[TEZ-1499] Add SortMergeJoinExample to tez-examples - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.5.1
Component/s: None
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed

Description

In the current join example, the inputs of JoinProcessor is unordered so that it will always need to load one input into memory, and stream another input. This only fit for the case when one dataset is small enough to fit into memory ( even use no-broadcast, memory may not be enough ). So I'd like to add another join example that make the inputs of JoinProcessor is ordered. ( using OrderedPartitionedKVEdgeConfig ). This kind of join could been used when both of the 2 datasets are large.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Tez-1499-3.patch
22/Sep/14 03:05
58 kB
Jeff Zhang
Tez-1499-2.patch
19/Sep/14 06:33
56 kB
Jeff Zhang
Tez-1499.patch
03/Sep/14 09:02
15 kB
Jeff Zhang

Activity

People

Assignee:: Jeff Zhang

Reporter:: Jeff Zhang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 26/Aug/14 06:00

Updated:: 02/Oct/14 21:41

Resolved:: 22/Sep/14 05:41