Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1499

Add SortMergeJoinExample to tez-examples

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.1
    • None
    • None
    • Incompatible change, Reviewed

    Description

      In the current join example, the inputs of JoinProcessor is unordered so that it will always need to load one input into memory, and stream another input. This only fit for the case when one dataset is small enough to fit into memory ( even use no-broadcast, memory may not be enough ). So I'd like to add another join example that make the inputs of JoinProcessor is ordered. ( using OrderedPartitionedKVEdgeConfig ). This kind of join could been used when both of the 2 datasets are large.

      Attachments

        1. Tez-1499-3.patch
          58 kB
          Jeff Zhang
        2. Tez-1499-2.patch
          56 kB
          Jeff Zhang
        3. Tez-1499.patch
          15 kB
          Jeff Zhang

        Activity

          People

            zjffdu Jeff Zhang
            zjffdu Jeff Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: