Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • tez
    • None
    • Reviewed

    Description

      Tez has it built-in. We can start with reusing it and then look at customization for better performance.

      Attachments

        1. PIG-3846-1.patch
          56 kB
          Daniel Dai
        2. PIG-3846-3.patch
          85 kB
          Daniel Dai
        3. PIG-3846-5.patch
          103 kB
          Daniel Dai
        4. PIG-3846-6.patch
          151 kB
          Daniel Dai
        5. PIG-3846-7.patch
          146 kB
          Daniel Dai
        6. PIG-3846-9.patch
          146 kB
          Daniel Dai

        Issue Links

          Activity

            daijy Daniel Dai added a comment -

            Attach initial patch. Still need to add test cases and run through unit tests/e2e tests.

            daijy Daniel Dai added a comment - Attach initial patch. Still need to add test cases and run through unit tests/e2e tests.
            daijy Daniel Dai added a comment - RB link: https://reviews.apache.org/r/21302/
            daijy Daniel Dai added a comment -

            Summary of changes:
            1. TezOperDependencyParallelismEstimator, estimate the number of parallelism based on the parallelism of predecessors and operators within predecessors' physical plan
            2. PigOrderByVertexManager, VertexManagerPlugin for sort vertex of order by. It receive event from partition node and decrease parallelism of sort vertex automatically (TEZ-1107 prevent increase parallelism of sort job)
            3. Change of POReservoirSample, FindQuantilesTez, WeightedRangePartitionerTez, PigProcessor to assist PigOrderByVertexManager, FindQuantilesTez will estimate numQuantiles based on the samples sent from POReservoirSample (include stats of the previous job), WeightedRangePartitionerTez will partition the incoming data into the estimated numQuantiles partitions, and PigProcessor will send numQuantiles to PigOrderByVertexManager
            4. Set auto-parallelism flag for ShuffleVertexManager to true for applicable vertex
            5. Add estimatedParallelism to TezOperator. If requestedParallelism is not set, TezOperDependencyParallelismEstimator will estimate the parallelism and instruct VertexManager to figure out parallelism dynamically

            daijy Daniel Dai added a comment - Summary of changes: 1. TezOperDependencyParallelismEstimator, estimate the number of parallelism based on the parallelism of predecessors and operators within predecessors' physical plan 2. PigOrderByVertexManager, VertexManagerPlugin for sort vertex of order by. It receive event from partition node and decrease parallelism of sort vertex automatically ( TEZ-1107 prevent increase parallelism of sort job) 3. Change of POReservoirSample, FindQuantilesTez, WeightedRangePartitionerTez, PigProcessor to assist PigOrderByVertexManager, FindQuantilesTez will estimate numQuantiles based on the samples sent from POReservoirSample (include stats of the previous job), WeightedRangePartitionerTez will partition the incoming data into the estimated numQuantiles partitions, and PigProcessor will send numQuantiles to PigOrderByVertexManager 4. Set auto-parallelism flag for ShuffleVertexManager to true for applicable vertex 5. Add estimatedParallelism to TezOperator. If requestedParallelism is not set, TezOperDependencyParallelismEstimator will estimate the parallelism and instruct VertexManager to figure out parallelism dynamically
            daijy Daniel Dai added a comment -

            Fix skewed join auto-parallelism.

            daijy Daniel Dai added a comment - Fix skewed join auto-parallelism.
            daijy Daniel Dai added a comment -

            Another updates pending on all Tez patches linked.

            daijy Daniel Dai added a comment - Another updates pending on all Tez patches linked.
            daijy Daniel Dai added a comment -

            Attach the final patch.

            daijy Daniel Dai added a comment - Attach the final patch.
            daijy Daniel Dai added a comment -

            Patch committed to trunk. Thanks Rohini for review. Review comments is on RB.

            daijy Daniel Dai added a comment - Patch committed to trunk. Thanks Rohini for review. Review comments is on RB.

            People

              daijy Daniel Dai
              rohini Rohini Palaniswamy
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: