Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1107

Support increase of parallelism of vertex in case of custom partitioner

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Current VertexManagerPlugin/EdgeManager mechanism support decrease of parallelism of a vertex, but increase parallelism is not supported. In general, we need to do repartition to increase the parallelism. However, in my simplified case, the proceeding vertex is using a custom partitioner which is able to partition to the final parallelism, repartitioning is not needed. However, I hit an exception from sorter:
      : Caused by: java.io.IOException: Illegal partition for Null: false index: 0 53.8 (2), TotalPartitions: 2
      : at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.collect(DefaultSorter.java:208)
      : at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.write(DefaultSorter.java:185)
      : at org.apache.tez.runtime.library.output.OnFileSortedOutput$1.write(OnFileSortedOutput.java:111)
      : at org.apache.pig.backend.hadoop.executionengine.tez.POIdentityInOutTez.getNextTuple(POIdentityInOutTez.java:148)
      : ... 8 more

      While increase parallelism in general is harder, increase parallelism with a custom partitioner might be easier to fix.

      Attachments

        Issue Links

          Activity

            People

              bikassaha Bikas Saha
              daijy Daniel Dai
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: