Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3334

Tez Custom Shuffle Handler

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom shuffle handler could help reduce the number of fetches and could more efficiently fetch data. In particular if a reducer is fetching 100 pieces serially from the same mapper it could do this in one fetch call.

      Attachments

        1. TEZ-3334.3.patch
          447 kB
          Jonathan Turner Eagles
        2. TEZ-3334.2.patch
          435 kB
          Jonathan Turner Eagles
        3. TEZ-3334.1.patch
          423 kB
          Jonathan Turner Eagles
        1.
        Tez Custom Shuffle Handler POC Sub-task Closed Jonathan Turner Eagles  
        2.
        Tez Custom Shuffle Handler Documentation Sub-task Closed Jonathan Turner Eagles  
        3.
        Fetch Multiple Partitions from the Shuffle Handler Sub-task Closed Jonathan Turner Eagles  
        4.
        Delete intermediate data at DAG level for Shuffle Handler Sub-task Closed Kuhu Shukla  
        5.
        Delete intermediate data at the vertex level for Shuffle Handler Sub-task Resolved Syed Shameerur Rahman

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 11.5h
        6.
        Query fetch stats without fetching for Shuffle Handler Sub-task Open Unassigned  
        7.
        Add support for Multiple Files Fetch from the Shuffle Handler Sub-task Open Kuhu Shukla  
        8.
        Allow tez to enable mapreduce or tez shuffle handler Sub-task Resolved Unassigned  
        9.
        Tez Shuffle Bench Sub-task Open Gopal Vijayaraghavan  
        10.
        Remove ShuffleHandler dependency on mapred.FadvisedChunkedFile and mapred.FadvisedFileRegion Sub-task Closed Kuhu Shukla  
        11.
        Move Shuffle Handler configuration into the Tez namespace Sub-task Closed Eric Badger  
        12.
        Shuffle Handler: Replace primitive wrapper's valueOf method with parse* method to avoid unnecessary boxing/unboxing Sub-task Closed Kuhu Shukla  
        13.
        Remove ShuffleHandler dependency on mapred.JobShuffleInfoProto and mapreduce.JobID Sub-task In Progress Jonathan Turner Eagles  
        14.
        Provide error information in shuffle response header Sub-task Open Unassigned  
        15.
        Package Shuffle Handler as a shaded uber-jar Sub-task Closed Jonathan Turner Eagles  
        16.
        Remove extra jetty dependency from Shuffle Handler Sub-task Closed Jonathan Turner Eagles  
        17.
        Shuffle service name should be configureable and should not be hardcoded to ‘mapreduce_shuffle’ Sub-task Closed Jonathan Turner Eagles  
        18.
        Allow Task Output Files to reside in DAG specific directories for Custom Shuffle Handler Sub-task Closed Kuhu Shukla  
        19.
        ShuffleHandler should use Path.SEPARATOR instead of '/' Sub-task Closed Kuhu Shukla  
        20.
        TestShuffleHandler#testSendMapCount should not used hard coded ShuffleHandler port Sub-task Closed Kuhu Shukla  
        21.
        Modify ShuffleHandler to use Constants.DAG_PREFIX and fix AttemptPathIdentifier#toString() Sub-task Closed Kuhu Shukla  
        22.
        Ability to configure shuffle server listen queue length Sub-task Resolved Unassigned  
        23.
        Port MAPREDUCE-6763 to Tez ShuffleHandler Sub-task Closed Kuhu Shukla  
        24.
        Make DAG Deletion Tracker configurable for ContainerLaunchers Sub-task Closed Kuhu Shukla  
        25.
        Backport MAPREDUCE-6808. Log map attempts as part of shuffle handler audit log Sub-task Closed Kuhu Shukla  
        26.
        Make Deletion Service Pluggable Sub-task Open Kuhu Shukla  
        27.
        TEZ-3362 causes TestContainerLauncherWrapper#testDelegation to fail Sub-task Closed Kuhu Shukla  
        28.
        Tez Shuffle Handler logging fails to initialize Sub-task Closed Jonathan Turner Eagles  
        29.
        TezConfiguration#TEZ_DELETION_TRACKER_CLASS has the wrong config key-name Sub-task Closed Kuhu Shukla  
        30.
        Remove fusesource.leveldbjni from the tez-auxservices shaded jar Sub-task Closed Kuhu Shukla  
        31.
        Fetcher fetchInputs() can NPE on srcAttempt due to missing entry in pathToAttemptMap Sub-task Closed Kuhu Shukla  
        32.
        Remove google.protobuf from the tez-auxservices shaded jar Sub-task Closed Jonathan Turner Eagles  
        33.
        Composite Fetch account error for disk direct Sub-task Closed Jonathan Turner Eagles  
        34.
        Number of Empty DME logged for Composite fetch is too high Sub-task Closed Jonathan Turner Eagles  
        35.
        Composite Fetch hangs on certain DME empty events. Sub-task Closed Jonathan Turner Eagles  
        36.
        Unordered Fetcher can hang if empty partitions are present Sub-task Closed Kuhu Shukla  
        37.
        Remove the compositeInputAttemptIdentifier from remaining list upon fetch completion in the Ordered case Sub-task Closed Kuhu Shukla  
        38.
        Fix debug log for empty partitions to the expanded partitionId in the Composite case Sub-task Closed Kuhu Shukla  
        39.
        ShuffleScheduler#allEventsReceived check is too tight Sub-task Open Kuhu Shukla  
        40.
        Fetcher can hang if copyMapOutput/fetchInputs returns early Sub-task Closed Kuhu Shukla  
        41.
        Tez Shuffle Handler Content length does not match actual Sub-task Closed Jonathan Turner Eagles  
        42.
        Shuffle Handler Loading cache equality tests always results is false Sub-task Closed Jonathan Turner Eagles  
        43.
        UnorderedPartitionedKVOutput is missing the shuffle service config in the confKeys set Sub-task Closed Kuhu Shukla  
        44.
        Optimize the Shuffle Handler content length calculation for keep alive Sub-task Closed Jonathan Turner Eagles  
        45.
        Give Tez shuffle handler threads custom names Sub-task Closed Jonathan Turner Eagles  
        46.
        Implement keep-alive timeout in tez shuffle handler Sub-task Closed Jonathan Turner Eagles  
        47.
        Pass parameters instead of configuration for changes to support tez shuffle handler Sub-task Closed Jonathan Turner Eagles  
        48.
        LocalContainerLauncher#shouldDelete member variable is not used Sub-task Closed Kuhu Shukla  
        49.
        Incorporate first pass non-essential TEZ-3334 pre-merge feedback Sub-task Closed Jonathan Turner Eagles  
        50.
        ShuffleHandler completedInputSet off-by-one error Sub-task Closed Jonathan Turner Eagles  
        51.
        Tez shuffle jar includes service loader entry for ClientProtocolProvider but not the corresponding class Sub-task Closed Jason Darrell Lowe  
        52.
        Modify DeletionTracker and deletion threads to be initialized only if enabled for tez_shuffle Sub-task Closed Kuhu Shukla  
        53.
        Use Local FileContext for deleting dag level directories Sub-task Closed Kuhu Shukla  
        54.
        Allow dag level deletion in cases where containers are reused Sub-task Closed Kuhu Shukla  
        55.
        Cleanup http connections and other unnecessary fields in DAG Deletion tracker classes. Sub-task Closed Kuhu Shukla  
        56.
        Clean up DeletionTracker's reflection instantiation and provide ContainerLauncher with dagComplete() functionality Sub-task Closed Kuhu Shukla  
        57.
        Use a new abstract class for ContainerLauncher(s) that provide dagComplete() functionality Sub-task Resolved Kuhu Shukla  
        58.
        Test failures in TestTaskAttempt and TestAMContainerMap Sub-task Closed Kuhu Shukla  
        59.
        Clean up TEZ-3334-CHANGES.txt Sub-task Closed Jonathan Turner Eagles  
        60.
        Delete intermediate attempt data for failed attempts for Shuffle Handler Sub-task Resolved Syed Shameerur Rahman

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 4h 20m
        61.
        Remove javax.security from the tez-auxservices shaded jar Sub-task Resolved László Bodor

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 0.5h

        Activity

          People

            Unassigned Unassigned
            jeagles Jonathan Turner Eagles
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 16h 20m
                16h 20m