Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4180

Show convenient input -> output vertex names in output/sort messages

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • None
    • 0.10.2
    • None
    • None

    Description

      While looking at aggregated yarn app logs, this message could be confusing (for those who're not yet familiar enough with sort/merge/etc., or tired of looking at huge, aggregated logs, or application logs in LLAP), as it makes the user think that something happens in a reducer task, but map output spilling happens in the Map task.

      2020-05-14 09:23:55,471 [INFO] [TezChild] |impl.PipelinedSorter|: Reducer 5: Spilling to /grid/2/yarn/nm/usercache/hive/appcache/application_1576231194218_0094/output/attempt_1576231194218_0094_1_12_000497_0_10147_0/file.out
      

      I would prefer something like "Map 3 -> Reducer 5", and it's possible by:

      outputContext.getTaskVertexName() -> outputContext.getDestinationVertexName()
      

      This can also be useful while looking at only an excerpt from app logs (e.g. grepped for "Spilling").

      Attachments

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m