Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-803 Fault Tolerance In Tez
  3. TEZ-1142

Bail out early if a vertex has too many failures

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      If there are a high number of failures in a vertex then bail out early instead of waiting for 4 failures of the same task.
      Lets say the vertex sees N consecutive failures without any successful task completion. Thats probably good enough evidence to infer that there is some bug in the code for the tasks in that vertex. Bailing out early wastes less resources.

      Attachments

        Activity

          People

            Unassigned Unassigned
            bikassaha Bikas Saha
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: