Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10433

Cancel connection when remote driver process exited with error code [Spark Branch]

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • spark-branch
    • None

    Description

      Currently in HoS, after starting a remote process in SparkClientImpl, it will wait for the process to connect back. However, there are cases that the process may fail and exit with error code, and thus no connection is attempted. In this situation, the HS2 process will still wait for the connection and eventually timeout itself. What makes it worse, user may need to wait for two timeout periods, one for SparkSetReducerParallelism, and another for the actual Spark job.

      We should cancel the timeout task and mark the promise as failed once we know that the process is failed.

      Attachments

        Activity

          People

            Unassigned Unassigned
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: