Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 3.4.0
-
ghx-label-4
Description
I saw a hang triggered by test_failpoints in JoinBuilder::HandofftoProbesAndWait(), where the thread was blocked but build_side_state->is_cancelled_ is true.
The sequence of events leading to the bug is as follows:
- Thread A is in HandoffToProbesAndWait(), reads is_cancelled_ and sees false.
- Thread B in RuntimeState::Cancel() sets is_cancelled_ = true, acquires cancellation_cvs_lock_, then calls NotifyAll() on the condition variable
- Thread A calls Wait() on the cv, blocks forever.
I think this is most likely if thread A is de-scheduled at the wrong time.
Attachments
Issue Links
- is broken by
-
IMPALA-9156 Share broadcast join builds between fragments
- Resolved
- is related to
-
IMPALA-9612 Runtime filter wait longer than it should be
- Resolved
- relates to
-
IMPALA-5904 Enable ThreadSanitizer for Impala
- Open