Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.0.0-alpha4, 3.1.1, 3.3.0
-
None
-
None
Description
The process through steps ① to ⑩ ultimately leads to the Active ResourceManager’s RMStateStore being stopped in the FENCED state, resulting in the inability to update the all job status.
Solution
First, adopting the solution described in YARN-11622 enables an ordered switch between the "toActive" and "toStandby", in which case we can remove the control of the "hasAlreadyRun" variable to avoid this issue。