Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
This is the remaining part of YARN-2001 - to halt allocations after restart till x% of nodes sync back with the RM. This is useful for avoiding bad scheduling during the time the nodes are still joining back after a restart/failover.
Attachments
Issue Links
- is duplicated by
-
YARN-4679 When work-preserving restart is enabled, the scheduler should wait for the earlier of recovery completion and configured wait time
- Resolved