Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
capacity scheduler doesn't set node label when new a reserved container's RMContainerImpl. When allocate this container, leafQueue will treat it as a ignorePartitionExclusivityRMContainer.
It will cause preempt failure. When preempt happens, the preemption policy will try to preempt the reserved container while leafQueue doesn't remove it from ignorePartitionExclusivityRMContainers. In our cluster, we found that preemption policy will always try to preempt the reserved container and actually preempt nothing.
We set the node label information to reserved container's RMContainerImpl and redo our test. The preemption performs as expected.