Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0
Description
I was able to reproduce this issue by starting a job, and this job keeps requesting containers until it uses up cluster available resource. My cluster has 70200 vcores, and each task it applies for 100 vcores, I was expecting total 702 containers can be allocated but eventually there was only 701. The last container could not get allocated because queue used resource is updated to be more than 100%.