Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
when applying the rescale api to change parallelism we should not change the min parallelism.
The problem currently is that if we cannot aquire the new resources within jobmanager.adaptive-scheduler.resource-wait-timeout the job will completely fail
The jobmanager.adaptive-scheduler.resource-stabilization-timeout still allows us to wait for quite long if necessary to get the target parallelism but failing completely because of the wait timeout seems very unfortunate
It's best to keep the min resources unchanged and let the adaptive scheduler take care of the parallelism changes together with the timeout settings.
Attachments
Issue Links
- links to