Details
-
Improvement
-
Status: Closed
-
Not a Priority
-
Resolution: Done
-
None
-
None
Description
In FLINK-21667, we decoupled RM leadership and lifecycle managements. RM is not started after obtaining leadership, and stopped on losing leadership.
Ideally, we may start and stop multiple RMs, as the process obtains and loses leadership. However, as discussed in the PR, having a process to start multiple RMs may cause problems in some deployment modes. E.g., repeated AM registration is not allowed on Yarn.
We need to investigate for all deployments that:
- Whether having multiple leader sessions causes problems.
- If it does, what can we do to solve the problem.
For information, multi-leader-session support for RM has been implemented in FLINK-21667, but is disabled by default. To enable, add the system property "flink.tests.enable-rm-multi-leader-session".
Attachments
Issue Links
- relates to
-
FLINK-21667 Standby RM might remove resources from Kubernetes
- Closed