Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.12.2
Description
Currently, on initialization KubernetesResourceManagerDriver starts a watch for receiving pod events. It could happen that it starts to receive events before obtaining leadership. Consequently, a standby RM may remove terminated pods from Kubernetes during handling the events.
This is not very damaging atm, since the removed pods are already terminated anyway. However, it would still be good for a standby RM to strictly following the contract and make no modifications before obtaining leadership. We might consider to postpone starting of the watch to when the leadership is granted.
Attachments
Issue Links
- blocks
-
FLINK-17707 Support configuring replica of Deployment based HA setups
- Closed
- causes
-
FLINK-23240 ResumeCheckpointManuallyITCase.testExternalizedFSCheckpointsWithLocalRecoveryZookeeper fails on azure
- Closed
-
FLINK-24038 DispatcherResourceManagerComponent fails to deregister application if no leading ResourceManager
- Closed
-
FLINK-25885 ClusterEntrypointTest.testWorkingDirectoryIsDeletedIfApplicationCompletes failed on azure
- Closed
- is related to
-
FLINK-22816 Investigate feasibility of supporting multiple RM leader sessions within JM process
- Closed
- links to