Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.16.0
Description
The current implementation of the multiple component leader election faces a number of issues. These issues mostly stem from an attempt to make the multiple leader election process work just the same way as the single component leader election.
An attempt at listing the issues follows:
- Naming MultipleComponentLeaderElectionService appears by name similar to the LeaderElectionService, but is in fact closer to the LeaderElectionDriver.
- Similarity The interfaces LeaderElectionService, LeaderElectionDriver and MultipleComponentLeaderElectionDriver are very similar to each other.
- Cyclic dependency DefaultMultipleComponentLeaderElectionService holds a reference to the ZooKeeperMultipleComponentLeaderElectionDriver (MultipleComponentLeaderElectionDriver), which in turn holds a reference to the DefaultMultipleComponentLeaderElectionService (LeaderLatchListener)
- Unclear contract With single component leader election drivers such as ZooKeeperLeaderElectionDriver a call to the LeaderElectionService#stop from JobMasterServiceLeadershipRunner#closeAsync implies giving up the leadership of the JobMaster. With the multiple component leader election this is no longer the case. The leadership is held until the HighAvailabilityServices shutdown. This logic may be difficult to understand from the perspective of one of the components (e.g., the Dispatcher)
- Long call hierarchy DefaultLeaderElectionService->MultipleComponentLeaderElectionDriverAdapter->MultipleComponentLeaderElectionService->ZooKeeperMultipleComponentLeaderElectionDriver
- Long prefix "MultipleComponentLeaderElection" is quite a long prefix but shared by many classes.
- Adapter as primary implementation All non-testing non-multiple-component leadership drivers are deprecated. The primary implementation of LeaderElectionDriver is the adapter MultipleComponentLeaderElectionDriverAdapter.
- Possible redundancy We currently have similar methods for the Dispatcher, ResourceManager, JobMaster and WebMonitorEndpoint. (E.g., for granting leadership.) As these methods are called at the same time due to the multiple component leader election, it may make sense to combine this logic into a single object.
Attachments
Attachments
Issue Links
- blocks
-
FLINK-31816 Refactor EmbeddedLeaderElectionService
- Open
- causes
-
FLINK-32678 Release Testing: Stress-Test to cover multiple low-level changes in Flink
- Resolved
-
FLINK-32994 LeaderElectionDriver.toString() is not implemented
- Resolved
-
FLINK-32503 FLIP-285 technical debt
- In Progress
- Discovered while testing
-
FLINK-25235 Re-enable ZooKeeperLeaderElectionITCase#testJobExecutionOnClusterWithLeaderChange
- Resolved
- fixes
-
FLINK-30484 ZooKeeperLeaderElectionTest.testZooKeeperReelection timed out
- Open
-
FLINK-30338 Clean leader election legacy code
- Closed
- is duplicated by
-
FLINK-30338 Clean leader election legacy code
- Closed
- requires
-
FLINK-25806 Remove legacy high availability services
- Closed
- links to
- mentioned in
-
Page Loading...