Description
Currently Hadoop components such as NameNode and JobTracker are single point of failure.
If Namenode or JobTracker goes down, there service will not be available until they are up and running again. If there was a Standby Namenode or JobTracker available and ready to serve when Active nodes go down, we could have reduced the service down time. Hadoop already provides a Standby Namenode implementation which is not fully a "hot" Standby.
The common problem to be addressed in any such Active-Standby cluster is Leader Election and Failure detection. This can be done using Zookeeper as mentioned in the Zookeeper recipes.
http://zookeeper.apache.org/doc/r3.3.3/recipes.html
Leader Election Service (LES)
Any Node who wants to participate in Leader Election can use this service. They should start the service with required configurations. The service will notify the nodes whether they should be started as Active or Standby mode. Also they intimate any changes in the mode at runtime. All other complexities can be handled internally by the LES.
Attachments
Attachments
Issue Links
- blocks
-
HIVE-2254 Provide an automatic recovery feature for Hive Server in case of failure
- Open
-
HDFS-2124 Namenode HA using Backup Namenode as Hot Standby
- Resolved
-
MAPREDUCE-2648 High Availability for JobTracker
- Resolved
- duplicates
-
ZOOKEEPER-1095 Simple leader election recipe
- Closed
- relates to
-
HDFS-1973 HA: HDFS clients must handle namenode failover and switch over to the new active namenode.
- Resolved