Details
-
Sub-task
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
None
Description
For LRS (long running services) on YARN, get rid of single point failure for critical container failure may not be necessary. Some applications would like to build its own HA architecture. However, it would be ideal to provide some fundamental support to HA service in YARN, like: launching container marked with active/standby, monitor/trigger out failed over, provide end point for shring information between active/standby container, etc.