Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
YARN-5355, YARN-5355-branch-2, 3.0.0-alpha4
-
Reviewed
Description
If timeline service v2 is enabled and NM is restarted with recovery enabled, then NM fails to start and throws an error as "flow context can't be null".
This is happening because the flow context did not exist before but now that timeline service v2 is enabled, ApplicationImpl expects it to exist.
This would also happen even if flow context existed before but since we are not persisting it / reading it during ContainerManagerImpl#recoverApplication, it does not get passed in to ApplicationImpl.
full stack trace
2017-05-03 21:51:52,178 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
java.lang.IllegalArgumentException: flow context cannot be null
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:104)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.<init>(ApplicationImpl.java:90)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
Attachments
Attachments
Issue Links
- is related to
-
YARN-6323 Rolling upgrade/config change is broken on timeline v2.
- Resolved