Description
This is on a two-node cluster. The NodeManager on one of the nodes (c6402) starts just fine. It's this second node where the nodemanager fails (c6403) with the following:
Build #398
[root@c6402 keytabs]# ambari-server --hash
c0167f89d7c293f39f62fbf9df0b6640bde25b51
[root@c6402 keytabs]#
NodeManager fails to start, with this error.
2015-01-24 18:12:33,968 - Error while executing command 'start': Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 184, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/nodemanager.py", line 58, in start self.configure(env) # FOR SECURITY File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/nodemanager.py", line 45, in configure yarn(name="nodemanager") File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/yarn.py", line 77, in yarn Execute(format("chown -R {params.smokeuser} {directory}")) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 151, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 117, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 277, in action_run raise ex Fail: Execution of 'chown -R ambari-qa /hadoop/yarn/local/usercache/ambari-qa' returned 1. chown: cannot access `/hadoop/yarn/local/usercache/ambari-qa': No such file or directory
ON C6402 (where NODEMANAGER STARTS FINE):
[vagrant@c6402 ~]$ ls -l /hadoop/yarn/local/ total 12 drwxr-xr-x 4 yarn hadoop 4096 Jan 24 17:42 filecache drwx------ 2 yarn hadoop 4096 Jan 24 17:44 nmPrivate drwxr-xr-x 3 yarn hadoop 4096 Jan 24 17:41 usercache [vagrant@c6402 ~]$ ls -l /hadoop/yarn/local/usercache/ total 4 drwxr-x--- 4 ambari-qa hadoop 4096 Jan 24 17:41 ambari-qa [vagrant@c6402 ~]$
ON C6403 (where NODEMANAGER DOES NOT START):
[vagrant@c6403 ~]$ ls -l /hadoop/yarn/local/ total 12 drwxr-xr-x 2 yarn hadoop 4096 Jan 24 17:40 filecache drwx------ 2 yarn hadoop 4096 Jan 24 17:40 nmPrivate drwxr-xr-x 2 yarn hadoop 4096 Jan 24 17:40 usercache [vagrant@c6403 ~]$ ls -l /hadoop/yarn/local/usercache/ total 0 [vagrant@c6403 ~]$