Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-19344

HOU Fails To Restart NameNode in non-HA Cluster

    XMLWordPrintableJSON

Details

    Description

      Steps

      1. Deploy HDP-2.5.0.0 with Ambari-2.5.0.0-547 build (non-HA cluster)
      2. Start Host Ordered Upgrade to HDP-2.5.3

      Result: Error at pre Upgrade HDFS step

      Traceback (most recent call last):
        File "/var/lib/ambari-agent/cache/custom_actions/scripts/ru_execute_tasks.py", line 156, in <module>
          ExecuteUpgradeTasks().execute()
        File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 287, in execute
          method(env)
        File "/var/lib/ambari-agent/cache/custom_actions/scripts/ru_execute_tasks.py", line 153, in actionexecute
          shell.checked_call(task.command, logoutput=True, quiet=True)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
          result = function(command, **kwargs)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
          tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
          result = _call(command, **kwargs_copy)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
          raise ExecutionFailed(err_msg, code, out, err)
      resource_management.core.exceptions.ExecutionFailed: Execution of 'source /var/lib/ambari-agent/ambari-env.sh ; /usr/bin/ambari-python-wrap /var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py prepare_rolling_upgrade /var/lib/ambari-agent/data/command-111.json /var/lib/ambari-agent/cache/custom_actions /var/lib/ambari-agent/data/structured-out-111.json INFO /var/lib/ambari-agent/tmp' returned 1. 2017-01-03 17:07:47,095 - In the middle of a stack upgrade/downgrade for Stack HDP and destination version 2.5.3.0-37, determining which hadoop conf dir to use.
      2017-01-03 17:07:47,096 - Hadoop conf dir: /usr/hdp/2.5.3.0-37/hadoop/conf
      2017-01-03 17:07:47,096 - The hadoop conf dir /usr/hdp/2.5.3.0-37/hadoop/conf exists, will call conf-select on it for version 2.5.3.0-37
      2017-01-03 17:07:47,096 - Checking if need to create versioned conf dir /etc/hadoop/2.5.3.0-37/0
      2017-01-03 17:07:47,097 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
      2017-01-03 17:07:47,141 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist already', '')
      2017-01-03 17:07:47,141 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
      2017-01-03 17:07:47,179 - checked_call returned (0, '')
      2017-01-03 17:07:47,180 - Ensuring that hadoop has the correct symlink structure
      2017-01-03 17:07:47,180 - Using hadoop conf dir: /usr/hdp/2.5.3.0-37/hadoop/conf
      2017-01-03 17:07:47,182 - Stack Feature Version Info: stack_version=2.5, version=2.5.3.0-37, current_cluster_version=2.5.0.0-1245, upgrade_direction=upgrade -> 2.5.3.0-37
      2017-01-03 17:07:47,185 - In the middle of a stack upgrade/downgrade for Stack HDP and destination version 2.5.3.0-37, determining which hadoop conf dir to use.
      2017-01-03 17:07:47,185 - Hadoop conf dir: /usr/hdp/2.5.3.0-37/hadoop/conf
      2017-01-03 17:07:47,186 - The hadoop conf dir /usr/hdp/2.5.3.0-37/hadoop/conf exists, will call conf-select on it for version 2.5.3.0-37
      2017-01-03 17:07:47,186 - Checking if need to create versioned conf dir /etc/hadoop/2.5.3.0-37/0
      2017-01-03 17:07:47,187 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
      2017-01-03 17:07:47,224 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist already', '')
      2017-01-03 17:07:47,225 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
      2017-01-03 17:07:47,262 - checked_call returned (0, '')
      2017-01-03 17:07:47,263 - Ensuring that hadoop has the correct symlink structure
      2017-01-03 17:07:47,263 - Using hadoop conf dir: /usr/hdp/2.5.3.0-37/hadoop/conf
      2017-01-03 17:07:47,271 - checked_call['rpm -q --queryformat '%{version}-%{release}' hdp-select | sed -e 's/\.el[0-9]//g''] {'stderr': -1}
      2017-01-03 17:07:47,382 - checked_call returned (0, '2.5.3.0-37', '')
      2017-01-03 17:07:47,385 - Performing a(n) upgrade of HDFS
      2017-01-03 17:07:47,385 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://vs-hucluster2-5.openstacklocal:8020 -rollingUpgrade prepare'] {'logoutput': True, 'user': 'cstm-hdfs'}
      ######## Hortonworks #############
      This is MOTD message, added for testing in qe infra
      PREPARE rolling upgrade ...
      17/01/03 17:07:51 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.rollingUpgrade over null. Not retrying because try once and fail.
      org.apache.hadoop.ipc.RemoteException(java.io.IOException): Safe mode should be turned ON in order to create namespace image.
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startRollingUpgradeInternalForNonHA(FSNamesystem.java:7839)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startRollingUpgrade(FSNamesystem.java:7799)
      	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollingUpgrade(NameNodeRpcServer.java:1273)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rollingUpgrade(ClientNamenodeProtocolServerSideTranslatorPB.java:808)
      	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
      
      	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1496)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1396)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
      	at com.sun.proxy.$Proxy16.rollingUpgrade(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rollingUpgrade(ClientNamenodeProtocolTranslatorPB.java:773)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
      	at com.sun.proxy.$Proxy17.rollingUpgrade(Unknown Source)
      	at org.apache.hadoop.hdfs.DFSClient.rollingUpgrade(DFSClient.java:2999)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.rollingUpgrade(DistributedFileSystem.java:1394)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin$RollingUpgradeCommand.run(DFSAdmin.java:375)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1932)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107)
      rollingUpgrade: Safe mode should be turned ON in order to create namespace image.
      Traceback (most recent call last):
        File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in <module>
          NameNode().execute()
        File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 287, in execute
          method(env)
        File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 180, in prepare_rolling_upgrade
          namenode_upgrade.prepare_rolling_upgrade(hfds_binary)
        File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode_upgrade.py", line 244, in prepare_rolling_upgrade
          logoutput=True)
        File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__
          self.env.run()
        File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
          self.run_action(resource, action)
        File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
          provider_action()
        File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run
          tries=self.resource.tries, try_sleep=self.resource.try_sleep)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
          result = function(command, **kwargs)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
          tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
          result = _call(command, **kwargs_copy)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
          raise ExecutionFailed(err_msg, code, out, err)
      resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://vs-hucluster2-5.openstacklocal:8020 -rollingUpgrade prepare' returned 255. ######## Hortonworks #############
      This is MOTD message, added for testing in qe infra
      PREPARE rolling upgrade ...
      17/01/03 17:07:51 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.rollingUpgrade over null. Not retrying because try once and fail.
      org.apache.hadoop.ipc.RemoteException(java.io.IOException): Safe mode should be turned ON in order to create namespace image.
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startRollingUpgradeInternalForNonHA(FSNamesystem.java:7839)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startRollingUpgrade(FSNamesystem.java:7799)
      	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollingUpgrade(NameNodeRpcServer.java:1273)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rollingUpgrade(ClientNamenodeProtocolServerSideTranslatorPB.java:808)
      	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
      
      	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1496)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1396)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
      	at com.sun.proxy.$Proxy16.rollingUpgrade(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rollingUpgrade(ClientNamenodeProtocolTranslatorPB.java:773)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
      	at com.sun.proxy.$Proxy17.rollingUpgrade(Unknown Source)
      	at org.apache.hadoop.hdfs.DFSClient.rollingUpgrade(DFSClient.java:2999)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.rollingUpgrade(DistributedFileSystem.java:1394)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin$RollingUpgradeCommand.run(DFSAdmin.java:375)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1932)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
      	at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107)
      rollingUpgrade: Safe mode should be turned ON in order to create namespace image.
      

      Attachments

        1. AMBARI-19344.patch
          22 kB
          Jonathan Hurley

        Issue Links

          Activity

            People

              jonathanhurley Jonathan Hurley
              shavi71 Vivek Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: