Description
In RM-HA env, kill ZK leader and then perform RM failover.
Sometimes, active RM gets NPE and fail to come up successfully
2018-06-08 10:31:03,007 INFO client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:run(289)) - Client will use GSSAPI as SASL mechanism. 2018-06-08 10:31:03,008 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server xxx/xxx:2181. Will attempt to SASL-authenticate using Login Context section 'Client' 2018-06-08 10:31:03,009 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1146)) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2018-06-08 10:31:03,344 INFO service.AbstractService (AbstractService.java:noteFailure(267)) - Service org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService failed in state INITED java.lang.NullPointerException at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1033) at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1030) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1087) at org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:1030) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:347) at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.serviceInit(ActiveStandbyElectorBasedElectorService.java:110) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:336) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1479) 2018-06-08 10:31:03,345 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:quitElection(409)) - Yielding from election