Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
This is a very annoying bug, I have two clusters running on the same machine so I often stop the wrong one (both are for testing). When it happens, it leaves a HMaster running with this:
"main" prio=10 tid=0x0000000041dd3800 nid=0x55d5 in Object.wait() [0x00007f3c7165a000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00007f3bac94f528> (a java.lang.Object) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:382) - locked <0x00007f3bac94f528> (a java.lang.Object) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:90) at org.apache.hadoop.hbase.master.HMasterCommandLine.stopMaster(HMasterCommandLine.java:160) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1058)
And if I happen to restart that cluster right away, IT KILLS IT!