Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Here is a hungup regionserver stuck on the running of the dfs shutdown thread:
"Thread-11" prio=10 tid=0x00007fcd00a9b400 nid=0x751d waiting on condition [0x0000000042458000..0x0000000042458d00] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.ipc.Client.stop(Client.java:667) at org.apache.hadoop.ipc.RPC$ClientCache.stopClient(RPC.java:189) at org.apache.hadoop.ipc.RPC$ClientCache.access$400(RPC.java:138) at org.apache.hadoop.ipc.RPC$Invoker.close(RPC.java:229) - locked <0x00007fcd06c6b470> (a org.apache.hadoop.ipc.RPC$Invoker) at org.apache.hadoop.ipc.RPC$Invoker.access$500(RPC.java:196) at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:353) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:213) - locked <0x00007fcd06c6b3a0> (a org.apache.hadoop.hdfs.DFSClient) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413) - locked <0x00007fcd06ab9b68> (a org.apache.hadoop.fs.FileSystem$Cache) at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236) at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221) - locked <0x00007fcd06aaeee0> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
Above is just doing this:
// wait until all connections are closed while (!connections.isEmpty()) { try { Thread.sleep(100); } catch (InterruptedException e) { } }
Might never go down or wont' go down promptly.
Should interrupt threads if join timesout and just continue with exit.