Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.8.0, 2.7.1, 3.0.0-alpha1
-
None
-
Reviewed
Description
When a WebHDFS client side exception (for example, read timeout) occurs there are no details beyond the fact that a timeout occurred. Ideally it should say which node is responsible for the timeout, but failing that it should at least say which node we're talking to so we can examine that node's logs to further investigate.
java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.MeteredStream.read(MeteredStream.java:134) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3035) at org.apache.commons.io.input.BoundedInputStream.read(BoundedInputStream.java:121) at org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:188) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at com.yahoo.grid.tools.util.io.ThrottledBufferedInputStream.read(ThrottledBufferedInputStream.java:58) at java.io.FilterInputStream.read(FilterInputStream.java:107) at com.yahoo.grid.replication.distcopy.tasklet.HFTPDistributedCopy.copyBytes(HFTPDistributedCopy.java:495) at com.yahoo.grid.replication.distcopy.tasklet.HFTPDistributedCopy.doCopy(HFTPDistributedCopy.java:440) at com.yahoo.grid.replication.distcopy.tasklet.HFTPDistributedCopy.access$200(HFTPDistributedCopy.java:57) at com.yahoo.grid.replication.distcopy.tasklet.HFTPDistributedCopy$1.doExecute(HFTPDistributedCopy.java:387) ... 12 more
There are no clues as to which datanode we're talking to nor which datanode was responsible for the timeout.