Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
-
None
-
Hadoop 3.0.0, HBase 2.0.0 +
HBASE-20403.(hbase-site.xml) hbase.rs.prefetchblocksonopen=true
Description
Found the following exception thrown in a HBase RegionServer log (Hadoop 3.0.0 + HBase 2.0.0. The hbase prefetch bug HBASE-20403 was fixed on this cluster, but I am not sure if that's related at all):
2018-07-11 11:10:44,462 WARN org.apache.hadoop.hbase.io.hfile.HFileReaderImpl: Stream moved/closed or prefetch cancelled?path=hdfs://ns1/hbase/data/default/IntegrationTestBigLinkedList_20180711003954/449fa9bf5a7483295493258b5af50abc/meta/e9de0683f8a9413a94183c752bea0ca5, offset=216505135, end=2309991906 java.lang.NullPointerException at org.apache.hadoop.hdfs.net.NioInetPeer.getRemoteAddressString(NioInetPeer.java:99) at org.apache.hadoop.hdfs.net.EncryptedPeer.getRemoteAddressString(EncryptedPeer.java:105) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.sendReadResult(BlockReaderRemote.java:330) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.readNextPacket(BlockReaderRemote.java:233) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.read(BlockReaderRemote.java:165) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1050) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:992) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1348) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1312) at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:331) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:92) at org.apache.hadoop.hbase.io.hfile.HFileBlock.positionalReadWithExtra(HFileBlock.java:805) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readAtOffset(HFileBlock.java:1565) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1769) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1594) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1488) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$1.run(HFileReaderImpl.java:278) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
The relevant Hadoop code:
BlockReaderRemote#sendReadResult
void sendReadResult(Status statusCode) { assert !sentStatusCode : "already sent status code to " + peer; try { writeReadResult(peer.getOutputStream(), statusCode); sentStatusCode = true; } catch (IOException e) { // It's ok not to be able to send this. But something is probably wrong. LOG.info("Could not send read status (" + statusCode + ") to datanode " + peer.getRemoteAddressString() + ": " + e.getMessage()); } }
So the NPE was thrown within a exception handler. A possible explanation could be that the socket was closed so client couldn't write, and Socket#getRemoteSocketAddress() returns null when the socket is closed.
Suggest check for nullity and return an empty string in NioInetPeer.getRemoteAddressString.