Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.18.1
-
None
Description
I am seeing these exceptions on our cluster in output from tablemap/tablereduce jobs:
> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
> at java.io.DataInputStream.readFull(DataInputSteram.java:175)
> at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64)
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1933)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1833)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
> at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:516)
> at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getNext(StoreFileScanner.java:312)
When such OOMEs as above happen, the cluster does not recover without manual intervention. The regionservers sometimes go down after this, or sometimes do not and stay up in sick condition for a while. Regions go offline and remain unavailable.
Attachments
Issue Links
- relates to
-
HBASE-1045 Hangup by regionserver causes write to fail
- Closed