Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Both the HMasters are abruptly down with IllegalArgumentException: NO_REPLICA_FOUND.
causing "CorruptHFileException: Problem reading HFile Trailer from file"
Stack Trace:
2024-06-13 02:57:51,744 ERROR org.apache.hadoop.hbase.master.HMaster: Failed to become active master java.io.IOException: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563 at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1144) at org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1087) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:990) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:940) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7904) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7861) at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:307) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:424) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:122) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563 at org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:284) at org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:334) at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:306) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6365) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1110) at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1107) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile Trailer from file ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563 at org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349) at org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:123) at org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:706) at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:364) at org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:224) at org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:262) ... 6 more Caused by: java.lang.IllegalArgumentException: NO_REPLICA_FOUND at org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143) at org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:180) at org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:161) at org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:342) at org.apache.hadoop.hdds.scm.storage.BlockInputStream.getBlockData(BlockInputStream.java:258) at org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:164) at org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:370) at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56) at org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:54) at org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:96) at org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56) at org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:81) at java.io.DataInputStream.readFully(DataInputStream.java:195) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:394) at org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:339) ... 12 more 2024-06-13 02:57:51,745 ERROR org.apache.hadoop.hbase.master.HMaster: ***** ABORTING master vc0121.xyz.com,22001,1718272586518: Unhandled exception. Starting shutdown. *****
Attachments
Issue Links
- is fixed by
-
HDDS-11014 [hsync] Block finalization should also merge last chunk to blockDataTable
- Resolved
- is related to
-
HDDS-11014 [hsync] Block finalization should also merge last chunk to blockDataTable
- Resolved
- relates to
-
HDDS-7593 Supporting HSync and lease recovery
- Resolved