XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.99.0, hbase-10070
    • None
    • None
    • Reviewed

    Description

      We sometimes see the following stack trace on test logs (TestReplicasClient), but this is not test-specific:

      2014-03-26 21:44:18,662 ERROR [RS_OPEN_REGION-c64-s12:35852-2] handler.OpenRegionHandler(481): Failed open of region=TestReplicasClient,,1395895445056_0001.5f8b8db27e36d2dde781193d92a05730., starting to roll back the global memstore size.
      java.io.IOException: java.io.IOException: java.io.FileNotFoundException: File does not exist: hdfs://localhost:56276/user/jenkins/hbase/data/default/TestReplicasClient/856934fb87781c9030975706b66137a5/info/589000f197b048e0897e1d81dd7e3a90
        at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:739)
        at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:646)
        at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:617)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4447)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4417)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4389)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4345)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4296)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:465)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:139)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)
      Caused by: java.io.IOException: java.io.FileNotFoundException: File does not exist: hdfs://localhost:56276/user/jenkins/hbase/data/default/TestReplicasClient/856934fb87781c9030975706b66137a5/info/589000f197b048e0897e1d81dd7e3a90
        at org.apache.hadoop.hbase.regionserver.HStore.openStoreFiles(HStore.java:531)
        at org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:486)
        at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:254)
        at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3357)
        at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:710)
        at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:707)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        ... 3 more
      Caused by: java.io.FileNotFoundException: File does not exist: hdfs://localhost:56276/user/jenkins/hbase/data/default/TestReplicasClient/856934fb87781c9030975706b66137a5/info/589000f197b048e0897e1d81dd7e3a90
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
        at org.apache.hadoop.hbase.regionserver.StoreFileInfo.<init>(StoreFileInfo.java:95)
        at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:600)
        at org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:121)
        at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:506)
        at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:503)
        ... 8 more
      

      The region fails to open for the region replica, because at this time, the primary region is performing a compaction. The file is moved to the archive directory in between listing of store files and opening those store files from the secondary.

      The secondary region should able to deal with this through usage of StoreFileInfo and HFile, but since we are reconstructing the StoreFileInfo object twice between HStore.openStoreFiles() and createStoreFileAndReader() we are getting this exception.

      Attachments

        1. hbase-10859_v1.patch
          12 kB
          Enis Soztutar
        2. hbase-10859_v2.patch
          22 kB
          Enis Soztutar
        3. 0030-HBASE-10859-Use-HFileLink-in-opening-region-files-fr.patch
          26 kB
          Enis Soztutar

        Activity

          People

            enis Enis Soztutar
            enis Enis Soztutar
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: