Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-7367

Snapshot based mapreduce jobs fails after HBASE-28401

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 5.2.1, 5.3.0
    • None
    • None

    Description

      HBASE-28401 had a regression due to which HRegion#close throws NPE while trying to close the memstore within the mapper

      Due to this, snapshot based MR jobs have started failing in phoenix. 

      This is due to the fact that TableSnapshotResultIterator ends up trying to release the read lock twice via HRegion#closeRegionOperation 

          •  HRegion#closeRegionOperation released the read lock and was successful
          •  HRegion#close which threw IOException due to memstore issue (HBASE-28401)
          •  SnapshotScanner catches the IOException but doesn't set region field to null
          • ScanningResultIterator's close is called again
          • Since region field wasn't null, HRegion#closeRegionOperation is called again and throws IllegalMonitorStateException while trying to release the read lock
          • The IllegalMonitorStateException then causes the whole mapper to fail

      It doesn't cause failure while doing snapshot reads via HBase (ref HBASE-28743 where same NPE was observed but mapper still passes)
      , because the closest equivalent code (RecordReader within TableSnapshotInputFormat) doesn't tries to close the region as part of it's nextKeyValue method
      This is generally much safer because record readers are always closed explicitly (even if mapper's run method fails)

      There are 2 improvements that can be done here : 
      1. Disable mslab for region created within snapshot (by setting hbase.hregion.memstore.mslab.enabled set to false)
      2. In TableSnapshotResultIterator - Remove the the SnapshotScanner's close (via ScanningResultIterator) called within next method. It would anyways be closed by the mapper at the end

      Attachments

        1. Screenshot 2024-07-19 at 8.18.06 PM.png
          2.19 MB
          Ujjawal Kumar
        2. Screenshot 2024-07-19 at 8.18.25 PM.png
          1.43 MB
          Ujjawal Kumar

        Issue Links

          Activity

            People

              ukumar Ujjawal Kumar
              ukumar Ujjawal Kumar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: