Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11728

Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.96.1.1, 0.98.4
    • 0.99.0, 0.98.6
    • Scanners
    • None
    • ubuntu12
      hadoop-2.2.0
      Hbase-0.96.1.1
      SUN-JDK(1.7.0_06-b24)

    • Reviewed

    Description

      In Scan case, i prepare some data as beflow:

      Table Desc (Using the prefix-tree encoding) :
      'prefix_tree_test',

      {NAME => 'cf_1', DATA_BLOCK_ENCODING => 'PREFIX_TREE', TTL => '15552000'}

      and i put 5 rows as:
      (RowKey , Qualifier, Value)
      'a-b-0-0', 'qf_1', 'c1-value'
      'a-b-A-1', 'qf_1', 'c1-value'
      'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
      'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
      'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'

      so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the corret result:
      Test 1:
      Scan scan = new Scan();

      scan.setStartRow("a-b-A-1".getBytes());
      scan.setStopRow("a-b-A-1:".getBytes());
      ------------------------------------------------------
      'a-b-A-1', 'qf_1', 'c1-value'
      'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
      'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'

      and then i try next , scan to addColumn
      Test2:
      Scan scan = new Scan();
      scan.addColumn(Bytes.toBytes("cf_1") , Bytes.toBytes("qf_2"));

      scan.setStartRow("a-b-A-1".getBytes());
      scan.setStopRow("a-b-A-1:".getBytes());
      ----------------------------------------------
      except:
      'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
      'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'

      but actually i got nonthing. Then i update the addColumn for scan.addColumn(Bytes.toBytes("cf_1") , Bytes.toBytes("qf_1")); and i got the expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.

      then i do more testing... i update the case to modify the startRow greater than the 'a-b-A-1'
      Test3:
      Scan scan = new Scan();

      scan.setStartRow("a-b-A-1-".getBytes());
      scan.setStopRow("a-b-A-1:".getBytes());
      ------------------------------------------------------
      except:
      'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
      'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'

      but actually i got nothing again. i modify the start row greater than 'a-b-A-1-1402329600-1402396277'

      Scan scan = new Scan();
      scan.setStartRow("a-b-A-1-140239".getBytes());
      scan.setStopRow("a-b-A-1:".getBytes());

      and i got the expect row as well:
      'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'

      So, i think it may be a bug in the prefix-tree encoding.It happens after the data flush to the storefile, and it's ok when the data in mem-store.

      Attachments

        1. TestPrefixTree.java
          5 kB
          ramkrishna.s.vasudevan
        2. HFileAnalys.java
          2 kB
          wuchengzhi
        3. HBASE-11728.patch
          5 kB
          ramkrishna.s.vasudevan
        4. HBASE-11728_4.patch
          11 kB
          ramkrishna.s.vasudevan
        5. HBASE-11728_3.patch
          11 kB
          ramkrishna.s.vasudevan
        6. HBASE-11728_2.patch
          10 kB
          ramkrishna.s.vasudevan
        7. HBASE-11728_1.patch
          11 kB
          ramkrishna.s.vasudevan
        8. 29cb562fad564b468ea9d61a2d60e8b0
          1 kB
          wuchengzhi

        Issue Links

          Activity

            People

              ram_krish ramkrishna.s.vasudevan
              bdifn wuchengzhi
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 72h
                  72h
                  Remaining:
                  Remaining Estimate - 72h
                  72h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified