Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-809

Deletall doesn't and inserts after delete don't work as expected.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.2.1, 0.18.0
    • None
    • None

    Description

      HBASE-808 describes program run in the below snippet from the list. Describes unexpected behaviors.

      ..
      
      Test 2
      
      A bit more surprising: I delete my row, using the delete-all command in
      shell:
      
      
      # SHELL 
      
      hbase(main):001:0> scan 'proxy-0.2'
      ROW                          COLUMN+CELL
       testrow                     column=bytes:, timestamp=1100, value=valbytes
      ts 1100
       testrow                     column=status:, timestamp=1100, value=valstat
      ts1100
      2 row(s) in 0.3560 seconds
      hbase(main):002:0> deleteall 'proxy-0.2', 'testrow'
      0 row(s) in 0.1050 seconds
      hbase(main):003:0> scan 'proxy-0.2'
      ROW                          COLUMN+CELL
      0 row(s) in 0.2540 seconds
      
      
      The table is now empty, and if I try to launch my dumpRowHistory() method,
      the emptiness is confirmed. Ok. Now I launch my test 1 again. Restarting
      from timestamp 1000:
      
      
      # OUTPUT
      
      > > Connecting to hbase master...
       > -- Inserted ts 1000
       > Versions or row : testrow
       > -- Inserted ts 1010
       > Versions or row : testrow
       > -- Inserted ts 1020
       > Versions or row : testrow
       > -- Inserted ts 1030
       > Versions or row : testrow
       > -- Inserted ts 1040
       > Versions or row : testrow
       > -- Inserted ts 1050
       > Versions or row : testrow
       > -- Inserted ts 1060
       > Versions or row : testrow
       > -- Inserted ts 1070
       > Versions or row : testrow
      
      
      It seems that the row are not inserted. Querying from shell:
      
      
      # SHELL 
      
      hbase(main):004:0> scan 'proxy-0.2'
      ROW                          COLUMN+CELL
      0 row(s) in 0.2030 seconds
      
      
      But, If I allow the program to make more iterations than the first time (ts
      > > 1100), the newest timestamps are taken in account. As if the table
      remembers of the previous maximum value of the timestamp:
      
      Relaunching the code of Test 1 :
      
      
      # OUTPUT
      
      > > Connecting to hbase master...
       > -- Inserted ts 1000
       > Versions or row : testrow
       > -- Inserted ts 1010
       > Versions or row : testrow
       > -- Inserted ts 1020
       > Versions or row : testrow
       > -- Inserted ts 1030
       > Versions or row : testrow
       > -- Inserted ts 1040
       > Versions or row : testrow
       > -- Inserted ts 1050
       > Versions or row : testrow
       > -- Inserted ts 1060
       > Versions or row : testrow
       > -- Inserted ts 1070
       > Versions or row : testrow
       > -- Inserted ts 1080
       > Versions or row : testrow
       > -- Inserted ts 1090
       > Versions or row : testrow
       > -- Inserted ts 1100
       > Versions or row : testrow
       > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100]
       > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
      ts1090 [1090]
       > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
      ts1080 [1080]
       > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
      ts1070 [1070]
       > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
      ts1060 [1060]
       > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
      ts1050 [1050]
       > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
      ts1040 [1040]
       > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
      ts1030 [1030]
       > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
      ts1020 [1020]
       > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
      ts1010 [1010]
       > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
      ts1000 [1000]
       > -- Inserted ts 1110
       > Versions or row : testrow
       > #1 MXTS[1110] bytes: => valbytes ts 1110 [1110]
       > #2 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
      ts1100 [1100]
       > #3 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
      ts1090 [1090]
       > #4 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
      ts1080 [1080]
       > #5 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
      ts1070 [1070]
       > #6 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
      ts1060 [1060]
       > #7 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
      ts1050 [1050]
       > #8 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
      ts1040 [1040]
       > #9 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
      ts1030 [1030]
       > #10 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
      ts1020 [1020]
       > #11 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
      ts1010 [1010]
       > #12 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
      ts1000 [1000]
       > -- Inserted ts 1120
       > Versions or row : testrow
       > #1 MXTS[1120] bytes: => valbytes ts 1120 [1120]
       > #2 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat
      ts1110 [1110]
       > #3 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
      ts1100 [1100]
       > #4 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
      ts1090 [1090]
       > #5 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
      ts1080 [1080]
       > #6 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
      ts1070 [1070]
       > #7 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
      ts1060 [1060]
       > #8 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
      ts1050 [1050]
       > #9 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
      ts1040 [1040]
       > #10 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
      ts1030 [1030]
       > #11 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
      ts1020 [1020]
       > #12 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
      ts1010 [1010]
       > #13 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
      ts1000 [1000]
       > -- Inserted ts 1130
       > Versions or row : testrow
       > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130]
       > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat
      ts1120 [1120]
       > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat
      ts1110 [1110]
       > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
      ts1100 [1100]
       > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
      ts1090 [1090]
       > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
      ts1080 [1080]
       > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
      ts1070 [1070]
       > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
      ts1060 [1060]
       > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
      ts1050 [1050]
       > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
      ts1040 [1040]
       > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
      ts1030 [1030]
       > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
      ts1020 [1020]
       > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
      ts1010 [1010]
       > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
      ts1000 [1000]
      
      
      Since the timestamp reachs a newest value, the row is inserted. Moreover,
      the previous insertions appears !
      
      Notice another problem: the last insertion is missing one cell: the
      'status:' column.
      
      Using shell to scan the table give the same result:
      
      
      # SHELL
      hbase(main):003:0> scan 'proxy-0.2'
      ROW                          COLUMN+CELL
       testrow                     column=bytes:, timestamp=1130, value=valbytes
      ts 1130
      
      
      Relauching hbase with the stop-hbase.sh / start-hbase.sh scripts yields to
      another unexpected behaviour:
      
      When I run the scan command in the shell, I have the same result than above:
      
      
      # SHELL
      hbase(main):001:0> scan 'proxy-0.2'
      ROW                          COLUMN+CELL
       testrow                     column=bytes:, timestamp=1130, value=valbytes
      ts 1130
      
      
      but if I launch the dumpRowHistory method it appears that most of history of
      the status: column is lost.
      
      Notice that I tried many times and I never had the same behaviour twice
      here, sometime the other column is missing, or the row is entirely lost
      giving no result at all.
      
      
      # OUTPUT
      
       > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130]
       > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat
      ts1120 [1120]
       > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110]
       > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100]
       > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090]
       > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080]
       > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070]
       > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060]
       > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050]
       > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040]
       > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030]
       > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020]
       > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010]
       > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000]
      
      
      I tried other tests, replacing only one column, using an existing timestamp
      to modify one single value, inserting past values, and so on... My
      conclusion is either I don't understand the general behaviour of that, or I
      make a bad usage of the API. 
      
      However, using normal insertion and normal query (I mean without any
      timestamp) gives me coherent and predictable results. As well as normal
      insertion and querying with past timestamps does.
      
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: