Description
HBASE-808 describes program run in the below snippet from the list. Describes unexpected behaviors.
.. Test 2 A bit more surprising: I delete my row, using the delete-all command in shell: # SHELL hbase(main):001:0> scan 'proxy-0.2' ROW COLUMN+CELL testrow column=bytes:, timestamp=1100, value=valbytes ts 1100 testrow column=status:, timestamp=1100, value=valstat ts1100 2 row(s) in 0.3560 seconds hbase(main):002:0> deleteall 'proxy-0.2', 'testrow' 0 row(s) in 0.1050 seconds hbase(main):003:0> scan 'proxy-0.2' ROW COLUMN+CELL 0 row(s) in 0.2540 seconds The table is now empty, and if I try to launch my dumpRowHistory() method, the emptiness is confirmed. Ok. Now I launch my test 1 again. Restarting from timestamp 1000: # OUTPUT > > Connecting to hbase master... > -- Inserted ts 1000 > Versions or row : testrow > -- Inserted ts 1010 > Versions or row : testrow > -- Inserted ts 1020 > Versions or row : testrow > -- Inserted ts 1030 > Versions or row : testrow > -- Inserted ts 1040 > Versions or row : testrow > -- Inserted ts 1050 > Versions or row : testrow > -- Inserted ts 1060 > Versions or row : testrow > -- Inserted ts 1070 > Versions or row : testrow It seems that the row are not inserted. Querying from shell: # SHELL hbase(main):004:0> scan 'proxy-0.2' ROW COLUMN+CELL 0 row(s) in 0.2030 seconds But, If I allow the program to make more iterations than the first time (ts > > 1100), the newest timestamps are taken in account. As if the table remembers of the previous maximum value of the timestamp: Relaunching the code of Test 1 : # OUTPUT > > Connecting to hbase master... > -- Inserted ts 1000 > Versions or row : testrow > -- Inserted ts 1010 > Versions or row : testrow > -- Inserted ts 1020 > Versions or row : testrow > -- Inserted ts 1030 > Versions or row : testrow > -- Inserted ts 1040 > Versions or row : testrow > -- Inserted ts 1050 > Versions or row : testrow > -- Inserted ts 1060 > Versions or row : testrow > -- Inserted ts 1070 > Versions or row : testrow > -- Inserted ts 1080 > Versions or row : testrow > -- Inserted ts 1090 > Versions or row : testrow > -- Inserted ts 1100 > Versions or row : testrow > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100] > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat ts1090 [1090] > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat ts1080 [1080] > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat ts1070 [1070] > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat ts1060 [1060] > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat ts1050 [1050] > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat ts1040 [1040] > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat ts1030 [1030] > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat ts1020 [1020] > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat ts1010 [1010] > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat ts1000 [1000] > -- Inserted ts 1110 > Versions or row : testrow > #1 MXTS[1110] bytes: => valbytes ts 1110 [1110] > #2 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat ts1100 [1100] > #3 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat ts1090 [1090] > #4 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat ts1080 [1080] > #5 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat ts1070 [1070] > #6 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat ts1060 [1060] > #7 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat ts1050 [1050] > #8 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat ts1040 [1040] > #9 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat ts1030 [1030] > #10 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat ts1020 [1020] > #11 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat ts1010 [1010] > #12 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat ts1000 [1000] > -- Inserted ts 1120 > Versions or row : testrow > #1 MXTS[1120] bytes: => valbytes ts 1120 [1120] > #2 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat ts1110 [1110] > #3 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat ts1100 [1100] > #4 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat ts1090 [1090] > #5 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat ts1080 [1080] > #6 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat ts1070 [1070] > #7 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat ts1060 [1060] > #8 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat ts1050 [1050] > #9 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat ts1040 [1040] > #10 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat ts1030 [1030] > #11 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat ts1020 [1020] > #12 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat ts1010 [1010] > #13 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat ts1000 [1000] > -- Inserted ts 1130 > Versions or row : testrow > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130] > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat ts1120 [1120] > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat ts1110 [1110] > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat ts1100 [1100] > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat ts1090 [1090] > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat ts1080 [1080] > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat ts1070 [1070] > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat ts1060 [1060] > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat ts1050 [1050] > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat ts1040 [1040] > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat ts1030 [1030] > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat ts1020 [1020] > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat ts1010 [1010] > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat ts1000 [1000] Since the timestamp reachs a newest value, the row is inserted. Moreover, the previous insertions appears ! Notice another problem: the last insertion is missing one cell: the 'status:' column. Using shell to scan the table give the same result: # SHELL hbase(main):003:0> scan 'proxy-0.2' ROW COLUMN+CELL testrow column=bytes:, timestamp=1130, value=valbytes ts 1130 Relauching hbase with the stop-hbase.sh / start-hbase.sh scripts yields to another unexpected behaviour: When I run the scan command in the shell, I have the same result than above: # SHELL hbase(main):001:0> scan 'proxy-0.2' ROW COLUMN+CELL testrow column=bytes:, timestamp=1130, value=valbytes ts 1130 but if I launch the dumpRowHistory method it appears that most of history of the status: column is lost. Notice that I tried many times and I never had the same behaviour twice here, sometime the other column is missing, or the row is entirely lost giving no result at all. # OUTPUT > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130] > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat ts1120 [1120] > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110] > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100] > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090] > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080] > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070] > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060] > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050] > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040] > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030] > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020] > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010] > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000] I tried other tests, replacing only one column, using an existing timestamp to modify one single value, inserting past values, and so on... My conclusion is either I don't understand the general behaviour of that, or I make a bad usage of the API. However, using normal insertion and normal query (I mean without any timestamp) gives me coherent and predictable results. As well as normal insertion and querying with past timestamps does.