Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Since HBase does not currently provide a way to undo delete operations, Tephra currently implements deletes by writing a special delete marker for each column. In the case of a row delete, this means that we have to read the entire row being deleted, then issue a separate delete marker for each column in the row. In the case of a wide row with many columns, this can be very inefficient.
We should consider support row delete markers:
- for a row delete, we would record a special column family delete marker for each column family in the row
- the delete marker would be stored in a special reserved column name. Any attempted puts to the same column name would be rejected.
- we would have to ensure the delete marker sorts prior to any other columns in the column family
- on any get or scan operation, we would have to add this column to the requested columns for each column family being used
- if the delete marker exists, seek to the next row