Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
n/a
Description
For unbucketed tables DeleteReaderValue will currently return all delete events. Once we trust that
the N in bucketN for "base" spit is reliable, all delete events not matching N can be skipped.
This is useful to protect against extreme cases where someone runs an update/delete on a partition that matches 10 billion rows thus generates very many delete events.
Since HIVE-19890 all acid data files must have bucketid/writerid in the file name match bucketid/writerid in ROW__ID in the data.
OrcRawRecrodMerger.getDeltaFiles() should only return files representing the right bucket
Attachments
Attachments
Issue Links
- contains
-
HIVE-21710 Minor compaction writes delete records in unbucketed tables multiple times when we have multiple files <bucket_N>
- Closed
- is related to
-
HIVE-16812 VectorizedOrcAcidRowBatchReader doesn't filter delete events
- Closed
- requires
-
HIVE-19890 ACID: Inherit bucket-id from original ROW_ID for delete deltas
- Closed
- supercedes
-
HIVE-20579 VectorizedOrcAcidRowBatchReader.checkBucketId() should run for unbucketed tables
- Resolved