Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
3.0.0
-
None
-
n/a
Description
Instead of using Lock Manager state as it currently does.
This will eliminate possible race conditions
See this comment
Suppose A is the set of all ValidTxnList across all active readers. Each ValidTxnList has minOpenTxnId.
MIN_HISTORY_LEVEL allows us to determine X = min(minOpenTxnId) across all currently active readers
This means that no active transaction in the system sees any txn with txnid < X as open.
This means if construct ValidTxnIdList with HWM=X-1 and use that in getAcidState(), any files determined by this call as 'obsolete', will be seen as obsolete by any existing/future reader, i.e. can be physically deleted.
This is also necessary for multi-statement transactions where relying on the state of Lock Manager is not sufficient. For example
Suppose txn 17 starts at t1 and sees txnid 13 with writeID 13 open.
13 commits (via it's parent txn) at t2 > t1. (17 is still running).
Compaction runs at t3 >t2 to produce base_14 (or delta_10_14 for example) on Table1/Part1 (17 is still running)
Now delta_13 may be cleaned since it can be seen as obsolete and there may be no locks on it, i.e. no one is reading it.
Now at t4 > t3 17 may (multi stmt txn) needs to read Table1/Part1. It cannot use base_14 is that may have absorbed delete events from delete_delta_14.
Another Use Case
There is delta_1_1 and delta_2_2 on disk both created by committed txns.
T5 starts reading these. At the same time compactor creates delta_1_2.
Now Cleaner sees delta_1_1 and delta_1_2 as obsolete and may remove them while the read is still in progress. This is because Compactor itself is not running in a txn and the files that
it produces are visible immediately. If it ran in a txn, the new files would only be visible once
this txn is visible to others (including the Cleaner).
Using MIN_HISTORY_LEVEL solves this.
See description of HIVE-18747 for more details on MIN_HISTORY_LEVEL
Attachments
Attachments
Issue Links
- blocks
-
HIVE-18773 Support multiple instances of Cleaner
- Open
- is blocked by
-
HIVE-18747 Cleaner for TXN_TO_WRITE_ID table entries using MIN_HISTORY_LEVEL.
- Closed
- is related to
-
HIVE-20436 Lock Manager scalability - linear
- Open
- is superceded by
-
HIVE-20823 Make Compactor run in a transaction
- Closed
- relates to
-
HIVE-20459 add ThriftHiveMetastore.get_open_txns(long txnid)
- Patch Available