Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Because of lock-free nature, RO reads might interact with writeIntents, meaning that such intents should be either evaluated as committed, aborted or pending. In order to perform writeIntent resolution it's required to
- If PartitionReplicaListener read a write intent then it checks a local txn state map for committed or aborted state - allow read if the state is committed and commitTs <= readTs.
- If not possible, PartitionReplicaListener send TxStateReq to coordinator by ReplicaService. - this initiates the coordinator path. Coordinator address is fetched from the txn state map.
- If a coordinator path was not able to resolve the intent, one of the following has happened - the coordinator is dead or txn state is not available in the cache. Calculate a commit partition and send the TxStateReq to its primary replica - this initiates the commit partition path.
- Retry commit partition path until a success or timeout.
On receiving TxStateReq in ReplicaManager on the coordinator:
- ReplicaManager reads txn state map. If the local txn is finished, return the response with the outcome: commit or abort. The txn state is stored in a local cache (https://issues.apache.org/jira/browse/IGNITE-17638)
- If the local txn is finishing (txState == Finishing) waiting for finish state replication, wait with timeout for outcome and return response with the outcome: commit or abort. txState become Finishing in TxManager on creating TxFinishReplicaRequest. TxManager has a txn state map. We can use future for concurrency and atomic operations on txn state map.
- If the outcome is commit, additional timestamp check is required: a commit timestamp must be <= readTs. If the condition is not held, the outcome is changed to abort.
- If local txn is active (txState != [finishing, commit, abort]), adjust the txn coordinator node HLC according to readTs to make sure the txn commit timestamp is above the read timestamp. The read timestamp must be installed before txn is started to commit, so commit timestamp is assigned after the read timestamp.
- If txn state is not found in a local cache and txn is not active, return NULL.
There's an open question about MvPartitionStorage api feature: https://issues.apache.org/jira/browse/IGNITE-17627
Attachments
Attachments
Issue Links
- Dependency
-
IGNITE-17222 Need to propagate HLC with transaction protocol events
- Resolved
- is blocked by
-
IGNITE-16882 Enlist txnState into single parition only
- Resolved
-
IGNITE-17627 Extend MvPartitionStorage read API with write intent resolution capabilities
- Resolved
- is duplicated by
-
IGNITE-20034 Implement writeIntentResolution coordinator path
- Resolved