Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Invalid
-
None
-
None
Description
5 nodes cluster, RF = 3
→ LWT consistency = SERIAL
→ Mutation consistency = ONE
1. Client 1 sends a INSERT INTO test (partition_key, value) VALUES(1, ‘ONE’) IF NOT EXISTS
2. Paxos round successfully gets ballot1
3. The IF NOT EXISTS condition is validated (reading with QUORUM on all replicas)
4. Paxos round validated
5. Mutation applied using consistency level ONE. Mutation pushed to replicas A, B and C
6. Replica A sends back acks. Replicas B and C did not receive the mutation (temporary network issue). The LWT operation is successful since we do not wait for acks from B and C
This LWT is considered successful
7. Client 2 starts a LWT : INSERT INTO test (partition_key, value) VALUES(1, ‘TWO’) IF NOT EXISTS
8. Paxos round successfully gets ballot2 (ballot2 > ballot1)
9. The IF NOT EXISTS condition is validated (reading with QUORUM and replica B and C reply that partition 1 does not exist)
10. The semantics of LWT is violated because indeed the partition already exists (because first LWT succeeded) with value = ‘ONE’
I'm not saying that there is a bug in our LWT implementation, it works as designed. The problem here is that I can't see any legit/sensible use-case for LWT with mutation CL=ONE since it opens the door for edge cases like the one described above and it defeats the purpose of Compare And Swap. Furthermore it can confuse a lot of people not familiar of how LWT is implemented internally.
Ideally, we should remove/ignore mutation CL for LWT and always use QUORUM/LOCAL_QUORUM. But for the sake of not breaking the existing API, it would be sufficient enough to validate and reject mutation CL < QUORUM.