Details
-
Improvement
-
Status: Done
-
Major
-
Resolution: Done
-
None
-
None
Description
Current implementation for resolving write-write conflict on snapshotEdge is overwhelmingly complicated, so I am suggesting refactor this to make it easy to understand.
Current master test cases failed with positive fail probability for RPC(hbase.fail.prob=0.01).
I can reproduce it by running `sbt "project s2core" -Dconfig.file=s2rest_play/conf/test.conf test`.
We can think this problem as critical section problem. Multiple thread that want to mutate same SnapshotEdge can only proceed one-by-one to keep consistent view on related IndexEdge. To achieve this, we are storing extra data on SnapshotEdge.
Details are on gitbook.
I think following 3 things would be expected output after resolving this issue.
1. more comments and simplified code.
2. remove excessive re-fetch on every retry from partial failure. since it is guaranteed that only one thread can proceed at a time, re-fetch is not necessary.
3. remove Thread.sleep between retry from partial failure.
Attachments
Issue Links
- links to