Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Duplicate
-
None
-
None
-
Correctness - Consistency
-
Normal
-
Normal
-
User Report
-
All
-
None
Description
Name queries (aka. single partition query with full clustering keys) query sstables sequentially in recency order, in the hope that most recent sstables will contain most recent data, so that they can avoid reading older sstables in SinglePartitionReadCommand#reduceFilter.
Unfortunately, this optimization may cause digest mismatch if older sstables contain range tombstone or row deletion with lower timestamp. Test Code
Table with (pk, ck1, ck2) Node1: * delete row (pk=1, ck1=1) with ts=10 * insert row (pk=1, ck1=1, ck2=1) with ts=11 Node2: * delete row (pk=1, ck1=1) with ts=10 * flush into sstable1 * insert row (pk=1, ck1=1, ck2=1) with ts=11 * flush into sstable2 Query with pk=1 and ck1=1 and ck2=1 * node1 returns: RT open marker, row, RT close marker * node2 returns: row (because sstable1 is skipped) Note: similar mismatch can happen with row deletion as well.
In the above example: Is it safe to ignore RT or row deletion if row liveness has higher timestamp for named queries in node1?
Attachments
Issue Links
- duplicates
-
CASSANDRA-15962 Digest for some queries is different depending whether the data are retrieved from sstable or memtable
- Resolved
- is related to
-
CASSANDRA-15369 Fake row deletions and range tombstones, causing digest mismatch and sstable growth
- Resolved
-
CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
- Resolved