Details
Description
As part of the fix for ZOOKEEPER-1797, the call to FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java. As a result, some old-looking but required txn log files can be deleted, resulting in data corruption or loss.
For example, consider the following:
1. Configuration:
autopurge.snapRetainCount=3
2. Following files exist:
log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
snapshot.110 - snapshot as of zxid=110
snapshot.120 - snapshot as of zxid=120
snapshot.130 - snapshot as of zxid=130
Above scenario is possible when snapshotting has happened multiple times but without accompanying log rollover, which is possible if the server was running as a learner.
3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is older than the zxid of the oldest snapshot (110). This results in loss of transactions in the range 131-140.
Before the fix for ZOOKEEPER-1797, this was avoided by the call to FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log file with starting zxid < oldest retained snapshot's highest zxid.
Attachments
Attachments
Issue Links
- duplicates
-
ZOOKEEPER-2420 Autopurge deletes log file prior to oldest retained snapshot even though restore may need it
- Resolved
- is broken by
-
ZOOKEEPER-1797 PurgeTxnLog may delete data logs during roll
- Resolved
- is related to
-
ZOOKEEPER-2671 Fix compilation error in branch-3.4
- Closed
- links to