Details
Description
In the even that the zookeeper transaction log or snapshot become corrupted and fail CRC checks (preventing startup) we should have a mechanism to get the cluster running again.
Previously we achieved this by loading the broken transaction log with a modified version of ZK with disabled CRC check and forced it to snapshot.
It'd very handy to have a tool which can do this for us. LogFormatter and SnapshotFormatter have already been designed to dump log and snapshot files, it'd be nice to extend their functionality and add ability for such recovery.
It has proven that once you end up with the corrupt txn log there is no way to recover except manually modifying the crc check. That's basically why the tool is needed.
Attachments
Issue Links
- is related to
-
ZOOKEEPER-2249 CRC check failed when preAllocSize smaller than node data
- Resolved
- links to