Details
Description
StandbyCheckpointer trigger rollback fsimage when RU is started.
When ru is started, a flag (needRollbackImage) was set to true during edit log replay.
And it only gets reset to false when doCheckpoint() succeeded.
Think about following scenario:
- Start RU, needRollbackImage is set to true.
- doCheckpoint() failed.
- RU is finalized.
- namesystem.getFSImage().hasRollbackFSImage() is always false since rollback image cannot be generated once RU is over.
- needRollbackImage was never set to false.
- Checkpoints threshold(1m txns) and period(1hr) are not honored.
StandbyCheckpointer: void doWork() { .... doCheckpoint(); // reset needRollbackCheckpoint to false only when we finish a ckpt // for rollback image if (needRollbackCheckpoint && namesystem.getFSImage().hasRollbackFSImage()) { namesystem.setCreatedRollbackImages(true); namesystem.setNeedRollbackFsImage(false); } lastCheckpointTime = now; }
Attachments
Issue Links
- links to