Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Ideally, a healthy Ozone cluster would contain only open and closed containers. However, container replicas commonly end up with a mix of states including quasi-closed and unhealthy that the current system is not able to resolve to cleanly closed replicas. The cause of these states is often bugs or broad failure handling on the write path. While we should fix these causes, they raise the problem that Ozone is not able to reconcile these mismatched container states on its own, regardless of their cause. This has lead to significant complexity in the replication manager for how to handle cases where only quasi-closed and unhealthy replicas are available, especially in the case of decommissioning.
Even when all replicas are closed, the system assumes that these closed container replicas are equal with no way to verify this. Checksumming is done for individual chunks within each container, but if two container replicas somehow end up with chunks that differ in length or content despite being marked closed with local checksums matching, the system has no way to detect or resolve this anomaly.
This Jira proposes a container reconciliation protocol to solve these problems. After implementing the proposal:
1. It should be possible for a cluster to progress to a state where it has only properly replicated closed and open containers.
2. We can verify the equality and integrity of all closed containers.
The design doc is linked here as a markdown pull request for inline comments.
Attachments
Issue Links
- is duplicated by
-
HDDS-9280 Quasi-closed container with unhealthy replicas may remain under-replicated in 4 node cluster
- Resolved
- relates to
-
HDDS-10931 Schedule on demand scan of containers after import
- Open
-
HDDS-7094 Enable Datanode side CRC checks by default
- Reopened
- links to