Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently, there is no notion of reserved space on datanodes as it exists on hdfs datanodes. Similarly, a datanode low on disk capacity continues to participate in pipeline allocation activity and keep on receiving write requests and these requests fail and potentially will end up running into retry loop in the client.
Similarly, ratis log disks are currently not accounted for by disk checker. Once a ratis disk gets full, existing pipelines can not be closed as closing of pipeline involves taking a ratis snapshot which will not succeed if the ratis disk is full. Similarly, new pipelines cannot be functional on such disks and will end up failing write requests and contribute in client retry chain.
Similarly, nodes low on disk capacity should not be choosen as targets for container re-replication.
The goal of the Jira is address disk related issues on datanodes holistically.
Attachments
Attachments
Issue Links
- is a parent of
-
HDDS-3022 Datanode unable to close Pipeline after disk out of space
- Resolved
- is related to
-
HDDS-7365 Integrate container and volume scanners
- Resolved
-
RATIS-1375 Handle bad storage dir due to disk failures
- Resolved
- relates to
-
RATIS-1377 Ratis min free space for storage dirs
- Resolved