Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-4666

Handling disk issues in Datanodes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • Ozone Datanode, SCM
    • None

    Description

      Currently, there is no notion of reserved space on datanodes as it exists on hdfs datanodes. Similarly, a datanode low on disk capacity continues to participate in pipeline allocation activity and keep on receiving write requests and these requests fail and potentially will end up running into retry loop in the client.

      Similarly, ratis log disks are currently not accounted for by disk checker. Once a ratis disk gets full, existing pipelines can not be closed as closing of pipeline involves taking a ratis snapshot which will not succeed if the ratis disk is full. Similarly, new pipelines cannot be functional on such disks and will end up failing write requests and contribute in client retry chain.

      Similarly, nodes low on disk capacity should not be choosen as targets for container re-replication.

      The goal of the Jira is address disk related issues on datanodes holistically.

      Attachments

        Issue Links

          Activity

            People

              shashikant Shashikant Banerjee
              shashikant Shashikant Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: