Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6462 Phase II : Erasure Coding Offline Recovery & Read/Write Improvements
  3. HDDS-7928

EC: Change ContainerReplicaPendingOps to store deadline rather than scheduled time

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • SCM

    Description

      In order to facilitate a "Move Manager" for the balancer, we need to schedule pending ops on the datanodes with different deadlines.

      For example, as replica scheduled due to replication would have a timeout of 10 minutes.

      However the balancer schedules large batches of work each hour, so replicas scheduled by the balancer probably need a timeout of 60 minutes.

      To facilitate this, we need to change containerReplicaPending ops to store the deadline rather than the scheduled time. This also fixes another issue, in that the ContainerReplicaPendingOps "expiry thread" had its own setting for the timeout for all pending ops, and it is not related to the replication manager settings.

      As part of this change, some APIs into RM has been changed a little to allow the move manager to schedule replication commands via RM, so that logic is consolidated in a single place. A later PR that adds the MoveManager will use these APIs.

      Attachments

        Issue Links

          Activity

            People

              sodonnell Stephen O'Donnell
              sodonnell Stephen O'Donnell
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: