Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16836

StandbyCheckpointer can still trigger rollback fs image after RU is finalized

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      StandbyCheckpointer trigger rollback fsimage when RU is started.

      When ru is started, a flag (needRollbackImage) was set to true during edit log replay.

      And it only gets reset to false when doCheckpoint() succeeded.

      Think about following scenario:

      1. Start RU, needRollbackImage is set to true.
      2. doCheckpoint() failed.
      3. RU is finalized.
      4. namesystem.getFSImage().hasRollbackFSImage() is always false since rollback image cannot be generated once RU is over.
      5. needRollbackImage was never set to false.
      6. Checkpoints threshold(1m txns) and period(1hr) are not honored.
      StandbyCheckpointer:
      void doWork() {
       ....
        doCheckpoint();
      
        // reset needRollbackCheckpoint to false only when we finish a ckpt
        // for rollback image
        if (needRollbackCheckpoint
            && namesystem.getFSImage().hasRollbackFSImage()) {
          namesystem.setCreatedRollbackImages(true);
          namesystem.setNeedRollbackFsImage(false);
        }
        lastCheckpointTime = now;
      } 

      Attachments

        Issue Links

          Activity

            People

              Lei Yang Lei Yang
              Lei Yang Lei Yang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: