Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17573

Allow turn on both FSImage parallelization and compression

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      The feature added HDFS-14617(in Improve FSImage load time by writing sub-sections to the FSImage index. by Stephen O'Donnell) makes loading FSImage very faster.

       

      But this option cannot be activated when turn on dfs.image.compress=true.

      In my opinion, larger clusters require both settings at the same time.

      For Example, the cluster I'm using has approximately 6 million file system objects and FSImage is approximately 11GB with dfs.image.compress=true setting.

      If turn off the dfs.image.compress option, it is expected to exceed 30GB, in which case it will take a long time to move FSImage from standby to active namenode using high network resource.

       

      It was proved in this jira(HDFS-16147 by kinit) that loading FSImage parallel and FSImage compression can be turned on at the same time.  (And worked well on my environment also.)

      I created this new Jira and PR because the discussion in HDFS-16147 ended in 2021, and I want it to be officially added in the next release, instead of patch available.

      The actual code of the patch was written by kinit and I resolved empty sub-section problem(see below comment of HDFS-16147) and added test code.

      If this is not a proper method, please let me know another way to contribute.

      Thanks.

      Attachments

        1. compressed-image-load-serial.png
          102 kB
          Sungdong Kim
        2. compressed-subsection-image-load-parallel.png
          102 kB
          Sungdong Kim
        3. compressed-subsection-image-load-serial.png
          101 kB
          Sungdong Kim

        Issue Links

          Activity

            People

              last-remote Sungdong Kim
              last-remote Sungdong Kim
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: