Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4789

Slow metadata loading with many partitions that have inconsistent HDFS path qualification

    XMLWordPrintableJSON

Details

    Description

      The fix for IMPALA-4172/IMPALA-3653 introduced a performance regression for loading tables that have many partitions with:
      1. inconsistent HDFS path qualification or
      2. a custom location (not under the table root dir)

      For the first issue consider a table whose root path is at 'hdfs://localhost:8020/warehouse/tbl/'.
      A partition with an unqualified location '/warehouse/tbl/p=1' will not be recognized as being a descendant of the table root dir by FileSystemUtil.isDescendentPath() because of how Path.equals() behaves, even if 'hdfs://localhost:8020' is the default filesystem.

      Such partitions are incorrectly recognized as having a custom location and are treated specially. The treatment of such partitions is very inefficient, as show in the following code snippets:

      HdfsTable.loadAllPartitions():

      ...
              if (!dirsToLoad.contains(partDir) &&
                  !FileSystemUtil.isDescendantPath(partDir, tblLocation)) { <--- this condition will fail
                // This partition has a custom filesystem location. Load its file/block
                // metadata separately by adding it to the list of dirs to load.
                dirsToLoad.add(partDir);
              }
      ...
      

      HdfsTable.loadMetadataAndDiskIds() calls HdfsTable.loadBlockMetadata() once for every location:

        private void loadMetadataAndDiskIds(List<Path> locations,
            HashMap<Path, List<HdfsPartition>> partsByPath) {
          LOG.info(String.format("Loading file and block metadata for %s partitions: %s",
              partsByPath.size(), getFullName()));
          for (Path location: locations) { loadBlockMetadata(location, partsByPath); }
          LOG.info(String.format("Loaded file and block metadata for %s partitions: %s",
              partsByPath.size(), getFullName()));
        }
      

      HdfsTable.loadBlockMetadata():

      ...
            // Clear the state of partitions under dirPath since they are now updated based
            // on the current snapshot of files in the directory.
            for (Map.Entry<Path, List<HdfsPartition>> entry: partsByPath.entrySet()) { <--- partsByPath has an entry for every partition in the table
              Path partDir = entry.getKey();
              if (!FileSystemUtil.isDescendantPath(partDir, dirPath)) continue;
              for (HdfsPartition partition: entry.getValue()) {
                partition.setFileDescriptors(new ArrayList<FileDescriptor>());
              }
            }
      ...
      

      As a result, it means that we will call isDescendentPath() roughly #numLocations * #totalPartitions times which can add up fast for tables with many partitions.

      There are two issues to fix here:
      1. The bug in recognizing partitions under the root table dir (for inconsistent qualification of table/partition locations)
      2. The expensive loop for partitions with custom locations (even if legitimately custom)

      Attachments

        Issue Links

          Activity

            People

              alex.behm Alexander Behm
              alex.behm Alexander Behm
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: