Details

    Description

      Hive QE have observed slowdown in LLAP queries due to time to create and close s3a filesystems instances. A key aspect of that is they keep closing the fs instances (HIVE-27884), but looking at the profiles, the reason things seem to have regressed is

      • two s3 clients are being created (sync and async)
      • these seem to take a lot of time scanning the classpath for "global interceptors", which is at least an O(jars) operation; #of index entries in the zip files may factor too.

      Proposed:

      • create async client on demand when the transfer manager is invoked
      • look at why passwords are being scanned for if InstanceProfileCredentialsProvider is in use...that seems slow too

      SDK wishes

      • SDK maybe allow us to turn off that scan for interceptors?

      attaching screenshots of the profile. storediag snippet:

      
      [001]  fs.s3a.access.key = (unset)
      [002]  fs.s3a.secret.key = (unset)
      [003]  fs.s3a.session.token = (unset)
      [004]  fs.s3a.server-side-encryption-algorithm = (unset)
      [005]  fs.s3a.server-side-encryption.key = (unset)
      [006]  fs.s3a.encryption.algorithm = (unset)
      [007]  fs.s3a.encryption.key = (unset)
      [008]  fs.s3a.aws.credentials.provider = "com.amazonaws.auth.InstanceProfileCredentialsProvider" [core-site.xml]
      
      

      Attachments

        1. Screenshot 2024-06-14 at 17.12.59.png
          458 kB
          Steve Loughran
        2. Screenshot 2024-06-14 at 17.14.33.png
          456 kB
          Steve Loughran

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: