Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-13314

Create a store for HadoopCatalogs to avoid creating a new one for each table

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Frontend
    • ghx-label-1

    Description

      Currently when we create a new Iceberg table in HadoopCatalog we create a new HadoopCatalog instance for each of these tables here

      The issue with this is that a catalog object such as HadoopCatalog holds an Iceberg FileIO instance where the size of such an instance can be measured in MBs in terms of memory consumption. This can blow up the catalog/localCatalog memory even if we have empty Iceberg tables in HadoopCatalog.

      So as a solution we should have a kind of HadoopCatalog store, where based on a location string we could cache HadoopCatalog objects for later use or cache a new HadoopCatalog in the store. With this approach tables under the sane HadoopCatalog location would be in the same HadoopCatalog instance and we won't end up having as many FileIO instance as many tables we have in HadoopCatalog.

      Attachments

        Activity

          People

            Unassigned Unassigned
            gaborkaszab Gabor Kaszab
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: