[IMPALA-13314] Create a store for HadoopCatalogs to avoid creating a new one for each table - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Frontend
Labels:
- impala-iceberg
- ramp-up

Epic Color:
ghx-label-1

Description

Currently when we create a new Iceberg table in HadoopCatalog we create a new HadoopCatalog instance for each of these tables here

The issue with this is that a catalog object such as HadoopCatalog holds an Iceberg FileIO instance where the size of such an instance can be measured in MBs in terms of memory consumption. This can blow up the catalog/localCatalog memory even if we have empty Iceberg tables in HadoopCatalog.

So as a solution we should have a kind of HadoopCatalog store, where based on a location string we could cache HadoopCatalog objects for later use or cache a new HadoopCatalog in the store. With this approach tables under the sane HadoopCatalog location would be in the same HadoopCatalog instance and we won't end up having as many FileIO instance as many tables we have in HadoopCatalog.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Gabor Kaszab

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/Aug/24 14:51

Updated:: 21/Aug/24 14:51