Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
None
-
ghx-label-5
Description
This is a placeholder to figure out what we'd need to do to support dynamic file-level pruning in Iceberg using runtime filters, i.e. have parity for partition pruning.
- If there is a single partition value per file, then applying bloom filters to the row group stats would be effective at pruning files.
- If there are partition transforms, e.g. hash-based, then I think we probably need to track the partition that the file is associated with and then have some custom logic in the parquet scanner to do partition pruning.