Description
Split grouping is currently done using a file size measurement which is the exact size of the split as it stays at rest on HDFS.
This is not valid for columnar formats and especially suffers from highly compressible data skews.
Attachments
Attachments
Issue Links
- is related to
-
HIVE-7428 OrcSplit fails to account for columnar projections in its size estimates
- Resolved