Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
0.10.0
-
None
-
None
Description
Current BucketizedHiveInputFormat creates one split per one input file, which could result too many map tasks. If input files are not so big (make configurable threshold?), combining files with same bucket number and same input format could help reducing total execution time.
Attachments
Issue Links
- is blocked by
-
HIVE-3171 Bucketed sort merge join doesn't work when multiple files exist for small alias
- Closed