Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
A quick investigation of ARROW-11781 showed much of the overhead lies in evaluating partition expressions against the filter. While much of this is just kernel evaluation, we should have benchmarks of key Datasets internals like SimplifyWithGuarantee.
Attachments
Issue Links
- relates to
-
ARROW-11781 [Python] Reading small amount of files from a partitioned dataset is unexpectedly slow
- Resolved
- links to