Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.17.1
Description
There is currently no way for a user to directly access the underlying ORC metadata of a given file. It seems the C++ functions and objects already existing and rather the plumbing is just missing the the cython/python and potentially a few c++ shims. Giving users the ability to retrieve the metadata without first reading the entire file could help numerous applications to increase their query performance by allowing them to intelligently determine which ORC stripes should be read.
This would allow for something like
import pyarrow as pa
orc_metadata = pa.orc.ORCFile(filename).metadata()
Attachments
Issue Links
- links to