[ARROW-12231] [C++][Dataset] Separate datasets backed by readers from InMemoryDataset - ASF JIRA

XML

Word

Printable

JSON

Backing an InMemoryDataset with a reader is misleading. Let's split that out into a separate class.
Dataset scanning can then use an I/O thread for the new class. (Note that for Python, we'll need to be careful to release the GIL before any operations so that the I/O thread can acquire the GIL to call into the underlying Python reader/file object.)
Longer-term, we should interface with Python's async.

relates to

ARROW-10882 [Python][Dataset] Writing dataset from python iterator of record batches

links to

GitHub Pull Request #10070

Estimated:

Not Specified

Remaining:

Logged:

2h 20m