[ARROW-17123] [JS] Unable to open reader on .arrow file after fetch: Uncaught (in promise) Error: Expected to read 1329865020 metadata bytes, but only read 1123. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 8.0.1
Fix Version/s: None
Component/s: JavaScript
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/20325

Description

I created a file in raw arrow format with the script given in the Py arrow cookbook here: https://arrow.apache.org/cookbook/py/io.html#saving-arrow-arrays-to-disk

In a Node.js application, this file can be read doing:

const r = await RecordBatchReader.from(fs.createReadStream(filePath));                
await r.open();

for (let i = 0; i < r.numRecordBatches; i++) {
    const rb = await r.readRecordBatch(i); 
    if (rb !== null) {
        console.log(rb.numRows);
    }
}

However this method loads the whole file in memory (is that a bug?), which is not scalable.

To solve this scalability issue, I try to load the data with fetch as described in the the README.md. Both:

import { tableFromIPC } from "apache-arrow";

const table = await tableFromIPC(fetch(filePath));
console.table([...table]);

and

const r = await RecordBatchReader.from(await fetch(filePath));                
await r.open();

fail with error:

Uncaught (in promise) Error: Expected to read 1329865020 metadata bytes, but only read 1123.

Attachments

Issue Links

links to

GitHub Pull Request #13715

Activity

People

Assignee:: Unassigned

Reporter:: Benoit Cantin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Jul/22 09:57

Updated:: 11/Jan/23 11:48

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

50m