Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.3.0
-
Drill 1.3.0 on a 3 node distributed-mode cluster on AWS.
Data files on S3.S3 storage plugin configuration:
{ "location": "/", "writable": false, "defaultInputFormat": null }
{
"type": "file",
"enabled": true,
"connection": "s3a://<bucket-name-was-here>",
"workspaces": {
"root":,
{ "location": "/processed", "writable": true, "defaultInputFormat": null }
"views":,
{ "location": "/tmp", "writable": true, "defaultInputFormat": null }
"tmp":},
{ "type": "text", "extensions": [ "tbl" ], "delimiter": "|" }
"formats": {
"psv":,
{ "type": "text", "extensions": [ "csv" ], "extractHeader": true, "delimiter": "," }
"csv":,
{ "type": "text", "extensions": [ "tsv" ], "delimiter": "\t" }
"tsv":,
{ "type": "parquet" }
"parquet":,
{ "type": "json" }
"json":,
{ "type": "avro" }
"avro":,
{ "type": "sequencefile", "extensions": [ "seq" ] }
"sequencefile":,
{ "type": "text", "extensions": [ "csvh", "csv" ], "extractHeader": true, "delimiter": "," }
"csvh":}
}Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. Data files on S3. S3 storage plugin configuration: { "type": "file", "enabled": true, "connection": "s3a://<bucket-name-was-here>", "workspaces": { "root": { "location": "/", "writable": false, "defaultInputFormat": null } , "views": { "location": "/processed", "writable": true, "defaultInputFormat": null } , "tmp": { "location": "/tmp", "writable": true, "defaultInputFormat": null } }, "formats": { "psv": { "type": "text", "extensions": [ "tbl" ], "delimiter": "|" } , "csv": { "type": "text", "extensions": [ "csv" ], "extractHeader": true, "delimiter": "," } , "tsv": { "type": "text", "extensions": [ "tsv" ], "delimiter": "\t" } , "parquet": { "type": "parquet" } , "json": { "type": "json" } , "avro": { "type": "avro" } , "sequencefile": { "type": "sequencefile", "extensions": [ "seq" ] } , "csvh": { "type": "text", "extensions": [ "csvh", "csv" ], "extractHeader": true, "delimiter": "," } } }
Description
When trying to query (via sqlline or WebUI) a .csv file I am getting an IndexOutofBoundsException:
0: jdbc:drill:> select * from s3data.root.`staging/data/apps1-bad.csv` limit 1; Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384)) Fragment 0:0 [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on ip-XXXXX.compute.internal:31010] (state=,code=0) 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | FIELD_1 | FIELD_2 | FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | FIELD_35 | +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0 | 925630488 | 0 | 925630488 | -1 | 19531580547 | 00000000 | 27/10/2015 02:00:00 | | 30 | 300 | 0 | 0 | 00000000 | 00000000 | 27/10/2015 02:05:27 | 0 | 1 | 0 | 35.0 | | | | 505 | 872.0 | | aBc | | | | | +----------+----------------------+----------+----------+----------+------------+----------+------------+----------+--------------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+----------------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ 1 row selected (1.094 seconds) 0: jdbc:drill:>
Good file: apps1.csv, and
Bad file: apps1-bad.csv attached.
Attachments
Attachments
Issue Links
- is duplicated by
-
DRILL-4140 csv extractHeader: fails on multiple empty columns
- Closed