Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.8.0
Description
We don't recoginize schema change during streaming aggregation when column is a mix of required and optional types.
Hash aggregation does throw correct error message.
I have a table 'mix' where:
[Fri Mar 27 09:46:07 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix ] # ls -ltr total 753 -rwxr-xr-x 1 root root 759879 Mar 27 09:41 optional.parquet -rwxr-xr-x 1 root root 9867 Mar 27 09:41 required.parquet [Fri Mar 27 09:46:09 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix ] # ~/parquet-tools-1.5.1-SNAPSHOT/parquet-schema optional.parquet message root { optional binary c_varchar (UTF8); optional int32 c_integer; optional int64 c_bigint; optional float c_float; optional double c_double; optional int32 c_date (DATE); optional int32 c_time (TIME); optional int64 c_timestamp (TIMESTAMP); optional boolean c_boolean; optional double d9; optional double d18; optional double d28; optional double d38; } [Fri Mar 27 09:46:41 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix ] # ~/parquet-tools-1.5.1-SNAPSHOT/parquet-schema required.parquet message root { required binary c_varchar (UTF8); required int32 c_integer; required int64 c_bigint; required float c_float; required double c_double; required int32 c_date (DATE); required int32 c_time (TIME); required int64 c_timestamp (TIMESTAMP); required boolean c_boolean; required double d9; required double d18; required double d28; required double d38; }
Nice error message on hash aggregation:
0: jdbc:drill:schema=dfs> select count(*) from mix group by c_integer;
+------------+
| EXPR$0 |
+------------+
Query failed: Query stopped., Hash aggregate does not support schema changes [ 2bc255ce-c7f9-47bf-80b0-a5c87cfa67be on atsqa4-134.qa.lab:31010 ]
java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)
On streaming aggregation, exception that is hard for the end user to understand:
0: jdbc:drill:schema=dfs> alter session set `planner.enable_hashagg` = false; +------------+------------+ | ok | summary | +------------+------------+ | true | planner.enable_hashagg updated. | +------------+------------+ 1 row selected (0.067 seconds) 0: jdbc:drill:schema=dfs> select count(*) from mix group by c_integer; +------------+ | EXPR$0 | +------------+ Query failed: RemoteRpcException: Failure while running fragment., Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.IntVector but was holding vector class org.apache.drill.exec.vector.NullableIntVector. [ 5610e589-38e0-4dc5-a560-649516180ba4 on atsqa4-134.qa.lab:31010 ] [ 5610e589-38e0-4dc5-a560-649516180ba4 on atsqa4-134.qa.lab:31010 ] java.lang.RuntimeException: java.sql.SQLException: Failure while executing query. at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514) at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148) at sqlline.SqlLine.print(SqlLine.java:1809) at sqlline.SqlLine$Commands.execute(SqlLine.java:3766) at sqlline.SqlLine$Commands.sql(SqlLine.java:3663) at sqlline.SqlLine.dispatch(SqlLine.java:889) at sqlline.SqlLine.begin(SqlLine.java:763) at sqlline.SqlLine.start(SqlLine.java:498) at sqlline.SqlLine.main(SqlLine.java:460)