Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.0
-
None
-
None
Description
Casey sent me a query that DCHECK. It turns out that the FE does not mark the tuple as nullable after a NAAJ. For example in
the query below on node 05:ANALYTIC, tuple-id 4 should be nullable.
[localhost:21000] > explain select COUNT(t1.int_col_1) OVER () AS
int_col_1 FROM (SELECT MAX(t1.int_col) AS int_col_1 FROM alltypestiny
t1) t1 WHERE t1.int_col_1 NOT IN (SELECT SUM(t1.smallint_col) AS
smallint_col_1 FROM alltypes t1);Query: explain select
COUNT(t1.int_col_1) OVER () AS int_col_1 FROM (SELECT MAX(t1.int_col)
AS int_col_1 FROM alltypestiny t1) t1 WHERE t1.int_col_1 NOT IN
(SELECT SUM(t1.smallint_col) AS smallint_col_1 FROM alltypes t1)
+------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=170.00MB VCores=1 |
| |
| 05:ANALYTIC |
| | functions: count(max(t1.int_col)) |
| | hosts=3 per-host-mem=unavailable |
| | tuple-ids=1,*4*,7 row-size=20B cardinality=0 |
| | |
| 04:HASH JOIN [NULL AWARE LEFT ANTI JOIN, BROADCAST] |
| | hash predicates: max(t1.int_col) = sum(t1.smallint_col) |
| | hosts=3 per-host-mem=unavailable |
| | tuple-ids=1,4 row-size=12B cardinality=0 |
| | |
| |--10:EXCHANGE [UNPARTITIONED] |
| | | hosts=3 per-host-mem=unavailable |
| | | tuple-ids=4 row-size=8B cardinality=0 |
| | | |
| | 09:AGGREGATE [FINALIZE] |
| | | output: sum:merge(t1.smallint_col) |
| | | hosts=3 per-host-mem=unavailable |
| | | tuple-ids=4 row-size=8B cardinality=0 |
| | | |
| | 08:EXCHANGE [UNPARTITIONED] |
| | | hosts=3 per-host-mem=unavailable |
| | | tuple-ids=4 row-size=8B cardinality=0 |
| | | |
| | 03:AGGREGATE |
| | | output: sum(t1.smallint_col) |
| | | hosts=3 per-host-mem=10.00MB |
| | | tuple-ids=4 row-size=8B cardinality=0 |
| | | |
| | 02:SCAN HDFS [functional.alltypes t1, RANDOM] |
| | partitions=24/24 size=478.45KB |
| | table stats: 7300 rows total |
| | column stats: all |
| | hosts=3 per-host-mem=160.00MB |
| | tuple-ids=3 row-size=2B cardinality=7300 |
| | |
| 07:AGGREGATE [FINALIZE] |
| | output: max:merge(t1.int_col) |
| | hosts=3 per-host-mem=unavailable |
| | tuple-ids=1 row-size=4B cardinality=0 |
| | |
| 06:EXCHANGE [UNPARTITIONED] |
| | hosts=3 per-host-mem=unavailable |
| | tuple-ids=1 row-size=4B cardinality=0 |
| | |
| 01:AGGREGATE |
| | output: max(t1.int_col) |
| | hosts=3 per-host-mem=10.00MB |
| | tuple-ids=1 row-size=4B cardinality=0 |
| | |
| 00:SCAN HDFS [functional.alltypestiny t1, RANDOM] |
| partitions=4/4 size=460B |
| table stats: 8 rows total |
| column stats: all |
| hosts=3 per-host-mem=32.00MB |
| tuple-ids=0 row-size=4B cardinality=8 |
+------------------------------------------------------------+
I have it in gdb. The DCHECK:
DCHECK_EQ(null_indicators_per_block_, 0); for (int i = 0; i < tuples_per_row; ++i) { const int tuple_size = desc_.tuple_descriptors()[i]->byte_size(); Tuple* t = row->GetTuple(i); // TODO: Once IMPALA-1306 (Avoid passing empty tuples of non-materiliazed slots) // is delivered, the check below should become DCHECK_NOTNULL(t). DCHECK(t != NULL || tuple_size == 0); <== this is the DCHECK that hits. t is NULL. This tuple should have been marked as NULLable. memcpy(tuple_buf, t, tuple_size); tuple_buf += tuple_size; } }