Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1318

Analyzer does not mark tuple as nullable after NAAJ

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.0
    • Impala 2.0
    • None
    • None

    Description

      Casey sent me a query that DCHECK. It turns out that the FE does not mark the tuple as nullable after a NAAJ. For example in
      the query below on node 05:ANALYTIC, tuple-id 4 should be nullable.

      [localhost:21000] > explain select COUNT(t1.int_col_1) OVER () AS
      int_col_1 FROM (SELECT MAX(t1.int_col) AS int_col_1 FROM alltypestiny
      t1) t1 WHERE t1.int_col_1 NOT IN (SELECT SUM(t1.smallint_col) AS
      smallint_col_1 FROM alltypes t1);Query: explain select
      COUNT(t1.int_col_1) OVER () AS int_col_1 FROM (SELECT MAX(t1.int_col)
      AS int_col_1 FROM alltypestiny t1) t1 WHERE t1.int_col_1 NOT IN
      (SELECT SUM(t1.smallint_col) AS smallint_col_1 FROM alltypes t1)
      +------------------------------------------------------------+
      | Explain String                                             |
      +------------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=170.00MB VCores=1  |
      |                                                            |
      | 05:ANALYTIC                                                |
      | |  functions: count(max(t1.int_col))                       |
      | |  hosts=3 per-host-mem=unavailable                        |
      | |  tuple-ids=1,*4*,7 row-size=20B cardinality=0              |
      | |                                                          |
      | 04:HASH JOIN [NULL AWARE LEFT ANTI JOIN, BROADCAST]        |
      | |  hash predicates: max(t1.int_col) = sum(t1.smallint_col) |
      | |  hosts=3 per-host-mem=unavailable                        |
      | |  tuple-ids=1,4 row-size=12B cardinality=0                |
      | |                                                          |
      | |--10:EXCHANGE [UNPARTITIONED]                             |
      | |  |  hosts=3 per-host-mem=unavailable                     |
      | |  |  tuple-ids=4 row-size=8B cardinality=0                |
      | |  |                                                       |
      | |  09:AGGREGATE [FINALIZE]                                 |
      | |  |  output: sum:merge(t1.smallint_col)                   |
      | |  |  hosts=3 per-host-mem=unavailable                     |
      | |  |  tuple-ids=4 row-size=8B cardinality=0                |
      | |  |                                                       |
      | |  08:EXCHANGE [UNPARTITIONED]                             |
      | |  |  hosts=3 per-host-mem=unavailable                     |
      | |  |  tuple-ids=4 row-size=8B cardinality=0                |
      | |  |                                                       |
      | |  03:AGGREGATE                                            |
      | |  |  output: sum(t1.smallint_col)                         |
      | |  |  hosts=3 per-host-mem=10.00MB                         |
      | |  |  tuple-ids=4 row-size=8B cardinality=0                |
      | |  |                                                       |
      | |  02:SCAN HDFS [functional.alltypes t1, RANDOM]           |
      | |     partitions=24/24 size=478.45KB                       |
      | |     table stats: 7300 rows total                         |
      | |     column stats: all                                    |
      | |     hosts=3 per-host-mem=160.00MB                        |
      | |     tuple-ids=3 row-size=2B cardinality=7300             |
      | |                                                          |
      | 07:AGGREGATE [FINALIZE]                                    |
      | |  output: max:merge(t1.int_col)                           |
      | |  hosts=3 per-host-mem=unavailable                        |
      | |  tuple-ids=1 row-size=4B cardinality=0                   |
      | |                                                          |
      | 06:EXCHANGE [UNPARTITIONED]                                |
      | |  hosts=3 per-host-mem=unavailable                        |
      | |  tuple-ids=1 row-size=4B cardinality=0                   |
      | |                                                          |
      | 01:AGGREGATE                                               |
      | |  output: max(t1.int_col)                                 |
      | |  hosts=3 per-host-mem=10.00MB                            |
      | |  tuple-ids=1 row-size=4B cardinality=0                   |
      | |                                                          |
      | 00:SCAN HDFS [functional.alltypestiny t1, RANDOM]          |
      |    partitions=4/4 size=460B                                |
      |    table stats: 8 rows total                               |
      |    column stats: all                                       |
      |    hosts=3 per-host-mem=32.00MB                            |
      |    tuple-ids=0 row-size=4B cardinality=8                   |
      +------------------------------------------------------------+
      

      I have it in gdb. The DCHECK:

          DCHECK_EQ(null_indicators_per_block_, 0);
          for (int i = 0; i < tuples_per_row; ++i) {
            const int tuple_size = desc_.tuple_descriptors()[i]->byte_size();
            Tuple* t = row->GetTuple(i);
            // TODO: Once IMPALA-1306 (Avoid passing empty tuples of
      non-materiliazed slots)
            // is delivered, the check below should become DCHECK_NOTNULL(t).
            DCHECK(t != NULL || tuple_size == 0); <== this is the DCHECK that hits. t is NULL. This tuple should have been marked as NULLable.
            memcpy(tuple_buf, t, tuple_size);
            tuple_buf += tuple_size;
          }
        }
      

      Attachments

        Activity

          People

            alex.behm Alexander Behm
            ippokratis Ippokratis Pandis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: