Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1339

Incorrect results in agg query grouping by CHAR (duplicate NULLs)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.0
    • Impala 2.0
    • None
    • None

    Description

      This is using the patch to fix IMPALA-1337.

      After GROUP BY char_col there may be multiple NULL values in the grouping column result set.

      [localhost.localdomain:21000] > create table text_types as select cast(id as char(127)) char127_col, cast(int_col as char(129)) char129_col, cast(bigint_col as varchar(256)) varchar256_col, cast(tinyint_col as string) string_col from alltypesagg;
      Query: create table text_types as select cast(id as char(127)) char127_col, cast(int_col as char(129)) char129_col, cast(bigint_col as varchar(256)) varchar256_col, cast(tinyint_col as string) string_col from alltypesagg
      +-----------------------+
      | summary               |
      +-----------------------+
      | Inserted 11000 row(s) |
      +-----------------------+
      Fetched 1 row(s) in 0.91s
      
      
      [localhost.localdomain:21000] > create table text_types2 as select cast(bigint_col as char(127)) char127_col, cast(tinyint_col as char(129)) char129_col, cast(int_col as varchar(256)) varchar256_col, cast(id as string) string_col from alltypesagg;
      Query: create table text_types2 as select cast(bigint_col as char(127)) char127_col, cast(tinyint_col as char(129)) char129_col, cast(int_col as varchar(256)) varchar256_col, cast(id as string) string_col from alltypesagg
      +-----------------------+
      | summary               |
      +-----------------------+
      | Inserted 11000 row(s) |
      +-----------------------+
      Fetched 1 row(s) in 0.90s
      
      
      [localhost.localdomain:21000] > create table foo as 
      select t2.char127_col AS char127_col_1 
      FROM text_types t1 
      LEFT JOIN text_types t2 ON t2.char127_col = t1.char129_col 
      GROUP BY t2.char127_col;
      Query: create table foo as select t2.char127_col AS char127_col_1 FROM text_types t1 LEFT JOIN text_types t2 ON t2.char127_col = t1.char129_col GROUP BY t2.char127_col
      +----------------------+
      | summary              |
      +----------------------+
      | Inserted 1019 row(s) |
      +----------------------+
      Fetched 1 row(s) in 69.77s
      
      
      [localhost.localdomain:21000] > select * from foo order by char127_col_1 desc limit 10;
      Query: select * from foo order by char127_col_1 desc limit 10
      +---------------+
      | char127_col_1 |
      +---------------+
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      | NULL          |
      +---------------+
      Fetched 10 row(s) in 0.30s
      

      It's also interesting that the CTAS took 70 seconds. I'll look into that a little.

      Attachments

        Activity

          People

            victor.bittorf_impala_fcb6 Victor Bittorf
            caseyc casey
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: