Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0
-
ghx-label-9
Description
The stress test under a secure (Kerberos + SSL) cluster finds that some queries are producing wrong results. I haven't yet been able to pin down why, but I'm going ahead and filing this bug to include what I have. Note that during the run, the queries do not always produce wrong results; only sometimes.
Queries the stress test has reported as producing wrong results:
tpch-q3, tpcds-q34, tpch-q12, tpch-q7
In the case of tpch-q3, I managed to get a complete profile of a correct and incorrect run of the query. See attached.
TPCH-Q3 is
select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'BUILDING' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < '1995-03-15' and l_shipdate > '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10
I got as far as noticing that in the "wrong results" case, fewer rows are scanned:
Results correct
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ---------------------------------------------------------------------------------------------------------------------------------- 13:MERGING-EXCHANGE 1 308.431us 308.431us 10 10 0 0 UNPARTITIONED 06:TOP-N 8 15.787ms 18.612ms 80 10 12.00 KB 580.00 B 12:AGGREGATE 8 506.351ms 692.593ms 1.13M 1.73M 14.24 MB 105.17 MB FINALIZE 11:EXCHANGE 8 75.520ms 138.635ms 1.13M 1.73M 0 0 HASH(l_orderkey,o_orderdate... 05:AGGREGATE 8 389.129ms 650.835ms 1.13M 1.73M 13.70 MB 105.17 MB STREAMING 04:HASH JOIN 8 1s901ms 2s576ms 2.99M 1.73M 153.17 MB 12.98 MB INNER JOIN, PARTITIONED |--10:EXCHANGE 8 235.256ms 552.595ms 3.00M 3.00M 0 0 HASH(c_custkey) | 00:SCAN HDFS 5 323.828ms 621.551ms 3.00M 3.00M 29.30 MB 176.00 MB tpch_100_parquet.customer 09:EXCHANGE 8 728.297ms 758.348ms 14.57M 6.00M 0 0 HASH(o_custkey) 03:HASH JOIN 8 24s679ms 29s349ms 14.57M 6.00M 777.11 MB 98.35 MB INNER JOIN, PARTITIONED |--08:EXCHANGE 8 5s521ms 8s310ms 70.97M 15.00M 0 0 HASH(o_orderkey) | 01:SCAN HDFS 8 3s626ms 7s399ms 70.97M 15.00M 80.88 MB 352.00 MB tpch_100_parquet.orders 07:EXCHANGE 8 14s268ms 15s285ms 323.49M 60.00M 0 0 HASH(l_orderkey) 02:SCAN HDFS 8 11s632ms 17s863ms 323.49M 60.00M 78.65 MB 352.00 MB tpch_100_parquet.lineitem
Results incorrect:
ExecSummary: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ---------------------------------------------------------------------------------------------------------------------------------- 13:MERGING-EXCHANGE 1 304.504us 304.504us 10 10 0 0 UNPARTITIONED 06:TOP-N 8 19.261ms 29.196ms 80 10 12.00 KB 580.00 B 12:AGGREGATE 8 305.220ms 449.997ms 1.13M 1.73M 14.24 MB 105.17 MB FINALIZE 11:EXCHANGE 8 66.207ms 96.284ms 1.13M 1.73M 0 0 HASH(l_orderkey,o_orderdate... 05:AGGREGATE 8 516.324ms 653.086ms 1.13M 1.73M 13.58 MB 105.17 MB STREAMING 04:HASH JOIN 8 1s217ms 1s461ms 2.99M 1.73M 153.17 MB 12.98 MB INNER JOIN, PARTITIONED |--10:EXCHANGE 8 150.899ms 213.929ms 3.00M 3.00M 0 0 HASH(c_custkey) | 00:SCAN HDFS 5 937.452ms 1s753ms 3.00M 3.00M 29.09 MB 176.00 MB tpch_100_parquet.customer 09:EXCHANGE 8 563.317ms 581.895ms 11.04M 6.00M 0 0 HASH(o_custkey) 03:HASH JOIN 8 24s420ms 28s126ms 11.04M 6.00M 649.11 MB 98.35 MB INNER JOIN, PARTITIONED |--08:EXCHANGE 8 2s733ms 2s967ms 53.80M 15.00M 0 0 HASH(o_orderkey) | 01:SCAN HDFS 8 30s937ms 47s728ms 53.80M 15.00M 85.11 MB 352.00 MB tpch_100_parquet.orders 07:EXCHANGE 8 13s816ms 14s173ms 323.49M 60.00M 0 0 HASH(l_orderkey) 02:SCAN HDFS 8 10s053ms 12s288ms 323.48M 60.00M 78.57 MB 352.00 MB tpch_100_parquet.lineitem
Attachments
Attachments
Issue Links
- breaks
-
IMPALA-5537 Impala does not retry RPCs that fail in SSL_read()
- Resolved
-
IMPALA-5558 Query hang after coordinator crash because DoRpc(ReportExecStatus) fails and is not retried
- Resolved