Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.3.0
Description
Queries against Nested types show that ~90% of the time is spent in clock_gettime.
A cheaper accounting method can speed up Nested queries by 8-9x
select count(*) from customer.orders_string o, o.lineitems_string l where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01' group by l_shipmode order by l_shipmode
Schema
------------------------------------------------------
name | type | comment |
------------------------------------------------------
c_custkey | bigint | |
c_name | string | |
c_address | string | |
c_nationkey | bigint | |
c_phone | string | |
c_acctbal | double | |
c_mktsegment | string | |
c_comment | string | |
orders_string | array<struct< | |
o_orderkey:bigint, | ||
o_orderstatus:string, | ||
o_totalprice:double, | ||
o_orderdate:string, | ||
o_orderpriority:string, | ||
o_clerk:string, | ||
o_shippriority:bigint, | ||
o_comment:string, | ||
lineitems_string:array<struct< | ||
l_partkey:bigint, | ||
l_suppkey:bigint, | ||
l_linenumber:bigint, | ||
l_quantity:double, | ||
l_extendedprice:double, | ||
l_discount:double, | ||
l_tax:double, | ||
l_returnflag:string, | ||
l_linestatus:string, | ||
l_shipdate:string, | ||
l_commitdate:string, | ||
l_receiptdate:string, | ||
l_shipinstruct:string, | ||
l_shipmode:string, | ||
l_comment:string | ||
>> | ||
>> |
------------------------------------------------------
These are all the function
Function / Call Stack Effective Time by Utilization Spin Time Overhead Time Module Function (Full) Source File Start Address clock_gettime 86.233s 0s 0s librt.so.1 clock_gettime 0x3e10 impala::UnnestNode::GetNext 17.552s 0s 0s impalad impala::UnnestNode::GetNext(impala::RuntimeState*, impala::RowBatch*, bool*) 0xca9280 impala::NestedLoopJoinNode::GetNext 17.380s 0s 0s impalad impala::NestedLoopJoinNode::GetNext(impala::RuntimeState*, impala::RowBatch*, bool*) 0xc77d50 impala::NestedLoopJoinNode::ConstructBuildSide 17.242s 0s 0s impalad impala::NestedLoopJoinNode::ConstructBuildSide(impala::RuntimeState*) 0xc74f10 impala::UnnestNode::Open 16.830s 0s 0s impalad impala::UnnestNode::Open(impala::RuntimeState*) 0xca96c0 impala::ScopedTimer<impala::MonotonicStopWatch>::~ScopedTimer 8.769s 0s 0s impalad impala::ScopedTimer<impala::MonotonicStopWatch>::~ScopedTimer(void) 0x786630 impala::BlockingJoinNode::Open 8.380s 0s 0s impalad impala::BlockingJoinNode::Open(impala::RuntimeState*) 0xcbbdf0 impala::HdfsScanNode::GetNext 0.040s 0s 0s impalad impala::HdfsScanNode::GetNext(impala::RuntimeState*, impala::RowBatch*, bool*) 0xc23530 impala::PlanFragmentExecutor::OpenInternal 0.020s 0s 0s impalad impala::PlanFragmentExecutor::OpenInternal(void) 0xbeaaf0 impala::ExecNode::RowBatchQueue::AddBatch 0.020s 0s 0s impalad impala::ExecNode::RowBatchQueue::AddBatch(impala::RowBatch*) 0xc0c260
The explain plan
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Explain String | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=544.00MB VCores=2 | | WARNING: The following tables are missing relevant table and/or column statistics. | | tpch_nested_parquet_30.customer | | | | 09:MERGING-EXCHANGE [UNPARTITIONED] | | | order by: l_shipmode ASC | | | | | 06:SORT | | | order by: l_shipmode ASC | | | | | 08:AGGREGATE [FINALIZE] | | | output: count:merge(*) | | | group by: l_shipmode | | | | | 07:EXCHANGE [HASH(l_shipmode)] | | | | | 05:AGGREGATE | | | output: count(*) | | | group by: l_shipmode | | | | | 01:SUBPLAN | | | | | |--04:NESTED LOOP JOIN [CROSS JOIN] | | | | | | | |--02:SINGULAR ROW SRC | | | | | | | 03:UNNEST [o.lineitems_string l] | | | | | 00:SCAN HDFS [tpch_nested_parquet_30.customer.orders_string o] | | partitions=1/1 files=41 size=15.01GB | | predicates on l: l_shipmode IN ('MAIL', 'SHIP'), l_commitdate < l_receiptdate, l_shipdate < l_commitdate, l_receiptdate >= '1994-01-01', l_receiptdate < '1995-01-01' | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Query summary
+---------------------------+--------+----------+----------+--------+------------+-----------+---------------+-------------------------------------------------+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +---------------------------+--------+----------+----------+--------+------------+-----------+---------------+-------------------------------------------------+ | 09:MERGING-EXCHANGE | 1 | 49.90us | 49.90us | 2 | 450.00M | 0 B | -1 B | UNPARTITIONED | | 06:SORT | 4 | 286.31us | 345.15us | 2 | 450.00M | 24.00 MB | 416.00 MB | | | 08:AGGREGATE | 4 | 258.75ms | 288.16ms | 2 | 450.00M | 3.27 MB | 128.00 MB | FINALIZE | | 07:EXCHANGE | 4 | 35.41us | 60.43us | 8 | 450.00M | 0 B | 0 B | HASH(l_shipmode) | | 05:AGGREGATE | 4 | 356.10ms | 381.49ms | 8 | 450.00M | 41.33 MB | 128.00 MB | | | 01:SUBPLAN | 4 | 22.51s | 22.78s | 0 | 450.00M | 28.59 MB | 0 B | | | |--04:NESTED LOOP JOIN | 4 | 69.36s | 69.76s | 0 | 10 | 0 B | 16 B | CROSS JOIN | | | |--02:SINGULAR ROW SRC | 4 | 0ns | 0ns | 0 | 1 | 0 B | 0 B | | | | 03:UNNEST | 4 | 18.81s | 18.94s | 0 | 10 | 0 B | 0 B | o.lineitems_string l | | 00:SCAN HDFS | 4 | 408.93ms | 585.28ms | 46.50M | 45.00M | 304.66 MB | 88.00 MB | tpch_nested_parquet_30.customer.orders_string o | +---------------------------+--------+----------+----------+--------+------------+-----------+---------------+-------------------------------------------------+
Attachments
Attachments
Issue Links
- relates to
-
IMPALA-13058 TestRuntimeFilters.test_basic_filters failed in exhaustive mode on ARM builds
- Resolved