Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
In some cases more than one TNK operator has the same expressions in the same operator tree or the difference is only a constant column. Most of this cases only one TNK op. should remain.
+----------------------------------------------------+ | Explain | +----------------------------------------------------+ | Plan not optimized by CBO. | | | | Vertex dependency in root stage | | Map 1 <- Reducer 8 (BROADCAST_EDGE) | | Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) | | Reducer 3 <- Reducer 2 (SIMPLE_EDGE) | | Reducer 4 <- Reducer 3 (SIMPLE_EDGE) | | Reducer 8 <- Map 7 (CUSTOM_SIMPLE_EDGE) | | | | Stage-0 | | Fetch Operator | | limit:50 | | Stage-1 | | Reducer 4 vectorized | | File Output Operator [FS_127] | | Limit [LIM_126] (rows=50 width=538) | | Number of rows:50 | | Select Operator [SEL_125] (rows=190 width=538) | | Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"] | | <-Reducer 3 [SIMPLE_EDGE] | | SHUFFLE [RS_30] | | Select Operator [SEL_29] (rows=190 width=538) | | Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"] | | Group By Operator [GBY_28] (rows=190 width=538) | | Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"],aggregations:["avg(VALUE._col0)","avg(VALUE._col1)","avg(VALUE._col2)","avg(VALUE._col3)"],keys:KEY._col0, KEY._col1, KEY._col2 | | <-Reducer 2 [SIMPLE_EDGE] | | SHUFFLE [RS_27] | | PartitionCols:_col0, _col1, _col2 | | Group By Operator [GBY_26] (rows=190 width=1134) | | Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"],aggregations:["avg(_col9)","avg(_col11)","avg(_col18)","avg(_col12)"],keys:_col102, _col93, 0L | | Top N Key Operator [TNK_60] (rows=127 width=234) | | keys:_col102, _col93, 0L,top n:50 | | Select Operator [SEL_25] (rows=127 width=234) | | Output:["_col9","_col11","_col12","_col18","_col93","_col102"] | | Top N Key Operator [TNK_58] (rows=127 width=234) | | keys:_col102, _col93,top n:50 | | Filter Operator [FIL_49] (rows=127 width=234) | | predicate:((_col22 = _col38) and (_col1 = _col101) and (_col6 = _col69) and (_col3 = _col26)) | | Map Join Operator [MAPJOIN_102] (rows=2044 width=232) | | Conds:MAPJOIN_101._col1=RS_123.i_item_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38","_col69","_col93","_col101","_col102"] | | <-Map 9 [BROADCAST_EDGE] vectorized | | BROADCAST [RS_123] | | PartitionCols:i_item_sk | | Filter Operator [FIL_122] (rows=204000 width=108) | | predicate:i_item_sk is not null | | TableScan [TS_4] (rows=204000 width=108) | | tpcds_bin_partitioned_orc_100@item,item, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["i_item_sk","i_item_id"] | | <-Map Join Operator [MAPJOIN_101] (rows=2010 width=118) | | Conds:MAPJOIN_100._col6=RS_107.s_store_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38","_col69","_col93"] | | <-Map 7 [BROADCAST_EDGE] vectorized | | PARTITION_ONLY_SHUFFLE [RS_107] | | PartitionCols:s_store_sk | | Filter Operator [FIL_106] (rows=402 width=94) | | predicate:s_store_sk is not null | | TableScan [TS_3] (rows=402 width=94) | | tpcds_bin_partitioned_orc_100@store,store, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["s_store_sk","s_state"] | | <-Map Join Operator [MAPJOIN_100] (rows=9604000 width=24) | | Conds:MERGEJOIN_99._col22=RS_118.d_date_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26","_col38"] | | <-Map 6 [BROADCAST_EDGE] vectorized | | BROADCAST [RS_118] | | PartitionCols:d_date_sk | | Filter Operator [FIL_117] (rows=73049 width=8) | | predicate:d_date_sk is not null | | TableScan [TS_2] (rows=73049 width=8) | | tpcds_bin_partitioned_orc_100@date_dim,date_dim, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk"] | | Dynamic Partitioning Event Operator [EVENT_121] (rows=1 width=8) | | Group By Operator [GBY_120] (rows=1 width=8) | | Output:["_col0"],keys:_col0 | | Select Operator [SEL_119] (rows=73049 width=8) | | Output:["_col0"] | | Please refer to the previous Filter Operator [FIL_117] | | <-Merge Join Operator [MERGEJOIN_99] (rows=9604000 width=16) | | Conds:RS_114.ss_cdemo_sk=RS_116.cd_demo_sk(Inner),Output:["_col1","_col3","_col6","_col9","_col11","_col12","_col18","_col22","_col26"] | | <-Map 1 [SIMPLE_EDGE] vectorized | | SHUFFLE [RS_114] | | PartitionCols:ss_cdemo_sk | | Filter Operator [FIL_113] (rows=235814137 width=353) | | predicate:(ss_cdemo_sk is not null and ss_store_sk is not null and ss_item_sk is not null and ss_store_sk BETWEEN DynamicValue(RS_17_store_s_store_sk_min) AND DynamicValue(RS_17_store_s_store_sk_max) and in_bloom_filter(ss_store_sk, DynamicValue(RS_17_store_s_store_sk_bloom_filter))) | | TableScan [TS_0] (rows=275041999 width=723) | | tpcds_bin_partitioned_orc_100@store_sales,store_sales, ACID table,Tbl:COMPLETE,Col:PARTIAL,Output:["ss_item_sk","ss_cdemo_sk","ss_store_sk","ss_quantity","ss_list_price","ss_sales_price","ss_coupon_amt"] | | <-Reducer 8 [BROADCAST_EDGE] vectorized | | BROADCAST [RS_112] | | Group By Operator [GBY_111] (rows=1 width=24) | | Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=1000000)"] | | <-Map 5 [SIMPLE_EDGE] vectorized | | SHUFFLE [RS_116] | | PartitionCols:cd_demo_sk | | Filter Operator [FIL_115] (rows=1920800 width=8) | | predicate:cd_demo_sk is not null | | TableScan [TS_1] (rows=1920800 width=8) | | tpcds_bin_partitioned_orc_100@customer_demographics,customer_demographics, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["cd_demo_sk"] | | | +----------------------------------------------------+
Attachments
Attachments
Issue Links
- relates to
-
HIVE-17896 TopNKey: Create a standalone vectorizable TopNKey operator
- Closed
- links to