Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.15.0
-
None
-
None
Description
For TPCDS query 49, there is an extra limit operator that is not needed.
Here is the query:
SELECT 'web' AS channel, web.item, web.return_ratio, web.return_rank, web.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT ws.ws_item_sk AS item, ( Cast(Sum(COALESCE(wr.wr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(ws.ws_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(wr.wr_return_amt, 0)) AS DEC(15, 4)) / Cast( Sum( COALESCE(ws.ws_net_paid, 0)) AS DEC(15, 4)) ) AS currency_ratio FROM web_sales ws LEFT OUTER JOIN web_returns wr ON ( ws.ws_order_number = wr.wr_order_number AND ws.ws_item_sk = wr.wr_item_sk ), date_dim WHERE wr.wr_return_amt > 10000 AND ws.ws_net_profit > 1 AND ws.ws_net_paid > 0 AND ws.ws_quantity > 0 AND ws_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY ws.ws_item_sk) in_web) web WHERE ( web.return_rank <= 10 OR web.currency_rank <= 10 ) UNION SELECT 'catalog' AS channel, catalog.item, catalog.return_ratio, catalog.return_rank, catalog.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT cs.cs_item_sk AS item, ( Cast(Sum(COALESCE(cr.cr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(cs.cs_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(cr.cr_return_amount, 0)) AS DEC(15, 4 )) / Cast(Sum( COALESCE(cs.cs_net_paid, 0)) AS DEC( 15, 4)) ) AS currency_ratio FROM catalog_sales cs LEFT OUTER JOIN catalog_returns cr ON ( cs.cs_order_number = cr.cr_order_number AND cs.cs_item_sk = cr.cr_item_sk ), date_dim WHERE cr.cr_return_amount > 10000 AND cs.cs_net_profit > 1 AND cs.cs_net_paid > 0 AND cs.cs_quantity > 0 AND cs_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY cs.cs_item_sk) in_cat) catalog WHERE ( catalog.return_rank <= 10 OR catalog.currency_rank <= 10 ) UNION SELECT 'store' AS channel, store.item, store.return_ratio, store.return_rank, store.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT sts.ss_item_sk AS item, ( Cast(Sum(COALESCE(sr.sr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(sts.ss_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(sr.sr_return_amt, 0)) AS DEC(15, 4)) / Cast( Sum( COALESCE(sts.ss_net_paid, 0)) AS DEC(15, 4)) ) AS currency_ratio FROM store_sales sts LEFT OUTER JOIN store_returns sr ON ( sts.ss_ticket_number = sr.sr_ticket_number AND sts.ss_item_sk = sr.sr_item_sk ), date_dim WHERE sr.sr_return_amt > 10000 AND sts.ss_net_profit > 1 AND sts.ss_net_paid > 0 AND sts.ss_quantity > 0 AND ss_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY sts.ss_item_sk) in_store) store WHERE ( store.return_rank <= 10 OR store.currency_rank <= 10 ) ORDER BY 1, 4, 5 LIMIT 100;
Here is the top of the plan:
00-00 Screen : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 100.0, cumulative cost = {1.5587382656934813E10 rows, 1.6644370208245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33692 00-01 Project(channel=[$0], item=[$1], return_ratio=[$2], return_rank=[$3], currency_rank=[$4]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 100.0, cumulative cost = {1.5587382646934813E10 rows, 1.6644370207245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33691 00-02 SelectionVectorRemover : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 100.0, cumulative cost = {1.5587382546934813E10 rows, 1.6644370157245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33690 00-03 Limit(fetch=[100]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 100.0, cumulative cost = {1.5587382446934813E10 rows, 1.6644370147245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33689 00-04 Limit(fetch=[100]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 100.0, cumulative cost = {1.5587382346934813E10 rows, 1.6644370107245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33688 00-05 SelectionVectorRemover : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 9067.461896625, cumulative cost = {1.5587382246934813E10 rows, 1.6644370067245007E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33687 00-06 TopN(limit=[100]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 9067.461896625, cumulative cost = {1.5587373179472916E10 rows, 1.664436916049882E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33686 00-07 HashAgg(group=[{0, 1, 2, 3, 4}]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 9067.461896625, cumulative cost = {1.5587364112011019E10 rows, 1.6644296869003403E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9294197896272392E10 memory}, id = 33685 00-08 Project(channel=[$0], item=[$1], return_ratio=[$2], return_rank=[$3], currency_rank=[$4]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank): rowcount = 90674.61896625, cumulative cost = {1.5587273437392052E10 rows, 1.664393417052754E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9289410276390976E10 memory}, id = 33684 00-09 HashToRandomExchange(dist0=[[$0]], dist1=[[$1]], dist2=[[$2]], dist3=[[$3]], dist4=[[$4]]) : rowType = RecordType(CHAR(7) channel, ANY item, DECIMAL(35, 20) return_ratio, BIGINT return_rank, BIGINT currency_rank, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 90674.61896625, cumulative cost = {1.5587182762773085E10 rows, 1.6643888833218057E11 cpu, 3.2256446355E10 io, 2.126707136508128E13 network, 1.9289410276390976E10 memory}, id = 33683
There are two limit operators, 00-03 and 00-04. Only one should be needed.