[IGNITE-16396] Calcite engine. Allow hash output distribution for aggregations - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Currently, we allow only single output distribution for aggregates, but looks like if we have hash input distribution and all grouping set contains all of the distribution keys we can make aggregation on remote nodes and produce hash output distribution with the same keys. This will reduce memory consumption on the initiator node and make some other optimizations possible.

For example, query:

SELECT t1.aff_key, t2.cnt FROM t1 JOIN (SELECT aff_key, COUNT(*) AS cnt FROM t2 GROUP BY id) AS t2 ON t1.aff_key = t2.aff_key

Can do colocated join if both tables are colocated on aff_key. Currently, such a query does join on the initiator node.

The same for set-ops (EXCEPT, INTERSECT).

Attachments

Issue Links

is part of

IGNITE-12248 Apache Calcite based query execution engine

Open

links to

GitHub Pull Request #9829

Activity

People

Assignee:: Aleksey Plekhanov

Reporter:: Aleksey Plekhanov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 26/Jan/22 09:55

Updated:: 21/Nov/22 07:05

Resolved:: 05/Mar/22 08:17

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

50m