[CALCITE-481] Add "Spool" operator, to allow re-use of relational expressions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.21.0
Component/s: None
Labels:
None

Description

If a sub-tree occurs more than once in a query an efficient plan would probably evaluate once and have two readers read the same data. We propose a "Spool" relational expression for this purpose.

Spool would have one input, the expression that populates it.

In the VolcanoPlanner, any RelNode can already have multiple consumers (each of which sees the same row type and the same data) but an optimal plan does not typically include multiple uses of the same node, so most implementors (e.g. EnumerableRelImplementor) would just not notice, and generate the same code twice. Having an explicit Spool would alert the implementor to re-use the result.

We do not prescribe a mechanism for implementing Spool as a physical operator. A job that populates a temporary table is one possible mechanism.

As part of this case, we should implement Spool in Enumerable convention, and use it to evaluate some test queries.

The other reason to implement Spool is costing. The cost of a Spool with N consumers is typically something like A + B . N. A, the fixed cost, is significantly larger than B, the re-play cost.

Volcano's dynamic programming model does not make it easy to account for re-use. There are approaches in academia based on integer linear programming; see e.g. http://www.slideshare.net/INRIA-OAK/plreuse and https://hal.inria.fr/hal-01353891/document.

Attachments

Issue Links

is related to

DRILL-3912 Common subexpression elimination in code generation

Closed

relates to

CALCITE-468 Introduce semi join reduction optimization in Calcite

Open

CALCITE-2116 The digests are not same for the common sub-expressions in HepPlanner

Closed

CALCITE-2127 In Interpreter, allow a node to have more than one consumer

Closed

CALCITE-1440 Implement planner for converting multiple SQL statements to unified RelNode Tree

Open

CALCITE-6188 Multi-query optimization

Open

CALCITE-482 Implement SQL and planner hints

Closed

(2 relates to)

Activity

People

Assignee:: Jesús Camacho Rodríguez

Reporter:: Julian Hyde

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 25/Nov/14 21:34

Updated:: 27/Feb/24 22:23

Resolved:: 02/Aug/19 00:09