Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.13.0
-
None
Description
For queries with JOIN operator and filter as disjunction of expressions
the additional filters can be derived and pushed down to prevent of unnecessary scanning.
Query example:
SELECT * FROM t1 JOIN t2 ON T1.COLUMN = T2.COLUMN WHERE (PC = X AND <other filters>) OR (PC = Y AND <some other filters>)
Unit test for TestParquetFilterPushdownWithTransitivePredicates.java:
@Test public void testForOrOperatorTestOr() throws Exception { String query = String.format("SELECT * FROM %s t1 " + "JOIN %s t2 ON t1.`month` = t2.`month` " + "WHERE ((t1.`period` = 4 AND t2.`year` = 1991) OR (t1.`period` = 3 AND t1.`year` = 1991)) ", FIRST_TABLE_NAME, SECOND_TABLE_NAME); final String[] expectedPlan = {"first.*numRowGroups=2", "second.*numRowGroups=1"}; testPlanMatchingPatterns(query, expectedPlan); }
LogicalProject(**=[$0], **0=[$4])
LogicalFilter(condition=[OR(AND(=($2, 4), =($6, 1991)), AND(=($2, 3), =($3, 1991)))])
LogicalJoin(condition=[=($1, $5)], joinType=[inner])
EnumerableTableScan(table=[[dfs, parquetFilterPush/transitiveClosure/first]])
EnumerableTableScan(table=[[dfs, parquetFilterPush/transitiveClosure/second]])
This improvement can be solved by CALCITE-2296.
Attachments
Issue Links
- requires
-
CALCITE-2296 Extra logic to derive additional filters in the FilterJoinRule
- Open