Description
Discovered while testing 1.15.0 RC0 with Hive. It seems this regression was introduced by CALCITE-1995.
Consider the following query:
select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1 = b.id) join t2 c on (a.id2 = b.id) where b.d <= 1 and c.d <= 1 ) z where d1 > 1 or d2 > 1
We end up generating the following plan:
HiveProject(id1=[$0], id2=[$1], d1=[$3], d2=[$4]) HiveJoin(condition=[OR(=($3, 1), =($4, 1))], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[AND(=($0, $2), =($1, $2))], joinType=[inner], algorithm=[none], cost=[not available]) HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))]) HiveProject(id1=[$0], id2=[$1]) HiveTableScan(table=[[default.t1]], table:alias=[a]) HiveFilter(condition=[AND(<=($1, 1), IS NOT NULL($0))]) HiveProject(id=[$0], d=[$1]) HiveTableScan(table=[[default.t2]], table:alias=[b]) HiveFilter(condition=[<=($0, 1)]) HiveProject(d=[$1]) HiveTableScan(table=[[default.t2]], table:alias=[c])
Observe that the condition in the top join is not correct.
I can reproduce this in RexProgramTest.simplifyFilter with the following example:
// condition "a > 5 or b > 5" // with pre-condition "a <= 5 and b <= 5" // should yield "false" but yields "a = 5 or b = 5" checkSimplifyFilter(or(gt(aRef, literal5),gt(bRef, literal5)), RelOptPredicateList.of(rexBuilder, ImmutableList.of(le(aRef, literal5), le(bRef, literal5))), "false");
Attachments
Issue Links
- blocks
-
HIVE-18068 Upgrade to Calcite 1.15
- Closed
- is related to
-
CALCITE-1995 Remove predicates from Filter if they can be proved to be always true or false
- Closed