Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.2.0
-
None
-
None
Description
Join can become unresolved after PullupCorrelatedPredicates:
create view t1(c1, c2) as values (0, 1), (1, 2) create view t2(c1, c2) as values (0, 2), (0, 3) select ( select sum(l.cnt + r.cnt) from (select count(*) cnt from t2 where t1.c1 = t2.c1 having cnt = 0) l join (select count(*) cnt from t2 where t1.c1 = t2.c1 having cnt = 0) r on l.cnt = r.cnt ) from t1 == Optimized Logical Plan == org.apache.spark.sql.catalyst.parser.ParseException: mismatched input '(' expecting {<EOF>, '.', '-'}(line 1, pos 14) == SQL == scalarsubquery(c1, c1) --------------^^^
This is because duplicate attributes are not handled correctly when pulling up correlated predicates over joins. Both `pullOutCorrelatedPredicates` and `DecorrelateInnerQuery` are subject to this issue.