[SPARK-32551] Ambiguous self join error in non self join with window - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.0.1
Component/s: Spark Core
Labels:
None

Description

Following code fails ambiguous self join analysis, even when it doesn't have self join

val v1 = spark.range(3).toDF("m")
val v2 = spark.range(3).toDF("d")
val v3 = v1.join(v2, v1("m").===(v2("d")))
val v4 = v3("d");
val w1 = Window.partitionBy(v4)
val out = v3.select(v4.as("a"), sum(v4).over(w1).as("b"))

org.apache.spark.sql.AnalysisException: Column a#45L are ambiguous. It's probably because you joined several Datasets together, and some of these Datasets are the same. This column points to one of the Datasets but Spark is unable to figure out which one. Please alias the Datasets with different names via `Dataset.as` before joining them, and specify the column using qualified name, e.g. `df.as("a").join(df.as("b"), $"a.id" > $"b.id")`. You can also set spark.sql.analyzer.failAmbiguousSelfJoin to false to disable this check.;

Attachments

Issue Links

duplicates

SPARK-31956 Do not fail if there is no ambiguous self join

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: kanika dhuria

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Aug/20 23:35

Updated:: 12/Dec/22 18:10

Resolved:: 06/Aug/20 18:04