Details
Description
The following Spark shell snippet reproduces this issue:
spark.range(10).createOrReplaceTempView("t1") spark.range(10).map(i => i: java.lang.Long).toDF("id").createOrReplaceTempView("t2") sql("SELECT struct(id) FROM t1 UNION ALL SELECT struct(id) FROM t2")
org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types. StructType(StructField(id,LongType,true)) <> StructType(StructField(id,LongType,false)) at the first column of the second table; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:57) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$11$$anonfun$apply$12.apply(CheckAnalysis.scala:291) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$11$$anonfun$apply$12.apply(CheckAnalysis.scala:289) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$11.apply(CheckAnalysis.scala:289) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$11.apply(CheckAnalysis.scala:278) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:278) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:132) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:61) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:573) ... 50 elided
The reason is that we treat two StructType incompatible even if their only differ from each other in field nullability.