Description
Spark supports merging schemata across table partitions in which one partition is missing a subfield that's present in another. However, attempting to select that missing field with a query that includes a partition pruning predicate that filters out the partitions that include that field results in a `ParquetDecodingException` when attempting to get the query results.
This bug is specifically exercised by the failing (but ignored) test case https://github.com/apache/spark/blob/f2d35427eedeacceb6edb8a51974a7e8bbb94bc2/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala#L125-L131.
Attachments
Issue Links
- causes
-
SPARK-31116 PrquetRowConverter does not follow case sensitivity
- Resolved
- is depended upon by
-
SPARK-31536 Backport SPARK-25407 Allow nested access for non-existent field for Parquet file when nested pruning is enabled
- Resolved
- is duplicated by
-
SPARK-25879 Schema pruning fails when a nested field and top level field are selected
- Resolved
- links to