[PARQUET-1656] Schema change results in exception - java.lang.ClassCastException - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.8.1, 1.12.0
Fix Version/s: None
Component/s: parquet-avro
Labels:
- Parquet
- avro
Environment:

Hoodie/Parquet/Avro

Parquet-1.8.1

Avro-1.7.6

External issue URL:
https://issues.apache.org/jira/browse/PARQUET-1681
External issue ID:
PARQUET-1681

Description

Following exception was seen with parquet 1.8.1 (and in parquet 1.12.0, when trying to reproduce it).

Exception in thread "main" java.lang.ClassCastException: optional binary phone_number (STRING) is not a group
at com.uber.komondor.shaded.org.apache.parquet.schema.Type.asGroupType(Type.java:250)
at com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:232)
at com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.access$100(AvroRecordConverter.java:78)
at org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter$ElementConverter.<init>(AvroRecordConverter.java:536)
at org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter.<init>(AvroRecordConverter.java:486)
at org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:289)
at org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:95)
at org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
at org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:138)
at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at util.ParquetToAvroSchemaConverter$.convert(ParquetToAvroSchemaConverter.scala:46)
at util.ParquetToAvroSchemaConverter$.main(ParquetToAvroSchemaConverter.scala:20)
at util.ParquetToAvroSchemaConverter.main(ParquetToAvroSchemaConverter.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Original exception was triggered by the following schema change.
Schema Before change:
{
"default": null,
"name": "master_cluster",
"type": [
"null",
{
"fields": [

{ "name": "uuid", "type": "string" }

{ "name": "namespace", "type": "string" }

{ "name": "version", "type": "long" }

],
"name": "master_cluster",
"type": "record"
}
]
},

After schema change:
{
"default": null,
"name": "master_cluster",
"type": [
"null",
{
"fields": [

{ "default": null, "name": "uuid", "type": [ "null", "string" ] }

{ "default": null, "name": "namespace", "type": [ "null", "string" ] }

{ "default": null, "name": "version", "type": [ "null", "long" ] }

],
"name": "VORGmaster_cluster",
"type": "record"
}
]
},

We were suspecting ~~PARQUET-1441~~ could be in play and tried to reproduce the issue on parquet-1.12.0 and seeing the same exception.

During the repro noticed that issue could be with avroSchema conversion (field name was substituted with generic name "array"). While we look into this further, want to get community input on whether this is a known issue and any thoughts on path forward.

19/09/12 22:34:37 DEBUG avro.SchemaCompatibility: Checking compatibility of reader {"type":"record","name":"IDphones_items","fields":[

{"name":"phone_number","type":["null","string"],"default":null}

]} with writer {"type":"record","name":"array","fields":[

{"name":"phone_number","type":["null","string"],"default":null}

]}

Attachments

Activity

People

Assignee:: Xinli Shang

Reporter:: Balajee Nagasubramaniam

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 18/Sep/19 18:14

Updated:: 23/Jun/24 03:31