Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.2.0
-
None
-
None
Description
Creating this bug to let you know that when we tested out spark 3.2.0 we saw a significant performance degradation where our code was handling Avro Specific Record objects. This slowed down some of our jobs with a factor 4.
Spark 3.2.0 upsteps the avro version from 1.8.2 to 1.10.2.
The degradation was caused by a change introduced in avro 1.9.0. This change degrades performance when creating avro specific records in certain classloader topologies, like the ones used in spark.
We notified and proposed a simple fix upstream in the avro project. (Links contain more details)
It is unclear for us how many other projects are using avro specific records in a spark context and will be impacted by this degradation.
Feel free to close this issue if you think this issue is too much of a corner case.
Attachments
Issue Links
- is blocked by
-
AVRO-3156 Performance degradation in SpecificRecordBuilder introduced in 1.9.0
- Resolved
- is fixed by
-
SPARK-37206 Upgrade Avro to 1.11.0
- Resolved