Description
When loading a Word2VecModel of compressed size 58Mb using the Word2VecModel.load() method introduced in Spark 1.4.0 I get a `org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 2` exception.
This happens because the model is saved as a unique file with no partitioning and the kryo buffer overflows when tries to serialize it all.
Increasing `spark.kryoserializer.buffer.max` works as a temporary solution but needs to increased again whenever we increase the model size.
Attachments
Issue Links
- is related to
-
SPARK-15740 Word2VecSuite "big model load / save" caused OOM in maven jenkins builds
- Resolved
- relates to
-
SPARK-6725 Model export/import for Pipeline API (Scala)
- Resolved
- links to