Details
Description
When using MLLib, when calling toJSON on a plan with many level of sub-queries, it may cause out of memory exception with stack trace like this
java.lang.OutOfMemoryError: GC overhead limit exceeded at scala.collection.mutable.AbstractSeq.<init>(Seq.scala:47) at scala.collection.mutable.AbstractBuffer.<init>(Buffer.scala:48) at scala.collection.mutable.ListBuffer.<init>(ListBuffer.scala:46) at scala.collection.immutable.List$.newBuilder(List.scala:396) at scala.collection.generic.GenericTraversableTemplate$class.newBuilder(GenericTraversableTemplate.scala:64) at scala.collection.AbstractTraversable.newBuilder(Traversable.scala:105) at scala.collection.TraversableLike$class.filter(TraversableLike.scala:262) at scala.collection.AbstractTraversable.filter(Traversable.scala:105) at scala.collection.TraversableLike$class.filterNot(TraversableLike.scala:274) at scala.collection.AbstractTraversable.filterNot(Traversable.scala:105) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:20) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:20) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:20) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:25) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:20) at org.json4s.jackson.JValueSerializer.serialize(JValueSerializer.scala:7) at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128) at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:2881) at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2338) at org.json4s.jackson.JsonMethods$class.compact(JsonMethods.scala:34) at org.json4s.jackson.JsonMethods$.compact(JsonMethods.scala:50) at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:566)
The query plan, stack trace, and jmap distribution is attached.