Details
Description
RDD is to DataFrame as Graph is to ??? (this JIRA).
It would be very useful long-term to have a Graph type which uses 2 DataFrames instead of 2 RDDs.
The immediate benefit I have in mind is taking advantage of Spark SQL datasources and storage formats.
This could also be an opportunity to make an API which is more Java- and Python-friendly.
Attachments
Issue Links
- is required by
-
SPARK-7258 spark.ml API taking Graph instead of DataFrame
- Closed