Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-969

Add Vector As Supported Type In Frame Conversions

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • SystemML 0.11
    • None
    • None

    Description

      Currently, we are able to pass in Spark DataFrames with Vector type columns as input to SystemML scripts to be converted to a SystemML matrix. We should add the Vector type to the frame conversion code too.

      Example: Given a Spark DataFrame of schema <double, vector, string>, we should be able to convert that to a SystemML frame with a bunch of double columns and the final string column.

      cc mboehm7, acs_s

      Attachments

        Activity

          mboehm7 Matthias Boehm added a comment - - edited

          ok, thanks again for pointing out this missing functionality. We now support dataframe-frame conversions with mixed schemas that include vector columns. We make one simplifying assumption though: we only allow a single vector column in the schema (but at arbitrary positions and mixed with arbitrary scalar fields) because this allows us to handle schema information without looking at the data (for the vector sizes).

          mboehm7 Matthias Boehm added a comment - - edited ok, thanks again for pointing out this missing functionality. We now support dataframe-frame conversions with mixed schemas that include vector columns. We make one simplifying assumption though: we only allow a single vector column in the schema (but at arbitrary positions and mixed with arbitrary scalar fields) because this allows us to handle schema information without looking at the data (for the vector sizes).
          mboehm7 Matthias Boehm added a comment -

          good point - let's try to get this in the upcoming rc2 for 0.11.

          mboehm7 Matthias Boehm added a comment - good point - let's try to get this in the upcoming rc2 for 0.11.

          mboehm7 I wanted to try the workaround of converting a DataFrame I have to a SystemML frame to avoid shuffles, but the frame converters do not yet allow for Vector types.

          dusenberrymw Mike Dusenberry added a comment - mboehm7 I wanted to try the workaround of converting a DataFrame I have to a SystemML frame to avoid shuffles, but the frame converters do not yet allow for Vector types.

          People

            mboehm7 Matthias Boehm
            dusenberrymw Mike Dusenberry
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: