Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43797 Python User-defined Table Functions
  3. SPARK-44559

Improve error messages for Python UDTF arrow type casts

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 3.5.0
    • PySpark
    • None

    Description

      Currently, if a Python UDTF outputs a type that is incompatible with the specified output schema, Spark will throw the following confusing error message:

        File "pyarrow/array.pxi", line 1044, in pyarrow.lib.Array.from_pandas
        File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
        File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
        File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
      pyarrow.lib.ArrowInvalid: Could not convert [1, 2] with type list: tried to convert to int32

      We should improve this.

      Attachments

        Activity

          People

            allisonwang-db Allison Wang
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: