Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41279 Feature parity: DataFrame API in Spark Connect
  3. SPARK-41743

groupBy(...).agg(...).sort does not actually sort the output

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect
    • None

    Description

      **********************************************************************
      File "/.../spark/python/pyspark/sql/connect/group.py", line 211, in pyspark.sql.connect.group.GroupedData.agg
      Failed example:
          df.groupBy(df.name).agg(F.min(df.age)).sort("name").show()
      Differences (ndiff with -expected +actual):
            +-----+--------+
            | name|min(age)|
            +-----+--------+
          + |  Bob|       5|
            |Alice|       2|
          - |  Bob|       5|
            +-----+--------+
          + <BLANKLINE>
      

      Attachments

        Activity

          People

            grundprinzip-db Martin Grund
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: