Details
-
Sub-task
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
3.0.0
-
None
-
None
Description
A nested aggregate below with a window function seems to have different answers in the `rsum` column between PgSQL and Spark;
postgres=# create table gstest2 (a integer, b integer, c integer, d integer, e integer, f integer, g integer, h integer); postgres=# insert into gstest2 values postgres-# (1, 1, 1, 1, 1, 1, 1, 1), postgres-# (1, 1, 1, 1, 1, 1, 1, 2), postgres-# (1, 1, 1, 1, 1, 1, 2, 2), postgres-# (1, 1, 1, 1, 1, 2, 2, 2), postgres-# (1, 1, 1, 1, 2, 2, 2, 2), postgres-# (1, 1, 1, 2, 2, 2, 2, 2), postgres-# (1, 1, 2, 2, 2, 2, 2, 2), postgres-# (1, 2, 2, 2, 2, 2, 2, 2), postgres-# (2, 2, 2, 2, 2, 2, 2, 2); INSERT 0 9 postgres=# postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum postgres-# from gstest2 group by rollup (a,b) order by rsum, a, b; a | b | sum | rsum ---+---+-----+------ 1 | 1 | 8 | 8 1 | 2 | 2 | 10 1 | | 10 | 20 2 | 2 | 2 | 22 2 | | 2 | 24 | | 12 | 36 (6 rows)
scala> sql(""" | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum | from gstest2 group by rollup (a,b) order by rsum, a, b | """).show() +----+----+------+----+ | a| b|sum(c)|rsum| +----+----+------+----+ |null|null| 12| 12| | 1|null| 10| 22| | 1| 1| 8| 30| | 1| 2| 2| 32| | 2|null| 2| 34| | 2| 2| 2| 36| +----+----+------+----+
Attachments
Issue Links
- is related to
-
SPARK-29708 Different answers in aggregates of duplicate grouping sets
- Resolved