Description
mutate() in the dplyr package supports adding new columns and replacing existing columns. But currently the implementation of mutate() in SparkR supports adding new columns only.
Also make the behavior of mutate more consistent with that in dplyr.
1. Throw error message when there are duplicated column names in the DataFrame being mutated.
2. when there are duplicated column names in specified columns by arguments, the last column of the same name takes effect.
Attachments
Issue Links
- duplicates
-
SPARK-10346 SparkR mutate and transform should replace column with same name to match R data.frame behavior
- Resolved
- is duplicated by
-
SPARK-10346 SparkR mutate and transform should replace column with same name to match R data.frame behavior
- Resolved
- is related to
-
SPARK-12225 Support adding or replacing multiple columns at once in DataFrame API
- Resolved
- links to