Details
-
Sub-task
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
3.3.0
-
None
-
None
Description
It has several issues with method `ShuffledRowRDD#getPreferredLocations`.
- it does not respect the config `spark.shuffle.reduceLocality.enabled`, so we can not disable it.
- it does not respect `REDUCER_PREF_LOCS_FRACTION`, so it has no effect if DAG schedule task to an executor who has less data. In worse, driver will take more memory to store the useless locations.