Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0
-
None
Description
In the beginning of a shuffle map stage, driver needs to select external shuffle services as the mergers of the shuffle partitions for the corresponding shuffle.
We currently leverage the immediate available information about current and past executor location information for this selection purpose. Ideally, this would be behind a pluggable interface so that we can potentially leverage information tracked outside of a Spark application for better load balancing or for a disaggregate deployment environment.