Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
The input split assigner is currently one shared instances across jobs and inputs. The mapping between splits, split type, and assigner is very inflexible, as it requires changing an internal class in the runtime project.
The following changes will make it simpler and more efficient:
- Attach the split assigner to the job vertex (the ExecutionJobVertex) for the task that consumes the input splits.
- Move the input split assigner interfaces into the core project, as well as the simple base implementations of the assigners (default and locality aware)
- Let input split producers (such as input formats) create their own assigners.