Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
Based on our experience, there is no scenario that necessarily requires deploying multiple Workers on the same node with Standalone backend. A worker should book all the resources reserved to Spark on the host it is launched, then it can allocate those resources to one or more executors launched by this worker. Since each executor runs in a separated JVM, we can limit the memory of each executor to avoid long GC pause.
The remaining concern is the local-cluster mode is implemented by launching multiple workers on the local host, we might need to re-implement LocalSparkCluster to launch only one Worker and multiple executors. It should be fine because local-cluster mode is only used in running Spark unit test cases, thus end users should not be affected by this change.
Removing multiple workers on the same host support could simplify the deploy model of Standalone backend, and also reduce the burden to support legacy deploy pattern in the future feature developments.
The proposal is to update the document to deprecate the support of system environment `SPARK_WORKER_INSTANCES` in 3.0, and remove the support in the next major version (3.1.0).
Attachments
Issue Links
- relates to
-
SPARK-30969 Remove resource coordination support from Standalone
- Resolved
1.
|
Deprecate support of multiple workers on the same host in Standalone | Resolved | Unassigned |