[SPARK-38158] Rdd and shuffle blocks not migrated to new executors when decommission feature is enabled - ASF JIRA

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.2.0
Fix Version/s: None
Component/s: Block Manager, Kubernetes
Labels:
None
Environment:

Spark on k8s

Description

We’re using Spark 3.2.0 and we have enabled the spark decommission feature. As part of validating this feature, we wanted to check if the rdd blocks and shuffle blocks from the decommissioned executors are migrated to other executors.

However, we could not see this happening. Below is the configuration we used.

Spark Configuration used:
     **     spark.local.dir /mnt/spark-ldir
     spark.decommission.enabled true
     spark.storage.decommission.enabled true
     spark.storage.decommission.rddBlocks.enabled true
     spark.storage.decommission.shuffleBlocks.enabled true
     spark.dynamicAllocation.enabled true
Brought up spark-driver and executors on the different nodes.
NAME                                                                                     READY              STATUS              NODE
decommission-driver                                                             1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-1          1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-2          1/1                 Running           Node2
gzip-compression-test-ae0b0b7e4d7fbe40-exec-3          1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-4          1/1                 Running           Node2
gzip-compression-test-ae0b0b7e4d7fbe40-exec-5          1/1                 Running           Node1
Bringdown Node2 so status of pods as are following.
NAME                                                                                     READY              STATUS           NODE
decommission-driver                                                             1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-1          1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-2          1/1                 Terminating   Node2
gzip-compression-test-ae0b0b7e4d7fbe40-exec-3          1/1                 Running           Node1
gzip-compression-test-ae0b0b7e4d7fbe40-exec-4          1/1                 Terminating   Node2
gzip-compression-test-ae0b0b7e4d7fbe40-exec-5          1/1                 Running           Node1
Driver logs: {"type":"log", "level":"INFO", "time":"2022-01-12T08:55:28.296Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"} {"type":"log", "level":"INFO", "time":"2022-01-12T08:55:28.459Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"} {"type":"log", "level":"INFO", "time":"2022-01-12T08:55:28.564Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"} {"type":"log", "level":"INFO", "time":"2022-01-12T08:55:28.601Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"} {"type":"log", "level":"INFO", "time":"2022-01-12T08:55:28.667Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"} {"type":"log", "level":"INFO", "time":"2022-01-12T08:58:21.885Z", "timezone":"UTC", "log":"Notify executor 5 to decommissioning."} {"type":"log", "level":"INFO", "time":"2022-01-12T08:58:21.887Z", "timezone":"UTC", "log":"Notify executor 1 to decommissioning."} {"type":"log", "level":"INFO", "time":"2022-01-12T08:58:21.887Z", "timezone":"UTC", "log":"Notify executor 3 to decommissioning."} {"type":"log", "level":"INFO", "time":"2022-01-12T08:58:21.887Z", "timezone":"UTC", "log":"Mark BlockManagers (BlockManagerId(5, X.X.X.X, 33359, None), BlockManagerId(1, X.X.X.X, 38655, None), BlockManagerId(3, X.X.X.X, 35797, None)) as being decommissioning."} {"type":"log", "level":"INFO", "time":"2022-01-12T08:59:24.426Z", "timezone":"UTC", "log":"Executor 2 is removed. Remove reason statistics: (gracefully decommissioned: 0, decommision unfinished: 0, driver killed: 0, unexpectedly exited: 1)."} {"type":"log", "level":"INFO", "time":"2022-01-12T08:59:24.426Z", "timezone":"UTC", "log":"Executor 4 is removed. Remove reason statistics: (gracefully decommissioned: 0, decommision unfinished: 0, driver killed: 0, unexpectedly exited: 2)."}
Verified by Execute into all live executors(1,3,5) and checked at location (/mnt/spark-ldir/) so only one blockManger id present, not seeing any other blockManager id copied to this location.
Example:

                        $kubectl exec -it gzip-compression-test-ae0b0b7e4d7fbe40-exec-1   -n test bash
                        $cd /mnt/spark-ldir/
                        $ blockmgr-60872c99-e7d6-43ba-a43e-a97fc9f619ca

Since the migration was not happening, we tried to use fallback storage option by specifying the hdfs storage. But unfortunately we could not see the rdd and shuffle blocks in this fallback storage location as well. Below is the configuration we used.

**

Spark Configuration Used:

     spark.decommission.enabled true
     spark.storage.decommission.enabled true
     spark.storage.decommission.rddBlocks.enabled true
     spark.storage.decommission.shuffleBlocks.enabled true
     spark.storage.decommission.fallbackStorage.path hdfs://namenodeHA/tmp/fallbackstorage
     spark.dynamicAllocation.enabled true

Brought up one spark-driver and one executor on the different nodes.
NAME READY NODE

decommission-driver 1/1 Node1

gzip-compression-test-49acf67e679f9259-exec-1 1/1 Node2

** Bringdown Node2 so status of pods as are following.
Example:

NAME READY STATUS

decommission-driver 1/1 Running

gzip-compression-test-49acf67e679f9259-exec-1 1/1 Running

gzip-compression-test-49acf67e679f9259-exec-1 1/1 Terminating

Verified data migration on that storage fallback location:
Example:

$ hdfs dfs -ls /tmp/fallbackstorage

Note: still empty this location.

Driver logs is here.
related to fallback
{"type":"log", "level":"INFO", "time":"2022-01-17T10:40:21.682Z", "timezone":"UTC", "log":"Registering BlockManager BlockManagerId(fallback, remote, 7337, None)"}

{"type":"log", "level":"INFO", "time":"2022-01-17T10:40:21.682Z", "timezone":"UTC", "log":"Registering block manager remote:7337 with 0.0 B RAM, BlockManagerId(fallback, remote, 7337, None)"}

{"type":"log", "level":"INFO", "time":"2022-01-17T10:40:21.682Z", "timezone":"UTC", "log":"Registered BlockManager BlockManagerId(fallback, remote, 7337, None)"}

related to decommissioning
**

{"type":"log", "level":"INFO", "time":"2022-01-17T10:40:21.661Z", "timezone":"UTC", "log":"Adding decommission script to lifecycle"}

{"type":"log", "level":"INFO", "time":"2022-01-17T10:46:17.952Z", "timezone":"UTC", "log":"Executor 1 is removed. Remove reason statistics: (gracefully decommissioned: 0, decommision unfinished: 0, driver killed: 0, unexpectedly exited: 1)."}

Note:- Configuration files for both the scenarios are attached.

Please Let us know if we are missing anything which is stopping the migration of rdd and shuffle blocks.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

sparkConf_Decommission_to_alternate_storage.conf
09/Feb/22 05:56
0.9 kB
Mohan Patidar
sparkConf_Decommission_to_executors.conf
09/Feb/22 05:56
0.5 kB
Mohan Patidar

Rdd and shuffle blocks not migrated to new executors when decommission feature is enabled

Details

Description

Attachments

Attachments

Activity

People

Dates