Details
-
Bug
-
Status: Accepted
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Mesos Command Executor try to support grace period with escalate but unfortunately it does not work. It launches command by wrapping it in sh -c this cause process tree to look like this
Received killTask Shutting down Sending SIGTERM to process tree at pid 18 Sent SIGTERM to the following process trees: [ -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 \--- 19 command... ] Command terminated with signal Terminated (pid: 18)
This cause sh to immediately close and so executor, while wrapped command might need some more time to finish. Finally, executor thinks command executed gracefully so it won't escalate to SIGKILL.
This cause leaks when POSIX containerizer is used because if command ignores SIGTERM it will be attached to initialize and never get killed. Using pid/namespace only masks the problem because hanging process is captured before it can gracefully shutdown.
Fix for this is to sent SIGTERM only to sh children. sh will exit when all children processes finish. If not they will be killed by escalation to SIGKILL.
All versions from 0.20 are affected.
This test should pass src/tests/command_executor_tests.cpp:342
Mailing list thread
Attachments
Attachments
Issue Links
- duplicates
-
MESOS-1871 Sending SIGTERM to a task command may render it orphaned
- Open
- is related to
-
MESOS-3363 custom executor's child process intermittently leaks to be a child of slave
- Open