Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-5893

mesos-executor should adopt and reap orphan child processes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.1.0
    • None
    • containerization

    Description

      mesos containerizer does not properly handle children death.

      discovered using marathon-lb, each topology update fork another haproxy, the old haproxy process should properly die after its last client connection is terminated, but turn into a zombie.

       7716 ?        Ssl    0:00  |       \_ mesos-executor --launcher_dir=/usr/libexec/mesos --sandbox_directory=/mnt/mesos/sandbox --user=root --working_directory=/marathon-lb --rootfs=/mnt/mesos/provisioner/containers/3b381d5c-7490-4dcd-ab4b-81051226075a/backends/overlay/rootfses/a4beacac-2d7e-445b-80c8-a9b4e480c491
       7813 ?        Ss     0:00  |       |   \_ sh -c /marathon-lb/run sse --marathon https://marathon:8443 --auth-credentials user:pass --group 'external' --ssl-certs /certs --max-serv-port-ip-per-task 20050
       7823 ?        S      0:00  |       |   |   \_ /bin/bash /marathon-lb/run sse --marathon https://marathon:8443 --auth-credentials user:pass --group external --ssl-certs /certs --max-serv-port-ip-per-task 20050
       7827 ?        S      0:00  |       |   |       \_ /usr/bin/runsv /marathon-lb/service/haproxy
       7829 ?        S      0:00  |       |   |       |   \_ /bin/bash ./run
       8879 ?        S      0:00  |       |   |       |       \_ sleep 0.5
       7828 ?        Sl     0:00  |       |   |       \_ python3 /marathon-lb/marathon_lb.py --syslog-socket /dev/null --haproxy-config /marathon-lb/haproxy.cfg --ssl-certs /certs --command sv reload /marathon-lb/service/haproxy --sse --marathon https://marathon:8443 --auth-credentials user:pass --group external --max-serv-port-ip-per-task 20050
       7906 ?        Zs     0:00  |       |   \_ [haproxy] <defunct>
       8628 ?        Zs     0:00  |       |   \_ [haproxy] <defunct>
       8722 ?        Ss     0:00  |       |   \_ haproxy -p /tmp/haproxy.pid -f /marathon-lb/haproxy.cfg -D -sf 144 52
      

      update: mesos-executor should be registered as a subreaper ( http://man7.org/linux/man-pages/man2/prctl.2.html ) and propagate signals.
      code sample: https://github.com/krallin/tini/blob/master/src/tini.c

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kaalh Stéphane Cottin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: