Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1866

Race between ~Authenticator() and Authenticator::authenticate() can lead to schedulers/slaves to never get authenticated

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 0.21.0
    • None
    • None
    • Twitter Q4 Sprint 1
    • 2

    Description

      The master might get a duplicate authenticate() request while a previous authentication attempt is in progress. Depending on what the AuthenticatorProcess is executing at the time, there are 2 possible race conditions which will cause scheduler/slave to continuously retry authentication but never succeed.

      We have seen both the race conditions in a heavily loaded production cluster.

      Race1:
      ----------
      --> An authenticate() event was dispatched to AuthenticatorProcess (Master::authenticate() called Authenticator::authenticate())

      --> A terminate() event was then injected into the front of the AuthenticatorProcess queue (duplicate Master::authenticate() did ~Authenticator) before the above authenticate() event was executed.

      --> Due to the bug in libprocess, the future returned by Master::authenticate() was never transitioned to discarded (Master::_authenticate() was never called).

      --> This caused all the subsequent authentication retries to be enqueued on the master waiting for Master::_authenticate() to be executed.

      Fix: Transition the dispatched future to discarded if the libprocess is terminated (https://reviews.apache.org/r/25945/)

      Race 2:
      -----------
      --> An authenticate() event was dispatched to AuthenticatorProcess (Master::authenticate() called Authenticator::authenticate())

      --> AuthenticatorProcess::authenticate() executed and set promise.onDiscard(defer(self, Self::discarded)). NOTE: The internal promise of AuthenticatorProcess is discarded in AuthenticatorProcess::discarded()

      --> A terminate() event was then injected into the front of the AuthenticatorProcess queue (duplicate Master::authenticate() did
      ~Authenticator) before the above discarded() event was executed)

      --> ~AuthenticatorProcess is destructed without ever discarding the internal promise (Master::_authenticate() was never called).

      --> This caused all the subsequent authentication retries to be enqueued on the master waiting for Master::_authenticate() to be executed.

      Fix: The fix here is to discard the internal promise when the AuthenticatorProcess is destructed.

      Attachments

        Issue Links

          Activity

            People

              vinodkone Vinod Kone
              vinodkone Vinod Kone
              Benjamin Mahler Benjamin Mahler
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: