Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1310

Zookeeper timeout causes deadlock in Controller

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.8.1
    • 0.8.1.1
    • None
    • None

    Description

      Steps to reproduce:

      1. Checkout and build 0.8.1 branch from github:
      git clone git@github.com:apache/kafka.git && cd kafka && git checkout origin/0.8.1 && ./gradlew jar

      2. Start zookeeper server:
      ./bin/zookeeper-server-start.sh config/zookeeper.properties

      3. Start kafka server:
      ./bin/kafka-server-start.sh config/server.properties

      4. Suspend zookeeper process for 10 seconds (ctrl-Z, then %1).

      5. And kafka hasn't been re-registered in zookeeper.
      ./bin/zookeeper-shell.sh
      ls /brokers/ids
      >> []

      Root cause of the problem seems to be the deadlock between DeleteTopicsThread and SessionExpirationListener in KafkaController.

      1. DeleteTopicsThread acquires controllerLock and await()-s on deleteTopicsCond in awaitTopicDeletionNotification()

      2. SessionExpirationListener fires. It acquires controllerLock and tries to shutdown deleteTopicManager(in onControllerResignation()). This interrupts DeleteTopicsThread.

      3. DeleteTopicsThread can't return from deleteTopicsCond.await() because controllerLock is taken. We got a deadlock.

      Attachments

        Issue Links

          Activity

            People

              nehanarkhede Neha Narkhede
              slon Fedor Korotkiy
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: