Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-17130

Connect workers do not properly ensure group membership before responding to health checks

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.8.0, 3.9.0
    • None
    • connect
    • None

    Description

      Initially reported here.

      When a distributed Connect worker's herder begins an iteration of its tick loop, it tries to ensure that the worker is still in contact with the Kafka cluster that's used for cluster coordination and internal topics; see here.

      However, this method may return even if the Kafka cluster is down. It does not force a heartbeat request to be sent to the broker, and may return if the time since the last heartbeat is small enough.

      We may want to force at least one request (possibly, specifically a heartbeat) to the group coordinator to have been sent before returning from WorkerGroupMember::ensureActive in order to guarantee that the health check point only returns 200 if it has explicitly validated the health of the worker's connection to the group coordinator after the request to the endpoint was initiated.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ChrisEgerton Chris Egerton
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: