Details
-
Wish
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Right now we are treating timeout in the network client as a disconnection exception, which "hides" legit timeout where increasing request.timeout.ms could be considered OK
when there is no "real" network disconnection :
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=describeConfigs, deadlineMs=1616147081029) timed out at 1616147081039 after 2 attempt(s)
Caused by: org.apache.kafka.common.errors.DisconnectException: Cancelled describeConfigs request with correlation id 8 due to node 1 being disconnected
the DisconnectException is thrown because of the disconnect flag being set to true in https://github.com/apache/kafka/blob/3d0b4d910b681df7d873c8a0285eaca01d6c173a/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L352
While we could have a different path from there https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java#L793 that would propagate the fact that the connection timed out because of request.timeout.ms expiration, and adjust the later thrown exception in there https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L1195 so that it's not a DisconnectException ?
Thank you