Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2, 0.9.4, 1.0.0, 0.9.5, 0.9.6, 0.10.1, 2.0.0, 1.0.1, 0.10.2, 1.0.2, 1.1.0, 1.0.3, 1.x, 0.10.3, 1.0.4, 1.1.1
-
None
-
None
Description
In our case, there is something wrong with network for a short time. So some partitions of Kafka have no leaders.
The nextTuple of KafkaSpout throw an exception of "No leader found for partition 0" at the position of "_coordinator.refresh();". The exception is from the function getLeaderFor in DynamicBrokersReader.java. So the spout is hanged.
The partitions of Kafka have recover for a short time. But the spout can not deal with this problem. This problem appears several times on our server. Such as:
Feb 25 06:31:19 CST 2017, KafkaSpout threw the exception.
Feb 25 06:31:21 CST 2017, Kafka partitions recoverd.
To be stronger, I think that the "_coordinator.refresh();" can try times. At the last time, throw the exception. Anyway, it will die, why not try one more time?
Attachments
Issue Links
- links to