Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Private Beta
-
None
Description
We saw this issue take down a server on bolt after one of its followers died. Eventually, we GCed that follower's logs, and then we started logging about 25 times a second that we couldn't GC its logs.
This also caused a lot of lock contention trying to write to other peers, etc, and make consensus more or less grind to halt.
Attachments
Issue Links
- relates to
-
KUDU-820 Add metrics for diagnosing recent cluster issues
- Open