Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
0.9.5
-
None
-
None
Description
When flume receives a reconfiguration command from the master it performs the changes in the heartbeat thread. This causes the node to drop heartbeats for as long as it takes to either complete the task or for flume to forcibly kill the existing driver thread. This isn't a show stopper because there is a timeout associated with shutting down the existing driver, but it's easy to see this as a place where errors can occur.
I believe this is indicative of a larger issue in the way the node handles heartbeats and (re)configuration and we should revisit this communication as part of the master re-arch which implicitly involves the heartbeat and communication systems.