Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Current behavior of .NET Evaluator is as follows: if evaluator can't send heartbeat to driver 3 times in row (which takes about 8 seconds), it considers driver dead/unreachable and enters recovery mode. However, if the code doesn't provide logic for handling reconnects, IDriverConnection uses default implementation MissingDriverConnection, which promptly throws NotImplementedException. The evaluator continues to try sending heartbeats which (in recovery mode already) continue to throw exception, so the evaluator loses any chance to reconnect to the driver and just hangs there indefinitely.
We should fix this by checking whether there is a non-default implementation bound for IDriverConnection. If there is one, we should enter recovery mode as before. But if there is none, we know that there's no point going to recovery; instead we should try to talk to driver some more, and then fail evaluator to avoid wasting resources.