Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.1.1
-
None
-
None
Description
I had a bunch of TaskTrackers time out because they were in the middle of job cleanup and the JobTracker restarted them by responding to emitHeartbeat with UNKNOWN_TASKTRACKER. Afterwards, I ended up with both the new and the restarted TaskTrackers on the list:
node1100_1234, 0-10 seconds since heartbeat
node1100_4321, >10,000 seconds since heartbeat