Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.10.0
-
None
-
None
Description
On a 500 node cluster, I had a bunch of map tasks get "lost" because they failed to report progress for 10 minutes. They appear to be in the sort stage at the end of the map. I hypothesize that the patch for HADOOP-331 does not update the map's progress during the sort/merge. If the sort/merge takes more than 10 minutes, the task is lost.