Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.10.1
-
None
-
None
Description
Finally, I decided to split TEZ-4139 into 2 different tasks, because handling upstream problems can be fixed independently and I'm focusing on that now
So, from TEZ-4139, this ticket is intended to handle downstream failures as:
collect all reported upstream mapper task attempts for a vertex, and if it's beyond a certain amount for the same source(map) host, blame mapper task immediately => blame mapper task attempt as soon as possible if read error is likely because of upstream node failure (somewhat similar goal to TEZ-3910, but currently TEZ-3910 is about to give the power of failure downstream task completely to AM)