[MAPREDUCE-5251] Reducer should not implicate map attempt if it has insufficient space to fetch map output - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.23.7, 2.0.4-alpha
Fix Version/s: 0.23.10, 2.1.1-beta
Component/s: mrv2
Labels:
None

Target Version/s:

0.23.10
Hadoop Flags:

Reviewed

Description

A job can fail if a reducer happens to run on a node with insufficient space to hold a map attempt's output. The reducer keeps reporting the map attempt as bad, and if the map attempt ends up being re-launched too many times before the reducer decides maybe it is the real problem the job can fail.

In that scenario it would be better to re-launch the reduce attempt and hopefully it will run on another node that has sufficient space to complete the shuffle. Reporting the map attempt is bad and relaunching the map task doesn't change the fact that the reducer can't hold the output.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-5251-7-b23.txt
26/Jul/13 15:34
6 kB
Ashwin Shankar
MAPREDUCE-5251-7.txt
25/Jul/13 18:30
5 kB
Ashwin Shankar
MAPREDUCE-5251-6.txt
23/Jul/13 22:02
5 kB
Ashwin Shankar
MAPREDUCE-5251-5.txt
23/Jul/13 19:19
5 kB
Ashwin Shankar
MAPREDUCE-5251-4.txt
18/Jul/13 20:26
5 kB
Ashwin Shankar
MAPREDUCE-5251-3.txt
27/Jun/13 20:47
5 kB
Ashwin Shankar
MAPREDUCE-5251-2.txt
13/Jun/13 19:33
7 kB
Ashwin Shankar

Issue Links

is duplicated by

MAPREDUCE-4852 Reducer should not signal fetch failures for disk errors on the reducer's side

Resolved

relates to

TEZ-952 Port MAPREDUCE-5209, MAPREDUCE-5251

Closed

Activity

People

Assignee:: Ashwin Shankar

Reporter:: Jason Darrell Lowe

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 15/May/13 13:50

Updated:: 03/Sep/14 23:52

Resolved:: 26/Jul/13 17:59