Details
Description
We are seeing this happen when
- an NM's disk goes bad during the creation of map output(s)
- the reducer's fetcher can read the shuffle header and reserve the memory
- but gets an IOException when trying to shuffle for InMemoryMapOutput
- shuffle fetch retry is enabled
Attachments
Attachments
- MAPREDUCE-6334.002.patch
- 4 kB
- Eric Payne
- MAPREDUCE-6334.001.patch
- 4 kB
- Eric Payne
Issue Links
- is duplicated by
-
MAPREDUCE-6351 Reducer hung in copy phase.
- Resolved
Activity
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | pre-patch | 14m 29s | Pre-patch trunk compilation is healthy. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | tests included | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | whitespace | 0m 0s | The patch has no lines that end in whitespace. |
+1 | javac | 7m 31s | There were no new javac warning messages. |
+1 | javadoc | 9m 35s | There were no new javadoc warning messages. |
+1 | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. |
+1 | checkstyle | 5m 22s | There were no new checkstyle issues. |
+1 | install | 1m 32s | mvn install still works. |
+1 | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. |
+1 | findbugs | 1m 14s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. |
+1 | mapreduce tests | 1m 37s | Tests passed in hadoop-mapreduce-client-core. |
42m 18s |
Subsystem | Report/Notes |
---|---|
Patch URL | http://issues.apache.org/jira/secure/attachment/12728059/MAPREDUCE-6334.001.patch |
Optional Tests | javadoc javac unit findbugs checkstyle |
git revision | trunk / 78fe6e5 |
hadoop-mapreduce-client-core test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt |
Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/testReport/ |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/console |
This message was automatically generated.
Thanks for the patch, Eric! I think the patch will fix the particular issue but introduces another. What if nothing goes wrong and the transfer was successful? It looks like mapOutput will be non-null, but it would be bad if we aborted the map output after it committed. I think we only want to abort the mapOutput if something went wrong, so I think the abort logic should be grouped with the code that's reporting an error occurred (i.e.: in the catch clause).
Thanks, jlowe. Good catch. I have uploaded a new patch that addresses your suggestions.
+1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | pre-patch | 14m 37s | Pre-patch trunk compilation is healthy. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | tests included | 0m 0s | The patch appears to include 1 new or modified test files. |
+1 | whitespace | 0m 0s | The patch has no lines that end in whitespace. |
+1 | javac | 7m 31s | There were no new javac warning messages. |
+1 | javadoc | 9m 30s | There were no new javadoc warning messages. |
+1 | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. |
+1 | checkstyle | 5m 28s | There were no new checkstyle issues. |
+1 | install | 1m 34s | mvn install still works. |
+1 | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. |
+1 | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. |
+1 | mapreduce tests | 1m 37s | Tests passed in hadoop-mapreduce-client-core. |
42m 29s |
Subsystem | Report/Notes |
---|---|
Patch URL | http://issues.apache.org/jira/secure/attachment/12728866/MAPREDUCE-6334.002.patch |
Optional Tests | javadoc javac unit findbugs checkstyle |
git revision | trunk / eccf709 |
hadoop-mapreduce-client-core test log | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt |
Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/testReport/ |
Java | 1.7.0_55 |
uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/console |
This message was automatically generated.
Thanks, Eric! I committed this to trunk, branch-2, and branch-2.7.
SUCCESS: Integrated in Hadoop-trunk-Commit #7695 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7695/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
- hadoop-mapreduce-project/CHANGES.txt
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
FAILURE: Integrated in Hadoop-Hdfs-trunk #2110 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2110/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/CHANGES.txt
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #169 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/169/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/CHANGES.txt
FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #178 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/178/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/CHANGES.txt
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
FAILURE: Integrated in Hadoop-Yarn-trunk #912 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/912/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
- hadoop-mapreduce-project/CHANGES.txt
FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #179 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/179/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
- hadoop-mapreduce-project/CHANGES.txt
FAILURE: Integrated in Hadoop-Mapreduce-trunk #2128 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2128/)
MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
- hadoop-mapreduce-project/CHANGES.txt
The change applied cleanly to branch-2.6 for 2.6.2. Ran a clean build and ran the TestFetcher test.
Version 1 of patch. jlowe would you mind taking a look?