Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6334

Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 2.7.0
    • 2.8.0, 2.7.1, 2.6.2, 3.0.0-alpha1
    • None
    • None
    • Reviewed

    Description

      We are seeing this happen when

      • an NM's disk goes bad during the creation of map output(s)
      • the reducer's fetcher can read the shuffle header and reserve the memory
      • but gets an IOException when trying to shuffle for InMemoryMapOutput
      • shuffle fetch retry is enabled

      Attachments

        1. MAPREDUCE-6334.002.patch
          4 kB
          Eric Payne
        2. MAPREDUCE-6334.001.patch
          4 kB
          Eric Payne

        Issue Links

          Activity

            epayne Eric Payne added a comment -

            Version 1 of patch. jlowe would you mind taking a look?

            epayne Eric Payne added a comment - Version 1 of patch. jlowe would you mind taking a look?
            hadoopqa Hadoop QA added a comment -



            +1 overall



            Vote Subsystem Runtime Comment
            0 pre-patch 14m 29s Pre-patch trunk compilation is healthy.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
            +1 whitespace 0m 0s The patch has no lines that end in whitespace.
            +1 javac 7m 31s There were no new javac warning messages.
            +1 javadoc 9m 35s There were no new javadoc warning messages.
            +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
            +1 checkstyle 5m 22s There were no new checkstyle issues.
            +1 install 1m 32s mvn install still works.
            +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
            +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
            +1 mapreduce tests 1m 37s Tests passed in hadoop-mapreduce-client-core.
                42m 18s  



            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 29s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 javac 7m 31s There were no new javac warning messages. +1 javadoc 9m 35s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 5m 22s There were no new checkstyle issues. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 mapreduce tests 1m 37s Tests passed in hadoop-mapreduce-client-core.     42m 18s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12728059/MAPREDUCE-6334.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 78fe6e5 hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/testReport/ Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5448/console This message was automatically generated.

            Thanks for the patch, Eric! I think the patch will fix the particular issue but introduces another. What if nothing goes wrong and the transfer was successful? It looks like mapOutput will be non-null, but it would be bad if we aborted the map output after it committed. I think we only want to abort the mapOutput if something went wrong, so I think the abort logic should be grouped with the code that's reporting an error occurred (i.e.: in the catch clause).

            jlowe Jason Darrell Lowe added a comment - Thanks for the patch, Eric! I think the patch will fix the particular issue but introduces another. What if nothing goes wrong and the transfer was successful? It looks like mapOutput will be non-null, but it would be bad if we aborted the map output after it committed. I think we only want to abort the mapOutput if something went wrong, so I think the abort logic should be grouped with the code that's reporting an error occurred (i.e.: in the catch clause).

            Canceling the patch to address review comments.

            vinodkv Vinod Kumar Vavilapalli added a comment - Canceling the patch to address review comments.
            epayne Eric Payne added a comment -

            Thanks, jlowe. Good catch. I have uploaded a new patch that addresses your suggestions.

            epayne Eric Payne added a comment - Thanks, jlowe . Good catch. I have uploaded a new patch that addresses your suggestions.
            hadoopqa Hadoop QA added a comment -



            +1 overall



            Vote Subsystem Runtime Comment
            0 pre-patch 14m 37s Pre-patch trunk compilation is healthy.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
            +1 whitespace 0m 0s The patch has no lines that end in whitespace.
            +1 javac 7m 31s There were no new javac warning messages.
            +1 javadoc 9m 30s There were no new javadoc warning messages.
            +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
            +1 checkstyle 5m 28s There were no new checkstyle issues.
            +1 install 1m 34s mvn install still works.
            +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse.
            +1 findbugs 1m 15s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
            +1 mapreduce tests 1m 37s Tests passed in hadoop-mapreduce-client-core.
                42m 29s  



            Subsystem Report/Notes
            Patch URL http://issues.apache.org/jira/secure/attachment/12728866/MAPREDUCE-6334.002.patch
            Optional Tests javadoc javac unit findbugs checkstyle
            git revision trunk / eccf709
            hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt
            Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/testReport/
            Java 1.7.0_55
            uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
            Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/console

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 37s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 javac 7m 31s There were no new javac warning messages. +1 javadoc 9m 30s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 5m 28s There were no new checkstyle issues. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 32s The patch built with eclipse:eclipse. +1 findbugs 1m 15s The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 mapreduce tests 1m 37s Tests passed in hadoop-mapreduce-client-core.     42m 29s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12728866/MAPREDUCE-6334.002.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / eccf709 hadoop-mapreduce-client-core test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/artifact/patchprocess/testrun_hadoop-mapreduce-client-core.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5465/console This message was automatically generated.

            +1 lgtm. Committing this.

            jlowe Jason Darrell Lowe added a comment - +1 lgtm. Committing this.

            Thanks, Eric! I committed this to trunk, branch-2, and branch-2.7.

            jlowe Jason Darrell Lowe added a comment - Thanks, Eric! I committed this to trunk, branch-2, and branch-2.7.
            hudson Hudson added a comment -

            SUCCESS: Integrated in Hadoop-trunk-Commit #7695 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7695/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            • hadoop-mapreduce-project/CHANGES.txt
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #7695 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7695/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Hdfs-trunk #2110 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2110/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/CHANGES.txt
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2110 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2110/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #169 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/169/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #169 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/169/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #178 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/178/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/CHANGES.txt
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #178 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/178/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Yarn-trunk #912 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/912/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            • hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #912 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/912/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #179 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/179/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            • hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #179 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/179/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment -

            FAILURE: Integrated in Hadoop-Mapreduce-trunk #2128 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2128/)
            MAPREDUCE-6334. Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6)

            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
            • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
            • hadoop-mapreduce-project/CHANGES.txt
            hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2128 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2128/ ) MAPREDUCE-6334 . Fetcher#copyMapOutput is leaking usedMemory upon IOException during InMemoryMapOutput shuffle handler. Contributed by Eric Payne (jlowe: rev bc1bd7e5c4047b374420683d36a8c30eda6d75b6) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/CHANGES.txt

            Targeting 2.6.2 per Eric's comment in the mailing lists.

            vinodkv Vinod Kumar Vavilapalli added a comment - Targeting 2.6.2 per Eric's comment in the mailing lists.
            sjlee0 Sangjin Lee added a comment -

            The change applied cleanly to branch-2.6 for 2.6.2. Ran a clean build and ran the TestFetcher test.

            sjlee0 Sangjin Lee added a comment - The change applied cleanly to branch-2.6 for 2.6.2. Ran a clean build and ran the TestFetcher test.
            vishal.rajan vishal.rajan added a comment -

            How can i reproduce this issue. we had faced a similar issue.

            vishal.rajan vishal.rajan added a comment - How can i reproduce this issue. we had faced a similar issue.
            epayne Eric Payne added a comment -

            vishal.rajan, what version of Hadoop are you running?

            epayne Eric Payne added a comment - vishal.rajan , what version of Hadoop are you running?
            vishal.rajan vishal.rajan added a comment -

            yarn 2.6.0

            vishal.rajan vishal.rajan added a comment - yarn 2.6.0

            People

              epayne Eric Payne
              epayne Eric Payne
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: