Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2403

JobHistory log files contain data that cannot be parsed by org.apache.hadoop.mapred.JobHistory

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.19.0
    • None
    • None
    • Reviewed

    Description

      When some tasks failed, the job tracker writes an line to the history file with error message.
      However, the error message may mess up with the history file format, choking the history parser. Here is an example:

      MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090" TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED" FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050" ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,> substituted by null
      . Key XML-Ori:

      <Root>

      Attachments

        1. EncodeDecode.java
          1 kB
          Amar Kamat
        2. patch-2403.txt
          7 kB
          Amareshwari Sriramadasu
        3. patch-2403.txt
          2 kB
          Amareshwari Sriramadasu
        4. patch-2403.txt
          2 kB
          Amareshwari Sriramadasu
        5. patch-2403.txt
          0.5 kB
          Amareshwari Sriramadasu

        Issue Links

          Activity

            JobHistory parser reads line by line and parses it. If line ends with

            {b} '\' {b}

            , it will continue reading next line.
            So, If error messages have more than one line, the lines should be appended with

            {b} '\' {b}

            (except the last line) before writing to log file.

            Thoughts?

            amareshwari Amareshwari Sriramadasu added a comment - JobHistory parser reads line by line and parses it. If line ends with {b} '\' {b} , it will continue reading next line. So, If error messages have more than one line, the lines should be appended with {b} '\' {b} (except the last line) before writing to log file. Thoughts?

            , the lines should be appended with {b} '\' {b}.

            This is to be done by replace "\n" by "\ \n"

            amareshwari Amareshwari Sriramadasu added a comment - , the lines should be appended with {b} '\' {b}. This is to be done by replace "\n" by "\ \n"
            amar_kamat Amar Kamat added a comment -

            Following are the solutions that might work here
            1) Encode/decode the strings that are externally generated (like error).
            2) Add character count information to the key-val pair. Something like key=length:value. Now read length characters at a time for forming the value.
            3) Serialize job history information.

            Following are the problems with (2) when error-like strings are stored in JobHistory :
            1) The whole line needs to be read before parsing and hence there is no good way to detect the length at the key-value level.
            2) A simple solution would be to use record-level length. But again there is a problem :

            • Currently, the code checks for a ' " ' in the end before considering the record as complete. This can be erroneous as error string can contain ' " ' which might lead to premature termination. Also the splitting of key-val pairs is done based on space. Hence with error like strings in the history, the split will result into wrong key-val pairs. Hence encoding-decoding seems to a better fix for all these problems.

            While encoding, one should make sure that the data is written in one line. Hence the record parsing algorithm becomes
            1) Read line. Since all the entries fit in one line, there is no need to look for record end.
            2) Split the line based on space.
            3) Split the pair on '=' to get key and value. Recover the value if required.


            Thoughts?

            amar_kamat Amar Kamat added a comment - Following are the solutions that might work here 1) Encode/decode the strings that are externally generated (like error). 2) Add character count information to the key-val pair. Something like key=length:value . Now read length characters at a time for forming the value. 3) Serialize job history information. Following are the problems with (2) when error-like strings are stored in JobHistory : 1) The whole line needs to be read before parsing and hence there is no good way to detect the length at the key-value level. 2) A simple solution would be to use record-level length. But again there is a problem : Currently, the code checks for a ' " ' in the end before considering the record as complete. This can be erroneous as error string can contain ' " ' which might lead to premature termination. Also the splitting of key-val pairs is done based on space . Hence with error like strings in the history, the split will result into wrong key-val pairs. Hence encoding-decoding seems to a better fix for all these problems. While encoding, one should make sure that the data is written in one line. Hence the record parsing algorithm becomes 1) Read line. Since all the entries fit in one line, there is no need to look for record end. 2) Split the line based on space . 3) Split the pair on '=' to get key and value. Recover the value if required. Thoughts?
            amar_kamat Amar Kamat added a comment -

            Attaching a sample encode/decode implementation

            Data in code :

             StringBuffer line = new StringBuffer();
              line.append("I m happy");
              line.append('\n');
              line.append("me 2");
              line.append('\n');
              line.append("`1234567890-=qwertyuiop[]\\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:\"'ZXCVBNM<>?\r\t\b\n\f");
            

            Data on screen :
            Data :

            I m happy
            me 2
            `1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?
            
            
            

            ------------------------------------
            Code :

            73-32-109-32-104-97-112-112-121-10-109-101-32-50-10-96-49-50-51-52-53-54-55-56-57-48-45-61-113-119-101-114-116-121-117-105-111-112-91-93-92-97-115-100-102-103-104-106-107-108-59-39-122-120-99-118-98-110-109-44-46-47-126-33-64-35-36-37-94-38-42-40-41-95-43-81-87-69-82-84-89-85-73-79-80-123-125-124-65-83-68-70-71-72-74-75-76-58-34-39-90-88-67-86-66-78-77-60-62-63-13-9-8-10-12-
            

            ------------------------------------
            Decoded :

            I m happy
            me 2
            `1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?
            
            
            
            amar_kamat Amar Kamat added a comment - Attaching a sample encode/decode implementation Data in code : StringBuffer line = new StringBuffer (); line.append( "I m happy" ); line.append( '\n' ); line.append( "me 2" ); line.append( '\n' ); line.append( "`1234567890-=qwertyuiop[]\\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:\" ' ZXCVBNM<>?\r\t\b\n\f"); Data on screen : Data : I m happy me 2 `1234567890-=qwertyuiop[]\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"' ZXCVBNM<>? ------------------------------------ Code : 73-32-109-32-104-97-112-112-121-10-109-101-32-50-10-96-49-50-51-52-53-54-55-56-57-48-45-61-113-119-101-114-116-121-117-105-111-112-91-93-92-97-115-100-102-103-104-106-107-108-59-39-122-120-99-118-98-110-109-44-46-47-126-33-64-35-36-37-94-38-42-40-41-95-43-81-87-69-82-84-89-85-73-79-80-123-125-124-65-83-68-70-71-72-74-75-76-58-34-39-90-88-67-86-66-78-77-60-62-63-13-9-8-10-12- ------------------------------------ Decoded : I m happy me 2 `1234567890-=qwertyuiop[]\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"' ZXCVBNM<>?

            Here is patch with one line change which replaces occurances of \n with \ \n (there is no space in between both back slashes. I put it here because the editor is treating it as a line break). The patch is tested, now it parses history files containing new lines in value field.

            amareshwari Amareshwari Sriramadasu added a comment - Here is patch with one line change which replaces occurances of \n with \ \n (there is no space in between both back slashes. I put it here because the editor is treating it as a line break). The patch is tested, now it parses history files containing new lines in value field.
            amar_kamat Amar Kamat added a comment -

            I think we should fix the general problem to do with history parsing which are
            1) Detect if the record is complete or not. The client can fail while writing to the history and the failure can be exactly on the key-val boundary.
            2) Detect if the key-val pairs are correct. The error message can contain tabs and other characters like " which can error the history parsing. Currently a tab is the delimiter for records and a " is used for value encapsulation. Similarly other strings in the history can have these characters like counter-names etc.
            This problems can fail HADOOP-3245.

            I would go for having a record delimiter like a .(dot) to detect if the record is complete or not. Also incomplete records should not be parsed and should be ignored. We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value.


            Thoughts?

            amar_kamat Amar Kamat added a comment - I think we should fix the general problem to do with history parsing which are 1) Detect if the record is complete or not. The client can fail while writing to the history and the failure can be exactly on the key-val boundary. 2) Detect if the key-val pairs are correct. The error message can contain tabs and other characters like " which can error the history parsing. Currently a tab is the delimiter for records and a " is used for value encapsulation. Similarly other strings in the history can have these characters like counter-names etc. This problems can fail HADOOP-3245 . I would go for having a record delimiter like a . (dot) to detect if the record is complete or not. Also incomplete records should not be parsed and should be ignored. We also need to make sure that the characters that are used as delimiter ( . , " , tab) should not occur in a value . Thoughts?

            We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value

            I guess what you mean is that they must be escaped. This is similar to what Amarsri has done, but is asking that the escaping be done for all characters that have significance in job history. +1 for that.

            yhemanth Hemanth Yamijala added a comment - We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value I guess what you mean is that they must be escaped. This is similar to what Amarsri has done, but is asking that the escaping be done for all characters that have significance in job history. +1 for that.
            omalley Owen O'Malley added a comment -

            I think we need to be consistent and quote all of the separators. We also need to unquote them when they are being parsed.

            omalley Owen O'Malley added a comment - I think we need to be consistent and quote all of the separators. We also need to unquote them when they are being parsed.

            Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character " is followed by \n, because the value doesnt allow " inside. Since the JobHistory looks for KEY="VALUE" Pattern for parsing keys and values, parsing fails if value has " and = in it.

            The attached patch escapes " and = in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both " and = are unescaped and returned.

            amareshwari Amareshwari Sriramadasu added a comment - Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character " is followed by \n , because the value doesnt allow " inside. Since the JobHistory looks for KEY="VALUE" Pattern for parsing keys and values, parsing fails if value has " and = in it. The attached patch escapes " and = in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both " and = are unescaped and returned.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12389822/patch-2403.txt
            against trunk revision 693705.

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no tests are needed for this patch.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs warnings.

            +1 core tests. The patch passed core unit tests.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/testReport/
            Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
            Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/checkstyle-errors.html
            Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389822/patch-2403.txt against trunk revision 693705. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/console This message is automatically generated.

            The patch does the following changes from earlier patch:
            1. Removes unnecessary change for DELIMITER from String to Char.
            2. Adds \n at the end of each line when the value spans through multiple lines, because BufferReader.readLine() doesnt read the line-termination character.
            3. Earlier patch was not putting the unescaped value in the parseBuffer. Fixed that in the patch.

            amareshwari Amareshwari Sriramadasu added a comment - The patch does the following changes from earlier patch: 1. Removes unnecessary change for DELIMITER from String to Char. 2. Adds \n at the end of each line when the value spans through multiple lines, because BufferReader.readLine() doesnt read the line-termination character. 3. Earlier patch was not putting the unescaped value in the parseBuffer. Fixed that in the patch.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12389900/patch-2403.txt
            against trunk revision 693705.

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no tests are needed for this patch.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs warnings.

            -1 core tests. The patch failed core unit tests.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/testReport/
            Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
            Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/checkstyle-errors.html
            Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389900/patch-2403.txt against trunk revision 693705. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/console This message is automatically generated.

            Test failure org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery is not related to the patch

            amareshwari Amareshwari Sriramadasu added a comment - Test failure org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery is not related to the patch
            • In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped ?
            • Also, I think this patch needs a test case to ensure lines with the special characters are being handled correctly.
            yhemanth Hemanth Yamijala added a comment - In parseHistoryFromFS , the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped ? Also, I think this patch needs a test case to ensure lines with the special characters are being handled correctly.

            Updated patch with trunk. And also added a testcase to parse values with special characters and patterns.

            In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped

            This is already taken care in StringUtils

            amareshwari Amareshwari Sriramadasu added a comment - Updated patch with trunk. And also added a testcase to parse values with special characters and patterns. In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped This is already taken care in StringUtils

            test-patch result on trunk :

                 [exec] +1 overall.
                 [exec]
                 [exec]     +1 @author.  The patch does not contain any @author tags.
                 [exec]
                 [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
                 [exec]
                 [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
                 [exec]
                 [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
                 [exec]
                 [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
            
            amareshwari Amareshwari Sriramadasu added a comment - test-patch result on trunk : [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12390260/patch-2403.txt
            against trunk revision 696427.

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified tests.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs warnings.

            +1 core tests. The patch passed core unit tests.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/testReport/
            Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
            Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/checkstyle-errors.html
            Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12390260/patch-2403.txt against trunk revision 696427. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/console This message is automatically generated.

            +1 for the fix.

            yhemanth Hemanth Yamijala added a comment - +1 for the fix.
            ddas Devaraj Das added a comment -

            I just committed this. Thanks, Amareshwari!

            ddas Devaraj Das added a comment - I just committed this. Thanks, Amareshwari!
            hudson Hudson added a comment -
            hudson Hudson added a comment - Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )
            hudson Hudson added a comment -
            hudson Hudson added a comment - Integrated in Hadoop-trunk #618 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/618/ )

            People

              amareshwari Amareshwari Sriramadasu
              runping Runping Qi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: