[HADOOP-2403] JobHistory log files contain data that cannot be parsed by org.apache.hadoop.mapred.JobHistory - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.19.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

When some tasks failed, the job tracker writes an line to the history file with error message.
However, the error message may mess up with the history file format, choking the history parser. Here is an example:

MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090" TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED" FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050" ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,> substituted by null
. Key XML-Ori:

<Root>

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

EncodeDecode.java
30/Jul/08 14:20
1 kB
Amar Kamat
patch-2403.txt
17/Sep/08 11:07
7 kB
Amareshwari Sriramadasu
patch-2403.txt
11/Sep/08 03:52
2 kB
Amareshwari Sriramadasu
patch-2403.txt
10/Sep/08 11:38
2 kB
Amareshwari Sriramadasu
patch-2403.txt
12/Aug/08 08:18
0.5 kB
Amareshwari Sriramadasu

Issue Links

is depended upon by

HADOOP-3245 Provide ability to persist running jobs (extend HADOOP-1876)

Closed

Activity

Ascending order - Click to sort in descending order

Amareshwari Sriramadasu added a comment - 10/Apr/08 08:23

JobHistory parser reads line by line and parses it. If line ends with

{b} '\' {b}

, it will continue reading next line.
So, If error messages have more than one line, the lines should be appended with

{b} '\' {b}

(except the last line) before writing to log file.

Thoughts?

Amareshwari Sriramadasu added a comment - 10/Apr/08 08:23 JobHistory parser reads line by line and parses it. If line ends with {b} '\' {b} , it will continue reading next line. So, If error messages have more than one line, the lines should be appended with {b} '\' {b} (except the last line) before writing to log file. Thoughts?

Amareshwari Sriramadasu added a comment - 10/Apr/08 08:26

, the lines should be appended with {b} '\' {b}.

This is to be done by replace "\n" by "\ \n"

Amareshwari Sriramadasu added a comment - 10/Apr/08 08:26 , the lines should be appended with {b} '\' {b}. This is to be done by replace "\n" by "\ \n"

Amar Kamat added a comment - 30/Jul/08 14:08

Following are the solutions that might work here
1) Encode/decode the strings that are externally generated (like error).
2) Add character count information to the key-val pair. Something like key=length:value. Now read length characters at a time for forming the value.
3) Serialize job history information.

Following are the problems with (2) when error-like strings are stored in JobHistory :
1) The whole line needs to be read before parsing and hence there is no good way to detect the length at the key-value level.
2) A simple solution would be to use record-level length. But again there is a problem :

Currently, the code checks for a ' " ' in the end before considering the record as complete. This can be erroneous as error string can contain ' " ' which might lead to premature termination. Also the splitting of key-val pairs is done based on space. Hence with error like strings in the history, the split will result into wrong key-val pairs. Hence encoding-decoding seems to a better fix for all these problems.

While encoding, one should make sure that the data is written in one line. Hence the record parsing algorithm becomes
1) Read line. Since all the entries fit in one line, there is no need to look for record end.
2) Split the line based on space.
3) Split the pair on '=' to get key and value. Recover the value if required.

Thoughts?

Amar Kamat added a comment - 30/Jul/08 14:08 Following are the solutions that might work here 1) Encode/decode the strings that are externally generated (like error). 2) Add character count information to the key-val pair. Something like key=length:value . Now read length characters at a time for forming the value. 3) Serialize job history information. Following are the problems with (2) when error-like strings are stored in JobHistory : 1) The whole line needs to be read before parsing and hence there is no good way to detect the length at the key-value level. 2) A simple solution would be to use record-level length. But again there is a problem : Currently, the code checks for a ' " ' in the end before considering the record as complete. This can be erroneous as error string can contain ' " ' which might lead to premature termination. Also the splitting of key-val pairs is done based on space . Hence with error like strings in the history, the split will result into wrong key-val pairs. Hence encoding-decoding seems to a better fix for all these problems. While encoding, one should make sure that the data is written in one line. Hence the record parsing algorithm becomes 1) Read line. Since all the entries fit in one line, there is no need to look for record end. 2) Split the line based on space . 3) Split the pair on '=' to get key and value. Recover the value if required. Thoughts?

Amar Kamat added a comment - 30/Jul/08 14:20

Attaching a sample encode/decode implementation

Data in code :

 StringBuffer line = new StringBuffer();
  line.append("I m happy");
  line.append('\n');
  line.append("me 2");
  line.append('\n');
  line.append("`1234567890-=qwertyuiop[]\\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:\"'ZXCVBNM<>?\r\t\b\n\f");

Data on screen :
Data :

I m happy
me 2
`1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?

------------------------------------
Code :

73-32-109-32-104-97-112-112-121-10-109-101-32-50-10-96-49-50-51-52-53-54-55-56-57-48-45-61-113-119-101-114-116-121-117-105-111-112-91-93-92-97-115-100-102-103-104-106-107-108-59-39-122-120-99-118-98-110-109-44-46-47-126-33-64-35-36-37-94-38-42-40-41-95-43-81-87-69-82-84-89-85-73-79-80-123-125-124-65-83-68-70-71-72-74-75-76-58-34-39-90-88-67-86-66-78-77-60-62-63-13-9-8-10-12-

------------------------------------
Decoded :

I m happy
me 2
`1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?

Amar Kamat added a comment - 30/Jul/08 14:20 Attaching a sample encode/decode implementation Data in code : StringBuffer line = new StringBuffer (); line.append( "I m happy" ); line.append( '\n' ); line.append( "me 2" ); line.append( '\n' ); line.append( "`1234567890-=qwertyuiop[]\\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:\" ' ZXCVBNM<>?\r\t\b\n\f"); Data on screen : Data : I m happy me 2 `1234567890-=qwertyuiop[]\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"' ZXCVBNM<>? ------------------------------------ Code : 73-32-109-32-104-97-112-112-121-10-109-101-32-50-10-96-49-50-51-52-53-54-55-56-57-48-45-61-113-119-101-114-116-121-117-105-111-112-91-93-92-97-115-100-102-103-104-106-107-108-59-39-122-120-99-118-98-110-109-44-46-47-126-33-64-35-36-37-94-38-42-40-41-95-43-81-87-69-82-84-89-85-73-79-80-123-125-124-65-83-68-70-71-72-74-75-76-58-34-39-90-88-67-86-66-78-77-60-62-63-13-9-8-10-12- ------------------------------------ Decoded : I m happy me 2 `1234567890-=qwertyuiop[]\asdfghjkl; 'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"' ZXCVBNM<>?

Amareshwari Sriramadasu added a comment - 12/Aug/08 08:18

Here is patch with one line change which replaces occurances of \n with \ \n (there is no space in between both back slashes. I put it here because the editor is treating it as a line break). The patch is tested, now it parses history files containing new lines in value field.

Amareshwari Sriramadasu added a comment - 12/Aug/08 08:18 Here is patch with one line change which replaces occurances of \n with \ \n (there is no space in between both back slashes. I put it here because the editor is treating it as a line break). The patch is tested, now it parses history files containing new lines in value field.

Amar Kamat added a comment - 12/Aug/08 08:44

I think we should fix the general problem to do with history parsing which are
1) Detect if the record is complete or not. The client can fail while writing to the history and the failure can be exactly on the key-val boundary.
2) Detect if the key-val pairs are correct. The error message can contain tabs and other characters like " which can error the history parsing. Currently a tab is the delimiter for records and a " is used for value encapsulation. Similarly other strings in the history can have these characters like counter-names etc.
This problems can fail ~~HADOOP-3245~~.

I would go for having a record delimiter like a .(dot) to detect if the record is complete or not. Also incomplete records should not be parsed and should be ignored. We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value.

Thoughts?

Amar Kamat added a comment - 12/Aug/08 08:44 I think we should fix the general problem to do with history parsing which are 1) Detect if the record is complete or not. The client can fail while writing to the history and the failure can be exactly on the key-val boundary. 2) Detect if the key-val pairs are correct. The error message can contain tabs and other characters like " which can error the history parsing. Currently a tab is the delimiter for records and a " is used for value encapsulation. Similarly other strings in the history can have these characters like counter-names etc. This problems can fail HADOOP-3245 . I would go for having a record delimiter like a . (dot) to detect if the record is complete or not. Also incomplete records should not be parsed and should be ignored. We also need to make sure that the characters that are used as delimiter ( . , " , tab) should not occur in a value . Thoughts?

Hemanth Yamijala added a comment - 12/Aug/08 09:15

We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value

I guess what you mean is that they must be escaped. This is similar to what Amarsri has done, but is asking that the escaping be done for all characters that have significance in job history. +1 for that.

Hemanth Yamijala added a comment - 12/Aug/08 09:15 We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value I guess what you mean is that they must be escaped. This is similar to what Amarsri has done, but is asking that the escaping be done for all characters that have significance in job history. +1 for that.

Owen O'Malley added a comment - 12/Aug/08 20:40

I think we need to be consistent and quote all of the separators. We also need to unquote them when they are being parsed.

Owen O'Malley added a comment - 12/Aug/08 20:40 I think we need to be consistent and quote all of the separators. We also need to unquote them when they are being parsed.

Amareshwari Sriramadasu added a comment - 10/Sep/08 11:38

Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character " is followed by \n, because the value doesnt allow " inside. Since the JobHistory looks for KEY="VALUE" Pattern for parsing keys and values, parsing fails if value has " and = in it.

The attached patch escapes " and = in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both " and = are unescaped and returned.

Amareshwari Sriramadasu added a comment - 10/Sep/08 11:38 Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character " is followed by \n , because the value doesnt allow " inside. Since the JobHistory looks for KEY="VALUE" Pattern for parsing keys and values, parsing fails if value has " and = in it. The attached patch escapes " and = in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both " and = are unescaped and returned.

Hadoop QA added a comment - 10/Sep/08 18:02

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389822/patch-2403.txt
against trunk revision 693705.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/console

This message is automatically generated.

Hadoop QA added a comment - 10/Sep/08 18:02 -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389822/patch-2403.txt against trunk revision 693705. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/console This message is automatically generated.

Amareshwari Sriramadasu added a comment - 11/Sep/08 03:52

The patch does the following changes from earlier patch:
1. Removes unnecessary change for DELIMITER from String to Char.
2. Adds \n at the end of each line when the value spans through multiple lines, because BufferReader.readLine() doesnt read the line-termination character.
3. Earlier patch was not putting the unescaped value in the parseBuffer. Fixed that in the patch.

Amareshwari Sriramadasu added a comment - 11/Sep/08 03:52 The patch does the following changes from earlier patch: 1. Removes unnecessary change for DELIMITER from String to Char. 2. Adds \n at the end of each line when the value spans through multiple lines, because BufferReader.readLine() doesnt read the line-termination character. 3. Earlier patch was not putting the unescaped value in the parseBuffer. Fixed that in the patch.

Hadoop QA added a comment - 11/Sep/08 14:50

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389900/patch-2403.txt
against trunk revision 693705.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/console

This message is automatically generated.

Hadoop QA added a comment - 11/Sep/08 14:50 -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12389900/patch-2403.txt against trunk revision 693705. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/console This message is automatically generated.

Amareshwari Sriramadasu added a comment - 11/Sep/08 15:04

Test failure org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery is not related to the patch

Amareshwari Sriramadasu added a comment - 11/Sep/08 15:04 Test failure org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery is not related to the patch

Hemanth Yamijala added a comment - 16/Sep/08 14:54

In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped ?
Also, I think this patch needs a test case to ensure lines with the special characters are being handled correctly.

Hemanth Yamijala added a comment - 16/Sep/08 14:54 In parseHistoryFromFS , the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped ? Also, I think this patch needs a test case to ensure lines with the special characters are being handled correctly.

Amareshwari Sriramadasu added a comment - 17/Sep/08 11:07

Updated patch with trunk. And also added a testcase to parse values with special characters and patterns.

In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped

This is already taken care in StringUtils

Amareshwari Sriramadasu added a comment - 17/Sep/08 11:07 Updated patch with trunk. And also added a testcase to parse values with special characters and patterns. In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped This is already taken care in StringUtils

Amareshwari Sriramadasu added a comment - 17/Sep/08 11:08

test-patch result on trunk :

     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.

Amareshwari Sriramadasu added a comment - 17/Sep/08 11:08 test-patch result on trunk : [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

Hadoop QA added a comment - 18/Sep/08 01:01

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390260/patch-2403.txt
against trunk revision 696427.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/console

This message is automatically generated.

Hadoop QA added a comment - 18/Sep/08 01:01 +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12390260/patch-2403.txt against trunk revision 696427. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/console This message is automatically generated.

Hemanth Yamijala added a comment - 18/Sep/08 05:22

+1 for the fix.

Hemanth Yamijala added a comment - 18/Sep/08 05:22 +1 for the fix.

Devaraj Das added a comment - 18/Sep/08 11:55

I just committed this. Thanks, Amareshwari!

Devaraj Das added a comment - 18/Sep/08 11:55 I just committed this. Thanks, Amareshwari!

Hudson added a comment - 22/Sep/08 15:18

Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/)

Hudson added a comment - 22/Sep/08 15:18 Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )

Hudson added a comment - 29/Sep/08 15:27

Integrated in Hadoop-trunk #618 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/618/)

Hudson added a comment - 29/Sep/08 15:27 Integrated in Hadoop-trunk #618 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/618/ )

People

Assignee:: Amareshwari Sriramadasu

Reporter:: Runping Qi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 11/Dec/07 02:30

Updated:: 02/May/13 02:29

Resolved:: 18/Sep/08 11:55

Hadoop Common

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates