Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
When some tasks failed, the job tracker writes an line to the history file with error message.
However, the error message may mess up with the history file format, choking the history parser. Here is an example:
MapAttempt TASK_TYPE="MAP" TASKID="tip_200712102254_0001_m_000090" TASK_ATTEMPT_ID="task_200712102254_0001_m_000090_0" TASK_STATUS="FAILED" FINISH_TIME="1197327293253" HOSTNAME="XXXX:50050" ERROR="java.lang.IllegalArgumentException: Trouble to get key or value (<,> substituted by null
. Key XML-Ori:
<Root>
Attachments
Attachments
- EncodeDecode.java
- 1 kB
- Amar Kamat
- patch-2403.txt
- 7 kB
- Amareshwari Sriramadasu
- patch-2403.txt
- 2 kB
- Amareshwari Sriramadasu
- patch-2403.txt
- 2 kB
- Amareshwari Sriramadasu
- patch-2403.txt
- 0.5 kB
- Amareshwari Sriramadasu
Issue Links
- is depended upon by
-
HADOOP-3245 Provide ability to persist running jobs (extend HADOOP-1876)
- Closed
Activity
, the lines should be appended with {b} '\' {b}.
This is to be done by replace "\n" by "\ \n"
Following are the solutions that might work here
1) Encode/decode the strings that are externally generated (like error).
2) Add character count information to the key-val pair. Something like key=length:value. Now read length characters at a time for forming the value.
3) Serialize job history information.
Following are the problems with (2) when error-like strings are stored in JobHistory :
1) The whole line needs to be read before parsing and hence there is no good way to detect the length at the key-value level.
2) A simple solution would be to use record-level length. But again there is a problem :
- Currently, the code checks for a ' " ' in the end before considering the record as complete. This can be erroneous as error string can contain ' " ' which might lead to premature termination. Also the splitting of key-val pairs is done based on space. Hence with error like strings in the history, the split will result into wrong key-val pairs. Hence encoding-decoding seems to a better fix for all these problems.
While encoding, one should make sure that the data is written in one line. Hence the record parsing algorithm becomes
1) Read line. Since all the entries fit in one line, there is no need to look for record end.
2) Split the line based on space.
3) Split the pair on '=' to get key and value. Recover the value if required.
Thoughts?
Attaching a sample encode/decode implementation
Data in code :
StringBuffer line = new StringBuffer(); line.append("I m happy"); line.append('\n'); line.append("me 2"); line.append('\n'); line.append("`1234567890-=qwertyuiop[]\\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:\"'ZXCVBNM<>?\r\t\b\n\f");
Data on screen :
Data :
I m happy
me 2
`1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?
------------------------------------
Code :
73-32-109-32-104-97-112-112-121-10-109-101-32-50-10-96-49-50-51-52-53-54-55-56-57-48-45-61-113-119-101-114-116-121-117-105-111-112-91-93-92-97-115-100-102-103-104-106-107-108-59-39-122-120-99-118-98-110-109-44-46-47-126-33-64-35-36-37-94-38-42-40-41-95-43-81-87-69-82-84-89-85-73-79-80-123-125-124-65-83-68-70-71-72-74-75-76-58-34-39-90-88-67-86-66-78-77-60-62-63-13-9-8-10-12-
------------------------------------
Decoded :
I m happy
me 2
`1234567890-=qwertyuiop[]\asdfghjkl;'zxcvbnm,./~!@#$%^&*()_+QWERTYUIOP{}|ASDFGHJKL:"'ZXCVBNM<>?
Here is patch with one line change which replaces occurances of \n with \ \n (there is no space in between both back slashes. I put it here because the editor is treating it as a line break). The patch is tested, now it parses history files containing new lines in value field.
I think we should fix the general problem to do with history parsing which are
1) Detect if the record is complete or not. The client can fail while writing to the history and the failure can be exactly on the key-val boundary.
2) Detect if the key-val pairs are correct. The error message can contain tabs and other characters like " which can error the history parsing. Currently a tab is the delimiter for records and a " is used for value encapsulation. Similarly other strings in the history can have these characters like counter-names etc.
This problems can fail HADOOP-3245.
I would go for having a record delimiter like a .(dot) to detect if the record is complete or not. Also incomplete records should not be parsed and should be ignored. We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value.
Thoughts?
We also need to make sure that the characters that are used as delimiter (., ", tab) should not occur in a value
I guess what you mean is that they must be escaped. This is similar to what Amarsri has done, but is asking that the escaping be done for all characters that have significance in job history. +1 for that.
I think we need to be consistent and quote all of the separators. We also need to unquote them when they are being parsed.
Existing history parsing code actually incorporates new lines in the values. The parsing problem occurs when the character " is followed by \n, because the value doesnt allow " inside. Since the JobHistory looks for KEY="VALUE" Pattern for parsing keys and values, parsing fails if value has " and = in it.
The attached patch escapes " and = in the value and logs it. Regular expression for VALUE is modified to allow any character otherthan quote, but escaped quotes will be allowed. After parsing the value, both " and = are unescaped and returned.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389822/patch-2403.txt
against trunk revision 693705.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3233/console
This message is automatically generated.
The patch does the following changes from earlier patch:
1. Removes unnecessary change for DELIMITER from String to Char.
2. Adds \n at the end of each line when the value spans through multiple lines, because BufferReader.readLine() doesnt read the line-termination character.
3. Earlier patch was not putting the unescaped value in the parseBuffer. Fixed that in the patch.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389900/patch-2403.txt
against trunk revision 693705.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3243/console
This message is automatically generated.
Test failure org.apache.hadoop.hdfs.TestFileCreation.testClientTriggeredLeaseRecovery is not related to the patch
- In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped ?
- Also, I think this patch needs a test case to ensure lines with the special characters are being handled correctly.
Updated patch with trunk. And also added a testcase to parse values with special characters and patterns.
In parseHistoryFromFS, the check if a line ends with an escaped quote, does it also require to check if the escape character itself needs to be escaped
This is already taken care in StringUtils
test-patch result on trunk :
[exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12390260/patch-2403.txt
against trunk revision 696427.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3292/console
This message is automatically generated.
Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/)
Integrated in Hadoop-trunk #618 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/618/)
JobHistory parser reads line by line and parses it. If line ends with
{b} '\' {b}, it will continue reading next line.
{b} '\' {b}So, If error messages have more than one line, the lines should be appended with
(except the last line) before writing to log file.
Thoughts?