Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
0.5.0
-
None
-
None
Description
The streaming code internally reads the input data into a UTF8 . This causes truncated data to be shipped to the mapper when the input exceeds about 21000 characters, with no notice to the user except possibly in individual tasks' machines' logs, which people would not normally read for apparently successful jobs.
Attachments
Issue Links
- duplicates
-
HADOOP-413 streaming: replace class UTF8 with class Text
- Closed