Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
Currently, if one call toString("UTF-8"), a String object is created using Java's converion code.
That does not work properly for some rare but still true utf-8 bytes. Hadoop has its own utf-8 conversion
code for string serialization/deserialization. The same code should be used here.