Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4
-
None
-
None
Description
When attempting to write an xml document containing valid UTF-16 surrogate pairs an error occurs during validation. This causes the write to fail.
It appears as though this issue was introduced with https://issues.apache.org/jira/browse/XERCESC-1854 in the following commit http://svn.apache.org/viewvc/xerces/c/trunk/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp?r1=768978&r2=1226891.
I have supplied a reproducible and a potential patch. The string validator should be responsible for determining if the codepoint is part of a surrogate pair. However, I may also like to make the argument that this may not be the right location to be doing the string validation. As it will leave the output document in an inconsistent (half-written) state.