Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Version 2.1
-
None
-
None
-
All
Description
When white space stripping is specified the parser does not detect XML entities such as & and strips the whitespace following each entity.
For example
<root>dog & cat</root>
is parsed as
<root>dog &cat</root>
The cause of the problem is the stripLeft() method in the org.apache.xmlbeans.impl.store.CharUtil
Below is a fixed version of the method that detects the ';' character after an entity which indicates that whitespace is significant and must be preserved. Note this code does not fix the case where the iteration is a for loop.
public Object stripLeft ( Object src, int off, int cch )
{
assert isValid( src, off, cch );
if (cch > 0)
{
if (src instanceof char[])
{
char[] chars = (char[]) src;
while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off - 1]!=';' ) //Fix for & etc
{ cch--; off++; }}
else if (src instanceof String)
{
String s = (String) src;
while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) && s.charAt(off - 1)!=';' ) //Fix for & etc
{ cch--; off++; }
}
else
}
if (cch == 0)
{ _offSrc = 0; _cchSrc = 0; return null; } _offSrc = off;
_cchSrc = cch;
return src;
}