Description
Background:
We use maven-site-plugin to generate our documentation site (source is xdoc) and noticed that non-ASCII characters like Japanese or Chinese in table caption are not correctly displayed in the output html files.
During generation, these characters are encoded to entities (e.g. '和' ) and are displayed correctly in the browser.
However, in a table caption, the first '&' is escaped into '&'.
So, for example, the actual output becomes '和' while the expected output is '和'.
To verify the issue, modify org.apache.maven.doxia.sink.impl.XhtmlBaseSinkTest.testTableCaption() as follows.
public void testTableCaption() { try { sink = new XhtmlBaseSink( writer ); sink.table(); sink.tableRows( null, false ); sink.tableCaption( attributes ); // Insert '&' sink.text( "cap&tion" ); sink.tableCaption_(); sink.tableRows_(); sink.table_(); } finally { sink.close(); } // the inserted '&' will become '&amp;' instead of '&' assertEquals( "<table border=\"0\" class=\"bodyTable\">" + "<caption style=\"bold\">cap&tion</caption></table>", writer.toString() ); }
Not sure if this is a proper fix, but I modified the following line in org.apache.maven.doxia.sink.impl.XhtmlBaseSink.write(String) ...
this.tableCaptionXMLWriterStack.getLast().writeText( unifyEOLs( text ) );
... to ...
this.tableCaptionXMLWriterStack.getLast().writeMarkup( unifyEOLs( text ) );
... and the issue was resolved without breaking existing tests.
I have attached the above modification as a patch.
Please let me know if you prefer a PR on GitHub or need more info.
Thanks in advance,
Iwao
Attachments
Attachments
Issue Links
- links to