Description
When using the CLI to unparse XML using the "xml" infoset type, we have the following code:
case "xml" => { val rdr = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(anyRef.asInstanceOf[Array[Byte]]))) new XMLTextInfosetInputter(rdr) }
In order to create the XMLTextInfosetInputter, we create an InputStreamReader, but we do not specify an encoding. This means the Java "file.encoding" system property will be used to decode this XML. So on machines where that property isn't UTF-8 (e.g. Windows), this can result in UTF-8 data in the XML not decoded correctly, which leads to incorrect unparsed data.
I believe Woodstox has the ability to inspect XML and determine the encoding based on the preamble, so we should just take advantage of that. So we should change the XMLTextInfosetInputter to accept an InputStream in the constructor instead of a Reader, and deprecate the Reader constructor.