XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.3
    • None

    Description

      Creating changelog.xml file doubles UTF-8 encoding if the git comment information is already UTF-8 format. For example: if property outputEncoding is set to ISO-8859-1 the output is (shown as od dump):

      0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
                u   s       t   o   i   m   i   m   a   a   n       m   y   ├
      0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
                Â   s       l   i   s   ├   ñ   y   k   s   e   s   s   ├   ñ
      

      And when set to UTF-8 the output is:

      0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
                i   m   i   m   a   a   n       m   y   ├   â   ┬   Â   s
      

      The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 B6 is the right for the "ö"-letter.

      The ISO-8859-1 format would do for the site documentation but since the file changelog.xml header says ISO-8859-1 encoding, rest of the process fails to process umlauts.

      I modified class ChangeLogReport method writeChangelogXml() by commenting out issue MCHANGELOG-86 writer change:

              PrintWriter pw = new PrintWriter(new BufferedOutputStream(new FileOutputStream(outputXML)));
              pw.write(changelogXml.toString());
              pw.flush();
              pw.close();
              // MCHANGELOG-86
      //        Writer writer = WriterFactory.newWriter( new BufferedOutputStream( new FileOutputStream( outputXML ) ),
      //                                                 getOutputEncoding() );
      //        writer.write(changelogXml.toString());
      //        writer.flush();
      //        writer.close();
      

      It might be there is double escaping in Writer since couple of lines above the change set is created with encoding information:

                  String changeset = changelogSet.toXML(getOutputEncoding());
      

      However, this is just a wild guess since I did not check out implementation of changelogSet.toXML() or writer.write(). It could be also something different in version control access since MCHANGELOG-86 was a SVN issue and here we got with GIT.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jht Jukka Harkki
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: