Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3112

NullPointerException at AbstractPDF2XHTML.extractXMPXFA() when using tika-app GUI

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.24.1
    • 1.25
    • app

    Description

      After start tika-app-1.24.1 using: java -jar tika-app-1.24.1.jar -g from an linux terminal to open the Tika GUI window. As soon as open a pdf file containing CJK characters from the Tika App GUI, received the following error messages. Please note opening the same file in tika-app-1.23 execute perfectly as expected without these errors. 

      Also downloaded and compiled the Tika-Master, I believe it's the unreleased tiks-2.0.0. The same problem above is present as well. 

      I also received (java:10218): GLib-GObject-WARNING **: 11:21:32.098: invalid cast from 'GtkToplevelAccessible' to 'JawToplevel' in the terminal in both cases.

      Apache Tika was unable to parse the documentApache Tika was unable to parse the documentat /home/tssoon/NetBeansProjects/dbTesting/sample-4.pdf.
      The full exception stack trace is included below:
      org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@4e5c0db8 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188) at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:84) at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:358) at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:309) at org.apache.tika.gui.TikaGUI.actionPerformed(TikaGUI.java:267) at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022) at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348) at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402) at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259) at javax.swing.AbstractButton.doClick(AbstractButton.java:376) at javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:842) at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:886) at java.awt.Component.processMouseEvent(Component.java:6539) at javax.swing.JComponent.processMouseEvent(JComponent.java:3324) at java.awt.Component.processEvent(Component.java:6304) at java.awt.Container.processEvent(Container.java:2239) at java.awt.Component.dispatchEventImpl(Component.java:4889) at java.awt.Container.dispatchEventImpl(Container.java:2297) at java.awt.Component.dispatchEvent(Component.java:4711) at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4904) at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4535) at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4476) at java.awt.Container.dispatchEventImpl(Container.java:2283) at java.awt.Window.dispatchEventImpl(Window.java:2746) at java.awt.Component.dispatchEvent(Component.java:4711) at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:760) at java.awt.EventQueue.access$500(EventQueue.java:97) at java.awt.EventQueue$3.run(EventQueue.java:709) at java.awt.EventQueue$3.run(EventQueue.java:703) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:84) at java.awt.EventQueue$4.run(EventQueue.java:733) at java.awt.EventQueue$4.run(EventQueue.java:731) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74) at java.awt.EventQueue.dispatchEvent(EventQueue.java:730) at org.GNOME.Accessibility.AtkWrapper$6.dispatchEvent(AtkWrapper.java:715) at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205) at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116) at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93) at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)

      Caused by: java.lang.NullPointerException at org.apache.tika.parser.pdf.AbstractPDF2XHTML.extractXMPXFA(AbstractPDF2XHTML.java:209) at org.apache.tika.parser.pdf.AbstractPDF2XHTML.endDocument(AbstractPDF2XHTML.java:678) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:267) at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:96) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:174) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 46 more

       

      Attached please find the  document used in the operations above: sample-4.pdf

      Attachments

        1. sample-4.pdf
          442 kB
          Ip Smile

        Issue Links

          Activity

            People

              tallison Tim Allison
              ipsmile Ip Smile
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: