Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3861

ClassCastException: org.apache.pdfbox.cos.COSStream cannot be cast to org.apache.pdfbox.cos.COSString

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.6
    • 2.0.7, 3.0.0 PDFBox
    • PDModel
    • None

    Description

      I got a ClassCastException throught Tika HTML extraction on PDFBox code:

      org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@60b1fa63
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	... 16 more
      Caused by: java.lang.ClassCastException: org.apache.pdfbox.cos.COSStream cannot be cast to org.apache.pdfbox.cos.COSString
      	at org.apache.pdfbox.cos.COSDictionary.getDate(COSDictionary.java:787)
      	at org.apache.pdfbox.pdmodel.PDDocumentInformation.getCreationDate(PDDocumentInformation.java:212)
      	at org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:256)
      	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:146)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      

      Attachments

        Activity

          People

            tilman Tilman Hausherr
            Giorgy Jorge Spinsanti
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: