Details
Description
Parsing the file http://media.opentur.it/WEB/CHANNELS/COCKTAILVIAGGI/CMS/PDF/Irlanda%202009%2028-51pag.pdf the call to Tika "parse" fails with the followinf stack trace:
java.io.IOException: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.pdf.PDFParser@1578aab
at com.travelport.indexing.documentparser.GenericDocumentParserTikaImpl.parse(GenericDocumentParserTikaImpl.java:143)
at com.travelport.indexing.documentparser.GenericDocumentParserTikaImpl.main(GenericDocumentParserTikaImpl.java:306)
Caused by: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.pdf.PDFParser@1578aab
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:126)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:101)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:114)
at com.travelport.indexing.documentparser.GenericDocumentParserTikaImpl.parse(GenericDocumentParserTikaImpl.java:69)
... 1 more
Caused by: org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:237)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:841)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:808)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:53)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:120)
... 4 more
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at org.apache.pdfbox.pdfparser.PDFXrefStreamParser.parse(PDFXrefStreamParser.java:115)
at org.apache.pdfbox.cos.COSDocument.parseXrefStreams(COSDocument.java:538)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:203)
... 8 more