Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.0
-
None
Description
This issue is to track fixing issues which came up as part of TIKA-1285 (Upgrade to PDFBox 2.0.0 when available) mainly
- new exceptions compared to PDFBox 1.8.x
- regressions in text extraction
- lower quality text extraction
There should be individual issues to track tasks/bugs arising from that.
Attachments
Attachments
Issue Links
- depends upon
-
PDFBOX-3051 COSArray.getObject() incorrect handling of indirect reference to COSNull
- Closed
-
PDFBOX-3059 java.io.IOException: Error: Unknown annotation type COSNull{}
- Closed