Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1743

OutOfMemoryError in fontbox

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.8.3, 2.0.0
    • FontBox, PDModel

    Description

      When trying to index a pdf document in solr, pdfbox (fontbox) throws java.lang.OutOfMemoryError: Java heap space exception

      This is the stack trace:
      SEVERE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException:
      java.lang.OutOfMemoryError: Java heap space at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
      at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382) at
      org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
      Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException:
      java.lang.OutOfMemoryError: Java heap space at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
      at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
      ... 3 more
      Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.OutOfMemoryError:
      Java heap space at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:542)
      at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411) ... 5
      more
      Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.fontbox.cff.IndexData.initData(IndexData.java:95)
      at org.apache.fontbox.cff.CFFParser.readIndexData(CFFParser.java:152) at org.apache.fontbox.cff.CFFParser.parse(CFFParser.java:103)
      at org.apache.pdfbox.pdmodel.font.PDType1CFont.load(PDType1CFont.java:322) at org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:104)
      at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:162) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92)
      at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:203) at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:604)
      at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at
      org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
      at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:455) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:379)
      at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:335) at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:66)
      at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:153) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
      at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:127)
      at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
      at
      org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:472) ... 6 more

      Attachments

        1. document1.pdf
          373 kB
          Stanea Paul

        Activity

          People

            lehmi Andreas Lehmkühler
            paul.stanea Stanea Paul
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: