Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5501

Jempbox is slow on xmp with large event histories

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • None
    • None
    • None
    • None

    Description

      In looking at the timeouts in a recent run against 8 million PDFs, I found one file where the processing time was caused by extremely slow parsing of the media management schema.

      If I do enough subclassing and put a hard limit inside getEventSequenceList(), the processing time is fairly quick.

      I realize that Jempbox is not going to be supported going forward and understand if this is a "do not fix".

      Attachments

        1. big.xmp.gz
          467 kB
          Tim Allison

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tallison Tim Allison
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: