Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5166

Implement RichMedia annotation

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • PDModel

    Description

      See TIKA-3359. The attached file as an embedded Flash/swf file. Tika is not currently extracting the embedded file.

      In the debugger, I can see the Annotation as a PDAnnotationUnknown. In the COSDictionary, I can see the subtype is "RichMedia". If someone has the time, it'd be great to implement this so that we can extract more attachments in Tika... Obv, others may find use too.

      Many thanks to Tyler Thorsted for the test file and many thanks to @terminalboredom and @beet_keeper.

      Attachments

        1. testFlashInPDF.pdf
          158 kB
          Tim Allison

        Activity

          People

            Unassigned Unassigned
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: