Details
-
New Feature
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
See TIKA-3359. The attached file as an embedded Flash/swf file. Tika is not currently extracting the embedded file.
In the debugger, I can see the Annotation as a PDAnnotationUnknown. In the COSDictionary, I can see the subtype is "RichMedia". If someone has the time, it'd be great to implement this so that we can extract more attachments in Tika... Obv, others may find use too.
Many thanks to Tyler Thorsted for the test file and many thanks to @terminalboredom and @beet_keeper.