Description
Currently all binaries are passed to Tika for text extraction. However Tika can only parse those for which it has supported parser present. Therefore extraction logic should parse a binary only if the mimeType is supported by Tika.
With this change jcr:mimeType would become a mandatory property
JR2 had a similar check [1]
Attachments
Issue Links
- is related to
-
OAK-2463 Provide support for providing custom Tika config
- Closed