Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.1
-
None
-
Any
Description
The tika-server API (web service) provides a limited set of functionality compared to the tika-app command-line version. Notable things missing are:
1. Language recognition.
2. Output in various formats (JSON for metadata, XHTML for the extracted text).
Those are the two main things that would be useful to me, but ideally the server should be able to provide all the functionality that the command-line app does, taking the command-line as the model to follow.