Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
2.3.1Addons
-
None
-
Java 1.5 and later
Description
When TikaAnnotator is part of a PEAR file, then when you call UIMAFramework.produceAnalysisEngine() and Tika asks the system for an XML parser, it fails with the exception:
javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found
This is because the XML parser is now built into Java, but the UIMA classloader (used with PEAR files) finds the parser implementation in xml-apis.jar first, which is older and incompatible with the current XML interfaces. xml-apis.jar is included because it's one of the eventual maven dependencies for Tika 0.7. See this issue for more information:
https://issues.apache.org/jira/browse/TIKA-412
This was fixed in Tika 0.8.
A work-around for those UIMA users who want to use TikaAnnotator in PEAR files with Java 1.6 is to exclude xml-apis from their PEAR file:
<dependency>
<groupId>org.apache.uima</groupId>
<artifactId>TikaAnnotator</artifactId>
<exclusions>
<exclusion>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
</exclusion>
</exclusions>
</dependency>
However, a better fix would be to update the version of Tika used in TikaAnnotator.
Attachments
Issue Links
- relates to
-
UIMA-2547 Migrate TikaAnnotator to Tika 0.8
- Resolved