Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-4130

Conflict with duplicate org/w3c and org/xml packages in tika-app jar

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0, 2.9.0
    • 3.0.0-BETA
    • None
    • None
    • Java 8 and Java 11

    Description

      While attempting to migrate from version 1.20 to version 2.7 of Apache Tika, I encountered a specific error. 

      We have been using a "child-first classloader" to isolate the tika-app JAR from the classpath for file parsing.

      The error message we're facing is as follows:
      java.lang.LinkageError: loader constraint violation: when resolving overridden method "org.apache.xerces.jaxp.DocumentBuilderImpl.newDocument()Lorg/w3c/dom/Document;" the class loader (instance of org/xeustechnologies/jcl/JarClassLoader) of the current class, org/apache/xerces/jaxp/DocumentBuilderImpl, and its superclass loader (instance of <bootloader>), have different Class objects for the type org/w3c/dom/Document used in the signature.

      Upon analysis, I can see that a conflict exists between the default classloader (rt.jar) and our child-first classloader due to different versions of the class "Node.class" (org/w3c package) in both jars. Similar issues were encountered with the classes in "org/xml" package too.

      The parsing functionality worked correctly after removing the following packages from the tika-app JAR:
      1. org/w3c/**
      2. org/xml/**

      We are currently using Java 8 and would greatly appreciate guidance on the same.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              raahul.u RaahulUmapathy
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: