Description
We are using Tika java library to parse a bunch of documents (various formats). We are seeing the exception below regularly in our logs on certain documents. Any suggestions on how to fix would be really useful. On initial investigation it looks like its a bug with mismatched ASM between XHTMLClassVisitor and tika-parsers pom.
Failed to parse the document. org.apache.tika.exception.TikaException: Failed to parse a Java class
at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:66)
at org.apache.tika.parser.asm.ClassParser.parse (ClassParser.java:51)
at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse (AutoDetectParser.java:143)
at com.askscio.beam.docbuilder.processor.parsers.GenericParser.parse (GenericParser.java:55)
<snipped>
Caused by: java.lang.UnsupportedOperationException: This feature requires ASM7
at org.objectweb.asm.ClassVisitor.visitNestMember (ClassVisitor.java:236)
at org.objectweb.asm.ClassReader.accept (ClassReader.java:660)
at org.objectweb.asm.ClassReader.accept (ClassReader.java:400)
at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:61)}}