Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-2319

Concurrency errors in text search when using explicit Analyzers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Jena 4.5.0
    • None
    • None

    Description

      Seeing errors when multiple jena text queries are in flight at the same time.  Precise traces vary but all examples seen so far occur in the Lucene analyzer phase of query parsing. Have only been able to reproduce this reliably when using the ConfigurableAnalyzer but that code itself looks clean suggesting that in general Lucene Analyzers are not thread safe. 

      Reproduced on Jena versions from 3.16.0 through 4.4.0.

      Will submit a PR with a test case and brute force fix (synchronize the query parse step) though more subtle fixes may be possible.

      Example partial stack traces:

      Caused by: java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
          at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:109) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.CharacterUtils.readFully(CharacterUtils.java:184) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:160) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:178) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:174) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:102) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:91) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:70) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:312) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:260) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.newFieldQuery(QueryParserBase.java:473) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:465) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:828) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109) ~[fuseki-server.jar:4.4.0]
          at org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) ~[fuseki-server.jar:4.4.0]

      ...

      and

      Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 16 out of bounds for length 16
          at org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:109) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.analysis.Analyzer.normalize(Analyzer.java:247) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryParserBase.java:756) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:824) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215) ~[fuseki-server.jar:4.4.0]
          at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109) ~[fuseki-server.jar:4.4.0]
          at org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) ~[fuseki-server.jar:4.4.0]

      ...

      Attachments

        Activity

          People

            der Dave Reynolds
            der Dave Reynolds
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: