[LUCENE-7897] RangeQuery optimization in IndexOrDocValuesQuery - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: trunk, 7.0
Fix Version/s: 7.1, 8.0
Component/s: core/search
Labels:
None

Lucene Fields:

New

Description

For range queries, Lucene uses either Points or Docvalues based on cost estimation (https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/search/IndexOrDocValuesQuery.html). Scorer is chosen based on the minCost here: https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/Boolean2ScorerSupplier.java#L16

However, the cost calculation for TermQuery and IndexOrDocvalueQuery seems to have same weightage. Essentially, cost depends upon the docfreq in TermDict, number of points visited and number of docvalues. In a situation where docfreq is not too restrictive, this is lot of lookups for docvalues and using points would have been better.

Following query with 1M matches, takes 60ms with docvalues, but only 27ms with points. If I change the query to "message:*", which matches all docs, it choses the points(since cost is same), but with message:xyz it choses docvalues eventhough doc frequency is 1million which results in many docvalue fetches. Would it make sense to change the cost of docvalues query to be higher or use points if the docfreq is too high for the term query(find an optimum threshold where points cost < docvalue cost)?

{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "message:xyz"
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": 1498652400000,
              "lte": 1498905000000,
              "format": "epoch_millis"
            }
          }
        }
      ]
    }
  }
}

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-7897.patch
01/Aug/17 13:26
40 kB
Adrien Grand

Activity

People

Assignee:: Unassigned

Reporter:: Murali Krishna P

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 04/Jul/17 13:44

Updated:: 28/Aug/22 15:17

Resolved:: 10/Aug/17 10:14