Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-16942

Improve knn explain output

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      The following is explain output for a query involving both reRank and a {!knn} query:
      1.4137135 = combined unscaled first and scaled second pass score 
        0.9137135 = first pass score
          0.9137135 = sum of:
            0.0039847707 = sum of:
              0.0039847707 = max of:
                0.0014896907 = weight(description_t:miles in 113) [SchemaSimilarity], result of:
                  0.0014896907 = score(freq=2.0), computed as boost * idf * tf from:
                    0.001 = boost
                    2.0111222 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
                      26 = n, number of documents containing term
                      197 = N, total number of documents with field
                    0.740726 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
                      2.0 = freq, occurrences of term within document
                      1.2 = k1, term saturation parameter
                      0.75 = b, length normalization parameter
                      21.0 = dl, length of field
                      47.243656 = avgdl, average length of field
                0.0039847707 = weight(title_t:miles in 113) [SchemaSimilarity], result of:
                  0.0039847707 = score(freq=2.0), computed as boost * idf * tf from:
                    0.002 = boost
                    2.84592 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
                      11 = n, number of documents containing term
                      197 = N, total number of documents with field
                    0.7000848 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
                      2.0 = freq, occurrences of term within document
                      1.2 = k1, term saturation parameter
                      0.75 = b, length normalization parameter
                      7.0 = dl, length of field
                      11.314721 = avgdl, average length of field
            0.90972877 = within top 100
        1.0 = second pass score scaled between:0-1
          3.9847708 = second pass score
            3.9847708 = sum of:
              3.9847708 = max of:
                1.4896905 = weight(description_t:miles in 113) [SchemaSimilarity], result of:
                  1.4896905 = score(freq=2.0), computed as boost * idf * tf from:
                    2.0111222 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
                      26 = n, number of documents containing term
                      197 = N, total number of documents with field
                    0.740726 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
                      2.0 = freq, occurrences of term within document
                      1.2 = k1, term saturation parameter
                      0.75 = b, length normalization parameter
                      21.0 = dl, length of field
                      47.243656 = avgdl, average length of field
                3.9847708 = weight(title_t:miles in 113) [SchemaSimilarity], result of:
                  3.9847708 = score(freq=2.0), computed as boost * idf * tf from:
                    2.0 = boost
                    2.84592 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:
                      11 = n, number of documents containing term
                      197 = N, total number of documents with field
                    0.7000848 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:
                      2.0 = freq, occurrences of term within document
                      1.2 = k1, term saturation parameter
                      0.75 = b, length normalization parameter
                      7.0 = dl, length of field
                      11.314721 = avgdl, average length of field
          0.8636209 = min second pass score
          3.9847708 = max sceond pass score
        0.5 = rerank weight

      Note the detail in the reRank explain, compared to the knn part having one entry:
        0.90972877 = within top 100

       

      (And we only know that as a result of doing a knn-only query).  

      Perhaps it doesn't need to be (and can't be) as detailed as the above, it should at least include:

      • topK
      • dimensions
      • scoring method - dot product, cosine similarity, etc.
      • maybe some insights into the HNSW tree walk?  

      Attachments

        Activity

          People

            Unassigned Unassigned
            zentimentalist Marc Byrd
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: