Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
New
Description
To perform ANN search with cosine similarity, users are expected to normalize the document and query vectors to unit length, then use VectorSimilarityFunction.DOT_PRODUCT. I think it would be good to also support cosine similarity directly through VectorSimilarityFunction.COSINE. This would allow users to perform ANN based on cosine similarity, while retaining access to the original vectors through VectorValues. That way they can use the original vectors in a reranking step or return them to the application for further processing.
It looks like nmslib and hnswlib support cosine similarity. On the other hand, FAISS only supports dot product and suggests users normalize the vectors to perform cosine similarity (https://github.com/facebookresearch/faiss/issues/95). To me adding this one additional similarity is worth it in terms of what it lets users accomplish.