Description
I'd like to explore using dedicated spatial trees for faster shape
intersection filters than postings-based implementations.
I implemented the tree data structure from
https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf
The idea is simple: it builds a full binary tree, partitioning 2D
space, alternately on lat and then lon, into smaller and smaller
rectangles until a leaf has <= N (default 1024) points.
It cannot index shapes (just points), and can then do fast shape
intersection queries. Multi-valued fields are supported.
I only implemented the "point is contained in this bounding box" query
for now, but I think polygon shape querying should be easy to
implement using the same approach from LUCENE-6450.
For indexing, you add BKDPointField (takes lat, lon) to your doc, and
must set up your Codec use BKDTreeDocValuesFormat for that field.
This DV format wraps Lucene50DVFormat, but then builds the disk-based
BKD tree structure on the side. BKDPointInBBoxQuery then requires this
DVFormat, and casts it to gain access to the tree.
I quantize each incoming double lat/lon to 32 bits precision (so 64
bits per point) = ~9 milli-meter lon precision at the equator, I
think.