Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
Currently the input to FSTSuggester needs to be re-sorted and this is done in-memory. Kind of defeats the purpose of the component since everything else is super-efficient but we don't even get to that part because of OOMs during construction.
Robert suggested using a spill-to-disk and merge sort on-disk. I suggested creating a lucene index and then enumerating terms for automaton construction or taking the automaton directly from the index structure (if it isn't pruned).
Attachments
Issue Links
- is part of
-
SOLR-2888 FSTSuggester refactoring: utf8 storage, external sorts (OOM prevention), code cleanups
- Closed