Description
For both NUMERIC fields and ordinals of SORTED fields, we store data in a dense way. As a consequence, if you have only 1000 documents out of 1B that have a value, and 8 bits are required to store those 1000 numbers, we will not require 1KB of storage, but 1GB.
I suspect this mostly happens in abuse cases, but still it's a pity that we explode storage requirements. We could try to detect sparsity and compress accordingly.
Attachments
Attachments
Issue Links
- supercedes
-
LUCENE-5688 NumericDocValues fields with sparse data can be compressed better
- Resolved
-
LUCENE-4921 Create a DocValuesFormat for sparse doc values
- Closed