[LUCENE-7258] Tune DocIdSetBuilder allocation rate - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Reopened
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 6.1, 7.0
Component/s: modules/spatial
Labels:
None

Lucene Fields:

New

Description

~~LUCENE-7211~~ converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but didn't actually reduce garbage generation for my Solr index.

Since something like 40% of my garbage (by space) is now attributed to DocIdSetBuilder.growBuffer, I charted a few different allocation strategies to see if I could tune things more.

See here: http://i.imgur.com/7sXLAYv.jpg
The jump-then-flatline at the right would be where DocIdSetBuilder gives up and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index curve/cutoff looked similar)

Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is terrible from an allocation standpoint if you're doing a lot of expansions, and is especially terrible when used to build a short-lived data structure like this one.
By the time it goes with the FBS, it's allocated around twice as much memory for the buffer as it would have needed for just the FBS.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

allocation_plot.jpg
26/Apr/16 20:40
39 kB
Jeff Wartes
LUCENE-7258.patch
18/May/16 16:04
10 kB
Adrien Grand
LUCENE-7258-expanding.patch
17/May/16 10:22
8 kB
Adrien Grand
LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch
27/Apr/16 21:11
6 kB
Jeff Wartes
LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch
26/Apr/16 20:20
7 kB
Jeff Wartes

Activity

People

Assignee:: Unassigned

Reporter:: Jeff Wartes

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 26/Apr/16 18:09

Updated:: 17/Sep/24 20:53

Resolved:: 23/May/16 07:29