Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
9.0
-
None
-
None
-
New
Description
LUCENE-9843 removed the compression option for SortedSetDocValues term dictionaries and enabled LZ4 compression all the time. This has quite an impact on our workloads which heavily uses sorted set doc values. It may lead to perf regression from 2x up to 5x. See samples below.
❯ times_tasks Elasticsearch 7.10.2 (Lucene 8.7) - no terms dict compression name type time_min time_max time_p50 time_p90 7.10.2-22.6-SNAPSHOT.json total 42 90 45 66 7.10.2-22.6-SNAPSHOT.json SearchJoinRequest1 14 32 15 18 7.10.2-22.6-SNAPSHOT.json SearchTaskBroadcastRequest2 23 53 27 43 ❯ times_tasks Elasticsearch 7.17.1 (Lucene 8.11) - with terms dict compression name type time_min time_max time_p50 time_p90 7.17.0-27.1-SNAPSHOT.json total 253 327 285 310 7.17.0-27.1-SNAPSHOT.json SearchJoinRequest1 121 154 142 152 7.17.0-27.1-SNAPSHOT.json SearchTaskBroadcastRequest2 122 173 140 152 ❯ times_tasks Elasticsearch 7.17.1 (Lucene 8.11) - lucene_default codec is used to bypass the terms dict compression name type time_min time_max time_p50 time_p90 7.17.0-27.1-SNAPSHOT.json.2 total 48 96 63 75 7.17.0-27.1-SNAPSHOT.json.2 SearchJoinRequest1 19 44 25 31 7.17.0-27.1-SNAPSHOT.json.2 SearchTaskBroadcastRequest2 23 42 29 37 ❯ times_tasks Elasticsearch 8.0 (Lucene 9.0) - with terms dict compression name type time_min time_max time_p50 time_p90 8.0.0-28.0-SNAPSHOT.json total 260 327 287 313 8.0.0-28.0-SNAPSHOT.json SearchJoinRequest1 122 168 148 158 8.0.0-28.0-SNAPSHOT.json SearchTaskBroadcastRequest2 123 165 139 155
We can clearly see in the benchmark the impact of the terms dict compression in our workload. Profiling the execution indicates that the bottleneck is the LZ4.decompress. We have attached two screenshots of a flamegraph.
The CPU time of the TermsDict.next method with Lucene 8.11 with no terms dict compression is around 2 seconds, while the CPU time of the same method in Lucene 9.0 is 12 seconds. This was measured on a small benchmark reading a fixed number of times a sorted set doc values field. Each document is created with a single keyword value that represents a UUID.