Details
-
Bug
-
Status: Patch Available
-
Minor
-
Resolution: Unresolved
-
9.3
-
None
-
New
Description
This has the same origin as issue LUCENE-10676 . Running a single process with thousands of fields across many indexes will lead to a lot of duplicate strings retained as keys and values in the `attributes` map. This can amount to GBs of heap for thousands of fields across a few thousand segments. The strings in the below heap dump analysis account for more than half (roughly 2/3 and the field names are somewhat unusually long in this example) the duplicate strings from `FieldInfo` instances.
If we could deduplicate theses obvious known strings when reading `FieldInfo` we could save GBs of heap for use cases like this.