Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
9.0, 9.1
-
None
Description
Summary
For fields using large="true", large fields (which is what they are intended for) can be truncated in v9+ of Solr.
Example fieldtype definition:
<fieldtype name="string_large" class="solr.TextField" multiValued="false" indexed="false" stored="true" omitNorms="true" large="true" />
Cause
Looks like this is a bug introduced along with https://issues.apache.org/jira/browse/LUCENE-8805 / https://github.com/apache/lucene/issues/9849
The current code is here:
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrDocumentFetcher.java#L511
public void stringField(FieldInfo fieldInfo, String value) throws IOException { Objects.requireNonNull(value, "String value should not be null"); bytesRef.bytes = value.getBytes(StandardCharsets.UTF_8); bytesRef.length = value.length();
Specifically with respect to "large" fields handling.
The length in utf8 bytes will often be longer than the string length `value.length()`, hence the truncation.
Fix
bytesRef.length = bytesRef.bytes.length
Attachments
Issue Links
- is caused by
-
LUCENE-8805 Parameter changes for stringField() in StoredFieldVisitor
- Closed
- links to