Description
Bug : utf8 hash reset is not correctly done, as a result, returns different hash when the values are logically equal.
cause commit :
reproduce :
org.apache.avro.util.Utf8 utf8 = new org.apache.avro.util.Utf8(); utf8.set("hello"); System.out.println(utf8.hashCode()); // 99162322 utf8.set("hello"); System.out.println(utf8.hashCode()); // -1538739392
Suggested fix.
- get rid of `hasHash` field. In my opinion, it leads to more bug-prone by adding additional variable to take care of. `hash != 0` should be enough to check the presence of hash. Redundant computation for `0 hash` elements should be negligible.
- reset `hash = 0` on every mutation call.
- not needed, but I want to see some minor optimization like avoiding memory access by creating local variable (int hash = 0; .... this.hash = hash).
Attachments
Issue Links
- fixes
-
AVRO-2801 Cache Hashcode of UTF8 Strings in all Set Methods
- Closed
- links to