Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently, ByteBufferUtils.readVLong is used to decode rows in all data block encodings in order to read the memstoreTs field. For a data block encoding like prefix, ByteBufferUtils.readVLong can surprisingly occupy over 50% of the CPU time in BufferedEncodedSeeker.decodeNext (which can be quite a hot method in seek operations).
Since memstoreTs will typically require at least 6 bytes to store, we could look to vectorize the read path for readVLong to read 8 bytes at a time instead of a single byte at a time (like in https://issues.apache.org/jira/browse/HBASE-28025) in order to increase performance.
Attached is a CPU flamegraph of a region server process which shows that we spend a surprising amount of time in decoding rows from the DBE in ByteBufferUtils.readVLong.
Attachments
Attachments
Issue Links
- links to