Details
Description
See discussion on HBASE-9969.
This was brought up by mcorgan. Even though I didn't believe him first.
KeyValueScanner topScanner = this.heap.peek(); if (topScanner == null || this.comparator.compare(kvNext, topScanner.peek()) >= 0) { this.heap.add(this.current); this.current = pollRealKV(); }
We already have the invariant everywhere this.current always has a real KV polled. So adding current back to the heap followed by pollRealKV() is a no-op if this.current is the only scanner anyway.
This can be changed to:
... if (topScanner != null && this.comparator.compare(kvNext, topScanner.peek()) >= 0) { ...
With 20m rows, 5 cols each, everything in the cache, fully compacted - i.e. one HFile per store, a scan that filters everything at the server through all 100m KVs takes 15.1s without the change and 13.3s with it, so a 12% improvement.