Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
With ROOT table gone, we no longer cache the location of the meta table (in MetaCache) in 96+. I've checked 94 code, and there we cache meta, but not root.
However, not caching the metas own location means that we are doing a zookeeper request every time we want to look up a regions location from meta. This means that there is a significant spike in zk requests whenever a region server goes down.
This affects trunk,0.98 and 0.96 as well as hbase-10070 branch. I've discovered the issue in hbase-10070 because of the integration test (HBASE-10572) results in 150K requests to zk in 10min.
A thread dump from one of the runs have 100+ threads from client in this stack trace:
"TimeBoundedMultiThreadedReaderThread_20" prio=10 tid=0x00007f852c2f2000 nid=0x57b6 in Object.wait() [0x00007f85059e7000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:503) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309) - locked <0x00000000ea71aa78> (a org.apache.zookeeper.ClientCnxn$Packet) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684) at org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:1853) at org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:186) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:60) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1112) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1220) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1129) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:321) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.call(RpcRetryingCallerWithReadReplicas.java:257) - locked <0x00000000e9bcf238> (a org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:818) at org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.queryKey(MultiThreadedReader.java:288) at org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.readKey(MultiThreadedReader.java:249) at org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.runReader(MultiThreadedReader.java:192) at org.apache.hadoop.hbase.util.MultiThreadedReader$HBaseReaderThread.run(MultiThreadedReader.java:150)
Attachments
Attachments
Issue Links
- is duplicated by
-
HBASE-11758 Meta region location should be cached
- Closed
- is related to
-
HBASE-10701 Cache invalidation improvements from client side
- Closed
-
HBASE-10070 HBase read high-availability using timeline-consistent region replicas
- Closed