Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.0.0, 3.0.0-alpha-1, 2.0.0
-
None
-
Reviewed
Description
When running org.apache.hadoop.hbase.tool.Canary with args -zookeeper -treatFailureAsError, the Canary will try to get a znode from each ZooKeeper server in the ensemble. If any server is unavailable or unresponsive, the canary will exit with a failure code.
If we use the Canary to gauge server health, and alert accordingly, this can be too strict. For example, in a 5-node ZooKeeper cluster, having one node down is safe and expected in rolling upgrades/patches.
This is a request to allow the Canary to take another parameter
-permittedZookeeperFailures <N>
If N=1, in the 5-node ZooKeeper ensemble example, then the Canary will still pass if 4 ZooKeeper nodes are reachable, but fail if 3 or fewer are reachable.
(This is my first Jira posting... sorry if I messed anything up.)
Attachments
Attachments
Issue Links
- is cloned by
-
HBASE-21145 Backport "HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1
- Resolved
-
HBASE-21146 (2.0) Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes
- Resolved
-
HBASE-21147 (1.4) Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes
- Resolved
- links to