Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.1, 1.1.0, 0.98.11
-
None
-
Reviewed
-
Adds a new option to HBCK tool -fixOrphanedTableZnodes, which fixes orphaned table entries in zookeeper which does not have corresponding meta entries. This state can be from a failed create table attempt.
Description
If the HMaster bounces in the middle of table creation, we could be left in a state where a znode exists for the table, but that hasn't percolated into META or to HDFS. We've run into this a couple times on our clusters. Once the table is in this state, the only fix is to rm the znode using the zookeeper-client. Doing this manually looks a bit error prone. Could an option be added to hbck to catch and fix such inconsistencies?
A more general issue I'd like comment on is whether it makes sense for HMaster to be maintaining its own write-ahead log? The idea would be that on a bounce, the master would discover it was in the middle of creating a table and either rollback or complete that operation? An issue that we observed recently was that a table that was in DISABLING state before a bounce was not in that state after. A write-ahead log to persist table state changes seems useful. Now, all of this state could be in ZK instead of the WAL - it doesn't matter where it gets persisted as long as it does.