Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Description
Among some of the current AMv2 issues, we faced situation where some regions had state as OPENING in meta, with an RS startcode that was not valid anymore. There was no AP running, the region stays permanently being logged as IN-Transition on master logs, yet no procedure is really trying to bring it online. Current hbck2 unassigns/assigns commands didn't work either, as per the exception shown, it expects regions to be in state SPLITTING, SPLIT, MERGING, OPEN, or CLOSING:
WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Failed transition, suspend 1secs pid=7093, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; UnassignProcedure table=rc_accounts, region=db85127b77fa56f7ad44e2c988e53925, server=server1.example.com,16020,1552682193324; rit=OPENING, location=server1.example.com,16020,1552682193324; waiting on rectified condition fixed by other Procedure or operator intervention org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current state=OPENING at org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:166) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1479) at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:212) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:957) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1835) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1595)
In this specific case, since we know the region is not actually being operated by any proc and is not really open anywhere, it's ok to manually set it's state to one of those assigns/unassigns can operate on, so this jira proposes a new hbck2 command that allows for arbitrarily set a region to a given state.