Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1.0
-
None
-
None
-
Reviewed
-
Description
Found this one when investigating ModifyTableProcedure got stuck while there was a MoveRegionProcedure going on after master restart.
Though this issue can be solved by HBASE-20752. But I discovered something else.
Before a MoveRegionProcedure can execute, it will hold the table's shared lock. so,, when a UnassignProcedure was spwaned, it will not check the table's shared lock since it is sure that its parent(MoveRegionProcedure) has aquired the table's lock.
// If there is parent procedure, it would have already taken xlock, so no need to take // shared lock here. Otherwise, take shared lock. if (!procedure.hasParent() && waitTableQueueSharedLock(procedure, table) == null) { return true; }
But, it is not the case when Master was restarted. The child procedure(UnassignProcedure) will be executed first after restart. Though it has a parent(MoveRegionProcedure), but apprently the parent didn't hold the table's lock.
So, since it began to execute without hold the table's shared lock. A ModifyTableProcedure can aquire the table's exclusive lock and execute at the same time. Which is not possible if the master was not restarted.
This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I wrote a simple UT to repo this case.
I think we don't have to check the parent for table's shared lock. It is a shared lock, right? I think we can acquire it every time we need it.
Attachments
Attachments
Issue Links
- relates to
-
HBASE-20920 Merge the update procedure store on locking with the general persist after a procedure execution
- Open
-
HBASE-20924 Backport "HBASE-20846 Restore procedure locks when master restarts"
- Resolved
- links to