Details
-
Task
-
Status: Closed
-
Blocker
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
There a couple of JIRAs for deleting the znode on a process failure:
HBASE-5844 (RS)
HBASE-5926 (Master)
which are pretty neat; on process failure, they delete the znode of the underlying process so HBase can recover faster.
These JIRAs were implemented via the startup scripts; i.e. the script hangs around and waits for the process to exit, then deletes the znode.
There are a few problems associated with this approach, as listed in the below JIRAs:
1) Hides startup output in script
https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
2) two hbase processes listed per launched daemon
https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
3) Not run by a real supervisor
https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
4) Weird output after kill -9 actual process in standalone mode
https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
5) Can kill existing RS if called again
https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
6) Hides stdout/stderr[6]
https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
I suspect running in via something like supervisor.d can solve these issues if we provide the right support.
Attachments
Attachments
Issue Links
- depends upon
-
HBASE-10310 ZNodeCleaner session expired for /hbase/master
- Closed
- is related to
-
HBASE-5844 Delete the region servers znode after a regions server crash
- Closed
-
HBASE-7334 We should expire the zk session for crashed servers rather than deleting ephemeral znodes
- Closed
-
HBASE-5926 Delete the master znode after a master crash
- Closed
-
HBASE-5843 Improve HBase MTTR - Mean Time To Recover
- Closed
- relates to
-
HBASE-7334 We should expire the zk session for crashed servers rather than deleting ephemeral znodes
- Closed