Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
OM/DN startup failure is observed for non-HA SCM, as secretKeyManager is not initialized.
As DN is not getting registered as it depends on secret Key, SMC is not commit out from safemode.
OM:
Execution of task getCurrentSecretKey failed permanently after 100 attempts org.apache.hadoop.hdds.security.exception.SCMSecretKeyException: Secret key initialization is not finished yet. at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.handleError(SecretKeyProtocolClientSideTranslatorPB.java:102) at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.submitRequest(SecretKeyProtocolClientSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.getCurrentSecretKey(SecretKeyProtocolClientSideTranslatorPB.java:128) at org.apache.hadoop.hdds.utils.RetriableTask.call(RetriableTask.java:56) at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.loadInitialSecretKey(DefaultSecretKeySignerClient.java:113) at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.start(DefaultSecretKeySignerClient.java:77) at org.apache.hadoop.ozone.om.OzoneManager.startSecretManager(OzoneManager.java:1091) at org.apache.hadoop.ozone.om.OzoneManager.startSecretManagerIfNecessary(OzoneManager.java:2276) at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1620) at org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:190) at org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86) at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:74) at org.apache.hadoop.hdds.cli.GenericCli.call(GenericCli.java:38) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine.execute(CommandLine.java:2078) at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100) at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91) at org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58) 4:04:45.698 AM ERROR OzoneManager Unable to initialize secret key.
DN:
org.apache.hadoop.hdds.security.exception.SCMSecretKeyException: Secret key initialization is not finished yet. at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.handleError(SecretKeyProtocolClientSideTranslatorPB.java:102) at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.submitRequest(SecretKeyProtocolClientSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.getCurrentSecretKey(SecretKeyProtocolClientSideTranslatorPB.java:128) at org.apache.hadoop.hdds.utils.RetriableTask.call(RetriableTask.java:56) at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.loadInitialSecretKey(DefaultSecretKeySignerClient.java:113) at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.start(DefaultSecretKeySignerClient.java:77) at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeyClient.start(DefaultSecretKeyClient.java:50) at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:283) at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:210) at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:178) at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:95) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine.execute(CommandLine.java:2078) at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100) at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91) at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:160) 4:00:59.649 AM ERROR HddsDatanodeService Exception in HddsDatanodeService. java.lang.RuntimeException: Can't start the HDDS datanode plugin at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:332) at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:210)
When forceful SCM safemode exit is done, its able to start and continue, as it will initiate the secret manager while trigger service with notifyStatusChanged() to SecretKeyManagerService.
For non-HA SCM, need support initialize the Key Secret manager before going to safemode.
Attachments
Issue Links
- links to