Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-9872

OM/DN startup failure with non-HA SCM for secret manager not initialized

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.4.0
    • None

    Description

      OM/DN startup failure is observed for non-HA SCM, as secretKeyManager is not initialized.

      As DN is not getting registered as it depends on secret Key, SMC is not commit out from safemode.

       

      OM:

      Execution of task getCurrentSecretKey failed permanently after 100 attempts
      org.apache.hadoop.hdds.security.exception.SCMSecretKeyException: Secret key initialization is not finished yet.
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.handleError(SecretKeyProtocolClientSideTranslatorPB.java:102)
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.submitRequest(SecretKeyProtocolClientSideTranslatorPB.java:90)
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.getCurrentSecretKey(SecretKeyProtocolClientSideTranslatorPB.java:128)
      	at org.apache.hadoop.hdds.utils.RetriableTask.call(RetriableTask.java:56)
      	at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.loadInitialSecretKey(DefaultSecretKeySignerClient.java:113)
      	at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.start(DefaultSecretKeySignerClient.java:77)
      	at org.apache.hadoop.ozone.om.OzoneManager.startSecretManager(OzoneManager.java:1091)
      	at org.apache.hadoop.ozone.om.OzoneManager.startSecretManagerIfNecessary(OzoneManager.java:2276)
      	at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1620)
      	at org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:190)
      	at org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86)
      	at org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:74)
      	at org.apache.hadoop.hdds.cli.GenericCli.call(GenericCli.java:38)
      	at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
      	at picocli.CommandLine.access$1300(CommandLine.java:145)
      	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
      	at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
      	at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
      	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
      	at picocli.CommandLine.execute(CommandLine.java:2078)
      	at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
      	at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
      	at org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58)
      4:04:45.698 AM	ERROR	OzoneManager	
      Unable to initialize secret key. 

      DN:

      org.apache.hadoop.hdds.security.exception.SCMSecretKeyException: Secret key initialization is not finished yet.
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.handleError(SecretKeyProtocolClientSideTranslatorPB.java:102)
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.submitRequest(SecretKeyProtocolClientSideTranslatorPB.java:90)
      	at org.apache.hadoop.hdds.protocolPB.SecretKeyProtocolClientSideTranslatorPB.getCurrentSecretKey(SecretKeyProtocolClientSideTranslatorPB.java:128)
      	at org.apache.hadoop.hdds.utils.RetriableTask.call(RetriableTask.java:56)
      	at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.loadInitialSecretKey(DefaultSecretKeySignerClient.java:113)
      	at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.start(DefaultSecretKeySignerClient.java:77)
      	at org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeyClient.start(DefaultSecretKeyClient.java:50)
      	at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:283)
      	at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:210)
      	at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:178)
      	at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:95)
      	at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
      	at picocli.CommandLine.access$1300(CommandLine.java:145)
      	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
      	at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
      	at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
      	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
      	at picocli.CommandLine.execute(CommandLine.java:2078)
      	at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
      	at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
      	at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:160)
      4:00:59.649 AM	ERROR	HddsDatanodeService	
      Exception in HddsDatanodeService.
      java.lang.RuntimeException: Can't start the HDDS datanode plugin
      	at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:332)
      	at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:210) 

      When forceful SCM safemode exit is done, its able to start and continue, as it will initiate the secret manager while trigger service with notifyStatusChanged() to SecretKeyManagerService.

       

      For non-HA SCM, need support initialize the Key Secret manager before going to safemode.

      Attachments

        Issue Links

          Activity

            People

              sumitagrawl Sumit Agrawal
              sumitagrawl Sumit Agrawal
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: