Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-7593 Supporting HSync and lease recovery
  3. HDDS-8352

OM crash with NPE in OMKeyCommitRequest due to missing user info

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • None

    Description

      While debugging HDDS-8292, another issue occurred which crashed OM. OM is unable to restart despite several attempts.

      2023-03-30 01:17:48,248 ERROR org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Terminating with exit status 1: Request cmdType: CommitKey
      clientId: "client-D4205EE3CB56"
      commitKeyRequest {
        keyArgs {
          volumeName: "vol1"
          bucketName: "bucket1"
          keyName: "ozpc12-cf4-8.ozpc12-cf4.root.hwx.site%2C22101%2C1679350143649.1679357760087"
          dataSize: 561054
          type: RATIS
          factor: THREE
          keyLocations {
            blockID {
              containerBlockID {
                containerID: 8
                localID: 111677748019202442
              }
              blockCommitSequenceId: 0
            }
            offset: 0
            length: 561054
            createVersion: 0
            partNumber: 0
          }
        }
        clientID: 110058390173712730
      }
      failed with exception
      java.lang.NullPointerException
              at org.apache.hadoop.ozone.om.OzoneAclUtils.isOwner(OzoneAclUtils.java:146)
              at org.apache.hadoop.ozone.om.OzoneAclUtils.checkAllAcls(OzoneAclUtils.java:84)
              at org.apache.hadoop.ozone.om.request.OMClientRequest.checkAcls(OMClientRequest.java:364)
              at org.apache.hadoop.ozone.om.request.OMClientRequest.checkAcls(OMClientRequest.java:215)
              at org.apache.hadoop.ozone.om.request.key.OMKeyRequest.checkKeyAcls(OMKeyRequest.java:342)
              at org.apache.hadoop.ozone.om.request.key.OMKeyRequest.checkKeyAclsInOpenKeyTable(OMKeyRequest.java:391)
              at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:112)
              at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:337)
              at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:533)
              at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:324)
              at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      

      For some reason, the callerUgi is null at here: https://github.com/apache/ozone/blob/eafd2ccec01ffcb5b9966fc957d5e6b1ce4b3ddc/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneAclUtils.java#L146

      Attachments

        1. omdb_npe.tgz
          1.28 MB
          Wei-Chiu Chuang

        Issue Links

          Activity

            People

              sumitagrawl Sumit Agrawal
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: