Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6410

Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.7.0
    • 2.8.0, 2.7.1, 3.0.0-alpha1
    • None
    • mrV2, secure mode

    • Reviewed

    Description

      GSSException is thrown everytime log aggregation deletion is attempted after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure cluster.

      The problem can be reproduced by following steps:
      1. startup historyserver in secure cluster.
      2. Log deletion happens as per expectation.
      3. execute mapred hsadmin -refreshLogRetentionSettings command to refresh the configuration value.
      4. All the subsequent attempts of log deletion fail with GSSException

      Following exception can be found in historyserver's log if log deletion is enabled.

      2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this deletion attempt is being aborted | AggregatedLogDeletionService.java:127
      java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; destination host is: "vm-33":25000; 
              at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
              at org.apache.hadoop.ipc.Client.call(Client.java:1414)
              at org.apache.hadoop.ipc.Client.call(Client.java:1363)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
              at com.sun.proxy.$Proxy9.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
              at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
              at com.sun.proxy.$Proxy10.getListing(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
              at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
              at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
              at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
              at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
              at java.util.TimerThread.mainLoop(Timer.java:555)
              at java.util.TimerThread.run(Timer.java:505)
      Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
              at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
              at org.apache.hadoop.ipc.Client.call(Client.java:1381)
              ... 21 more
      Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
              at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411)
              at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550)
              at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711)
              ... 24 more
      Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
              at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
              at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
              at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
              at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
              at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
              at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
              ... 33 more
      

      Attachments

        1. YARN-3779.01.patch
          5 kB
          Varun Saxena
        2. YARN-3779.02.patch
          5 kB
          Varun Saxena
        3. log_aggr_deletion_on_refresh_error.log
          375 kB
          Varun Saxena
        4. log_aggr_deletion_on_refresh_fix.log
          294 kB
          Varun Saxena
        5. YARN-3779.03.patch
          3 kB
          Varun Saxena
        6. MAPREDUCE-6410.04.patch
          7 kB
          Varun Saxena
        7. MAPREDUCE-6410.05.patch
          7 kB
          Vinod Kumar Vavilapalli

        Activity

          People

            varun_saxena Varun Saxena
            sijing0410 Zhang Wei
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: