Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-25884

Set Keytab, Check keytab and Remove Keytab operations failing on few clusters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.6
    • 2.8.0, 2.7.8
    • None
    • None

    Description

      On large clusters while enabling kerberos or on running kerberos service check, NPE is thrown on for CHECK_KEYTABS, REMOVE_KEYTAB, SET_KEYTAB

       

      2023-03-06 07:22:00,538  INFO [agent-command-publisher-0] AgentCommandsPublisher:174 - CHECK_KEYTABS called
      2023-03-06 07:22:00,538 ERROR [ambari-action-scheduler] AgentCommandsPublisher:126 - Exception on sendAgentCommand
      java.util.concurrent.ExecutionException: java.lang.NullPointerException
              at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
              at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.sendAgentCommand(AgentCommandsPublisher.java:124)
              at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:555)
              at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:347)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.NullPointerException
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
              at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
              at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1005)
              ... 4 more
      Caused by: java.lang.NullPointerException
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
              at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
              at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
              at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
              at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
              at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
              at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
              at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
              at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:650)
              at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.lambda$sendAgentCommand$1(AgentCommandsPublisher.java:103)
              at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
              at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
              at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
              at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
              at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:163)
      Caused by: java.lang.NullPointerException
              at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.prepareExecutionCommandsClusters(AgentCommandsPublisher.java:214)
              at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.populateExecutionCommandsClusters(AgentCommandsPublisher.java:192)
              at org.apache.ambari.server.events.publishers.AgentCommandsPublisher.lambda$null$0(AgentCommandsPublisher.java:122)
              at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
              at com.google.common.collect.CollectSpliterators$1.lambda$forEachRemaining$1(CollectSpliterators.java:116)
              at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
              at com.google.common.collect.CollectSpliterators$1.forEachRemaining(CollectSpliterators.java:116)
              at com.google.common.collect.CollectSpliterators$1FlatMapSpliterator.lambda$forEachRemaining$1(CollectSpliterators.java:247)
              at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1699)
              at com.google.common.collect.CollectSpliterators$1FlatMapSpliterator.forEachRemaining(CollectSpliterators.java:247)
              at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
              at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
              at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
              ... 4 more 

       

       

      This might be due to the using the Treemap for executionCommandsClusters multithreading operations, so we need to update to a threadsafe datastructure for executionCommandsClusters.

       

      Due to this, kerberos service check gets stuck for 30 minutes and then the commands are sent to agent again, then the service check gets successful.

      Also, on large clusters this is happening multiple times on during enabling kerberos.

      Attachments

        Issue Links

          Activity

            People

              dmmkr D M Murali Krishna Reddy
              dmmkr D M Murali Krishna Reddy
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h