Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7880

CapacityScheduler$ResourceCommitterService throws NPE when running sls

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.0.0
    • 3.0.0
    • yarn
    • None

    Description

      sls test case: node count = 9000, job count=10k,task num of job = 500, task run time = 100s, but it does not occur when node count = 500 and 2000.

      18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: container_1517575125794_5707_01_000086 Container Transitioned from ACQUIRED to RUNNING
      
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541)
      

      some CapacityScheduler$AsyncScheduleThread also throws NPE

      18/02/02 20:40:34 INFO resourcemanager.DefaultAMSProcessor: AM registration appattempt_1517575125794_4564_000001
      18/02/02 20:40:34 INFO resourcemanager.RMAuditLogger: USER=default      OPERATION=Register App Master   TARGET=ApplicationMasterService RESULT=SUCCESS  APPID=application_1517575125794_4564    APPATTEMPTID=appattempt_1517575125794_4564_000001
      Exception in thread "Thread-43" 18/02/02 20:40:34 INFO appmaster.AMSimulator: Register the application master for application application_1517575125794_4564
      18/02/02 20:40:34 INFO resourcemanager.MockAMLauncher: Notify AM launcher launched:container_1517575125794_4564_01_000001
      18/02/02 20:40:34 INFO rmcontainer.RMContainerImpl: container_1517575125794_2703_01_000001 Container Transitioned from ACQUIRED to RUNNING
      18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: appattempt_1517575125794_4564_000001 State change from ALLOCATED to LAUNCHED on event = LAUNCHED
      18/02/02 20:40:34 INFO attempt.RMAppAttemptImpl: appattempt_1517575125794_4564_000001 State change from LAUNCHED to RUNNING on event = REGISTERED
      18/02/02 20:40:34 INFO rmapp.RMAppImpl: application_1517575125794_4564 State change from ACCEPTED to RUNNING on event = ATTEMPT_REGISTERED
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequests(SchedulerApplicationAttempt.java:1341)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.canAssign(RegularContainerAllocator.java:302)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignOffSwitchContainers(RegularContainerAllocator.java:389)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainersOnNode(RegularContainerAllocator.java:470)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.tryAllocateOnNode(RegularContainerAllocator.java:252)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:816)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:854)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:856)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1111)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:735)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1343)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1337)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1434)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1199)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:474)
              at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:501)
      

      Attachments

        Issue Links

          Activity

            People

              yangjiandan Jiandan Yang
              yangjiandan Jiandan Yang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: