Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3369

Missing NullPointer check in AppSchedulingInfo causes RM to die

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      In AppSchedulingInfo.java the method checkForDeactivation() has these 2 consecutive lines:

      ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY);
      if (request.getNumContainers() > 0) {
      

      the first line calls getResourceRequest and it can return null.

      synchronized public ResourceRequest getResourceRequest(
      Priority priority, String resourceName) {
          Map<String, ResourceRequest> nodeRequests = requests.get(priority);
          return  (nodeRequests == null) ? {color:red} null : nodeRequests.get(resourceName);
      }
      

      The second line dereferences the pointer directly without a check.
      If the pointer is null, the RM dies.

      2015-03-17 14:14:04,757 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
      java.lang.NullPointerException
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
      at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
      at java.lang.Thread.run(Thread.java:722)
      2015-03-17 14:14:04,758 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..

      Attachments

        1. YARN-3369-003.patch
          2 kB
          Brahma Reddy Battula
        2. YARN-3369.patch
          1 kB
          Brahma Reddy Battula
        3. YARN-3369.2.patch
          1 kB
          Wangda Tan

        Activity

          People

            brahmareddy Brahma Reddy Battula
            giovanni.fumarola Giovanni Matteo Fumarola
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: