Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13960

Starvation in mgmt pool caused by MetadataTask execution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.9.1
    • 2.10
    • compute
    • None
    • Docs Required, Release Notes Required

    Description

      Issue:

      Requesting cache metadata from multiple threads causes starvation in the mgmt pool.

      Root Cause:

      From the mgmt pool GridCacheCommandHandler.MetadataJob calls GridCacheQueryManager#sqlMetadata() and GridClosureProcessor#callAsyncNoFailover().get() that executes and waits an another internal task. The job response of this task should be also handled from the mgmt pool. It causes starvation.

      Proposed Fix:

      Make GridCacheQueryManager#sqlMetadata() asynchronous and apply continuation for GridCacheCommandHandler.MetadataJob to release a mgmt thread for the time of completing the future returned by sqlMetadata().

      Attached threads with hanging threads:

      
      "mgmt-#10633" #14311 prio=5 os_prio=0 tid=0x0000560c79117000 nid=0x134c6 waiting on condition [0x00007f15baa77000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
      	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
      	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
      	at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.sqlMetadata(GridCacheQueryManager.java:1803)
      	at org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1123)
      	at org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1088)
      	at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567)
      	at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069)
      	at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:561)
      	at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:490)
      	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
      	at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1270)
      	at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:2088)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:144)
      	at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144)
      	at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      "mgmt-#81" #270 prio=5 os_prio=0 tid=0x0000562323c3c800 nid=0x592 waiting on condition [0x00007fba5f378000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
      	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
      	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
      	at org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor$ClientChangeGlobalStateComputeRequest.run(GridClusterStateProcessor.java:1979)
      	at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4.execute(GridClosureProcessor.java:1943)
      	at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567)
      	at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069)
      	at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:561)
      	at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:490)
      	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
      	at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1270)
      	at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:2088)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255)
      	at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:144)
      	at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144)
      	at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

      Attachments

        Activity

          People

            pvinokurov Pavel Vinokurov
            pvinokurov Pavel Vinokurov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m