Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
1.14.0
-
None
-
None
Description
We noticed this report of a stuck thread in a test that enabled max-threads in a cache server:
[warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822 <ThreadsMonitor> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> has been stuck for <46.356 seconds> and number of thread monitor iteration <1> Thread Name <ServerConnection on port 26188 Thread 5> state <RUNNABLE> Executor Group <PooledExecutorWithDMStats> Monitored metric <ResourceManagerStats.numThreadsStuck> Thread stack: sun.nio.ch.FileDispatcherImpl.read0(Native Method) sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) sun.nio.ch.IOUtil.read(IOUtil.java:192) sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237) org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859) org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698) org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213) org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229) org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816) org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777) org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73) org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710) org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown Source) org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown Source) java.lang.Thread.run(Thread.java:748)
The cache server should suspend thread monitoring before reading from a socket and resume monitoring afterward. An example of this can be found in org.apache.geode.internal.tcp.Connection.java.