Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
proton-c-0.32.0
-
None
-
None
Description
As observed in comments on PROTON-2225, the c-threaderciser test can occasionally get stuck.
The following is output after I set static bool debug_enable = true; in the test source, and run the test under valgrind (as it seems to be more likely to get stuck when running under valgrind).
while true; do /usr/bin/python "/home/jdanek/repos/qpid/qpid-proton/scripts/env.py" "--" "TEST_EXE_PREFIX=/usr/bin/valgrind --tool=memcheck --leak-check=full --error-exitcode=42 --quiet --suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/usr/bin/valgrind" "--tool=memcheck" "--leak-check=full" "--error-exitcode=42" "--quiet" --vgdb=yes "--suppressions=/home/jdanek/repos/qpid/qpid-proton/tests/valgrind.supp" "/home/jdanek/repos/qpid/qpid-proton/cmake-build-debug/c/tests/c-threaderciser" | tee cthreadlog.txt; done [...] threaderciser start: threads=8, time=1, actions=[listen, close-listen, connect, close-connect, wake, timeout, cancel-timeout] (4e2ab80) [0x4e2e220] listen (6a2e700) user_thread start (6a2e700) [0x4e2d140] timeout (722f700) proactor_thread start (8a32700) user_thread start (722f700) [0x4e2e220] listening on 0.0.0.0:36767 (722f700) [0x4e31170] connect (8231700) proactor_thread start (7a30700) user_thread start (6a2e700) [0x4e3a480] connect (5a2c700) user_thread start (4e2ab80) shut down (5a2c700) user_thread end (8a32700) [0x4e30840] listen (622d700) proactor_thread start (8231700) [0x4e39b50] listen (7a30700) [0x4e3b360] listen (9233700) proactor_thread start (6a2e700) [0x4e3a480] wake (622d700) [0x4e30840] listening on 0.0.0.0:39411 (622d700) [0x4e44e40] connect (8a32700) user_thread end (6a2e700) [0x4e2d140] timeout (9233700) [0x4e3b360] listening on 0.0.0.0:35627 (9233700) [0x4e4d3f0] connect (6a2e700) user_thread end (7a30700) user_thread end (4e2ab80) disconnect (8231700) [0x4e39b50] listening on 0.0.0.0:45417 (8231700) proactor_thread end [... I've let it run for over an hour at this point, but got no further output ...]
GDB says
(gdb) thread apply all bt Thread 4 (Thread 137904): #0 0x0000000004a1dd49 in epoll_wait (epfd=4, events=0x4e2db50, maxevents=16, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30 #1 0x00000000048595ef in proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2623 #2 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802 #3 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474 #4 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477 #5 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6 Thread 3 (Thread 137900): #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e30304) at ../sysdeps/nptl/futex-internal.h:183 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e302b0, cond=0x4e302d8) at pthread_cond_wait.c:508 #2 __pthread_cond_wait (cond=cond@entry=0x4e302d8, mutex=mutex@entry=0x4e302b0) at pthread_cond_wait.c:638 #3 0x0000000004859bd4 in suspend (ts=0x4e302b0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512 #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740 #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802 #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474 #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477 #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6 Thread 2 (Thread 137898): #0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x4e43520) at ../sysdeps/nptl/futex-internal.h:183 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x4e434d0, cond=0x4e434f8) at pthread_cond_wait.c:508 #2 __pthread_cond_wait (cond=cond@entry=0x4e434f8, mutex=mutex@entry=0x4e434d0) at pthread_cond_wait.c:638 #3 0x0000000004859bd4 in suspend (ts=0x4e434d0, p=0x4e2d140) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:512 #4 proactor_do_epoll (p=0x4e2d140, can_block=can_block@entry=true) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2740 #5 0x0000000004859e6b in pn_proactor_wait (p=<optimized out>) at /home/jdanek/repos/qpid/qpid-proton/c/src/proactor/epoll.c:2802 #6 0x000000000010afc6 in proactor_thread (void_g=0x1ffeffda50) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:474 #7 0x00000000048d881f in start_thread (arg=<optimized out>) at pthread_create.c:477 #8 0x0000000004a1d963 in clone () from /usr/lib64/haswell/libc.so.6 Thread 1 (Thread 137869): #0 __pthread_clockjoin_ex (threadid=102946560, thread_return=0x1ffeffda48, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145 #1 0x000000000010b5ff in main (argc=argc@entry=1, argv=0x1ffeffda50, argv@entry=0x1ffeffdc98) at /home/jdanek/repos/qpid/qpid-proton/c/tests/threaderciser.c:583 #2 0x000000000491d51d in __libc_start_main (main=0x10b225 <main>, argc=1, argv=0x1ffeffdc98, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x1ffeffdc88) at ../csu/libc-start.c:308 #3 0x000000000010a3ce in _start ()
Attachments
Issue Links
- is duplicated by
-
PROTON-2362 c-threaderciser timed out on 32-core machine.
- Closed
- relates to
-
PROTON-2230 Valgrind and Sanitizer issues in c-threaderciser test
- Open
-
PROTON-2225 Set -DTHREADERCISER=ON by default, to prevent regressions
- Closed