Deadlock in session daemon and consumer for short lived apps in live per-pid
I encountered a deadlock of the session daemon when tracing short lived UST app in burst. I was not able to reproduce it.
Here is the session configuration:
lttng create allo --live lttng enable-channel --buffers-pid -s allo --tracefile-size=5M --tracefile-count=4 -u chan1 lttng enable-event -u -a -c chan1 lttng start allo
After configured this session, I ran the following for loop followed by a lttng destroy command. As such:
for i in `seq 1 10`; do ../../../lttng-emojis/main & ; done; lttng destroy -a
These apps run and exit very quickly. The lttng destroy command never completes .
I attached the source code of the app as well as the backtraces of both lttng session daemon and the lttng consumer daemon.
We gathered so far:
sessiond thread 8: waiting on consumer socket (recv), while holding the consumer socket lock
sessiond thread 11: waiting for socket lock, while holding ust registry lock
sessiond thread 15: waiting for ust registry lock
consumerd thread 4: waiting for stream lock
consumerd thread 7: waiting on consumer metadata socket(recv), while holding stream lock
consumerd thread 8: waiting for stream lock
I was running this lttng-tools branch: https://github.com/PSRCode/lttng-tools/tree/live-per-pid (commit f0e3b9ebe)
Updated by Francis Deslauriers 5 months ago
- File sessiond-bt-2.txt sessiond-bt-2.txt added
- File consumerd-bt-2.txt consumerd-bt-2.txt added
- File relayd-bt-2.txt relayd-bt-2.txt added
I was able to reproduce it by using the following loop:
for i in `seq 1 40`; do ./setup-trace.sh ; ../../../lttng-emojis/main &; lttng destroy -a ; done;
I attached the backtrace of this run.