Bug #530
closedlttng-tools2.2.0rc2: ConsumerD segfault in ustctl_flush_buffer (stream=0x63c4e0, producer_active=0) at ustctl.c:1425
100%
Description
Commit used: ============ userspace: 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize side-effect in 0.7 ust : 83e4321 (HEAD, origin/master, origin/HEAD) Fix: incorrect support for multi-context tools : 1274479 (HEAD, origin/master, origin/HEAD) Fix: check channel subbuf size against page size bebltrace: 5bfcad9 (HEAD, origin/master, origin/HEAD) Fix: handling of empty streams Problem Description: ==================== * While our stability-test is running, consumerD segfault once in a while with the following gdb info: (gdb) bt #0 0x00007f79958c72fc in ustctl_flush_buffer (stream=0x63c4e0, producer_active=0) at ustctl.c:1425 #1 0x000000000041c57b in lttng_ustconsumer_on_stream_hangup (stream=0x647590) at ust-consumer.c:1196 #2 0x000000000040a5e9 in consumer_thread_data_poll (data=0x633d10) at consumer.c:2500 #3 0x00007f799549f7b6 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f79951fac5d in clone () from /lib64/libc.so.6 #5 0x0000000000000000 in ?? () * This segfault happens very often (about each 10 minutes ). Is problem reproducible ? ========================= * yes How to reproduce (if reproducible): =================================== * We got 4 users, each has a different set of trace actions to perform (ex: create session, enable-chanel, enable-event, start session, stop session, etc). Once the set of action has been done, that user sleep for a few second and repeat the same set of actions again. While running the above, we encounter this consumerD segfault issue. Any other information: ====================== - Included in this bug report is the gdb printout + the "sessiond -vvv --verbose-consumer" printout. The time at which the segfault occured, was about 14:10 .
Files
Updated by David Goulet over 11 years ago
- Status changed from New to Confirmed
- Assignee set to David Goulet
- Target version set to 2.2
Updated by David Goulet over 11 years ago
- File bug530.patch bug530.patch added
This patch should fix the issue. I'll wait for your ACK before merging it. There is a clear race that the patch fixes.
Updated by Tan le tran over 11 years ago
Hi David,
The old segfault is no longer there, but a new one is observed.
It now occurs in multiple nodes and the frequency of this occurences is
also about every 3-7min . We still run the same test suite described on the
top of this bug report.
New commits used:
=================
lttng-tools: c5854b1 (HEAD, origin/master, origin/HEAD) Fix: use memset instead of poll ...
+ Apply bug530 patch (from update #2)
lttng-ust : 352fce3 (HEAD, origin/master, origin/HEAD) Remove 0.x TODO
rcu : 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize...
babeltrace : 5bfcad9 (HEAD, origin/master, origin/HEAD) Fix: handling of empty streams
New gdb_printout is attached.
I have quickly checked gdb for other coredumps and they all have very similar back trace.
Please, let us know if further info are needed.
Updated by David Goulet over 11 years ago
Thanks Tan! I'll be merging this fix and I've opened a new bug with this new issue (#536).
This bug will be closed once the commit is done.
Updated by David Goulet over 11 years ago
- Status changed from Confirmed to Resolved
- % Done changed from 0 to 100
Applied in changeset b31398bb2b3fa91a53dea3b36fd693da4b50e0d3.