Project

General

Profile

Actions

Bug #530

closed
TL DG

lttng-tools2.2.0rc2: ConsumerD segfault in ustctl_flush_buffer (stream=0x63c4e0, producer_active=0) at ustctl.c:1425

Bug #530: lttng-tools2.2.0rc2: ConsumerD segfault in ustctl_flush_buffer (stream=0x63c4e0, producer_active=0) at ustctl.c:1425

Added by Tan le tran over 12 years ago. Updated over 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
05/13/2013
Due date:
% Done:

100%

Estimated time:

Description

Commit used:
============
userspace: 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize side-effect in 0.7
ust      : 83e4321 (HEAD, origin/master, origin/HEAD) Fix: incorrect support for multi-context
tools    : 1274479 (HEAD, origin/master, origin/HEAD) Fix: check channel subbuf size against page size
bebltrace: 5bfcad9 (HEAD, origin/master, origin/HEAD) Fix: handling of empty streams

Problem Description:
====================
 * While our stability-test is running, consumerD segfault once in a while with the following
   gdb info:
      (gdb) bt
      #0  0x00007f79958c72fc in ustctl_flush_buffer (stream=0x63c4e0, producer_active=0) at ustctl.c:1425
      #1  0x000000000041c57b in lttng_ustconsumer_on_stream_hangup (stream=0x647590) at ust-consumer.c:1196
      #2  0x000000000040a5e9 in consumer_thread_data_poll (data=0x633d10) at consumer.c:2500
      #3  0x00007f799549f7b6 in start_thread () from /lib64/libpthread.so.0
      #4  0x00007f79951fac5d in clone () from /lib64/libc.so.6
      #5  0x0000000000000000 in ?? ()

  * This segfault happens very often (about each 10 minutes ).

Is problem reproducible ?
=========================
  * yes 

How to reproduce (if reproducible):
===================================
  * We got 4 users, each has a different set of trace actions to perform (ex: create session, enable-chanel, 
    enable-event, start session, stop session, etc). Once the set of action has been done, that user sleep 
    for a few second and repeat the same set of actions again.

    While running the above, we encounter this consumerD segfault issue.

Any other information:
======================
- Included in this bug report is the gdb printout + the "sessiond -vvv --verbose-consumer" printout.
  The time at which the segfault occured, was about 14:10 .


Files

gdb_printout.log (12.7 KB) gdb_printout.log gdb printout Tan le tran, 05/13/2013 02:44 PM
sessiond_verbose.tar (57.9 KB) sessiond_verbose.tar "sessiond -vvv --verbose-consumer" output Tan le tran, 05/13/2013 02:44 PM
bug530.patch (2.34 KB) bug530.patch David Goulet, 05/15/2013 01:17 PM
May17_Pacth1_applied_gdb_printout.log (29.7 KB) May17_Pacth1_applied_gdb_printout.log gdb printout after apllying Patch (from update#2) May17. Tan le tran, 05/17/2013 09:06 AM

DG Updated by David Goulet over 12 years ago Actions #1

  • Status changed from New to Confirmed
  • Assignee set to David Goulet
  • Target version set to 2.2

DG Updated by David Goulet over 12 years ago Actions #2

This patch should fix the issue. I'll wait for your ACK before merging it. There is a clear race that the patch fixes.

TL Updated by Tan le tran over 12 years ago Actions #3

Hi David,

The old segfault is no longer there, but a new one is observed.
It now occurs in multiple nodes and the frequency of this occurences is
also about every 3-7min . We still run the same test suite described on the
top of this bug report.

New commits used: =================
lttng-tools: c5854b1 (HEAD, origin/master, origin/HEAD) Fix: use memset instead of poll ...
+ Apply bug530 patch (from update #2)

lttng-ust : 352fce3 (HEAD, origin/master, origin/HEAD) Remove 0.x TODO
rcu : 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize...
babeltrace : 5bfcad9 (HEAD, origin/master, origin/HEAD) Fix: handling of empty streams

New gdb_printout is attached.
I have quickly checked gdb for other coredumps and they all have very similar back trace.

Please, let us know if further info are needed.

DG Updated by David Goulet over 12 years ago Actions #4

Thanks Tan! I'll be merging this fix and I've opened a new bug with this new issue (#536).

This bug will be closed once the commit is done.

DG Updated by David Goulet over 12 years ago Actions #5

  • Status changed from Confirmed to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: PDF Atom