Project

General

Profile

Actions

Bug #546

closed

lttng2.2.0rc2: SessionD hang after receiving lttng_enable_channel() API call

Added by Tan le tran over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
High
Target version:
Start date:
05/29/2013
Due date:
% Done:

100%

Estimated time:

Description

Commit used:
============
babeltrace  : 9eaf254 Version 1.0.3
tools       : 094d169 (HEAD, origin/master, origin/HEAD) Fix: dereference after NULL check
ust         : 996aead (HEAD, origin/master, origin/HEAD) Add parameter -f to rm in Makefile clean target
userspace   : 264716f (HEAD, origin/stable-0.7, stable-0.7) Fix: Use a filled signal mask to disable all signals

Problem Description:
====================
 * SessionD hang (does not response) during stability test activity.
 * The hanging start to happen after about 30 min of Stability run.
 * From our log, it shows that all hanging happens when lttng_enable_channel() API is invoked.
     :
     86520 09:30:23 05/29/2013 IN TraceP_SC-1 "DBG: activateSession: invoke lttng_create_session for session STAB001_1 with url net://192.168.0.1:54176:53483/./.
     86521 09:30:23 05/29/2013 IN TraceP_SC-1 "DBG: activateSession: Invoking lttng_create_handle for session STAB001_1
     86522 09:30:23 05/29/2013 IN TraceP_SC-1 "DBG: activateSession: Invoking lttng_set_default_attr for session STAB001_1.
     86523 09:30:23 05/29/2013 IN TraceP_SC-1 "DBG: activateSession: Invoking lttng_enable_channel for session STAB001_1
     :  #--- no reply receives for this....after about 60 sec (+- 5) timeout, our process start to
        #--- "kill -9" sessiond and consumerd .
        #    We have modified the code to use "kill -SIGABRT" sessiond and consumerD so that
        #    coredump can be generated. The corresponding gdb printouts are attached with
        #    this report.

Is problem reproducible ?
=========================
  * yes 

How to reproduce (if reproducible):
===================================
  * Our stability test consist of 4 users. Each user has a different set of trace commands
    (such as create session, activate session, stop session, etc). Each user then executes
    its set of commands through multiple iterations.
    All sessions are created using streamming and perUID buffer and tracing on userspace only.

    After about 30 min run (Each user has finished a few iterations), we start noticing 
    that sessionD start to hang. Our heatlh check process will "kill -9" sessiond and consumerD
    and restart a new one. After about another while, sessionD hang again with the same kind
    of symptomp. During an overnight run, that happens a lot of time.

    For the purpose of gathering more data, we have modified our code to use "kill -SIGABRT" to
    kill sessiond and consumerd so that coredump can be obtained when hanging occur.
    The corresponding gdb printout are included in this report.

Any other information:
======================
-   

Files

may29_sessiond_consumerd_gdb_printout.log (22.8 KB) may29_sessiond_consumerd_gdb_printout.log gdb printout for sessionD and consumerD after sending "kill -SIGABRT" Tan le tran, 05/29/2013 01:04 PM
gdb_printous.log (27.6 KB) gdb_printous.log gdb printout of relayd, sessiond and consumerd when hanging occurs Tan le tran, 05/31/2013 10:58 AM
sessiond_vvv.log.tar (429 KB) sessiond_vvv.log.tar "sessiond -vvv" (from 09:36:04 to when sessiond got restarted) Tan le tran, 06/04/2013 12:14 PM
netstat_ps_printout.txt (3.63 KB) netstat_ps_printout.txt netstat printout when hanging occurs Tan le tran, 06/04/2013 12:14 PM
gdb_printouts.log (15.6 KB) gdb_printouts.log gdb printout of ralyd and sessiond after SIGABRT sent Tan le tran, 06/04/2013 12:14 PM
sessiond_vvv.log (459 KB) sessiond_vvv.log printout of "sessiond -vvv" Tan le tran, 06/07/2013 07:50 PM
relayd_vvv.log (129 KB) relayd_vvv.log print out of "relayd -vvv" Tan le tran, 06/07/2013 07:50 PM
bug546.diff (1.26 KB) bug546.diff David Goulet, 06/18/2013 03:51 PM
sessiond_vvv.log.tar (7.72 KB) sessiond_vvv.log.tar sessiond -vvv log when attempting to activate session Tan le tran, 06/20/2013 03:30 PM
1850_Netstat.log (2.09 KB) 1850_Netstat.log netstat -etanp printout Tan le tran, 06/25/2013 09:53 AM
1850_relayd_vvv.log (175 KB) 1850_relayd_vvv.log relayd -vvv printout Tan le tran, 06/25/2013 09:53 AM
1850_extra_printouts.log (3.82 KB) 1850_extra_printouts.log requested data printout Tan le tran, 06/25/2013 09:53 AM
socket-timeout-1.patch (5.13 KB) socket-timeout-1.patch socket-timeout-1.patch Mathieu Desnoyers, 07/11/2013 01:49 PM
socket-timeout-2.patch (6.62 KB) socket-timeout-2.patch socket-timeout-2.patch Mathieu Desnoyers, 07/11/2013 01:49 PM
socket-timeout-3.patch (6.05 KB) socket-timeout-3.patch socket-timeout-3.patch Mathieu Desnoyers, 07/11/2013 01:49 PM
sessiond_vvv_combined_1.log.tar (311 KB) sessiond_vvv_combined_1.log.tar sessiond -vvv log (part1) Tan le tran, 07/16/2013 09:13 AM
sessiond_vvv_combined_2.log.tar (425 KB) sessiond_vvv_combined_2.log.tar sessiond -vvv log (part2) Tan le tran, 07/16/2013 09:13 AM
sessiond_vvv.log.tar (21.6 KB) sessiond_vvv.log.tar sessiond -vvv log Tan le tran, 07/23/2013 09:59 AM
connect-timeout-2-replacement.patch (8 KB) connect-timeout-2-replacement.patch Mathieu Desnoyers, 07/23/2013 04:13 PM
Actions

Also available in: Atom PDF