Actions
Bug #510
closed
TL
MD
lttng2.2.0rc1: ConsumerD coredump once in a while after a few iteration of "TC dealing with killing relayd while session is active"
Bug #510:
lttng2.2.0rc1: ConsumerD coredump once in a while after a few iteration of "TC dealing with killing relayd while session is active"
Start date:
04/23/2013
Due date:
% Done:
0%
Estimated time:
Description
Commit used:
============
rcu : 8a97780 (HEAD) Fix: tests/api.h use cpuset.h
ust : a16877a (HEAD, origin/master, origin/HEAD) Typo fix in README
tools : 6bf5e7c (HEAD, origin/master, origin/HEAD) Fix: remove mention of trace ...
babeltrace : d088404 (HEAD, origin/master, origin/HEAD) Use objstack for AST allocation
Problem Description:
====================
* ConsumerD coredump with "in __assert_fail () from /lib64/libc.so.6"
So far, 2 different flavours (in terms of "gdb> bt" printouts) have been seen:
Case1:
(gdb) bt
#0 0x00007ff900a13b55 in raise () from /lib64/libc.so.6
#1 0x00007ff900a15131 in abort () from /lib64/libc.so.6
#2 0x00007ff900a0ca10 in __assert_fail () from /lib64/libc.so.6
#3 0x000000000041c28f in lttng_ustconsumer_request_metadata (ctx=0x632d10, channel=0x6359b0) at ust-consumer.c:1404
#4 0x000000000040c88f in metadata_switch_timer (ctx=0x632d10, sig=44, si=0x7ff8fe5cdb10, uc=0x0) at consumer-timer.c:68
#5 0x000000000040ce85 in consumer_timer_metadata_thread (data=0x632d10) at consumer-timer.c:215
#6 0x00007ff900d5c7b6 in start_thread () from /lib64/libpthread.so.0
#7 0x00007ff900ab8c6d in clone () from /lib64/libc.so.6
#8 0x0000000000000000 in ?? ()
Case2:
(gdb) bt
#0 0x00007feadea4bb55 in raise () from /lib64/libc.so.6
#1 0x00007feadea4d131 in abort () from /lib64/libc.so.6
#2 0x00007feadea44a10 in __assert_fail () from /lib64/libc.so.6
#3 0x0000000000413a29 in lttng_ht_add_unique_u64 (ht=0x633fa0, node=0x63c138) at hashtable.c:281
#4 0x000000000040ad1e in consumer_thread_channel_poll (data=0x632d10) at consumer.c:2697
#5 0x00007feaded947b6 in start_thread () from /lib64/libpthread.so.0
#6 0x00007feadeaf0c6d in clone () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()
Is problem reproducible ?
=========================
* yes
How to reproduce (if reproducible):
===================================
* When running the following scenario a couple of times, a consumerD coredump
can be observed (NOTE: all interaction with lttng were done via API calls):
1)_ lttng-relayd -C tcp://0.0.0.0:53000 -D tcp://0.0.0.0:53001 -o <some out put path> &
2)_ create a session (using the above ports)
a)_ lttng_create_session()
b)_ lttng_create_handle()
c)_ lttng_channel_set_default_attr() // one for ctf and one for metadata
d)_ lttng_enable_channel() // subbuf_size = 16384 (for both ctf and metadata)
// switch_timer_interval = 1000000 (for both ctf and metadata)
e)_ lttng_add_context(..,LTTNG_EVENT_CONTEXT_VPID,..)
f)_ lttng_enable_event_with_filter() // enable a tracepoint which has only 1 hit per second
3)_ start the instrumented apps to produce events
4)_ lttng_start_tracing()
5)_ sleep 5
6)_ "pkill relayd"
7)_ lttng_stop_tracing_no_wait()
8)_ loop every 1 sec until lttng_data_pending() returns 0 (timeout at after 60 sec)
9)_ lttng_destroy_session()
10)_ "pkill TestApp" (kill all running instrumented apps)
11)_ pause for a couple of sec then repeat from step-1 to 11
Sometimes the coredump happens during the 2nd or 3rd iteration.
* Unfortunately, we could not reproduce the above issue using equivalent lttng cli commands !
Any other information:
======================
- "sessiond -vvv" have been collected for each of the above 2 cases.
Files
Actions