Bug #1036
closedChannel creation stops working and returns "Command error" without apparent reason after a few iterations of session/channel create+destroy
0%
Description
Environment: Chroot on an ARM phone running Ubuntu Touch (15.04-based)
Tools:
lttng-ust 2.8.1 built from stable git branch
lttng-tools 2.8.0 built from stable git branch
My workflow is the following:
1) setup session and channel
lttng create
lttng enable-channel --userspace --num-subbuf 8 --subbuf-size 8M big-channel
lttng enable-event -ua --channel big-channel
lttng add-context -u -t vpid -t vtid -t procname -t ip
lttng start
2) run the app in a separate terminal
3) lttng stop + lttng destroy
4) Pull the trace, analyze etc
5) Rinse and repeat
What happens is that at one point (it seems when I wait more than a few minutes between the stop and the destroy commands)
the channel creation command starts failing.
The only way to fix it seem to be rebooting the device.
An example of sessiond log:
DEBUG2 - 13:12:59.823516 [25414/25420]: UST app session created successfully with handle 1 (in create_ust_app_session() at ust-app.c:2246)
DEBUG3 - 13:12:59.823659 [25414/25420]: Created hashtable size 4 at 0xaf0213c8 of type 1 (in lttng_ht_new() at hashtable.c:145)
DEBUG3 - 13:12:59.823745 [25414/25420]: Created hashtable size 4 at 0xaf026bd0 of type 0 (in lttng_ht_new() at hashtable.c:145)
DEBUG3 - 13:12:59.823819 [25414/25420]: UST app channel big-channel allocated (in alloc_ust_app_channel() at ust-app.c:1051)
DEBUG2 - 13:12:59.823888 [25414/25420]: UST app shadow copy of channel big-channel started (in shadow_copy_channel() at ust-app.c:1794)
DEBUG3 - 13:12:59.823966 [25414/25420]: UST app shadow copy of channel big-channel done (in shadow_copy_channel() at ust-app.c:1845)
DEBUG1 - 13:12:59.824041 [25414/25420]: UST app creating channel big-channel with per UID buffers (in create_channel_per_uid() at ust-app.c:2860)
DEBUG3 - 13:12:59.824113 [25414/25420]: Buffer registry per UID find id: 0, ABI: 32, uid: 0 (in buffer_reg_uid_find() at buffer-registry.c:203)
DEBUG2 - 13:12:59.824190 [25414/25420]: UST app creating buffer registry channel for big-channel (in create_buffer_reg_channel() at ust-app.c:2705)
DEBUG3 - 13:12:59.824271 [25414/25420]: Buffer registry channel create with key: 0 (in buffer_reg_channel_create() at buffer-registry.c:380)
DEBUG3 - 13:12:59.824371 [25414/25420]: Created hashtable size 4 at 0xaf021980 of type 0 (in lttng_ht_new() at hashtable.c:145)
DEBUG2 - 13:12:59.824457 [25414/25420]: Asking UST consumer for channel (in ask_channel_creation() at ust-consumer.c:120)
DEBUG3 - 13:12:59.824537 [25414/25420]: mkdir() recursive /home/phablet/lttng-traces/auto-20160704-131253/ust/uid/0/32-bit with mode 504 for uid 32011 and gid 32011 (in run_as_mkdir_recursive() at runas.c:470)
DEBUG1 - 13:12:59.824622 [25414/25420]: Using run_as worker (in run_as() at runas.c:451)
DEBUG3 - 13:12:59.825649 [25414/25420]: UST registry channel finding key 0 (in ust_registry_channel_find() at ust-registry.c:784)
Error: Error creating UST channel "big-channel" on the consumer daemon
An example of LTTNG_UST_DEBUG log (from a different session, sorry)
http://pastebin.ubuntu.com/18448735/
I made sure any session shown by lttng list was destroyed, so lttng list shows nothing
lsof output of consumerd http://pastebin.ubuntu.com/18459042/
Updated by Andrea Bernabei over 8 years ago
Last time it happened I checked /run/shm using "mount" and it still had 350Mb available
Updated by Mathieu Desnoyers over 8 years ago
Without rebooting your system, does killing and restarting lttng-sessiond repair the situation ?
Updated by Mathieu Desnoyers over 8 years ago
Extra relevant info from our IRC discussion: lttng-sessiond is executed as a user, within a container.
Updated by Andrea Bernabei over 8 years ago
I logged into the device this evening, and lttng list this time was showing many sessions, I destroyed them and I could create a new channel again.
I'm quite sure I killed sessiond and started it again with -vvv to provide the log above, so it was still not working.
I will doublecheck both the points above next time I encounter the same problem ;)
Updated by Andrea Bernabei over 8 years ago
After further investigation it appears that the culprit of the problem was that
some of the lttng commands where issues using the liblttng-ctl.so coming from /usr/lib/arm-linux-gnueabihf/
while others were likely being executed after changing the LD_LIBRARY_PATH to point to /usr/local/lib via helper scripts that I wrote. /usr/local/lib hosts a different version of liblttng-ctl.so, so essentially as a consequence some of the lttng commands were being executed using one version of liblttng-ctl, others using another version, and that was obviously causing lttng's misbehaviour.
Proof that shows a differente lttng list output before and after exporting LD_LIBRARY_PATH
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ which lttng
/usr/local/bin/lttng
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ lttng version
lttng version 2.8.0 - Isseki Nicho
The result of a collaboration between "Dieu du Ciel!" and Nagano-based "Shiga Kogen", Isseki Nicho is a strong Imperial Dark Saison offering a rich roasted malt flavor combined with a complex fruity finish typical of Saison yeasts.
Web site: http://lttng.org
lttng is free software and under the GPL license and part LGPL
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ lttng list
Currently no available tracing session
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ ldd /usr/local/bin/lttng
liblttng-ctl.so.0 => /usr/lib/arm-linux-gnueabihf/liblttng-ctl.so.0 (0xb6f16000)
[etc etc]
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ export LD_LIBRARY_PATH=/usr/local/lib:/vendor/lib:/system/lib
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ lttng list
Available tracing sessions:
1) auto-20160704-222355 (/home/phablet/lttng-traces/auto-20160704-222355) [inactive]
Trace path: /home/phablet/lttng-traces/auto-20160704-222355
Use lttng list <session_name> for more details
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ ldd /usr/local/bin/lttng
liblttng-ctl.so.0 => /usr/local/lib/liblttng-ctl.so.0 (0xb6f40000)
[same etc etc libs as above]
phablet@ubuntu-phablet:~/lttng_tracing_scripts$ which lttng
/usr/local/bin/lttng
I'll close this bug as invalid and reopen it in case I have enough evidence that there is actually a bug :)
Thanks for looking into it Mathieu!
Updated by Andrea Bernabei over 8 years ago
line 2: s/where issues/were issued
and "Proof that shows a different"*
Updated by Mathieu Desnoyers over 8 years ago
- Status changed from New to Invalid