LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912024-03-13T17:07:08ZLTTng bugs repository
Redmine LTTng-tools - Bug #1411 (Feedback): Memory leak when relay daemon exits before application startshttps://bugs.lttng.org/issues/14112024-03-13T17:07:08ZMikael Beckius
<p>When the relay daemon is shutdown after creating a live session but before applications are started the shared memory allocated for tracing appears to remain and new memory is allocated for every application start.</p>
<p>How to reproduce:<br />host:~# lttng create micke --live<br />Spawning a session daemon<br />Spawning a relayd daemon<br />Live session micke created.<br />Traces will be output to tcp4://127.0.0.1:5342/ [data: 5343]<br />Live timer interval set to 1000000 us</p>
<p>host:~# lttng enable-event --userspace --all<br />All ust events are enabled in channel channel0</p>
<p>host:~# lttng start<br />Tracing started for session micke</p>
<p>host:~# killall -9 lttng-relayd</p>
<p>host:~# free -h<br /> total used free shared buff/cache available<br />Mem: 15Gi 207Mi 15Gi 572Ki 320Mi 15Gi<br />Swap: 0B 0B 0B</p>
<p>host:~# ./micke-lttng<br />Mikael LTTNG 2015 - Starting<br />Mikael LTTNG 2015 - Signing out</p>
<p>host:~# free -h<br /> total used free shared buff/cache available<br />Mem: 15Gi 248Mi 15Gi 40Mi 360Mi 15Gi<br />Swap: 0B 0B 0B</p>
<p>host:~# ./micke-lttng<br />Mikael LTTNG 2015 - Starting<br />Mikael LTTNG 2015 - Signing out</p>
<p>host:~# free -h<br /> total used free shared buff/cache available<br />Mem: 15Gi 288Mi 15Gi 80Mi 400Mi 15Gi<br />Swap: 0B 0B 0B</p>
<p>host:~# lttng destroy micke<br />Destroying session micke..<br />Session micke destroyed</p>
<p>host:~# free -h<br /> total used free shared buff/cache available<br />Mem: 15Gi 289Mi 15Gi 80Mi 400Mi 15Gi<br />Swap: 0B 0B 0B<br />host:~#</p>
<p>Version:<br />lttng-tools 2.13.11</p>
<p>Analyzis:<br />It seems that when the first application of a session starts, after the relay daemon has been shutdown, a failure to transfer streams to the relay deamon triggers a clean up through a call to ust_consumer_destroy_channel. But it appears that the cleanup isn't complete and the channel reference count remains incremented. Decrementing the reference count appears to be blocked in clean_channel_stream_list by stream->monitor = 0; preventing CONSUMER_CHANNEL_DEL from reaching consumer_del_channel(chan);</p>
<p>Information has it that this problem is NOT reproduced on 2.13 but I haven't tested that myself</p> LTTng-tools - Bug #1407 (New): Hang when stopping a live session with a low live valuehttps://bugs.lttng.org/issues/14072023-12-19T20:03:54ZKienan Stewart
<p>While investigating <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Investigate why events are no longer recorded by the live view when `DELAYUS` in `tests/regressio... (Resolved)" href="https://bugs.lttng.org/issues/1403">#1403</a>, I noticed some test runs would intermittently hang when the live delay value was set very low (eg. 1us).</p>
<p>For example, in <code>tests/regression/tools/clear/test_ust</code> the test <code>test_ust_streaming_live</code> would hang executing <code>lttng stop sessionX</code>. After investigating the issue I noted the following:</p>
<ul>
<li>the lttng client is hanging, pending a response from the sessiond</li>
<li>the consumer's consumer_thread_data_poll thread appear to be looping in <code>consumer_timer_signal_thread_qs</code> while trying to remove the monitor timer during channel deletion: <pre>
Thread 7 (Thread 0x7f999bfff680 (LWP 3060295) "lttng-consumerd"):
#0 sigpending (set=0x7f999bffdea0) at ../sysdeps/unix/sysv/linux/sigpending.c:27
#1 0x00005559299f07ec in consumer_timer_signal_thread_qs (signr=47) at consumer/consumer-timer.cpp:325
#2 consumer_channel_timer_stop (timer_id=timer_id@entry=0x7f998c0021c0, signal=47) at consumer/consumer-timer.cpp:422
#3 0x00005559299f122f in consumer_timer_monitor_stop (channel=channel@entry=0x7f998c000f40) at consumer/consumer-timer.cpp:531
#4 0x00005559299dabbf in consumer_del_channel (channel=channel@entry=0x7f998c000f40) at consumer/consumer.cpp:379
#5 0x00005559299ee83a in consumer_stream_destroy (stream=stream@entry=0x7f998c01a830, ht=<optimized out>) at consumer/consumer-stream.cpp:1070
#6 0x00005559299e3c61 in consumer_del_stream (ht=<optimized out>, stream=0x7f998c01a830) at consumer/consumer.cpp:547
#7 consumer_thread_data_poll (data=0x55592b498ba0) at consumer/consumer.cpp:2747
#8 0x00007f99aaaa63ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#9 0x00007f99aab26a5c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
</pre></li>
<li>the consumerd's consumer_timer_thread was often handling the live timer: <pre>
Thread 4 (Thread 0x7f99a991b680 (LWP 3060292) "lttng-consumerd"):
#0 __GI___libc_write (nbytes=124, buf=0x7f99a9919e90, fd=2) at ../sysdeps/unix/sysv/linux/write.c:26
#1 __GI___libc_write (fd=2, buf=0x7f99a9919e90, nbytes=124) at ../sysdeps/unix/sysv/linux/write.c:24
#2 0x00007f99aaa9dbb5 in _IO_new_file_write (f=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=0x7f99a9919e90, n=124) at ./libio/fileops.c:1180
#3 0x00007f99aaa9cf70 in new_do_write (fp=fp@entry=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=data@entry=0x7f99a9919e9
0 "DBG1 - 10:31:33.730698608 [3060287/3060292]: Live timer for channel 17 (in live_timer() at consumer/consumer-timer.cpp:284)\n", to_do=to_do@entry=124) at ./libio/libioP.h:946
#4 0x00007f99aaa9e2a1 in _IO_new_file_xsputn (n=124, data=<optimized out>, f=0x7f99aabf26a0 <_IO_2_1_stderr_>) at ./libio/fileops.c:1254
#5 _IO_new_file_xsputn (f=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=<optimized out>, n=124) at ./libio/fileops.c:1196
#6 0x00007f99aaa71409 in __printf_buffer_flush_to_file (buf=buf@entry=0x7f99a9919e60) at ../libio/libioP.h:946
#7 0x00007f99aaa714c0 in __printf_buffer_to_file_done (buf=buf@entry=0x7f99a9919e60) at ./stdio-common/printf_buffer_to_file.c:120
#8 0x00007f99aaa7ac0d in __vfprintf_internal (s=0x7f99aabf26a0 <_IO_2_1_stderr_>, format=0x555929a57cd8 "DBG1 - %s [%s]: Live timer for channel %lu (in %s() at consumer/consumer-timer.cpp:284)\n", ap=ap@entry=0x7f99a9919f60, mode_
flags=mode_flags@entry=0) at ./stdio-common/vfprintf-internal.c:1475
#9 0x00007f99aaa70296 in __fprintf (stream=<optimized out>, format=<optimized out>) at ./stdio-common/fprintf.c:32
#10 0x00005559299f2280 in live_timer (si=0x7f99a991a160, ctx=0x55592b498ba0) at consumer/consumer-timer.cpp:284
#11 consumer_timer_thread (data=0x55592b498ba0) at consumer/consumer-timer.cpp:789
#12 0x00007f99aaaa63ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#13 0x00007f99aab26a5c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
</pre></li>
</ul>
<p><code>consumer_timer_signal_thread_qs</code> is busy loop until the the the signal in question is no longer pending. In <code>consumer-timer.cpp::consumer_timer_thread</code>, the signals are waited on by <code>sigwaitinfo</code>, and there are some ordering rules (see man 7 signal):</p>
<ul>
<li>for RT signals, by signal priority</li>
<li>for standard signals, an unspecified order</li>
<li>in Linux standard signals are delivered before RT signals</li>
</ul>
<p>The signal priorities are their signal number as defined [[<a class="external" href="https://github.com/lttng/lttng-tools/blob/dff46b736afb7945aaf635de661b920dd0d01d7f/src/common/consumer/consumer-timer.hpp#L17|here">https://github.com/lttng/lttng-tools/blob/dff46b736afb7945aaf635de661b920dd0d01d7f/src/common/consumer/consumer-timer.hpp#L17|here</a>]]. Lower values have a higher priority.</p>
<p>When the live session is configured with a low delay, eg <code>--live=1</code>, the live timer is firing relatively rapidly and can prevent the signal fired by the monitor timer from being handled. This in turn means that the session destruction may hang until (hopefully) the monitor signal is handled and no longer pending.</p>
<p>A few options were discussed:</p>
<ul>
<li>changing signal ordering (eg. reducing the priority of live signal): while it would address this particular case, it then means a session with a low value for the monitor timer could equally starve out the handling of the live timer's signal</li>
<li>using <code>sigpending</code> in combination with <code>sigwaitinfo</code>: while it's possible to get all pending signals as a set, it seems to not be possible to retrieve the details (<code>siginfo_t</code>) which are required some some data (eg. pointers) to properly handle the signal</li>
<li>signalfd: very linux specific</li>
<li>timer_fd: very linux specific</li>
<li>custom scheduler: eg., <a class="external" href="https://github.com/nsec/badge-conf-2023/blob/nsec2023/lib/scheduling/scheduler.hpp">https://github.com/nsec/badge-conf-2023/blob/nsec2023/lib/scheduling/scheduler.hpp</a></li>
<li>switch from using channel pointers in the <code>sigval_ptr</code> to passing IDs that can be used for lookups which may fail gracefully instead; that way the pending signals wouldn't have to all be cleared in <code>consumer_timer_signal_thread_qs</code></li>
</ul>
<p>It seems the most promising path is to implement a small deadline scheduler.</p> LTTng-tools - Bug #1402 (New): Some files are missing REUSE licensing and copyright informationhttps://bugs.lttng.org/issues/14022023-11-14T15:03:06ZPhilippe Proulxeeppeliteloop@gmail.com
<p>Running <code>reuse lint</code> at the root of the source tree reveals many files are missing copyright and/or license information (SPDX):</p>
<blockquote>
<p>Unfortunately, your project is not compliant with version 3.0 of the REUSE Specification :-(</p>
</blockquote>
<p>This includes pretty much all the <code>Makefile.am</code> and <code>*.m4</code> files as well as <code>src/vendor/fmt/*</code>, <code>src/vendor/msgpack/*</code>, and <code>src/vendor/optional.hpp</code>.</p>
<p>Note that to keep vendor files unmodified, you may use the <code>.license</code> or <code>.reuse/dep5</code> <a href="https://reuse.software/spec/" class="external">strategies</a>.</p> LTTng-tools - Bug #1400 (New): Completion of error handlinghttps://bugs.lttng.org/issues/14002023-10-26T20:33:03ZMarkus Elfring
<p>I have looked at a few source files for your current software.<br />I have noticed that some checks for return codes are missing.</p>
Would you like to add more error handling for return values from functions like the following?
<ul>
<li>fprintf ⇒ usage<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-consumerd/lttng-consumerd.cpp#L159-L200">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-consumerd/lttng-consumerd.cpp#L159-L200</a></li>
<li>pthread_mutex_lock ⇒ lttng_thread_create<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/thread.cpp#L98">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/thread.cpp#L98</a></li>
</ul> LTTng-tools - Bug #1399 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13992023-10-26T20:02:18ZMarkus Elfring
<p>I would like to point out that some identifiers do eventually not fit to the expected naming convention of the C++ language standard.<br /><a class="external" href="https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL51-CPP.+Do+not+declare+or+define+a+reserved+identifier">https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL51-CPP.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_LTTNG_BUG_H<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/common/bug.hpp#L8-L9">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/common/bug.hpp#L8-L9</a></li>
<li>_LTT_UTILS_H<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/utils.hpp#L8-L9">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/utils.hpp#L8-L9</a></li>
</ul> LTTng-UST - Bug #1398 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13982023-10-26T19:32:57ZMarkus Elfring
<p>I would like to point out that some identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_UST_COMMON_CLOCK_H<br /> <a class="external" href="https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/clock.h#L7-L8">https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/clock.h#L7-L8</a></li>
<li>__lttng_ust_siov<br /> <a class="external" href="https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/snprintf/fvwrite.h#L18">https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/snprintf/fvwrite.h#L18</a></li>
</ul> Babeltrace - Bug #1397 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13972023-10-26T18:56:27ZMarkus Elfring
<p>I would like to point out that a few identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>__BT_IN_BABELTRACE_H<br /> <a class="external" href="https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/include/babeltrace2/babeltrace.h#L13-L17">https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/include/babeltrace2/babeltrace.h#L13-L17</a></li>
<li>_BABELTRACE_UTC_H<br /> <a class="external" href="https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/src/compat/utc.h#L7-L8">https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/src/compat/utc.h#L7-L8</a></li>
</ul> Userspace RCU - Bug #1396 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13962023-10-26T15:17:27ZMarkus Elfring
<p>I would like to point out that a few identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_URCU_WAIT_H<br /> <a class="external" href="https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/src/urcu-wait.h#L5">https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/src/urcu-wait.h#L5</a></li>
<li>__cds_wfcq_init<br /> <a class="external" href="https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/include/urcu/wfcqueue.h#L136">https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/include/urcu/wfcqueue.h#L136</a></li>
</ul> LTTng-tools - Bug #1392 (Feedback): lttng view: Stream 0 is not declared in metadatahttps://bugs.lttng.org/issues/13922023-09-27T20:38:17ZRicardo Nabinger Sanchezrnsanchez@gmail.com
<p>While trying to capture UST events from a simple program, even though kernel events seem to be properly captured, UST events are not. Trace Compass cannot decode the UST data (but it can decode from the kernel channel).<br />The application is single-threaded, performs only disk and stdout I/O, nothing fancy. Compiled with <code>-finstrument-functions</code>, no optimizations, and with debugging symbols.</p>
<p>Session setup:<br /><pre>
# lttng create
# lttng enable-channel -u --subbuf-size=64M --num-subbuf=8 chan_ust
# lttng enable-channel -k --subbuf-size=64M --num-subbuf=2 chan_kernel
# lttng add-context -u -t vpid -t ip -t procname -t vtid -c chan_ust
# lttng enable-event -u -a -c chan_ust
# lttng enable-event -k -c chan_kernel --syscall --all
# lttng enable-event -k -c chan_kernel lttng_statedump_block_device,lttng_statedump_file_descriptor,lttng_statedump_process_state,mm_page_alloc,mm_page_free,net_dev_xmit,netif_receive_skb,sched_pi_setprio,sched_process_exec,sched_process_fork,sched_switch,sched_wakeup,sched_waking,softirq_entry,softirq_exit,softirq_raise
# lttng start
(possibly excessive preloads, but these are from my notes, collecting data on Varnish Cache)
# LTTNG_UST_BLOCKING_RETRY_TIMEOUT=-1 LD_PRELOAD=liblttng-ust-fd.so:liblttng-ust-fork.so:liblttng-ust-dl.so:liblttng-ust-cyg-profile.so:liblttng-ust-libc-wrapper.so:liblttng-ust-pthread-wrapper.so tests/test-playlist tests/playlist/bug_ffmpeg_1000-x20.m3u
# lttng stop
# lttng view > /dev/null
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
[error] Stream 0 is not declared in metadata.
</pre></p>
Environment:
<ul>
<li>lttng-modules-2.13.10 (from tarball)</li>
<li>lttng-tools-2.13.11 (from tarball)</li>
<li>lttng-ust-2.13.6 (from tarball)</li>
<li>babeltrace (git; at <code>v1.5.11</code>)</li>
<li>Linux kernel 6.6.0-rc1</li>
<li>Slackware64-current, up-to-date as of 2023-09-26</li>
<li>Trace Compass 9.1.0.202309200833</li>
</ul>
<p>IRC snippet:<br /><pre>
Sep/27 16:13:54 rnsanchez I'm trying to find out what I'm doing wrong.. I keep getting "[error] Stream 0 is not declared in metadata." in my captures. I guess this is making tracecompass not being able to show UST events (simple function enter-exit instrumentation)
Sep/27 16:16:46 rnsanchez using my old notes, while checking https://lttng.org/docs/v2.13/#doc-liblttng-ust-cyg-profile and other sections
Sep/27 16:18:25 jgalar rnsanchez: shot in the dark, but are you making sure the session is stopped or destroyed before reading the trace?
Sep/27 16:18:28 rnsanchez if it helps knowing, events from kernel channel are fine and I can see them in tracecompass (the trace is huge)
Sep/27 16:18:52 rnsanchez jgalar: yes, with lttng stop, then I copy the data from root to my user
Sep/27 16:19:10 rnsanchez but I get that Stream 0 regardless.. it's from lttng view
Sep/27 16:20:11 jgalar it sounds like a metadata flushing issue, any chance you can share the trace? (or at least, the metadata files)
Sep/27 16:20:42 rnsanchez yes, sure. just a moment while I upload it
Sep/27 16:20:48 jgalar okay thanks
Sep/27 16:21:43 ebugden not sure if related, but it's triggering deja vu of this issue: https://github.com/efficios/barectf/commit/d024537859c1d869bfa1cedc8abe8e3f7a648faa
Sep/27 16:22:23 ebugden maybe a similar problem in a similar area in lttng view
Sep/27 16:25:28 rnsanchez jgalar: https://rnsanchez.wait4.org/auto-20230927-161138.tar.xz
Sep/27 16:27:33 rnsanchez jgalar: this last one has incomplete events as I was trying to investigate, but I can bundle another one with full events as I used to collect
Sep/27 16:31:51 rnsanchez ebugden: is there anything on my side I can do?
Sep/27 16:34:38 ebugden rnsanchez: i don't feel i'm familiar enough with this situation to say; i was tossing out the association as possible food for thought for jgalar
Sep/27 16:35:12 rnsanchez oh ok :-)
Sep/27 16:43:03 rnsanchez jgalar: just in case, here is with full events: https://rnsanchez.wait4.org/auto-20230927-164014.tar.xz
Sep/27 16:45:53 jgalar ebugden: yup, the errors are indeed similar
Sep/27 16:50:59 jgalar hmm, the user space trace's metadata doesn't declare any stream class, that's... unexpected...
Sep/27 16:51:14 jgalar which versions of lttng-ust and lttng-tools are you running?
Sep/27 16:55:03 rnsanchez ust 2.13.6 (tarball) and tools 2.13.11 (tarball too)
Sep/27 16:55:07 rnsanchez compiled today
Sep/27 16:55:13 jgalar okay
Sep/27 17:17:57 ebugden rnsanchez (cc: jgalar): would you open a bug report on https://bugs.lttng.org/ ?
Sep/27 17:18:16 rnsanchez ebugden: sure
Sep/27 17:19:21 ebugden thanks! (we have a few hypotheses, but we'd need more information about exactly how the trace is generated)
Sep/27 17:19:58 jgalar rnsanchez: it's weird, the user space trace's stream files just have the packet headers... if you can reproduce the problem while running lttng-sessiond with the `-vvv --verbose-consumer` options to get the logs, it would be helpful
Sep/27 17:20:47 rnsanchez I can try. I suppose I have to finish the sessiond already running and manually launch another?
Sep/27 17:21:06 jgalar yep
</pre></p> Babeltrace - Bug #1389 (New): src.ctf.fs crash with invalid tracehttps://bugs.lttng.org/issues/13892023-09-18T20:16:39ZSimon Marchisimon.marchi@polymtl.ca
<p>Using the attached trace, I get:</p>
<pre>
$ ./src/cli/babeltrace2 trace
09-18 16:15:58.978 335803 335803 E PLUGIN/CTF/MSG-ITER ctf_msg_iter_get_next_message@msg-iter.cpp:2715 [auto-disc-source-ctf-fs] Cannot handle state: msg-it-addr=0x613000000580, state=DSCOPE_EVENT_PAYLOAD_CONTINUE
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:63 Babeltrace 2 library postcondition not satisfied.
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:65 ------------------------------------------------------------------------
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:66 Condition ID: `post:message-iterator-class-next-method:no-error-if-no-error-status`.
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:68 Function: bt_message_iterator_class_next_method().
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:69 ------------------------------------------------------------------------
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:70 Error is:
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:72 Current thread has an error, but user function returned a non-error status: status=OK
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:75 Aborting...
</pre> Babeltrace - Bug #1388 (New): Assertion `this->active_stream_iter == 0` failedhttps://bugs.lttng.org/issues/13882023-09-08T18:57:03ZSimon Marchisimon.marchi@polymtl.ca
<p>Steps:</p>
<ol>
<li><code>lttng create my-session --live</code></li>
<li><code>lttng enable-event -u -a</code></li>
<li><code>lttng start</code></li>
<li><code>./src/cli/babeltrace2 'net://127.0.0.1/host/smarchi-efficios/my-session'</code></li>
<li>Run the hello app from the lttng-ust tests: <code>lttng-ust/tests/hello/hello</code></li>
<li>While events print on the babeltrace terminal, type ctrl-C</li>
</ol>
<p>I get:</p>
<p><code> (╯°□°)╯︵ ┻━┻ /home/smarchi/src/babeltrace/src/plugins/ctf/lttng-live/lttng-live.cpp:214: ~lttng_live_msg_iter(): Assertion `this->active_stream_iter == 0` failed.</code></p> LTTng - Bug #1385 (Feedback): [lttng-live] error: recevie a message in the pasthttps://bugs.lttng.org/issues/13852023-08-16T18:53:52ZBin Yuan
<p>My program works well most of time. But in rare case, it reports the following hightling error.<br /><img src="https://bugs.lttng.org/attachments/download/572/clipboard-202308170250-vfley.jpg" alt="" loading="lazy" /></p>
<p>What will lead to those branch? any suggestion to debug?</p> Babeltrace - Bug #1384 (New): Assertion `bt_value_get_type(map) == BT_VALUE_TYPE_MAP` failshttps://bugs.lttng.org/issues/13842023-08-08T18:07:42ZErica Bugden
<p>Hello! I triggered an internal assertion while trying to, presumably incorrectly, query the lttng-live component class.</p>
<a name="Software"></a>
<h1 >Software<a href="#Software" class="wiki-anchor">¶</a></h1>
<ul>
<li>lttng tools: <code>lttng (LTTng Trace Control) 2.14.0-pre - O-Beer - v2.12.0-rc1-1528-g5983bb6d3</code></li>
<li>babeltrace2: <code>Babeltrace 2.1.0-rc1 "Codename TBD" [v1.2.0-3704-gdbea6be2]</code> (configured with plugins, python bindings support)</li>
</ul>
<a name="Procedure"></a>
<h1 >Procedure<a href="#Procedure" class="wiki-anchor">¶</a></h1>
<ul>
<li>Start root session daemon: <code>$sudo lttng-sessiond --daemonize</code></li>
<li>Create live session: <code>$lttng create my-session --live</code></li>
<li>Enable events: <code>$lttng enable-event --kernel sched-switch,sched_process_fork</code></li>
<li>Start tracing <code>$lttng start</code></li>
<li>Run python script: <code>python3 lttng-live.py</code> (see below)
<ul>
<li>Assertion</li>
</ul></li>
</ul>
<a name="Python-script"></a>
<h1 >Python script<a href="#Python-script" class="wiki-anchor">¶</a></h1>
<p>Note: The script is noisy as some code/comments are irrelevant/incorrect/inconsistent. (It's originally based on the query example in the Python bindings documentation.)</p>
<pre>
import bt2
import sys
# Get the `source.ctf.fs` component class from the `ctf` plugin.
# `source.ctf.lttng-live` component class instead
# (list plugins/components with $babeltrace2 list-plugins)
comp_cls = bt2.find_plugin('ctf').source_component_classes['lttng-live']
# The `babeltrace.support-info` query operation expects a `type`
# parameter (set to `directory` here) and an `input` parameter (the
# actual path or string to check, in this case the first command-line
# argument).
#
# See `babeltrace2-query-babeltrace.support-info(7)`.
params = {
'url': 'net://localhost/host/luna/my-session'
}
'''
params = {
'type': 'string',
'input': sys.argv[1],
}
'''
# Create a query executor.
#
# This is the environment in which query operations happens. The
# queried component class has access to this executor, for example to
# retrieve the query operation's logging level.
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
'inputs=net://localhost/host/luna/my-session')
'''
query_exec = bt2.QueryExecutor(comp_cls, 'babeltrace.support-info',
params)
'''
# Query the component class through the query executor.
#
# This method returns the result.
result = query_exec.query()
# Print the result.
print(result)
# Try to iterate on the trace.
'''
for msg in bt2.TraceCollectionMessageIterator('net://localhost/host/luna/my-session'):
if type(msg) is bt2._EventMessageConst:
print(msg.event.name)
'''
</pre>
<p>If I remember correctly, <strong>the script line change that triggered the assertion was expressing the query parameters directly as a string</strong> (rather than a dictionary). As in this:</p>
<pre>
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
'inputs=net://localhost/host/luna/my-session')
</pre>
<p>instead of this:</p>
<pre>
params = {
'url': 'net://localhost/host/luna/my-session'
}
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
params)
</pre>
<a name="Output"></a>
<h1 >Output<a href="#Output" class="wiki-anchor">¶</a></h1>
<pre>
erica@luna:~$ python3 lttng-live.py
(╯°□°)╯︵ ┻━┻ param-validation.c:196: validate_map_value(): Assertion `bt_value_get_type(map) == BT_VALUE_TYPE_MAP` failed.
Aborted (core dumped)
</pre> LTTng-tools - Bug #1383 (New): The "cpu_id" context (available for filters) is not discoverable b...https://bugs.lttng.org/issues/13832023-07-23T16:22:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>I recently tried remembering how to filter by cpu_id when enabling an event, and found that there was not clear way to see that $ctx.cpu_id actually exists from the lttng man pages (other than a random example in the lttng-enable-event man page). AFAIU the man pages rely on lttng add-context --list to let the user discover the available contexts. This is likely because the cpu_id context is not available for the "add-context" command because it would be redundant with the implicit context already sampled with the buffers.</p>
<p>We may want to have some way to let users discovers those contexts which are filter-specific.</p>
<p>My use-case is to try to grab only traces for a few logical cpus (0-3) from my 384 logical cpus machine.</p> LTTng-tools - Bug #1381 (New): lttng list-triggers crashing with 2.13.9https://bugs.lttng.org/issues/13812023-06-16T19:23:50ZHeman Muresan
<p>This was discussed offline, but I am opening a ticket here for tracking purposes. With the latest 2.13.9 I'm seeing the following assert:<br /><pre>
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007fb11ad09723 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007fb11acbe876 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007fb11aca97e3 in __GI_abort () at abort.c:79
#4 0x00007fb11aca9719 in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, file=<optimized out>, line=<optimized out>, function=<optimized out>) at assert.c:92
#5 0x00007fb11acb78e6 in __GI___assert_fail (assertion=assertion@entry=0x475fd2 "element_index < array->size", file=file@entry=0x4832e8 "../../../lttng-tools-2.13.9/src/common/dynamic-array.h",
line=line@entry=55, function=function@entry=0x498120 <__PRETTY_FUNCTION__.1> "lttng_dynamic_array_get_element") at assert.c:101
#6 0x0000000000469311 in lttng_dynamic_array_get_element (array=0x4b7f10, element_index=0) at ../../../lttng-tools-2.13.9/src/common/dynamic-array.h:55
#7 lttng_dynamic_array_get_element (element_index=0, array=0x4b7f10) at ../../../lttng-tools-2.13.9/src/common/dynamic-array.h:52
#8 lttng_action_path_copy (src=src@entry=0x4b7f10, dst=dst@entry=0x4b8250) at ../../../lttng-tools-2.13.9/src/common/actions/path.c:109
#9 0x000000000043595c in lttng_error_query_action_create (action_path=0x4b7f10, trigger=0x4b8240) at ../../../lttng-tools-2.13.9/src/common/error-query.c:230
#10 lttng_error_query_action_create (trigger=trigger@entry=0x4b5ea0, action_path=action_path@entry=0x4b7f10) at ../../../lttng-tools-2.13.9/src/common/error-query.c:206
#11 0x000000000042c62a in print_action_errors (action=<optimized out>, action_path_length=0, action_path_indexes=0x0, trigger=0x4b5ea0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:766
#12 print_one_action (trigger=0x4b5ea0, action=<optimized out>, action_path_indexes=0x0, action_path_length=0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:966
#13 0x000000000042d105 in print_one_trigger (trigger=0x4b5ea0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1124
#14 print_sorted_triggers (triggers=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1202
#15 cmd_list_triggers (argc=<optimized out>, argv=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1421
#16 0x0000000000410be0 in handle_command (argv=0x7ffc348fc9c0, argc=1) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:237
#17 parse_args (argv=0x7ffc348fc9b8, argc=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:426
#18 main (argc=<optimized out>, argv=0x7ffc348fc9b8) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:475
</pre></p>
<p>There's nothing particular fancy about the trigger being printed, it's just a buffer usage trigger. Here is it being displayed on an older version of lttng-tools:<br /><pre>
- name: foo_bar_bufferUsageTrigger
owner uid: 0
condition: buffer usage high
session name: foo_bar
channel name: foo_bar-ust
domain: ust
threshold (ratio): 0.70
errors: none
action:notify
errors: none
errors: none
</pre></p>
<p>This trigger is being created with the following pseudocode:<br /><pre>
pCondition = lttng_condition_buffer_usage_high_create();
lttng_condition_buffer_usage_set_session_name(...);
lttng_condition_buffer_usage_set_channel_name(...);
lttng_condition_buffer_usage_set_threshold_ratio(...);
lttng_condition_buffer_usage_set_domain_type();
pAction = lttng_action_notify_create();
pTrigger = lttng_trigger_create(pCondition, pAction);
lttng_register_trigger_with_name(pTrigger, pTriggerName);
</pre></p>
<p>I think this was introduced by this commit:<br /><a class="external" href="https://github.com/lttng/lttng-tools/commit/1ad48d9678f8f123dc7d44dbbfadc9957b7e970e#diff-e9fa6863d3580e2d95137ddaab0280d5391e830657a46f47cde39e0cfabc8fb7R109">https://github.com/lttng/lttng-tools/commit/1ad48d9678f8f123dc7d44dbbfadc9957b7e970e#diff-e9fa6863d3580e2d95137ddaab0280d5391e830657a46f47cde39e0cfabc8fb7R109</a><br />In lttng_action_path_copy the code as is doesn't handle the case of there being no destination action path. A way to fix this is to make this code handle this scenario, attached is a patch that does this. But, perhaps we shouldn't even be hitting this scenario where there is no action path and this is just a bandaid.</p>
<p>Thanks,<br />Heman</p>