LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912023-12-19T20:03:54ZLTTng bugs repository
Redmine LTTng-tools - Bug #1407 (New): Hang when stopping a live session with a low live valuehttps://bugs.lttng.org/issues/14072023-12-19T20:03:54ZKienan Stewart
<p>While investigating <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Investigate why events are no longer recorded by the live view when `DELAYUS` in `tests/regressio... (Resolved)" href="https://bugs.lttng.org/issues/1403">#1403</a>, I noticed some test runs would intermittently hang when the live delay value was set very low (eg. 1us).</p>
<p>For example, in <code>tests/regression/tools/clear/test_ust</code> the test <code>test_ust_streaming_live</code> would hang executing <code>lttng stop sessionX</code>. After investigating the issue I noted the following:</p>
<ul>
<li>the lttng client is hanging, pending a response from the sessiond</li>
<li>the consumer's consumer_thread_data_poll thread appear to be looping in <code>consumer_timer_signal_thread_qs</code> while trying to remove the monitor timer during channel deletion: <pre>
Thread 7 (Thread 0x7f999bfff680 (LWP 3060295) "lttng-consumerd"):
#0 sigpending (set=0x7f999bffdea0) at ../sysdeps/unix/sysv/linux/sigpending.c:27
#1 0x00005559299f07ec in consumer_timer_signal_thread_qs (signr=47) at consumer/consumer-timer.cpp:325
#2 consumer_channel_timer_stop (timer_id=timer_id@entry=0x7f998c0021c0, signal=47) at consumer/consumer-timer.cpp:422
#3 0x00005559299f122f in consumer_timer_monitor_stop (channel=channel@entry=0x7f998c000f40) at consumer/consumer-timer.cpp:531
#4 0x00005559299dabbf in consumer_del_channel (channel=channel@entry=0x7f998c000f40) at consumer/consumer.cpp:379
#5 0x00005559299ee83a in consumer_stream_destroy (stream=stream@entry=0x7f998c01a830, ht=<optimized out>) at consumer/consumer-stream.cpp:1070
#6 0x00005559299e3c61 in consumer_del_stream (ht=<optimized out>, stream=0x7f998c01a830) at consumer/consumer.cpp:547
#7 consumer_thread_data_poll (data=0x55592b498ba0) at consumer/consumer.cpp:2747
#8 0x00007f99aaaa63ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#9 0x00007f99aab26a5c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
</pre></li>
<li>the consumerd's consumer_timer_thread was often handling the live timer: <pre>
Thread 4 (Thread 0x7f99a991b680 (LWP 3060292) "lttng-consumerd"):
#0 __GI___libc_write (nbytes=124, buf=0x7f99a9919e90, fd=2) at ../sysdeps/unix/sysv/linux/write.c:26
#1 __GI___libc_write (fd=2, buf=0x7f99a9919e90, nbytes=124) at ../sysdeps/unix/sysv/linux/write.c:24
#2 0x00007f99aaa9dbb5 in _IO_new_file_write (f=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=0x7f99a9919e90, n=124) at ./libio/fileops.c:1180
#3 0x00007f99aaa9cf70 in new_do_write (fp=fp@entry=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=data@entry=0x7f99a9919e9
0 "DBG1 - 10:31:33.730698608 [3060287/3060292]: Live timer for channel 17 (in live_timer() at consumer/consumer-timer.cpp:284)\n", to_do=to_do@entry=124) at ./libio/libioP.h:946
#4 0x00007f99aaa9e2a1 in _IO_new_file_xsputn (n=124, data=<optimized out>, f=0x7f99aabf26a0 <_IO_2_1_stderr_>) at ./libio/fileops.c:1254
#5 _IO_new_file_xsputn (f=0x7f99aabf26a0 <_IO_2_1_stderr_>, data=<optimized out>, n=124) at ./libio/fileops.c:1196
#6 0x00007f99aaa71409 in __printf_buffer_flush_to_file (buf=buf@entry=0x7f99a9919e60) at ../libio/libioP.h:946
#7 0x00007f99aaa714c0 in __printf_buffer_to_file_done (buf=buf@entry=0x7f99a9919e60) at ./stdio-common/printf_buffer_to_file.c:120
#8 0x00007f99aaa7ac0d in __vfprintf_internal (s=0x7f99aabf26a0 <_IO_2_1_stderr_>, format=0x555929a57cd8 "DBG1 - %s [%s]: Live timer for channel %lu (in %s() at consumer/consumer-timer.cpp:284)\n", ap=ap@entry=0x7f99a9919f60, mode_
flags=mode_flags@entry=0) at ./stdio-common/vfprintf-internal.c:1475
#9 0x00007f99aaa70296 in __fprintf (stream=<optimized out>, format=<optimized out>) at ./stdio-common/fprintf.c:32
#10 0x00005559299f2280 in live_timer (si=0x7f99a991a160, ctx=0x55592b498ba0) at consumer/consumer-timer.cpp:284
#11 consumer_timer_thread (data=0x55592b498ba0) at consumer/consumer-timer.cpp:789
#12 0x00007f99aaaa63ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#13 0x00007f99aab26a5c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
</pre></li>
</ul>
<p><code>consumer_timer_signal_thread_qs</code> is busy loop until the the the signal in question is no longer pending. In <code>consumer-timer.cpp::consumer_timer_thread</code>, the signals are waited on by <code>sigwaitinfo</code>, and there are some ordering rules (see man 7 signal):</p>
<ul>
<li>for RT signals, by signal priority</li>
<li>for standard signals, an unspecified order</li>
<li>in Linux standard signals are delivered before RT signals</li>
</ul>
<p>The signal priorities are their signal number as defined [[<a class="external" href="https://github.com/lttng/lttng-tools/blob/dff46b736afb7945aaf635de661b920dd0d01d7f/src/common/consumer/consumer-timer.hpp#L17|here">https://github.com/lttng/lttng-tools/blob/dff46b736afb7945aaf635de661b920dd0d01d7f/src/common/consumer/consumer-timer.hpp#L17|here</a>]]. Lower values have a higher priority.</p>
<p>When the live session is configured with a low delay, eg <code>--live=1</code>, the live timer is firing relatively rapidly and can prevent the signal fired by the monitor timer from being handled. This in turn means that the session destruction may hang until (hopefully) the monitor signal is handled and no longer pending.</p>
<p>A few options were discussed:</p>
<ul>
<li>changing signal ordering (eg. reducing the priority of live signal): while it would address this particular case, it then means a session with a low value for the monitor timer could equally starve out the handling of the live timer's signal</li>
<li>using <code>sigpending</code> in combination with <code>sigwaitinfo</code>: while it's possible to get all pending signals as a set, it seems to not be possible to retrieve the details (<code>siginfo_t</code>) which are required some some data (eg. pointers) to properly handle the signal</li>
<li>signalfd: very linux specific</li>
<li>timer_fd: very linux specific</li>
<li>custom scheduler: eg., <a class="external" href="https://github.com/nsec/badge-conf-2023/blob/nsec2023/lib/scheduling/scheduler.hpp">https://github.com/nsec/badge-conf-2023/blob/nsec2023/lib/scheduling/scheduler.hpp</a></li>
<li>switch from using channel pointers in the <code>sigval_ptr</code> to passing IDs that can be used for lookups which may fail gracefully instead; that way the pending signals wouldn't have to all be cleared in <code>consumer_timer_signal_thread_qs</code></li>
</ul>
<p>It seems the most promising path is to implement a small deadline scheduler.</p> LTTng-tools - Bug #1402 (New): Some files are missing REUSE licensing and copyright informationhttps://bugs.lttng.org/issues/14022023-11-14T15:03:06ZPhilippe Proulxeeppeliteloop@gmail.com
<p>Running <code>reuse lint</code> at the root of the source tree reveals many files are missing copyright and/or license information (SPDX):</p>
<blockquote>
<p>Unfortunately, your project is not compliant with version 3.0 of the REUSE Specification :-(</p>
</blockquote>
<p>This includes pretty much all the <code>Makefile.am</code> and <code>*.m4</code> files as well as <code>src/vendor/fmt/*</code>, <code>src/vendor/msgpack/*</code>, and <code>src/vendor/optional.hpp</code>.</p>
<p>Note that to keep vendor files unmodified, you may use the <code>.license</code> or <code>.reuse/dep5</code> <a href="https://reuse.software/spec/" class="external">strategies</a>.</p> LTTng-tools - Bug #1400 (New): Completion of error handlinghttps://bugs.lttng.org/issues/14002023-10-26T20:33:03ZMarkus Elfring
<p>I have looked at a few source files for your current software.<br />I have noticed that some checks for return codes are missing.</p>
Would you like to add more error handling for return values from functions like the following?
<ul>
<li>fprintf ⇒ usage<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-consumerd/lttng-consumerd.cpp#L159-L200">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-consumerd/lttng-consumerd.cpp#L159-L200</a></li>
<li>pthread_mutex_lock ⇒ lttng_thread_create<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/thread.cpp#L98">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/thread.cpp#L98</a></li>
</ul> LTTng-tools - Bug #1399 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13992023-10-26T20:02:18ZMarkus Elfring
<p>I would like to point out that some identifiers do eventually not fit to the expected naming convention of the C++ language standard.<br /><a class="external" href="https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL51-CPP.+Do+not+declare+or+define+a+reserved+identifier">https://wiki.sei.cmu.edu/confluence/display/cplusplus/DCL51-CPP.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_LTTNG_BUG_H<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/common/bug.hpp#L8-L9">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/common/bug.hpp#L8-L9</a></li>
<li>_LTT_UTILS_H<br /> <a class="external" href="https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/utils.hpp#L8-L9">https://github.com/lttng/lttng-tools/blob/d4333eb996dec315fb721074c2cf49adee04363b/src/bin/lttng-sessiond/utils.hpp#L8-L9</a></li>
</ul> LTTng-UST - Bug #1398 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13982023-10-26T19:32:57ZMarkus Elfring
<p>I would like to point out that some identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_UST_COMMON_CLOCK_H<br /> <a class="external" href="https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/clock.h#L7-L8">https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/clock.h#L7-L8</a></li>
<li>__lttng_ust_siov<br /> <a class="external" href="https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/snprintf/fvwrite.h#L18">https://github.com/lttng/lttng-ust/blob/8bc1125eb851b2c52d3263c2992e6806017e98e7/src/common/snprintf/fvwrite.h#L18</a></li>
</ul> Babeltrace - Bug #1397 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13972023-10-26T18:56:27ZMarkus Elfring
<p>I would like to point out that a few identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>__BT_IN_BABELTRACE_H<br /> <a class="external" href="https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/include/babeltrace2/babeltrace.h#L13-L17">https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/include/babeltrace2/babeltrace.h#L13-L17</a></li>
<li>_BABELTRACE_UTC_H<br /> <a class="external" href="https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/src/compat/utc.h#L7-L8">https://github.com/efficios/babeltrace/blob/8a87cdbd41e2e04b1b64e9959751cdabe72281c1/src/compat/utc.h#L7-L8</a></li>
</ul> Userspace RCU - Bug #1396 (New): reserved identifier violationhttps://bugs.lttng.org/issues/13962023-10-26T15:17:27ZMarkus Elfring
<p>I would like to point out that a few identifiers do eventually not fit to the expected naming convention of the C language standard.<br /><a class="external" href="https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier">https://www.securecoding.cert.org/confluence/display/c/DCL37-C.+Do+not+declare+or+define+a+reserved+identifier</a></p>
<p>Would you like to adjust your selection for unique names?</p>
Examples:
<ul>
<li>_URCU_WAIT_H<br /> <a class="external" href="https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/src/urcu-wait.h#L5">https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/src/urcu-wait.h#L5</a></li>
<li>__cds_wfcq_init<br /> <a class="external" href="https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/include/urcu/wfcqueue.h#L136">https://github.com/urcu/userspace-rcu/blob/008a71ef213f23c8d403e086563155121b286c0c/include/urcu/wfcqueue.h#L136</a></li>
</ul> Babeltrace - Bug #1389 (New): src.ctf.fs crash with invalid tracehttps://bugs.lttng.org/issues/13892023-09-18T20:16:39ZSimon Marchisimon.marchi@polymtl.ca
<p>Using the attached trace, I get:</p>
<pre>
$ ./src/cli/babeltrace2 trace
09-18 16:15:58.978 335803 335803 E PLUGIN/CTF/MSG-ITER ctf_msg_iter_get_next_message@msg-iter.cpp:2715 [auto-disc-source-ctf-fs] Cannot handle state: msg-it-addr=0x613000000580, state=DSCOPE_EVENT_PAYLOAD_CONTINUE
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:63 Babeltrace 2 library postcondition not satisfied.
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:65 ------------------------------------------------------------------------
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:66 Condition ID: `post:message-iterator-class-next-method:no-error-if-no-error-status`.
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:68 Function: bt_message_iterator_class_next_method().
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:69 ------------------------------------------------------------------------
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:70 Error is:
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:72 Current thread has an error, but user function returned a non-error status: status=OK
09-18 16:15:58.978 335803 335803 F LIB/ASSERT-COND bt_lib_assert_cond_failed@assert-cond.c:75 Aborting...
</pre> Babeltrace - Bug #1388 (New): Assertion `this->active_stream_iter == 0` failedhttps://bugs.lttng.org/issues/13882023-09-08T18:57:03ZSimon Marchisimon.marchi@polymtl.ca
<p>Steps:</p>
<ol>
<li><code>lttng create my-session --live</code></li>
<li><code>lttng enable-event -u -a</code></li>
<li><code>lttng start</code></li>
<li><code>./src/cli/babeltrace2 'net://127.0.0.1/host/smarchi-efficios/my-session'</code></li>
<li>Run the hello app from the lttng-ust tests: <code>lttng-ust/tests/hello/hello</code></li>
<li>While events print on the babeltrace terminal, type ctrl-C</li>
</ol>
<p>I get:</p>
<p><code> (╯°□°)╯︵ ┻━┻ /home/smarchi/src/babeltrace/src/plugins/ctf/lttng-live/lttng-live.cpp:214: ~lttng_live_msg_iter(): Assertion `this->active_stream_iter == 0` failed.</code></p> Babeltrace - Bug #1384 (New): Assertion `bt_value_get_type(map) == BT_VALUE_TYPE_MAP` failshttps://bugs.lttng.org/issues/13842023-08-08T18:07:42ZErica Bugden
<p>Hello! I triggered an internal assertion while trying to, presumably incorrectly, query the lttng-live component class.</p>
<a name="Software"></a>
<h1 >Software<a href="#Software" class="wiki-anchor">¶</a></h1>
<ul>
<li>lttng tools: <code>lttng (LTTng Trace Control) 2.14.0-pre - O-Beer - v2.12.0-rc1-1528-g5983bb6d3</code></li>
<li>babeltrace2: <code>Babeltrace 2.1.0-rc1 "Codename TBD" [v1.2.0-3704-gdbea6be2]</code> (configured with plugins, python bindings support)</li>
</ul>
<a name="Procedure"></a>
<h1 >Procedure<a href="#Procedure" class="wiki-anchor">¶</a></h1>
<ul>
<li>Start root session daemon: <code>$sudo lttng-sessiond --daemonize</code></li>
<li>Create live session: <code>$lttng create my-session --live</code></li>
<li>Enable events: <code>$lttng enable-event --kernel sched-switch,sched_process_fork</code></li>
<li>Start tracing <code>$lttng start</code></li>
<li>Run python script: <code>python3 lttng-live.py</code> (see below)
<ul>
<li>Assertion</li>
</ul></li>
</ul>
<a name="Python-script"></a>
<h1 >Python script<a href="#Python-script" class="wiki-anchor">¶</a></h1>
<p>Note: The script is noisy as some code/comments are irrelevant/incorrect/inconsistent. (It's originally based on the query example in the Python bindings documentation.)</p>
<pre>
import bt2
import sys
# Get the `source.ctf.fs` component class from the `ctf` plugin.
# `source.ctf.lttng-live` component class instead
# (list plugins/components with $babeltrace2 list-plugins)
comp_cls = bt2.find_plugin('ctf').source_component_classes['lttng-live']
# The `babeltrace.support-info` query operation expects a `type`
# parameter (set to `directory` here) and an `input` parameter (the
# actual path or string to check, in this case the first command-line
# argument).
#
# See `babeltrace2-query-babeltrace.support-info(7)`.
params = {
'url': 'net://localhost/host/luna/my-session'
}
'''
params = {
'type': 'string',
'input': sys.argv[1],
}
'''
# Create a query executor.
#
# This is the environment in which query operations happens. The
# queried component class has access to this executor, for example to
# retrieve the query operation's logging level.
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
'inputs=net://localhost/host/luna/my-session')
'''
query_exec = bt2.QueryExecutor(comp_cls, 'babeltrace.support-info',
params)
'''
# Query the component class through the query executor.
#
# This method returns the result.
result = query_exec.query()
# Print the result.
print(result)
# Try to iterate on the trace.
'''
for msg in bt2.TraceCollectionMessageIterator('net://localhost/host/luna/my-session'):
if type(msg) is bt2._EventMessageConst:
print(msg.event.name)
'''
</pre>
<p>If I remember correctly, <strong>the script line change that triggered the assertion was expressing the query parameters directly as a string</strong> (rather than a dictionary). As in this:</p>
<pre>
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
'inputs=net://localhost/host/luna/my-session')
</pre>
<p>instead of this:</p>
<pre>
params = {
'url': 'net://localhost/host/luna/my-session'
}
query_exec = bt2.QueryExecutor(comp_cls, 'sessions',
params)
</pre>
<a name="Output"></a>
<h1 >Output<a href="#Output" class="wiki-anchor">¶</a></h1>
<pre>
erica@luna:~$ python3 lttng-live.py
(╯°□°)╯︵ ┻━┻ param-validation.c:196: validate_map_value(): Assertion `bt_value_get_type(map) == BT_VALUE_TYPE_MAP` failed.
Aborted (core dumped)
</pre> LTTng-tools - Bug #1383 (New): The "cpu_id" context (available for filters) is not discoverable b...https://bugs.lttng.org/issues/13832023-07-23T16:22:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>I recently tried remembering how to filter by cpu_id when enabling an event, and found that there was not clear way to see that $ctx.cpu_id actually exists from the lttng man pages (other than a random example in the lttng-enable-event man page). AFAIU the man pages rely on lttng add-context --list to let the user discover the available contexts. This is likely because the cpu_id context is not available for the "add-context" command because it would be redundant with the implicit context already sampled with the buffers.</p>
<p>We may want to have some way to let users discovers those contexts which are filter-specific.</p>
<p>My use-case is to try to grab only traces for a few logical cpus (0-3) from my 384 logical cpus machine.</p> LTTng-tools - Bug #1381 (New): lttng list-triggers crashing with 2.13.9https://bugs.lttng.org/issues/13812023-06-16T19:23:50ZHeman Muresan
<p>This was discussed offline, but I am opening a ticket here for tracking purposes. With the latest 2.13.9 I'm seeing the following assert:<br /><pre>
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007fb11ad09723 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007fb11acbe876 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007fb11aca97e3 in __GI_abort () at abort.c:79
#4 0x00007fb11aca9719 in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, file=<optimized out>, line=<optimized out>, function=<optimized out>) at assert.c:92
#5 0x00007fb11acb78e6 in __GI___assert_fail (assertion=assertion@entry=0x475fd2 "element_index < array->size", file=file@entry=0x4832e8 "../../../lttng-tools-2.13.9/src/common/dynamic-array.h",
line=line@entry=55, function=function@entry=0x498120 <__PRETTY_FUNCTION__.1> "lttng_dynamic_array_get_element") at assert.c:101
#6 0x0000000000469311 in lttng_dynamic_array_get_element (array=0x4b7f10, element_index=0) at ../../../lttng-tools-2.13.9/src/common/dynamic-array.h:55
#7 lttng_dynamic_array_get_element (element_index=0, array=0x4b7f10) at ../../../lttng-tools-2.13.9/src/common/dynamic-array.h:52
#8 lttng_action_path_copy (src=src@entry=0x4b7f10, dst=dst@entry=0x4b8250) at ../../../lttng-tools-2.13.9/src/common/actions/path.c:109
#9 0x000000000043595c in lttng_error_query_action_create (action_path=0x4b7f10, trigger=0x4b8240) at ../../../lttng-tools-2.13.9/src/common/error-query.c:230
#10 lttng_error_query_action_create (trigger=trigger@entry=0x4b5ea0, action_path=action_path@entry=0x4b7f10) at ../../../lttng-tools-2.13.9/src/common/error-query.c:206
#11 0x000000000042c62a in print_action_errors (action=<optimized out>, action_path_length=0, action_path_indexes=0x0, trigger=0x4b5ea0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:766
#12 print_one_action (trigger=0x4b5ea0, action=<optimized out>, action_path_indexes=0x0, action_path_length=0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:966
#13 0x000000000042d105 in print_one_trigger (trigger=0x4b5ea0) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1124
#14 print_sorted_triggers (triggers=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1202
#15 cmd_list_triggers (argc=<optimized out>, argv=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/commands/list_triggers.c:1421
#16 0x0000000000410be0 in handle_command (argv=0x7ffc348fc9c0, argc=1) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:237
#17 parse_args (argv=0x7ffc348fc9b8, argc=<optimized out>) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:426
#18 main (argc=<optimized out>, argv=0x7ffc348fc9b8) at ../../../../lttng-tools-2.13.9/src/bin/lttng/lttng.c:475
</pre></p>
<p>There's nothing particular fancy about the trigger being printed, it's just a buffer usage trigger. Here is it being displayed on an older version of lttng-tools:<br /><pre>
- name: foo_bar_bufferUsageTrigger
owner uid: 0
condition: buffer usage high
session name: foo_bar
channel name: foo_bar-ust
domain: ust
threshold (ratio): 0.70
errors: none
action:notify
errors: none
errors: none
</pre></p>
<p>This trigger is being created with the following pseudocode:<br /><pre>
pCondition = lttng_condition_buffer_usage_high_create();
lttng_condition_buffer_usage_set_session_name(...);
lttng_condition_buffer_usage_set_channel_name(...);
lttng_condition_buffer_usage_set_threshold_ratio(...);
lttng_condition_buffer_usage_set_domain_type();
pAction = lttng_action_notify_create();
pTrigger = lttng_trigger_create(pCondition, pAction);
lttng_register_trigger_with_name(pTrigger, pTriggerName);
</pre></p>
<p>I think this was introduced by this commit:<br /><a class="external" href="https://github.com/lttng/lttng-tools/commit/1ad48d9678f8f123dc7d44dbbfadc9957b7e970e#diff-e9fa6863d3580e2d95137ddaab0280d5391e830657a46f47cde39e0cfabc8fb7R109">https://github.com/lttng/lttng-tools/commit/1ad48d9678f8f123dc7d44dbbfadc9957b7e970e#diff-e9fa6863d3580e2d95137ddaab0280d5391e830657a46f47cde39e0cfabc8fb7R109</a><br />In lttng_action_path_copy the code as is doesn't handle the case of there being no destination action path. A way to fix this is to make this code handle this scenario, attached is a patch that does this. But, perhaps we shouldn't even be hitting this scenario where there is no action path and this is just a bandaid.</p>
<p>Thanks,<br />Heman</p> Babeltrace - Bug #1374 (New): Babeltrace fails to read trace with no data stream type ID when the...https://bugs.lttng.org/issues/13742023-04-26T17:57:32ZErica Bugden
<p>Babeltrace is not able to read a trace that does not contain data stream IDs (empty packet header) when the metadata specifies an ID for the stream. Trace Compass is also unable to read this trace. If the lines mentioning the stream ID are removed from the metadata, then babeltrace is able to read the trace.</p>
<a name="Does-this-need-to-be-fixed"></a>
<h3 >Does this need to be fixed?<a href="#Does-this-need-to-be-fixed" class="wiki-anchor">¶</a></h3>
<p>This was first reported as a barectf issue <a class="external" href="https://github.com/efficios/barectf/issues/28">https://github.com/efficios/barectf/issues/28</a> , but after further consideration the issue appears to be in a grey area in the CTF 1.8 specification. Whether it is the CTF generator's (barectf) responsibility or the CTF reader's responsibility is up for debate.</p>
<p>The superfluous metadata information need not prevent the trace from being parsed. For example, yactfr <a class="external" href="https://github.com/eepp/yactfr">https://github.com/eepp/yactfr</a> is able to read the trace. However, considering that:</p>
<ul>
<li>At least two CTF readers (babeltrace and Trace Compass) cannot read the trace, and</li>
<li>It is easy to fix in barectf (Proposed fix: <a class="external" href="https://review.lttng.org/c/barectf/+/9868">https://review.lttng.org/c/barectf/+/9868</a> )</li>
</ul>
<p>this is not a high priority issue in babeltrace.</p>
<a name="Context"></a>
<h1 >Context<a href="#Context" class="wiki-anchor">¶</a></h1>
<p>- Background and reproducer: barectf github issue <a class="external" href="https://github.com/efficios/barectf/issues/28">https://github.com/efficios/barectf/issues/28</a> <br />- Background and specification discussion: barectf fix <a class="external" href="https://review.lttng.org/c/barectf/+/9868">https://review.lttng.org/c/barectf/+/9868</a></p> LTTng-tools - Bug #1371 (New): Cannot have two non-root session daemons in two separate Docker co...https://bugs.lttng.org/issues/13712023-04-18T23:33:11ZChristophe Bedard
<p>I have 2 separate Docker containers. The user in each container is the same user as my non-root host user (same name, same UID, same GID). Users in the 2 Docker containers have the same $HOME, which is a directory mounted from the host.</p>
<p>To set up and configure each container:</p>
<p>1. Start container and pass all necessary info through:</p>
<pre>
host$ mkdir ~/fake-home
host$ docker run -it --env USER=$USER --env GROUP=$USER --env USER_ID=$(id -u) --env GROUP_ID=$(id -g) -v $HOME/fake-home:$HOME ubuntu:22.04 bash
</pre>
<p>2. Inside each container, create user, install lttng-tools, and start shell as $USER:</p>
<pre>
containerN$ groupdel "$GROUP" &>/dev/null || true
containerN$ groupadd -og "$GROUP_ID" "$GROUP"
containerN$ useradd -M -u "$USER_ID" -g "$GROUP_ID" -d "/home/$USER" -s /bin/bash "$USER"
containerN$ usermod -aG sudo "$USER"
containerN$ apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y lttng-tools
containerN$ su - "$USER"
</pre>
<p>Then, I can start a non-root session daemon in container 1 and trace just fine:</p>
<pre>
container1$ lttng list # No session daemon, as expected
Error: No session daemon is available
container1$ lttng-sessiond --daemonize
container1$ pgrep lttng-sessiond # As expected, we have a session daemon
459
container1$ cat $HOME/.lttng/lttng-sessiond.pid # Matches above output
459
container1$ ps -o comm= $(cat $HOME/.lttng/lttng-sessiond.pid) # That PID indeed corresponds to the session daemon
lttng-sessiond
container1$ lttng list # Now this works, as expected
Currently no available recording session
container1$ # Trace, ..., works fine
container1$ babeltrace /home/$USER/lttng-traces/trace-container1 # Shows trace data
...
</pre>
<p>However, in container 2, LTTng thinks that there is already a session daemon, but tracing doesn't collect any data:</p>
<pre>
container2$ pgrep lttng-sessiond # No output, since there is no session daemon, as expected
container2$ lttng-sessiond --daemonize # Doesn't work, which is unexpected, since there isn't supposed to be a session daemon!
Error: A session daemon is already running.
container2$ cat $HOME/.lttng/lttng-sessiond.pid # Matches output from container 1, since they have the same $HOME directory
459
container1$ ps -o comm= 459 # ...but that PID doesn't correspond to any process in this container! (no output)
container2$ lttng list # Works, which is unexpected, since there isn't supposed to be a session daemon!
Currently no available recording session
container2$ # Trace, ..., works fine
container2$ babeltrace /home/$USER/lttng-traces/trace-container2 # Error, babeltrace says it's not a trace, which is unexpected
[error] Cannot open any trace for reading.
[error] opening trace "/home/$USER/lttng-traces/trace-container2" for reading.
[error] none of the specified trace paths could be opened.
container2$ ls /home/$USER/lttng-traces/trace-container2 # The trace directory has a ust/ subdirectory, but it's empty
ust
container2$ export LTTNG_HOME=~/some-other-dir # Setting/using a different LTTNG_HOME fixes everything
</pre>
<p>As mentioned in the last line above, using a different LTTNG_HOME allows container2 to successfully start its own non-root session daemon and collect trace data. However, the issue is that this fails silently: LTTng doesn't detect this weird configuration/state and doesn't give an error.</p>
<p>I can detect this myself by checking if the PID in the $HOME/.lttng/lttng-sessiond.pid file is a 'lttng-sessiond' process:</p>
<pre>
[ $(ps -o comm= $(cat $HOME/.lttng/lttng-sessiond.pid)) = 'lttng-sessiond' ]
</pre>
<p>If not, then that means that the session daemon isn't really "available," and I can report an error in my own tool and tell my user to set LTTNG_HOME to something other than $HOME and try again. I understand that this is a weird corner case, but it would be good if LTTng itself could handle this check, if possible.</p>
<p>Note: in my case, the $HOME directory inside the containers is not the actual host user's $HOME directory. However, if it were, maybe this could be reproducible with the host + 1 Docker container?</p> LTTng-tools - Bug #1367 (New): Sessions wih --no-output don't show lost packetshttps://bugs.lttng.org/issues/13672023-02-02T20:00:13ZOlivier Dion
<p>Creating a session without any output results in the lost packets count to be zero.</p>
<p>Steps to reproduce:</p>
<ol>
<li>lttng create --no-output</li>
<li>lttng enable-channel --userspace --subbuf-size=$(getconf PAGE_SIZE) --num-subbuf=2 channel0</li>
<li>lttng enable-event --userspace --channel=channel0 '*'</li>
<li>lttng start</li>
<li>./spam-ust.sh</li>
<li>lttng status</li>
</ol>
<p>Got `Statistics: Discard events: 0' but should be `Statistics: None' or `Statistics: Disard events: N'.</p>