LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912024-02-21T17:04:43ZLTTng bugs repository
Redmine Babeltrace - Feature #1410 (New): Add pretty-print sink options to override how to format floatin...https://bugs.lttng.org/issues/14102024-02-21T17:04:43ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Babeltrace 2, just like its predecessor babeltrace 1.5, uses "%g" to print floating point numbers.</p>
<p>It would be useful to let users optionally override the formatting for floating point numbers, perhaps with a pretty print sink option. They could then choose if exponent notation should be used or not, choose the precision, etc.</p> LTTng-tools - Bug #1379 (New): Document behavior of live mode with per-pid UST buffershttps://bugs.lttng.org/issues/13792023-06-01T18:48:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>There is a design limitation of per-pid UST buffers when used in live mode which should be documented.</p>
<p>Basically, because the per-pid buffers are reclaimed soon after the application exits, it may cause the events traced by a short-lived application to never appear in the output of a live trace when per-pid buffers are used.</p>
<p>This is caused by the fact that the live client periodically checks for new buffers, but there is no inherent reference kept on the streams before they disappear.</p>
<p>This means that the information will be available in the disk output of the trace, but it will be missing from the live mode output.</p>
<p>This could eventually be mitigated by keeping references to the streams received by the relay daemon which are of interest to a live client until they are referenced by the live client.</p> LTTng-tools - Bug #1378 (New): Document behavior of snapshot mode with per-pid UST buffershttps://bugs.lttng.org/issues/13782023-06-01T18:01:12ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>We should document a design limitation of the per-pid UST buffers which can lead to unexpected results when used with the "snapshot" feature.</p>
<p>Basically, because the per-pid buffers lifetime is bound by the application using them (and the consumer daemon extracting data from them), they are freed almost immediately after the traced application exits. However, in a scenario where we are interested in capturing a snapshot of the content of flight recorder buffers soon after application exits (e.g. caused by a crash), those buffers are not available anymore a few milliseconds after the application exits.</p>
<p>We should document this design limitation, and eventually think about improvements (feature request ?) to allow a configurable delay after application exit during which the consumer daemon could keep references on the buffers so they are available for snapshot.</p> Userspace RCU - Feature #1368 (New): Integrate RCU implementation from libside into liburcuhttps://bugs.lttng.org/issues/13682023-02-14T14:26:10ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>libside features a multi-domain RCU implementation semantically similar to the Linux kernel's SRCU's implementation. It tracks grace periods with per-cpu counters rather than per-thread state.</p>
<p><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.h">https://github.com/efficios/libside/blob/master/src/rcu.h</a><br /><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.c">https://github.com/efficios/libside/blob/master/src/rcu.c</a></p>
<p>It is somewhat different from other urcu flavors because it supports multiple domains, whereas current liburcu flavors only have a single domain per process.</p>
<p>We would have to think thoroughly about the impacts of supporting multiple domains on other parts of liburcu such as the call_rcu worker thread. The worker thread would either have to be able to synchronize with multiple RCU domains, or we would have to require each worker thread to be associated with a single RCU domain.</p>
<p>Similar concerns arise with respect to data structures such as rculfhash which take a urcu flavor as parameter: with per-domain flavor, those would have to be associated with a specific RCU domain in addition to the flavor.</p> LTTng-tools - Bug #1340 (New): Document behavior of per-uid UST buffers with respect to asynchron...https://bugs.lttng.org/issues/13402021-12-07T20:22:28ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>We should document a known design limitation of the per-uid UST buffers:</p>
<p>When using lttng-ust with per-uid buffers, there is a known design limitation<br />regarding process terminated by non-handled kill signals: if this occurs while<br />the buffer is being written to (between reserve and commit), it will cause the<br />rest of the per-cpu buffer to be unreadable. This applies to all events<br />recorded by all applications of the same user id for that particular CPU.<br />Recovering from this involves destroying and re-creating the tracing session.</p>
<p>If this kind of scenario is likely for you, you might want to consider using per-<br />pid buffers (lttng enable-channel -u --buffers-pid), which do not suffer from<br />this design limitation. The downside of per-pid buffers is that it allocates<br />more memory on your system for buffering, and adds extra overhead when<br />used with processes that have a short life-time.</p> Babeltrace - Feature #1300 (New): babeltrace2 lacks knowledge of bitwise enum produced by lttng-m...https://bugs.lttng.org/issues/13002021-03-23T17:41:47ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>This is a follow up on <a class="external" href="https://review.lttng.org/c/babeltrace/+/3045">https://review.lttng.org/c/babeltrace/+/3045</a> now that end users are starting to contact us wondering why up-to-date lttng and babeltrace2 show incomplete information for bitwise enums.</p>
<p>I think we should implement a better stop-gap solution while awaiting CTF2. I recommend that we specialize the babeltrace2 CTF input plugin to detect traces produced by the lttng-modules kernel tracer, for specific known (hardcoded) events and fields which have a bitwise enum semantic, and use this hardcoded knowledge to print the correct bitwise enum to the end user rather than "<unknown>".</p>
<p>This should ensure the current producer/consumer tools work well together (show complete information about those bitwise enums) without requiring to modify tons of documentation about the specific flags needed for bt2 to handle lttng traces appropriately. And this would not lead to misleading scenarios where we print an unknown value as a bitwise enum by mistake.</p>
<p>Hardcoded producer detection/event/fields is of course a last resort solution awaiting proper self-description in the upcoming CTF2 spec.</p> Babeltrace - Bug #1221 (New): Babeltrace 2 sometimes fail on live trace with lttng-clear featurehttps://bugs.lttng.org/issues/12212020-02-17T20:52:19ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Babeltrace 2 sometimes fail on live trace with lttng-clear feature</p>
<p>This may happen in a scenario where per-pid buffers are used, an application has just quit, and babeltrace 2 observes a stream in make_viewer_streams, but then fails to open the associated index or data file on the first get_next_index because it has been unlinked by clear.</p>
<p>The babeltrace output is:</p>
<pre>
12-04 14:01:48.455 11884 11884 E PLUGIN/CTF/MSG-ITER create_msg_stream_end@msg-iter.c:2464 [lttng-live] Cannot create stream for stream message: notit-addr=0x55b89f957980
12-04 14:01:48.455 11884 11884 W LIB/MSG-ITER bt_self_component_port_input_message_iterator_next@iterator.c:906 Component input port message iterator's "next" method failed: iter-addr=0x55b89f93f510, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:444 [muxer] Error or unsupported status code: status-code=-1
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:975 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x55b89f950950, muxer-upstream-msg-iter-wrap-addr=0x55b89f9415d0
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1371 [muxer] Cannot get next message: comp-addr=0x55b89f93ebc0, muxer-comp-addr=0x55b89f93ec40, muxer-msg-iter-addr=0x55b89f950950, msg-iter-addr=0x55b89f93f3a0, status=ERROR
12-04 14:01:48.455 11884 11884 W LIB/MSG-ITER bt_self_component_port_input_message_iterator_next@iterator.c:906 Component input port message iterator's "next" method failed: iter-addr=0x55b89f93f3a0, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
12-04 14:01:48.455 11884 11884 W LIB/GRAPH consume_graph_sink@graph.c:596 Component's "consume" method failed: status=ERROR, comp-addr=0x55b89f93ed30, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x55b89f94c740, comp-class-so-handle-path="/usr/local/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
12-04 14:01:48.456 11884 11884 E CLI cmd_run@babeltrace2.c:2574 Graph failed to complete successfully
ERROR: [Babeltrace CLI] (babeltrace2.c:2574)
Graph failed to complete successfully
CAUSED BY [Babeltrace library] (graph.c:596)
Component's "consume" method failed: status=ERROR, comp-addr=0x55b89f93ed30,
comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
(`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x55b89f94c740,
comp-class-so-handle-path="/usr/local/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
comp-input-port-count=1, comp-output-port-count=0
CAUSED BY [Babeltrace library] (iterator.c:906)
Component input port message iterator's "next" method failed:
iter-addr=0x55b89f93f3a0, iter-upstream-comp-name="muxer",
iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
iter-upstream-comp-class-name="muxer",
iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [Babeltrace library] (iterator.c:906)
Component input port message iterator's "next" method failed:
iter-addr=0x55b89f93f510, iter-upstream-comp-name="lttng-live",
iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
iter-upstream-comp-class-name="lttng-live",
iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
</pre>
<p>notit->stream is expected not to be NULL when reaching the state STATE_EMIT_MSG_STREAM_END, but if no packet was received, it is indeed still NULL, because it is the packet header being read that moves to the STATE_CHECK_EMIT_MSG_STREAM_BEGINNING state (after_packet_context_state()).</p> LTTng-UST - Bug #1203 (New): Improve documentation of lttng-ust with daemonshttps://bugs.lttng.org/issues/12032019-10-07T15:51:26ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>The lttng-ust(3) man page should be clarified regarding the scenarios where the liblttng-ust-fork.so wrapper is needed.</p>
<p>It should document that it is needed in scenarios where applications using fork(2), clone(2) (with CLONE_VM flag cleared), BSD's rfork(2), daemon(3), without <strong>immediately</strong> following those by an exec(3) family call. This includes scenarios where close(2) is used to close file descriptors before invoking exec(3), even though close(2) is listed as an async-signal-safe function.</p> LTTng - Feature #968 (Feedback): lttng-modules kernel and user callstack contexthttps://bugs.lttng.org/issues/9682015-10-25T20:14:40ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Implementation from Francis Giraldeau reviewed and cleaned up available here:</p>
<p><a class="external" href="https://github.com/compudj/lttng-tools-dev/tree/callstack">https://github.com/compudj/lttng-tools-dev/tree/callstack</a><br /><a class="external" href="https://github.com/compudj/lttng-modules-dev/tree/callstack">https://github.com/compudj/lttng-modules-dev/tree/callstack</a></p>
<p>Documentation and tests are missing.</p> LTTng-UST - Feature #965 (New): Implement UST statedumphttps://bugs.lttng.org/issues/9652015-10-22T20:43:37ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Initial implementation: <a class="external" href="https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier">https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier</a></p>
<p>Missing tests in lttng-tools for this feature before we can merge it into lttng-ust.</p> LTTng-modules - Feature #964 (New): Implement support for persistent memory buffershttps://bugs.lttng.org/issues/9642015-10-22T20:41:43ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-modules - Feature #963 (New): Implement user-space stack dump contexthttps://bugs.lttng.org/issues/9632015-10-22T20:41:10ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-UST - Feature #527 (New): Add git version information to project versionhttps://bugs.lttng.org/issues/5272013-05-10T12:54:46ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-UST - Feature #520 (New): Allow override of /var/run directoryhttps://bugs.lttng.org/issues/5202013-05-04T13:59:36ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Allow override of /var/run directory, as will as per-user $HOME/.lttng/ directory at configure time, and by environment variables. This should match a similar feature in lttng-tools.</p> LTTng-UST - Feature #446 (New): Improve process startup time with many eventshttps://bugs.lttng.org/issues/4462013-02-15T12:54:19ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>J9 VM instrumentation has 16k individual events. Latest UST changes improves the process startup time from about 180s down to 2-4s (depending if tracing is active or not). However, this is still far from the 200ms process startup time normally expected for J9 VM.</p>
<p>There are a couple of ways to improve things:</p>
<p>a) implement a pre-computed hash table and or radix tree within the probe provider. Given the tracepoint names within a provider are known statically, we could construct a data structure to access them efficiently after object compilation, generate C code to construct those structures, and generate an output object, which would be linked with the provider object to create the shared library. The hash table would fit for use-cases where events are enabled by name, and radix tree (or something similar) would be better suited for wildcards.</p>
<p>b) a simpler solution for the disabled tracing case might be to delay addition of tracepoints to the hash table until the moment this data structure is actually needed (lazy initialization).</p>