LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912024-02-21T17:04:43ZLTTng bugs repository
Redmine Babeltrace - Feature #1410 (New): Add pretty-print sink options to override how to format floatin...https://bugs.lttng.org/issues/14102024-02-21T17:04:43ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Babeltrace 2, just like its predecessor babeltrace 1.5, uses "%g" to print floating point numbers.</p>
<p>It would be useful to let users optionally override the formatting for floating point numbers, perhaps with a pretty print sink option. They could then choose if exponent notation should be used or not, choose the precision, etc.</p> LTTng-tools - Bug #1383 (New): The "cpu_id" context (available for filters) is not discoverable b...https://bugs.lttng.org/issues/13832023-07-23T16:22:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>I recently tried remembering how to filter by cpu_id when enabling an event, and found that there was not clear way to see that $ctx.cpu_id actually exists from the lttng man pages (other than a random example in the lttng-enable-event man page). AFAIU the man pages rely on lttng add-context --list to let the user discover the available contexts. This is likely because the cpu_id context is not available for the "add-context" command because it would be redundant with the implicit context already sampled with the buffers.</p>
<p>We may want to have some way to let users discovers those contexts which are filter-specific.</p>
<p>My use-case is to try to grab only traces for a few logical cpus (0-3) from my 384 logical cpus machine.</p> LTTng-tools - Bug #1379 (New): Document behavior of live mode with per-pid UST buffershttps://bugs.lttng.org/issues/13792023-06-01T18:48:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>There is a design limitation of per-pid UST buffers when used in live mode which should be documented.</p>
<p>Basically, because the per-pid buffers are reclaimed soon after the application exits, it may cause the events traced by a short-lived application to never appear in the output of a live trace when per-pid buffers are used.</p>
<p>This is caused by the fact that the live client periodically checks for new buffers, but there is no inherent reference kept on the streams before they disappear.</p>
<p>This means that the information will be available in the disk output of the trace, but it will be missing from the live mode output.</p>
<p>This could eventually be mitigated by keeping references to the streams received by the relay daemon which are of interest to a live client until they are referenced by the live client.</p> Babeltrace - Feature #1300 (New): babeltrace2 lacks knowledge of bitwise enum produced by lttng-m...https://bugs.lttng.org/issues/13002021-03-23T17:41:47ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>This is a follow up on <a class="external" href="https://review.lttng.org/c/babeltrace/+/3045">https://review.lttng.org/c/babeltrace/+/3045</a> now that end users are starting to contact us wondering why up-to-date lttng and babeltrace2 show incomplete information for bitwise enums.</p>
<p>I think we should implement a better stop-gap solution while awaiting CTF2. I recommend that we specialize the babeltrace2 CTF input plugin to detect traces produced by the lttng-modules kernel tracer, for specific known (hardcoded) events and fields which have a bitwise enum semantic, and use this hardcoded knowledge to print the correct bitwise enum to the end user rather than "<unknown>".</p>
<p>This should ensure the current producer/consumer tools work well together (show complete information about those bitwise enums) without requiring to modify tons of documentation about the specific flags needed for bt2 to handle lttng traces appropriately. And this would not lead to misleading scenarios where we print an unknown value as a bitwise enum by mistake.</p>
<p>Hardcoded producer detection/event/fields is of course a last resort solution awaiting proper self-description in the upcoming CTF2 spec.</p> Babeltrace - Bug #1299 (Confirmed): babeltrace2 re-uses prior event string rather than expected e...https://bugs.lttng.org/issues/12992021-03-04T18:49:50ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Trying to reproduce the page fault scenario for an openat() event. Note that the behavior of generating a page fault may be toolchain-specific (linker-specific):</p>
<p>Test program "do-pagefault-open":</p>
<p>#include <stdio.h><br />#include <sys/types.h><br />#include <sys/stat.h><br />#include <fcntl.h></p>
<p>const char *str = "/tmp/blah";<br />const char *str2 = "/tmp/blah2";</p>
<p>int main()
{<br /> open(str, O_RDWR);<br /> perror("first");<br /> open(str2, O_RDWR);<br /> perror("second");<br /> return 0;<br />}</p>
<p>then trace with:</p>
<p>lttng-trace ./do-pagefault-open</p>
<p>Relevant events with babeltrace2 (grepping for 'openat'):</p>
<p>[13:47:12.296015570] (+0.000015421) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/etc/ld.so.cache", flags = ( "O_CLOEXEC" : container = 524288 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296029004] (+0.000013434) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = 3 }<br />[13:47:12.296116608] (+0.000058899) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/lib/x86_64-linux-gnu/libc.so.6", flags = ( "O_CLOEXEC" : container = 524288 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296126244] (+0.000009636) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = 3 }<br />[13:47:12.296706313] (+0.000096888) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/lib/x86_64-linux-gnu/libc.so.6", flags = ( "O_RDWR" : container = 2 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296717797] (+0.000011484) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = -2 }<br />[13:47:12.296954228] (+0.000005481) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/tmp/blah2", flags = ( "O_RDWR" : container = 2 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296960451] (+0.000006223) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = -2 }</p>
<p>Same trace with babeltrace 1:</p>
<p>[warning] Unknown value 0 in enum.<br />[warning] Unknown value 0 in enum.<br />[warning] Unknown value 3 in enum.<br />[warning] Unknown value 5 in enum.<br />[warning] Unknown value 129 in enum.<br />[warning] Unknown value 129 in enum.<br />[warning] Unknown value 3 in enum.<br />[warning] Unknown value 129 in enum.<br />[warning] Unknown value 3 in enum.<br />[warning] Unknown value 3 in enum.<br />[13:47:12.296015570] (+0.000015421) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/etc/ld.so.cache", flags = ( "O_CLOEXEC" : container = 524288 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296029004] (+0.000013434) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = 3 }<br />[13:47:12.296116608] (+0.000058899) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/lib/x86_64-linux-gnu/libc.so.6", flags = ( "O_CLOEXEC" : container = 524288 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296126244] (+0.000009636) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = 3 }<br />[warning] Unknown value 0 in enum.<br />[warning] Unknown value 0 in enum.<br />[13:47:12.296706313] (+0.000096888) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "", flags = ( "O_RDWR" : container = 2 ), mode = ( <unknown> : container = 0 ) }<br />[13:47:12.296717797] (+0.000011484) thinkos syscall_exit_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { ret = -2 }<br />[13:47:12.296954228] (+0.000005481) thinkos syscall_entry_openat: { cpu_id = 3 }, { procname = "do-pagefault-op", vpid = 31576, vtid = 31576 }, { dfd = -100, filename = "/tmp/blah2", flags = ( "O_RDWR" : container = 2 ), mode = ( <unknown> : container = 0 ) }</p>
<p>Notice how filename = "/lib/x86_64-linux-gnu/libc.so.6" appears twice in the bt2 output, where we would expect the empty string (as provided by bt1).</p>
<p>Tested with babeltrace2 commit c429f86d0.</p> Babeltrace - Bug #1221 (New): Babeltrace 2 sometimes fail on live trace with lttng-clear featurehttps://bugs.lttng.org/issues/12212020-02-17T20:52:19ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Babeltrace 2 sometimes fail on live trace with lttng-clear feature</p>
<p>This may happen in a scenario where per-pid buffers are used, an application has just quit, and babeltrace 2 observes a stream in make_viewer_streams, but then fails to open the associated index or data file on the first get_next_index because it has been unlinked by clear.</p>
<p>The babeltrace output is:</p>
<pre>
12-04 14:01:48.455 11884 11884 E PLUGIN/CTF/MSG-ITER create_msg_stream_end@msg-iter.c:2464 [lttng-live] Cannot create stream for stream message: notit-addr=0x55b89f957980
12-04 14:01:48.455 11884 11884 W LIB/MSG-ITER bt_self_component_port_input_message_iterator_next@iterator.c:906 Component input port message iterator's "next" method failed: iter-addr=0x55b89f93f510, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:444 [muxer] Error or unsupported status code: status-code=-1
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:975 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x55b89f950950, muxer-upstream-msg-iter-wrap-addr=0x55b89f9415d0
12-04 14:01:48.455 11884 11884 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1371 [muxer] Cannot get next message: comp-addr=0x55b89f93ebc0, muxer-comp-addr=0x55b89f93ec40, muxer-msg-iter-addr=0x55b89f950950, msg-iter-addr=0x55b89f93f3a0, status=ERROR
12-04 14:01:48.455 11884 11884 W LIB/MSG-ITER bt_self_component_port_input_message_iterator_next@iterator.c:906 Component input port message iterator's "next" method failed: iter-addr=0x55b89f93f3a0, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
12-04 14:01:48.455 11884 11884 W LIB/GRAPH consume_graph_sink@graph.c:596 Component's "consume" method failed: status=ERROR, comp-addr=0x55b89f93ed30, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x55b89f94c740, comp-class-so-handle-path="/usr/local/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
12-04 14:01:48.456 11884 11884 E CLI cmd_run@babeltrace2.c:2574 Graph failed to complete successfully
ERROR: [Babeltrace CLI] (babeltrace2.c:2574)
Graph failed to complete successfully
CAUSED BY [Babeltrace library] (graph.c:596)
Component's "consume" method failed: status=ERROR, comp-addr=0x55b89f93ed30,
comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
(`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x55b89f94c740,
comp-class-so-handle-path="/usr/local/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
comp-input-port-count=1, comp-output-port-count=0
CAUSED BY [Babeltrace library] (iterator.c:906)
Component input port message iterator's "next" method failed:
iter-addr=0x55b89f93f3a0, iter-upstream-comp-name="muxer",
iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
iter-upstream-comp-class-name="muxer",
iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [Babeltrace library] (iterator.c:906)
Component input port message iterator's "next" method failed:
iter-addr=0x55b89f93f510, iter-upstream-comp-name="lttng-live",
iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
iter-upstream-comp-class-name="lttng-live",
iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
</pre>
<p>notit->stream is expected not to be NULL when reaching the state STATE_EMIT_MSG_STREAM_END, but if no packet was received, it is indeed still NULL, because it is the packet header being read that moves to the STATE_CHECK_EMIT_MSG_STREAM_BEGINNING state (after_packet_context_state()).</p> LTTng-tools - Bug #1182 (New): relayd mishandles channels configured with tracefile-count == 0 an...https://bugs.lttng.org/issues/11822019-04-02T17:25:50ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Relayd mistakenly considers that the criterion tracefile-count == 0 means no tracefile rotation should happen for the stream.</p>
<p>However, it's possible that tracefile-size is set to non-zero, but tracefile-count is left at its default value of 0.</p>
<p>The expected behavior would be to have an ever increasing counter (no limit) of tracefiles each "size" in size at most.</p>
<p>I suspect the relayd receiving side may also be broken, because its rotation handling code also has the same incorrect assumptions.</p>
<p>When connecting a live client to such a session, we get on the client side:<br />[error] Remote side has closed connection</p>
<p>And a PERROR on relayd:<br />PERROR - 13:16:28.097800421 [8061/8068]: opening index in read-only: No such file or directory (in lttng_index_file_open() at index.c:261)</p>
<p>Likely caused by confusion between tracefile rotation suffixes in the filename.</p> Babeltrace - Feature #1045 (New): Wire up debug info on lttng_ust_cyg_profile event fieldshttps://bugs.lttng.org/issues/10452016-07-13T14:53:52ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>#lttng paste</p>
<p>09:56 < rnsanchez> is there a default procedure for "hydrating" instrument-functions traces like this?<br />09:56 < rnsanchez> [13:54:09.866652414] (+0.000001178) priminho lttng_ust_cyg_profile:func_exit: { cpu_id = 1 }, { addr = 0x46BB90, call_site = 0x46BF7B }<br />09:56 < rnsanchez> (kind of replacing the addr with their proper symbols)<br />09:57 < milian> rnsanchez: I'm not an lttng dev, but could imagine that one would be able to write that by analyzing mmap + openat to find the offset into a library, which you can then feed into addr2line, or libdw/libbacktrace<br />09:59 < rnsanchez> I could propably pass it (babeltrace) through some script to do that. but since the trace is huge (and this is not even a "real" trace), I was wondering if there is a better way to do that<br />09:59 < milian> I'd also be interested in that<br />10:00 < rnsanchez> well maybe there is one. building a symbol-table cache for the known things (a binary of special interest) and then feeding babeltrace through awk, replacing the symbols found with their names<br />10:01 < rnsanchez> some would miss, of course, but perhaps a good amount would help<br />10:46 < Compudj> rnsanchez, milian: currently, babeltrace is a bit "hardwired" to the "ip" context for symbol resolution<br />10:46 < Compudj> but all the infrastructure code is there<br />10:49 < Compudj> see babeltrace: formats/ctf-text/types/integer.c<br />10:49 < Compudj> there is a call to ctf_text_integer_write_debug_info<br />10:50 < Compudj> implemented in include/babeltrace/trace-debug-info.h<br />10:50 < Compudj> it checks if integer_definition->debug_info_src is non-null<br />10:51 < Compudj> this is wired up in lib/debug-info.c register_event_debug_infos()<br />10:51 < Compudj> it is where it is tied to the "ip" context<br />10:51 < Compudj> it should be extended to be tied to the lttng_ust_cyg_profile event fields too</p> LTTng-tools - Bug #987 (Confirmed): Incomplete string comparison patternhttps://bugs.lttng.org/issues/9872016-01-06T23:23:31ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>In a few instances in lttng-tools, a string comparison pattern involving strncmp and strlen() to limit the length of the comparison is used.</p>
<p>One can quickly find them with:</p>
<p>grep -r -A 3 strncmp . | less -> search for strlen or strnlen</p>
<p>Many of those are OK: they indeed only want to match the beginning of the strings. However, this pattern is incorrectly used in cases where a full match is required, for instance:</p>
<p>ust-registry.c:ht_match_event()<br />./src/bin/lttng/commands/enable_channels.c: if (!strncmp(output_mmap, opt_output, strlen(output_mmap))) {<br />./src/bin/lttng/commands/enable_channels.c: } else if (!strncmp(output_splice, opt_output, strlen(output_splice))) {<br />./src/bin/lttng-sessiond/snapshot.c: if (!strncmp(output->name, name, strlen(name))) {</p> LTTng - Feature #968 (Feedback): lttng-modules kernel and user callstack contexthttps://bugs.lttng.org/issues/9682015-10-25T20:14:40ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Implementation from Francis Giraldeau reviewed and cleaned up available here:</p>
<p><a class="external" href="https://github.com/compudj/lttng-tools-dev/tree/callstack">https://github.com/compudj/lttng-tools-dev/tree/callstack</a><br /><a class="external" href="https://github.com/compudj/lttng-modules-dev/tree/callstack">https://github.com/compudj/lttng-modules-dev/tree/callstack</a></p>
<p>Documentation and tests are missing.</p> LTTng-UST - Feature #965 (New): Implement UST statedumphttps://bugs.lttng.org/issues/9652015-10-22T20:43:37ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Initial implementation: <a class="external" href="https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier">https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier</a></p>
<p>Missing tests in lttng-tools for this feature before we can merge it into lttng-ust.</p> LTTng-modules - Feature #964 (New): Implement support for persistent memory buffershttps://bugs.lttng.org/issues/9642015-10-22T20:41:43ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-modules - Feature #963 (New): Implement user-space stack dump contexthttps://bugs.lttng.org/issues/9632015-10-22T20:41:10ZMathieu Desnoyersmathieu.desnoyers@efficios.comUserspace RCU - Feature #941 (New): URCU flavor which can be used across processes using shared m...https://bugs.lttng.org/issues/9412015-09-26T16:23:20ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>The appears to be interest for a URCU flavor which can be used across a set of processes communicating through shared memory.</p> LTTng-tools - Bug #833 (Confirmed): memcpy of non-packed struct into packed struct (possible layo...https://bugs.lttng.org/issues/8332014-09-09T13:17:10ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>lttng-tools src/lib/lttng-ctl/lttng-ctl.c:</p>
<p>lttng_enable_event_with_exclusions()</p>
<p>memcpy(&lsm.u.enable.event, ev, sizeof(lsm.u.enable.event));</p>
<p>copy "ev" (non-packed) into a packed structure.</p>
<p>We should copy each field one by one (create a copy_event_to_event_packed() helper to do so).</p>