LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912023-07-23T16:22:58ZLTTng bugs repository
Redmine LTTng-tools - Bug #1383 (New): The "cpu_id" context (available for filters) is not discoverable b...https://bugs.lttng.org/issues/13832023-07-23T16:22:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>I recently tried remembering how to filter by cpu_id when enabling an event, and found that there was not clear way to see that $ctx.cpu_id actually exists from the lttng man pages (other than a random example in the lttng-enable-event man page). AFAIU the man pages rely on lttng add-context --list to let the user discover the available contexts. This is likely because the cpu_id context is not available for the "add-context" command because it would be redundant with the implicit context already sampled with the buffers.</p>
<p>We may want to have some way to let users discovers those contexts which are filter-specific.</p>
<p>My use-case is to try to grab only traces for a few logical cpus (0-3) from my 384 logical cpus machine.</p> LTTng-tools - Bug #1379 (New): Document behavior of live mode with per-pid UST buffershttps://bugs.lttng.org/issues/13792023-06-01T18:48:58ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>There is a design limitation of per-pid UST buffers when used in live mode which should be documented.</p>
<p>Basically, because the per-pid buffers are reclaimed soon after the application exits, it may cause the events traced by a short-lived application to never appear in the output of a live trace when per-pid buffers are used.</p>
<p>This is caused by the fact that the live client periodically checks for new buffers, but there is no inherent reference kept on the streams before they disappear.</p>
<p>This means that the information will be available in the disk output of the trace, but it will be missing from the live mode output.</p>
<p>This could eventually be mitigated by keeping references to the streams received by the relay daemon which are of interest to a live client until they are referenced by the live client.</p> LTTng-tools - Bug #1378 (New): Document behavior of snapshot mode with per-pid UST buffershttps://bugs.lttng.org/issues/13782023-06-01T18:01:12ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>We should document a design limitation of the per-pid UST buffers which can lead to unexpected results when used with the "snapshot" feature.</p>
<p>Basically, because the per-pid buffers lifetime is bound by the application using them (and the consumer daemon extracting data from them), they are freed almost immediately after the traced application exits. However, in a scenario where we are interested in capturing a snapshot of the content of flight recorder buffers soon after application exits (e.g. caused by a crash), those buffers are not available anymore a few milliseconds after the application exits.</p>
<p>We should document this design limitation, and eventually think about improvements (feature request ?) to allow a configurable delay after application exit during which the consumer daemon could keep references on the buffers so they are available for snapshot.</p> Userspace RCU - Feature #1368 (New): Integrate RCU implementation from libside into liburcuhttps://bugs.lttng.org/issues/13682023-02-14T14:26:10ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>libside features a multi-domain RCU implementation semantically similar to the Linux kernel's SRCU's implementation. It tracks grace periods with per-cpu counters rather than per-thread state.</p>
<p><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.h">https://github.com/efficios/libside/blob/master/src/rcu.h</a><br /><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.c">https://github.com/efficios/libside/blob/master/src/rcu.c</a></p>
<p>It is somewhat different from other urcu flavors because it supports multiple domains, whereas current liburcu flavors only have a single domain per process.</p>
<p>We would have to think thoroughly about the impacts of supporting multiple domains on other parts of liburcu such as the call_rcu worker thread. The worker thread would either have to be able to synchronize with multiple RCU domains, or we would have to require each worker thread to be associated with a single RCU domain.</p>
<p>Similar concerns arise with respect to data structures such as rculfhash which take a urcu flavor as parameter: with per-domain flavor, those would have to be associated with a specific RCU domain in addition to the flavor.</p> LTTng-tools - Feature #1341 (New): Introduce "--shm-path-ust" to clarify use vs documentationhttps://bugs.lttng.org/issues/13412021-12-07T20:53:52ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Considering that the "--shm-path" argument is documented as applying to all domains, but that currently the kernel domain does not support this argument, it leads us to a situation where users may budget their NVRAM for the space needed only for UST buffers, and eventually upgrade to a new LTTng version which would implement kernel support for shm-path. They would then face issues given the lack of available space.</p>
<p>I would recommend to add a new "--shm-path-ust" argument as an alias to "--shm-path", but document that it only applies to the UST domain. This way, users wishing to budget their NVRAM space only for UST tracing, but still perform kernel tracing in the same session, will be able to do so.</p> LTTng - Feature #1288 (New): Combine all per-cpu shm areas for a channel into a single filehttps://bugs.lttng.org/issues/12882020-10-13T15:39:03ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>On the reason for having one shm file per cpu (per channel/per uid/per session): having one file per cpu is mainly done to facilitate NUMA-local memory allocation. That being said, I wonder if as a future improvement we could combine all per-cpu buffers into a single shm file, and issue numa_set_preferred piecewise when mmapping regions of the files. This would lessen the number of file descriptors needed by a significant amount. I suspect it would work, but we'd need to do some prototyping to validate this approach first.</p> LTTng-tools - Feature #1287 (New): Use abstract sockets for lttng-consumerd UST shared memory fileshttps://bugs.lttng.org/issues/12872020-10-13T15:35:32ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Abstract sockets (unix(7)) are not tied to the filesystem, and are available since Linux 2.2.</p>
<p>Those are Linux-specific.</p>
<p>Those are the same as regular unix domain but their first character of path is NULL. They have the benefit of not requiring unlinking of files left behind.</p>
<p>We could use those abstract sockets in lttng-consumerd on Linux.</p> LTTng-modules - Feature #1265 (New): Turn lttng-probes.c probe_list into a hash tablehttps://bugs.lttng.org/issues/12652020-05-06T17:14:59ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Turns O(n^2) trace systems registration (cost for n systems) into O(n). (O(1) per system)</p> LTTng-modules - Feature #1264 (New): NOHZ support for lib ring bufferhttps://bugs.lttng.org/issues/12642020-05-06T17:14:02ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>NOHZ infrastructure in the Linux kernel does not support notifiers chains, which does not let LTTng play nicely with low power consumption setups for flight recorder (overwrite mode) live traces. ne way to allow integration between NOHZ and LTTng would be to add support for such notifiers into NOHZ kernel infrastructure.</p> LTTng-modules - Feature #1263 (New): Re-integrate sample modules from libringbuffer into lttng-mo...https://bugs.lttng.org/issues/12632020-05-06T17:13:20ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Re-integrate sample modules from libringbuffer into lttng-modules. Those modules can be used as example of how to use libringbuffer in other contexts than LTTng, and are useful to perform benchmarks of the ringbuffer library. See: <a class="external" href="https://github.com/compudj/linux-libringbuffer/tree/tip-current-ringbuffer-0.240">https://github.com/compudj/linux-libringbuffer/tree/tip-current-ringbuffer-0.240</a></p> LTTng-tools - Bug #1256 (New): Check babeltrace1 stderr output in testshttps://bugs.lttng.org/issues/12562020-04-08T19:23:18ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Currently, babeltrace1 stderr's output contains warnings which are not checked by the tests. Those do not fail the overall babeltrace execution, but end up being unexpected regressions when they appear between releases.</p>
<p>We could modify all babeltrace invocations to add a helper which saves the stderr output, and check that it is empty after babeltrace has completed execution. If non-empty, a test error would be reported.</p> LTTng-tools - Feature #1197 (New): Use renameat2() to atomically exchange metadata on metadata re...https://bugs.lttng.org/issues/11972019-09-23T16:56:22ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>When using shm-path, a use-case is to expect the sessiond or the system to crash at any point during tracing.</p>
<p>If such a crash occurs while regenerating the metadata, lttng-crash may find a truncated file.</p>
<p>It would be nice if we could generate the new metadata in a ".metadata.new" file while keeping the old file around, and then exchange both files using renameat2().</p>
<p>However, renameat2() appeared in kernel 3.15, so we would have to figure out a fallback.</p> Babeltrace - Feature #1045 (New): Wire up debug info on lttng_ust_cyg_profile event fieldshttps://bugs.lttng.org/issues/10452016-07-13T14:53:52ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>#lttng paste</p>
<p>09:56 < rnsanchez> is there a default procedure for "hydrating" instrument-functions traces like this?<br />09:56 < rnsanchez> [13:54:09.866652414] (+0.000001178) priminho lttng_ust_cyg_profile:func_exit: { cpu_id = 1 }, { addr = 0x46BB90, call_site = 0x46BF7B }<br />09:56 < rnsanchez> (kind of replacing the addr with their proper symbols)<br />09:57 < milian> rnsanchez: I'm not an lttng dev, but could imagine that one would be able to write that by analyzing mmap + openat to find the offset into a library, which you can then feed into addr2line, or libdw/libbacktrace<br />09:59 < rnsanchez> I could propably pass it (babeltrace) through some script to do that. but since the trace is huge (and this is not even a "real" trace), I was wondering if there is a better way to do that<br />09:59 < milian> I'd also be interested in that<br />10:00 < rnsanchez> well maybe there is one. building a symbol-table cache for the known things (a binary of special interest) and then feeding babeltrace through awk, replacing the symbols found with their names<br />10:01 < rnsanchez> some would miss, of course, but perhaps a good amount would help<br />10:46 < Compudj> rnsanchez, milian: currently, babeltrace is a bit "hardwired" to the "ip" context for symbol resolution<br />10:46 < Compudj> but all the infrastructure code is there<br />10:49 < Compudj> see babeltrace: formats/ctf-text/types/integer.c<br />10:49 < Compudj> there is a call to ctf_text_integer_write_debug_info<br />10:50 < Compudj> implemented in include/babeltrace/trace-debug-info.h<br />10:50 < Compudj> it checks if integer_definition->debug_info_src is non-null<br />10:51 < Compudj> this is wired up in lib/debug-info.c register_event_debug_infos()<br />10:51 < Compudj> it is where it is tied to the "ip" context<br />10:51 < Compudj> it should be extended to be tied to the lttng_ust_cyg_profile event fields too</p> Userspace RCU - Feature #940 (New): Wire up sys membarrier on each architecturehttps://bugs.lttng.org/issues/9402015-09-26T16:00:41ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-UST - Bug #525 (Confirmed): new "notifications" from UST do not strictly respect LTTNG_UST_...https://bugs.lttng.org/issues/5252013-05-07T15:25:54ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>We should eventually find a way to improve notifications from UST to sessiond so they don't delay .so loading (and thus application startup) when env. var. specify a LTTNG_UST_REGISTER_TIMEOUT=0. Currently, we work around this issue by setting the notification socket with a timeout of 100ms as minimum timeout if the LTTNG_UST_REGISTER_TIMEOUT value is below 100.</p>
<p>A cleaner fix could involve handling these notifications from a (possibly new) separate thread, and use a semaphore-based scheme to handle optional wait from the application.</p>