LTTng bugs repository: Issueshttps://bugs.lttng.org/https://bugs.lttng.org/themes/lttng/favicon/a.ico?14249722912023-02-14T14:26:10ZLTTng bugs repository
Redmine Userspace RCU - Feature #1368 (New): Integrate RCU implementation from libside into liburcuhttps://bugs.lttng.org/issues/13682023-02-14T14:26:10ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>libside features a multi-domain RCU implementation semantically similar to the Linux kernel's SRCU's implementation. It tracks grace periods with per-cpu counters rather than per-thread state.</p>
<p><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.h">https://github.com/efficios/libside/blob/master/src/rcu.h</a><br /><a class="external" href="https://github.com/efficios/libside/blob/master/src/rcu.c">https://github.com/efficios/libside/blob/master/src/rcu.c</a></p>
<p>It is somewhat different from other urcu flavors because it supports multiple domains, whereas current liburcu flavors only have a single domain per process.</p>
<p>We would have to think thoroughly about the impacts of supporting multiple domains on other parts of liburcu such as the call_rcu worker thread. The worker thread would either have to be able to synchronize with multiple RCU domains, or we would have to require each worker thread to be associated with a single RCU domain.</p>
<p>Similar concerns arise with respect to data structures such as rculfhash which take a urcu flavor as parameter: with per-domain flavor, those would have to be associated with a specific RCU domain in addition to the flavor.</p> LTTng-UST - Feature #1100 (New): Add (possibly symbolic) event whenever blocking occurs due to ex...https://bugs.lttng.org/issues/11002017-05-08T17:24:36ZRicardo Nabinger Sanchezrnsanchez@gmail.com
<p>When using <code>LTTNG_UST_BLOCKING_RETRY_TIMEOUT</code>, it is desirable to know when blocking occured. This way an analyst will stay informed of periods in the trace where precision of events had to be traded for completeness.</p>
<p>This has been discussed informally in the IRC channel:<br /><pre>
[09:08:06] rnsanchez such points could be of interest in my analysis :+)
[09:59:35] Compudj rnsanchez: no
[10:01:19] Compudj rnsanchez: but looking at event clusters in your trace will help
[10:01:36] rnsanchez Compudj: what about throwing a single/occasional event for a corresponding tid/pid that was ungraced due to excessive events?
[10:01:56] Compudj rnsanchez: with multi-session support, that would not be that easy
[10:02:04] Compudj we'd have to know which buffers are affected
[10:02:22] Compudj so we don't generate noise into other sessions
[10:02:36] rnsanchez I see
[10:02:37] rnsanchez sounds fair
[10:03:35] Compudj rnsanchez: if we add such info eventually, I'd be tempted to add a packet header field for that
[10:03:47] Compudj which could counts the "blocking" a buffer has had so far
[10:04:11] Compudj so we don't rely on events to relay information that is transport-level
[10:04:35] Compudj that would be doable (you may want to add a feature request on bugs.lttng.org)
[10:05:26] rnsanchez well that kind of info could be "rendered" into tools such as Trace Compass. it would be informative enough for the analyst to take a convolute period with a grain/spoon of salt
[10:05:57] rnsanchez Compudj: I can do that, sure
[10:06:04] Compudj rnsanchez: yes, and having those counters in the packet header would nicely match the spots where the situation occurs
[10:06:15] Compudj blocking always happen at sub-buffer (packet) boundary
[10:09:06] rnsanchez so it would act (briefly) as a relaxed clock? i.e., "between events En and En+1, we blocked, take that into account"
[10:09:24] rnsanchez (relaxed as in not actually probing clocks for a timestamp)
[10:10:43] Compudj yes, if we detect that the blocking counter increased between two consecutive packets, we can put a marker between timestamp end of the first packet and timestamp begin of the second packet saying that the source has been blocked during that time
[10:11:11] rnsanchez perfect
[10:12:19] Compudj you may want to paste this conversation into the bug tracker feature request, it will make it easier to remember the details when/if we get to implement it
[10:13:26] rnsanchez doing that as we speak
</pre></p> LTTng-UST - Feature #965 (New): Implement UST statedumphttps://bugs.lttng.org/issues/9652015-10-22T20:43:37ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Initial implementation: <a class="external" href="https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier">https://github.com/compudj/lttng-ust-dev/tree/statedump-notifier</a></p>
<p>Missing tests in lttng-tools for this feature before we can merge it into lttng-ust.</p> Userspace RCU - Feature #941 (New): URCU flavor which can be used across processes using shared m...https://bugs.lttng.org/issues/9412015-09-26T16:23:20ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>The appears to be interest for a URCU flavor which can be used across a set of processes communicating through shared memory.</p> Userspace RCU - Feature #940 (New): Wire up sys membarrier on each architecturehttps://bugs.lttng.org/issues/9402015-09-26T16:00:41ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-UST - Feature #717 (New): Better validation of tracepoint provider headers?https://bugs.lttng.org/issues/7172014-01-15T17:35:27ZDaniel U. Thibaultdaniel.thibault@drdc-rddc.gc.ca
<p>I fooled around to try and break the LTTng tracepoint provider preparation process at the level of the event field names. Turns out the literals supplied to the <code>ctf_*</code> macros as arguments for <code>TP_FIELDS</code> are pretty robust. Maybe too robust.</p>
<p>If you supply a non-identifier (see ISO/IEC 9899:TC2 (9899:1999) at 6.4.2 and Annex D) such as for instance "<code>named name</code>" or "<code>.name</code>", the tracepoint provider header compiles, links and packages into an <code>.so</code> without a hitch. The instrumented application likewise. Tracing also works flawlessly, producing a trace on disk. But when <code>babeltrace</code> tries to read it, it complains and gives up, with messages like:</p>
<pre>
[error] at line 110: token "name": syntax error, unexpected IDENTIFIER, expecting SEMICOLON or COMMA
[...]
</pre>
<p>(for "<code>named name</code>") or</p>
<pre>
[error] at line 110: token ".": syntax error, unexpected DOT, expecting SEMICOLON or COMMA
[...]
</pre>
<p>(for "<code>.name</code>")</p>
<p>There isn't much that can be done to prevent this. I would recommend merely amending the tracepoint provider samples slightly, like this:</p>
<pre>
/*
* The ctf_string macro takes a C string and writes it into a field
* named "message" (any C identifier will do for the field name)
*/
ctf_string(message, text)
</pre> LTTng-UST - Feature #710 (New): List event fields in the same order as the TP definitionhttps://bugs.lttng.org/issues/7102014-01-09T15:37:19ZDavid Goulet
<p>Considering this tracepoint definition taken from tests/hello/</p>
<pre>
TRACEPOINT_EVENT(ust_tests_hello, tptest,
TP_ARGS(int, anint, int, netint, long *, values,
char *, text, size_t, textlen,
double, doublearg, float, floatarg,
bool, boolarg),
TP_FIELDS(
ctf_integer(int, intfield, anint)
ctf_integer_hex(int, intfield2, anint)
ctf_integer(long, longfield, anint)
ctf_integer_network(int, netintfield, netint)
ctf_integer_network_hex(int, netintfieldhex, netint)
ctf_array(long, arrfield1, values, 3)
ctf_array_text(char, arrfield2, text, 10)
ctf_sequence(char, seqfield1, text,
size_t, textlen)
ctf_sequence_text(char, seqfield2, text,
size_t, textlen)
ctf_string(stringfield, text)
ctf_float(float, floatfield, floatarg)
ctf_float(double, doublefield, doublearg)
ctf_integer(bool, boolfield, boolarg)
ctf_integer_nowrite(int, filterfield, anint)
)
)
</pre>
<p>When listing fields with ustctl_tracepoint_field_list() and ustctl_tracepoint_field_list_get() (using for instance lttng list -u -f), the fields<br />are sent back in the reverse order starting at the bottom of TP_FIELDS().</p>
<pre>
ust_tests_hello:tptest (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
field: filterfield (integer) [no write]
field: boolfield (integer)
field: doublefield (float)
field: floatfield (float)
field: stringfield (string)
field: seqfield2 (string)
field: seqfield1 (unknown)
field: arrfield2 (string)
field: arrfield1 (unknown)
field: netintfieldhex (integer)
field: netintfield (integer)
field: longfield (integer)
field: intfield2 (integer)
field: intfield (integer)
</pre>
<p>No idea if it's stored in a list or in a hashtable but if possible having them listed in the same order could nice.</p> LTTng-UST - Feature #602 (New): Add session to metadata environmenthttps://bugs.lttng.org/issues/6022013-07-24T18:36:42ZYannick Brosseauyannick.brosseau@polymtl.ca
<p>To aid in identifying the trace, we should add the session in the metadata (in the environment section)</p> LTTng-UST - Feature #527 (New): Add git version information to project versionhttps://bugs.lttng.org/issues/5272013-05-10T12:54:46ZMathieu Desnoyersmathieu.desnoyers@efficios.comLTTng-UST - Feature #520 (New): Allow override of /var/run directoryhttps://bugs.lttng.org/issues/5202013-05-04T13:59:36ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Allow override of /var/run directory, as will as per-user $HOME/.lttng/ directory at configure time, and by environment variables. This should match a similar feature in lttng-tools.</p> LTTng-UST - Feature #508 (Feedback): arrays of floats are stored and/or displayed as arrays of intshttps://bugs.lttng.org/issues/5082013-04-19T20:33:41ZSébastien Barthélémybarthelemy@crans.org
<p>The attached patch modifies the hello.cxx test (from lttng-ust 2.1.1) to add an array of floats argument.<br />as you see in the following output (babeltrace v1.0.3 with the python bindings patchs), the field floatarrfield shows ints instead of the expected floats:</p>
<pre>
$ rm -rf ~/lttng-traces/ ; lttng create && lttng enable-event -a -u && lttng start && ./run && lttng stop && lttng destroy && babeltrace ~/lttng-traces | head -n 2
Session auto-20130419-222314 created.
Traces will be written in /home/sbarthelemy/lttng-traces/auto-20130419-222314
All UST events are enabled in channel channel0
Tracing started for session auto-20130419-222314
Hello, World!
Tracing... done.
Waiting for data availability
Tracing stopped for session auto-20130419-222314
Session auto-20130419-222314 destroyed
[22:23:14.605465399] (+?.?????????) ald-0987-de:hello:30232 ust_tests_hello:tptest: { cpu_id = 5 }, { intfield = 0, intfield2 = 0x0, longfield = 0, netintfield = 0, netintfieldhex = 0x0, arrfield1 = [ [0] = 1, [1] = 2, [2] = 3 ], arrfield2 = "test", _seqfield1_length = 4, seqfield1 = [ [0] = 116, [1] = 101, [2] = 115, [3] = 116 ], _seqfield2_length = 4, seqfield2 = "test", stringfield = "test", floatfield = 2222.1, doublefield = 2.1, floatarrfield = [ [0] = 1066192077, [1] = 1074580685, [2] = 1079194419 ] }
[22:23:14.605470207] (+0.000004808) ald-0987-de:hello:30232 ust_tests_hello:tptest: { cpu_id = 5 }, { intfield = 1, intfield2 = 0x1, longfield = 1, netintfield = 1, netintfieldhex = 0x1, arrfield1 = [ [0] = 1, [1] = 2, [2] = 3 ], arrfield2 = "test", _seqfield1_length = 4, seqfield1 = [ [0] = 116, [1] = 101, [2] = 115, [3] = 116 ], _seqfield2_length = 4, seqfield2 = "test", stringfield = "test", floatfield = 2222.1, doublefield = 2.1, floatarrfield = [ [0] = 1066192077, [1] = 1074580685, [2] = 1079194419 ] }
</pre>
<p>as a workaround, the floating point value can be retrieved in python using:<br /><pre>
In [1]: import struct
In [2]: struct.unpack('f', struct.pack('i', 1066192077))
Out[2]: (1.100000023841858,)
</pre></p> LTTng-UST - Feature #483 (New): Use "man 3 backtrace" to dump the stack state at record start (at...https://bugs.lttng.org/issues/4832013-03-26T11:32:44ZPaul Woegererpaul_woegerer@mentor.com
<p>When an already running application gets traced with liblttng-ust-cyg-profile<br />function entry/exit instrumentation we should provide a way to reconstruct the<br />stack state at connection time. This can be achieved by using the backtrace<br />feature of glibc.</p>
<p>The following conversation on IRC motivated this feature request:</p>
<p>[09:51] <pwoegere> Compudj: Regarding <a class="external" href="http://git.lttng.org/?p=lttng-ust.git;a=blob;f=liblttng-ust-cyg-profile/lttng-ust-cyg-profile.c;h=d772e76b961a148d19bf04d56ae9481b697d99b5;hb=70d654f22a6b52beddfb86ec3daa453073c356d2#l39">http://git.lttng.org/?p=lttng-ust.git;a=blob;f=liblttng-ust-cyg-profile/lttng-ust-cyg-profile.c;h=d772e76b961a148d19bf04d56ae9481b697d99b5;hb=70d654f22a6b52beddfb86ec3daa453073c356d2#l39</a><br />[09:52] <pwoegere> Compudj: There is a disadvantage not to pass the return address on lttng_ust_cyg_profile:func_exit<br />[09:52] <pwoegere> Compudj: Think about the use case where you start recording in the middle of the application ...<br />[09:53] <pwoegere> Compudj: <br />[09:53] <pwoegere> All the lttng_ust_cyg_profile:func_exit events where<br />[09:53] <pwoegere> there is no corresponding func_entry (because it was emitted before the<br />[09:53] <pwoegere> attach happend) are basically worthless.<br />[09:56] <pwoegere> Compudj: If you also pass the call_site to func_exit to you will have useful func_exit events even when you don't have the corresponding func_entry<br />[11:40] <Compudj> pwoegere: yes, it's a question of trade-off<br />[11:41] <Compudj> pwoegere: is it worth it to almost double the size of the traces (and thus double the throughput needed) in order to handle the few func_exit events that would happen to be there at trace start without matching func_entry ?<br />[11:41] <Compudj> pwoegere: in my opinion, the saving in trace bandwidth is far more important<br />[12:20] <pwoegere> Compudj: We could use something like "man 3 backtrace" to dump the stack state at record start (attach) time. This would allow to reconstruct the missed stack state.<br />[12:24] <Compudj> pwoegere: it sounds like an excellent idea!<br />[12:24] <Compudj> pwoegere: could you open a feature request on bugs.lttng.org along with this reference ?</p> LTTng-UST - Feature #447 (New): Support dlopen/dlclose of probe providershttps://bugs.lttng.org/issues/4472013-02-15T13:01:48ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>Since lttng-ust 2.1, the glibc deadlocks documented within lttng-ust(3) manpage have been worked-around with a "tls fixup" within constructor trick. However, it is still not safe to use dlclose on a provider shared object that is being actively used for tracing due to lack of reference counting from lttng-ust to the used shared object. This leads to either:</p>
<p>- segmentation fault while tracing, since events could use a serialization function located within an unloaded shared object.<br />- segmentation fault while destroying trace session or exiting process, trying to access the event description located within unloaded shared object.<br />- segmentation fault while dumping metadata for a trace session, since it would be trying to access event description located within unloaded .so.</p>
<p>Since we don't want to mess with metadata description nor synchronization of tracing while unloading a module, the best solution I can think of (and the solution that is the most similar to the approach taken with lttng-modules for kernel tracing) is to hold a reference count on the provider .so when it's used by a tracing session. One way to do this would be to use dladdr() to lookup the library path, and use a pair of dlopen()/dlclose() at tracing session creation/destruction within lttng-ust to take an extra reference count on each shared object used.</p> LTTng-UST - Feature #446 (New): Improve process startup time with many eventshttps://bugs.lttng.org/issues/4462013-02-15T12:54:19ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>J9 VM instrumentation has 16k individual events. Latest UST changes improves the process startup time from about 180s down to 2-4s (depending if tracing is active or not). However, this is still far from the 200ms process startup time normally expected for J9 VM.</p>
<p>There are a couple of ways to improve things:</p>
<p>a) implement a pre-computed hash table and or radix tree within the probe provider. Given the tracepoint names within a provider are known statically, we could construct a data structure to access them efficiently after object compilation, generate C code to construct those structures, and generate an output object, which would be linked with the provider object to create the shared library. The hash table would fit for use-cases where events are enabled by name, and radix tree (or something similar) would be better suited for wildcards.</p>
<p>b) a simpler solution for the disabled tracing case might be to delay addition of tracepoints to the hash table until the moment this data structure is actually needed (lazy initialization).</p> LTTng-UST - Feature #327 (On pause): Implement missing hostname contexthttps://bugs.lttng.org/issues/3272012-08-26T23:22:47ZMathieu Desnoyersmathieu.desnoyers@efficios.com
<p>To match features of lttng-modules.</p>