Project

General

Profile

Bug #612

Duplicate events for tracepoints emitted from dlopened shared objects

Added by Paul Woegerer about 7 years ago. Updated about 7 years ago.

Status:
Won't fix
Priority:
High
Target version:
Start date:
08/06/2013
Due date:
% Done:

0%

Estimated time:

Description

Since LTTng 2.2 the behavior for lttng enable-event/enable-channel changed in a subtle but annoying way for tracepoints emitted from dlopened shared objects.

The attached sample executable caused exactly one event to get recorded up until LTTng-2.1 when the following lttng command sequence is used:

lttng create
lttng enable-channel ch1 -u
lttng enable-event 'my_events:*' -u -c ch1
lttng enable-channel ch2 -u
lttng enable-event -u -a -c ch2
lttng start
./bin/tpproxytest
lttng stop
lttng destroy

for LTTng <=2.1 babeltrace give us just one event (which is what users would expect):
[17:01:20.816055508] (+?.?????????) atv-pwoegere-l3.atv.mentorg.com:tpproxytest:23702 my_events:event1: { cpu_id = 6 }, { }

for LTTng >=2.2 babeltrace shows that the event got recoded twice (once in ch1 and once again in ch2 ?):
[16:57:08.342747191] (+?.
???????) atv-pwoegere-l3.atv.mentorg.com:tpproxytest:5458 my_events:event1: { cpu_id = 5 }, { }
[16:57:08.342753041] (+0.000005850) atv-pwoegere-l3.atv.mentorg.com:tpproxytest:5458 my_events:event1: { cpu_id = 5 }, { }

As already stated, this effect only occurs when the tracepoint gets emitted from a dlopened shared lib. For all other emitted tracepoints the behavior is as it was until LTTng 2.1.

Not sure what could have caused this strange regression but the following commits might have to do with it:
99234da69ab197050d3d28ea428c54e08539667c Fix: tracepoint.h incorrect assumption about constructor order
46c881a7d6839761e3371a93ba29e7a0202bcd03 Fix: tracepoint instrumentation constructor order issue
558b9d86247004f8e9bbaf8c982f3b2b182093d1 Fix: ensure all probe providers have their symbols


Files

tpproxy.tar.gz (20 KB) tpproxy.tar.gz sample program that emits tracepoint from dlopened shared lib Paul Woegerer, 08/06/2013 11:30 AM
#1

Updated by Paul Woegerer about 7 years ago

I assume this is the critical part where things go wrong:

libust[11311/11311]: Provider "my_events" accepted, version 1.0 is compatible with LTTng UST provider version 1.0. (in check_provider_version() at lttng-probes.c:174)
libust[11311/11311]: adding probe my_events containing 1 events to lazy registration list (in lttng_probe_register() at lttng-probes.c:216)
libust[11311/11311]: just registered probe my_events containing 1 events (in lttng_lazy_probe_register() at lttng-probes.c:115)
libust[11311/11311]: Sent register event notification for name "my_events:event1": ret_code 0, event_id 0
 (in ustcomm_register_event() at lttng-ust-comm.c:991)
libust[11311/11311]: Sent register event notification for name "my_events:event1": ret_code 0, event_id 0
 (in ustcomm_register_event() at lttng-ust-comm.c:991)
liblttng_ust_tracepoint[11311/11311]: Registering probe to tracepoint my_events:event1 (in __tracepoint_probe_register() at tracepoint.c:541)
liblttng_ust_tracepoint[11311/11311]: Registering probe to tracepoint my_events:event1 (in __tracepoint_probe_register() at tracepoint.c:541)
liblttng_ust_tracepoint[11311/11311]: just registered a tracepoints section from 0x7f744d23c060 and having 1 tracepoints (in tracepoint_register_lib() at tracepoint.c:759)

#2

Updated by Mathieu Desnoyers about 7 years ago

  • Status changed from New to Won't fix
09:37 < Compudj> pwoegere: you enable events in two channels
09:37 < Compudj> this would enabled events both in .so and in the executable twice, right ?
09:38 < Compudj> "As already stated, this effect only occurs when the tracepoint gets emitted from a dlopened shared lib. For all other emitted tracepoints the behavior is as it was until LTTng 2.1." -> ah, it seems not
09:39 < pwoegere> Compudj: My understanding was that if an event was enabled for channel A, it should not go also in channel B (even if B collects all remaining userspace events)
09:40 < Compudj> pwoegere: I think early versions were doing this, but it was more an unfortunate design issue than done on purpose
09:40 < pwoegere> Compudj: otherwhise how can you reasonably control which events go where ?
09:40 < Compudj> you don't enable "all" events
09:41 < Compudj> or eventually, we'd introduce "all except A B and C" 
09:41 < pwoegere> Compudj: lttng enable-event -u -a -c ch2
09:41 < Compudj> I mean: if you enable all events, you will have duplicates
09:41 < Compudj> (if an event ends up matching for 2 channels)
09:41 < Compudj> e.g.:
09:41 < Compudj> lttng enable-event -u -a -c ch1
09:41 < Compudj> lttng enable-event -u -a -c ch2
09:41 < Compudj> you will have all events in each channel
09:42 < pwoegere> Compudj: Hmm
09:42 < pwoegere> Compudj: I would not have expected that
09:42 < pwoegere> Compudj: what sense could it make to get the same event persisted in two channels ?
09:43 < Compudj> each channel can have its own config
09:43 < Compudj> e.g. different buffer size, timers, behavior in general
09:43 < Compudj> so we might want to put important info into one
09:44 < Compudj> eventually, also, we might want to put an event with a specific loglevel or filter associated into a channel
09:44 < Compudj> so, an example of valid use-case:
09:44 < Compudj> enable all events of loglevel debug with filter on TID 123 into channel A
09:44 < Compudj> and enable some events of loglevel critical into channel B
09:45 < Compudj> so the "all events" set is made more specific by the loglevel and filter info
09:45 < Compudj> so someone could very well want to enable a "duplicate" of an event with different loglevel and filter info
09:46 < Compudj> (into another channel, _or_ into the same channel)
09:46 < pwoegere> Compudj: Hmmm, I though of it more like the unix pipe. data flows from one place to another. If you want duplicate use explicit tee
09:47 < Compudj> events can be written into all the sessions dans are interested in them
09:47 < Compudj> and can also be written in all channels that are interested in them
09:47 < Compudj> however, let's say that you enable an event twice for the same channel
09:47 < Compudj> with a wildcard and the specific event name
09:47 < Compudj> it's only going to be saved once
09:47 < Compudj> no duplicate
09:47 < Compudj> but if you do so for multiple channels, how should we know which one to keep ?
09:48 < pwoegere> Compudj: by using the mental model of a pipe
09:49 < pwoegere> Compudj: e.g.
09:49 < pwoegere> Compudj: lttng enable-event 'my_events:*' -u -c ch1 --dontConsume
09:50 < Compudj> what is --dontConsume ?
09:50 < pwoegere> Compudj: I made it up for demonstration purpose
09:51 < Compudj> it puts a ring buffer in overwrite mode, without consumer attached ?
09:51 < pwoegere> lttng enable-event 'my_events:*' -u -c ch1
09:52 < pwoegere> with --dontConsume would mean capute that kind of event in ch1 but do not take it from the pipe so that a later ...
09:52 < pwoegere> lttng enable-event -u -a -c ch2
09:52 < pwoegere> would also hava a chance to capture it
09:53 < pwoegere> Compudj: but I guess thats already too late, since the semantic is already defined differently
09:54 < Compudj> indeed
09:55 < Compudj> if you want close control on which event goes where, you might want to rethink you use of the wildcards (*)
09:55 < Compudj> so an event won't match for two channels
09:56 < Compudj> by the way, it also behaves like this if the event is in the program main executable
09:57 < Compudj> so the behavior seems coherent
09:59 < pwoegere> Compudj: Ok, for now I will adjust my script to not make use of multiple userspace channels anymore. this has to wait until we have event exclusion. Thanks !
10:00 < Compudj> pwoegere: sorry about that.. this slight change of behavior has been required by the addition of loglevel and filter features

Also available in: Atom PDF