Bug #1018
closedCrash while trying to print debug info
0%
Description
The attached trace, recorded with LTTng-UST, crashes babeltrace (b1d10d85a8fce649045532003b2ddae8d98e47a1, today's master).
I think that the bug is uncovered by a very particular sequence of events and cpu migration.
First, we have this event:
[18:02:40.873741190] (+0.000004995) simark lttng_ust_statedump:end: { cpu_id = 1 }, { vpid = 29015, ip = 0x7FC6FDF98867, debug_info = { bin = "liblttng-ust.so.0.0.0+0x35867", func = "trace_end_cb+0x86" } }, { }
This event is not handled in a special way in debug_info_handle_event, so babeltrace calls register_event_debug_infos on it. The "debug_info_src" field of the definition of the _ip field gets assigned to some valid value, pointing to a valid struct debug_info_source. That happens on CPU 1, which is important here because there is a single "struct definition_integer" for the _ip field for the whole CPU 1 file.
Then, the program execs, so a new statedump is started. When that happens, babeltrace scraps all the cached debug_info_source, because the mappings are no longer valid. However, the debug_info_src field of the _ip field of CPU1 is still set, so it becomes a dangling pointer. No
The very next event on CPU 1 is a dlopen one. Note that because it is handled in a special way in debug_info_handle_event, register_event_debug_infos is not called on it, so the value of debug_info_src is not set (it still points to the now freed debug_info_src of the first event). When babeltrace tries to print it, it crashes.
One solution I see is to call register_event_debug_infos on all events (see patch attached). dlopen and other "special" events all have source location we could resolve, so I don't see why we wouldn't call it for them. Otherwise, we would at least need to clear debug_info_src when it's a special event.
Files
Updated by Simon Marchi almost 8 years ago
Here's a way to generate a trace that reproduces the problem with higher probability:
$ lttng create && lttng enable-event -u -a && lttng add-context -t vpid -t ip -u && lttng start $ LD_PRELOAD=liblttng-ust.so:liblttng-ust-dl.so /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 /usr/bin/env LOL=1 ls $ lttng stop $ lttng view
With that many statedump for the same pid, there is a good chance that the problematic situation will happen. I took a few traces like this and it crashed babeltrace every time.
Updated by Jonathan Rajotte Julien about 4 years ago
- Status changed from New to Invalid
State of babeltrace moved a lot since.
Closing this ticket as invalid. Reopen it if it stills apply to Babeltrace 2.