Bug #1033
openlttng load does not preserve event or channel ordering
0%
Description
lttng load
builds the channel and event lists as if they were entered using enable-channel
and enable-event
commands, in the order that they appear in the .lttng
file. Internally, these lists are built in stack-like fashion, each addition pushing the already-entered channels or events down and inserting the new channel or event at the head of the list. lttng save
, on the other hand, writes the channels and events to the .lttng
file in the same order that they appear in the internal lists, starting at each list's head. This means an lttng load
- lttng save
cycle will completely invert the channel and event ordering within the .lttng
file.
This normally matters little, but it should be noted that an event list (for instance) can be optimized against lookup by ensuring the most-frequently requested events appear at the head of the list. A user that fine-tuned his .lttng
files in this way would understandably be upset that lttng load
inverts the order.
This can be fixed by having the lttng load
routine read each XML enumeration into a buffer, issuing the LTTng list-addition commands in last-to-first order. Since the XML enumerations in the .lttng
files are nested, recursion is required and the desired corrected behaviour can be achieved by having the list-addition commands appear in the popping phase of the recursion.
Note that domain ordering is preserved by lttng load
.
Updated by Jérémie Galarneau about 8 years ago
I'm not sure what you mean by "it should be noted that an event list (for instance) can be optimized against lookup by ensuring the most-frequently requested events appear at the head of the list."
Do you mean improving the tracers' performance by ordering the events in a particular way?
Updated by Daniel U. Thibault about 8 years ago
Yes. Suppose 90% of my events are of a certain type and are assigned to a certain channel while the remaining 10% are spread across a dozen or so other channels. If the majority event's channel is the first in the channel list, every lookup for that channel will terminate faster than if it is at the end of the list. That's just a few instruction cycles, but every little bit helps sometimes.
If the channel were looked up every time the event occurs, it would mean large savings, but I don't think that's how LTTng works (I suspect the session daemon looks up the event type once at session start and sets up the tracepoint providers so they have the channel pointer in persistent memory). It may be more important in user-space, where events can register and unregister repeatedly during a session.
The annoyance of having the channel list invert with every session load/save cycle would also cause problems during trace analysis, as the babeltrace API recovers channel IDs and not channel names: a user creating a collection of several sessions, some of which are using an inverted channel list with respect to the other sessions, would be puzzled by the changing channel IDs (TRACE_PACKET_HEADER stream_id) of his events.
Updated by Jérémie Galarneau about 8 years ago
Daniel U. Thibault wrote:
Yes. Suppose 90% of my events are of a certain type and are assigned to a certain channel while the remaining 10% are spread across a dozen or so other channels. If the majority event's channel is the first in the channel list, every lookup for that channel will terminate faster than if it is at the end of the list. That's just a few instruction cycles, but every little bit helps sometimes.
I will need to see the difference in a benchmark to justify changing this on the grounds of performance concerns. I'm not convinced that it makes a noticeable difference in practice, but I'm open to being proven wrong.
If the channel were looked up every time the event occurs, it would mean large savings, but I don't think that's how LTTng works (I suspect the session daemon looks up the event type once at session start and sets up the tracepoint providers so they have the channel pointer in persistent memory). It may be more important in user-space, where events can register and unregister repeatedly during a session.
There are no such look-ups happening at run-time, as you pointed out.
The annoyance of having the channel list invert with every session load/save cycle would also cause problems during trace analysis, as the babeltrace API recovers channel IDs and not channel names: a user creating a collection of several sessions, some of which are using an inverted channel list with respect to the other sessions, would be puzzled by the changing channel IDs (TRACE_PACKET_HEADER stream_id) of his events.
There should indeed be a way to recover a human-readable channel name (or stream name, in CTF parlance). However, the tracer offers no guarantee that channel IDs will be preserved from one session to the next. The fact that this id is generated in such a "predictable" way is an implementation detail that is not guaranteed to remain unchanged in the future.
Updated by Daniel U. Thibault about 8 years ago
Jérémie Galarneau wrote:
Daniel U. Thibault wrote:
The annoyance of having the channel list invert with every session load/save cycle would also cause problems during trace analysis, as the babeltrace API recovers channel IDs and not channel names: a user creating a collection of several sessions, some of which are using an inverted channel list with respect to the other sessions, would be puzzled by the changing channel IDs (TRACE_PACKET_HEADER stream_id) of his events.
There should indeed be a way to recover a human-readable channel name (or stream name, in CTF parlance). However, the tracer offers no guarantee that channel IDs will be preserved from one session to the next. The fact that this id is generated in such a "predictable" way is an implementation detail that is not guaranteed to remain unchanged in the future.
Until the babeltrace API offers a way of recovering the channel names, the users have no choice but to fall back on channel IDs. We're just lucky that (for now) the IDs are generated predictably. It's true one should not rely on this, but right now one can't do it any other way.