Feature #1226

Allow making graph execution more "event-based"

Added by Simon Marchi 10 months ago. Updated 9 months ago.

Start date:
Due date:
% Done:


Estimated time:


Currently, when using the lttng-live source component class, the source component will return TRY_AGAIN if it has no messages to return but the connection is still open. The graph will return TRY_AGAIN to the user, which can then run the graph again to retry. The source component might have some data available, or it might also return TRY_AGAIN again. This effectively constitutes a busy loop, consuming CPU to check if new data is available.

It would be better to have the process block on a syscall if there's nothing to consume, and be woken up when some data arrives. Here's how I would see it:

1. A source that may return TRY_AGAIN would register a file descriptor to be used as an asynchronous event source, along with a callback
2. When a sink's consume method returns TRY_AGAIN, that sink is placed in the "inactive sinks" list
3. If there are no active sinks, bt_graph_run (or some new function) would poll all event sources, waiting for one to become readable.
4. It would call the callback associated to the event source(s) that became readable
5. The source component would send some kind of signal downstream that would trickle down to the sinks that previously became inactive, those sinks would inform the graph, who would put them back in the "active" list

Of course, many problems come to mind, such as how to deal with portability, multiple sinks where one keeps producing and one that becomes inactive and active again (how do you make sure not to starve the one that became inactive), how to evolve the API to include that without breaking any existing behavior, etc.


Updated by Jonathan Rajotte Julien 10 months ago

  • Author changed from 215 to 8

Updated by Jonathan Rajotte Julien 10 months ago

Migrated from internal bug tracker.


Updated by Francis Deslauriers 9 months ago

  • Tracker changed from Bug to Feature

Updated by Mathieu Desnoyers 9 months ago

This would depend on implementing an extension to the lttng-live protocol. It was initially planned at the design phase but never implemented: add a second "notification" socket which would allow the relayd to notify viewers that new information is available.

With the current lttng-live protocol, the client needs to actively poll, so there is no point in adding underlying support for event-driven wakeup as long as the lttng-live source cannot do it due to protocol limitations.


Updated by Jérémie Galarneau 9 months ago

Just chiming to say that I like Simon's overall proposal, but it can't be a literal fd and poll() as babeltrace is a cross-platform project. Nonetheless, all supported platforms offer a somewhat similar mechanism (e.g. WSAWaitForMultipleEvents, etc.)

As for the second notification socket, it is not necessary if we do change the lttng-live protocol. A similar strategy to the lttng_notification_channel protocol can be adopted. That protocol allows for both commands and "spontaneous" messages (notifications) to be multiplexed in a single socket's stream.

Also available in: Atom PDF