Create an lttng-collectd skeleton for VM tracking feature
Geneviève Bastien wants to contribute a VM-tracking statedump notifier (based on libudev). However, her current pull request spawns a "statedump" thread in the lttng-sessiond.Since that thread has to use tracepoints, it is proving hard to integrate the feature inside the session daemon for various reasons:
- we can't link on liblttng-ust directly since the registration performed at the construction will fail
- we need a clear way to know when the sessiond is ready to trace itself before we allow the creation of sessions
- it is not clear that it is the sessiond's role to interact with external programs to collect system information anyway
lttng-daemonize()closes all file descriptors, lttng-ust's included...
The solution we chose consists in introducing an external `lttng-collectd` process that can be linked to lttng-ust directly, thus saving us the trouble of
dlopen()-ing the lttng-ust provider after the
Right now, we have this hierarchy of processes
$ lttng-sessiond bash └─── lttng-sessiond $ lttng-sessiond -d/-b bash └─── lttng-sessiond, forks and waits for SIGUSR1 from child to exit(), more of a launcher └─── lttng-sessiond (the "real" daemonized lttng-sessiond process)
SIGUSR1 signal is sent from the "real" process to the launcher when all threads have been launched. We consider that the threads have been launched when they have all called
The last thread to call
sessiond_notify_ready() will be the one that actually sends that signal.
We want the
lttng-collectd to live as a child process under the session daemon.
$ lttng-sessiond bash └─── lttng-sessiond └─── lttng-collectd $ lttng-sessiond -d/-b bash └─── lttng-sessiond, forks and waits for SIGUSR1 from child to exit(), more of a launcher └─── lttng-sessiond (the "real" daemonized lttng-sessiond process) └─── lttng-collectd
To make this feature reliable, we need to ensure the sessiond does not allow the creation of tracing sessions before the
lttng-collectd has been launched and is ready to react to statedump commands. For now, this means that
libudev initialization has been completed and that statedump commands initiated by liblttng-ust will result in a correct statedump.
lttng-collectd is launched after the registration thread, the start of its process should be delayed by
liblttng-ust's constructor until the registration is completed.
To make sure we don't end-up in situations where statedumps are unexpectedly not produced, we should launch
lttng-collectd with the environment variable
LTTNG_UST_REGISTER_TIMEOUT=-1. Otherwise, the
lttng-collectd's registration could timeout.
In practice, that means
thread_registration_apps() has to be ready to accept the registration of
lttng-collectd before it is launched.
I would add a function here that:
- waits on a registration_thread_ready semaphore (see explanation below)
- creates/open a fifo in the rundir (see
- fork()+execve() the
lttng-collectdwith the path to the fifo as argument
- in the parent, block on 1-byte
read()on the fifo (this is similar to run-as)
- in the child,
write()a byte on the fifo
- in the parent, block on 1-byte
Look at how
sem_t notification_thread_ready is initialized, posted when the notification-thread is ready, waited-on by the rotation thread and destroyed. The semaphore should be posted here to signal that the app registration thread is ready.
As far as the teardown is concerned, killing/terminating
lttng-sessiond should result in
SIGPIPE being received by
SIGPIPE is only received if a process tries to write to a closed pipe,
lttng-collectd should just loop on
write(). Everything we want it to do will happen in the lttng-ust thread anyway.
Updated by Michael Jeanson 10 months ago
Here is a branch with a PoC: https://github.com/mjeanson/lttng-tools/tree/collectd
There is still some work and cleanup to do but it is working.