Bug #471
closedlttng poorly handles some trace-provider errors
100%
Description
First I modify two files of the lttng-ust/doc/examples/easy-ust
project:
sample.c
:
int main(int argc, char **argv) { int i = 0; short s[3] = { -1, -1, -1 }; for (i = 0; i < 2000; i++) { tracepoint(sample_component, message, "Hello World", s, 3); usleep(10); } return 0;
sample_component_provider.h
:
TP_ARGS(char *, text, short *, shortvalues, short, length), TP_FIELDS( ctf_string(message, text) ctf_sequence(short, shortseq, shortvalues, short, length) )
As far as I can tell the above two changes should correctly produce a sequence of shorts in the UST event payload. It certainly make
s correctly. There's probably an error or two lurking in there (I should point out I've previously managed to get sample
to generate traces with arrays of longs, so the error is specific to my (mis)use of ctf_sequence
), because when I get the lttng session going, this happens:
$ lttng create mylocalsession Session mylocalsession created. Traces will be written in /home/username/lttng-traces/mylocalsession-20130311-140014 $ lttng enable-event -u --all All UST events are enabled in channel channel0 $ lttng start Tracing started for session mylocalsession
At this point I run ./sample
. The application does not complain or segfault or whatever. But the user lttng-consumerd
and lttng-sessiond
both quit unexpectedly. So when ask for lttng list
, I get:
$ lttng -vvv list Spawning a session daemon DEBUG1 [8583/8583]: SIGUSR1 caught (in sighandler() at lttng.c:196) DEBUG2 [8583/8583]: Session name: (null) (in cmd_list() at commands/list.c:718) DEBUG1 [8583/8583]: LSM cmd type : 13 (in send_session_msg() at lttng-ctl.c:261) DEBUG1 [8583/8583]: Session count 0 (in list_sessions() at commands/list.c:586) Currently no available tracing session
The lttng-traces
folder contains what looks like a proper trace, but it is corrupt, as babeltrace
is unable to open it ('the metadata is empty'
).
Now, besides what I did wrong in setting up my trace provider (which isn't an LTTng bug, obviously), the problem I'm reporting here is that the failure was completely silent: no errors of any kind were reported by lttng. The daemons just quit.
An error-reporting mechanism is needed. It could be as simple as the "current session" .lttngrc
trick: have the session daemon write a "crash report" to .lttngerror
or some such (the consumer daemon would simply forward its crash report to the session daemon). The next time the lttng session daemon is started, it would spot the error report file and warn the user about it (e.g. "An error log exists: consult %s"
where %s
is the .lttngerror
fully qualified path).