Bug #972
closedBabeltrace terminates unexpectedly during lttng-live trace reading
100%
Description
Problem description:
Watching the tracking log on another host by using lttng-relayd and babeltrace as a lttng-live viewer, during live tracing, babeltrace unexpected stops without any error reported.
How can the problem be reproduced:
First execute the following two commands on a Ubuntu host.
1). lttng-relayd -d -o /tmp/lttng/live/
2). lttng-session -d --no-kernel
Then execute the following commands on another Ubuntu host.
3). lttng-relayd -d -o /tmp/lttng/live/
4). lttng-session -d --no-kernel
5). lttng create s1 --live -U net://ip (ip: the address of the first Ubuntu host)
6). lttng enable-event -u -a
7). lttng start
8). ./hello (a test application can be found on LTTng web site: http://lttng.org/docs/#doc-tracing-your-own-user-application)
At last, observe tracking log on the first Ubuntu host.
9). babeltrace --clock-date --no-delta /tmp/lttng/live/localhost/s1-20000107-015309/
About twenty minutes, the babeltrace tracing stops without any error reported.
The debug log is as following:
[04:36:03.081401633] (+0.010076186) localhost hello_world:my_first_tracepoint: { timestamp_begin = 678972918036556, timestamp_end = 678973917652555, content_size = 20768, packet_size = 32768, events_discarded = 0, cpu_id = 0 }, { id = ( "compact" : container = 0 ), v = { compact = { timestamp = 109183534 } } }, { my_string_field = "i'm there", my_integer_field = 3 }
[debug] ctf_pos_get_event offset 20608content_size 20768
[debug] ctf_move_pos test EOF: 20608
[debug] ctf_move_pos after increment: 20608
[debug] ctf_move_pos test EOF: 20608
[debug] ctf_move_pos after increment: 20608
[debug] ctf_move_pos test EOF: 20608
[debug] ctf_move_pos after increment: 20613
[debug] ctf_move_pos test EOF: 20613
[debug] ctf_move_pos after increment: 20613
[debug] ctf_move_pos test EOF: 20613
[debug] ctf_move_pos after increment: 20613
[debug] ctf_move_pos test EOF: 20613
[debug] ctf_move_pos after increment: 20613
[debug] ctf_move_pos test EOF: 20613
[debug] ctf_move_pos after increment: 20640
[debug] ctf_move_pos test EOF: 20640
[debug] ctf_move_pos after increment: 20640
[debug] ctf_move_pos test EOF: 20640
[debug] ctf_move_pos after increment: 20640
[debug] CTF string read i'm there
[debug] ctf_move_pos test EOF: 20640
[debug] ctf_move_pos after increment: 20720
[debug] ctf_move_pos test EOF: 20720
[debug] ctf_move_pos after increment: 20736
[debug] ctf_move_pos test EOF: 20736
[debug] ctf_move_pos after increment: 20768
[04:36:03.091477819] (+0.010076186) localhost hello_world:my_first_tracepoint: { timestamp_begin = 678972918036556, timestamp_end = 678973917652555, content_size = 20768, packet_size = 32768, events_discarded = 0, cpu_id = 0 }, { id = ( "compact" : container = 0 ), v = { compact = { timestamp = 119259720 } } }, { my_string_field = "i'm there", my_integer_field = 3 }
[debug] ctf_packet_seek (before call): 20768
[debug] ctf_packet_seek (after call): -1
[verbose] finished converting. Output written to:
<stdout>
Updated by Jonathan Rajotte Julien about 9 years ago
Hi Li
As long as I remember the use of babeltrace on a live trace without the use of a "--input-format lttng-live" argument does not guarantee synchronization of reading thus if babeltrace read an EOF for the trace it exit. As seen in this debug line:
[verbose] finished converting
See https://lttng.org/docs/#doc-lttng-live for proper usage of the live mode.
Also there is some big incoherence in the way you expect live to work.
The proper scenario would be:
First ubuntu host:
lttng-relayd -d -o /tmp/lttng/live/
Starting a sessiond is useless here.
On the second host:
lttng-sessiond -d --no-kernel lttng create s1 --live -U net://ip (ip: the address of the first Ubuntu host) lttng enable-event -u -a lttng start ./hello
No need to start a relayd here since we send data to a remote relayd.
On the first ubuntu host:
First list the available live session for localhost:
babeltrace --clock-date --no-delta -i lttng-live net://localhost
Then hook babeltrace with the full url (opn my tegst it was):
babeltrace -i lttng-live --clock-date --no-delta net://localhost/host/myhostname/s1
Could you make sure to redo your experiment this way and report your finding?
Cheers!
Updated by Jonathan Rajotte Julien about 9 years ago
- Status changed from New to Feedback
- Priority changed from Normal to Low
Updated by Li Liguang about 9 years ago
Hi,
Thanks for the feedback, i didn't understand clearly about the remote live mode at that time.
Follow the steps you provide, this problem no longer appear. Is my error, please close this issue. Thanks.
Regards.
Updated by Jonathan Rajotte Julien about 9 years ago
- Status changed from Feedback to Resolved
- % Done changed from 0 to 100