Bug #1406
closedMissing events/crash with babeltrace2 lttng-live plugin with session that uses per-pid buffers
100%
Description
While investigating #1403 I noticed one pattern of test failures that had babeltrace2 crashes or hangs. The problem seemed intermittent, but I have been able to develop a test case that's reproducible for me. This affects for babeltrace2 stable-2.0 and master while using lttng-tools master.
The test case (see attached script) performs the following steps:
- Start a ust application and leave it running
- Configure and then start an lttng live session
- Connect a live viewer (babeltrace)
- Run a second ust application
- Wait for the expected number of events
- In the failing case, no events are seen by babeltrace
Using per-uid buffers, the test typically completes normally. With per-pid buffers the test fails, hanging indefinitely if waiting for the specified number of events. While "hanging", babeltrace2 is polling the relayd:
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=100000000}, NULL) = 0 sendto(3, "\0\0\0\0\0\0\0\10\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\6", 24, MSG_NOSIGNAL, NULL, 0) = 24 recvfrom(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64, 0, NULL, NULL) = 64 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=100000000}, NULL) = 0 sendto(3, "\0\0\0\0\0\0\0\10\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\6", 24, MSG_NOSIGNAL, NULL, 0) = 24 recvfrom(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64, 0, NULL, NULL) = 64
While the test as written stops the session before the first gen-ust-events app when the timeout is reached, if the test is invoked with LIVEVIEWER_TIMEOUT=NO
we can try to different operations:
- killing the early gen-ust-events app causes babeltrace2 to exit unsuccessfully (I don't think this should happen)
- killing the late gen-ust-events app leaves babeltrace2 running
- stopping the session (as in the test), causes babeltrace2 to exit unsuccessfully (I think this should exit gracefully, but I could be wrong)
Files
Updated by Kienan Stewart about 1 year ago
- Related to Bug #1403: Investigate why events are no longer recorded by the live view when `DELAYUS` in `tests/regression/tools/clear/test_ust` is reduced too low added
Updated by Kienan Stewart about 1 year ago · Edited
The per-pid buffer portion of the test also fails with lttng-tools stable-2.13.
I believe I understand the mechanism of failure, and I think changes to both lttng-tools (relayd) and babeltrace can address the issue.
The first application is running and registered with the sessiond and relayd, and it's events have all be emitted. This application continues to run and remained registered, waiting for the before exit file to be touched. The live viewer then connects and has the first set of streams from the early app registered. It begins to regularly poll the relayd with VIEWER_GET_NEXT_INDEX
commands1.
When the second UST application runs and registers, the sessiond notifies the relayd (RELAYD_ADD_STREAM
). Once all streams are added, RELAYD_STREAMS_SENT
is sent by the sessiond to the relayd. RELAYD_STREAMS_SENT
will cause session->new_stream
to be set for the relevant session.
The new streams are registered with the relayd:
DBG1 - 14:29:10.865071340 [2589952/2589964]: Done receiving control command header: fd = 19, cmd = RELAYD_STREAMS_SENT, cmd_version = 0, payload size = 0 bytes (in relay_pro cess_control_receive_header() at main.cpp:3508) DBG1 - 14:29:10.865143952 [2589952/2589964]: Done receiving control command payload: fd = 19, payload size = 0 bytes (in relay_process_control_receive_payload() at main.cpp: 3423) DBG3 - 14:29:10.865201980 [2589952/2589964]: Processing "RELAYD_STREAMS_SENT" command for socket 19 (in relay_process_control_command() at main.cpp:3299)
This last variable is checked in live.cpp::check_new_streams
[2]. If set, the results index flags should have the bit for LTTNG_VIEWER_FLAG_NEW_STREAM
set to indicate that the viewer should run LTTNG_VIEWER_GET_NEW_STREAMS
to update it's streams. However, the next LTTNG_VIEWER_GET_NEW_STREAMS_COMMAND
doesn't happen.
New streams are only fetched once:
$ grep GET_NEW_STREAMS /tmp/relayd.log DBG1 - 14:29:13.153878107 [2589952/2589967]: Processing GET_NEW_STREAMS viewer command from connection 31 (in process_control() at live.cpp:2522)
As there are no new events causing changes to the indices of streams (data + metadata streams from the the early UST application), the GET_NEXT_INDEX
command is returning early3 with the flags set by check_index_status
[4] with no verification at all of the session->new_stream
status.
The next GET_NEXT_INDEX
commands after the new streams are registered with the relayd:
DBG1 - 14:29:10.896364295 [2589952/2589967]: Processing GET_NEXT_INDEX viewer command from connection 31 (in process_control() at live.cpp:2522) DBG1 - 14:29:10.896469038 [2589952/2589967]: Transition to latest chunk check (0 -> 0): Same chunk, no need to rotate (in viewer_get_next_index() at live.cpp:1785) DBG1 - 14:29:10.896504900 [2589952/2589967]: Check index status: index_received_seqcount 1 index_sent_seqcount 1 for stream 1 (in check_index_status() at live.cpp:1499) DBG1 - 14:29:10.896537507 [2589952/2589967]: Check index status: inactive with beacon, for stream 1, returning status=INDEX_INACTIVE (in check_index_status() at live.cpp:1542) 2)
As a result, the live viewer won't be able to start the process to register new streams until there are changes on the first set of registered streams. In this test case, there are no further events from the early UST application.
However, it is not sufficient to ensure that the LTTNG_VIEWER_FLAG_NEW_STREAM
is sent when the relayd replies with LTTNG_VIEWER_INDEX_INACTIVE
. When the command's return status is LTTNG_VIEWER_INDEX_INACTIVE
, the index flag LTTNG_VIEWER_FLAG_NEW_STREAM
is never verified5. The new stream and new metadata flags are only being checked on the babeltrace2 side when command's return code is LTTNG_VIEWER_INDEX_OK
[6].
To resolve this I would propose two changes:
- In the relayd, ensure that the index flags have
LTTNG_VIEWER_FLAG_NEW_STREAM
have checked, set, and returned on all the non-error code paths. - In the babeltrace2 live viewer, check the same flags on all non-error code paths so the subsequent commands can be issued to update the streams.
I believe similar issues also apply to LTTNG_VIEWER_FLAG_NEW_METADATA
.
[1]: https://github.com/lttng/lttng-tools/blob/7d46777ba0eaba7916ae822a7133f39740cdf9e5/doc/live-reading-protocol.txt#L120
[2]: https://github.com/lttng/lttng-tools/blob/7d46777ba0eaba7916ae822a7133f39740cdf9e5/src/bin/lttng-relayd/live.cpp#L1902
[3]: https://github.com/lttng/lttng-tools/blob/7d46777ba0eaba7916ae822a7133f39740cdf9e5/src/bin/lttng-relayd/live.cpp#L1822
[4]: https://github.com/lttng/lttng-tools/blob/7d46777ba0eaba7916ae822a7133f39740cdf9e5/src/bin/lttng-relayd/live.cpp#L1562
[5]: https://github.com/efficios/babeltrace/blob/fca1d0f55bf24741f07d2a86d70ad7274d3ec851/src/plugins/ctf/lttng-live/viewer-connection.cpp#L1463
[6]: https://github.com/efficios/babeltrace/blob/fca1d0f55bf24741f07d2a86d70ad7274d3ec851/src/plugins/ctf/lttng-live/viewer-connection.cpp#L1448
Updated by Kienan Stewart about 1 year ago
- Status changed from New to In Progress
- Assignee set to Kienan Stewart
For a potential patch check LTTNG_VIEWER_FLAG_NEW_STREAM
(and logically _NEW_METADATA
) for the INACTIVE
and HUP
cases. Eg. check_new_streams
before check_index_status
, and have check_index_status
not fully override the index flags (eg. use |=
).
Updated by Mathieu Desnoyers about 1 year ago
Kienan Stewart wrote in #note-4:
For a potential patch check
LTTNG_VIEWER_FLAG_NEW_STREAM
(and logically_NEW_METADATA
) for theINACTIVE
andHUP
cases. Eg.check_new_streams
beforecheck_index_status
, and havecheck_index_status
not fully override the index flags (eg. use|=
).
Sorry I mislead you in our IRC discussion: moving check_new_streams before check_index_status is fine, but we don't need to change anything in check_index_status because it only modifies "index->status" and not "index->flags".
Updated by Jérémie Galarneau about 1 year ago
First thanks for the write-up, it's awesome!
As a result, the live viewer won't be able to start the process to register new streams until there are changes on the first set of registered streams. In this test case, there are no further events from the early UST application.
This part surprises me. I would expect the live beacon (effectively a new index entry sent from the consumer to the relay daemon when the live timer fires) to cause "new activity" to be perceived on the streams of the original application.
Have you noticed something that prevents the live beacon from doing its job?
Updated by Jérémie Galarneau about 1 year ago · Edited
Just talked with Mathieu and I hadn't realized the live beacons resulted in the INACTIVE
status being returned. Your explanation makes a lot of sense.
Updated by Jérémie Galarneau about 1 year ago
I took a shot at implementing a fix against both LTTng-tools and Babeltrace2 master branches:
https://review.lttng.org/c/lttng-tools/+/11588
https://review.lttng.org/c/babeltrace/+/11587
They are not tested yet, I'm setting up a reproducer to confirm on my end.
Let me know if you get a chance to test it on your end!
Updated by Kienan Stewart about 1 year ago · Edited
I also have proposed fixes from last week here: https://review.lttng.org/q/topic:%22bugs1406-inactive_indices_miss_new_streams%22 - I just needed to finish the write up + per-uid tests
Updated by Jérémie Galarneau about 1 year ago
Okay great! Sorry, I didn't think of looking on gerrit.
They look very close functionality-wise so I don't mind abandoning my changes. I'll comment on yours.
Updated by Kienan Stewart about 1 year ago
My bad for not leaving a comment here on Friday to say where I was at. I could also abandon my changes and bring in the tests to your patch on live. Whichever you prefer.
What was your motivation for adding the SAMPLE_AND_CLEAR
and SAMPLE
modes to the check_new_streams
function?
Updated by Jérémie Galarneau about 1 year ago
Kienan Stewart wrote in #note-11:
My bad for not leaving a comment here on Friday to say where I was at. I could also abandon my changes and bring in the tests to your patch on live. Whichever you prefer.
I'm good going with yours, you did the heavy lifting of finding the cause 🤘
What was your motivation for adding the
SAMPLE_AND_CLEAR
andSAMPLE
modes to thecheck_new_streams
function?
It's an attempt to maintain compatibility with unfixed viewers as much as possible.
When check_new_streams
is invoked, it searches for a session with new streams. If one is found, it clears the flag by using a cmpxchg and exits with 1
. In that sense, the check is "destructive".
Since unfixed viewers don't check the flags for new streams when the index reply uses the [...]_INACTIVE
status, we would clear the session's flag (on the relay daemon end) and they would simply ignore it. Then, the viewer would never be informed that this session has new streams even if the existing streams start to see activity again (causing the index queries to return "active" indexes).
The two sampling modes are introduced to have destructive and non-destructive versions of the check. When an "active" index reply is delivered, we use the destructive mode as we always did.
When the index reply is "inactive", we use the non-destructive version so that:- We give the "hint" that new streams are available to "fixed" viewers so that they fetch the new streams and resume consumption even though the current streams are "inactive",
- We allow unfixed viewers a chance to see the
[...]_NEW_STREAMS
flag whenever the index query happens to succeed at delivering a proper "active" index.
Does that make sense?
Updated by Kienan Stewart about 1 year ago
Jérémie Galarneau wrote in #note-12:
It's an attempt to maintain compatibility with unfixed viewers as much as possible.
When
check_new_streams
is invoked, it searches for a session with new streams. If one is found, it clears the flag by using a cmpxchg and exits with1
. In that sense, the check is "destructive".Since unfixed viewers don't check the flags for new streams when the index reply uses the
[...]_INACTIVE
status, we would clear the session's flag (on the relay daemon end) and they would simply ignore it. Then, the viewer would never be informed that this session has new streams even if the existing streams start to see activity again (causing the index queries to return "active" indexes).The two sampling modes are introduced to have destructive and non-destructive versions of the check. When an "active" index reply is delivered, we use the destructive mode as we always did.
When the index reply is "inactive", we use the non-destructive version so that:
- We give the "hint" that new streams are available to "fixed" viewers so that they fetch the new streams and resume consumption even though the current streams are "inactive",
- We allow unfixed viewers a chance to see the
[...]_NEW_STREAMS
flag whenever the index query happens to succeed at delivering a proper "active" index.Does that make sense?
I hadn't realized that check_new_streams
was a destructive compxchg. Would it make sense to replace the compxchg with the non-destructive sampling version only, and then in viewer_get_new_streams
have a corresponding compxchg or similar to reset the value back to 0?
Based on my new understanding the current implementation would only send a packet with the viewer_flags LTTNG_VIEWER_FLAG_NEW_STREAM
once and assume that the request will be (at some point) handled. If the expectation of the protocol is that new streams have to be fetched with the LTTNG_VIEWER_GET_NEW_STREAMS
command, wouldn't it be easier to assume that the request isn't actually treated until that command happens? While it would mean that a client could potentially receive multiple packets with the LTTNG_VIEWER_FLAG_NEW_STREAM
in a row if they don't immediately treat the update to the session state, I think it would muddy less the expectations around check_new_streams
and be clear for clients that the session state update is still pending. I don't think this change would impact how babeltrace2 operates.
From https://github.com/lttng/lttng-tools/blob/master/doc/live-reading-protocol.txt#L121 there doesn't seem to be any indication that the flag LTTNG_VIEWER_FLAG_NEW_STREAM
would only be set once.
Updated by Jérémie Galarneau about 1 year ago · Edited
Kienan Stewart wrote in #note-13:
I hadn't realized that
check_new_streams
was a destructive compxchg. Would it make sense to replace the compxchg with the non-destructive sampling version only, and then inviewer_get_new_streams
have a corresponding compxchg or similar to reset the value back to 0?
Yes, it's one of the alternatives I had noted on Friday. I had not pursued it because of the locking context in which all of this happens.
The trace receiving end (an event loop that handles incoming trace data from various consumer daemons) and the live protocol event loop (which interacts with viewers) are in concurrent threads that share some data locklessly; the relay_session
(and the objects it owns) in this case.
That's why an atomic cmpxchg is used to clear the new_session
flag.
get_new_streams
command, we have to be mindful of the inherent "race" between:
- The publication of new streams (from the receiving end)
- The clearing of the flag (from the viewer end).
- The receiving end sets the
new_stream
flag while processingrelay_streams_sent
- The live end starts to send new streams in response to a
get_new_streams
command - New streams are published on the receiving end, setting the
new_stream
flag - The
new_stream
flag is cleared by the viewer end
In such a scenario, some streams would never be announced to the viewer.
Just generally, we must guarantee that new_stream
is set if some streams could have not been seen by the viewer.
new_stream
flag in two places:
To be honest, I'm not quite sure of the reasoning behind the second assignation. I need to look at it more carefully.
Regardless, in both cases we set the flag after making the streams visible (publishing) to the trace's stream_list
.
On the live end, we need to ensure we clear the new_streams
flag before iterating on the stream list.
I need to think about the appropriate synchronization (memory barriers) to put in place considering those already implied by the uses of the liburcu
API 🤔.
Updated by Kienan Stewart 11 months ago
The current proposed patch sets are
Updated by Jérémie Galarneau 11 months ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset lttng-tools|283a96e49f066a4263726271bc64aa7e94ae0e92.