Project

General

Profile

Bug #1183

LTTng-UST and modules ring buffers timestamp_end value may not include last events in the packet (LTTng 2.2+)

Added by Mathieu Desnoyers 12 months ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Target version:
Start date:
04/29/2019
Due date:
% Done:

100%

Estimated time:

Description

In the following scenario, lttng-ust and lttng-modules ring buffer code may not strictly respect the guarantee about packet header [timestamp_begin, timestamp_end] range including timestamps of all events contained in the packet.

Indeed, if we have the prior-to-last event (with timestamp T) written in the packet interrupted or preempted before it increments the commit counter, and then the last event for the packet with timestamp (T+1) written and committed, it will be up to the commit counter increment of event with timestamp T to perform the packet delivery, and therefore set the value for the timestamp_end.

However, that timestamp is taken from the event reservation, and will not cover T+1.

This issue was introduced in LTTng 2.2:

commit 969771a1536069d8f3f05e4836f5ef746d9b9a11
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date:   Sun Nov 24 04:11:16 2013 -0500

    Fix: eliminate timestamp overlap between packets

    By using the timestamp sampled at space reservation when the packet is
    being filled as "end timestamp" for a packet, we can ensure there is no
    overlap between packet timestamp ranges, so that packet timestamp end <=
    following packets timestamp begin.

    Overlap between consecutive packets becomes an issue when the end
    timestamp of a packet is greater than the end timestamp of a following
    packet, IOW a packet completely contains the timestamp range of a
    following packet.  This kind of situation does not allow trace viewers
    to do binary search within the packet timestamps. This kind of situation
    will typically never occur if packets are significantly larger than
    event size, but this fix ensures it can never even theoretically happen.

    The only case where packets can still theoretically overlap is if they
    have equal begin and end timestamps, which is valid.

    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Files

#1

Updated by Mathieu Desnoyers 12 months ago

  • File 0001-Fix-timestamp_end-field-should-include-all-events-wi.patch added
  • Status changed from New to Feedback
  • Assignee set to Mathieu Desnoyers
  • Target version set to 2.9

I think I've been able to come up with a clean fix for the issue. The idea is to sample the timestamp when doing the last space reservation in a sub-buffer (which necessarily happens before doing the delivery after its last commit). We can save this timestamp temporarily in a per-sub-buffer control area (we have exclusive access to that area until we increment the commit counter).

Then, that timestamp value will be read when delivering the sub-buffer, whichever event or switch happens to be the last to increment the commit counter to perform delivery. The timestamp value can be read without worry about concurrent access, because at that point sub-buffer delivery has exclusive access to the sub-buffer.

This ensures the timestamp_end value is always equal to the timestamp of the last event.

I'm attaching the fix for lttng-modules for feedback. If we decide to go that way, I'll port it to lttng-ust.

#2

Updated by Mathieu Desnoyers 12 months ago

Updated patches for ust and modules.

#3

Updated by Mathieu Desnoyers 12 months ago

  • File deleted (0001-Fix-timestamp_end-field-should-include-all-events-wi.patch)
#4

Updated by Mathieu Desnoyers 12 months ago

  • File deleted (0001-Fix-timestamp_end-field-should-include-all-events-wi.patch)
#6

Updated by Mathieu Desnoyers 12 months ago

We needed to bump the ABI major for lttng-ust because it changes the ring buffer layout in shared memory.

Therefore, the fix is backported to 2.11 (in rc stage), 2.10, 2.9 for lttng-modules only.

It is backported into 2.11 in lttng-ust (still in rc stage).

#7

Updated by Mathieu Desnoyers 12 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF