Project

General

Profile

Actions

Bug #479

closed

Tracing of a real-time application drops a big amount of events at the end of the trace

Added by Raphaël Beamonte over 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Low
Assignee:
-
Target version:
-
Start date:
03/22/2013
Due date:
% Done:

0%

Estimated time:

Description

This behavior appears with or without using the read-timer option.
It appeared with the commit 34a91bdb42a2a3b01b687ab5e1ba7638401e6dfc implementing the read-timer for realtime.

Tests and results

npt

The tests were made with npt, which is available on the dorsal's git

Test procedure

  • We have 4 CPUs. We isolate them with cpusets
  • We keep the system on CPU 0.
  • We start a sessiond daemon on CPU 2
  • We isolate a shell session on CPU 1 and then run the following commands as root :
    lttng create lttngrt
    lttng enable-channel u -u --num-subbuf 32 --subbuf-size 1048576
    lttng enable-event -c u -u "npt:loop","npt:start","npt:stop" 
    ./npt --eval-cpu-speed -l10000000
    lttng destroy
    
  • We verify the trace : babeltrace lttngrt-trace > /dev/null

Results

With the commit "34a91bdb42a2a3b01b687ab5e1ba7638401e6dfc" of LTTng-UST, not depending on the commit used for LTTng-tools (just watch the locked-steps), we can observe the same behavior for all tests made, using or not a "read-timer" option in the enable-channel line presented in previous section. This behavior is that the latency during the trace is relatively low, but many events are droped at the end of the trace (the "read-timer" had only an impact on whether or not there were droped events during trace in addition to those at the end, but with a low value of read-timer, this other problem disapeared). In fact, always the same amount of events for the same amount of loops used in the call of npt : 8602021.
For commits previous this one, this behavior does not appear and the trace is complete.

However, this behavior was encountered for the previous commits during the tests when I made a mistake and forgot to isolate the sessiond on the CPU 2. In fact, as I forgot to start the sessiond, it was starting during the call to "lttng create", and then was on the same CPU as npt. In this situation, the amount of droped events was exactly the same.

Actions #1

Updated by Raphaël Beamonte over 11 years ago

  • Description updated (diff)
Actions #2

Updated by Mathieu Desnoyers over 11 years ago

In your test procedure, a couple of operations are not detailed. How do you perform the preparation steps ?

"We have 4 CPUs. We isolate them with cpusets
We keep the system on CPU 0.
We start a sessiond daemon on CPU 2
We isolate a shell session on CPU 1 and then run the following commands as root :"

Also: what is your computer configuration ? How many processors, and how many thread per processor (hyperthreading ?) Have you made sure hyperthreading is disabled in your bios ?

Actions #3

Updated by Mathieu Desnoyers over 11 years ago

  • Status changed from New to Feedback
Actions #4

Updated by Mathieu Desnoyers over 11 years ago

I ran your test-case without any preparation steps (no CPU pinning performed), and there seems to be no event lost, either with or without --read-timer:

lttng-sessiond (as root)

# with read-timer:

lttng create lttngrt
lttng enable-channel u -u --num-subbuf 32 --subbuf-size 1048576 --read-timer 200
lttng enable-event -c u -u "npt:loop","npt:start","npt:stop" 
lttng start

as root:

./npt --eval-cpu-speed -l10000000
# CPU affinity set on CPU 1
# Application priority set to 99
# CPU frequency (evaluation): 2893.41 MHz
10000000 cycles done over 10000000.
Cycles duration:
        min:            1.093518 us
        max:            35.203394 us
        mean:           1.901982 us
        sum:            19019823.487977 us
        variance:       0.800043 us
        std dev:        0.894451 us
--------------------------
duration (us)   nb. cycles
--------------------------
1               7332645
2               2150347
3               400108
4               22521
5               47592
6               508
7               346
8               208
9               1064
10              2522
11              15075
12              23375
13              3240
14              246
15              73
16              42
17              29
18              24
19              17
20              4
21              6
22              2
23              4
32              1
35              1
--------------------------
Overruns (1000000+): 0

lttng destroy

babeltrace /home/compudj/lttng-traces/lttngrt-20130410-205542 > /dev/null
(no event discarded)
babeltrace /home/compudj/lttng-traces/lttngrt-20130410-205542 |wc -l
10000007

# without read-timer:

lttng create lttngrt
lttng enable-channel u -u --num-subbuf 32 --subbuf-size 1048576
lttng enable-event -c u -u "npt:loop","npt:start","npt:stop" 
lttng start

as root:

./npt --eval-cpu-speed -l10000000
# CPU affinity set on CPU 1
# Application priority set to 99
# CPU frequency (evaluation): 2893.41 MHz

10000000 cycles done over 10000000.
Cycles duration:
        min:            1.051353 us
        max:            354.635387 us
        mean:           3.081866 us
        sum:            30818657.399807 us
        variance:       0.408903 us
        std dev:        0.639455 us
--------------------------
duration (us)   nb. cycles
--------------------------
1               749444
2               973
3               9040419
4               158954
5               25511
6               2639
7               3573
8               8107
9               6541
10              1116
11              957
12              888
13              356
14              190
15              146
16              89
17              40
18              21
19              12
20              2
21              3
22              1
23              3
24              4
25              5
26              2
27              2
29              1
354             1
--------------------------
Overruns (1000000+): 0

lttng destroy

babeltrace /home/compudj/lttng-traces/lttngrt-20130410-210519 > /dev/null
(no event discarded)
babeltrace /home/compudj/lttng-traces/lttngrt-20130410-210519 |wc -l
10000007
Actions #5

Updated by Raphaël Beamonte over 11 years ago

Since the bug report, I worked with Yannick on the bug and discovered new elements that could help target the problem.
I'll try here to answer your questions and give the new elements.

OS and system

xaf@station11-64:~$ uname -a
Linux station11-64 3.2.0-4-rt-amd64 #1 SMP PREEMPT RT Debian 3.2.32-1 x86_64 GNU/Linux
xaf@station11-64:~$ cat /proc/cpuinfo | grep "^model name" 
model name    : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
model name    : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
model name    : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
model name    : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz

Hyperthreading off (working on real time...)

npt

Forgot to mention, the version used for these tests is the v0.02 (tag). The new version I'm working on is not currently mature enough and needs some configure options to work as the v0.02 works (it shouldn't have been pushed online yet... my bad). Sorry for that.

CPU shielding

# disable NMI watchdog
/sbin/sysctl -q -e -w "kernel.nmi_watchdog=0" 
# Restrict stopmachine to cpu 0
/sbin/sysctl -q -e -w "kernel.stopmachine_cpumask=0x1" 
# Restrict pagedrain to cpu 0
/sbin/sysctl -q -e -w "vm.pagedrain_cpumask=0x1" 
# Disable scheduler RT throttling
/sbin/sysctl -q -e -w "kernel.sched_rt_runtime_us=-1" 
# Disable softlockup detection.
/sbin/sysctl -q -e -w "kernel.softlockup_thresh=-1" 

# Set default IRQ affinity to cpu 0
echo 1 > /proc/irq/default_smp_affinity
# Set affinity for all active IRQs
for i in /proc/irq/[0-9]*; do
    echo 1 > $i/smp_affinity 2>/dev/null
done

# Mount cgroups
mkdir /cpusets
mount -t cgroup -ocpuset cpuset /cpusets

# Disable systemwide load balancing
cd /cpusets
echo 0 > cpuset.sched_load_balance

# Create system cpuset
mkdir system
echo 0 > system/cpuset.cpus
echo 0 > system/cpuset.mems
echo 0 > system/cpuset.sched_load_balance
echo 1 > system/cpuset.cpu_exclusive

# Move all tasks
for t in `cat tasks`; do
    echo $t > system/tasks
done

# Create NPT cpuset
mkdir npt
echo 1 > npt/cpuset.cpus
echo 0 > npt/cpuset.mems
echo 0 > npt/cpuset.sched_load_balance
echo 1 > npt/cpuset.cpu_exclusive

# Create LTTng cpuset
mkdir lttng
echo 2 > lttng/cpuset.cpus
echo 0 > lttng/cpuset.mems
echo 0 > lttng/cpuset.sched_load_balance
echo 1 > lttng/cpuset.cpu_exclusive

UST and kernel trace

We discovered that we can have drops of events in UST traces but also in kernel traces, depending on the configuration of the system. In all the cases the drops of events appeared, the kernel trace was also activated. I forgot to mention that in my bug report as I didn't see that it could be linked. By doing more tests and desactivating it, we found that it was linked.
In all cases, the test procedure was the following (in root/with sudo):

lttng create lttngrt
lttng enable-channel k -k --num-subbuf 32 --subbuf-size 1048576
lttng enable-event -c k -k -a
lttng enable-channel u -u --num-subbuf 32 --subbuf-size 1048576
lttng enable-event -c u -u "npt:loop","npt:start","npt:stop" 
./npt --eval-cpu-speed -l10000000
lttng destroy

This test procedure should be run in a shell put in the npt cpuset:

echo $$ > /cpuset/npt/tasks

It should be run after starting a sessiond in another shell put in the lttng cpuset:
echo $$ > /cpuset/lttng/tasks

Here are the information about the different tests made and the results obtained.

Without CPU shielding

LTTng v2.1.0

No drops of events were encountered with this configuration.
Another test we made was not using the cli() and sti() and we encountered event drops in the UST trace (not kernel) in this case. The following tests will only use npt with active cli() and sti() calls to analyze the regression of LTTng.

LTTng-tools (bd2d5dd) & UST (34a91bd)

Here, the number of dropped events is always 8602021 for UST. No event is dropped for the kernel.

With CPU shielding

LTTng v2.1.0

No drops of events were encountered with this configuration.

LTTng-tools (bd2d5dd) & UST (34a91bd)

The drops explained in the bugreport were encountered here (8602021 for UST).
There were two situations in which we were able not to have drops of events for UST:
  1. When we added a new CPU to the LTTng cpuset (echo 2-3 > lttng/cpuset.cpus), but there where drops of kernel events.
  2. When we put a very big subbuffer size for the kernel channel (4M instead of 1M here). In this case, the kernel was just never restarting working on the buffers after it used them a first time as it has as much space as needed for the whole kernel trace of the npt run. No drops of kernel events either in this case.

The idea that came from that is that the "consumerd"s of lttng were interfering. When the kernel consumerd needs to run, the UST consumerd has to wait, and thus if the kernel consumerd is not able to follow the rythm, he is just keeping all the CPU time. This situation seems weird to me as NPT is not generating any kernel event, and UST should not generate such events either.

Number of events expected for npt UST tracepoints

10000007 events are expected. This number is:
  • 1 tracepoint npt:start
  • 5 tracepoints npt:loop to prepare the npt work (before calculating loops time)
  • 10000000 tracepoints npt:loop (-l option)
  • 1 tracepoint npt:stop
Actions #6

Updated by Raphaël Beamonte about 11 years ago

With d1a996f5524540ebc8caa7c3c1bd3049018d212c (ust) and dec56f6cc894de41b312354d360b6a4c09fc199d (tools), and without the read-timer option, the comportement of LTTng seems ok.
But still have many drops with the read-timer option.

Actions #7

Updated by Mathieu Desnoyers about 11 years ago

  • Status changed from Feedback to Resolved
  • Priority changed from Normal to Low

OK, so this appears to be more expected now: when using the --read-timer feature, there is a certain delay between the moment the data is ready to be read and the moment it is picked up by consumerd. So losing more data than without --read-timer is expected.

Moreover, the information you gave me indicates that the ust consumerd is only showing starvation when there is a lot of kernel tracing happening concurrently. So perhaps the kernel chooses not to run the ust consumerd timer-handling thread very often when a kernel consumerd is actively running (starvation).

It appears to be that the new feature (--read-timer) behaves differently from the synchronous wakeup, and this is therefore not considered as a regression anymore.

We have fixed a bug in lttng-tools recently where the default behavior of UST was to use a read-timer of 200 microseconds. It was indeed a regression. It was been fixed in commit:

commit d92ff3ef25e70f4ffad582e24bfb581cfdcea04f
Author: David Goulet <dgoulet@efficios.com>
Date:   Fri Apr 12 10:21:51 2013 -0400

    Fix: use channel per domain default values

    This is related to bug497 but does not fix it. Using the command line and th
    lttng_channel_set_default_attr() API call, the default read and switch timer
    were not correct for UST and Kernel domain since we now use specific default
    values for each of them.

    Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: David Goulet <dgoulet@efficios.com>
Actions

Also available in: Atom PDF