Project

General

Profile

Actions

Bug #415

closed

Incomplete CTF and metadata files after session has been stopped

Added by Jesus Garcia over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Critical
Assignee:
Target version:
Start date:
12/18/2012
Due date:
% Done:

0%

Estimated time:

Description

consumerd is not done flushing all the metadata and CTF files towards relayd even after sessiond has replied to the lttng_stop_tracing call.
The scenario I'm running is as follows:
- 4-node cluster (SC-1/SC-2/PL-3/PL-4) with one sessiond/consumerd pair per node
- One relayd in node SC-1 which receives data from the consumerd in each of the 4 nodes
- 3 TestApps are running at all times (4 processes total, since one of the apps forks a child) in each node
- One cluster-wide session (same session is started on each node) is activated for 10 seconds with one event that produces 10 hits per second from one of the apps

The result is that most of the files have file size 0.

Just for troubleshooting purposes I have introduced a 30-second delay in our code after sessiond has returned from lttng_stop_tracing and before killing relayd (we start a new relayd when each cluster-wide session is created and kill it when the session is stopped). As a result, we get more data, but the data is still incomplete.

I have included a log from the 30 sec delay scenario, but I can also provide the normal log without the delay, which shows most of the files have zero file size, if required. The issue is easily reproducible. I have not tried with verbose mode either, but will try it later and attach it to this bug report.

I have assigned high priority to this bug due to the closeness to our code-freeze date.

Here is my software level:
CURRENT HEAD: foss/babeltrace d0acc96 (HEAD, origin/master, origin/HEAD) Provides a basic pkg-config file for libbabeltrace
CURRENT HEAD: foss/lttng-tools bb63afd (HEAD, origin/master, origin/HEAD) Fix: for librelayd, fix negative reply ret code
CURRENT HEAD: foss/lttng-ust 45f399e (HEAD, origin/master, origin/HEAD) Adapt internal files and examples to TRACEPOINT_INCLUDE
CURRENT HEAD: foss/userspace-rcu b1f69e8 (HEAD, origin/master, origin/HEAD) documentation: fix rcu-api.txt duplicates


Files

Incomplete_data_30sec_delay_before_killing_relayd.txt (59 KB) Incomplete_data_30sec_delay_before_killing_relayd.txt Jesus Garcia, 12/19/2012 12:24 AM
bug415_verbose_no_tc.txt (439 KB) bug415_verbose_no_tc.txt Jesus Garcia, 12/19/2012 11:08 AM
metadata (4 KB) metadata Jesus Garcia, 12/19/2012 11:08 AM
bug415_verbose_no_tc_40sec_delay.txt (208 KB) bug415_verbose_no_tc_40sec_delay.txt Jesus Garcia, 12/19/2012 12:27 PM
bug415-relayd.diff (514 Bytes) bug415-relayd.diff David Goulet, 12/19/2012 01:32 PM
Actions

Also available in: Atom PDF