Project

General

Profile

Bug #1274

"[error] Event id ____ is outside range." when reading a trace

Added by Christophe Bedard 13 days ago. Updated 7 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
06/23/2020
Due date:
% Done:

0%

Estimated time:

Description

I'm using custom LTTng branches off of stable-2.11 inside a Ubuntu 18.04 Docker container (5.3.0-53-generic).

This worked fine with commit 6f84b7ea107c913475e789ab1c86c254c0e35e0d from the custom lttng-ust branch above, but the next commit defines a lot more events, and seems to break something. I looked over the last commit but I don't see any problems.

I've attached a userspace trace that is generated when LD_PRELOADing the custom liblttng-ust-libc-wrapper.so and running an application. Reading it with babeltrace prints some events but stops at some point and gives the following output:

[error] Event id 19144 is outside range.
[error] Reading event failed.
Error printing trace.

in case it's helpful, babeltrace2 also errors out (although I'm not using bt2 and currently cannot use it):

06-23 10:40:15.487 24237 24237 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1415 [muxer] Cannot get next message: comp-addr=0xb85980, muxer-comp-addr=0xb7f2e0, muxer-msg-iter-addr=0xb80d60, msg-iter-addr=0xb80c80, status=ERROR
06-23 10:40:15.487 24237 24237 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0xb80c80, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
06-23 10:40:15.487 24237 24237 W LIB/GRAPH consume_graph_sink@graph.c:473 Component's "consume" method failed: status=ERROR, comp-addr=0xb7f4f0, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0xa2d230, comp-class-so-handle-path="/usr/lib/x86_64-linux-gnu/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
06-23 10:40:15.487 24237 24237 E CLI cmd_run@babeltrace2.c:2548 Graph failed to complete successfully

ERROR:    [Babeltrace CLI] (babeltrace2.c:2548)
  Graph failed to complete successfully
CAUSED BY [libbabeltrace2] (graph.c:473)
  Component's "consume" method failed: status=ERROR, comp-addr=0xb7f4f0, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty",
  comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0xa2d230,
  comp-class-so-handle-path="/usr/lib/x86_64-linux-gnu/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
CAUSED BY [libbabeltrace2] (iterator.c:864)
  Component input port message iterator's "next" method failed: iter-addr=0xb80c80, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING,
  iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
  iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:991)
  Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0xb80d60, muxer-upstream-msg-iter-wrap-addr=0xb82450
CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:454)
  Upstream iterator's next method returned an error: status=ERROR
CAUSED BY [libbabeltrace2] (iterator.c:864)
  Component input port message iterator's "next" method failed: iter-addr=0xb81f00, iter-upstream-comp-name="auto-disc-source-ctf-fs", iter-upstream-comp-log-level=WARNING,
  iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="fs", iter-upstream-comp-class-partial-descr="Read CTF traces from the file sy", iter-upstream-port-type=OUTPUT,
  iter-upstream-port-name="06ec4ff1-bba0-4773-a749-ef866a19f1d7 | 0 | 11", status=ERROR
CAUSED BY [auto-disc-source-ctf-fs (06ec4ff1-bba0-4773-a749-ef866a19f1d7 | 0 | 11): 'source.ctf.fs'] (fs.c:109)
  Failed to get next message from CTF message iterator.
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (msg-iter.c:2899)
  Cannot handle state: msg-it-addr=0xb820f0, state=AFTER_EVENT_HEADER
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (msg-iter.c:1282)
  No event class with ID of event class ID to use in stream class: msg-it-addr=0xb820f0, stream-class-id=0, event-class-id=19144, trace-class-addr=0xa23f50

Perhaps this is unrelated, but the metadata file looks weird: event IDs go from 0 to 31 then reset to 17 and go up to 23, all on the same stream (+ there is an event that has its metadata cut in two).

I assume this is more of a lttng-ust problem rather than a babeltrace problem, but this is where I encounter it. There are a lot of similarities with bug #648, but since that one is quite old and was closed without any fix, here is another issue.


Files

ust_trace.tar.xz (31.6 KB) ust_trace.tar.xz Christophe Bedard, 06/23/2020 10:38 AM
#1

Updated by Christophe Bedard 13 days ago

using commit 6f84b7ea107c913475e789ab1c86c254c0e35e0d instead of 269ea68b2c01d21c7711d0376b89989389301675 for lttng-ust (from this branch: https://github.com/ApexAI/lttng-ust/commits/ust-libc-wrapper-additions) indeed works, so perhaps it has something to do with having a lot of events (30 total events in the modified lttng_ust_libc_wrapper shared lib).

#2

Updated by Mathieu Desnoyers 13 days ago

The usual cause of this kind of issue is a corrupted trace. Can you reproduce without the custom commits on top of our supported stable lttng-ust branch ?

#3

Updated by Christophe Bedard 13 days ago

i.e. without the added callstack userspace context but with the added libc wrapper tracepoints (since this should be a normal use of lttng-ust and could be split out)?

#4

Updated by Mathieu Desnoyers 13 days ago

Yes, this is what I mean.

#5

Updated by Christophe Bedard 7 days ago

That seems to work. But from what I understand, the custom commits can work if I do not have too many events (either simply tracepoints that are enabled or events that get generated, I'm not sure).

Would you mind explaining (just briefly) how the custom commits might corrupt a trace? I'm just wondering if it's a defect/bug in the custom code, or if it's a limitation of lttng, etc. I'm just trying to figure out what my next steps should be, given that I'm using an unsupported/custom feature and that I kind of need it.

#6

Updated by Mathieu Desnoyers 7 days ago

The first thing to do is to double-check the custom commits to ensure there is no mismatch between the field description layout (which ends up in the metadata) and the field size/alignment. The field alignment would be the first thing to double-check, as it's forced to 1 byte on x86 (which is very likely the only architecture on which the custom callstack feature was tested), so alignment issues don't show up.

On what architecture are you reproducing this issue ?

#7

Updated by Christophe Bedard 7 days ago

I'm on an x86_64/amd64 system, so I assume it should be fine, but I'll double-check the layout.

Also available in: Atom PDF