Project

General

Profile

Actions

Bug #1412

closed

Trace decoding fails for trace generated on aarch64 with ctf_sequence using _length_type of uint32_t

Added by Christophe Bedard about 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
04/05/2024
Due date:
% Done:

0%

Estimated time:

Description

Context:

  1. Tracepoint with a `ctf_sequence` field using a `_length_type` of `uint32_t`
  2. No need to actually pass any values to run into the issue (i.e., NULL and 0U)
  3. Changing the `_length_type` from `uint32_t` to `uint64_t` fixes the issue on aarch64, but both `uint32_t` and `uint64_t` work fine on an x86_64 system
  4. Issue surfaces when reading the trace, e.g., with `babeltrace` or Trace Compass (I believe that using `babeltrace2` seems to result in the decoding error being ignored)

Reproducer: https://github.com/christophebedard/lttng-sequence-len-type-mwe

Environment info:

aarch64 system:
  1. CLI version: 2.13.12 (from `lttng --version`)
  2. LTTng-UST version: 2.13.5 (built from source; from /usr/include/lttng/ust-version.h)
  3. `uname -a`: Linux host 5.10.120-rt70-tegra #1 SMP PREEMPT RT Mon Apr 1 23:53:36 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
  4. `nproc`: 8
x86_64 system:
  1. CLI version: 2.13.10 (from `lttng --version`)
  2. LTTng-UST version: 2.13.7 (from LTTng stable-2.13 Ubuntu PPA; according to /usr/include/x86_64-linux-gnu/lttng/ust-version.h)
  3. `uname -a`: Linux host 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  4. `nproc`: 16

This seems similar to https://review.lttng.org/c/lttng-ust/+/12122, but it's already using an unsigned int (although not literally `unsigned int`).


Files

Actions #1

Updated by Mathieu Desnoyers about 1 month ago

Please attach the resulting trace binary file and metadata to this ticket.

Actions #2

Updated by Mathieu Desnoyers about 1 month ago

Also can you reproduce with updated versions of lttng-ust and lttng-tools 2.13 ?

Updated by Christophe Bedard about 1 month ago

I've attached the following traces:

  1. issue1412_mwe-20240410-211856_2-13-5.tar.xz: generated on the aarch64 system that I described in the issue description above (lttng-ust 2.13.5)
  2. issue1412_mwe-20240410-213544_2-13-7-9261aea.tar.xz: generated on the same aarch64 system, but with lttng-ust built from source using stable-2.13 (latest commit: 9261aea)

The second trace (with the latest lttng-ust version) leads to a decoding issue too.

Actions #4

Updated by Christophe Bedard about 1 month ago

Sorry, I re-read your comment and realized you were asking me to try using the latest version of lttng-tools too.

  1. issue1412_mwe-20240410-220758_ust2-13-7-9261aea_tools2-13-13-e6681c6.tar.xz: same as trace number 2 above, but with lttng-tools built from source using stable-2.13 (latest commit: e6681c6 aka 2.13.13)

Same issue.

Actions #5

Updated by Mathieu Desnoyers about 1 month ago

Which version of Babeltrace do you use to reproduce ? Does it reproduce with Babeltrace 2 ?

I suspect an issue with handling of alignment of 0-length sequences in Babeltrace 1.5 and Trace Compass. This does not reproduce on x86-64 because everything is packed on that architecture (efficient unaligned accesses).

Actions #6

Updated by Mathieu Desnoyers about 1 month ago

I reproduced on x86-64 by applying this diff to lttng-ust.

Actions #7

Updated by Mathieu Desnoyers about 1 month ago

Patch fixing the issue for Babeltrace 1.5. It is EOL, so do not expect any release with this fix. Please use Babeltrace 2 instead.

Actions #8

Updated by Mathieu Desnoyers about 1 month ago

Which version of Trace Compass did you try it with ? Can you try with latest to see if there is still an issue ?

Actions #9

Updated by Christophe Bedard about 1 month ago

babeltrace2 (2.0.5) indeed works fine. I'm of course not expecting a fix/release for babeltrace 1.5!

I can confirm that changing the reproducer to trigger the tracepoint with a non-empty array indeed works fine (babeltrace 1.5 and babeltrace2).

I was using babeltrace (1.5.8) because it seemed to have the same issue as Trace Compass (8.2.0), so it was easier to reproduce. I tried with the latest Trace Compass version (9.3.0) and I still get the error. Since I understand that this is a parsing issue and not an lttng-ust or lttng-tools issue, I think we can close this issue. I'll try to fix Trace Compass.

Thank you for your time!

Actions #10

Updated by Mathieu Desnoyers about 1 month ago

  • Status changed from New to Resolved

Marking as resolved on the LTTng-UST side.

This ends up being an issue with 0-length arrays/sequences alignment handling (for instance generated by the aarch64 architecture) affecting Babeltrace 1.5 (now EOL) and Trace Compass. Babeltrace 2 works fine.

Actions

Also available in: Atom PDF