Bug #1412
closed
Trace decoding fails for trace generated on aarch64 with ctf_sequence using _length_type of uint32_t
Added by Christophe Bedard 8 months ago.
Updated 7 months ago.
Description
Context:
- Tracepoint with a `ctf_sequence` field using a `_length_type` of `uint32_t`
- No need to actually pass any values to run into the issue (i.e., NULL and 0U)
- Changing the `_length_type` from `uint32_t` to `uint64_t` fixes the issue on aarch64, but both `uint32_t` and `uint64_t` work fine on an x86_64 system
- Issue surfaces when reading the trace, e.g., with `babeltrace` or Trace Compass (I believe that using `babeltrace2` seems to result in the decoding error being ignored)
Reproducer: https://github.com/christophebedard/lttng-sequence-len-type-mwe
Environment info:
aarch64 system:
- CLI version: 2.13.12 (from `lttng --version`)
- LTTng-UST version: 2.13.5 (built from source; from /usr/include/lttng/ust-version.h)
- `uname -a`: Linux host 5.10.120-rt70-tegra #1 SMP PREEMPT RT Mon Apr 1 23:53:36 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
- `nproc`: 8
x86_64 system:
- CLI version: 2.13.10 (from `lttng --version`)
- LTTng-UST version: 2.13.7 (from LTTng stable-2.13 Ubuntu PPA; according to /usr/include/x86_64-linux-gnu/lttng/ust-version.h)
- `uname -a`: Linux host 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
- `nproc`: 16
This seems similar to https://review.lttng.org/c/lttng-ust/+/12122, but it's already using an unsigned int (although not literally `unsigned int`).
Files
Please attach the resulting trace binary file and metadata to this ticket.
Also can you reproduce with updated versions of lttng-ust and lttng-tools 2.13 ?
I've attached the following traces:
- issue1412_mwe-20240410-211856_2-13-5.tar.xz: generated on the aarch64 system that I described in the issue description above (lttng-ust 2.13.5)
- issue1412_mwe-20240410-213544_2-13-7-9261aea.tar.xz: generated on the same aarch64 system, but with lttng-ust built from source using stable-2.13 (latest commit: 9261aea)
The second trace (with the latest lttng-ust version) leads to a decoding issue too.
Sorry, I re-read your comment and realized you were asking me to try using the latest version of lttng-tools too.
- issue1412_mwe-20240410-220758_ust2-13-7-9261aea_tools2-13-13-e6681c6.tar.xz: same as trace number 2 above, but with lttng-tools built from source using stable-2.13 (latest commit: e6681c6 aka 2.13.13)
Same issue.
Which version of Babeltrace do you use to reproduce ? Does it reproduce with Babeltrace 2 ?
I suspect an issue with handling of alignment of 0-length sequences in Babeltrace 1.5 and Trace Compass. This does not reproduce on x86-64 because everything is packed on that architecture (efficient unaligned accesses).
I reproduced on x86-64 by applying this diff to lttng-ust.
Patch fixing the issue for Babeltrace 1.5. It is EOL, so do not expect any release with this fix. Please use Babeltrace 2 instead.
Which version of Trace Compass did you try it with ? Can you try with latest to see if there is still an issue ?
babeltrace2 (2.0.5) indeed works fine. I'm of course not expecting a fix/release for babeltrace 1.5!
I can confirm that changing the reproducer to trigger the tracepoint with a non-empty array indeed works fine (babeltrace 1.5 and babeltrace2).
I was using babeltrace (1.5.8) because it seemed to have the same issue as Trace Compass (8.2.0), so it was easier to reproduce. I tried with the latest Trace Compass version (9.3.0) and I still get the error. Since I understand that this is a parsing issue and not an lttng-ust or lttng-tools issue, I think we can close this issue. I'll try to fix Trace Compass.
Thank you for your time!
- Status changed from New to Resolved
Marking as resolved on the LTTng-UST side.
This ends up being an issue with 0-length arrays/sequences alignment handling (for instance generated by the aarch64 architecture) affecting Babeltrace 1.5 (now EOL) and Trace Compass. Babeltrace 2 works fine.
Also available in: Atom
PDF