Project

General

Profile

Actions

Bug #1438

open
RR

Big memory usage when RLIMIT_NOFILE is high

Bug #1438: Big memory usage when RLIMIT_NOFILE is high

Added by Romain Reignier about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
04/01/2026
Due date:
% Done:

0%

Estimated time:

Description

Dear lttng community,

I am not a direct lttng user but I use a library (ROS 2) that links to lttng-ust for tracing.

First, after some searching on the lttng website, mailing list and source code to investigate my issue, I wanted to congrats you for the nice naming scheme of the lttng versions, I really like it! :)

While using ROS 2 applications in a Docker container on Debian 13, I have noticed a large memory consumption (> 140 MB) per process while my colleagues on Ubuntu 24.04 have only 15 MB per process.

After an analysis with heaptrack, it appears that lttng_ust_fd_tracker_init() allocates a 128 MB buffer.

From lttng_ust_fd_tracker_init() source code, it is quite clear that the malloc() size depends on rlim.rlim_max.

The rlim.rlim_max value appears to be 1073741816 in my environment and 1073741816 / 8 = 128 MB.

This huge value has already caused a lot of issues in several software.

For the history, systemd v240 raised the limit in 2018, Debian added a build flag to not increase it but for systemd 256 shipped with Debian 13, they removed this build flag.
By default, systemd limits the value to 524288 with a meson variable on the host OS, but Docker sets this to unlimited.

$ ulimit -Hn
524288
$ docker run --rm ubuntu:noble bash -c "ulimit -Hn" 
1073741816
$ docker run --rm ubuntu:noble bash -c "sysctl fs.nr_open" 
fs.nr_open = 1073741816
I know that there are several workarounds:
  • ROS 2 can use dlopen() to avoid to always link to lttng (this has been done recently for another reason in this PR)
  • Docker daemon or Docker container can be started with a limit on nofile ulimit. Doc here and here

But it seems a bad practice to use fs.nr_open for sizing a memory allocation and should be fixed.

Several projects have fixed this like MariaDB here after being discussed here.

There are several resources online about this kind of issues:

I do not know lttng enough to try provide a patch.

The solution of MariaDB to cap the fd_set size to a reasonable max value might be a quick solution. But I don't know what implication this would have on lttng.

Steps to reproduce

We use the lttng-ust demo easy-ust as executable using lttng.

Start the sample executable with prlimit in order to change the number of allowed nofile to match the default in my environment (Docker container on Debian 13), 1024 as soft limit and 1073741816 as hard limit:

sudo prlimit --nofile=1024:1073741816 ./sample

And check the RSS with ps aux;

$ ps aux | grep sample

root         540  4.8  0.4 150920 133596 pts/13  Sl+  10:10   0:00 ./sample

RSS value is 133596 -> 130,46 MB!

KS Updated by Kienan Stewart about 2 months ago Actions #1

Hi Romain,

thanks for the very clear report! Some folks here are taking a look at this.

thanks,
kienan

Actions

Also available in: PDF Atom