Project

General

Profile

Actions

Bug #1411

closed

Memory leak when relay daemon exits before application starts

Added by Mikael Beckius 10 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Target version:
Start date:
03/13/2024
Due date:
% Done:

0%

Estimated time:

Description

When the relay daemon is shutdown after creating a live session but before applications are started the shared memory allocated for tracing appears to remain and new memory is allocated for every application start.

How to reproduce:
host:~# lttng create micke --live
Spawning a session daemon
Spawning a relayd daemon
Live session micke created.
Traces will be output to tcp4://127.0.0.1:5342/ [data: 5343]
Live timer interval set to 1000000 us

host:~# lttng enable-event --userspace --all
All ust events are enabled in channel channel0

host:~# lttng start
Tracing started for session micke

host:~# killall -9 lttng-relayd

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 207Mi 15Gi 572Ki 320Mi 15Gi
Swap: 0B 0B 0B

host:~# ./micke-lttng
Mikael LTTNG 2015 - Starting
Mikael LTTNG 2015 - Signing out

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 248Mi 15Gi 40Mi 360Mi 15Gi
Swap: 0B 0B 0B

host:~# ./micke-lttng
Mikael LTTNG 2015 - Starting
Mikael LTTNG 2015 - Signing out

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 288Mi 15Gi 80Mi 400Mi 15Gi
Swap: 0B 0B 0B

host:~# lttng destroy micke
Destroying session micke..
Session micke destroyed

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 289Mi 15Gi 80Mi 400Mi 15Gi
Swap: 0B 0B 0B
host:~#

Version:
lttng-tools 2.13.11

Analyzis:
It seems that when the first application of a session starts, after the relay daemon has been shutdown, a failure to transfer streams to the relay deamon triggers a clean up through a call to ust_consumer_destroy_channel. But it appears that the cleanup isn't complete and the channel reference count remains incremented. Decrementing the reference count appears to be blocked in clean_channel_stream_list by stream->monitor = 0; preventing CONSUMER_CHANNEL_DEL from reaching consumer_del_channel(chan);

Information has it that this problem is NOT reproduced on 2.13 but I haven't tested that myself

Actions #1

Updated by Kienan Stewart 10 months ago

  • Status changed from New to Feedback
  • Assignee set to Mikael Beckius

Hi Mikael,

could you clarify what "./micke-lttng" is?

could you also clarify which versions of lttng-tools, lttng-ust, and liburcu are being used?

Information has it that this problem is NOT reproduced on 2.13 but I haven't tested that myself

Earlier you wrote

Version:
lttng-tools 2.13.11

thanks,
kienan

Actions #2

Updated by Mikael Beckius 10 months ago

I was supposed to add that the test application is just a simple app writing an lttng trace:
lttng_ust_tracepoint(micke_lttng, micke_tracepoint_count, timeSinceBoot, "Time since boot: ");

but I forgot to and I couldn't see that I could change the description.

could you also clarify which versions of lttng-tools, lttng-ust, and liburcu are being used?

Information has it that this problem is NOT reproduced on 2.13 but I haven't tested that myself

Ah, that's a typo. It should say 2.12.

UST: 2.13.6
URCU: 0.13.3

Micke

Actions #3

Updated by Jérémie Galarneau 9 months ago

  • Assignee changed from Mikael Beckius to Jérémie Galarneau
  • Target version set to 2.13

Hi Mikael,

Can you test this fix on your end?

Thanks!
Jérémie

Actions #4

Updated by Mikael Beckius 9 months ago

Hello Jérémie!

Removing that line was the first thing I tried after understanding the logic of the cleanup but just to be sure, this is the result after updating lttng-tools on a clean build:
host:~# lttng create micke --live
Spawning a session daemon
Spawning a relayd daemon
Live session micke created.
Traces will be output to tcp4://127.0.0.1:5342/ [data: 5343]
Live timer interval set to 1000000 us

host:~# lttng enable-event --userspace --all
All ust events are enabled in channel channel0

host:~# lttng start
Tracing started for session micke

host:~# killall -9 lttng-relayd

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 133Mi 15Gi 0.0Ki 444Mi 15Gi
Swap: 0B 0B 0B

host:~# ./micke-lttng
Mikael LTTNG 2015 - Starting
Mikael LTTNG 2015 - Signing out

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 132Mi 15Gi 0.0Ki 444Mi 15Gi
Swap: 0B 0B 0B

host:~# ./micke-lttng
Mikael LTTNG 2015 - Starting
Mikael LTTNG 2015 - Signing out

host:~# free -h
total used free shared buff/cache available
Mem: 15Gi 132Mi 15Gi 0.0Ki 444Mi 15Gi
Swap: 0B 0B 0B

host:~#

Now "shared" is always at zero and "used" remains the same.

Micke

Actions #5

Updated by Jérémie Galarneau 9 months ago

That's good to hear, thanks for confirming!

Actions #6

Updated by Erica Bugden 8 months ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF