Actions
Bug #539
closed
TL
lttng-tools2.2.0rc2: Possible memory leak from sessionD ?
Bug #539:
lttng-tools2.2.0rc2: Possible memory leak from sessionD ?
Start date:
05/22/2013
Due date:
% Done:
0%
Estimated time:
Description
Commit used:
============
userspace-rcu : 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize side-effect in 0.7
lttng-ust : 352fce3 (HEAD, origin/master, origin/HEAD) Remove 0.x TODO
lttng-tools : b31398b (HEAD, origin/master, origin/HEAD) Fix: increment UST channel refcount at stream creation
babeltrace : 9eaf254 (HEAD, tag: v1.0.3, origin/stable-1.0) Version 1.0.3
Problem Description:
====================
* There seems to be memory leak in sessionD when many short live instrumented apps are launched
during a period of time. At any time, we have about 700 instances of instrumented app present.
Each of those instances has only 3 sec life spann.
After about 3 minute run, the value under "%MEM" column increases from 0.0 to 0.4 for
sessionD:
[SC-1:Xen43 Wed May 22 06:29:29/cluster/temp/tdlt] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13616 root 20 0 85160 872 604 S 0 0.0 0:00.00 lttng-sessiond
:
:
[SC-1:Xen43 Wed May 22 06:32:34/cluster/temp/tdlt] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13616 root 20 0 190m 7292 688 S 7 0.4 0:05.73 lttng-sessiond
:
:
[SC-1:Xen43 Wed May 22 06:37:19/cluster/temp/tdlt] # top
top - 06:37:25 up 1 day, 16:20, 2 users, load average: 2.60, 3.14, 1.55
Tasks: 187 total, 1 running, 186 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.5%sy, 0.0%ni, 99.2%id, 0.1%wa, 0.1%hi, 0.0%si, 0.0%st
Mem: 2049828k total, 210928k used, 1838900k free, 2808k buffers
Swap: 4200992k total, 784808k used, 3416184k free, 42348k cached
:
* On a different machine, we observed the following (note the total mem size of
this machine is different than the one above):
[SC-1:Node16 Tue May 21 13:22:33/cluster/temp/tdlt/Stability/May12] # top
top - 13:23:15 up 11 days, 4:41, 2 users, load average: 0.00, 0.01, 0.05
Tasks: 288 total, 1 running, 287 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 24149M total, 3461M used, 20687M free, 372M buffers
Swap: 4102M total, 0M used, 4102M free, 2411M cached
:
[SC-1:Node16 Tue May 21 13:32:09/cluster/temp/tdlt/Stability/May12] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |egrep -i "lttng|trace"
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15012 root 20 0 311m 137m 912 S 0 0.6 2:46.43 lttng-sessiond
15373 root 20 0 82128 1484 956 S 0 0.0 1:06.09 lttng-consumerd
18322 root 20 0 42264 2268 1808 S 0 0.0 1:11.81 TraceEa
18342 root 20 0 42264 912 444 S 0 0.0 0:45.82 TraceEa
20874 root 20 0 190m 3808 2596 S 0 0.0 0:40.07 trace_c
20880 root 20 0 47220 2948 2040 S 0 0.0 0:10.89 trace_p
Is problem reproducible ?
=========================
* yes
How to reproduce (if reproducible):
===================================
1)_ Kill the current sessionD and relaunch a new one
2)_ top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
3)_ lttng list -u; lttng list #--- no instrumented app running, no session available
4)_ run multiple instances of instrumented app (each instance has only 3 sec lifespan)
so that there are around 700 instances present at any time.
ex:
for a in $(seq 1 1234); do (for n in $(seq 1 100); do (/home/test_apps/TestApp_100perSecOnly 3 np > /dev/null &); done; usleep 200000); done &
5)_ Once in a while, repeat step-2 above.
Will see that the number under "%MEM" start increasing for sessionD.
6)_ Stop the process from step-4 . No more instrumented app should be present.
7)_ repeat step-2. %MEM used still remain.
Any other information:
======================
-
Files
Actions