Project

General

Profile

Actions

Bug #539

closed

lttng-tools2.2.0rc2: Possible memory leak from sessionD ?

Added by Tan le tran almost 11 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Critical
Assignee:
-
Target version:
Start date:
05/22/2013
Due date:
% Done:

0%

Estimated time:

Description


Commit used:
============
userspace-rcu : 56e676d (HEAD, origin/stable-0.7) Document: rculfhash destroy and resize side-effect in 0.7
lttng-ust     : 352fce3 (HEAD, origin/master, origin/HEAD) Remove 0.x TODO
lttng-tools   : b31398b (HEAD, origin/master, origin/HEAD) Fix: increment UST channel refcount at stream creation
babeltrace    : 9eaf254 (HEAD, tag: v1.0.3, origin/stable-1.0) Version 1.0.3

Problem Description:
====================
 * There seems to be memory leak in sessionD when many short live instrumented apps are launched
   during a period of time. At any time, we have about 700 instances of instrumented app present.
   Each of those instances has only 3 sec life spann.

   After about 3 minute run, the value under "%MEM" column increases from 0.0 to 0.4 for
   sessionD:

   [SC-1:Xen43 Wed May 22 06:29:29/cluster/temp/tdlt] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
     PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  COMMAND
   13616 root      20   0 85160  872  604 S      0  0.0   0:00.00 lttng-sessiond
   :
   :
   [SC-1:Xen43 Wed May 22 06:32:34/cluster/temp/tdlt] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
     PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  COMMAND
   13616 root      20   0  190m 7292  688 S      7  0.4   0:05.73 lttng-sessiond
   :
   :
   [SC-1:Xen43 Wed May 22 06:37:19/cluster/temp/tdlt] # top
   top - 06:37:25 up 1 day, 16:20,  2 users,  load average: 2.60, 3.14, 1.55
   Tasks: 187 total,   1 running, 186 sleeping,   0 stopped,   0 zombie
   Cpu(s):  0.1%us,  0.5%sy,  0.0%ni, 99.2%id,  0.1%wa,  0.1%hi,  0.0%si,  0.0%st
   Mem:   2049828k total,   210928k used,  1838900k free,     2808k buffers
   Swap:  4200992k total,   784808k used,  3416184k free,    42348k cached
   :

 * On a different machine, we observed the following (note the total mem size of
   this machine is different than the one above):

   [SC-1:Node16 Tue May 21 13:22:33/cluster/temp/tdlt/Stability/May12] # top
   top - 13:23:15 up 11 days,  4:41,  2 users,  load average: 0.00, 0.01, 0.05
   Tasks: 288 total,   1 running, 287 sleeping,   0 stopped,   0 zombie
   Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
   Mem:     24149M total,     3461M used,    20687M free,      372M buffers
   Swap:     4102M total,        0M used,     4102M free,     2411M cached
   :

   [SC-1:Node16 Tue May 21 13:32:09/cluster/temp/tdlt/Stability/May12] # top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |egrep -i "lttng|trace" 
      PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  COMMAND
    15012 root      20   0  311m 137m  912 S      0  0.6   2:46.43 lttng-sessiond
    15373 root      20   0 82128 1484  956 S      0  0.0   1:06.09 lttng-consumerd
    18322 root      20   0 42264 2268 1808 S      0  0.0   1:11.81 TraceEa
    18342 root      20   0 42264  912  444 S      0  0.0   0:45.82 TraceEa
    20874 root      20   0  190m 3808 2596 S      0  0.0   0:40.07 trace_c
    20880 root      20   0 47220 2948 2040 S      0  0.0   0:10.89 trace_p

Is problem reproducible ?
=========================
  * yes 

How to reproduce (if reproducible):
===================================
  1)_ Kill the current sessionD and relaunch a new one
  2)_ top -b -n 1 | head -n 7 | tail -n 1; top -b -n 1 |grep -i sessiond
  3)_ lttng list -u; lttng list    #--- no instrumented app running, no session available
  4)_ run multiple instances of instrumented app (each instance has only 3 sec lifespan)
      so that there are around 700 instances present at any time.
      ex:
          for a in $(seq 1 1234); do (for n in $(seq 1 100); do (/home/test_apps/TestApp_100perSecOnly 3 np > /dev/null &); done; usleep 200000); done &
  5)_ Once in a while, repeat step-2 above.
      Will see that the number under "%MEM" start increasing for sessionD.
  6)_ Stop the process from step-4 . No more instrumented app should be present.
  7)_ repeat step-2. %MEM used still remain.

Any other information:
======================
-   

Files

terminal.log (13.4 KB) terminal.log terminal log Tan le tran, 05/22/2013 07:14 AM
bug539.diff (5.13 KB) bug539.diff David Goulet, 06/06/2013 03:50 PM
Jun10_retest.log (11.5 KB) Jun10_retest.log terminal log when the above test was carried out (update #14) Tan le tran, 06/10/2013 07:42 AM
patch539_jun10_gitdiff.log (7.46 KB) patch539_jun10_gitdiff.log git diff after applying the patch Tan le tran, 06/10/2013 11:44 AM
patch539_jun10_terminal.log (14.7 KB) patch539_jun10_terminal.log terminal log showing ps printout when the TC is executed Tan le tran, 06/10/2013 11:44 AM
Actions

Also available in: Atom PDF