Project

General

Profile

Bug #653

LTTng memory allocation failure goes unreported

Added by Daniel U. Thibault about 7 years ago. Updated about 5 years ago.

Status:
Confirmed
Priority:
Low
Assignee:
-
Target version:
Start date:
10/21/2013
Due date:
% Done:

0%

Estimated time:

Description

Consider the following (LTTng 2.3.0 running in a single-processor 1 GiB virtual machine):

$ lttng create test
Session test created.
Traces will be written in /home/daniel/lttng-traces/test-20131018-122002
$ lttng enable-channel ch -u --discard --num-subbuf 0x800
UST channel ch enabled for session test
$ lttng enable-event -c ch -u -a
All UST events are enabled in channel ch
$ lttng start
Tracing started for session test
$ lttng list test
Tracing session test: [active]
    Trace path: /home/daniel/lttng-traces/test-20131018-122002

=== Domain: UST global ===

Buffer type: per UID

Channels:
-------------
- ch: [enabled]

    Attributes:
      overwrite mode: 0
      subbufers size: 131072
      number of subbufers: 2048
      switch timer interval: 0
      read timer interval: 0
      output: mmap()

    Events:
      * (type: tracepoint) [enabled]

$ lttng destroy
Session test destroyed

$ lttng create test
Session test created.
Traces will be written in /home/daniel/lttng-traces/test-20131018-122241
$ lttng enable-channel ch -u --discard --num-subbuf 0x1000
UST channel ch enabled for session test
$ lttng enable-event -c ch -u -a
All UST events are enabled in channel ch
$ lttng start
Tracing started for session test
$ lttng list test
Tracing session test: [active]
    Trace path: /home/daniel/lttng-traces/test-20131018-122241

=== Domain: UST global ===

Buffer type: per UID

Channels:
-------------
- ch: [enabled]

    Attributes:
      overwrite mode: 0
      subbufers size: 131072
      number of subbufers: 4096
      switch timer interval: 0
      read timer interval: 0
      output: mmap()

    Events:
      * (type: tracepoint) [enabled]

$ lttng destroy
Session test destroyed

The first session generates a trace as expected. The second does not: the trace folder remains stubbornly empty.

This is puzzling from a memory management point of view (the first session grabs 256 MiB of buffers, the second should grab 512 MiB), since the system's capacity was apparently not reached.

The one error captured in the session log (attached) is:

libringbuffer[14690/14695]: Error: zero_file: No space left on device (in _shm_object_table_alloc_shm() at shm.c:173)

The error needs to be properly captured and passed on to the lttng client.


Files

lttng-sessiond.log (24.4 KB) lttng-sessiond.log lttng-sessiond log Daniel U. Thibault, 10/21/2013 09:32 AM
#1

Updated by Mathieu Desnoyers almost 7 years ago

  • Project changed from LTTng to LTTng-tools

We might be able to do something about it for per-uid buffers (report to the user). I doubt we'll be able to do anything for per-pid buffers though.

#2

Updated by David Goulet almost 7 years ago

  • Status changed from New to Confirmed
  • Priority changed from Normal to Low
  • Target version changed from 2.3 to 2.5

This is a bit tricky since this can happen in non interactive way. Furthermore, right now, unfortunately, the way we interact with the consumer in the session daemon, we don't bring back the error code up to the caller but rather -1.

This would require a bit of work in error management so I'm not sure I'm comfortable right now doing it for at least the 2.4 stable release nor 2.3. I'll flag this one for 2.5 and see if I (or anyone else) have time to fix this in the next version which would be nice to have!

#3

Updated by David Goulet over 6 years ago

  • Target version deleted (2.5)

Moving this one out of 2.5 because this needs quite some work and we do not have resources to fix that before 2.5-rc1.

#4

Updated by Jérémie Galarneau about 5 years ago

  • Target version set to Wishlist

Also available in: Atom PDF