Bug #900
closedDouble free or corruption crash on relay daemon
0%
Description
A crash has been observed on relay daemon, when 24 streaming sessions with same name were created from 24 different targets towards a relay daemon running on remote host.
LTTng-tools version being used on the target is 2.6.
PERROR - 09:24:33.402270 [31294/31310]: Relay index to close fd 32575: Bad file descriptor (in deferred_free_relay_index() at index.c:41)
- Error in `./lttng-relayd': double free or corruption (out): 0x00007f3fdc04d930 ***
Are there any known issues similar to above in relay daemon LTTng-tools version 2.6?
If yes, kindly let us know.
Files
Updated by Julien Desfossez over 9 years ago
- Project changed from LTTngTop to LTTng-tools
Updated by Jérémie Galarneau over 9 years ago
How long does it take for the corruption to occur?
Updated by anusha mahamkali over 9 years ago
- File lttng-relayd-crash.txt lttng-relayd-crash.txt added
The corruption is observed immediately after creating 24 (mostly 21) sessions from 24 different trace generating systems
towards a single relayd on remote host.
We reproduced the problem in verbose mode and collected the logs. Please find them attached to the case.
Below traces are of some interest:PERROR - 09:24:33.380611 [31294/31297]: pipe: Too many open files (in run_as_clone() at runas.c:204)
PERROR - 09:24:33.380624 [31294/31297]: Index trace directory creation error: Too many open files (in index_create_file() at index.c:56)
.
.
.
PERROR - 09:24:33.393093 [31294/31297]: close stream: Bad file descriptor (in stream_close() at stream.c:91)
.
.
.
PERROR - 09:24:33.402270 [31294/31310]: Relay index to close fd 32575: Bad file descriptor (in deferred_free_relay_index() at index.c:41)
- Error in `./lttng-relayd': double free or corruption (out): 0x00007f3fdc04d930 ***
Updated by Jérémie Galarneau over 9 years ago
- Status changed from New to Feedback
- Priority changed from Critical to Normal
This does look like the number of file descriptors has been exhausted. Can you list the open file descriptors just before the process crashes?
$ ls -l /proc/your_relayd_pid/fd
As a workaround, you could try increasing the number of file descriptors available per-process.
Can you provide the output of
$ ulimit -n
You can try increasing this value to, say, 4096 and see if the problem persists.
Updated by anusha mahamkali over 9 years ago
We tried to increase the number of file descriptors per process with ulimit command
and it worked for us. Problem is not seen anymore.
Updated by Jérémie Galarneau over 9 years ago
- Status changed from Feedback to Confirmed
Good to hear. I'll keep the bug open to make sure this type of error is handled gracefully.
Updated by Jérémie Galarneau about 9 years ago
- Status changed from Confirmed to In Progress
- Assignee set to Mathieu Desnoyers
- Target version changed from 2.6 to 2.7
Updated by Jérémie Galarneau about 9 years ago
- Target version changed from 2.7 to 2.6
Updated by Jérémie Galarneau about 9 years ago
- Status changed from In Progress to Resolved