Actions
Bug #426
closedInstrumented application segfault ( in lttng_enabler_release (objd=<optimized out>) at lttng-ust-abi.c:1033)
Start date:
01/21/2013
Due date:
% Done:
0%
Estimated time:
Description
We received this coredump very often while running the so called "stability" testing. This activity consist of 3 users. Each user has a set of trace commands to execute. They keep executing one command (such as create, stop, destroy, list, etc) after the the other and keep repeating until we stop the activity. At all time, there are 4 instances of instrumented app running (without exiting). Once in a while, we launch another 10 new instances of "TestApp_Mini1" (an insturmented app that simply wait for 5 sec , print a line and exit). We have noticed that once in a while, we see a segfault from the TestApp_Mini1. "LTTNG_UST_DEBUG=1" has been used to launch TestApp_Mini1 to get more data for the segfault. "gdb bt" printout is also provided in the attachement. Please, note that we are using a separate relayD for each session. From our internal log, this is what we have observed (put into time order based on their time stamps) In the scenario below, user2 is using session STAB002_2 and user3 uses session STAB003_b_1: : 21:27:26 Launch 10 TestApp_Mini by using: for a in $(seq 1 10); do (LTTNG_UST_DEBUG=1 /home/test_apps/TestApp_Mini1 5 &); done (each instance has 5 sec life time) (user-3) 21:27:26 invoke lttng_list_session (STAB003_b_1) (user-2) 21:27:27 RelayD is killed for session STAB002_2 (user-3) 21:27:28 reply received from lttng_list_session with count 2 (user-3) 21:27:28 Invoke lttng_stop_tracing_no_wait for session STAB003_b_1 (user-3) 21:27:28 Reply received for lttng_stop_tracing_no_wait for session STAB003_b_1 (user-2) 21:27:28 Invoke lttng_list_session (for session STAB002_2) (user-2) 21:27:28 Reply received from lttng_list_session (for session STAB002_2) (user-2) 21:27:28 Invoke lttng_stop_tracing_no_wait for session STAB002_2 (user-2) 21:27:28 Reply received for lttng_stop_tracing_no_wait for session STAB002_2 (user-2) 21:27:28 Invoke lttng_list_session (for session STAB002_2) (user-2) 21:27:28 Reply received from lttng_list_session (for session STAB002_2) (user-2) 21:27:28 lttng_destroy_session for session STAB002_2 (user-2) 21:27:28 Reply recieved for lttng_destroy_session for session STAB002_2. (user-3) 21:27:28 Invoke lttng_data_pending for session STAB003_b_1 (user-3) 21:27:29 Reply recieved for lttng_data_pending for session STAB003_b_1: 0 (user-3) 21:27:29 Invoke lttng_destroy_session for session STAB003_b_1 ****** Jan 20 21:27:34 SC-1 kernel: [133553.807697] TestApp_Mini1[24519]: segfault at 7fe883e5d2d8 ip 00007fe8835f39be sp 00007fff67488b40 error 4 in liblttng-ust.so.0.0.0[7fe8835e3000+38000] ****** (user-3) 21:28:03 Reply recieved for lttng_destroy_session for session STAB003_b_1. We are not sure what about what would have made the instrumented app (TestApp_Mini1) to seg fault. Regardless where the fault might be, instrumented application should not be effected and should not give a segmentation fault. Lttng version used: =================== * userspace: da9bed2 (HEAD, tag: v0.7.6) Version 0.7.6 * lttng-ust: 05356aa (HEAD, tag: v2.1.1) Version 2.1.1 * lttng-tools: 959f036 (HEAD, tag: v2.1.1) Update version to v2.1.1
Files
Actions