Project

General

Profile

Actions

Bug #1373

closed

lttng-sessiond: Assertion fails when running 'lttng list'

Added by Bogdan Codres 12 months ago. Updated 10 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
Start date:
04/25/2023
Due date:
% Done:

100%

Estimated time:

Description

Hello guys !

We received a crash from one of our customer. Please see the detailed backtrace below,
The version used are:

Select desired LTTng component versions
PREFERRED_VERSION_liburcu = "0.9.7"
PREFERRED_VERSION_lttng-tools = "2.12.12"
PREFERRED_VERSION_lttng-ust = "2.12.6"
PREFERRED_VERSION_babeltrace2 = "2.0.4"
PREFERRED_VERSION_nativesdk-liburcu = "0.9.7"
PREFERRED_VERSION_nativesdk-lttng-tools = "2.12.12"
PREFERRED_VERSION_nativesdk-lttng-ust = "2.12.6"
PREFERRED_VERSION_nativesdk-babeltrace2 = "2.0.4"

However, I manage to reproduce myself on version lttng-tools = 2.13.9
As you can see the crash occurs at this line:

LTTNG_ASSERT(node_ptr  &event->node.node);  ---> from add_unique_ust_app_event

Root cause is this:

add_unique_ust_app_event will create the same key hash
for both --loglevel and --loglevel-only, resulting in:

add_unique_ust_app_event: Assertion `node_ptr &event->node.node' failed.

We did some changes in source code to prevent the crash ..meaning
we used ltt_ust_ht_key instead of ust_app_ht_key in some places.
But, we are not sure if our changes covers all the scenarios ..corner
cases, so we would prefer to do an analysis on your side too.
I'm attaching a script with whom I can reproduce the issue.

Best regards, Bogdan.

#0 libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1 0xf7a24a82 in _libc_signal_restore_set (set=0xf43f98f0) at ../sysdeps/unix/sysv/linux/internal-signals.h:105
#2 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:47
#3 0xf7a15f06 in __GI_abort () at abort.c:79
#4 0xf7a1fe98 in __assert_fail_base (fmt=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x8e360 "node_ptr == &event->node.node",
assertion@entry=0x0, file=file@entry=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", line=line@entry=197,
function=0xb1364 <
_PRETTY_FUNCTION
.86> "add_unique_ust_app_event", function@entry=0xf43fd280 "\001") at assert.c:92
#5 0xf7a1ff1a in GIassert_fail (assertion=0x0, assertion@entry=0x8e360 "node_ptr == &event->node.node",
file=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", file@entry=0xaa59c "../../../../git/src/bin/lttng-sessiond/ust-app.c",
line=line@entry=197, function=0xf43fd280 "\001", function@entry=0xb1364 <
_PRETTY_FUNCTION__.86> "add_unique_ust_app_event") at assert.c:101
#6 0x00055384 in add_unique_ust_app_event (event=0xf43fcf30, ua_chan=0xeec70628) at ../../../../git/src/bin/lttng-sessiond/ust-app.c:197
#7 create_ust_app_event (ua_chan=0xeec70628, uevent=uevent@entry=0xf389d4b8, app=0xf7ad2ca0, app@entry=0xeec671f8, ua_sess=0xeec6d010)
at ../../../../git/src/bin/lttng-sessiond/ust-app.c:3244
#8 0x000583b0 in ust_app_create_event_glb (usess=usess@entry=0xf38039e0, uchan=uchan@entry=0xf381de78, uevent=0xf389d4b8)
at ../../../../git/src/bin/lttng-sessiond/ust-app.c:4383
#9 0x00021e7e in event_ust_enable_tracepoint (usess=usess@entry=0xf38039e0, uchan=0xf381de78, event=event@entry=0xf389d0a8,
filter_expression=filter_expression@entry=0x0, filter=filter@entry=0x0, exclusion=exclusion@entry=0x0, internal_event=internal_event@entry=false)
at ../../../../git/src/bin/lttng-sessiond/event.c:202
#10 0x0002c684 in _cmd_enable_event (session=session@entry=0xf38006a0, domain=domain@entry=0xf43fa154,
channel_name=channel_name@entry=0xf380e618 "", event=0xf389d0a8, filter_expression=0x0, filter=0x0, exclusion=0x0, wpipe=wpipe@entry=28,
internal_event=internal_event@entry=false) at ../../../../git/src/bin/lttng-sessiond/cmd.c:2353
#11 0x0002e59c in cmd_enable_event (cmd_ctx=cmd_ctx@entry=0xf381d528, sock=<optimized out>, wpipe=28)
at ../../../../git/src/bin/lttng-sessiond/cmd.c:2555
#12 0x0004844c in process_client_msg (cmd_ctx=0xf381d528, sock=sock@entry=0xf43fc450, sock_error=sock_error@entry=0xf43fc454)
at ../../../../git/src/bin/lttng-sessiond/client.c:1190
#13 0x00049ad2 in thread_manage_clients (data=<optimized out>) at ../../../../git/src/bin/lttng-sessiond/client.c:2018
#14 0x00046074 in launch_thread (data=0x19c3b48) at ../../../../git/src/bin/lttng-sessiond/thread.c:65
#15 0xf7afc2ba in start_thread (arg=0x4b62c1c0) at pthread_create.c:481
?)

(gdb) bt full
#0 _libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
No locals.
#1 0xf7a24a82 in __libc_signal_restore_set (set=0xf43f98f0) at ../sysdeps/unix/sysv/linux/internal-signals.h:105
_a2tmp = -197158672
_a2 = -197158672
_nametmp = 175
_a3tmp = 0
_a3 = 0
_a1 = 0
_a4tmp = 8
_a1tmp = 2
_a4 = 8
_name = 175
#2 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:47
set = {
_val = {8192, 30734, 0 <repeats 26 times>, 197, 4154786469, 4097824712, 18}}
pid = <optimized out>
tid = <optimized out>
ret = 0
#3 0xf7a15f06 in _GI_abort () at abort.c:79
save_stage = 1
act = {
_sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {4155506755, 4294967295, 1, 4294967295, 1, 0, 4097809072, 582496, 725860, 4154738487, 4097809136, 4155346536, 4097824704, 0, 4155428388, 1, 0, 197, 582496, 725860, 4005982716, 4085882904, 4097824704, 11, 197, 582496, 725860, 4005982716, 4155431592, 4154803079, 4155433208, 197}}, sa_flags = -137093120, sa_restorer = 0xf43f9b34}
sigs = {__val = {32, 0, 4155428388, 4155431592, 4154803079, 146, 146, 146, 146, 4097809200, 4154764535, 0, 0, 4222451712, 4085893872, 4085893973, 4085893872, 4085893872, 4085894018, 4085894172, 4085893872, 4085894172, 0, 0, 0, 0, 0, 4157924044, 128, 4097809064, 4154589184, 0}}

[**ALERT: The abort() might not be exactly invoked from the following function line.
If the trail function contains multiple abort() calls, then you should cross check by other means to get correct abort() call location.
This is due to the optimized compilation which hides the debug info for multiple abort() calls in a given function.
Refer TR HU16995 for more information]

--Type <RET> for more, q to quit, c to continue without paging--
#4 0xf7a1fe98 in assert_fail_base (fmt=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x8e360 "node_ptr == &event->node.node", assertion@entry=0x0, file=file@entry=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", line=line@entry=197, function=0xb1364 <__PRETTY_FUNCTION.86> "add_unique_ust_app_event", function@entry=0xf43fd280 "\001") at assert.c:92
str = 0xf389a018 ""
total = 4096
#5 0xf7a1ff1a in GIassert_fail (assertion=0x0, assertion@entry=0x8e360 "node_ptr == &event->node.node", file=0xf7ad2ca0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", file@entry=0xaa59c "../../../../git/src/bin/lttng-sessiond/ust-app.c", line=line@entry=197, function=0xf43fd280 "\001", function@entry=0xb1364 <_PRETTY_FUNCTION__.86> "add_unique_ust_app_event") at assert.c:101
No locals.
#6 0x00055384 in add_unique_ust_app_event (event=0xf43fcf30, ua_chan=0xeec70628) at ../../../../git/src/bin/lttng-sessiond/ust-app.c:197
node_ptr = <optimized out>
key = {name = 0xf389d738 "", filter = 0x0, loglevel_type = (LTTNG_UST_LOGLEVEL_SINGLE | unknown: 0xc), exclusion = 0x0}
ht = <optimized out>
node_ptr = <optimized out>
key = {name = <optimized out>, filter = <optimized out>, loglevel_type = <optimized out>, exclusion = <optimized out>}
ht = <optimized out>
PRETTY_FUNCTION = "add_unique_ust_app_event"
#7 create_ust_app_event (ua_chan=0xeec70628, uevent=uevent@entry=0xf389d4b8, app=0xf7ad2ca0, app@entry=0xeec671f8, ua_sess=0xeec6d010) at ../../../../git/src/bin/lttng-sessiond/ust-app.c:3244
ret = <optimized out>
ua_event = 0xf43fcf30
end = <optimized out>
func = "create_ust_app_event"
#8 0x000583b0 in ust_app_create_event_glb (usess=usess@entry=0xf38039e0, uchan=uchan@entry=0xf381de78, uevent=0xf389d4b8) at ../../../../git/src/bin/lttng-sessiond/ust-app.c:4383
ret = <optimized out>
iter = {iter = {node = 0xeec67280, next = 0xf0017190}}
uiter = {iter = {node = 0xeec707d0, next = 0xeec70600}}
ua_chan_node = <optimized out>
app = 0xeec671f8
ua_sess = 0xeec6d010
ua_chan = <optimized out>
PRETTY_FUNCTION = "ust_app_create_event_glb"
--Type <RET> for more, q to quit, c to continue without paging--
func = "ust_app_create_event_glb"
#9 0x00021e7e in event_ust_enable_tracepoint (usess=usess@entry=0xf38039e0, uchan=0xf381de78, event=event@entry=0xf389d0a8, filter_expression=filter_expression@entry=0x0, filter=filter@entry=0x0, exclusion=exclusion@entry=0x0, internal_event=internal_event@entry=false) at ../../../../git/src/bin/lttng-sessiond/event.c:202
ret = 10
to_create = 1
uevent = 0xf389d4b8
PRETTY_FUNCTION = "event_ust_enable_tracepoint"
func = "event_ust_enable_tracepoint"
#10 0x0002c684 in _cmd_enable_event (session=session@entry=0xf38006a0, domain=domain@entry=0xf43fa154, channel_name=channel_name@entry=0xf380e618 "", event=0xf389d0a8, filter_expression=0x0, filter=0x0, exclusion=0x0, wpipe=wpipe@entry=28, internal_event=internal_event@entry=false) at ../../../../git/src/bin/lttng-sessiond/cmd.c:2353
uchan = <optimized out>
usess = 0xf38039e0
ret = <optimized out>
channel_created = 0
attr = 0x0
PRETTY_FUNCTION = "_cmd_enable_event"
func = "_cmd_enable_event"
#11 0x0002e59c in cmd_enable_event (cmd_ctx=cmd_ctx@entry=0xf381d528, sock=<optimized out>, wpipe=28) at ../../../../git/src/bin/lttng-sessiond/cmd.c:2555
ret = <optimized out>
event_len = 67
sock_recv_len = <optimized out>
event = 0xf389d0a8
view = {data = 0xf38998e8 "", size = 67}
event_buffer = {data = 0xf38998e8 "", size = 67, _capacity = 128}
exclusion = 0x0
bytecode = 0x0
filter_expression = 0x0
#12 0x0004844c in process_client_msg (cmd_ctx=0xf381d528, sock=sock@entry=0xf43fc450, sock_error=sock_error@entry=0xf43fc454) at ../../../../git/src/bin/lttng-sessiond/client.c:1190
ret = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
need_tracing_session = 1
need_domain = <optimized out>
payload = {data = 0x0, size = 0, _capacity = 0}
func = "process_client_msg"
PRETTY_FUNCTION = "process_client_msg"
#13 0x00049ad2 in thread_manage_clients (data=<optimized out>) at ../../../../git/src/bin/lttng-sessiond/client.c:2018
cmd_completion_handler = <optimized out>
sock = 906
ret = <optimized out>
i = <optimized out>
pollfd = <optimized out>
err = 1
sock_error = 0
revents = <optimized out>
nb_fd = <optimized out>
cmd_ctx = 0xf381d528
events = {epfd = 47, nb_fd = 2, alloc_size = 2, init_size = 2, events = 0xf38005f8}
client_sock = 45
quit_pipe = <optimized out>
thread_quit_pipe_fd = 43
func = "thread_manage_clients"
restart = <optimized out>
#14 0x00046074 in launch_thread (data=0x19c3b48) at ../../../../git/src/bin/lttng-sessiond/thread.c:65
ret = <optimized out>
thread = 0x19c3b48
func = "launch_thread"
#15 0xf7afc2ba in start_thread (arg=0x4b62c1c0) at pthread_create.c:481
ret = <optimized out>
pd = 0x4b62c1c0
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {1223872447, 1264763328, -197143936, 944208, -378662, -197145576, -197143936, -137084864, 0, -197145280, 0 <repeats 54 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
robust = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging-


Files

script_to_reproduce_LINCCM-21085_NEW.sh (493 Bytes) script_to_reproduce_LINCCM-21085_NEW.sh script to reproduce the issue Bogdan Codres, 04/25/2023 10:28 AM
Actions #1

Updated by Bogdan Codres 11 months ago

Hello ! Can you provide some feedback about the proposed patch ?

https://lists.lttng.org/pipermail/lttng-dev/2023-May/030453.html

Actions #2

Updated by Erica Bugden 10 months ago

  • Subject changed from lttng-sessiond crash to lttng-sessiond: Assertion fails when running 'lttng list'
  • Status changed from New to Confirmed
  • Assignee set to Erica Bugden

Hello Bogdan,

Thank you for taking the time to report this issue! We've recently reproduced the bug on our end and are currently looking into it.

Regards,
Erica

Actions #3

Updated by Jérémie Galarneau 10 months ago

  • Status changed from Confirmed to Feedback

Hi,

We published an alternative fix to the master branch.

Can you confirm it fixes the problem on your end?

Thanks!

Actions #4

Updated by Jérémie Galarneau 10 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF