Bug #1199

close() implementation of ust-fd is not async-signal-safe causing child processes to hang forever after fork()

Added by Tai Dinh 9 months ago. Updated 9 months ago.

According to below link, all implementation of close() should be async-signal-safe. But the implementation of ust-fd is absolutely not a safe one since it use a lots of none safe functions, especially the pthread_mutex_lock.

This causes a serious problem for any application that do a fork() followed by close() of file descriptor in the child before calling any of exec* family function, which is a common pattern.
With ust-fd preload, the lttng_ust_safe_close_fd will be called and if the fork() happens right after the ust_safe_guard_fd_mutex is locked in the parent (due to some trace events), the child will end up in a deadlock since it will be no longer can lock/unlock the mutex again.

(gdb) bt
#0 lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f2f3ae4c023 in _GI
_pthread_mutex_lock (mutex=mutex@entry=0x7f2f3a843940 <ust_safe_guard_fd_mutex>) at ../nptl/pthread_mutex_lock.c:78
#2 0x00007f2f3a5e6eaa in lttng_ust_lock_fd_tracker () at lttng-ust-fd-tracker.c:133
#3 0x00007f2f3a5e71a5 in lttng_ust_safe_close_fd (fd=1, close_cb=0x7f2f3ae53410 <
_close>) at lttng-ust-fd-tracker.c:300
#4 0x000055b014499c72 in do_thread ()
#5 0x00007f2f3ae496db in start_thread (arg=0x7f2f38891700) at pthread_create.c:463
#6 0x00007f2f3a96e88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

It is OK that this can be an exception and we cannot make a POSIX conforming implementation but at least the dead lock should be fixed since the problem can always be reproduced and is blocking those kind of application.



1199.c (2.31 KB) 1199.c Tai Dinh, 10/03/2019 03:43 PM
ust_pipe_and_fork.h (631 Bytes) ust_pipe_and_fork.h Tai Dinh, 10/03/2019 04:05 PM
tp.c (64 Bytes) tp.c Tai Dinh, 10/03/2019 04:05 PM
patch-1199-1.diff (609 Bytes) patch-1199-1.diff Mathieu Desnoyers, 10/03/2019 04:41 PM

