Fixed by the following commit in master:
commit 9fd30396a597942084b007f33cc7f2c279f746e9
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Thu Sep 19 10:10:31 2019 -0400
Fix: provide errno as argument to urcu_die()
commit 1a990de3add "Fix: rculfhash worker needs to unblock to SIGRCU"
provides "ret" (-1) as argument to urcu_die(), but should rather provide
errno.
Reported by Coverity:
** CID 1405700: Error handling issues (NEGATIVE_RETURNS) /src/rculfhash.c: 2171 in cds_lfht_worker_init()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
commit 1a990de3addad89fc397f57bb359175d307e6960
Author: hewenliang <hewenliang4@huawei.com>
Date: Tue Sep 17 10:59:18 2019 -0400
Fix: rculfhash worker needs to unblock to SIGRCU
In urcu-signal flavor, call_rcu_thread calls synchronize_rcu which
will send SIGRCU signal to all registed threads, and then loops to
wait need_mb to be cleared. However, the registed workqueue_thread
does not process the SIGRCU signal, and never clear the need_mb.
Based on above, call_rcu_thread and workqueue_thread will wait
forever for completion of the grace period: call_rcu_thread which holds
the rcu_registry_lock, waits for workqueue_thread to do cmm_smp_mb.
While workqueue thread never does cmm_smp_mb because of signal blocking,
and it will eventually wait to get rcu_registry_lock in do_resize_cb.
The phenomenon is as follows, which is easy to be triggered:
(gdb) t 2
[Switching to thread 2 (Thread 0xffff83c3b080 (LWP 27116))]
0 0x0000ffff845296c4 in poll () from /lib64/libc.so.6
(gdb) bt
0 0x0000ffff845296c4 in poll () from /lib64/libc.so.6
1 0x0000ffff8461b93c in force_mb_all_readers () at urcu.c:241
2 0x0000ffff8461c748 in smp_mb_master () at urcu.c:249
3 urcu_signal_synchronize_rcu () at urcu.c:445
4 0x0000ffff8461d004 in call_rcu_thread at urcu-call-rcu-impl.h:364
5 0x0000ffff845eb8bc in start_thread () from /lib64/libpthread.so.0
6 0x0000ffff845335cc in thread_start () from /lib64/libc.so.6
(gdb) t 3
[Switching to thread 3 (Thread 0xffff8443c080 (LWP 27191))]
0 0x0000ffff845f51c4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
0 0x0000ffff845f51c4 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x0000ffff845ee048 in pthread_mutex_lock () from /lib64/libpthread.so.0
2 0x0000ffff8461b814 in mutex_lock ( <rcu_registry_lock>) at urcu.c:157
3 0x0000ffff8461b9e4 in urcu_signal_unregister_thread () at urcu.c:564
4 0x0000ffff8463e62c in do_resize_cb (work=0x11e2e790) at rculfhash.c:2042
5 0x0000ffff8463c940 in workqueue_thread (arg=0x11e1d260) at workqueue.c:228
6 0x0000ffff845eb8bc in start_thread () from /lib64/libpthread.so.0
7 0x0000ffff845335cc in thread_start () from /lib64/libc.so.6
So we should not block SIGRCU in workqueue thread to avoid blocking
forever in the grace period awaiting on the worker thread when using
urcu-signal flavor.
Signed-off-by: hewenliang <hewenliang4@huawei.com>
Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
backported into stable-0.10 as:
commit ef728ceea316503bdfd75c386512045fc8aa8285
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Thu Sep 19 10:10:31 2019 -0400
Fix: provide errno as argument to urcu_die()
commit 1a990de3add "Fix: rculfhash worker needs to unblock to SIGRCU"
provides "ret" (-1) as argument to urcu_die(), but should rather provide
errno.
Reported by Coverity:
** CID 1405700: Error handling issues (NEGATIVE_RETURNS) /src/rculfhash.c: 2171 in cds_lfht_worker_init()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
commit 23da24d2d03a22841e7f76bf25a52a803b403362
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Wed Sep 18 11:13:24 2019 -0400
Fix: include urcu-signal-nr.h
Erroneous include file name in backport of commit 1a990de3add
"Fix: rculfhash worker needs to unblock to SIGRCU".
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
commit 22e3a77f65d53b84dbe653fa6cb83181cafe2482
Author: hewenliang <hewenliang4@huawei.com>
Date: Tue Sep 17 10:59:18 2019 -0400
Fix: rculfhash worker needs to unblock to SIGRCU
In urcu-signal flavor, call_rcu_thread calls synchronize_rcu which
will send SIGRCU signal to all registed threads, and then loops to
wait need_mb to be cleared. However, the registed workqueue_thread
does not process the SIGRCU signal, and never clear the need_mb.
Based on above, call_rcu_thread and workqueue_thread will wait
forever for completion of the grace period: call_rcu_thread which holds
the rcu_registry_lock, waits for workqueue_thread to do cmm_smp_mb.
While workqueue thread never does cmm_smp_mb because of signal blocking,
and it will eventually wait to get rcu_registry_lock in do_resize_cb.
The phenomenon is as follows, which is easy to be triggered:
(gdb) t 2
[Switching to thread 2 (Thread 0xffff83c3b080 (LWP 27116))]
0 0x0000ffff845296c4 in poll () from /lib64/libc.so.6
(gdb) bt
0 0x0000ffff845296c4 in poll () from /lib64/libc.so.6
1 0x0000ffff8461b93c in force_mb_all_readers () at urcu.c:241
2 0x0000ffff8461c748 in smp_mb_master () at urcu.c:249
3 urcu_signal_synchronize_rcu () at urcu.c:445
4 0x0000ffff8461d004 in call_rcu_thread at urcu-call-rcu-impl.h:364
5 0x0000ffff845eb8bc in start_thread () from /lib64/libpthread.so.0
6 0x0000ffff845335cc in thread_start () from /lib64/libc.so.6
(gdb) t 3
[Switching to thread 3 (Thread 0xffff8443c080 (LWP 27191))]
0 0x0000ffff845f51c4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
0 0x0000ffff845f51c4 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x0000ffff845ee048 in pthread_mutex_lock () from /lib64/libpthread.so.0
2 0x0000ffff8461b814 in mutex_lock ( <rcu_registry_lock>) at urcu.c:157
3 0x0000ffff8461b9e4 in urcu_signal_unregister_thread () at urcu.c:564
4 0x0000ffff8463e62c in do_resize_cb (work=0x11e2e790) at rculfhash.c:2042
5 0x0000ffff8463c940 in workqueue_thread (arg=0x11e1d260) at workqueue.c:228
6 0x0000ffff845eb8bc in start_thread () from /lib64/libpthread.so.0
7 0x0000ffff845335cc in thread_start () from /lib64/libc.so.6
So we should not block SIGRCU in workqueue thread to avoid blocking
forever in the grace period awaiting on the worker thread when using
urcu-signal flavor.
Signed-off-by: hewenliang <hewenliang4@huawei.com>
Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>