Project

General

Profile

Bug #1201

ust_fd close() causes deadlock for any multithreaded application

Added by Tai Dinh 8 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
Start date:
10/04/2019
Due date:
% Done:

100%

Estimated time:

Description

Hi,

I discovered today an even more serious problem with the current implementation of close/flose().
It does not neither handle the thread cancellation nor make the mutex robustness.
Which means if any of the thread of a multiple threads application got cancelled during calling close()/fclose(), the cancelled thread will terminated leaving the mutex at a persistent lock state.
Any call of close()/fclose() later at any other thread will be blocked forever.
What makes this really serious is because the scenario is quite simple, just thread cancellation during close then it block all other close() forever.

We need to either:
- Disable the thread cancellation at start of lttng_ust_safe_close_fd and re-enable it later.
- Setup the thread cancellation handler and do the mutex if needed.
- Set the mutex as PTHREAD_MUTEX_ROBUST, then check for return code and do pthread_mutex_consistent when needed.

Reproducer:

LD_PRELOAD=/usr/local/lib/liblttng-ust-fd.so ./ust_fd_mutex_battle

/Tai


Files

ust_fd_mutex_battle.c (961 Bytes) ust_fd_mutex_battle.c Tai Dinh, 10/04/2019 09:44 AM
fix-1201-1.diff (1.97 KB) fix-1201-1.diff Mathieu Desnoyers, 10/04/2019 10:16 AM
#1

Updated by Mathieu Desnoyers 8 months ago

The attached patch implements the first approach (disable pthread cancellation around lock). Does it help ?

#2

Updated by Mathieu Desnoyers 8 months ago

The patch appears to fix the issue with the reproducer here.

#3

Updated by Tai Dinh 8 months ago

Mathieu Desnoyers wrote:

The attached patch implements the first approach (disable pthread cancellation around lock). Does it help ?

Yes, I'll try it now.
Thank Mathieu a lot for your quick solution.

Anyway, are you considering other option as well or you think that this has more advantage compared to other alternatives.

/Tai

#4

Updated by Tai Dinh 8 months ago

Tai Dinh wrote:

Mathieu Desnoyers wrote:

The attached patch implements the first approach (disable pthread cancellation around lock). Does it help ?

Yes, I'll try it now.
Thank Mathieu a lot for your quick solution.

Anyway, are you considering other option as well or you think that this has more advantage compared to other alternatives.

/Tai

It works for me as well.

/Tai

#5

Updated by Mathieu Desnoyers 8 months ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF