Bug #411
closedhealth check reporting should use TLS
100%
Description
In order to be able to put health check reporting into code shared between threads, we should use TLS variables to store the status (counters), and register those memory areas through a "register" method called at beginning of thread life, and unregister method called at unregister.
As a side-benefit, this will ensure that a thread never updates the health check counter of another thread by mistake, and therefore removes a risk of programming error that would prevent detection of health state of some threads.
Updated by Mathieu Desnoyers almost 12 years ago
I see two possible models for this:
1) linked-list of per-TLS nodes, similar to what is done in the userspace RCU library for per-thread tracking of grace periods. The state tracking of a thread would be within the TLS of that thread, along with the "task type" of the thread (an enum provided as parameter to registration). Health check "poke" actions would need to iterate on the linked list to find each node it is querying for. Note: eventually, if iteration on linked list becomes an issue, we could put the nodes in a hash table indexed by "task type".
2) registration could return an index within a global array. This index would then be stored within the TLS, and used to access the appropriate entry within the global storage each time we need to update the state for a thread.
The advantage of (1) over (2) is that we can extend (1) to thread pools without requiring a hard limit on the size of the global storage needed for (2) (or dynamic memory reallocation). Also, (1) requires less levels of indirection on the fast-path than (2). (fast-path being the health reporting actions executed by each thread).
Updated by Mathieu Desnoyers almost 12 years ago
With approach (1), you'll need a mutex protecting registration/unregistration from list traversal (unless list traversal is RCU, but I don't recommend going for it in the first implementation).
Each (frequent) access to update the state of per-thread health can be performed with an atomic operation, without need for any mutex.
Reading the state of each thread can be performed with an atomic read, while holding the list-protection mutex.
Updated by David Goulet almost 12 years ago
- Status changed from Confirmed to Resolved
- % Done changed from 0 to 100
Applied in changeset 927ca06aed61ff6dd3f64ae71854f2d7f9acebe5.
Updated by Mathieu Desnoyers almost 12 years ago
For tracking purpose, I have been actively involved in development of the fix (commit 769b7d7ec9067a192a01a8d0c884256e9fa25165) and following cleanup (commit 227e824a28deb5b5d31955908827426a03f97802) and carefully reviewed both commits before they were merged.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Updated by Christian Babeux almost 12 years ago
Reviewed-by: Christian Babeux <christian.babeux@efficios.com>
Updated by Christian Babeux almost 12 years ago
- Subject changed from heath check reporting should use TLS to health check reporting should use TLS