Actions
Bug #374
closedStreaming: when using non-default ports, port remain hanging in CLOSE_WAIT state
Start date:
10/15/2012
Due date:
% Done:
100%
Estimated time:
Description
Description: ============ When using non-default ports for streaming, the ports remain hanging in CLOSE_WAIT state forever even after "lttng destroy" and the remote side has kill the corresponding relayd process. Commit used: ============ userspace : (oct12) 1fe734e wfcqueue: clarify locking usage lttng-ust : (oct09) 1c7b4a9 Fix: memcpy of string is larger than source ttng-tools: (oct02) 4dbc372 ABI with support for compat 32/64 bits babeltrace : (oct02) 052606b Document that list.h is LGPLv2.1, but entirely trivial Scenario: ========= On Node-1: 1a)_ netstat -etan 1b)_ lttng-relayd -vvv -C tcp://0.0.0.0:51234 -D tcp://0.0.0.0:51235 & 1c)_ Once the session has been started on Node-2: netstat -etan gives: Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 0.0.0.0:51234 0.0.0.0:* LISTEN 0 1625723 tcp 0 0 0.0.0.0:51235 0.0.0.0:* LISTEN 0 1625724 tcp 0 0 192.168.0.1:51235 192.168.0.4:58137 ESTABLISHED 0 1625875 tcp 0 0 192.168.0.1:51235 192.168.0.4:58139 ESTABLISHED 0 1625975 tcp 0 0 192.168.0.1:51234 192.168.0.4:41174 ESTABLISHED 0 1625974 tcp 0 0 192.168.0.1:51234 192.168.0.4:41172 ESTABLISHED 0 1625874 1d)_ Once the session has been destroyed from Node-2, pkill lttng-relayd netstat -etan gives: Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 192.168.0.1:51235 192.168.0.4:58137 FIN_WAIT2 0 1625875 tcp 0 0 192.168.0.1:51235 192.168.0.4:58139 FIN_WAIT2 0 1625975 tcp 0 0 192.168.0.1:51234 192.168.0.4:41174 FIN_WAIT2 0 1625974 tcp 0 0 192.168.0.1:51234 192.168.0.4:41172 FIN_WAIT2 0 1625874 1e)_ About 30 sec later: The above ports pair nolonger appear in netstat On Node-2: 2a)_ <run an instrumented app> 2b)_ netstat -etan 2c)_ lttng create ses1 2d)_ lttng enable-event an_event_from_the_running_instrumented_app -u 2e)_ lttng enable-consumer -u -s ses1 -C tcp://192.168.0.1:51234 -D tcp://192.168.0.1:51235 -e 2f)_ lttng start; sleep 20; lttng stop 2g)_ netstat -etan Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 192.168.0.4:1128 0.0.0.0:* LISTEN 0 22611 tcp 0 0 192.168.0.4:1130 0.0.0.0:* LISTEN 0 20916 tcp 0 0 192.168.0.4:1131 0.0.0.0:* LISTEN 0 22141 tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 20739 tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 22237 tcp 0 0 192.168.0.4:1022 0.0.0.0:* LISTEN 0 20952 tcp 0 0 192.168.0.4:41172 192.168.0.1:51234 ESTABLISHED 0 1474170 tcp 0 0 192.168.0.4:58139 192.168.0.1:51235 ESTABLISHED 0 1477457 tcp 0 0 192.168.0.4:41174 192.168.0.1:51234 ESTABLISHED 0 1477456 tcp 0 0 192.168.0.4:58137 192.168.0.1:51235 ESTABLISHED 0 1477398 2h)_ lttng destroy; lttng list 2i)_ From Node-1, kill the corresponding relayd . 2j)_ netstat -etan Proto Recv-Q Send-Q Local Address Foreign Address State User Inode tcp 0 0 192.168.0.4:1128 0.0.0.0:* LISTEN 0 22611 tcp 0 0 192.168.0.4:1130 0.0.0.0:* LISTEN 0 20916 tcp 0 0 192.168.0.4:1131 0.0.0.0:* LISTEN 0 22141 tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 20739 tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 22237 tcp 0 0 192.168.0.4:1022 0.0.0.0:* LISTEN 0 20952 tcp 0 0 192.168.0.4:41172 192.168.0.1:51234 CLOSE_WAIT 0 1474170 tcp 0 0 192.168.0.4:58139 192.168.0.1:51235 CLOSE_WAIT 0 1477457 tcp 0 0 192.168.0.4:41174 192.168.0.1:51234 CLOSE_WAIT 0 1477456 tcp 0 0 192.168.0.4:58137 192.168.0.1:51235 CLOSE_WAIT 0 1477398 2k)_ The above port-pair remain CLOSE_WAIT forever (still there after 24hrs). When a new session is created using the same -C and -D, new set of port-pair being created and they eventually hang in CLOSE_WAIT as well.
Updated by David Goulet almost 12 years ago
- Status changed from New to Confirmed
- Assignee set to David Goulet
- Target version set to 2.1 stable
Updated by David Goulet almost 12 years ago
Is this a show stopper for you right now?
If so, I can work on a quick patch to fix that today. Else, I'll try to fix it this week.
Thanks!
Updated by Tan le tran almost 12 years ago
Hi David,
Thank You for looking into this issue.
We prefer to have a permanent fix. It is ok if we have the fix later this week.
Currently we use node reboot as the remedy to clear all the hanging ports.
Regards,
Tan
Updated by David Goulet almost 12 years ago
- Status changed from Confirmed to Resolved
- % Done changed from 0 to 100
Applied in changeset e09272cd700e3a0a59100cdf6c3ad2332341109c.
Actions