Project

General

Profile

Actions

Bug #374

closed
TL DG

Streaming: when using non-default ports, port remain hanging in CLOSE_WAIT state

Bug #374: Streaming: when using non-default ports, port remain hanging in CLOSE_WAIT state

Added by Tan le tran about 13 years ago. Updated about 13 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
10/15/2012
Due date:
% Done:

100%

Estimated time:

Description


Description: 
============ 
When using non-default ports for streaming, the ports remain hanging in CLOSE_WAIT state forever even after "lttng destroy" and the remote side has kill the corresponding relayd process.

Commit used: 
============ 
userspace  : (oct12) 1fe734e wfcqueue: clarify locking usage
lttng-ust  : (oct09) 1c7b4a9 Fix: memcpy of string is larger than source
ttng-tools: (oct02)  4dbc372 ABI with support for compat 32/64 bits
babeltrace : (oct02) 052606b Document that list.h is LGPLv2.1, but entirely trivial

Scenario: 
========= 
  On Node-1:
    1a)_ netstat -etan
    1b)_ lttng-relayd -vvv -C tcp://0.0.0.0:51234 -D tcp://0.0.0.0:51235  &
    1c)_ Once the session has been started on Node-2:
         netstat -etan gives:
         Proto Recv-Q Send-Q Local Address     Foreign Address          State       User Inode
         tcp        0      0 0.0.0.0:51234      0.0.0.0:*               LISTEN      0    1625723
         tcp        0      0 0.0.0.0:51235      0.0.0.0:*               LISTEN      0    1625724
         tcp        0      0 192.168.0.1:51235  192.168.0.4:58137       ESTABLISHED 0    1625875
         tcp        0      0 192.168.0.1:51235  192.168.0.4:58139       ESTABLISHED 0    1625975
         tcp        0      0 192.168.0.1:51234  192.168.0.4:41174       ESTABLISHED 0    1625974
         tcp        0      0 192.168.0.1:51234  192.168.0.4:41172       ESTABLISHED 0    1625874

    1d)_ Once the session has been destroyed from Node-2,
         pkill lttng-relayd
         netstat -etan gives:
         Proto Recv-Q Send-Q Local Address     Foreign Address          State       User Inode
         tcp        0      0 192.168.0.1:51235  192.168.0.4:58137       FIN_WAIT2   0    1625875
         tcp        0      0 192.168.0.1:51235  192.168.0.4:58139       FIN_WAIT2   0    1625975
         tcp        0      0 192.168.0.1:51234  192.168.0.4:41174       FIN_WAIT2   0    1625974
         tcp        0      0 192.168.0.1:51234  192.168.0.4:41172       FIN_WAIT2   0    1625874

    1e)_ About 30 sec later:
         The above ports pair nolonger appear in netstat

  On Node-2:
    2a)_ <run an instrumented app>
    2b)_ netstat -etan
    2c)_ lttng create ses1
    2d)_ lttng enable-event an_event_from_the_running_instrumented_app -u
    2e)_ lttng enable-consumer -u -s ses1 -C tcp://192.168.0.1:51234 -D tcp://192.168.0.1:51235 -e
    2f)_ lttng start; sleep 20; lttng stop
    2g)_ netstat -etan
         Proto Recv-Q Send-Q Local Address     Foreign Address    State       User Inode
         tcp        0      0 192.168.0.4:1128  0.0.0.0:*          LISTEN      0    22611
         tcp        0      0 192.168.0.4:1130  0.0.0.0:*          LISTEN      0    20916
         tcp        0      0 192.168.0.4:1131  0.0.0.0:*          LISTEN      0    22141
         tcp        0      0 0.0.0.0:111       0.0.0.0:*          LISTEN      0    20739
         tcp        0      0 0.0.0.0:22        0.0.0.0:*          LISTEN      0    22237
         tcp        0      0 192.168.0.4:1022  0.0.0.0:*          LISTEN      0    20952
         tcp        0      0 192.168.0.4:41172 192.168.0.1:51234  ESTABLISHED 0    1474170
         tcp        0      0 192.168.0.4:58139 192.168.0.1:51235  ESTABLISHED 0    1477457
         tcp        0      0 192.168.0.4:41174 192.168.0.1:51234  ESTABLISHED 0    1477456
         tcp        0      0 192.168.0.4:58137 192.168.0.1:51235  ESTABLISHED 0    1477398

    2h)_ lttng destroy; lttng list 
    2i)_ From Node-1, kill the corresponding relayd .

    2j)_ netstat -etan
         Proto Recv-Q Send-Q Local Address     Foreign Address    State       User Inode
         tcp        0      0 192.168.0.4:1128  0.0.0.0:*          LISTEN      0    22611
         tcp        0      0 192.168.0.4:1130  0.0.0.0:*          LISTEN      0    20916
         tcp        0      0 192.168.0.4:1131  0.0.0.0:*          LISTEN      0    22141
         tcp        0      0 0.0.0.0:111       0.0.0.0:*          LISTEN      0    20739
         tcp        0      0 0.0.0.0:22        0.0.0.0:*          LISTEN      0    22237
         tcp        0      0 192.168.0.4:1022  0.0.0.0:*          LISTEN      0    20952
         tcp        0      0 192.168.0.4:41172 192.168.0.1:51234  CLOSE_WAIT  0    1474170
         tcp        0      0 192.168.0.4:58139 192.168.0.1:51235  CLOSE_WAIT  0    1477457
         tcp        0      0 192.168.0.4:41174 192.168.0.1:51234  CLOSE_WAIT  0    1477456
         tcp        0      0 192.168.0.4:58137 192.168.0.1:51235  CLOSE_WAIT  0    1477398

    2k)_ The above port-pair remain CLOSE_WAIT forever (still there after 24hrs).
         When a new session is created using the same -C and -D, 
         new set of port-pair being created and they eventually hang in CLOSE_WAIT as well.

DG Updated by David Goulet about 13 years ago Actions #1

  • Status changed from New to Confirmed
  • Assignee set to David Goulet
  • Target version set to 2.1 stable

DG Updated by David Goulet about 13 years ago Actions #2

Is this a show stopper for you right now?

If so, I can work on a quick patch to fix that today. Else, I'll try to fix it this week.

Thanks!

TL Updated by Tan le tran about 13 years ago Actions #3

Hi David,

Thank You for looking into this issue.

We prefer to have a permanent fix. It is ok if we have the fix later this week.
Currently we use node reboot as the remedy to clear all the hanging ports.

Regards,
Tan

DG Updated by David Goulet about 13 years ago Actions #4

  • Status changed from Confirmed to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: PDF Atom