Project

General

Profile

Bug #892

lttng commands stuck when option --set-url is used to send the traces over network

Added by vinay kambam about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Target version:
Start date:
07/07/2015
Due date:
% Done:

0%

Estimated time:

Description

lttng commands stuck when option --set-url is used to send the traces over network ,Suspecting if the relay daemon is unable to process the request from session daemon.When we restarted the relay daemon ,we haven't seen this problem again

I have few questions in regard to this
-Does the relay daemon started on the remote machine has the limitation to handle the number of connections ?
-Does relay daemon cleans up the connection if there happens an unordered crash of the target on which session daemon is started ?
-The netstat's output has the queued packets at the relay daemon side , interestingly there are no lttng sessions on the target

The netstat output is as

tcp 0 0 0.0.0.0:5342 0.0.0.0:* LISTEN
tcp 32 0 137.58.215.17:5342 10.74.59.170:36740 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:40543 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:43537 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:43347 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:45642 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:45433 ESTABLISHED
tcp 33 0 137.58.215.17:5342 10.74.58.20:57832 CLOSE_WAIT
tcp 32 0 137.58.215.17:5342 10.67.29.116:53648 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:54402 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:55397 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:34960 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:37913 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:43167 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:51673 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:48492 ESTABLISHED
tcp 32 0 137.58.215.17:5342 172.31.98.221:39518 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:47521 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:51916 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.58.20:48219 ESTABLISHED
tcp 33 0 137.58.215.17:5342 10.74.58.20:44167 CLOSE_WAIT
tcp 32 0 137.58.215.17:5342 10.74.59.170:47477 ESTABLISHED
tcp 32 0 137.58.215.17:5342 137.58.191.89:34615 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.67.30.52:50288 ESTABLISHED
tcp 32 0 137.58.215.17:5342 10.74.59.170:55543 ESTABLISHED
tcp 89 0 137.58.215.17:5342 10.74.58.20:37882 CLOSE_WAIT

-when the relay daemon is in this state , we are unable to create new sessions to send the traces to the same daemon and lttng commands stuck

In our case , relay daemon was started few days ago, It was working fine collecting traces from sessions on different machines and some how it went to this "bad" state.

-Is it due to the reason that the relay daemon was started and being run for few days?

we are using lttng 2.6 version

  1. which lttng
    /usr/bin/lttng
  1. lttng --version
    lttng (LTTng Trace Control) 2.6.0 - Gaia - v2.6.0

Could you please find the root cause for this issue and a workaround ?


Files

relayd-logs.zip (21.5 KB) relayd-logs.zip logs of the relay daemon run in verbose mode when the issue has occured vinay kambam, 08/27/2015 05:15 AM

Also available in: Atom PDF