Project

General

Profile

Actions

Bug #389

closed

API: lttng_health_check() keeps returning 1 when a session is created with event enabled

Added by Tan le tran over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Critical
Assignee:
Target version:
Start date:
10/30/2012
Due date:
% Done:

100%

Estimated time:

Description

Description: 
============
  When using the API: lttng_health_check(LTTNG_HEALTH_CONSUMER), it keeps returning 1 
  whenever a session has been created/started (after about 30 second).

Commit used: 
============ 
  userspace:   e7e6ff7 rculfhash test: fix trivial memleak and return node leak and errors
  lttng-ust:   1c7b4a9 Fix: memcpy of string is larger than source
  lttng-tools: dda67f6  Fix: Error handling when sending relayd sockets to consumer
  babeltrace : 6ca30a4 Cleanup: fix cppcheck warning

Scenario: 
========= 
  1)_ no instrumented application running
  2)_ in the background, we have a process that will call the API 
      lttng_health_check(LTTNG_HEALTH_CONSUMER) every 5 sec .
      If a non zero is returned, sessionD will be restarted.
  3)_ lttng create test
  4)_ lttng enable-event "com*" -u

      About 30 sec later (after 5 successful health check)
      the health check return 1 causing the sessionD to be restarted
      by our background process.

The way we are interpreting the lttng_health_check() is that whenever non-zero number is returned, 
a "BAD" situation has been encountered /non-recoverable error and therefore we will restart 
SessionD to recover from that bad sittuation. 

NOTE: when we do not run the process to perform the health check periodically, 
      session is able to be created, activated, stopped and propper log can be
      obtained. That kind of indicate that whatever problem the health check 
      has encountered is not that severe and therefore it should not return 
      a failure in this case .

      This bug has been put to critical as it make our trace environment
      completely unusable when running with health check (which is a requirement
      for our product to have).      


Files

bugX_HealthCheckConsummer_Failed.log (107 KB) bugX_HealthCheckConsummer_Failed.log log with lttng-sessiond -vvv --verbose-consumer Tan le tran, 10/30/2012 10:28 PM
Actions #1

Updated by David Goulet over 11 years ago

  • Status changed from New to In Progress
  • Assignee set to David Goulet
  • Target version set to 2.1 stable
Actions #2

Updated by David Goulet over 11 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF