Project

General

Profile

Actions

Bug #1213

closed

Cannot print trace under chroot

Added by jinming liao almost 5 years ago. Updated over 2 years ago.

Status:
Invalid
Priority:
Normal
Assignee:
-
Target version:
Start date:
01/13/2020
Due date:
% Done:

0%

Estimated time:

Description

I want to use lttng under chroot, here is what I did:

  1. The root directory of chroot is / var / chroot.
  2. Install userspace-rcu-0.10.1-1.el7.x86_64.rpm, lttng-ust-2.10.3-1.el7.x86_64.rpm, lttng-tools-2.10.4-1.el7.x86_64.rpm And babeltrace-1.5.5-1.el7.x86_64.rpm to / var / chroot.
  3. Copy the corresponding dependent libraries to / var / chroot.
  4. mount --bind / dev / shm / var / chroot / dev / shm

This is fine at first, and can print the trace, but it won't work later, I don't know the reason. Here are the commands I executed under chroot:

lttng-sessiond --no-kernel --daemonize
lttng create
lttng enable-event -u sal _ *, sql *
lttng add-context --type = vpid --type = pthread_id -u
lttng start
lttng stop
lttng view

And I found that if lttng add-context --type = vpid --type = pthread_id -u is not executed, the trace can be printed.

I read the source code of lttng-ust, and printed some logs, and found that the problem was in the frontend_api.h: lib_ring_buffer_reserve function:

static inline
int lib_ring_buffer_reserve (const struct lttng_ust_lib_ring_buffer_config * config,
          struct lttng_ust_lib_ring_buffer_ctx * ctx)
{
  struct channel * chan = ctx-> chan;
  struct lttng_ust_shm_handle * handle = ctx-> handle;
  struct lttng_ust_lib_ring_buffer * buf;
  unsigned long o_begin, o_end, o_old;
  size_t before_hdr_pad = 0;

  if (caa_unlikely (uatomic_read (& chan-> record_disabled)))
    return -EAGAIN;

  if (config-> alloc == RING_BUFFER_ALLOC_PER_CPU)
    buf = shmp (handle, chan-> backend.buf [ctx-> cpu] .shmp);
  else
    buf = shmp (handle, chan-> backend.buf [0] .shmp);
  if (caa_unlikely (! buf))
    return -EIO;
  if (caa_unlikely (uatomic_read (& buf-> record_disabled)))
    return -EAGAIN;
  ctx-> buf = buf;
    ...

I found that the function would return in if (caa_unlikely (! Buf)) and if (caa_unlikely (uatomic_read (& buf-> record_disabled))), resulting in the code that generates the trace cannot be reached, so the trace cannot be printed.

I researched for a while, but I couldn't find the reason, so I came here for help.


Files

log.txt (133 KB) log.txt lttng-sessiond's log jinming liao, 01/14/2020 07:02 AM
hello.log (25.6 KB) hello.log jinming liao, 01/14/2020 10:35 PM
lttng.log (134 KB) lttng.log jinming liao, 01/14/2020 10:37 PM
Actions #1

Updated by Jonathan Rajotte Julien almost 5 years ago

  • Status changed from New to Feedback
  • Priority changed from High to Normal

Hi,

This seems to be a failure on add context that prevent buffers allocation. Let's take a step back and look into the lttng-sessiond process and logs. Errors management might be lacking in these part.

Starts the lttng-sessiond daemon manually with the verbose options:

lttng-sessiond --no-kernel --no-kernel -vvv 2>&1 | tee log.txt &

Run the same commands and upload the content of log.txt.

Cheers

Actions #2

Updated by jinming liao almost 5 years ago

If chroot is not used, the trace can be printed. By executing df -h, I found that the size of /dev/shm is 81M. With chroot, the size of /dev/shm is only 11M, and sometimes smaller.

Therefore, I think it may be that insufficient shared memory is requested under the chroot.

Actions #3

Updated by Jonathan Rajotte Julien almost 5 years ago

Sure that might be it but the log from the lttng-sessiond deamon should point you toward something.

Also I forgot, user the --verbose-consumer option also:

lttng-sessiond --no-kernel --no-kernel --verbose-consumer -vvv 2>&1 | tee log.txt &
Actions #4

Updated by jinming liao almost 5 years ago

Sorry, because I am in the company, I cannot upload the entire log file directly.

I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application and got the logs according to the method you provided, I found this Error, consistent with the code I analyzed:

Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer () at ust-consumer.c: 2570)

Forgive me for not being able to upload the entire log file in the company. At night I will upload the entire log file. Here are some of the logs.

DEBUG1 - 16:19:25.410466269 [19216/19222]: Channel poll wait (in consumer_thread_channel_poll() at consumer.c:2875)
DEBUG1 - 16:19:25.410488273 [19216/19224]: fd 38 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410506395 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410509190 [19216/19224]: fd 40 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410511743 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410514269 [19216/19224]: polling on 6 fd (in consumer_thread_data_poll() at consumer.c:2544)
DEBUG1 - 16:19:25.410518047 [19216/19224]: poll num_rdy : 4 (in consumer_thread_data_poll() at consumer.c:2551)
DEBUG1 - 16:19:25.410520395 [19216/19224]: Normal read on fd 38 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410522908 [19216/19224]: In UST read_subbuffer (wait_fd: 38, name: channel0_0) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410526249 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410528692 [19216/19224]: Normal read on fd 40 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410530931 [19216/19224]: In UST read_subbuffer (wait_fd: 40, name: channel0_1) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410533448 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410535804 [19216/19224]: fd 42 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410538134 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410540384 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410544709 [19216/19224]: Consumer stream destroy monitored key: 38 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410559396 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410562144 [19216/19224]: Consumer stream destroy monitored key: 40 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410569506 [19216/19224]: fd 44 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410571873 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410575023 [19216/19224]: Updating poll fd array (in update_poll_array() at consumer.c:1091)
DEBUG1 - 16:19:25.410578087 [19216/19224]: polling on 4 fd (in consumer_thread_data_poll() at consumer.c:2544)
DEBUG1 - 16:19:25.410581009 [19216/19224]: poll num_rdy : 2 (in consumer_thread_data_poll() at consumer.c:2551)
DEBUG1 - 16:19:25.410583307 [19216/19224]: Normal read on fd 42 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410585591 [19216/19224]: In UST read_subbuffer (wait_fd: 42, name: channel0_2) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410588001 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410590231 [19216/19224]: Normal read on fd 44 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410592384 [19216/19224]: In UST read_subbuffer (wait_fd: 44, name: channel0_3) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410594829 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410597023 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410599553 [19216/19224]: Consumer stream destroy monitored key: 42 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410606620 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410609179 [19216/19224]: Consumer stream destroy monitored key: 44 (in destroy_close_stream() at consumer-stream.c:265)
Actions #5

Updated by jinming liao almost 5 years ago

I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application.

I execute commands in a chroot as follows:

lttng-sessiond --no-kernel --verbose-consumer -vvv 2>&1 | tee /tmp/log.txt
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
lttng destroy

Attached file is the complete log, and found from the log: "Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11]". The return value is the same as the return value of the lib_ring_buffer_reserve function -EAGAIN. -EAGAIN means: channel is disabled, did the create channel fail?

Actions #6

Updated by Jonathan Rajotte Julien almost 5 years ago

Please rerun you experiment using LTTNG_UST_DEBUG=y while running the instrumented application.

LTTNG_UST_DEBUG=y ./a.out

And send us both the lttng-sessiond log and the log from the application.

Updated by jinming liao almost 5 years ago

The command I executed is as follows:

  1. Run the hello program:
LTTNG_UST_DEBUG=y ./hello world and beyond 2>&1 | tee /tmp/hello.log
  1. Run lttng:
lttng-sessiond --no-kernel --verbose-consumer -vvv 2?&1 | tee /tmp/lttng.log
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
  1. Go back to the running hello application and press Enter.
  1. destroy lttng
lttng destroy

Attached files are the complete log.

Actions #8

Updated by Jonathan Rajotte Julien almost 5 years ago

Hi,

What does nproc print under your chroot? Is it the same count as on the physical machine?

Cheers

Actions #9

Updated by jinming liao almost 5 years ago

Under my chroot, there is no nproc command because I did not copy it to the chroot.

In addition, your reminder made me discover a problem point: I am using a container. If chroot is not used, the result of nproc is the same as the host. In addition, I executed: mount --bind /proc /var/chroot/proc.

Actions #10

Updated by Michael Jeanson over 2 years ago

  • Status changed from Feedback to Invalid
Actions

Also available in: Atom PDF