Project

General

Profile

Actions

Bug #1213

closed
JL

Cannot print trace under chroot

Bug #1213: Cannot print trace under chroot

Added by jinming liao almost 6 years ago. Updated almost 4 years ago.

Status:
Invalid
Priority:
Normal
Assignee:
-
Target version:
Start date:
01/13/2020
Due date:
% Done:

0%

Estimated time:

Description

I want to use lttng under chroot, here is what I did:

  1. The root directory of chroot is / var / chroot.
  2. Install userspace-rcu-0.10.1-1.el7.x86_64.rpm, lttng-ust-2.10.3-1.el7.x86_64.rpm, lttng-tools-2.10.4-1.el7.x86_64.rpm And babeltrace-1.5.5-1.el7.x86_64.rpm to / var / chroot.
  3. Copy the corresponding dependent libraries to / var / chroot.
  4. mount --bind / dev / shm / var / chroot / dev / shm

This is fine at first, and can print the trace, but it won't work later, I don't know the reason. Here are the commands I executed under chroot:

lttng-sessiond --no-kernel --daemonize
lttng create
lttng enable-event -u sal _ *, sql *
lttng add-context --type = vpid --type = pthread_id -u
lttng start
lttng stop
lttng view

And I found that if lttng add-context --type = vpid --type = pthread_id -u is not executed, the trace can be printed.

I read the source code of lttng-ust, and printed some logs, and found that the problem was in the frontend_api.h: lib_ring_buffer_reserve function:

static inline
int lib_ring_buffer_reserve (const struct lttng_ust_lib_ring_buffer_config * config,
          struct lttng_ust_lib_ring_buffer_ctx * ctx)
{
  struct channel * chan = ctx-> chan;
  struct lttng_ust_shm_handle * handle = ctx-> handle;
  struct lttng_ust_lib_ring_buffer * buf;
  unsigned long o_begin, o_end, o_old;
  size_t before_hdr_pad = 0;

  if (caa_unlikely (uatomic_read (& chan-> record_disabled)))
    return -EAGAIN;

  if (config-> alloc == RING_BUFFER_ALLOC_PER_CPU)
    buf = shmp (handle, chan-> backend.buf [ctx-> cpu] .shmp);
  else
    buf = shmp (handle, chan-> backend.buf [0] .shmp);
  if (caa_unlikely (! buf))
    return -EIO;
  if (caa_unlikely (uatomic_read (& buf-> record_disabled)))
    return -EAGAIN;
  ctx-> buf = buf;
    ...

I found that the function would return in if (caa_unlikely (! Buf)) and if (caa_unlikely (uatomic_read (& buf-> record_disabled))), resulting in the code that generates the trace cannot be reached, so the trace cannot be printed.

I researched for a while, but I couldn't find the reason, so I came here for help.


Files

log.txt (133 KB) log.txt lttng-sessiond's log jinming liao, 01/14/2020 07:02 AM
hello.log (25.6 KB) hello.log jinming liao, 01/14/2020 10:35 PM
lttng.log (134 KB) lttng.log jinming liao, 01/14/2020 10:37 PM

JR Updated by Jonathan Rajotte Julien almost 6 years ago Actions #1

  • Status changed from New to Feedback
  • Priority changed from High to Normal

Hi,

This seems to be a failure on add context that prevent buffers allocation. Let's take a step back and look into the lttng-sessiond process and logs. Errors management might be lacking in these part.

Starts the lttng-sessiond daemon manually with the verbose options:

lttng-sessiond --no-kernel --no-kernel -vvv 2>&1 | tee log.txt &

Run the same commands and upload the content of log.txt.

Cheers

JL Updated by jinming liao almost 6 years ago Actions #2

If chroot is not used, the trace can be printed. By executing df -h, I found that the size of /dev/shm is 81M. With chroot, the size of /dev/shm is only 11M, and sometimes smaller.

Therefore, I think it may be that insufficient shared memory is requested under the chroot.

JR Updated by Jonathan Rajotte Julien almost 6 years ago Actions #3

Sure that might be it but the log from the lttng-sessiond deamon should point you toward something.

Also I forgot, user the --verbose-consumer option also:

lttng-sessiond --no-kernel --no-kernel --verbose-consumer -vvv 2>&1 | tee log.txt &

JL Updated by jinming liao almost 6 years ago Actions #4

Sorry, because I am in the company, I cannot upload the entire log file directly.

I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application and got the logs according to the method you provided, I found this Error, consistent with the code I analyzed:

Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer () at ust-consumer.c: 2570)

Forgive me for not being able to upload the entire log file in the company. At night I will upload the entire log file. Here are some of the logs.

DEBUG1 - 16:19:25.410466269 [19216/19222]: Channel poll wait (in consumer_thread_channel_poll() at consumer.c:2875)
DEBUG1 - 16:19:25.410488273 [19216/19224]: fd 38 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410506395 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410509190 [19216/19224]: fd 40 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410511743 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410514269 [19216/19224]: polling on 6 fd (in consumer_thread_data_poll() at consumer.c:2544)
DEBUG1 - 16:19:25.410518047 [19216/19224]: poll num_rdy : 4 (in consumer_thread_data_poll() at consumer.c:2551)
DEBUG1 - 16:19:25.410520395 [19216/19224]: Normal read on fd 38 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410522908 [19216/19224]: In UST read_subbuffer (wait_fd: 38, name: channel0_0) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410526249 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410528692 [19216/19224]: Normal read on fd 40 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410530931 [19216/19224]: In UST read_subbuffer (wait_fd: 40, name: channel0_1) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410533448 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410535804 [19216/19224]: fd 42 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410538134 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410540384 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410544709 [19216/19224]: Consumer stream destroy monitored key: 38 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410559396 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410562144 [19216/19224]: Consumer stream destroy monitored key: 40 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410569506 [19216/19224]: fd 44 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683)
DEBUG1 - 16:19:25.410571873 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410575023 [19216/19224]: Updating poll fd array (in update_poll_array() at consumer.c:1091)
DEBUG1 - 16:19:25.410578087 [19216/19224]: polling on 4 fd (in consumer_thread_data_poll() at consumer.c:2544)
DEBUG1 - 16:19:25.410581009 [19216/19224]: poll num_rdy : 2 (in consumer_thread_data_poll() at consumer.c:2551)
DEBUG1 - 16:19:25.410583307 [19216/19224]: Normal read on fd 42 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410585591 [19216/19224]: In UST read_subbuffer (wait_fd: 42, name: channel0_2) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410588001 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410590231 [19216/19224]: Normal read on fd 44 (in consumer_thread_data_poll() at consumer.c:2658)
DEBUG1 - 16:19:25.410592384 [19216/19224]: In UST read_subbuffer (wait_fd: 44, name: channel0_3) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519)
DEBUG1 - 16:19:25.410594829 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570)
DEBUG1 - 16:19:25.410597023 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410599553 [19216/19224]: Consumer stream destroy monitored key: 42 (in destroy_close_stream() at consumer-stream.c:265)
DEBUG1 - 16:19:25.410606620 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694)
DEBUG1 - 16:19:25.410609179 [19216/19224]: Consumer stream destroy monitored key: 44 (in destroy_close_stream() at consumer-stream.c:265)

JL Updated by jinming liao almost 6 years ago Actions #5

I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application.

I execute commands in a chroot as follows:

lttng-sessiond --no-kernel --verbose-consumer -vvv 2>&1 | tee /tmp/log.txt
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
lttng destroy

Attached file is the complete log, and found from the log: "Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11]". The return value is the same as the return value of the lib_ring_buffer_reserve function -EAGAIN. -EAGAIN means: channel is disabled, did the create channel fail?

JR Updated by Jonathan Rajotte Julien almost 6 years ago Actions #6

Please rerun you experiment using LTTNG_UST_DEBUG=y while running the instrumented application.

LTTNG_UST_DEBUG=y ./a.out

And send us both the lttng-sessiond log and the log from the application.

JL Updated by jinming liao almost 6 years ago Actions #7

The command I executed is as follows:

  1. Run the hello program:
LTTNG_UST_DEBUG=y ./hello world and beyond 2>&1 | tee /tmp/hello.log
  1. Run lttng:
lttng-sessiond --no-kernel --verbose-consumer -vvv 2?&1 | tee /tmp/lttng.log
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
  1. Go back to the running hello application and press Enter.
  1. destroy lttng
lttng destroy

Attached files are the complete log.

JR Updated by Jonathan Rajotte Julien almost 6 years ago Actions #8

Hi,

What does nproc print under your chroot? Is it the same count as on the physical machine?

Cheers

JL Updated by jinming liao almost 6 years ago Actions #9

Under my chroot, there is no nproc command because I did not copy it to the chroot.

In addition, your reminder made me discover a problem point: I am using a container. If chroot is not used, the result of nproc is the same as the host. In addition, I executed: mount --bind /proc /var/chroot/proc.

MJ Updated by Michael Jeanson almost 4 years ago Actions #10

  • Status changed from Feedback to Invalid
Actions

Also available in: PDF Atom