Bug #1213
closedCannot print trace under chroot
0%
Description
I want to use lttng under chroot, here is what I did:
- The root directory of chroot is / var / chroot.
- Install userspace-rcu-0.10.1-1.el7.x86_64.rpm, lttng-ust-2.10.3-1.el7.x86_64.rpm, lttng-tools-2.10.4-1.el7.x86_64.rpm And babeltrace-1.5.5-1.el7.x86_64.rpm to / var / chroot.
- Copy the corresponding dependent libraries to / var / chroot.
- mount --bind / dev / shm / var / chroot / dev / shm
This is fine at first, and can print the trace, but it won't work later, I don't know the reason. Here are the commands I executed under chroot:
lttng-sessiond --no-kernel --daemonize
lttng create
lttng enable-event -u sal _ *, sql *
lttng add-context --type = vpid --type = pthread_id -u
lttng start
lttng stop
lttng view
And I found that if lttng add-context --type = vpid --type = pthread_id -u is not executed, the trace can be printed.
I read the source code of lttng-ust, and printed some logs, and found that the problem was in the frontend_api.h: lib_ring_buffer_reserve function:
static inline
int lib_ring_buffer_reserve (const struct lttng_ust_lib_ring_buffer_config * config,
struct lttng_ust_lib_ring_buffer_ctx * ctx)
{
struct channel * chan = ctx-> chan;
struct lttng_ust_shm_handle * handle = ctx-> handle;
struct lttng_ust_lib_ring_buffer * buf;
unsigned long o_begin, o_end, o_old;
size_t before_hdr_pad = 0;
if (caa_unlikely (uatomic_read (& chan-> record_disabled)))
return -EAGAIN;
if (config-> alloc == RING_BUFFER_ALLOC_PER_CPU)
buf = shmp (handle, chan-> backend.buf [ctx-> cpu] .shmp);
else
buf = shmp (handle, chan-> backend.buf [0] .shmp);
if (caa_unlikely (! buf))
return -EIO;
if (caa_unlikely (uatomic_read (& buf-> record_disabled)))
return -EAGAIN;
ctx-> buf = buf;
...
I found that the function would return in if (caa_unlikely (! Buf)) and if (caa_unlikely (uatomic_read (& buf-> record_disabled))), resulting in the code that generates the trace cannot be reached, so the trace cannot be printed.
I researched for a while, but I couldn't find the reason, so I came here for help.
Files
Updated by Jonathan Rajotte Julien almost 5 years ago
- Status changed from New to Feedback
- Priority changed from High to Normal
Hi,
This seems to be a failure on add context that prevent buffers allocation. Let's take a step back and look into the lttng-sessiond process and logs. Errors management might be lacking in these part.
Starts the lttng-sessiond daemon manually with the verbose options:
lttng-sessiond --no-kernel --no-kernel -vvv 2>&1 | tee log.txt &
Run the same commands and upload the content of log.txt.
Cheers
Updated by jinming liao almost 5 years ago
If chroot is not used, the trace can be printed. By executing df -h, I found that the size of /dev/shm is 81M. With chroot, the size of /dev/shm is only 11M, and sometimes smaller.
Therefore, I think it may be that insufficient shared memory is requested under the chroot.
Updated by Jonathan Rajotte Julien almost 5 years ago
Sure that might be it but the log from the lttng-sessiond deamon should point you toward something.
Also I forgot, user the --verbose-consumer option also:
lttng-sessiond --no-kernel --no-kernel --verbose-consumer -vvv 2>&1 | tee log.txt &
Updated by jinming liao almost 5 years ago
Sorry, because I am in the company, I cannot upload the entire log file directly.
I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application and got the logs according to the method you provided, I found this Error, consistent with the code I analyzed:
Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer () at ust-consumer.c: 2570)
Forgive me for not being able to upload the entire log file in the company. At night I will upload the entire log file. Here are some of the logs.
DEBUG1 - 16:19:25.410466269 [19216/19222]: Channel poll wait (in consumer_thread_channel_poll() at consumer.c:2875) DEBUG1 - 16:19:25.410488273 [19216/19224]: fd 38 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683) DEBUG1 - 16:19:25.410506395 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410509190 [19216/19224]: fd 40 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683) DEBUG1 - 16:19:25.410511743 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410514269 [19216/19224]: polling on 6 fd (in consumer_thread_data_poll() at consumer.c:2544) DEBUG1 - 16:19:25.410518047 [19216/19224]: poll num_rdy : 4 (in consumer_thread_data_poll() at consumer.c:2551) DEBUG1 - 16:19:25.410520395 [19216/19224]: Normal read on fd 38 (in consumer_thread_data_poll() at consumer.c:2658) DEBUG1 - 16:19:25.410522908 [19216/19224]: In UST read_subbuffer (wait_fd: 38, name: channel0_0) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519) DEBUG1 - 16:19:25.410526249 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570) DEBUG1 - 16:19:25.410528692 [19216/19224]: Normal read on fd 40 (in consumer_thread_data_poll() at consumer.c:2658) DEBUG1 - 16:19:25.410530931 [19216/19224]: In UST read_subbuffer (wait_fd: 40, name: channel0_1) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519) DEBUG1 - 16:19:25.410533448 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570) DEBUG1 - 16:19:25.410535804 [19216/19224]: fd 42 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683) DEBUG1 - 16:19:25.410538134 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410540384 [19216/19224]: Polling fd 38 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410544709 [19216/19224]: Consumer stream destroy monitored key: 38 (in destroy_close_stream() at consumer-stream.c:265) DEBUG1 - 16:19:25.410559396 [19216/19224]: Polling fd 40 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410562144 [19216/19224]: Consumer stream destroy monitored key: 40 (in destroy_close_stream() at consumer-stream.c:265) DEBUG1 - 16:19:25.410569506 [19216/19224]: fd 44 is hup|err|nval. Attempting flush and read. (in consumer_thread_data_poll() at consumer.c:2683) DEBUG1 - 16:19:25.410571873 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410575023 [19216/19224]: Updating poll fd array (in update_poll_array() at consumer.c:1091) DEBUG1 - 16:19:25.410578087 [19216/19224]: polling on 4 fd (in consumer_thread_data_poll() at consumer.c:2544) DEBUG1 - 16:19:25.410581009 [19216/19224]: poll num_rdy : 2 (in consumer_thread_data_poll() at consumer.c:2551) DEBUG1 - 16:19:25.410583307 [19216/19224]: Normal read on fd 42 (in consumer_thread_data_poll() at consumer.c:2658) DEBUG1 - 16:19:25.410585591 [19216/19224]: In UST read_subbuffer (wait_fd: 42, name: channel0_2) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519) DEBUG1 - 16:19:25.410588001 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570) DEBUG1 - 16:19:25.410590231 [19216/19224]: Normal read on fd 44 (in consumer_thread_data_poll() at consumer.c:2658) DEBUG1 - 16:19:25.410592384 [19216/19224]: In UST read_subbuffer (wait_fd: 44, name: channel0_3) (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2519) DEBUG1 - 16:19:25.410594829 [19216/19224]: Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11] (in lttng_ustconsumer_read_subbuffer() at ust-consumer.c:2570) DEBUG1 - 16:19:25.410597023 [19216/19224]: Polling fd 42 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410599553 [19216/19224]: Consumer stream destroy monitored key: 42 (in destroy_close_stream() at consumer-stream.c:265) DEBUG1 - 16:19:25.410606620 [19216/19224]: Polling fd 44 tells it has hung up. (in consumer_thread_data_poll() at consumer.c:2694) DEBUG1 - 16:19:25.410609179 [19216/19224]: Consumer stream destroy monitored key: 44 (in destroy_close_stream() at consumer-stream.c:265)
Updated by jinming liao almost 5 years ago
I used the example listed here: https://lttng.org/docs/v2.11/#doc-tracing-your-own-user-application.
I execute commands in a chroot as follows:
lttng-sessiond --no-kernel --verbose-consumer -vvv 2>&1 | tee /tmp/log.txt
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
lttng destroy
Attached file is the complete log, and found from the log: "Reserving sub buffer failed (everything is normal, it is due to concurrency) [ret: -11]". The return value is the same as the return value of the lib_ring_buffer_reserve function -EAGAIN. -EAGAIN means: channel is disabled, did the create channel fail?
Updated by Jonathan Rajotte Julien almost 5 years ago
Please rerun you experiment using LTTNG_UST_DEBUG=y while running the instrumented application.
LTTNG_UST_DEBUG=y ./a.out
And send us both the lttng-sessiond log and the log from the application.
Updated by jinming liao almost 5 years ago
The command I executed is as follows:
- Run the hello program:
LTTNG_UST_DEBUG=y ./hello world and beyond 2>&1 | tee /tmp/hello.log
- Run lttng:
lttng-sessiond --no-kernel --verbose-consumer -vvv 2?&1 | tee /tmp/lttng.log
lttng create
lttng enable-event -u hello_world:my_first_tracepoint
lttng add-context --type=vpid --type=pthread_id -u
lttng start
- Go back to the running hello application and press Enter.
- destroy lttng
lttng destroy
Attached files are the complete log.
Updated by Jonathan Rajotte Julien almost 5 years ago
Hi,
What does nproc print under your chroot? Is it the same count as on the physical machine?
Cheers
Updated by jinming liao almost 5 years ago
Under my chroot, there is no nproc command because I did not copy it to the chroot.
In addition, your reminder made me discover a problem point: I am using a container. If chroot is not used, the result of nproc is the same as the host. In addition, I executed: mount --bind /proc /var/chroot/proc.
Updated by Michael Jeanson over 2 years ago
- Status changed from Feedback to Invalid