Project

General

Profile

Actions

Bug #1296

closed

tracepoint symbols should have public visibility

Added by Jeff Trull almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
12/18/2020
Due date:
% Done:

100%

Estimated time:

Description

I have an application with two providers: one from a shared library, the other from the main program. I followed the build instructions in the documentation; everything links and runs OK.

The problem is that when I run "lttng list --userspace" I see only the events from the provider in the shared library. The debug log has some interesting information (my providers are HPX and HPX_ALG):

...
Provider "HPX" accepted, version 1.0 is compatible with...
...
just registered a tracepoints section from ...
registered tracepoint: HPX:thread_create_highprio
...

etc. (the HPX provider's tracepoints are shown)
...
just registered a tracepoints section from ...
...

(BOTH the HPX and HPX_ALG provider tracepoints are shown). Note that only HPX is shown as an "accepted" provider, and there are two tracepoint sections, with the HPX ones duplicated.

Desired result: lttng list --userspace shows events from both HPX and HPX_ALG providers
Actual result: only HPX (the shared library) events are shown.

I'm pretty much certain this is my bug but I'm stumped. I'm hoping the odd symptoms displayed in my debug log will point to the underlying issue.


Files

lttng_debug.log (10.2 KB) lttng_debug.log LTTNG_UST_DEBUG=1 output from app Jeff Trull, 12/18/2020 12:53 PM
tracepoints.h (542 Bytes) tracepoints.h Jeff Trull, 12/18/2020 03:07 PM
0001-Fix-Use-default-visibility-for-tracepoint-provider-s.patch (1.56 KB) 0001-Fix-Use-default-visibility-for-tracepoint-provider-s.patch Mathieu Desnoyers, 12/27/2020 03:00 PM
Actions #1

Updated by Mathieu Desnoyers almost 4 years ago

Can you show us, for each compile unit where the HPX_ALG tracepoint header file is included, exactly how it is included ?

This should show whether TRACEPOINT_DEFINE and/or TRACEPOINT_CREATE_PROBES and/or _LGPL_SOURCE and/or TRACEPOINT_PROBE_DYNAMIC_LINKAGE is defined before its inclusion for each compile unit.

Then providing a bit of context showing how each compile unit ends up being linked into the application would be useful.

Thanks,

Mathieu

Actions #2

Updated by Mathieu Desnoyers almost 4 years ago

  • Status changed from New to Feedback
Actions #3

Updated by Jeff Trull almost 4 years ago

Sure. The HPX_ALG header file is attached. It is included in two source files: the tracepoint provider, which looks like this:

#define TRACEPOINT_CREATE_PROBES

// include header
#include "tracepoints.h" 

and the main app, which looks like this:

#define TRACEPOINT_DEFINE
#define TRACEPOINT_PROBE_DYNAMIC_LINKAGE
...
other header files, including one that pulls in the HPX provider
#include "tracepoints.h" 
...
rest of the code, including calls to tracepoint(HPX_ALG, ...)

The order of the HPX vs. HPX_ALG inclusion doesn't seem to affect the result, FWIW.

The HPX provider, and a bunch of code that uses its events, are linked in as a shared library. The HPX_ALG provider is also a shared library, for consistency.
The main application makes calls to both providers' events.

Actions #4

Updated by Mathieu Desnoyers almost 4 years ago

How is the shared library containing the HPX_ALG provider linked into the application ? Is it statically linked (.a) or dynamically loaded (.so) ?

Actions #5

Updated by Mathieu Desnoyers almost 4 years ago

If the probe provider for HPX_ALG is statically linked, then TRACEPOINT_PROBE_DYNAMIC_LINKAGE should not be defined at the location where the inclusion following the TRACEPOINT_DEFINE is performed.

If in your case you have e.g. one provider dynamically loaded (HPX), and one provider statically linked (HPX_ALG), you would need to do something like this in your app:

#define TRACEPOINT_DEFINE
#include "tracepoints.h"

and in tracepoints.h:

/* This probe is dynamically loaded */
#define TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#include "tracepoint-hpx.h"

/* This probe is statically linked */
#undef TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#include "tracepoint-hpx-alg.h"

This is explained by this comment in include/lttng/tracepoint.h:

/* * When TRACEPOINT_PROBE_DYNAMIC_LINKAGE is defined, we do not emit a * unresolved symbol that requires the provider to be linked in. When * TRACEPOINT_PROBE_DYNAMIC_LINKAGE is not defined, we emit an * unresolved symbol that depends on having the provider linked in, * otherwise the linker complains. This deals with use of static * libraries, ensuring that the linker does not remove the provider * object from the executable.
*/

I suspect that because you have TRACEPOINT_PROBE_DYNAMIC_LINKAGE defined in a situation
where a probe provider is statically linked, the linker discards all the content of the
probe because there is no dependency on its symbols.

Actions #6

Updated by Jeff Trull almost 4 years ago

Yes, the shared library containing the HPX_ALG provider is... a .so :) - I probably misunderstand your question - anyway that's an interesting hint; I can see from ldd that only the HPX provider is listed as a runtime loaded library. Does that mean it's a shared library that's statically linked?

Anyway adding #undef TRACEPOINT_PROBE_DYNAMIC_LINKAGE before the HPX_ALG inclusion doesn't change the result.

I also just now tried disabling LTO, in case that was removing symbols. No help.

Are there any hints in the symbols in the binaries that might help explain what's going on? I can mess around with nm and see what's there and the visibility.

Actions #7

Updated by Jeff Trull almost 4 years ago

Like, maybe there's some clues in the libraries that can explain why there are two tracepoint section messages

Actions #8

Updated by Jeff Trull almost 4 years ago

Update: I can get the application to accept the HPX_ALG provider by either of these two hacks:
1) Add an artificial function dependency inside the tracepoint provider so the linker records it and loads the provider .so
2) Use LD_PRELOAD to force loading the provider .so

Without either of those, the linker drops the HPX_ALG provider's .so

Actions #9

Updated by Jeff Trull almost 4 years ago

I've managed to create a nearly-minimal WE for this issue: https://github.com/jefftrull/lttng_1296. It's very little code but illustrates the problem.

Actions #10

Updated by Mathieu Desnoyers almost 4 years ago

Quick comment (I did not look at the example yet):

You won't need to define TRACEPOINT_PROBE_DYNAMIC_LINKAGE I think. It will leave the dependency in place. TRACEPOINT_PROBE_DYNAMIC_LINKAGE is only for use-cases where you want to LD_PRELOAD your probe provider. If you want to link to it with -lmyprovider you should not use TRACEPOINT_PROBE_DYNAMIC_LINKAGE.

Actions #11

Updated by Jeff Trull almost 4 years ago

I see... definitely a misconception on my part. Well, removing that makes my MWE work - but my original problem changes to a link error. Probably that is why I added TRACEPOINT_PROBE_DYNAMIC_LINK in the first place.

Without it, in my original code I have a bunch of undefined reference to `__tracepoint_provider_HPX' when I link the main program.

Actions #12

Updated by Jeff Trull almost 4 years ago

More on those linker errors... The HPX_ALG provider (built in its own library as part of my app) looks like this via nm:

00000000000041a0 B __tracepoint_provider_HPX_ALG
0000000000001591 t __tracepoint_provider_check_HPX_ALG()
000000000000158a t __tracepoint_provider_mismatch_HPX_ALG()

The HPX provider looks like this in its own .so:

0000000001991ea0 b __tracepoint_provider_HPX
00000000010ab666 t __tracepoint_provider_check_HPX()
00000000010ab65f t __tracepoint_provider_mismatch_HPX()

However, fixing the visibility with objcopy so it matches doesn't change the link errors, to my surprise.

Actions #13

Updated by Jeff Trull almost 4 years ago

I think I have it. The HPX library is built with -fvisibility=hidden, and this causes the tracepoint provider symbols to be both "local" (not externally visible) and "normal" (that is, not "dynamic" and so unavailable for linking out of shared libraries). The latter problem is the reason I could not fix things with objcopy --globalize-symbol.

I have updated my MWE to demonstrate the issue; it may be appropriate to change the title of this bug to "tracepoint symbols should have public visibility", as I think you could support libraries built this way with some extra attributes at the right points.

Actions #14

Updated by Mathieu Desnoyers almost 4 years ago

  • Subject changed from events not listed though "registered" in debug log to tracepoint symbols should have public visibility
Actions #16

Updated by Jeff Trull almost 4 years ago

It fixes it! Thank you.

(Somehow I didn't get notified about the update - sorry about the delay)

Actions #17

Updated by Mathieu Desnoyers almost 4 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF