Bug #1296
closedtracepoint symbols should have public visibility
100%
Description
I have an application with two providers: one from a shared library, the other from the main program. I followed the build instructions in the documentation; everything links and runs OK.
The problem is that when I run "lttng list --userspace" I see only the events from the provider in the shared library. The debug log has some interesting information (my providers are HPX and HPX_ALG):
... Provider "HPX" accepted, version 1.0 is compatible with... ... just registered a tracepoints section from ... registered tracepoint: HPX:thread_create_highprio ...
etc. (the HPX provider's tracepoints are shown)
... just registered a tracepoints section from ... ...
(BOTH the HPX and HPX_ALG provider tracepoints are shown). Note that only
HPX
is shown as an "accepted" provider, and there are two tracepoint sections, with the HPX
ones duplicated.
Desired result: lttng list --userspace
shows events from both HPX
and HPX_ALG
providers
Actual result: only HPX
(the shared library) events are shown.
I'm pretty much certain this is my bug but I'm stumped. I'm hoping the odd symptoms displayed in my debug log will point to the underlying issue.
Files
Updated by Mathieu Desnoyers almost 4 years ago
Can you show us, for each compile unit where the HPX_ALG tracepoint header file is included, exactly how it is included ?
This should show whether TRACEPOINT_DEFINE and/or TRACEPOINT_CREATE_PROBES and/or _LGPL_SOURCE and/or TRACEPOINT_PROBE_DYNAMIC_LINKAGE is defined before its inclusion for each compile unit.
Then providing a bit of context showing how each compile unit ends up being linked into the application would be useful.
Thanks,
Mathieu
Updated by Mathieu Desnoyers almost 4 years ago
- Status changed from New to Feedback
Updated by Jeff Trull almost 4 years ago
- File tracepoints.h tracepoints.h added
Sure. The HPX_ALG header file is attached. It is included in two source files: the tracepoint provider, which looks like this:
#define TRACEPOINT_CREATE_PROBES // include header #include "tracepoints.h"
and the main app, which looks like this:
#define TRACEPOINT_DEFINE #define TRACEPOINT_PROBE_DYNAMIC_LINKAGE ... other header files, including one that pulls in the HPX provider #include "tracepoints.h" ... rest of the code, including calls to tracepoint(HPX_ALG, ...)
The order of the HPX vs. HPX_ALG inclusion doesn't seem to affect the result, FWIW.
The HPX provider, and a bunch of code that uses its events, are linked in as a shared library. The HPX_ALG provider is also a shared library, for consistency.
The main application makes calls to both providers' events.
Updated by Mathieu Desnoyers almost 4 years ago
How is the shared library containing the HPX_ALG provider linked into the application ? Is it statically linked (.a) or dynamically loaded (.so) ?
Updated by Mathieu Desnoyers almost 4 years ago
If the probe provider for HPX_ALG is statically linked, then TRACEPOINT_PROBE_DYNAMIC_LINKAGE should not be defined at the location where the inclusion following the TRACEPOINT_DEFINE is performed.
If in your case you have e.g. one provider dynamically loaded (HPX), and one provider statically linked (HPX_ALG), you would need to do something like this in your app:
#define TRACEPOINT_DEFINE
#include "tracepoints.h"
and in tracepoints.h:
/* This probe is dynamically loaded */
#define TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#include "tracepoint-hpx.h"
/* This probe is statically linked */
#undef TRACEPOINT_PROBE_DYNAMIC_LINKAGE
#include "tracepoint-hpx-alg.h"
This is explained by this comment in include/lttng/tracepoint.h:
/*
* When TRACEPOINT_PROBE_DYNAMIC_LINKAGE is defined, we do not emit a
* unresolved symbol that requires the provider to be linked in. When
* TRACEPOINT_PROBE_DYNAMIC_LINKAGE is not defined, we emit an
* unresolved symbol that depends on having the provider linked in,
* otherwise the linker complains. This deals with use of static
* libraries, ensuring that the linker does not remove the provider
* object from the executable.
*/
I suspect that because you have TRACEPOINT_PROBE_DYNAMIC_LINKAGE defined in a situation
where a probe provider is statically linked, the linker discards all the content of the
probe because there is no dependency on its symbols.
Updated by Jeff Trull almost 4 years ago
Yes, the shared library containing the HPX_ALG provider is... a .so :) - I probably misunderstand your question - anyway that's an interesting hint; I can see from ldd
that only the HPX
provider is listed as a runtime loaded library. Does that mean it's a shared library that's statically linked?
Anyway adding #undef TRACEPOINT_PROBE_DYNAMIC_LINKAGE
before the HPX_ALG
inclusion doesn't change the result.
I also just now tried disabling LTO, in case that was removing symbols. No help.
Are there any hints in the symbols in the binaries that might help explain what's going on? I can mess around with nm
and see what's there and the visibility.
Updated by Jeff Trull almost 4 years ago
Like, maybe there's some clues in the libraries that can explain why there are two tracepoint section
messages
Updated by Jeff Trull almost 4 years ago
Update: I can get the application to accept the HPX_ALG provider by either of these two hacks:
1) Add an artificial function dependency inside the tracepoint provider so the linker records it and loads the provider .so
2) Use LD_PRELOAD to force loading the provider .so
Without either of those, the linker drops the HPX_ALG provider's .so
Updated by Jeff Trull almost 4 years ago
I've managed to create a nearly-minimal WE for this issue: https://github.com/jefftrull/lttng_1296. It's very little code but illustrates the problem.
Updated by Mathieu Desnoyers almost 4 years ago
Quick comment (I did not look at the example yet):
You won't need to define TRACEPOINT_PROBE_DYNAMIC_LINKAGE I think. It will leave the dependency in place. TRACEPOINT_PROBE_DYNAMIC_LINKAGE is only for use-cases where you want to LD_PRELOAD your probe provider. If you want to link to it with -lmyprovider you should not use TRACEPOINT_PROBE_DYNAMIC_LINKAGE.
Updated by Jeff Trull almost 4 years ago
I see... definitely a misconception on my part. Well, removing that makes my MWE work - but my original problem changes to a link error. Probably that is why I added TRACEPOINT_PROBE_DYNAMIC_LINK
in the first place.
Without it, in my original code I have a bunch of undefined reference to `__tracepoint_provider_HPX'
when I link the main program.
Updated by Jeff Trull almost 4 years ago
More on those linker errors... The HPX_ALG provider (built in its own library as part of my app) looks like this via nm
:
00000000000041a0 B __tracepoint_provider_HPX_ALG 0000000000001591 t __tracepoint_provider_check_HPX_ALG() 000000000000158a t __tracepoint_provider_mismatch_HPX_ALG()
The HPX provider looks like this in its own .so:
0000000001991ea0 b __tracepoint_provider_HPX 00000000010ab666 t __tracepoint_provider_check_HPX() 00000000010ab65f t __tracepoint_provider_mismatch_HPX()
However, fixing the visibility with objcopy
so it matches doesn't change the link errors, to my surprise.
Updated by Jeff Trull almost 4 years ago
I think I have it. The HPX library is built with -fvisibility=hidden
, and this causes the tracepoint provider symbols to be both "local" (not externally visible) and "normal" (that is, not "dynamic" and so unavailable for linking out of shared libraries). The latter problem is the reason I could not fix things with objcopy --globalize-symbol
.
I have updated my MWE to demonstrate the issue; it may be appropriate to change the title of this bug to "tracepoint symbols should have public visibility", as I think you could support libraries built this way with some extra attributes at the right points.
Updated by Mathieu Desnoyers almost 4 years ago
- Subject changed from events not listed though "registered" in debug log to tracepoint symbols should have public visibility
Updated by Mathieu Desnoyers almost 4 years ago
- File 0001-Fix-Use-default-visibility-for-tracepoint-provider-s.patch 0001-Fix-Use-default-visibility-for-tracepoint-provider-s.patch added
Can you please try the attached patch ?
Updated by Jeff Trull almost 4 years ago
It fixes it! Thank you.
(Somehow I didn't get notified about the update - sorry about the delay)
Updated by Mathieu Desnoyers almost 4 years ago
- Status changed from Feedback to Resolved
- % Done changed from 0 to 100
Applied in changeset lttng-ust|6c8cfec756dd209da46aa278133d4b2baaa5ea90.