Project

General

Profile

Actions

Bug #1359

closed

lttng can reap wrong child and get wrong status in get_wait_shm

Added by Tomas Weinfurt 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Target version:
-
Start date:
09/13/2022
Due date:
% Done:

100%

Estimated time:

Description

That code essentially use


pid = fork();
if (pid > 0) {
wait();
}

https://github.com/lttng/lttng-ust/blob/2d2d38713aea27077b690f2756a901c2a0c06f8c/src/lib/lttng-ust/lttng-ust-comm.c#L1584-L1597

that is problematic because it translates to waitpid(-1, &wstatus, 0) on Linux and that can reap any unrelated process that existed during that time.
While the window is narrow it can be anything started before lttng was called or something started from unrelated thread.

The code rally should use
pid = waitpid(pid, &status, 0);
to avoid collecting unrelated status

Long saga and traces are captured here: https://github.com/dotnet/runtime/issues/74795

lttng is loaded and initialized indirectly via msquic library and the code above interferes with tests runs.
It showed only on arm64 but that is probably just matter of timing.

Actions #1

Updated by Jérémie Galarneau 3 months ago

  • Status changed from New to In Progress
  • Assignee set to Mathieu Desnoyers

Thanks for the great report!

I have submitted a patch for review (and automated testing).

https://review.lttng.org/c/lttng-ust/+/8779

Actions #2

Updated by Jérémie Galarneau 3 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF