Project

General

Profile

Bug #1155

BUG: scheduling while atomic and BUG: sleeping function called from invalid context

Added by Fredrik Gustavsson almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
02/21/2018
Due date:
% Done:

0%

Estimated time:

Description

Hi,

Preempt-rt kernel 4.1.21-rt13 with lttng-modules 2.9.5 on a arm32.

I would get some stack traces of this kind and if I provoke the system by loading it, the fault will appear quite quickly.
I see the lttng-tracer in the printout.

The sleeping function called from invalid context is much more common than the scheduling while atomic but the stack traces will be the same.

We previously used 2.8.7 and did not get this error then.

If we revert a6370148ab4a6f210c29372febcffd87b440cc50
"select, poll and epoll_wait overrides on ARM 32-bit"
We do not see the error.

[ 186.864953] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:925
[ 186.864959] in_atomic(): 1, irqs_disabled(): 0, pid: 6756, name: hackbench
[ 186.864963] no locks held by hackbench/6756.
[ 186.864970] Preemption disabled at:[< (null)>] (null)

[ 186.864981] CPU: 8 PID: 6756 Comm: hackbench Tainted: G C 4.1.21-rt13 #1
[ 186.864984] Hardware name:
[ 186.865006] [<c041b028>] (unwind_backtrace) from [<c0414c88>] (show_stack+0x20/0x24)
[ 186.865020] [<c0414c88>] (show_stack) from [<c0a62b0c>] (dump_stack+0xb4/0xe4)
[ 186.865032] [<c0a62b0c>] (dump_stack) from [<c0460d9c>] (might_sleep+0x1b8/0x2a8)
[ 186.865041] [<c0460d9c>] (
_might_sleep) from [<c0a67d4c>] (rt_spin_lock+0x34/0x6c)
[ 186.865053] [<c0a67d4c>] (rt_spin_lock) from [<c054c4b0>] (get_page_from_freelist+0x284/0xa0c)
[ 186.865065] [<c054c4b0>] (get_page_from_freelist) from [<c054ce28>] (
_alloc_pages_nodemask+0x1f0/0xa7c)
[ 186.865075] [<c054ce28>] (_alloc_pages_nodemask) from [<c0587ce8>] (new_slab+0x98/0x2f4)
[ 186.865085] [<c0587ce8>] (new_slab) from [<c058a7ec>] (
_slab_alloc.constprop.9+0x3ac/0x488)
[ 186.865093] [<c058a7ec>] (_slab_alloc.constprop.9) from [<c058ac48>] (_kmalloc+0x380/0x4f4)
[ 186.865170] [<c058ac48>] (_kmalloc) from [<bf081338>] (_event_probe__syscall_entry_poll+0xf4/0x480 [lttng_tracer])
[ 186.865263] [<bf081338>] (_event_probe_syscall_entry_poll [lttng_tracer]) from [<bf093200>] (syscall_entry_probe.part.0+0x158/0x2a0 [lttng_tracer])
[ 186.865349] [<bf093200>] (syscall_entry_probe.part.0 [lttng_tracer]) from [<bf0933a8>] (syscall_entry_probe+0x60/0x64 [lttng_tracer])
[ 186.865406] [<bf0933a8>] (syscall_entry_probe [lttng_tracer]) from [<c0412a94>] (syscall_trace_enter+0x114/0x3c4)
[ 186.865421] [<c0412a94>] (syscall_trace_enter) from [<c040fff8>] (__sys_trace+0xc/0x38)


Files

lttng_kernel_load (2.23 KB) lttng_kernel_load Fredrik Gustavsson, 02/24/2018 09:51 AM
stresstest (2.72 KB) stresstest Fredrik Gustavsson, 02/24/2018 09:51 AM
#1

Updated by Mathieu Desnoyers almost 3 years ago

  • Status changed from New to Feedback

Can you provide your kernel .config ?

Also, I cannot find the source code for the 4.1.21-rt13 kernel anywhere (neither linux-rt stable git tree nor linux-rt patches). The only mentions of it based on a quick Google search seem to indicate that it is a Wind River kernel. You may want to ask them for help if it's the case.

Moreover, does the issue reproduce rather quickly after tracing is activated, or it takes a while ? How is your tracing configured ?

It's really unexpected that a kmalloc() with GFP_ATOMIC | GFP_NOWAIT end up calling the page allocator. It sounds like there may be something fishy in your kernel. Can you reproduce with a maintained, updated, version of the preempt-rt kernel ?

#2

Updated by Mathieu Desnoyers almost 3 years ago

irc discussion about this:

11:10 < Compudj> a kmalloc with GFP_ATOMIC | GFP_NOWAIT flags calling the page allocator on a 4.1.21-rt13 kernel: how can this happen ?
11:11 < Compudj> I wonder if there is a flag I should add in my code, or blame it on that specific kernel
11:11 < Compudj> it's called from preempt off context, so complaining about scheduling in atomic ctx
11:12 < Compudj> ref. https://bugs.lttng.org/issues/1155
11:14 < bigeasy> Compudj: where is this tree from?
11:14 < Compudj> bigeasy: I cannot find it anywhere :(
11:14 < bigeasy> Compudj: but to keep it brief, you can't kmalloc in preempt disable region
11:14 < Compudj> bigeasy: the only leads I get from a gurgle search is windriver
11:15 < Compudj> bigeasy: even with GFP_ATOMIC flag ?
11:15 < bigeasy> I have v4.1.12-rt13 
11:15 < bigeasy> yup, never
11:16 < bigeasy> we always try to breakout somehow from preempt disable section and local_irq_disable() sections to do kmalloc()
11:16 < Compudj> not really possible within a tracepoint :-(
11:17 < Compudj> because it's using preempt off to ensure consistency of probe callbacks
11:17 < Compudj> I just need to try to allocate memory for a temporary buffer when tracing a system call
11:17 < bigeasy> I think the i915 did this and I just disabled that tracepoint
11:17 < Compudj> and it's OK if it fails due to lack of memory
11:17 < Compudj> (rather than invoke page allocator)
11:19 < Compudj> bigeasy: GFP_ATOMIC is documented as "the caller cannot reclaim or sleep and is high priority. Users are typically interrupt 
                 handlers." 
11:19 < Compudj> so how comes it cannot be called from preempt-off ?
11:19 < Compudj> is there a documentation source describing this ?
11:22 < bigeasy> not sure. probably not. On RT the interrupt handler runs in threaded context so you can allocate memory even without GFP_ATOMIC 
                 :)
11:22 < Compudj> mergh...
11:22 < bigeasy> but the thing is, in page allocater you go after sleeping locks so you must be preemptible
11:22 < Compudj> ok, so it's preempt-rt specific
11:22 < bigeasy> yes
11:23 < Compudj> is there a GFP_MAGICWAND in preempt-rt that allows allocating memory with preempt disabled from an emergency pool ?
11:23 < bigeasy> we don't have this
11:23 < bigeasy> we have alsmost everything in preemptible context
11:23 < Compudj> except tracepoint handlers
11:24 < bigeasy> and keep the preempt disabled section small as possible
11:24 < Compudj> we should really introduce preemptible tracepoint handlers
11:24 < Compudj> there is srcu for this kind of stuff
11:24 < wildea01> bigeasy: can you use a kmem_cache?
11:24 < bigeasy> this reminds me of the amd-iommu patches I have somewhere solving a similar problem there
11:24 < rostedt> Compudj: what we've done in the histogram code is to have an allocated pool
11:24 < bigeasy> yes but in preempt-disable sectiions
11:25 < bigeasy> wildea01: ^
11:25 < rostedt> when the pool gets low, we kick off a irq work to do more allocation
11:25 < Compudj> rostedt: right, that would work
11:25 < rostedt> and in -rt that irq work runs in thread context
11:25 < bigeasy> wildea01: again. yes but _not_ in preempt-disabled sections
11:26 < wildea01> bigeasy: ok, thanks
11:26 < Compudj> rostedt: ok I'll try something along the same lines, thanks
11:26 < rostedt> np
#3

Updated by Julien Desfossez almost 3 years ago

We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storage

commit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmalloc

Can you try with those and let us know if it fixes the problem ?

Thanks,

Julien

#4

Updated by Fredrik Gustavsson almost 3 years ago

Julien Desfossez wrote:

We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storage

commit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmalloc

Can you try with those and let us know if it fixes the problem ?

Thanks,

Julien

Hi,

Thanks for the quick support. Testing it out at the moment.

The arises quickly, within minutes (even seconds sometimes) when provoking the system.

The lttng_kernel_load will show you the lttng config.

#5

Updated by Fredrik Gustavsson almost 3 years ago

Fredrik Gustavsson wrote:

Julien Desfossez wrote:

We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storage

commit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmalloc

Can you try with those and let us know if it fixes the problem ?

Thanks,

Julien

Hi,

Thanks for the quick support. Testing it out at the moment.

The arises quickly, within minutes (even seconds sometimes) when provoking the system.

The lttng_kernel_load will show you the lttng config.

I have try to reproduce it during the weekend but I have not been able to. So it seems like a these two commits did the trick.

#6

Updated by Mathieu Desnoyers almost 3 years ago

  • Status changed from Feedback to Resolved

Thanks for the feedback!

Also available in: Atom PDF