Bug #1155
closedBUG: scheduling while atomic and BUG: sleeping function called from invalid context
0%
Description
Hi,
Preempt-rt kernel 4.1.21-rt13 with lttng-modules 2.9.5 on a arm32.
I would get some stack traces of this kind and if I provoke the system by loading it, the fault will appear quite quickly.
I see the lttng-tracer in the printout.
The sleeping function called from invalid context is much more common than the scheduling while atomic but the stack traces will be the same.
We previously used 2.8.7 and did not get this error then.
If we revert a6370148ab4a6f210c29372febcffd87b440cc50
"select, poll and epoll_wait overrides on ARM 32-bit"
We do not see the error.
[ 186.864953] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:925
[ 186.864959] in_atomic(): 1, irqs_disabled(): 0, pid: 6756, name: hackbench
[ 186.864963] no locks held by hackbench/6756.
[ 186.864970] Preemption disabled at:[< (null)>] (null)
[ 186.864981] CPU: 8 PID: 6756 Comm: hackbench Tainted: G C 4.1.21-rt13 #1
[ 186.864984] Hardware name:
[ 186.865006] [<c041b028>] (unwind_backtrace) from [<c0414c88>] (show_stack+0x20/0x24)
[ 186.865020] [<c0414c88>] (show_stack) from [<c0a62b0c>] (dump_stack+0xb4/0xe4)
[ 186.865032] [<c0a62b0c>] (dump_stack) from [<c0460d9c>] (might_sleep+0x1b8/0x2a8)
[ 186.865041] [<c0460d9c>] (_might_sleep) from [<c0a67d4c>] (rt_spin_lock+0x34/0x6c)
[ 186.865053] [<c0a67d4c>] (rt_spin_lock) from [<c054c4b0>] (get_page_from_freelist+0x284/0xa0c)
[ 186.865065] [<c054c4b0>] (get_page_from_freelist) from [<c054ce28>] (_alloc_pages_nodemask+0x1f0/0xa7c)
[ 186.865075] [<c054ce28>] (_alloc_pages_nodemask) from [<c0587ce8>] (new_slab+0x98/0x2f4)
[ 186.865085] [<c0587ce8>] (new_slab) from [<c058a7ec>] (_slab_alloc.constprop.9+0x3ac/0x488)
[ 186.865093] [<c058a7ec>] (_slab_alloc.constprop.9) from [<c058ac48>] (_kmalloc+0x380/0x4f4)
[ 186.865170] [<c058ac48>] (_kmalloc) from [<bf081338>] (_event_probe__syscall_entry_poll+0xf4/0x480 [lttng_tracer])
[ 186.865263] [<bf081338>] (_event_probe_syscall_entry_poll [lttng_tracer]) from [<bf093200>] (syscall_entry_probe.part.0+0x158/0x2a0 [lttng_tracer])
[ 186.865349] [<bf093200>] (syscall_entry_probe.part.0 [lttng_tracer]) from [<bf0933a8>] (syscall_entry_probe+0x60/0x64 [lttng_tracer])
[ 186.865406] [<bf0933a8>] (syscall_entry_probe [lttng_tracer]) from [<c0412a94>] (syscall_trace_enter+0x114/0x3c4)
[ 186.865421] [<c0412a94>] (syscall_trace_enter) from [<c040fff8>] (__sys_trace+0xc/0x38)
Files
Updated by Mathieu Desnoyers about 7 years ago
- Status changed from New to Feedback
Can you provide your kernel .config ?
Also, I cannot find the source code for the 4.1.21-rt13 kernel anywhere (neither linux-rt stable git tree nor linux-rt patches). The only mentions of it based on a quick Google search seem to indicate that it is a Wind River kernel. You may want to ask them for help if it's the case.
Moreover, does the issue reproduce rather quickly after tracing is activated, or it takes a while ? How is your tracing configured ?
It's really unexpected that a kmalloc() with GFP_ATOMIC | GFP_NOWAIT end up calling the page allocator. It sounds like there may be something fishy in your kernel. Can you reproduce with a maintained, updated, version of the preempt-rt kernel ?
Updated by Mathieu Desnoyers about 7 years ago
irc discussion about this:
11:10 < Compudj> a kmalloc with GFP_ATOMIC | GFP_NOWAIT flags calling the page allocator on a 4.1.21-rt13 kernel: how can this happen ? 11:11 < Compudj> I wonder if there is a flag I should add in my code, or blame it on that specific kernel 11:11 < Compudj> it's called from preempt off context, so complaining about scheduling in atomic ctx 11:12 < Compudj> ref. https://bugs.lttng.org/issues/1155 11:14 < bigeasy> Compudj: where is this tree from? 11:14 < Compudj> bigeasy: I cannot find it anywhere :( 11:14 < bigeasy> Compudj: but to keep it brief, you can't kmalloc in preempt disable region 11:14 < Compudj> bigeasy: the only leads I get from a gurgle search is windriver 11:15 < Compudj> bigeasy: even with GFP_ATOMIC flag ? 11:15 < bigeasy> I have v4.1.12-rt13 11:15 < bigeasy> yup, never 11:16 < bigeasy> we always try to breakout somehow from preempt disable section and local_irq_disable() sections to do kmalloc() 11:16 < Compudj> not really possible within a tracepoint :-( 11:17 < Compudj> because it's using preempt off to ensure consistency of probe callbacks 11:17 < Compudj> I just need to try to allocate memory for a temporary buffer when tracing a system call 11:17 < bigeasy> I think the i915 did this and I just disabled that tracepoint 11:17 < Compudj> and it's OK if it fails due to lack of memory 11:17 < Compudj> (rather than invoke page allocator) 11:19 < Compudj> bigeasy: GFP_ATOMIC is documented as "the caller cannot reclaim or sleep and is high priority. Users are typically interrupt handlers." 11:19 < Compudj> so how comes it cannot be called from preempt-off ? 11:19 < Compudj> is there a documentation source describing this ? 11:22 < bigeasy> not sure. probably not. On RT the interrupt handler runs in threaded context so you can allocate memory even without GFP_ATOMIC :) 11:22 < Compudj> mergh... 11:22 < bigeasy> but the thing is, in page allocater you go after sleeping locks so you must be preemptible 11:22 < Compudj> ok, so it's preempt-rt specific 11:22 < bigeasy> yes 11:23 < Compudj> is there a GFP_MAGICWAND in preempt-rt that allows allocating memory with preempt disabled from an emergency pool ? 11:23 < bigeasy> we don't have this 11:23 < bigeasy> we have alsmost everything in preemptible context 11:23 < Compudj> except tracepoint handlers 11:24 < bigeasy> and keep the preempt disabled section small as possible 11:24 < Compudj> we should really introduce preemptible tracepoint handlers 11:24 < Compudj> there is srcu for this kind of stuff 11:24 < wildea01> bigeasy: can you use a kmem_cache? 11:24 < bigeasy> this reminds me of the amd-iommu patches I have somewhere solving a similar problem there 11:24 < rostedt> Compudj: what we've done in the histogram code is to have an allocated pool 11:24 < bigeasy> yes but in preempt-disable sectiions 11:25 < bigeasy> wildea01: ^ 11:25 < rostedt> when the pool gets low, we kick off a irq work to do more allocation 11:25 < Compudj> rostedt: right, that would work 11:25 < rostedt> and in -rt that irq work runs in thread context 11:25 < bigeasy> wildea01: again. yes but _not_ in preempt-disabled sections 11:26 < wildea01> bigeasy: ok, thanks 11:26 < Compudj> rostedt: ok I'll try something along the same lines, thanks 11:26 < rostedt> np
Updated by Julien Desfossez about 7 years ago
We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storage
commit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmalloc
Can you try with those and let us know if it fixes the problem ?
Thanks,
Julien
Updated by Fredrik Gustavsson about 7 years ago
- File lttng_kernel_load lttng_kernel_load added
- File stresstest stresstest added
Julien Desfossez wrote:
We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storagecommit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmallocCan you try with those and let us know if it fixes the problem ?
Thanks,
Julien
Hi,
Thanks for the quick support. Testing it out at the moment.
The arises quickly, within minutes (even seconds sometimes) when provoking the system.
The lttng_kernel_load will show you the lttng config.
Updated by Fredrik Gustavsson about 7 years ago
Fredrik Gustavsson wrote:
Julien Desfossez wrote:
We just merged two patches in the master branch that will help:
commit f771eda68a28a432a87a7a71b23aab9692828700
Create a memory pool for temporary tracepoint probes storagecommit ec85ce1db0ca20e17fedbc85dde39613d830cbab
Use the memory pool instead of kmallocCan you try with those and let us know if it fixes the problem ?
Thanks,
Julien
Hi,
Thanks for the quick support. Testing it out at the moment.
The arises quickly, within minutes (even seconds sometimes) when provoking the system.
The lttng_kernel_load will show you the lttng config.
I have try to reproduce it during the weekend but I have not been able to. So it seems like a these two commits did the trick.
Updated by Mathieu Desnoyers about 7 years ago
- Status changed from Feedback to Resolved
Thanks for the feedback!