Project

General

Profile

Actions

Bug #1149

closed

Kernel soft lockup on trace start on Fedora 4.14.13-300

Added by Loïc Gelle over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Target version:
-
Start date:
01/19/2018
Due date:
% Done:

0%

Estimated time:

Description

Hello,

I experience a new bug with lttng-modules on Fedora 27, kernel 4.14.13-300, custom built with added tracepoints and locks verification. Lttng-modules is built from source (latest version from github), only change is in kernel version checks in kvm.h for Fedora support.

Kernel soft lockup happens when starting tracing with a fresh session, all kernel events enabled. dmesg outputs:

[  151.593022] LTTng: Loaded modules v2.11.0-pre (L-Beer)
[  151.765229] ------------[ cut here ]------------
[  151.765296] WARNING: CPU: 1 PID: 4042 at /home/loicgelle/Data_Drive/kernel-cgroup-instr/lttng-modules/lttng-probes.c:94 fixup_lazy_probes+0xc6/0x240 [lttng_tracer]
[  151.765297] Modules linked in: lttng_probe_compaction(O+) lttng_probe_btrfs(O) lttng_probe_block(O) lttng_probe_asoc(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_statedump(O) lttng_ftrace(O) lttng_kprobes(O) lttng_clock(O) lttng_lib_ring_buffer(O) lttng_kretprobes(O) fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xt_addrtype br_netfilter overlay nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
[  151.765365]  ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec irqbypass snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep intel_cstate snd_seq iTCO_wdt iTCO_vendor_support snd_seq_device snd_pcm intel_uncore snd_timer intel_rapl_perf snd eeepc_wmi asus_wmi sparse_keymap rfkill wmi_bmof i2c_i801 soundcore ioatdma tpm_tis mei_me tpm_tis_core joydev mei tpm shpchp dca nouveau video i2c_algo_bit drm_kms_helper e1000e ttm nvme drm mxm_wmi crc32c_intel
[  151.765471]  ptp nvme_core pps_core wmi
[  151.765481] CPU: 1 PID: 4042 Comm: modprobe Tainted: G           O    4.14.13-300.fc27.x86_64.LG #1
[  151.765483] Hardware name: System manufacturer System Product Name/TUF X299 MARK 2, BIOS 0402 06/13/2017
[  151.765485] task: ffff9673d8dc0000 task.stack: ffffa6374991c000
[  151.765499] RIP: 0010:fixup_lazy_probes+0xc6/0x240 [lttng_tracer]
[  151.765501] RSP: 0018:ffffa6374991fbe0 EFLAGS: 00010286
[  151.765505] RAX: 00000000ffffffff RBX: ffffffffc0cb033b RCX: 0000000000000061
[  151.765507] RDX: 0000000000000005 RSI: ffffffffc0cb01ac RDI: ffffffffc0cb033b
[  151.765509] RBP: ffffa6374991fc30 R08: 0000000000000062 R09: 0000000000000000
[  151.765511] R10: ffffa6374991fc30 R11: ffffffffc0c31100 R12: 0000000000000010
[  151.765512] R13: 0000000000000005 R14: ffffffffc0cb4000 R15: ffffffffc0cb4040
[  151.765515] FS:  00007fca36e7c0c0(0000) GS:ffff9673fe840000(0000) knlGS:0000000000000000
[  151.765517] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  151.765520] CR2: 0000559517d3e000 CR3: 0000000854f88005 CR4: 00000000003606e0
[  151.765523] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  151.765525] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  151.765527] Call Trace:
[  151.765550]  lttng_get_probe_list_head.part.0+0x1d/0x30 [lttng_tracer]
[  151.765563]  lttng_probe_register+0xf7/0x100 [lttng_tracer]
[  151.765568]  ? __event_probe__compaction_migratepages+0x2a0/0x2a0 [lttng_probe_compaction]
[  151.765573]  __lttng_events_init__compaction+0x28/0x70 [lttng_probe_compaction]
[  151.765578]  do_one_initcall+0x50/0x192
[  151.765585]  ? rcu_read_lock_sched_held+0x64/0x70
[  151.765590]  ? kmem_cache_alloc_trace+0x238/0x2a0
[  151.765594]  ? do_init_module+0x27/0x1e9
[  151.765600]  do_init_module+0x5f/0x1e9
[  151.765605]  load_module+0x24b8/0x2c00
[  151.765616]  ? vfs_read+0x122/0x150
[  151.765634]  SYSC_finit_module+0xfc/0x120
[  151.765637]  ? SYSC_finit_module+0xfc/0x120
[  151.765651]  SyS_finit_module+0xe/0x10
[  151.765656]  entry_SYSCALL_64_fastpath+0x25/0x9c
[  151.765659] RIP: 0033:0x7fca3634cb89
[  151.765661] RSP: 002b:00007ffe37f749b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  151.765665] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fca3634cb89
[  151.765666] RDX: 0000000000000000 RSI: 0000559515ece85c RDI: 0000000000000004
[  151.765668] RBP: 00007ffe37f739b0 R08: 0000000000000000 R09: 0000000000000000
[  151.765670] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[  151.765671] R13: 0000559517d3adf0 R14: 00007ffe37f73990 R15: 0000000000000005
[  151.765685] Code: 01 44 39 65 cc 0f 84 11 01 00 00 49 63 c4 4c 89 ea 4c 89 f6 49 8b 04 c7 48 8b 18 48 89 df e8 62 02 e7 d9 85 c0 74 c1 4c 8b 75 c0 <0f> ff 48 8b 35 c9 8a 12 00 48 81 fe 50 11 c3 c0 4c 8d 66 e8 48 
[  151.765779] ---[ end trace 720a2bf9e01d856d ]---
[  178.026672] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1
[  178.054936] LTTng: global_dirty_limit symbol lookup failed.
[  178.466675] BUG: unable to handle kernel NULL pointer dereference at 0000000000000448
[  178.466695] IP: memcg_stat_show+0x65/0x4e0
[  178.466700] PGD 0 P4D 0 
[  178.466704] Oops: 0000 [#1] SMP PTI
[  178.466709] Modules linked in: lttng_probe_cgroup(O) lttng_probe_x86_exceptions(O) lttng_probe_x86_irq_vectors(O) lttng_probe_writeback(O) lttng_probe_workqueue(O) lttng_probe_v4l2(O) lttng_probe_vmscan(O) lttng_probe_udp(O) lttng_probe_timer(O) lttng_probe_sunrpc(O) lttng_probe_statedump(O) lttng_probe_sock(O) lttng_probe_skb(O) lttng_probe_signal(O) lttng_probe_scsi(O) lttng_probe_sched(O) lttng_probe_regmap(O) lttng_probe_rcu(O) lttng_probe_random(O) lttng_probe_printk(O) lttng_probe_power(O) lttng_probe_net(O) lttng_probe_napi(O) lttng_probe_module(O) lttng_probe_kvm_x86_mmu(O) lttng_probe_kvm_x86(O) lttng_probe_kvm(O) lttng_probe_kmem(O) lttng_probe_jbd2(O) lttng_probe_irq(O) lttng_probe_i2c(O) lttng_probe_gpio(O) lttng_probe_ext4(O) lttng_probe_compaction(O) lttng_probe_btrfs(O) lttng_probe_block(O)
[  178.466772]  lttng_probe_asoc(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_statedump(O) lttng_ftrace(O) lttng_kprobes(O) lttng_clock(O) lttng_lib_ring_buffer(O) lttng_kretprobes(O) fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xt_addrtype br_netfilter overlay nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
[  178.466837]  nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec irqbypass snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep intel_cstate snd_seq iTCO_wdt iTCO_vendor_support snd_seq_device snd_pcm intel_uncore snd_timer intel_rapl_perf snd eeepc_wmi asus_wmi sparse_keymap rfkill wmi_bmof i2c_i801 soundcore ioatdma tpm_tis mei_me tpm_tis_core joydev mei tpm shpchp dca nouveau video i2c_algo_bit drm_kms_helper e1000e ttm nvme drm mxm_wmi crc32c_intel ptp nvme_core pps_core wmi
[  178.466919] CPU: 3 PID: 4117 Comm: lttng-sessiond Tainted: G        W  O    4.14.13-300.fc27.x86_64.LG #1
[  178.466926] Hardware name: System manufacturer System Product Name/TUF X299 MARK 2, BIOS 0402 06/13/2017
[  178.466934] task: ffff9673e514c000 task.stack: ffffa63749cf4000
[  178.466939] RIP: 0010:memcg_stat_show+0x65/0x4e0
[  178.466944] RSP: 0018:ffffa63749cf7a68 EFLAGS: 00010297
[  178.466949] RAX: 0000000000000000 RBX: 00000000000000d8 RCX: 0000000000000000
[  178.466955] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9b3b5520
[  178.466960] RBP: ffffa63749cf7aa8 R08: ffffffffffffffff R09: ffffffffffffffff
[  178.466966] R10: ffffffff9b07d398 R11: ffffffff9b07d728 R12: ffffffff9b3b5520
[  178.466972] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  178.466978] FS:  00007f5cedfd6700(0000) GS:ffff9673fe8c0000(0000) knlGS:0000000000000000
[  178.466984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  178.466989] CR2: 0000000000000448 CR3: 000000086447e004 CR4: 00000000003606e0
[  178.466995] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  178.467001] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  178.467006] Call Trace:
[  178.467013]  lttng_statedump_start+0x186e/0x1f50 [lttng_statedump]
[  178.467044]  lttng_session_enable+0xf7/0x110 [lttng_tracer]
[  178.467055]  lttng_session_ioctl+0x23a/0x2f0 [lttng_tracer]
[  178.467062]  ? sock_has_perm+0x84/0xb0
[  178.467068]  ? finish_wait+0x80/0x80
[  178.467074]  ? ___sys_recvmsg+0x1a4/0x240
[  178.467080]  ? rcu_read_lock_sched_held+0x64/0x70
[  178.467085]  ? kfree+0x222/0x290
[  178.467090]  ? __lock_acquire+0x2ba/0x1300
[  178.467096]  ? find_held_lock+0x35/0xa0
[  178.467102]  ? __fget+0xf2/0x1f0
[  178.467107]  ? find_held_lock+0x35/0xa0
[  178.467111]  ? debug_lockdep_rcu_enabled+0x1d/0x30
[  178.467117]  ? selinux_file_ioctl+0x3c/0x1e0
[  178.467122]  do_vfs_ioctl+0xa6/0x6b0
[  178.467128]  SyS_ioctl+0x79/0x90
[  178.467134]  entry_SYSCALL_64_fastpath+0x25/0x9c
[  178.467138] RIP: 0033:0x7f5cf5407877
[  178.467142] RSP: 002b:00007f5cedfc58a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  178.467148] RAX: ffffffffffffffda RBX: 00007f5cedfc27b0 RCX: 00007f5cf5407877
[  178.467154] RDX: 0000000100000000 RSI: 000000000000f656 RDI: 000000000000002d
[  178.467159] RBP: 00007f5cd8003f50 R08: 0000000000000000 R09: 0000000000400000
[  178.467165] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  178.467170] R13: 0000000000000000 R14: 0000000000000045 R15: 0000000000000000
[  178.467178] Code: 49 83 c5 01 49 83 fd 08 74 7b 48 c7 c0 80 0d 2b 9b 4a 63 1c a8 83 fb 1e 74 e1 48 c1 e3 03 b8 ff ff ff ff 45 31 ff eb 19 48 63 f0 <49> 8b 96 48 04 00 00 48 8b 34 f5 e0 33 16 9b 48 01 da 4c 03 3c 
[  178.467219] RIP: memcg_stat_show+0x65/0x4e0 RSP: ffffa63749cf7a68
[  178.467223] CR2: 0000000000000448
[  178.468597] ---[ end trace 720a2bf9e01d856e ]---

I am available and happy to help for patch testing.

Thanks,
Loïc.


Files

WIP-linux-cgroup-tracepoints.patch (4.95 KB) WIP-linux-cgroup-tracepoints.patch Loïc Gelle, 01/19/2018 11:45 AM
Actions

Also available in: Atom PDF