Skip to content

General Protection Fault in tcp_estats_read_connection_spec.isra #10

Open
@lathspell

Description

@lathspell

We sometimes experience General Protection Faults after which all invokations of web10g-listconns just hang forever. I'm not 100% sure if the call to web10g-listconns actually triggers the error or if the error just stays unnoticed for a while as the web10g-listconns is called somewhen by a script.

The system is a Debian 8.0 with kernel 3.18.0-trunk-web10g-amd64 (Debian linux-3.18.5-1~exp1 kernel with web10g-0.11-3.18.tar.gz).

Here are two examples of the error message from syslog:

Apr 07 18:20:38 tacho kernel: general protection fault: 0000 [#1] SMP 
Apr 07 18:20:38 tacho kernel: Modules linked in: btrfs xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs cpuid xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG nf_nat_ftp nf_conntrack_ftp ip
Apr 07 18:20:38 tacho kernel:  uhci_hcd ehci_pci ehci_hcd libata megaraid_sas usbcore scsi_mod usb_common bnx2
Apr 07 18:20:38 tacho kernel: CPU: 3 PID: 10760 Comm: web10g-listconn Tainted: G          I    3.18.0-trunk-web10g-amd64 #1 Debian 3.18.5-1~exp1a~web10g
Apr 07 18:20:38 tacho kernel: Hardware name: Dell Inc. PowerEdge R610/086HF8, BIOS 6.4.0 07/23/2013
Apr 07 18:20:38 tacho kernel: task: ffff88003682abc0 ti: ffff8800c9a88000 task.ti: ffff8800c9a88000
Apr 07 18:20:38 tacho kernel: RIP: 0010:[<ffffffffa0335db3>]  [<ffffffffa0335db3>] tcp_estats_read_connection_spec.isra.8+0x13/0x70 [tcp_estats_nl]
Apr 07 18:20:38 tacho kernel: RSP: 0018:ffff8800c9a8bad8  EFLAGS: 00010286
Apr 07 18:20:38 tacho kernel: RAX: 0000000061747365 RBX: ffff880036a76b00 RCX: 0000000000000000
Apr 07 18:20:38 tacho kernel: RDX: 0000000061747365 RSI: 00041080f8efd594 RDI: ffff8800c9a8bb18
Apr 07 18:20:38 tacho kernel: RBP: 0000000000000010 R08: 0000000000000000 R09: 00000000ffffffff
Apr 07 18:20:38 tacho kernel: R10: 0000000000ffffff R11: 0000000000000001 R12: 0000000000000000
Apr 07 18:20:38 tacho kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88012f845000
Apr 07 18:20:38 tacho kernel: FS:  00007fb70c116700(0000) GS:ffff88012fc20000(0000) knlGS:0000000000000000
Apr 07 18:20:38 tacho kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 07 18:20:38 tacho kernel: CR2: 00007fb70c123000 CR3: 00000000b7e70000 CR4: 00000000000007e0
Apr 07 18:20:38 tacho kernel: Stack:
Apr 07 18:20:38 tacho kernel:  ffff880036a76b00 ffffffffa0336454 ffff88012ac50c00 0000000000000000
Apr 07 18:20:38 tacho kernel:  ffff8800c9a8bba0 ffff88012aa3bc00 0000000100000000 ffffffff81153fff
Apr 07 18:20:38 tacho kernel:  ffffffff818cdcc0 0000000000000246 00000000c9a8bbf0 ffffffff000280da
Apr 07 18:20:38 tacho kernel: Call Trace:
Apr 07 18:20:38 tacho kernel:  [<ffffffffa0336454>] ? genl_list_conns+0x164/0x430 [tcp_estats_nl]
Apr 07 18:20:38 tacho kernel:  [<ffffffff81153fff>] ? get_page_from_freelist+0x62f/0xa30
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147c193>] ? genl_family_rcv_msg+0x193/0x360
Apr 07 18:20:38 tacho kernel:  [<ffffffff8143b7e7>] ? __alloc_skb+0x47/0x1e0
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147c360>] ? genl_family_rcv_msg+0x360/0x360
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147c3d9>] ? genl_rcv_msg+0x79/0xc0
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147baa0>] ? netlink_rcv_skb+0xb0/0xd0
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147bfe4>] ? genl_rcv+0x24/0x40
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147b1e7>] ? netlink_unicast+0x107/0x1a0
Apr 07 18:20:38 tacho kernel:  [<ffffffff8147b59b>] ? netlink_sendmsg+0x31b/0x660
Apr 07 18:20:38 tacho kernel:  [<ffffffff8117dba9>] ? vma_adjust+0x3f9/0x7d0
Apr 07 18:20:38 tacho kernel:  [<ffffffff81432d13>] ? sock_sendmsg+0x83/0xc0
Apr 07 18:20:38 tacho kernel:  [<ffffffff81433244>] ? SYSC_sendto+0xf4/0x180
Apr 07 18:20:38 tacho kernel:  [<ffffffff8117e65e>] ? SyS_mmap_pgoff+0xfe/0x280
Apr 07 18:20:38 tacho kernel:  [<ffffffff81546fed>] ? system_call_fast_compare_end+0xc/0x11
Apr 07 18:20:38 tacho kernel: Code: e8 33 ff ff ff 48 83 c4 48 89 c5 89 e8 5b 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 83 ec 08 48 85 f6 74 3e 48 85 ff 74 48 <48> 8b 46 14 48 89 47 04 48 8b 46 1c 48 89 47 0c 0f b7 46 26 66 
Apr 07 18:20:38 tacho kernel: RIP  [<ffffffffa0335db3>] tcp_estats_read_connection_spec.isra.8+0x13/0x70 [tcp_estats_nl]
Apr 07 18:20:38 tacho kernel:  RSP <ffff8800c9a8bad8>
Apr 07 18:20:38 tacho kernel: ---[ end trace 5001876b82b2bcd7 ]---
Mai 19 13:05:06 tacho kernel: general protection fault: 0000 [#1] SMP 
Mai 19 13:05:06 tacho kernel: Modules linked in: binfmt_misc nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG nf_nat_ftp nf_conntrack_ftp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables joydev iTCO_wdt coretemp kvm_intel kvm ttm drm_kms_helper drm iTCO_vendor_support lpc_ich evdev i2c_algo_bit i2c_core psmouse serio_raw acpi_power_meter mfd_core dcdbas i7core_edac shpchp edac_core 8250_fintek wmi pcspkr tpm_tis tpm button processor thermal_sys tcp_estats_nl loop ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse autofs4 xfs libcrc32c ext4 crc16 mbcache jbd2 dm_mod sr_mod cdrom ata_generic hid_generic usbhid hid sg sd_mod crc32c_intel ata_piix uhci_hcd
Mai 19 13:05:06 tacho kernel:  ehci_pci ehci_hcd libata megaraid_sas usbcore scsi_mod usb_common bnx2
Mai 19 13:05:06 tacho kernel: CPU: 0 PID: 5359 Comm: web10g-listconn Tainted: G          I    3.18.0-trunk-web10g-amd64 #1 Debian 3.18.5-1~exp1a~web10g
Mai 19 13:05:06 tacho kernel: Hardware name: Dell Inc. PowerEdge R610/086HF8, BIOS 6.4.0 07/23/2013
Mai 19 13:05:06 tacho kernel: task: ffff88012adaacc0 ti: ffff8800362a4000 task.ti: ffff8800362a4000
Mai 19 13:05:06 tacho kernel: RIP: 0010:[<ffffffffa033ddb3>]  [<ffffffffa033ddb3>] tcp_estats_read_connection_spec.isra.8+0x13/0x70 [tcp_estats_nl]
Mai 19 13:05:06 tacho kernel: RSP: 0018:ffff8800362a7ad8  EFLAGS: 00010286
Mai 19 13:05:06 tacho kernel: RAX: 000000006572662f RBX: ffff88020631c700 RCX: 0000000000000000
Mai 19 13:05:06 tacho kernel: RDX: 000000006572662f RSI: 5e011080556d8263 RDI: ffff8800362a7b18
Mai 19 13:05:06 tacho kernel: RBP: 0000000000000010 R08: 0000000000000000 R09: 00000000ffffffff
Mai 19 13:05:06 tacho kernel: R10: 0000000000ffffff R11: 0000000000000001 R12: 0000000000000000
Mai 19 13:05:06 tacho kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88003629c000
Mai 19 13:05:06 tacho kernel: FS:  00007fe0d8300700(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
Mai 19 13:05:06 tacho kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 19 13:05:06 tacho kernel: CR2: 00007fe0d830d000 CR3: 0000000202fb2000 CR4: 00000000000007f0
Mai 19 13:05:06 tacho kernel: Stack:
Mai 19 13:05:06 tacho kernel:  ffff88020631c700 ffffffffa033e454 ffff88012ac50c00 0000000000000000
Mai 19 13:05:06 tacho kernel:  ffff8800362a7ba0 ffff88020631c900 0000000100000000 ffffffff81153fff
Mai 19 13:05:06 tacho kernel:  ffffffff818cdcc0 0000000000000246 00000000362a7bf0 ffffffff000280da
Mai 19 13:05:06 tacho kernel: Call Trace:
Mai 19 13:05:06 tacho kernel:  [<ffffffffa033e454>] ? genl_list_conns+0x164/0x430 [tcp_estats_nl]
Mai 19 13:05:06 tacho kernel:  [<ffffffff81153fff>] ? get_page_from_freelist+0x62f/0xa30
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147c193>] ? genl_family_rcv_msg+0x193/0x360
Mai 19 13:05:06 tacho kernel:  [<ffffffff8143b7e7>] ? __alloc_skb+0x47/0x1e0
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147c360>] ? genl_family_rcv_msg+0x360/0x360
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147c3d9>] ? genl_rcv_msg+0x79/0xc0
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147baa0>] ? netlink_rcv_skb+0xb0/0xd0
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147bfe4>] ? genl_rcv+0x24/0x40
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147b1e7>] ? netlink_unicast+0x107/0x1a0
Mai 19 13:05:06 tacho kernel:  [<ffffffff8147b59b>] ? netlink_sendmsg+0x31b/0x660
Mai 19 13:05:06 tacho kernel:  [<ffffffff8117dba9>] ? vma_adjust+0x3f9/0x7d0
Mai 19 13:05:06 tacho kernel:  [<ffffffff81432d13>] ? sock_sendmsg+0x83/0xc0
Mai 19 13:05:06 tacho kernel:  [<ffffffff81433244>] ? SYSC_sendto+0xf4/0x180
Mai 19 13:05:06 tacho kernel:  [<ffffffff8117e65e>] ? SyS_mmap_pgoff+0xfe/0x280
Mai 19 13:05:06 tacho kernel:  [<ffffffff81546fed>] ? system_call_fast_compare_end+0xc/0x11
Mai 19 13:05:06 tacho kernel: Code: e8 33 ff ff ff 48 83 c4 48 89 c5 89 e8 5b 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 83 ec 08 48 85 f6 74 3e 48 85 ff 74 48 <48> 8b 46 14 48 89 47 04 48 8b 46 1c 48 89 47 0c 0f b7 46 26 66 
Mai 19 13:05:06 tacho kernel: RIP  [<ffffffffa033ddb3>] tcp_estats_read_connection_spec.isra.8+0x13/0x70 [tcp_estats_nl]
Mai 19 13:05:06 tacho kernel:  RSP <ffff8800362a7ad8>
Mai 19 13:05:06 tacho kernel: ---[ end trace 2799a13a2e5b595c ]---

Any idea how to further track down the bug or what may have caused it?

best regards,

-christian-

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions