Limit netlink communication length #8

dmah42 · 2014-02-03T22:15:54Z

In some cases, where lots of connections are present, netlink can run out of buffer space. By providing a max number of sockets to return, and a start connection id, the userspace can manage the netlink buffer better, and can keep receiving messages until none are returned (or the kernel can send a special end-of-list message).

…king Daniel Borkmann reported a VM_BUG_ON assertion failing: ------------[ cut here ]------------ kernel BUG at mm/mlock.c:528! invalid opcode: 0000 [#1] SMP Modules linked in: ccm arc4 iwldvm [...] video CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8 Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012 task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000 RIP: 0010:[<ffffffff81171ad0>] [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0 Call Trace: do_munmap+0x18f/0x3b0 vm_munmap+0x41/0x60 SyS_munmap+0x22/0x30 system_call_fastpath+0x1a/0x1f RIP munlock_vma_pages_range+0x2e0/0x2f0 ---[ end trace a0088dcf07ae10f2 ]--- because munlock_vma_pages_range() thinks it's unexpectedly in the middle of a THP page. This can be reproduced with default config since 3.11 kernels. A reproducer can be found in the kernel's selftest directory for networking by running ./psock_tpacket. The problem is that an order=2 compound page (allocated by alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped by packet_mmap()) and mistaken for a THP page and assumed to be order=9. The checks for THP in munlock came with commit ff6a6da ("mm: accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did not trigger a bug. It just makes munlock_vma_pages_range() skip such compound pages until the next 512-pages-aligned page, when it encounters a head page. This is however not a problem for vma's where mlocking has no effect anyway, but it can distort the accounting. Since commit 7225522 ("mm: munlock: batch non-THP page isolation and munlock+putback using pagevec") this can trigger a VM_BUG_ON in PageTransHuge() check. This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a list of flags that make vma's non-mlockable and non-mergeable. The reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is already on the VM_SPECIAL list, and both are intended for non-LRU pages where mlocking makes no sense anyway. Related Lkml discussion can be found in [2]. [1] tools/testing/selftests/net/psock_tpacket [2] https://lkml.org/lkml/2014/1/10/427 Signed-off-by: Vlastimil Babka <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reported-by: Daniel Borkmann <[email protected]> Tested-by: Daniel Borkmann <[email protected]> Cc: Thomas Hellstrom <[email protected]> Cc: John David Anglin <[email protected]> Cc: HATAYAMA Daisuke <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Carsten Otte <[email protected]> Cc: Jared Hulbert <[email protected]> Tested-by: Hannes Frederic Sowa <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: <[email protected]> [3.11.x+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

vmxnet3's netpoll driver is incorrectly coded. It directly calls vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll controller method doesn't block real napi polls in any way, there is a potential for race conditions in which the netpoll controller method and the napi poll method run concurrently. The result is data corruption causing panics such as this one recently observed: PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg" #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92 #2 [ffff88023abd58b0] oops_end at ffffffff8152b570 #3 [ffff88023abd58e0] die at ffffffff81010e0b #4 [ffff88023abd5910] do_trap at ffffffff8152add4 #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95 #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b [exception RIP: vmxnet3_rq_rx_complete+1968] RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0 RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0 RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048 R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8 R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3] #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3] #9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7 The fix is to do as other drivers do, and have the poll controller call the top half interrupt handler, which schedules a napi poll properly to recieve frames Tested by myself, successfully. Signed-off-by: Neil Horman <[email protected]> CC: Shreyas Bhatewara <[email protected]> CC: "VMware, Inc." <[email protected]> CC: "David S. Miller" <[email protected]> CC: [email protected] Reviewed-by: Shreyas N Bhatewara <[email protected]> Signed-off-by: David S. Miller <[email protected]>

If recovery failed ath10k returned 0 (success) and mac80211 continued to call other driver callbacks. This caused null dereference. This is how the failure looked like: ath10k: ctl_resp never came in (-110) ath10k: failed to connect to HTC: -110 ath10k: could not init core (-110) BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffa0b355c1>] ath10k_ce_send+0x1d/0x15d [ath10k_pci] PGD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ath10k_pci ath10k_core ath5k ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nf_nat_ipv4 ] CPU: 1 PID: 36 Comm: kworker/1:1 Tainted: G WC 3.13.0-rc8-wl-ath+ #8 Hardware name: To be filled by O.E.M. To be filled by O.E.M./HURONRIVER, BIOS 4.6.5 05/02/2012 Workqueue: events ieee80211_restart_work [mac80211] task: ffff880215b521c0 ti: ffff880215e18000 task.ti: ffff880215e18000 RIP: 0010:[<ffffffffa0b355c1>] [<ffffffffa0b355c1>] ath10k_ce_send+0x1d/0x15d [ath10k_pci] RSP: 0018:ffff880215e19af8 EFLAGS: 00010292 RAX: ffff880215e19b10 RBX: 0000000000000000 RCX: 0000000000000018 RDX: 00000000d9ccf800 RSI: ffff8800c965ad00 RDI: 0000000000000000 RBP: ffff880215e19b58 R08: 0000000000000002 R09: 0000000000000000 R10: ffffffff812e1a23 R11: 0000000000000292 R12: 0000000000000018 R13: 0000000000000000 R14: 0000000000000002 R15: ffff88021562d700 FS: 0000000000000000(0000) GS:ffff88021fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001a0d000 CR4: 00000000000407e0 Stack: d9ccf8000d47df40 ffffffffa0b367a0 ffff880215e19b10 0000000000000010 ffff880215e19b68 ffff880215e19b28 0000000000000018 ffff8800c965ad00 0000000000000018 0000000000000000 0000000000000002 ffff88021562d700 Call Trace: [<ffffffffa0b3251d>] ath10k_pci_hif_send_head+0xa7/0xcb [ath10k_pci] [<ffffffffa0b16cbe>] ath10k_htc_send+0x23d/0x2d0 [ath10k_core] [<ffffffffa0b1a169>] ath10k_wmi_cmd_send_nowait+0x5d/0x85 [ath10k_core] [<ffffffffa0b1aaef>] ath10k_wmi_cmd_send+0x62/0x115 [ath10k_core] [<ffffffff814e8abd>] ? __netdev_alloc_skb+0x4b/0x9b [<ffffffffa0b1c438>] ath10k_wmi_vdev_set_param+0x91/0xa3 [ath10k_core] [<ffffffffa0b0e0d5>] ath10k_mac_set_rts+0x3e/0x40 [ath10k_core] [<ffffffffa0b0e1d0>] ath10k_set_frag_threshold+0x5e/0x9c [ath10k_core] [<ffffffffa09d60eb>] ieee80211_reconfig+0x12a/0x7b3 [mac80211] [<ffffffff815a8069>] ? mutex_unlock+0x9/0xb [<ffffffffa09b3a40>] ieee80211_restart_work+0x5e/0x68 [mac80211] [<ffffffff810c01d0>] process_one_work+0x1d7/0x2fc [<ffffffff810c0166>] ? process_one_work+0x16d/0x2fc [<ffffffff810c06c8>] worker_thread+0x12e/0x1fb [<ffffffff810c059a>] ? rescuer_thread+0x27b/0x27b [<ffffffff810c5aee>] kthread+0xb5/0xbd [<ffffffff815a9220>] ? _raw_spin_unlock_irq+0x28/0x42 [<ffffffff810c5a39>] ? __kthread_parkme+0x5c/0x5c [<ffffffff815ae04c>] ret_from_fork+0x7c/0xb0 [<ffffffff810c5a39>] ? __kthread_parkme+0x5c/0x5c Code: df ff d0 48 83 c4 18 5b 41 5c 41 5d 5d c3 55 48 89 e5 41 57 41 56 45 89 c6 41 55 41 54 41 89 cc 53 48 89 RIP [<ffffffffa0b355c1>] ath10k_ce_send+0x1d/0x15d [ath10k_pci] RSP <ffff880215e19af8> CR2: 0000000000000000 Reported-By: Ben Greear <[email protected]> Signed-off-by: Michal Kazior <[email protected]> Signed-off-by: Kalle Valo <[email protected]>

When plugging a specific micro SD card at MMC socket of a custom i.MX28 board, we get the following kernel warning: WARNING: CPU: 0 PID: 30 at drivers/mmc/host/mxs-mmc.c:342 mxs_mmc_start_cmd+0x34c/0x378() Modules linked in: CPU: 0 PID: 30 Comm: kworker/u2:1 Not tainted 3.14.0-rc5 #8 Workqueue: kmmcd mmc_rescan [<c0015420>] (unwind_backtrace) from [<c0012cb0>] (show_stack+0x10/0x14) [<c0012cb0>] (show_stack) from [<c001daf8>] (warn_slowpath_common+0x6c/0x8c) [<c001daf8>] (warn_slowpath_common) from [<c001db34>] (warn_slowpath_null+0x1c/0x24) [<c001db34>] (warn_slowpath_null) from [<c0349478>] (mxs_mmc_start_cmd+0x34c/0x378) [<c0349478>] (mxs_mmc_start_cmd) from [<c0338fa0>] (mmc_start_request+0xc4/0xf4) [<c0338fa0>] (mmc_start_request) from [<c03390b4>] (mmc_wait_for_req+0x50/0x164) [<c03390b4>] (mmc_wait_for_req) from [<c03405b8>] (mmc_app_send_scr+0x158/0x1c8) [<c03405b8>] (mmc_app_send_scr) from [<c033ee1c>] (mmc_sd_setup_card+0x80/0x3c8) [<c033ee1c>] (mmc_sd_setup_card) from [<c033f788>] (mmc_sd_init_card+0x124/0x66c) [<c033f788>] (mmc_sd_init_card) from [<c033fd7c>] (mmc_attach_sd+0xac/0x174) [<c033fd7c>] (mmc_attach_sd) from [<c033a658>] (mmc_rescan+0x25c/0x2d8) [<c033a658>] (mmc_rescan) from [<c003597c>] (process_one_work+0x1b4/0x4ec) [<c003597c>] (process_one_work) from [<c0035de4>] (worker_thread+0x130/0x464) [<c0035de4>] (worker_thread) from [<c003c824>] (kthread+0xb4/0xd0) [<c003c824>] (kthread) from [<c000f420>] (ret_from_fork+0x14/0x34) The error is due to an invalid value in CSD register of a specific 2GB micro SD card. The CSD version of this card is 1.0 but the TACC field has the invalid value 0. cid:0000005553442020000000000000583f csd:00000032535a83bfedb7ffbf1680003f date:08/2005 erase_size:512 fwrev:0x0 hwrev:0x0 manfid:0x000000 name:USD oemid:0x0000 preferred_erase_size:4194304 scr:0225000000000000 serial:0x00000000 type:SD Since the kernel is making use of this TACC field to calculate the SD card timeout, an invalid value 0 leads to a warning at mxs_ns_to_ssp_ticks() and later the following misleading error message appears in a loop: mxs-mmc 80010000.ssp: card claims to support voltages below defined range mxs-mmc 80010000.ssp: no support for card's volts mmc0: error -22 whilst initialising MMC card This error is only found on this 2GB SD card on mxs platform. On x86 this card works without any problems. The following patch based on the work of Peter Chan and Otavio Salvador. It catches the case that the determined timeout is still 0 and sets it to a valid value. Successful tested on a i.MX28 board. Signed-off-by: Stefan Wahren <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Chris Ball <[email protected]>

All tests should pass with and without JIT. Example output: test_bpf: #0 TAX 35 16 16 PASS test_bpf: #1 TXA 7 7 7 PASS test_bpf: #2 ADD_SUB_MUL_K 10 PASS test_bpf: #3 DIV_KX 33 PASS test_bpf: #4 AND_OR_LSH_K 10 10 PASS test_bpf: #5 LD_IND 8 8 8 PASS test_bpf: #6 LD_ABS 8 8 8 PASS test_bpf: #7 LD_ABS_LL 13 14 PASS test_bpf: #8 LD_IND_LL 12 12 12 PASS test_bpf: #9 LD_ABS_NET 10 12 PASS test_bpf: #10 LD_IND_NET 11 12 12 PASS ... Numbers are times in nsec per filter for given input data. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>

This patch tries to fix this crash: #5 [ffff88003c1cd690] do_invalid_op at ffffffff810166d5 #6 [ffff88003c1cd730] invalid_op at ffffffff8159b2de [exception RIP: ocfs2_direct_IO_get_blocks+359] RIP: ffffffffa05dfa27 RSP: ffff88003c1cd7e8 RFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88003c1cdaa8 RCX: 0000000000000000 RDX: 000000000000000c RSI: ffff880027a95000 RDI: ffff88003c79b540 RBP: ffff88003c1cd858 R8: 0000000000000000 R9: ffffffff815f6ba0 R10: 00000000000001c9 R11: 00000000000001c9 R12: ffff88002d271500 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000001000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff88003c1cd860] do_direct_IO at ffffffff811cd31b #8 [ffff88003c1cd950] direct_IO_iovec at ffffffff811cde9c #9 [ffff88003c1cd9b0] do_blockdev_direct_IO at ffffffff811ce764 #10 [ffff88003c1cdb80] __blockdev_direct_IO at ffffffff811ce7cc #11 [ffff88003c1cdbb0] ocfs2_direct_IO at ffffffffa05df756 [ocfs2] #12 [ffff88003c1cdbe0] generic_file_direct_write_iter at ffffffff8112f935 #13 [ffff88003c1cdc40] ocfs2_file_write_iter at ffffffffa0600ccc [ocfs2] #14 [ffff88003c1cdd50] do_aio_write at ffffffff8119126c #15 [ffff88003c1cddc0] aio_rw_vect_retry at ffffffff811d9bb4 #16 [ffff88003c1cddf0] aio_run_iocb at ffffffff811db880 #17 [ffff88003c1cde30] io_submit_one at ffffffff811dc238 #18 [ffff88003c1cde80] do_io_submit at ffffffff811dc437 #19 [ffff88003c1cdf70] sys_io_submit at ffffffff811dc530 #20 [ffff88003c1cdf80] system_call_fastpath at ffffffff8159a159 It crashes at BUG_ON(create && (ext_flags & OCFS2_EXT_REFCOUNTED)); in ocfs2_direct_IO_get_blocks. ocfs2_direct_IO_get_blocks is expecting the OCFS2_EXT_REFCOUNTED be removed in ocfs2_prepare_inode_for_write() if it was there. But no cluster lock is taken during the time before (or inside) ocfs2_prepare_inode_for_write() and after ocfs2_direct_IO_get_blocks(). It can happen in this case: Node A(which crashes) Node B ------------------------ --------------------------- ocfs2_file_aio_write ocfs2_prepare_inode_for_write ocfs2_inode_lock ... ocfs2_inode_unlock #no refcount found .... ocfs2_reflink ocfs2_inode_lock ... ocfs2_inode_unlock #now, refcount flag set on extent ... flush change to disk ocfs2_direct_IO_get_blocks ocfs2_get_clusters #extent map miss #buffer_head miss read extents from disk found refcount flag on extent crash.. Fix: Take rw_lock in ocfs2_reflink path Signed-off-by: Wengang Wang <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

When performing a consuming read, the ring buffer swaps out a page from the ring buffer with a empty page and this page that was swapped out becomes the new reader page. The reader page is owned by the reader and since it was swapped out of the ring buffer, writers do not have access to it (there's an exception to that rule, but it's out of scope for this commit). When reading the "trace" file, it is a non consuming read, which means that the data in the ring buffer will not be modified. When the trace file is opened, a ring buffer iterator is allocated and writes to the ring buffer are disabled, such that the iterator will not have issues iterating over the data. Although the ring buffer disabled writes, it does not disable other reads, or even consuming reads. If a consuming read happens, then the iterator is reset and starts reading from the beginning again. My tests would sometimes trigger this bug on my i386 box: WARNING: CPU: 0 PID: 5175 at kernel/trace/trace.c:1527 __trace_find_cmdline+0x66/0xaa() Modules linked in: CPU: 0 PID: 5175 Comm: grep Not tainted 3.16.0-rc3-test+ #8 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 00000000 00000000 f09c9e1c c18796b3 c1b5d74c f09c9e4c c103a0e3 c1b5154b f09c9e78 00001437 c1b5d74c 000005f7 c10bd85a c10bd85a c1cac57c f09c9eb0 ed0e0000 f09c9e64 c103a185 00000009 f09c9e5c c1b5154b f09c9e78 f09c9e80^M Call Trace: [<c18796b3>] dump_stack+0x4b/0x75 [<c103a0e3>] warn_slowpath_common+0x7e/0x95 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa [<c103a185>] warn_slowpath_fmt+0x33/0x35 [<c10bd85a>] __trace_find_cmdline+0x66/0xaa^M [<c10bed04>] trace_find_cmdline+0x40/0x64 [<c10c3c16>] trace_print_context+0x27/0xec [<c10c4360>] ? trace_seq_printf+0x37/0x5b [<c10c0b15>] print_trace_line+0x319/0x39b [<c10ba3fb>] ? ring_buffer_read+0x47/0x50 [<c10c13b1>] s_show+0x192/0x1ab [<c10bfd9a>] ? s_next+0x5a/0x7c [<c112e76e>] seq_read+0x267/0x34c [<c1115a25>] vfs_read+0x8c/0xef [<c112e507>] ? seq_lseek+0x154/0x154 [<c1115ba2>] SyS_read+0x54/0x7f [<c188488e>] syscall_call+0x7/0xb ---[ end trace 3f507febd6b4cc83 ]--- >>>> ##### CPU 1 buffer started #### Which was the __trace_find_cmdline() function complaining about the pid in the event record being negative. After adding more test cases, this would trigger more often. Strangely enough, it would never trigger on a single test, but instead would trigger only when running all the tests. I believe that was the case because it required one of the tests to be shutting down via delayed instances while a new test started up. After spending several days debugging this, I found that it was caused by the iterator becoming corrupted. Debugging further, I found out why the iterator became corrupted. It happened with the rb_iter_reset(). As consuming reads may not read the full reader page, and only part of it, there's a "read" field to know where the last read took place. The iterator, must also start at the read position. In the rb_iter_reset() code, if the reader page was disconnected from the ring buffer, the iterator would start at the head page within the ring buffer (where writes still happen). But the mistake there was that it still used the "read" field to start the iterator on the head page, where it should always start at zero because readers never read from within the ring buffer where writes occur. I originally wrote a patch to have it set the iter->head to 0 instead of iter->head_page->read, but then I questioned why it wasn't always setting the iter to point to the reader page, as the reader page is still valid. The list_empty(reader_page->list) just means that it was successful in swapping out. But the reader_page may still have data. There was a bug report a long time ago that was not reproducible that had something about trace_pipe (consuming read) not matching trace (iterator read). This may explain why that happened. Anyway, the correct answer to this bug is to always use the reader page an not reset the iterator to inside the writable ring buffer. Cc: [email protected] # 2.6.28+ Fixes: d769041 "ring_buffer: implement new locking" Signed-off-by: Steven Rostedt <[email protected]>

For commit ocfs2 journal, ocfs2 journal thread will acquire the mutex osb->journal->j_trans_barrier and wake up jbd2 commit thread, then it will wait until jbd2 commit thread done. In order journal mode, jbd2 needs flushing dirty data pages first, and this needs get page lock. So osb->journal->j_trans_barrier should be got before page lock. But ocfs2_write_zero_page() and ocfs2_write_begin_inline() obey this locking order, and this will cause deadlock and hung the whole cluster. One deadlock catched is the following: PID: 13449 TASK: ffff8802e2f08180 CPU: 31 COMMAND: "oracle" #0 [ffff8802ee3f79b0] __schedule at ffffffff8150a524 #1 [ffff8802ee3f7a58] schedule at ffffffff8150acbf #2 [ffff8802ee3f7a68] rwsem_down_failed_common at ffffffff8150cb85 #3 [ffff8802ee3f7ad8] rwsem_down_read_failed at ffffffff8150cc55 #4 [ffff8802ee3f7ae8] call_rwsem_down_read_failed at ffffffff812617a4 #5 [ffff8802ee3f7b50] ocfs2_start_trans at ffffffffa0498919 [ocfs2] #6 [ffff8802ee3f7ba0] ocfs2_zero_start_ordered_transaction at ffffffffa048b2b8 [ocfs2] #7 [ffff8802ee3f7bf0] ocfs2_write_zero_page at ffffffffa048e9bd [ocfs2] #8 [ffff8802ee3f7c80] ocfs2_zero_extend_range at ffffffffa048ec83 [ocfs2] #9 [ffff8802ee3f7ce0] ocfs2_zero_extend at ffffffffa048edfd [ocfs2] #10 [ffff8802ee3f7d50] ocfs2_extend_file at ffffffffa049079e [ocfs2] #11 [ffff8802ee3f7da0] ocfs2_setattr at ffffffffa04910ed [ocfs2] #12 [ffff8802ee3f7e70] notify_change at ffffffff81187d29 #13 [ffff8802ee3f7ee0] do_truncate at ffffffff8116bbc1 #14 [ffff8802ee3f7f50] sys_ftruncate at ffffffff8116bcbd #15 [ffff8802ee3f7f80] system_call_fastpath at ffffffff81515142 RIP: 00007f8de750c6f7 RSP: 00007fffe786e478 RFLAGS: 00000206 RAX: 000000000000004d RBX: ffffffff81515142 RCX: 0000000000000000 RDX: 0000000000000200 RSI: 0000000000028400 RDI: 000000000000000d RBP: 00007fffe786e040 R8: 0000000000000000 R9: 000000000000000d R10: 0000000000000000 R11: 0000000000000206 R12: 000000000000000d R13: 00007fffe786e710 R14: 00007f8de70f8340 R15: 0000000000028400 ORIG_RAX: 000000000000004d CS: 0033 SS: 002b crash64> bt PID: 7610 TASK: ffff88100fd56140 CPU: 1 COMMAND: "ocfs2cmt" #0 [ffff88100f4d1c50] __schedule at ffffffff8150a524 #1 [ffff88100f4d1cf8] schedule at ffffffff8150acbf #2 [ffff88100f4d1d08] jbd2_log_wait_commit at ffffffffa01274fd [jbd2] #3 [ffff88100f4d1d98] jbd2_journal_flush at ffffffffa01280b4 [jbd2] #4 [ffff88100f4d1dd8] ocfs2_commit_cache at ffffffffa0499b14 [ocfs2] #5 [ffff88100f4d1e38] ocfs2_commit_thread at ffffffffa0499d38 [ocfs2] #6 [ffff88100f4d1ee8] kthread at ffffffff81090db6 #7 [ffff88100f4d1f48] kernel_thread_helper at ffffffff81516284 crash64> bt PID: 7609 TASK: ffff88100f2d4480 CPU: 0 COMMAND: "jbd2/dm-20-86" #0 [ffff88100def3920] __schedule at ffffffff8150a524 #1 [ffff88100def39c8] schedule at ffffffff8150acbf #2 [ffff88100def39d8] io_schedule at ffffffff8150ad6c #3 [ffff88100def39f8] sleep_on_page at ffffffff8111069e #4 [ffff88100def3a08] __wait_on_bit_lock at ffffffff8150b30a #5 [ffff88100def3a58] __lock_page at ffffffff81110687 #6 [ffff88100def3ab8] write_cache_pages at ffffffff8111b752 #7 [ffff88100def3be8] generic_writepages at ffffffff8111b901 #8 [ffff88100def3c48] journal_submit_data_buffers at ffffffffa0120f67 [jbd2] #9 [ffff88100def3cf8] jbd2_journal_commit_transaction at ffffffffa0121372[jbd2] #10 [ffff88100def3e68] kjournald2 at ffffffffa0127a86 [jbd2] #11 [ffff88100def3ee8] kthread at ffffffff81090db6 #12 [ffff88100def3f48] kernel_thread_helper at ffffffff81516284 Signed-off-by: Junxiao Bi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Alex Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

This patch wires up the new syscall sys_bpf() on powerpc. Passes the tests in samples/bpf: #0 add+sub+mul OK #1 unreachable OK #2 unreachable2 OK #3 out of range jump OK #4 out of range jump2 OK #5 test1 ld_imm64 OK #6 test2 ld_imm64 OK #7 test3 ld_imm64 OK #8 test4 ld_imm64 OK #9 test5 ld_imm64 OK #10 no bpf_exit OK #11 loop (back-edge) OK #12 loop2 (back-edge) OK #13 conditional loop OK #14 read uninitialized register OK #15 read invalid register OK #16 program doesn't init R0 before exit OK #17 stack out of bounds OK #18 invalid call insn1 OK #19 invalid call insn2 OK #20 invalid function call OK #21 uninitialized stack1 OK #22 uninitialized stack2 OK #23 check valid spill/fill OK #24 check corrupted spill/fill OK #25 invalid src register in STX OK #26 invalid dst register in STX OK #27 invalid dst register in ST OK #28 invalid src register in LDX OK #29 invalid dst register in LDX OK #30 junk insn OK #31 junk insn2 OK #32 junk insn3 OK #33 junk insn4 OK #34 junk insn5 OK #35 misaligned read from stack OK #36 invalid map_fd for function call OK #37 don't check return value before access OK #38 access memory with incorrect alignment OK #39 sometimes access memory with incorrect alignment OK #40 jump test 1 OK #41 jump test 2 OK #42 jump test 3 OK #43 jump test 4 OK Signed-off-by: Pranith Kumar <[email protected]> [mpe: test using samples/bpf] Signed-off-by: Michael Ellerman <[email protected]>

File /proc/sys/kernel/numa_balancing_scan_size_mb allows writing of zero. This bash command reproduces problem: $ while :; do echo 0 > /proc/sys/kernel/numa_balancing_scan_size_mb; \ echo 256 > /proc/sys/kernel/numa_balancing_scan_size_mb; done divide error: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 24112 Comm: bash Not tainted 3.17.0+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88013c852600 ti: ffff880037a68000 task.ti: ffff880037a68000 RIP: 0010:[<ffffffff81074191>] [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP: 0000:ffff880037a6bce0 EFLAGS: 00010246 RAX: 0000000000000a00 RBX: 00000000000003e8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013c852600 RBP: ffff880037a6bcf0 R08: 0000000000000001 R09: 0000000000015c90 R10: ffff880239bf6c00 R11: 0000000000000016 R12: 0000000000003fff R13: ffff88013c852600 R14: ffffea0008d1b000 R15: 0000000000000003 FS: 00007f12bb048700(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000001505678 CR3: 0000000234770000 CR4: 00000000000006f0 Stack: ffff88013c852600 0000000000003fff ffff880037a6bd18 ffffffff810741d1 ffff88013c852600 0000000000003fff 000000000002bfff ffff880037a6bda8 ffffffff81077ef7 ffffea0008a56d40 0000000000000001 0000000000000001 Call Trace: [<ffffffff810741d1>] task_scan_max+0x11/0x40 [<ffffffff81077ef7>] task_numa_fault+0x1f7/0xae0 [<ffffffff8115a896>] ? migrate_misplaced_page+0x276/0x300 [<ffffffff81134a4d>] handle_mm_fault+0x62d/0xba0 [<ffffffff8103e2f1>] __do_page_fault+0x191/0x510 [<ffffffff81030122>] ? native_smp_send_reschedule+0x42/0x60 [<ffffffff8106dc00>] ? check_preempt_curr+0x80/0xa0 [<ffffffff8107092c>] ? wake_up_new_task+0x11c/0x1a0 [<ffffffff8104887d>] ? do_fork+0x14d/0x340 [<ffffffff811799bb>] ? get_unused_fd_flags+0x2b/0x30 [<ffffffff811799df>] ? __fd_install+0x1f/0x60 [<ffffffff8103e67c>] do_page_fault+0xc/0x10 [<ffffffff8150d322>] page_fault+0x22/0x30 RIP [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP <ffff880037a6bce0> ---[ end trace 9a826d16936c04de ]--- Also fix race in task_scan_min (it depends on compiler behaviour). Signed-off-by: Kirill Tkhai <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dario Faggioli <[email protected]> Cc: David Rientjes <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Rik van Riel <[email protected]> Link: http://lkml.kernel.org/r/1413455977.24793.78.camel@tkhai Signed-off-by: Ingo Molnar <[email protected]>

The function chandef_to_chanspec() failed when converting a chandef with bandwidth set to NL80211_CHAN_WIDTH_20_NOHT. This was reported by user running the device in AP mode. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 304 at drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c:381 chandef_to_chanspec.isra.11+0x158/0x184() Modules linked in: CPU: 0 PID: 304 Comm: hostapd Not tainted 3.16.0-rc7-abb+g64aa90f #8 [<c0014bb4>] (unwind_backtrace) from [<c0012314>] (show_stack+0x10/0x14) [<c0012314>] (show_stack) from [<c001d3f8>] (warn_slowpath_common+0x6c/0x8c) [<c001d3f8>] (warn_slowpath_common) from [<c001d4b4>] (warn_slowpath_null+0x1c/0x24) [<c001d4b4>] (warn_slowpath_null) from [<c03449a4>] (chandef_to_chanspec.isra.11+0x158/0x184) [<c03449a4>] (chandef_to_chanspec.isra.11) from [<c0348e00>] (brcmf_cfg80211_start_ap+0x1e4/0x614) [<c0348e00>] (brcmf_cfg80211_start_ap) from [<c04d1468>] (nl80211_start_ap+0x288/0x414) [<c04d1468>] (nl80211_start_ap) from [<c043d144>] (genl_rcv_msg+0x21c/0x38c) [<c043d144>] (genl_rcv_msg) from [<c043c740>] (netlink_rcv_skb+0xac/0xc0) [<c043c740>] (netlink_rcv_skb) from [<c043cf14>] (genl_rcv+0x20/0x34) [<c043cf14>] (genl_rcv) from [<c043c0a0>] (netlink_unicast+0x150/0x20c) [<c043c0a0>] (netlink_unicast) from [<c043c4b8>] (netlink_sendmsg+0x2b8/0x398) [<c043c4b8>] (netlink_sendmsg) from [<c04066a4>] (sock_sendmsg+0x84/0xa8) [<c04066a4>] (sock_sendmsg) from [<c0407c5c>] (___sys_sendmsg.part.29+0x268/0x278) [<c0407c5c>] (___sys_sendmsg.part.29) from [<c0408bdc>] (__sys_sendmsg+0x4c/0x7c) [<c0408bdc>] (__sys_sendmsg) from [<c000ec60>] (ret_fast_syscall+0x0/0x44) ---[ end trace 965ee2158c9905a2 ]--- Cc: [email protected] # v3.17 Reported-by: Pontus Fuchs <[email protected]> Reviewed-by: Hante Meuleman <[email protected]> Reviewed-by: Daniel (Deognyoun) Kim <[email protected]> Reviewed-by: Franky (Zhenhui) Lin <[email protected]> Reviewed-by: Pieter-Paul Giesberts <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: John W. Linville <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limit netlink communication length #8

Limit netlink communication length #8

dmah42 commented Feb 3, 2014

Limit netlink communication length #8

Limit netlink communication length #8

Comments

dmah42 commented Feb 3, 2014