Skip to content

Conversation

@opsiff
Copy link
Member

@opsiff opsiff commented Oct 10, 2025

mainline inclusion
from mainline-v6.14-rc1
category: performance

On busy multi-threaded workloads, there can be significant contention
on the mm_cpumask at context switch time.

Reduce that contention by updating mm_cpumask lazily, setting the CPU bit
at context switch time (if not already set), and clearing the CPU bit at
the first TLB flush sent to a CPU where the process isn't running.

When a flurry of TLB flushes for a process happen, only the first one
will be sent to CPUs where the process isn't running. The others will
be sent to CPUs where the process is currently running.

On an AMD Milan system with 36 cores, there is a noticeable difference:
$ hackbench --groups 20 --loops 10000

  Before: ~4.5s +/- 0.1s
  After:  ~4.2s +/- 0.1s

Signed-off-by: Rik van Riel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 209954c)
Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion
from mainline-v6.14-rc1
category: performance

Add a tracepoint when we send a TLB flush IPI to a CPU that used
to be in the mm_cpumask, but isn't any more.

Suggested-by: Dave Hansen <[email protected]>
Signed-off-by: Rik van Riel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 2815a56)
Signed-off-by: Wentao Guan <[email protected]>
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @opsiff, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from opsiff. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mainline inclusion
from mainline-v6.16-rc1
category: bugfix

The CONFIG_DEBUG_VM=y warning in switch_mm_irqs_off() started
triggering in testing:

	VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(prev)));

AFAIU what happens is that unuse_temporary_mm() clears the mm_cpumask()
for the current CPU, while switch_mm_irqs_off() then checks that the
mm_cpumask() bit is set for the current CPU.

While this behaviour hasn't really changed since the following commit:

  209954c ("x86/mm/tlb: Update mm_cpumask lazily")

introduced both, but the warning is wrong, so remove it.

[ mingo: Patchified Peter's email. ]

Reported-by: [email protected]
Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 52ebfe7)
Signed-off-by: Wentao Guan <[email protected]>

Conflicts:
	arch/x86/mm/tlb.c
@deepin-ci-robot
Copy link

deepin pr auto review

这段代码是关于x86架构下内存管理(MM)和TLB(转换后备缓冲器)处理的修改。我来分析一下这些变更并提供审查意见:

1. 代码变更概述

  • alternative.c中:

    • 移动了poking_mmpoking_addr变量的声明位置
    • unuse_temporary_mm函数中添加了清除CPU掩码的操作
    • 修改了函数顺序,将变量声明移到了函数使用之后
  • tlb.c中:

    • 移除了对prev != &init_mm的检查和相关代码
    • 修改了next的CPU掩码设置条件
    • flush_tlb_func中添加了处理TLB_REMOTE_WRONG_CPU情况的代码
  • mm_types.h中:

    • 添加了新的TLB刷新原因TLB_REMOTE_WRONG_CPU

2. 语法逻辑分析

  • 代码语法正确,没有明显的语法错误
  • 逻辑流程基本合理,但有一些值得注意的变化

3. 代码质量改进建议

  1. 变量声明位置

    • poking_mmpoking_addr的声明移到了函数使用之后,这违反了通常的代码组织原则,建议将声明移回文件顶部
    • 这些变量被标记为__ro_after_init,表明它们在初始化后只读,这个使用是合适的
  2. 注释完整性

    • 新添加的TLB处理逻辑缺少足够的注释,建议添加更详细的注释说明为什么需要这些变更
    • 特别是cpumask_clear_cpu操作的目的和影响需要更清晰的解释
  3. 代码组织

    • unuse_temporary_mm函数中的修改应该有更清晰的注释,说明为什么需要在此时清除CPU掩码

4. 代码性能分析

  1. CPU掩码操作

    • 添加的cpumask_clear_cpucpumask_set_cpu操作可能会影响性能,因为这些操作涉及到原子操作和缓存一致性
    • 需要确认这些操作是否真的必要,或者是否有更高效的方式实现相同的功能
  2. 条件判断优化

    • if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))这个条件判断可以拆分为两个单独的条件,以提高可读性

5. 代码安全性分析

  1. 竞态条件

    • 修改后的代码涉及到多个CPU之间的同步操作,需要确保没有竞态条件
    • 特别是在flush_tlb_func中处理TLB_REMOTE_WRONG_CPU的情况时,需要确保所有相关操作是原子的
  2. 错误处理

    • 新添加的TLB刷新原因TLB_REMOTE_WRONG_CPU应该有相应的错误处理机制,而不仅仅是记录trace事件
    • 建议添加适当的错误恢复机制或至少记录更详细的错误信息
  3. 边界条件

    • 需要确保代码在所有边界条件下都能正确工作,特别是在处理init_mm和普通mm_struct时

6. 总体建议

  1. 添加更详细的注释,解释修改的动机和目的
  2. 考虑将变量声明移回文件顶部,遵循良好的代码组织原则
  3. 对CPU掩码操作的性能影响进行评估,确保这些操作是必要的
  4. 增强错误处理机制,特别是在处理TLB刷新异常情况时
  5. 添加更多的边界条件检查,确保代码的健壮性

这些修改看起来是为了解决TLB刷新相关的问题,特别是处理远程CPU上的TLB刷新。虽然修改的目的是合理的,但实现方式可能需要更多的测试和验证,以确保不会引入新的问题。

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR merges x86 TLB context switch optimizations that improve TLB (Translation Lookaside Buffer) management during process context switches. The changes optimize when CPUs are added/removed from memory management unit CPU masks and add better tracking for remote TLB flush scenarios.

  • Adds new TLB flush reason TLB_REMOTE_WRONG_CPU for better debugging/tracing
  • Optimizes CPU mask management by deferring CPU removal until actually needed
  • Improves temporary memory management for code patching operations

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
include/linux/mm_types.h Adds new TLB flush reason enum for wrong CPU scenarios
arch/x86/mm/tlb.c Optimizes context switching by deferring cpumask operations and adding CPU clearing logic
arch/x86/kernel/alternative.c Moves variable declarations and adds cpumask clearing for temporary mm operations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants