[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Merge x86,tlb: context switch optimizations #1214

opsiff · 2025-10-10T12:49:25Z

x86,tlb: context switch optimizations
https://lore.kernel.org/all/[email protected]/

mainline inclusion from mainline-v6.14-rc1 category: performance On busy multi-threaded workloads, there can be significant contention on the mm_cpumask at context switch time. Reduce that contention by updating mm_cpumask lazily, setting the CPU bit at context switch time (if not already set), and clearing the CPU bit at the first TLB flush sent to a CPU where the process isn't running. When a flurry of TLB flushes for a process happen, only the first one will be sent to CPUs where the process isn't running. The others will be sent to CPUs where the process is currently running. On an AMD Milan system with 36 cores, there is a noticeable difference: $ hackbench --groups 20 --loops 10000 Before: ~4.5s +/- 0.1s After: ~4.2s +/- 0.1s Signed-off-by: Rik van Riel <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mel Gorman <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 209954c) Signed-off-by: Wentao Guan <[email protected]>

mainline inclusion from mainline-v6.14-rc1 category: performance Add a tracepoint when we send a TLB flush IPI to a CPU that used to be in the mm_cpumask, but isn't any more. Suggested-by: Dave Hansen <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 2815a56) Signed-off-by: Wentao Guan <[email protected]>

sourcery-ai

Sorry @opsiff, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

deepin-ci-robot · 2025-10-10T12:49:34Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from opsiff. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

deepin/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mainline inclusion from mainline-v6.16-rc1 category: bugfix The CONFIG_DEBUG_VM=y warning in switch_mm_irqs_off() started triggering in testing: VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(prev))); AFAIU what happens is that unuse_temporary_mm() clears the mm_cpumask() for the current CPU, while switch_mm_irqs_off() then checks that the mm_cpumask() bit is set for the current CPU. While this behaviour hasn't really changed since the following commit: 209954c ("x86/mm/tlb: Update mm_cpumask lazily") introduced both, but the warning is wrong, so remove it. [ mingo: Patchified Peter's email. ] Reported-by: [email protected] Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Andrew Cooper <[email protected]> Cc: Rik van Riel <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 52ebfe7) Signed-off-by: Wentao Guan <[email protected]> Conflicts: arch/x86/mm/tlb.c

deepin-ci-robot · 2025-10-10T12:58:34Z

deepin pr auto review

这段代码是关于x86架构下内存管理(MM)和TLB(转换后备缓冲器)处理的修改。我来分析一下这些变更并提供审查意见：

1. 代码变更概述

在alternative.c中：
- 移动了poking_mm和poking_addr变量的声明位置
- 在unuse_temporary_mm函数中添加了清除CPU掩码的操作
- 修改了函数顺序，将变量声明移到了函数使用之后
在tlb.c中：
- 移除了对prev != &init_mm的检查和相关代码
- 修改了next的CPU掩码设置条件
- 在flush_tlb_func中添加了处理TLB_REMOTE_WRONG_CPU情况的代码
在mm_types.h中：
- 添加了新的TLB刷新原因TLB_REMOTE_WRONG_CPU

2. 语法逻辑分析

代码语法正确，没有明显的语法错误
逻辑流程基本合理，但有一些值得注意的变化

3. 代码质量改进建议

变量声明位置：
- 将poking_mm和poking_addr的声明移到了函数使用之后，这违反了通常的代码组织原则，建议将声明移回文件顶部
- 这些变量被标记为__ro_after_init，表明它们在初始化后只读，这个使用是合适的
注释完整性：
- 新添加的TLB处理逻辑缺少足够的注释，建议添加更详细的注释说明为什么需要这些变更
- 特别是cpumask_clear_cpu操作的目的和影响需要更清晰的解释
代码组织：
- unuse_temporary_mm函数中的修改应该有更清晰的注释，说明为什么需要在此时清除CPU掩码

4. 代码性能分析

CPU掩码操作：
- 添加的cpumask_clear_cpu和cpumask_set_cpu操作可能会影响性能，因为这些操作涉及到原子操作和缓存一致性
- 需要确认这些操作是否真的必要，或者是否有更高效的方式实现相同的功能
条件判断优化：
- if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))这个条件判断可以拆分为两个单独的条件，以提高可读性

5. 代码安全性分析

竞态条件：
- 修改后的代码涉及到多个CPU之间的同步操作，需要确保没有竞态条件
- 特别是在flush_tlb_func中处理TLB_REMOTE_WRONG_CPU的情况时，需要确保所有相关操作是原子的
错误处理：
- 新添加的TLB刷新原因TLB_REMOTE_WRONG_CPU应该有相应的错误处理机制，而不仅仅是记录trace事件
- 建议添加适当的错误恢复机制或至少记录更详细的错误信息
边界条件：
- 需要确保代码在所有边界条件下都能正确工作，特别是在处理init_mm和普通mm_struct时

6. 总体建议

添加更详细的注释，解释修改的动机和目的
考虑将变量声明移回文件顶部，遵循良好的代码组织原则
对CPU掩码操作的性能影响进行评估，确保这些操作是必要的
增强错误处理机制，特别是在处理TLB刷新异常情况时
添加更多的边界条件检查，确保代码的健壮性

这些修改看起来是为了解决TLB刷新相关的问题，特别是处理远程CPU上的TLB刷新。虽然修改的目的是合理的，但实现方式可能需要更多的测试和验证，以确保不会引入新的问题。

Copilot

Pull Request Overview

This PR merges x86 TLB context switch optimizations that improve TLB (Translation Lookaside Buffer) management during process context switches. The changes optimize when CPUs are added/removed from memory management unit CPU masks and add better tracking for remote TLB flush scenarios.

Adds new TLB flush reason TLB_REMOTE_WRONG_CPU for better debugging/tracing
Optimizes CPU mask management by deferring CPU removal until actually needed
Improves temporary memory management for code patching operations

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
include/linux/mm_types.h	Adds new TLB flush reason enum for wrong CPU scenarios
arch/x86/mm/tlb.c	Optimizes context switching by deferring cpumask operations and adding CPU clearing logic
arch/x86/kernel/alternative.c	Moves variable declarations and adds cpumask clearing for temporary mm operations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

rikvanriel added 2 commits October 10, 2025 20:43

sourcery-ai bot reviewed Oct 10, 2025

View reviewed changes

deepin-ci-robot requested review from Wenlp and chenchongbiao October 10, 2025 12:49

Avenger-285714 requested review from Avenger-285714 and Copilot October 11, 2025 01:26

Copilot AI reviewed Oct 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Merge x86,tlb: context switch optimizations #1214

[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Merge x86,tlb: context switch optimizations #1214

Uh oh!

opsiff commented Oct 10, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

deepin-ci-robot commented Oct 10, 2025

Uh oh!

deepin-ci-robot commented Oct 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Merge x86,tlb: context switch optimizations #1214

Are you sure you want to change the base?

[Deepin-Kernel-SIG] [linux 6.12-y] [Upstream] Merge x86,tlb: context switch optimizations #1214

Uh oh!

Conversation

opsiff commented Oct 10, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

deepin-ci-robot commented Oct 10, 2025

Uh oh!

deepin-ci-robot commented Oct 10, 2025

deepin pr auto review

1. 代码变更概述

2. 语法逻辑分析

3. 代码质量改进建议

4. 代码性能分析

5. 代码安全性分析

6. 总体建议

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants