-
Notifications
You must be signed in to change notification settings - Fork 102
[Deepin-Kernel-SIG] [linux 6.12-y] [Deepin] deepin: arm64: cpufeature: disable LSE on some cpus #1333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: linux-6.12.y
Are you sure you want to change the base?
[Deepin-Kernel-SIG] [linux 6.12-y] [Deepin] deepin: arm64: cpufeature: disable LSE on some cpus #1333
Conversation
Reviewer's GuideGate ARM64 LSE atomic instruction support behind a runtime CPU-model check that can automatically disable LSE on known-slow microarchitectures, controlled by a new boot-time kernel parameter, and wire this check into the existing ARM64 LSE capability detection and documentation. Sequence diagram for ARM64 LSE capability check with lse_disable_check parametersequenceDiagram
actor Admin
participant Bootloader
participant Kernel
participant arm64_lse_disable_check
participant has_lse_capability_check
participant has_cpuid_feature
Admin->>Bootloader: Configure kernel cmdline (lse_disable_check=0 or 1)
Bootloader->>Kernel: Pass kernel cmdline
Kernel->>arm64_lse_disable_check: early_param lse_disable_check
arm64_lse_disable_check->>Kernel: Set lse_disable_check (default 0 if absent)
Kernel->>has_lse_capability_check: Evaluate ARM64_HAS_LSE_ATOMICS
has_lse_capability_check->>has_lse_capability_check: Read lse_disable_check
has_lse_capability_check->>has_lse_capability_check: Check CPU midr in lse_disable_list
alt lse_disable_check == 0 and CPU in lse_disable_list
has_lse_capability_check-->>Kernel: Return false (disable LSE)
Kernel->>Kernel: Log info "LSE atomics: use llsc for performance..." (system scope)
else lse_disable_check == 1 or CPU not in list
has_lse_capability_check->>has_cpuid_feature: Delegate capability test
has_cpuid_feature-->>has_lse_capability_check: Return CPUID-based result
has_lse_capability_check-->>Kernel: Return CPUID-based result
end
Kernel->>Kernel: Configure atomic implementation (LSE or LL/SC)
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The logic around
lse_disable_checkis a bit confusing: the variable name, the default value of0, the in-code comment (/* Disable LSE when lse_disable_check is enabled */), and thepr_infomessage don’t all line up with what the code actually does; consider inverting the boolean or renaming it and updating the comment/log message so it’s clear that1means “skip the auto-disable check and keep LSE enabled.” - Since
lse_disable_checkis only mutated during early boot viaearly_paramand then treated as read-only, it would be more appropriate to use__ro_after_initinstead of__read_mostlyfor stronger protection and clearer intent. - The
lse_disable_listcould be moved to file scope as astatic consttable rather than being re-declared on each call tohas_lse_capability_check, which would make it easier to extend and slightly reduce per-call overhead.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic around `lse_disable_check` is a bit confusing: the variable name, the default value of `0`, the in-code comment (`/* Disable LSE when lse_disable_check is enabled */`), and the `pr_info` message don’t all line up with what the code actually does; consider inverting the boolean or renaming it and updating the comment/log message so it’s clear that `1` means “skip the auto-disable check and keep LSE enabled.”
- Since `lse_disable_check` is only mutated during early boot via `early_param` and then treated as read-only, it would be more appropriate to use `__ro_after_init` instead of `__read_mostly` for stronger protection and clearer intent.
- The `lse_disable_list` could be moved to file scope as a `static const` table rather than being re-declared on each call to `has_lse_capability_check`, which would make it easier to extend and slightly reduce per-call overhead.
## Individual Comments
### Comment 1
<location> `Documentation/admin-guide/kernel-parameters.txt:3261` </location>
<code_context>
+ (Load-Link/Store-Conditional) is substantially faster.
+
+ The default value is 0 (enabled), which automatically
+ disables LSE on some systems. Set to 1 to bypassing
+ the automatic disabling of LSE on affected systems.
+
</code_context>
<issue_to_address>
**issue (typo):** Fix the grammatical issue in "Set to 1 to bypassing"
Please change "Set to 1 to bypassing the automatic disabling of LSE on affected systems" to "Set to 1 to bypass the automatic disabling of LSE on affected systems."
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
deepin inclusion category: performance Disable LSE (Large System Extensions) atomic instructions on some systems to improve performance of per-CPU atomic operations. LSE atomics can exhibit significant overhead on certain microarchitectures (e.g., TSV110) due to "far atomic" implementations bypassing L1 cache. LL/SC (Load-Link/Store-Conditional) is substantially faster. The default value is 0 (enabled), which automatically disables LSE on some systems. Set to 1 to skip the check enablement on our test systems regardless of performance impact. When this feature is active, the kernel logs: "LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature." PS: Test with byte-unixbench6 in kp920 24c and 64GB memory, improve whole scores by 3.8%. Test with https://github.com/leitao/debug/tree/main/LSE percpu_bench CPU0: LSE (stadd) (c 0, d 0): p50: 029.00 ns p95: 029.09 ns p99: 029.12 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.55 ns p95: 069.57 ns p99: 069.71 ns CPU1: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 029.00 ns p99: 029.01 ns LL/SC (c 0, d 0): p50: 006.56 ns p95: 006.58 ns p99: 006.58 ns LDADD (c 0, d 0): p50: 010.04 ns p95: 010.06 ns p99: 010.06 ns CPU2: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.53 ns p95: 069.56 ns p99: 069.58 ns ... CPU23: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 064.93 ns p95: 064.95 ns p99: 064.97 ns Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop [1] Link: deepin-community#1320 Link: deepin-community#1302 Signed-off-by: Wentao Guan <[email protected]> (cherry picked from commit 6afecf6) Conflicts: Documentation/admin-guide/kernel-parameters.txt
192b2af to
2e0e00f
Compare
deepin pr auto review我来对这个Git diff进行详细的代码审查:
改进建议: static bool force_enable_lse __read_mostly;
static int __init arm64_force_enable_lse(char *str)
{
return kstrtobool(str, &force_enable_lse);
}
early_param("lse_disable_check", arm64_force_enable_lse);
static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap,
int scope)
{
/* List of CPUs where LSE is slower than LL/SC */
static const struct midr_range lse_disable_list[] = {
MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
{ /* sentinel */ }
};
/* If LSE is forced enabled, skip the check */
if (force_enable_lse)
return has_cpuid_feature(cap, scope);
/* Check if current CPU is in the disable list */
if (is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) {
if (scope == SCOPE_SYSTEM)
pr_info("LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature.\n");
return false;
}
return has_cpuid_feature(cap, scope);
}这些改进主要是:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a mechanism to selectively disable LSE (Large System Extensions) atomic instructions on ARM64 systems where LL/SC (Load-Link/Store-Conditional) operations perform better than LSE atomics. The change addresses performance issues on certain microarchitectures (e.g., HiSilicon TSV110) where LSE "far atomic" implementations bypass L1 cache, resulting in significant overhead for per-CPU atomic operations.
Key Changes:
- Add runtime CPU model detection to conditionally disable LSE atomics on affected systems
- Introduce
lse_disable_checkboot parameter to override automatic LSE disabling behavior - Document the new parameter with usage guidelines and performance context
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| arch/arm64/kernel/cpufeature.c | Implements has_lse_capability_check() function to detect affected CPU models (TSV110) and conditionally disable LSE atomics based on boot parameter and CPU blacklist |
| Documentation/admin-guide/kernel-parameters.txt | Documents the new lse_disable_check boot parameter including default behavior, usage examples, and performance rationale |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| early_param("lse_disable_check", arm64_lse_disable_check); | ||
|
|
||
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | ||
| int scope) |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation: The function parameters should be aligned properly. The second line should use tabs to align with the opening parenthesis of the first parameter, not excessive spaces. This is inconsistent with the coding style used elsewhere in the file (see has_useable_gicv3_cpuif at line 1687 for reference).
| int scope) | |
| int scope) |
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | ||
| int scope) | ||
| { | ||
| /* List of CPUs that LSE are slow more than llsc */ |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar error: "LSE are slow" should be "LSE is slower" or "LSE atomics are slower". The comment is grammatically incorrect as written.
| /* List of CPUs that LSE are slow more than llsc */ | |
| /* List of CPUs where LSE atomics are slower than llsc */ |
| }; | ||
|
|
||
| /* Disable LSE when lse_disable_check is 0 and in lse_disable_list */ | ||
| if (lse_disable_check == 0 && is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) { |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comparison lse_disable_check == 0 should use !lse_disable_check for consistency with idiomatic C boolean checks. This is more readable and follows common kernel coding patterns.
| if (lse_disable_check == 0 && is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) { | |
| if (!lse_disable_check && is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) { |
| The default value is 0 (enabled), which automatically | ||
| disables LSE on some systems. Set to 1 to bypass | ||
| the automatic disabling of LSE on affected systems. | ||
|
|
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is confusing. Line 3260 states "default value is 0 (enabled)" but the variable name is lse_disable_check, and when it's 0 (false), the code disables LSE on affected CPUs. The phrasing should clarify what is being "enabled" - the automatic disabling feature, not LSE itself. Consider rewording to: "The default value is 0, which enables automatic disabling of LSE on affected systems. Set to 1 to skip this check and keep LSE enabled regardless of CPU model."
| The default value is 0 (enabled), which automatically | |
| disables LSE on some systems. Set to 1 to bypass | |
| the automatic disabling of LSE on affected systems. | |
| The default value is 0, which enables automatic disabling of LSE on affected systems. | |
| Set to 1 to skip this check and keep LSE enabled regardless of CPU model. |
| static bool lse_disable_check __read_mostly; | ||
|
|
||
| static int __init arm64_lse_disable_check(char *str) | ||
| { | ||
| return kstrtobool(str, &lse_disable_check); | ||
| } | ||
| early_param("lse_disable_check", arm64_lse_disable_check); |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The parameter name lse_disable_check is confusing and has inverted logic. When set to 1, it actually enables LSE (by disabling the check that would disable LSE). A clearer name would be lse_force_enable or lse_skip_check, where the meaning is more intuitive (1 = force enable LSE, 0 = allow automatic disabling). The current name requires double-negative reasoning ("disable" the "disable check").
deepin inclusion
category: performance
Disable LSE (Large System Extensions) atomic instructions on some systems to improve performance of per-CPU atomic operations. LSE atomics can exhibit significant overhead on certain microarchitectures (e.g., TSV110) due
to "far atomic" implementations bypassing L1 cache. LL/SC (Load-Link/Store-Conditional) is substantially faster.
The default value is 0 (enabled), which automatically disables LSE on some systems. Set to 1 to skip the check enablement on our test systems regardless of performance impact.
When this feature is active, the kernel logs:
"LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature."
PS:
Test with byte-unixbench6 in kp920 24c and 64GB memory, improve whole scores by 3.8%.
Test with https://github.com/leitao/debug/tree/main/LSE percpu_bench CPU0:
LSE (stadd) (c 0, d 0): p50: 029.00 ns p95: 029.09 ns p99: 029.12 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.55 ns p95: 069.57 ns p99: 069.71 ns CPU1:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 029.00 ns p99: 029.01 ns LL/SC (c 0, d 0): p50: 006.56 ns p95: 006.58 ns p99: 006.58 ns LDADD (c 0, d 0): p50: 010.04 ns p95: 010.06 ns p99: 010.06 ns CPU2:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.53 ns p95: 069.56 ns p99: 069.58 ns ...
CPU23:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 064.93 ns p95: 064.95 ns p99: 064.97 ns
Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop [1]
Link: #1320
Link: #1302
(cherry picked from commit 6afecf6) Conflicts:
Documentation/admin-guide/kernel-parameters.txt
Summary by Sourcery
Gate enabling of ARM64 LSE atomic instructions behind a CPU-specific performance check and optional boot-time override to prefer LL/SC atomics on affected systems.
New Features:
Enhancements: