-
Notifications
You must be signed in to change notification settings - Fork 103
[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] Sync maple_tree patch series from mainline #1127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: linux-6.6.y
Are you sure you want to change the base?
[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] Sync maple_tree patch series from mainline #1127
Conversation
mainline inclusion from mainline-v6.8-rc1 category: performance Patch series "maple_tree: iterator state changes". These patches have some general cleanup and a change to separate the maple state status tracking from the maple state node. The maple state status change allows for walks to continue from previous places when the status needs to be recorded to make logical sense for the next call to the maple state. For instance, it allows for prev/next to function in a way that better resembles the linked list. It also allows switch statements to be used to detect missed states during compile, and the addition of fast-path "active" state is cleaner as an enum. While making the status change, perf showed some very small (one line) functions that were not inlined even with the inline key word. Making these small functions __always_inline is less expensive according to perf. As part of that change, some inlines have been dropped from larger functions. Perf also showed that the commonly used mas_for_each() iterator was spending a lot of time finding the end of the node. This series introduces caching of the end of the node in the maple state (and updating it during writes). This caching along with the inline changes yielded at 23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test framework benchmark. I've also included a change to mtree_range_walk and mtree_lookup_walk to take advantage of Peng's change [1] to the initial pivot setup. mmtests did not produce any significant gains. [1] https://lore.kernel.org/all/[email protected]/T/#u This patch (of 12): Removing the default types from the switch statements will cause compile warnings on missing cases. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Suggested-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 37a8ab2) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance mas_erase() may not deal correctly with all maple states. Make the function more robust by ensuring the state is in one of the two acceptable states. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit f7a5901) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance __mas_set_range() was created to shortcut resetting the maple state and a debug check was added to the caller (the vma iterator) to ensure the internal maple state remains safe to use. Move the debug check from the vma iterator into the maple tree itself so other users do not incorrectly use the advanced maple state modification. Fallout from this change include a large amount of debug setup needed to be moved to earlier in the header, and the maple_tree.h radix-tree test code needed to move the inclusion of the header to after the atomic define. None of those changes have functional changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit bf857dd) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance Analysis of the mas_for_each() iteration showed that there is a significant time spent finding the end of a node. This time can be greatly reduced if the end of the node is cached in the maple state. Care must be taken to update & invalidate as necessary. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 31c532a) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance When looking for the next entry, don't recalculate the node end as it is now tracked in the maple state. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit e9c52d8) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance The node end is set during the walk, so use the resulting end instead of re-fetching it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 1f41ef1) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance There are a few functions which were inlined but are somewhat too large to inline, so remove the inline key word. There are also several very small functions which are used in critical code sections which gcc was not inlining, so make this more strict and use __always_line for these functions. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 271f61a) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance The maple tree node is overloaded to keep status as well as the active node. This, unfortunately, results in a re-walk on underflow or overflow. Since the maple state has room, the status can be placed in its own enum in the structure. Once an underflow/overflow is detected, certain modes can restore the status to active and others may need to re-walk just that one node to see the entry. The status being an enum has the benefit of detecting unhandled status in switch statements. [[email protected]: fix comments about MAS_*] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: update forking to separate maple state and node] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix mas_prev() state separation code] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 067311d) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance Now that the status of the maple state is outside of the node, the mas_searchable() function can be dropped for easier open-coding of what is going on. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 9a40d45) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance ma_wr_state was previously tracking the end of the node for writing. Since the implementation of the ma_state end tracking, this is duplicated work. This patch removes the maple write state tracking of the end of the node and uses the maple state end instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 0de56e3) [Conflict: because ("maple_tree: correct tree corruption on spanning store") merged before] Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance Since the pivot being set is now reliable, the optimized loop no longer needs to find the node end. The redundant check for a dead node can also be avoided as there is no danger of using the wrong pivot since the results will be thrown out in the case of a dead node by the later check. This patch also adds a benchmark test for the function to the maple tree test framework. The benchmark shows an average increase performance of 5.98% over 3 runs with this commit. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 24662de) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance mtree_range_walk() needed to be updated to avoid checking if there was a pivot value. On closer examination, the code could avoid setting min or max in certain scenarios. The commit removes the extra check for pivot[offset] before setting max and only sets max when necessary. It also only sets min if it is necessary by checking offset 0 prior to the loop (as it has always done). The commit also drops a dead node check since the end of the node will return the array size when the last slot is occupied (by a potential reuse in a dead node). The data will be discarded later if the node is marked dead. Benchmarking these changes results in an increase in performance of 5.45% using the BENCH_WALK in the maple tree test code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit a3c63c8) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.8-rc1 category: performance Commit 0de56e3 ("maple_tree: use maple state end for write operations") was broken by a later patch "maple_tree: do not preallocate nodes for slot stores". But the later patch was scheduled ahead of 0de56e3, for 6.7-rc. This fixlet undoes the damage. Fixes: 0de56e3 ("maple_tree: use maple state end for write operations") Cc: Liam R. Howlett <[email protected]> Cc: Sidhartha Kumar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 5143eec) Signed-off-by: Wentao Guan <[email protected]>
mainline inclusion from mainline-v6.15-rc1 category: performance Remove a BUG_ON() right before a WARN_ON() with the same condition. Calling WARN_ON() and BUG_ON() here is definitely wrong. Since the goal is generally to remove BUG_ON() invocations from the kernel, keep only the WARN_ON(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 067311d ("maple_tree: separate ma_state node from status") Signed-off-by: Petr Tesarik <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 18ea595) Signed-off-by: Wentao Guan <[email protected]>
Reviewer's GuideThis PR backports and integrates the upstream maple_tree iterator state overhaul: sentinel node pointers are replaced by an explicit status enum in ma_state, each node’s end offset is cached for faster prev/next walks, performance‐critical small helpers are forced __always_inline, switch statements are cleaned up, Peng’s new pivot setup is leveraged in range and lookup walks, and a build error plus an invalid BUG_ON are fixed. Class diagram for updated maple_tree iterator state (ma_state)classDiagram
class ma_state {
+maple_tree* tree
+maple_enode* node
+unsigned long index
+unsigned long last
+unsigned long min
+unsigned long max
+maple_alloc* alloc
+maple_status status
+unsigned char depth
+unsigned char offset
+unsigned char mas_flags
+unsigned char end
}
class maple_status {
ma_active
ma_start
ma_root
ma_none
ma_pause
ma_overflow
ma_underflow
ma_error
}
ma_state --> maple_status : status
class ma_wr_state {
+ma_state* mas
+unsigned long r_min
+unsigned long r_max
+maple_type type
+unsigned char offset_end
+unsigned long* pivots
+unsigned long end_piv
+void __rcu** slots
+void* entry
+void* content
}
ma_wr_state --> ma_state : mas
Class diagram for removal of sentinel node pointers in ma_stateclassDiagram
class ma_state {
-maple_enode* node
-MAS_START
-MAS_ROOT
-MAS_NONE
-MAS_PAUSE
-MAS_OVERFLOW
-MAS_UNDERFLOW
+enum maple_status status
+maple_enode* node
}
class maple_status {
ma_active
ma_start
ma_root
ma_none
ma_pause
ma_overflow
ma_underflow
ma_error
}
ma_state --> maple_status : status
Class diagram for updated debug macros and counters (CONFIG_DEBUG_MAPLE_TREE)classDiagram
class maple_tree_tests_run {
atomic_t maple_tree_tests_run
}
class maple_tree_tests_passed {
atomic_t maple_tree_tests_passed
}
class mt_dump_format {
mt_dump_dec
mt_dump_hex
}
class mt_dump {
+void mt_dump(const maple_tree*, mt_dump_format)
}
class mas_dump {
+void mas_dump(const ma_state*)
}
class mas_wr_dump {
+void mas_wr_dump(const ma_wr_state*)
}
class mt_validate {
+void mt_validate(maple_tree*)
}
class mt_cache_shrink {
+void mt_cache_shrink()
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
deepin pr auto review代码审查分析总体概述这个补丁主要是对 Maple Tree 实现进行了重构和优化,主要改进包括:
具体分析1. 状态管理改进优点:
改进建议:
2. 错误处理改进优点:
改进建议:
3. 性能优化优点:
改进建议:
4. 安全性改进优点:
改进建议:
5. 代码结构改进优点:
改进建议:
具体问题
总结这个补丁总体上是积极的改进,提高了代码的可读性、安全性和性能。主要的改进集中在状态管理、错误处理和性能优化方面。建议在合并前进行充分的测试,确保所有功能正常工作,并且性能测试通过。 对于后续的开发,建议:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @opsiff, you have reached your weekly rate limit for Sourcery. Please try again later
Merge patch series "maple_tree: iterator state changes"
These patches have some general cleanup and a change to separate the
maple state status tracking from the maple state node.
The maple state status change allows for walks to continue from previous
places when the status needs to be recorded to make logical sense for
the next call to the maple state. For instance, it allows for prev/next
to function in a way that better resembles the linked list. It also
allows switch statements to be used to detect missed states during
compile, and the addition of fast-path "active" state is cleaner as an
enum.
While making the status change, perf showed some very small (one line)
functions that were not inlined even with the inline key word. Making
these small functions __always_inline is less expensive according to
perf. As part of that change, some inlines have been dropped from
larger functions.
Perf also showed that the commonly used mas_for_each() iterator was
spending a lot of time finding the end of the node. This series
introduces caching of the end of the node in the maple state (and
updating it during writes). This caching along with the inline changes
yielded at 23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test
framework benchmark.
I've also included a change to mtree_range_walk and mtree_lookup_walk to
take advantage of Peng's change [1] to the initial pivot setup.
mmtests did not produce any significant gains.
[1] https://lore.kernel.org/all/[email protected]/T/#u
Liam R. Howlett (12):
maple_tree: Remove unnecessary default labels from switch statements
maple_tree: Make mas_erase() more robust
maple_tree: Move debug check to __mas_set_range()
maple_tree: Add end of node tracking to the maple state
maple_tree: Use cached node end in mas_next()
maple_tree: Use cached node end in mas_destroy()
maple_tree: Clean up inlines for some functions
maple_tree: Separate ma_state node from status.
maple_tree: Remove mas_searchable()
maple_tree: Use maple state end for write operations
maple_tree: Don't find node end in mtree_lookup_walk()
maple_tree: mtree_range_walk() clean up
lib/maple_tree.c: fix build error due to hotfix alteration
maple_tree: remove a BUG_ON() in mas_alloc_nodes()
Summary by Sourcery
Sync Maple Tree patch series from upstream: refactor iterator state status tracking, add cached node end, improve performance, and align code with mainline.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Tests: