Skip to content

Commit

Permalink
Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/sc…
Browse files Browse the repository at this point in the history
…m/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "The various patchsets are summarized below. Plus of course many
  indivudual patches which are described in their changelogs.

   - "Allocate and free frozen pages" from Matthew Wilcox reorganizes
     the page allocator so we end up with the ability to allocate and
     free zero-refcount pages. So that callers (ie, slab) can avoid a
     refcount inc & dec

   - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
     use large folios other than PMD-sized ones

   - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
     and fixes for this small built-in kernel selftest

   - "mas_anode_descend() related cleanup" from Wei Yang tidies up part
     of the mapletree code

   - "mm: fix format issues and param types" from Keren Sun implements a
     few minor code cleanups

   - "simplify split calculation" from Wei Yang provides a few fixes and
     a test for the mapletree code

   - "mm/vma: make more mmap logic userland testable" from Lorenzo
     Stoakes continues the work of moving vma-related code into the
     (relatively) new mm/vma.c

   - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
     Hildenbrand cleans up and rationalizes handling of gfp flags in the
     page allocator

   - "readahead: Reintroduce fix for improper RA window sizing" from Jan
     Kara is a second attempt at fixing a readahead window sizing issue.
     It should reduce the amount of unnecessary reading

   - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
     addresses an issue where "huge" amounts of pte pagetables are
     accumulated:

       https://lore.kernel.org/lkml/[email protected]/

     Qi's series addresses this windup by synchronously freeing PTE
     memory within the context of madvise(MADV_DONTNEED)

   - "selftest/mm: Remove warnings found by adding compiler flags" from
     Muhammad Usama Anjum fixes some build warnings in the selftests
     code when optional compiler warnings are enabled

   - "mm: don't use __GFP_HARDWALL when migrating remote pages" from
     David Hildenbrand tightens the allocator's observance of
     __GFP_HARDWALL

   - "pkeys kselftests improvements" from Kevin Brodsky implements
     various fixes and cleanups in the MM selftests code, mainly
     pertaining to the pkeys tests

   - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
     estimate application working set size

   - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
     provides some cleanups to memcg's hugetlb charging logic

   - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
     removes the global swap cgroup lock. A speedup of 10% for a
     tmpfs-based kernel build was demonstrated

   - "zram: split page type read/write handling" from Sergey Senozhatsky
     has several fixes and cleaups for zram in the area of
     zram_write_page(). A watchdog softlockup warning was eliminated

   - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
     Brodsky cleans up the pagetable destructor implementations. A rare
     use-after-free race is fixed

   - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
     simplifies and cleans up the debugging code in the VMA merging
     logic

   - "Account page tables at all levels" from Kevin Brodsky cleans up
     and regularizes the pagetable ctor/dtor handling. This results in
     improvements in accounting accuracy

   - "mm/damon: replace most damon_callback usages in sysfs with new
     core functions" from SeongJae Park cleans up and generalizes
     DAMON's sysfs file interface logic

   - "mm/damon: enable page level properties based monitoring" from
     SeongJae Park increases the amount of information which is
     presented in response to DAMOS actions

   - "mm/damon: remove DAMON debugfs interface" from SeongJae Park
     removes DAMON's long-deprecated debugfs interfaces. Thus the
     migration to sysfs is completed

   - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
     Peter Xu cleans up and generalizes the hugetlb reservation
     accounting

   - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
     removes a never-used feature of the alloc_pages_bulk() interface

   - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
     extends DAMOS filters to support not only exclusion (rejecting),
     but also inclusion (allowing) behavior

   - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
     introduces a new memory descriptor for zswap.zpool that currently
     overlaps with struct page for now. This is part of the effort to
     reduce the size of struct page and to enable dynamic allocation of
     memory descriptors

   - "mm, swap: rework of swap allocator locks" from Kairui Song redoes
     and simplifies the swap allocator locking. A speedup of 400% was
     demonstrated for one workload. As was a 35% reduction for kernel
     build time with swap-on-zram

   - "mm: update mips to use do_mmap(), make mmap_region() internal"
     from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
     mmap_region() can be made MM-internal

   - "mm/mglru: performance optimizations" from Yu Zhao fixes a few
     MGLRU regressions and otherwise improves MGLRU performance

   - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
     Park updates DAMON documentation

   - "Cleanup for memfd_create()" from Isaac Manjarres does that thing

   - "mm: hugetlb+THP folio and migration cleanups" from David
     Hildenbrand provides various cleanups in the areas of hugetlb
     folios, THP folios and migration

   - "Uncached buffered IO" from Jens Axboe implements the new
     RWF_DONTCACHE flag which provides synchronous dropbehind for
     pagecache reading and writing. To permite userspace to address
     issues with massive buildup of useless pagecache when
     reading/writing fast devices

   - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
     Weißschuh fixes and optimizes some of the MM selftests"

* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm/compaction: fix UBSAN shift-out-of-bounds warning
  s390/mm: add missing ctor/dtor on page table upgrade
  kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
  tools: add VM_WARN_ON_VMG definition
  mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
  seqlock: add missing parameter documentation for raw_seqcount_try_begin()
  mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
  mm/page_alloc: remove the incorrect and misleading comment
  zram: remove zcomp_stream_put() from write_incompressible_page()
  mm: separate move/undo parts from migrate_pages_batch()
  mm/kfence: use str_write_read() helper in get_access_type()
  selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
  kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
  selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
  selftests/mm: vm_util: split up /proc/self/smaps parsing
  selftests/mm: virtual_address_range: unmap chunks after validation
  selftests/mm: virtual_address_range: mmap() without PROT_WRITE
  selftests/memfd/memfd_test: fix possible NULL pointer dereference
  mm: add FGP_DONTCACHE folio creation flag
  mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
  ...
  • Loading branch information
torvalds committed Jan 27, 2025
2 parents c159dfb + d1366e7 commit 9c5968d
Show file tree
Hide file tree
Showing 334 changed files with 7,962 additions and 7,921 deletions.
26 changes: 22 additions & 4 deletions Documentation/ABI/testing/sysfs-kernel-mm-damon
Original file line number Diff line number Diff line change
Expand Up @@ -355,10 +355,15 @@ Description: If 'target' is written to the 'type' file, writing to or
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/matching
Date: Dec 2022
Contact: SeongJae Park <[email protected]>
Description: Writing 'Y' or 'N' to this file sets whether to filter out
pages that do or do not match to the 'type' and 'memcg_path',
respectively. Filter out means the action of the scheme will
not be applied to.
Description: Writing 'Y' or 'N' to this file sets whether the filter is for
the memory of the 'type', or all except the 'type'.

What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/allow
Date: Jan 2025
Contact: SeongJae Park <[email protected]>
Description: Writing 'Y' or 'N' to this file sets whether to allow or reject
applying the scheme's action to the memory that satisfies the
'type' and the 'matching' of the directory.

What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/stats/nr_tried
Date: Mar 2022
Expand All @@ -384,6 +389,12 @@ Contact: SeongJae Park <[email protected]>
Description: Reading this file returns the total size of regions that the
action of the scheme has successfully applied in bytes.

What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/stats/sz_ops_filter_passed
Date: Dec 2024
Contact: SeongJae Park <[email protected]>
Description: Reading this file returns the total size of memory that passed
DAMON operations layer-handled filters of the scheme in bytes.

What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/stats/qt_exceeds
Date: Mar 2022
Contact: SeongJae Park <[email protected]>
Expand Down Expand Up @@ -424,3 +435,10 @@ Contact: SeongJae Park <[email protected]>
Description: Reading this file returns the 'age' of a memory region that
corresponding DAMON-based Operation Scheme's action has tried
to be applied.

What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/tried_regions/<R>/sz_filter_passed
Date: Dec 2024
Contact: SeongJae Park <[email protected]>
Description: Reading this file returns the size of the memory in the region
that passed DAMON operations layer-handled filters of the
scheme in bytes.
11 changes: 9 additions & 2 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3495,8 +3495,8 @@
[KNL] Set the initial state for the memory hotplug
onlining policy. If not specified, the default value is
set according to the
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
option.
CONFIG_MHP_DEFAULT_ONLINE_TYPE kernel config
options.
See Documentation/admin-guide/mm/memory-hotplug.rst.

memmap=exactmap [KNL,X86,EARLY] Enable setting of an exact
Expand Down Expand Up @@ -7303,6 +7303,13 @@
See Documentation/admin-guide/mm/transhuge.rst
for more details.

transparent_hugepage_tmpfs= [KNL]
Format: [always|within_size|advise|never]
Can be used to control the default hugepage allocation policy
for the tmpfs mount.
See Documentation/admin-guide/mm/transhuge.rst
for more details.

trusted.source= [KEYS]
Format: <string>
This parameter identifies the trust source as a backend
Expand Down
67 changes: 40 additions & 27 deletions Documentation/admin-guide/mm/damon/start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,32 +42,45 @@ the execution. ::

$ git clone https://github.com/sjp38/masim; cd masim; make
$ sudo damo start "./masim ./configs/stairs.cfg --quiet"
$ sudo ./damo show
0 addr [85.541 TiB , 85.541 TiB ) (57.707 MiB ) access 0 % age 10.400 s
1 addr [85.541 TiB , 85.542 TiB ) (413.285 MiB) access 0 % age 11.400 s
2 addr [127.649 TiB , 127.649 TiB) (57.500 MiB ) access 0 % age 1.600 s
3 addr [127.649 TiB , 127.649 TiB) (32.500 MiB ) access 0 % age 500 ms
4 addr [127.649 TiB , 127.649 TiB) (9.535 MiB ) access 100 % age 300 ms
5 addr [127.649 TiB , 127.649 TiB) (8.000 KiB ) access 60 % age 0 ns
6 addr [127.649 TiB , 127.649 TiB) (6.926 MiB ) access 0 % age 1 s
7 addr [127.998 TiB , 127.998 TiB) (120.000 KiB) access 0 % age 11.100 s
8 addr [127.998 TiB , 127.998 TiB) (8.000 KiB ) access 40 % age 100 ms
9 addr [127.998 TiB , 127.998 TiB) (4.000 KiB ) access 0 % age 11 s
total size: 577.590 MiB
$ sudo ./damo stop
$ sudo damo report access
heatmap: 641111111000000000000000000000000000000000000000000000[...]33333333333333335557984444[...]7
# min/max temperatures: -1,840,000,000, 370,010,000, column size: 3.925 MiB
0 addr 86.182 TiB size 8.000 KiB access 0 % age 14.900 s
1 addr 86.182 TiB size 8.000 KiB access 60 % age 0 ns
2 addr 86.182 TiB size 3.422 MiB access 0 % age 4.100 s
3 addr 86.182 TiB size 2.004 MiB access 95 % age 2.200 s
4 addr 86.182 TiB size 29.688 MiB access 0 % age 14.100 s
5 addr 86.182 TiB size 29.516 MiB access 0 % age 16.700 s
6 addr 86.182 TiB size 29.633 MiB access 0 % age 17.900 s
7 addr 86.182 TiB size 117.652 MiB access 0 % age 18.400 s
8 addr 126.990 TiB size 62.332 MiB access 0 % age 9.500 s
9 addr 126.990 TiB size 13.980 MiB access 0 % age 5.200 s
10 addr 126.990 TiB size 9.539 MiB access 100 % age 3.700 s
11 addr 126.990 TiB size 16.098 MiB access 0 % age 6.400 s
12 addr 127.987 TiB size 132.000 KiB access 0 % age 2.900 s
total size: 314.008 MiB
$ sudo damo stop

The first command of the above example downloads and builds an artificial
memory access generator program called ``masim``. The second command asks DAMO
to execute the artificial generator process start via the given command and
make DAMON monitors the generator process. The third command retrieves the
current snapshot of the monitored access pattern of the process from DAMON and
shows the pattern in a human readable format.

Each line of the output shows which virtual address range (``addr [XX, XX)``)
of the process is how frequently (``access XX %``) accessed for how long time
(``age XX``). For example, the fifth region of ~9 MiB size is being most
frequently accessed for last 300 milliseconds. Finally, the fourth command
stops DAMON.
to start the program via the given command and make DAMON monitors the newly
started process. The third command retrieves the current snapshot of the
monitored access pattern of the process from DAMON and shows the pattern in a
human readable format.

The first line of the output shows the relative access temperature (hotness) of
the regions in a single row hetmap format. Each column on the heatmap
represents regions of same size on the monitored virtual address space. The
position of the colun on the row and the number on the column represents the
relative location and access temperature of the region. ``[...]`` means
unmapped huge regions on the virtual address spaces. The second line shows
additional information for better understanding the heatmap.

Each line of the output from the third line shows which virtual address range
(``addr XX size XX``) of the process is how frequently (``access XX %``)
accessed for how long time (``age XX``). For example, the evelenth region of
~9.5 MiB size is being most frequently accessed for last 3.7 seconds. Finally,
the fourth command stops DAMON.

Note that DAMON can monitor not only virtual address spaces but multiple types
of address spaces including the physical address space.
Expand Down Expand Up @@ -95,7 +108,7 @@ Visualizing Recorded Patterns
You can visualize the pattern in a heatmap, showing which memory region
(x-axis) got accessed when (y-axis) and how frequently (number).::

$ sudo damo report heats --heatmap stdout
$ sudo damo report heatmap
22222222222222222222222222222222222222211111111111111111111111111111111111111100
44444444444444444444444444444444444444434444444444444444444444444444444444443200
44444444444444444444444444444444444444433444444444444444444444444444444444444200
Expand Down Expand Up @@ -160,6 +173,6 @@ Data Access Pattern Aware Memory Management
Below command makes every memory region of size >=4K that has not accessed for
>=60 seconds in your workload to be swapped out. ::

$ sudo damo schemes --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>
$ sudo damo start --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>
Loading

0 comments on commit 9c5968d

Please sign in to comment.