Skip to content

Conversation

@mkroening
Copy link
Member

@mkroening mkroening commented Oct 21, 2025

Similar to #1994, this makes the internals of the physical and virtual memory allocation private. Note that this will slow down parts of paging right now, since it is not possible to hold a global FrameAlloc lock anymore. This means the lock has to be reacquired again and again until we improve the paging code or the FrameAlloc code.

Depends on #2008.
Closes #1994.

@mkroening mkroening self-assigned this Oct 21, 2025
@mkroening mkroening force-pushed the page_alloc branch 4 times, most recently from 374c53b to d32f419 Compare October 21, 2025 15:11
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Benchmark Current: 82ea5fa Previous: cb0d623 Performance Ratio
startup_benchmark Build Time 117.40 s 111.98 s 1.05
startup_benchmark File Size 0.91 MB 0.89 MB 1.01
Startup Time - 1 core 0.92 s (±0.04 s) 0.91 s (±0.05 s) 1.02
Startup Time - 2 cores 0.92 s (±0.02 s) 0.92 s (±0.04 s) 1.00
Startup Time - 4 cores 0.93 s (±0.04 s) 0.92 s (±0.03 s) 1.01
multithreaded_benchmark Build Time 111.51 s 114.79 s 0.97
multithreaded_benchmark File Size 1.01 MB 1.00 MB 1.01
Multithreaded Pi Efficiency - 2 Threads 87.01 % (±8.29 %) 90.02 % (±8.17 %) 0.97
Multithreaded Pi Efficiency - 4 Threads 42.65 % (±3.92 %) 44.54 % (±3.42 %) 0.96
Multithreaded Pi Efficiency - 8 Threads 24.76 % (±2.15 %) 25.97 % (±2.04 %) 0.95
micro_benchmarks Build Time 268.31 s 261.23 s 1.03
micro_benchmarks File Size 1.01 MB 1.01 MB 1.01
Scheduling time - 1 thread 137.72 ticks (±38.01 ticks) 140.01 ticks (±34.10 ticks) 0.98
Scheduling time - 2 threads 66.64 ticks (±21.23 ticks) 85.24 ticks (±13.44 ticks) 0.78
Micro - Time for syscall (getpid) 7.98 ticks (±3.83 ticks) 8.13 ticks (±3.54 ticks) 0.98
Memcpy speed - (built_in) block size 4096 58267.02 MByte/s (±41456.38 MByte/s) 58334.21 MByte/s (±41243.72 MByte/s) 1.00
Memcpy speed - (built_in) block size 1048576 18562.17 MByte/s (±15848.52 MByte/s) 21033.07 MByte/s (±18225.52 MByte/s) 0.88
Memcpy speed - (built_in) block size 16777216 9980.77 MByte/s (±8442.52 MByte/s) 13267.84 MByte/s (±10774.63 MByte/s) 0.75
Memset speed - (built_in) block size 4096 58343.14 MByte/s (±41504.33 MByte/s) 58219.36 MByte/s (±41164.23 MByte/s) 1.00
Memset speed - (built_in) block size 1048576 18945.24 MByte/s (±16015.81 MByte/s) 21368.76 MByte/s (±18349.64 MByte/s) 0.89
Memset speed - (built_in) block size 16777216 10202.92 MByte/s (±8573.73 MByte/s) 13479.15 MByte/s (±10870.12 MByte/s) 0.76
Memcpy speed - (rust) block size 4096 52549.24 MByte/s (±39182.08 MByte/s) 52819.55 MByte/s (±38638.93 MByte/s) 0.99
Memcpy speed - (rust) block size 1048576 19277.35 MByte/s (±16375.41 MByte/s) 23393.82 MByte/s (±21095.61 MByte/s) 0.82
Memcpy speed - (rust) block size 16777216 9441.37 MByte/s (±7789.00 MByte/s) 13873.33 MByte/s (±11633.55 MByte/s) 0.68
Memset speed - (rust) block size 4096 52880.95 MByte/s (±39398.72 MByte/s) 54298.23 MByte/s (±39297.25 MByte/s) 0.97
Memset speed - (rust) block size 1048576 19478.64 MByte/s (±16458.84 MByte/s) 23892.19 MByte/s (±21294.83 MByte/s) 0.82
Memset speed - (rust) block size 16777216 9745.15 MByte/s (±8043.22 MByte/s) 13997.13 MByte/s (±11673.08 MByte/s) 0.70
alloc_benchmarks Build Time 253.43 s 250.66 s 1.01
alloc_benchmarks File Size 0.97 MB 0.96 MB 1.01
Allocations - Allocation success 100.00 % 100.00 % 1
Allocations - Deallocation success 100.00 % 100.00 % 1
Allocations - Pre-fail Allocations 100.00 % 100.00 % 1
Allocations - Average Allocation time 10995.00 Ticks (±2766.35 Ticks) 8836.57 Ticks (±1388.50 Ticks) 1.24
Allocations - Average Allocation time (no fail) 10995.00 Ticks (±2766.35 Ticks) 8836.57 Ticks (±1388.50 Ticks) 1.24
Allocations - Average Deallocation time 2069.86 Ticks (±340.76 Ticks) 1396.33 Ticks (±193.31 Ticks) 1.48
mutex_benchmark Build Time 243.91 s 245.80 s 0.99
mutex_benchmark File Size 1.02 MB 1.01 MB 1.01
Mutex Stress Test Average Time per Iteration - 1 Threads 27.98 ns (±6.03 ns) 27.14 ns (±5.97 ns) 1.03
Mutex Stress Test Average Time per Iteration - 2 Threads 25.12 ns (±2.97 ns) 25.02 ns (±3.35 ns) 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkroening
Copy link
Member Author

@m-mueller678, what do you think about PageAlloc and FrameAlloc instead of going back to free-standing functions? Feel free to review. :)

Copy link
Contributor

@m-mueller678 m-mueller678 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, safe for the few split unsafe blocks. I like the change from free functions to methods.

pub fn init() {
crate::mm::physicalmem::init();
crate::mm::virtualmem::init();
unsafe {
Copy link
Contributor

@m-mueller678 m-mueller678 Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be merged into a single unsafe block

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having these operations in separate blocks was on purpose. See clippy::multiple_unsafe_ops_per_block. Arguably, the situation is far from ideal in the kernel still, but nevertheless, I think this is a step in the right direction.

This function itself should be unsafe too, I just noticed. Let's change that.

Ok(Self(range, PhantomData))
}

pub unsafe fn from_raw(range: PageRange) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like there should be an into_raw too for symmetry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I added it. 👍

unsafe fn deallocate(range: PageRange);
}

pub struct PageRangeBox<A: PageRangeAllocator>(PageRange, PhantomData<A>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much like the idea of PageRangeBox 👍

InterruptTicketMutex::new(FreeList::new());
pub static TOTAL_MEMORY: AtomicUsize = AtomicUsize::new(0);

pub struct FrameAlloc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having these as methods rather than free standing functions. Both to enable PageRangeBox and to have something natural to implement x86_64's FrameAllocator on.

mkroening and others added 3 commits October 23, 2025 16:46
This replaces uses of KERNEL_FREE_LIST with PageAlloc.
This replaces uses of PHYSICAL_FREE_LIST with FrameAlloc.

Note that this will slow down parts of x86-64 paging right now, since it is not possible to hold a global `FrameAlloc` lock anymore.
This means the lock has to be reacquired again and again until we improve the paging code or the `FrameAlloc` code.

Co-authored-by: m-mueller678 <[email protected]>
@mkroening
Copy link
Member Author

With #2008 merged, this should now be ready too.

@mkroening mkroening marked this pull request as ready for review October 23, 2025 15:09
@mkroening mkroening added this pull request to the merge queue Oct 27, 2025
Merged via the queue into main with commit 6019ba4 Oct 27, 2025
16 of 17 checks passed
@mkroening mkroening deleted the page_alloc branch October 27, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants