-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix locking in qmemman on Python 3.13 #629
Conversation
Experiments show that using memory hotplug or populate-on-demand makes no difference in required memory at startup. But PV vs PVH/HVM does make a difference - PV doesn't need extra per-MB overhead at all. On top of that, experimentally find the correct factor. Do it by starting VM (paused) with different parameters and compare `xl info free_memory` before and after. memory / maxmem: difference 400 / 4000: 434 600 / 4000: 634 400 / 400: 405 400 / 600: 407 400 / 2000: 418 2000 / 2000: 2018 600 / 600: 607 All above are with 2 vcpus. Testing with other vcpus count shows the 1.5MB per vcpu is quite accurate. As seen above, the initial memory doesn't affect the overhead. The maxmem counts. Applying linear regression to that shows it's about 0.008MB per MB of maxmem, so round it up to 8192 bytes. The base overhead of 4MB doesn't match exactly, but since the calculated number is smaller, leave it at 4MB as a safety margin. Fixes QubesOS/qubes-issues#9431
Any memory adjustments must be done while holding a lock, to not interfere with client request handling. This is critical to prevent memory just freed for a new VM being re-allocated elsewhere. The domain_list_changed() function failed to do that - do_balance call was done after releasing the lock. It wasn't a problem for a long time because of Python's global interpreter lock. But Python 3.13 is finally starting to support proper parallel thread execution, and it revealed this bug. Fixes QubesOS/qubes-issues#9431
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #629 +/- ##
==========================================
- Coverage 69.32% 69.31% -0.01%
==========================================
Files 58 58
Lines 11993 11994 +1
==========================================
Hits 8314 8314
- Misses 3679 3680 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024102822-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024091704-4.3&flavor=update
Failed tests29 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/112766#dependencies 201 fixed
Unstable tests
|
Is this bug actually specific to Python 3.13+, or just much easier to trigger there? To me this change ought to be backported. |
Yes, this may be backported (once it really lands in "main"). |
See commit messages for details.
Fixes QubesOS/qubes-issues#9431