Skip to content

cache cpu info in a file at boot. #261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 63 commits into
base: main
Choose a base branch
from

Conversation

bunnitha
Copy link

@bunnitha bunnitha commented Sep 19, 2024

Instrumented cpuinfo to read the populated cpu info data from cpuid.info file during initialization.
Consumers of the library or OEMs can trigger a oneshot service - cpuinfo-svc - during device boot to link libcpuinfo and
dump all cpu info related to processor, model, arch and isa capabilities into cpuid.info file.
This arrangement will limit cpuid() intrinsic calls (which causes VMX events and performance overhead) to only once during init of libcpuinfo at boot by cpuinfo-svc.

Its little tricky to count the exact number of VM-Exits caused by CPUID calls.
For this effect, we have written a script to run an Android (dummy) service
that loads libcpuinfo - first in original version and next in modified version.
libcpuinfo makes around 4 __cpuid() calls on each initialization.
So, to amplify the effect, we have run the dummy service 1000 times and timed the total
execution (4000 __cpuid() calls and 8000 VMX events) in nano second precision.
This test shows that there is a reduction of approx. 0.6% in time to execute libcpuinfo.

Also, we attempted to record the KVM event statistics using perf tool.
Perf records the stats during the execution of same script.
The data shows that approx. 20,000 kvm_cpuid events reduced in modified libcpuinfo calls.
We cannot assume that all the CPUID calls are contributed by our script.
We are only making a safe assumption that the proportional reduction is affected
by the modification we did. We had made all efforts to keep the test conditions
similar in both cases.

Please find attached the excel sheet containing the test data.
libcpuinfo_enhancement_test_data.xlsx

@bunnitha
Copy link
Author

Hi, requesting concerned moderator to take a look at this proposal and review the patch.
Intel wants Pytorch to adopt this solution so that it can later land in Chrome ARCVM tree.

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution.

(1) tlb refactor should be in a different commit
(2) please add some tests for cpuid features
(3) update PR summary for performance data and rationale for Android vm-exit overhead reduction.

tlb variables are moved into a struct for clean and readable code.

Signed-off-by: Balakrishnan Unnithan <[email protected]>
@bunnitha bunnitha changed the title read cpu info from cpuid.info file. cache cpu info in a file at boot. Feb 4, 2025
@bunnitha
Copy link
Author

bunnitha commented Feb 4, 2025

separated tlb refactor codes and cpu info caching.

@bunnitha bunnitha requested a review from digantdesai February 13, 2025 04:13
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, apologies for a delay, can you help me understand,

(1) why does it have to be __ANDROID__ specific. And Why can't a Linux user use cpuinfo-svc?
(1.1) Does it have to be x86 specific? We don't have to use it for other ISAs, I don't have a too strong preference here.
(2) why a user facing "app" is in the src/x86/linux/ vs cpuinfo/tools?

@bunnitha
Copy link
Author

Thank you @digantdesai for your review.

Answer to (1) and (2):
The solution can be generic.
We have had it devised for ARCVM (Virtualized Android) on CrOS.
The deployment and validation have only been done on Android in x86 target (others are out of our scope).
To make sure the rest of the world don't break, we applied the ANDROID restriction.

@digantdesai
Copy link
Contributor

(2) why a user facing "app" is in the src/x86/linux/ vs cpuinfo/tools?
I am ok if you restrict caching to android + x86, but please move the "app" out of the src dir.

@bunnitha
Copy link
Author

bunnitha commented Mar 3, 2025

(2) why a user facing "app" is in the src/x86/linux/ vs cpuinfo/tools?
I am ok if you restrict caching to android + x86, but please move the "app" out of the src dir.

Agreed!
As suggested, app moved into tools/ folder

@bunnitha bunnitha requested a review from digantdesai March 3, 2025 04:32
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are getting pretty close. Left some minor comments. Thanks!

@bunnitha bunnitha requested a review from digantdesai March 21, 2025 13:55
@digantdesai
Copy link
Contributor

digantdesai commented Mar 21, 2025

PR LGTM. Left a comment about validation after loading from the file.
Please also check and make sure the CI is green. Thanks.

@bunnitha
Copy link
Author

0001-Cache-cpu-info-in-a-file-at-boot.patch
0001-Trigger-cpuinfo-svc-at-boot.patch
Complete patch to implement the solution in Android platform.

@digantdesai
Copy link
Contributor

Can we make sure CI is green?

@digantdesai
Copy link
Contributor

Rebase?

@bunnitha
Copy link
Author

Rebase?

Hi Digant,
Apologies for the delay in taking this forward. A rebase is absolutely needed and moreover, some tweaking of cmake/bazel is also needed to make CI green. I'll need some time to finish a higher priority task before working on this one. Thank you.

@enh-google
Copy link
Contributor

is this different from https://www.phoronix.com/news/Linux-Intel-KVM-Cache-CPUID ?

when we talked about this at bootcamp, i didn't really mean that you should upstream this to cpuinfo (because i don't think anyone will use it) --- i was just saying "Android won't take non-upstream patches to cpuinfo".

this still feels like something that should be fixed in the kernel, not hacked around in callers...

rrwinterton and others added 26 commits June 27, 2025 16:45
Update spaces to tabs to fix alignment issues.
fix the spaces to tab at line 50 to fix indenting issue.
[`PROJECT_NAME`][1] refers to the most recent `project()` call, while
[`CMAKE_PROJECT_NAME`][2] refers to the top-most `project()` call.

In the cases where cpuinfo is installed as a standalone project, this is
perfectly fine and works as intended where the installed pkg-config file
contains libcpuinfo as the name. However, if cpuinfo is used as a
subproject, such as when using FetchContent, the name of the calling
project would be used instead, leading to something like libOuterProject
rather than libcpuinfo.

[1]: https://cmake.org/cmake/help/latest/variable/PROJECT_NAME.html#variable:PROJECT_NAME
[2]: https://cmake.org/cmake/help/latest/variable/CMAKE_PROJECT_NAME.html#variable:CMAKE_PROJECT_NAME

Signed-off-by: Christopher Degawa <[email protected]>
While casting function pointers is allowed in C, the function must
ultimately be called through a pointer with the same type signature as
the function itself. Type signature mismatches, even decaying T* to
void* is undefined behavior.

UBSan flags this with -fsanitize=function. The easiest way I found to
repro this was:

    CC=clang-18 CXX=clang++-18 \
    CFLAGS="-fsanitize=function -fno-sanitize-recover=function" \
    CXXFLAGS="-fsanitize=function -fno-sanitize-recover=function" \
    cmake -GNinja -B build -DCPUINFO_BUILD_BENCHMARKS=OFF

    ninja -C build

    ./build/cpu-info

That gives the following error:

    [...]/src/linux/multiline.c:85:11: runtime error: call to function parse_line through pointer to incorrect function type 'bool (*)(const char *, const char *, void *, unsigned long)'
    cpuinfo.c: note: parse_line defined here
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior [...]/src/linux/multiline.c:85:11

The fix is fairly straightforward: just keep the function at the type
signature the expected, and cast void* instead the function instead.
…ding hw.machine

This functionality was implemented in pytorch#65 ("Updated package.name to also
query machdep.cpu.brand_string if decode of hw.machine fails"), but then it was
omitted from the subsequent pytorch#100, probably inadvertently.

Adding that functionality back here, so that the package/device name can be
shown correctly on recent devices and macOS/iOS versions.  I have reversed
the order so that `machdep.cpu.brand_string` is checked before attempting to
decode `hw.machine`, since the former appears to be more future-proof.

Before this change, on a recent MacBook Pro:

    $ cpu-info
    ...
    Debug (cpuinfo): hw.machine: arm64
    Warning in cpuinfo: parsing "hw.machine" failed: Undefined error: 0
    ...
    Packages:
    	0:

After this change:

    $ cpu-info
    ...
    Debug (cpuinfo): machdep.cpu.brand_string: Apple M2 Pro
    ...
    Packages:
    	0: Apple M2 Pro
- cpuinfo_get_max_arm_sme_length() returns svl vector length in bits
- Display length of SME vectors in isa-tool
- Upgrade cmake-linux-riscv64 ubuntu-22.04 runners to ubuntu-24.04

SME may be enabled on cpus that do not have SVE
Xcode on macOS cannot handle multiple .c files having the same name as
it produces an object file like init-<hash>.o, but tries to link to
init.o

https://gitlab.kitware.com/cmake/cmake/-/issues/20501

Signed-off-by: Christopher Degawa <[email protected]>
Tried a few things, but looks like all one needs to do is to add `--platform linux/riscv64` flag to `docker run` command
syscall() doesn't care about these types at all, and the kernel uses cpu_set_t, so we're better off just removing the cast entirely.
* [WIP] update apple soc info

Summary:
Added support for A16, A17, A18, A18 pro. Reg values are found from ncnn and needs validation.

Additional source
Constants are taken from https://github.com/apple-oss-distributions/xnu/blob/e3723e1f17661b24996789d8afc084c0c3303b26/osfmk/mach/machine.h#L449

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update on "[WIP] update apple soc info"


Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update on "[WIP] update apple soc info"


Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update on "[WIP] update apple soc info"


Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

* Update on "update apple soc info"


Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Darkmont is the uarch used in Sierra Forest
Tested:
make cpu-info
sde -srf -- ./cpu-info
* Add Intel Darkmont uarch

- Darkmont is the uarch used in Clearwater Forest

* Add Intel Darkmont uarch

- Darkmont is the uarch used in Clearwater Forest

* Add Intel Darkmont uarch

- Darkmont is the uarch used in Clearwater Forest
* Added  Willow Cove

* Update uarch.c

removed incomplete modification in naming

* Update cpu-info.c

moved willow_cove up two lines to be next to sunny_cove

* Update cpuinfo.h

made the enum consistent with uarch naming following sunny_cove

* Update cpuinfo.h

updated comments to refer to intel microarchitecture per request to keep consistent.
@bunnitha
Copy link
Author

Rebase?

Hi @digantdesai
The original commit is far behind in the tree and attempts to rebase has messed up the PR.
I have created another one (#304).
Could you please continue the review there? (this one needs to be abandoned)
#304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.