Skip to content

Conversation

@Avenger-285714
Copy link
Member

kernel为支持CCA特性,需要包含guest和host修改。本次提交包含这2部分修改。

【特性描述】
机密计算通过改变传统信任模型的信任关系,减小用户对计算基础设施的信任,如OS、hypervisor。其基于可信硬件的安全环境中执行计算,旨在保护data-in-use,保护数据和代码免受特权软件和硬件代理的观察和修改。Arm机密计算架构CCA(Confidential Compute Architecture)特性是Arm在架构上引入的机密计算扩展,其主要特性包括:
1)引入机密计算环境Realm保护in-use 数据和代码
2)允许任何第三方开发者保护其VM或应用
3)支持动态内存分配
4)支持远程证明
相比之前的TrustZone技术,能够提供机密虚机级别的安全防护,支持大型应用的无缝迁移。
更多特性介绍可以参考Arm官方主页:https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture

Host/KVM需要通过RMI(Realm Management Interface)接口管理Realm生命周期、分配和回收Realm资源、调度Realm。内核/KVM需要实现相应patch适配CCA特性。

【涉及代码仓】
kernel、libvirt、QEMU

Link: https://gitee.com/OpenCloudOS/OpenCloudOS-Kernel/pulls/489
Link: https://gitee.com/OpenCloudOS/OpenCloudOS-Kernel/issues/ICZ6R2

willdeacon and others added 30 commits November 23, 2025 08:23
mainline inclusion
from mainline-v6.12-rc1
commit e7bafbf
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e7bafbf7177750e6643941473b343ed72fc5a100

--------------------------------

Implementing the internal mem_encrypt API for arm64 depends entirely on
the Confidential Computing environment in which the kernel is running.

Introduce a simple dispatcher so that backend hooks can be registered
depending upon the environment in which the kernel finds itself.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/set_memory.h
[Only context conflicts]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.12-rc1
commit c86fa34
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c86fa3470c1026e9f63a93e8885ea51ef99fae35

--------------------------------

Confidential Computing environments such as pKVM and Arm's CCA
distinguish between shared (i.e. emulated) and private (i.e. assigned)
MMIO regions.

Introduce a hook into our implementation of ioremap_prot() so that MMIO
regions can be shared if necessary.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit b880a80
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b880a80011f56880f32bde47fc6af313359f926b

--------------------------------

The RMM (Realm Management Monitor) provides functionality that can be
accessed by a realm guest through SMC (Realm Services Interface) calls.

The SMC definitions are based on DEN0137[1] version 1.0-rel0.

[1] https://developer.arm.com/documentation/den0137/1-0rel0/

Acked-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit c077711
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c077711f718be7cebcc8b987eac2ebfd17447e9f

--------------------------------

Detect that the VM is a realm guest by the presence of the RSI
interface. This is done after PSCI has been initialised so that we can
check the SMCCC conduit before making any RSI calls.

If in a realm then iterate over all memory ensuring that it is marked as
RIPAS RAM. The loader is required to do this for us, however if some
memory is missed this will cause the guest to receive a hard to debug
external abort at some random point in the future. So for a
belt-and-braces approach set all memory to RIPAS RAM. Any failure here
implies that the RAM regions passed to Linux are incorrect so panic()
promptly to make the situation clear.

Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Co-developed-by: Steven Price <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Conflicts:
	arch/arm64/kernel/Makefile
	arch/arm64/kernel/setup.c
[Only context conflicts]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 3993069
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=399306954996be58ac20b4b29f6334e3d55a2ce7

--------------------------------

The top bit of the configured IPA size is used as an attribute to
control whether the address is protected or shared. Query the
configuration from the RMM to assertain which bit this is.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Co-developed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 3715894
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=371589437616fbb03590d8ff505f8a4c95c8a031

--------------------------------

On Arm CCA, with RMM-v1.0, all MMIO regions are shared. However, in
the future, an Arm CCA-v1.0 compliant guest may be run in a lesser
privileged partition in the Realm World (with Arm CCA-v1.1 Planes
feature). In this case, some of the MMIO regions may be emulated
by a higher privileged component in the Realm world, i.e, protected.

Thus the guest must decide today, whether a given MMIO region is shared
vs Protected and create the stage1 mapping accordingly. On Arm CCA, this
detection is based on the "IPA State" (RIPAS == RIPAS_IO). Provide a
helper to run this check on a given range of MMIO.

Also, provide a arm64 helper which may be hooked in by other solutions.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 3c6c706
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c6c706139564f74ec48229378873c1d930a8bc8

--------------------------------

Instead of marking every MMIO as shared, check if the given region is
"Protected" and apply the permissions accordingly.

Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 491db21
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=491db21d8256992ab9fe11c42744eb3044315d14

--------------------------------

Device mappings need to be emulated by the VMM so must be mapped shared
with the host.

Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit fbf979a
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fbf979a01375704fa87c559763209c658593b6f8

--------------------------------

Within a realm guest it's not possible for a device emulated by the VMM
to access arbitrary guest memory. So force the use of bounce buffers to
ensure that the memory the emulated devices are accessing is in memory
which is explicitly shared with the host.

This adds a call to swiotlb_update_mem_attributes() which calls
set_memory_decrypted() to ensure the bounce buffer memory is shared with
the host. For non-realm guests or hosts this is a no-op.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Co-developed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Conflicts:
            arch/arm64/mm/init.c
[The commit 6503357 ("arm64: swiotlb: Reduce the default size if no ZONE_DMA bouncing needed") is not merged
resulting in context conflicts. Virtcca commit caaefd56addf ("mm: enable swiotlb alloc for cvm share mem") call
swiotlb_cvm_update_mem_attributes after swiotlb_init resulting in context conflicts.]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 0e9cb59
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0e9cb5995b2539a332fe65ada6a28a6be55f6e40

--------------------------------

When __change_memory_common() is purely setting the valid bit on a PTE
(e.g. via the set_memory_valid() call) there is no need for a TLBI as
either the entry isn't changing (the valid bit was already set) or the
entry was invalid and so should not have been cached in the TLB.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 42be24a
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=42be24a4178fe51e6f47d91d8621b2f53820f88b

--------------------------------

Use the memory encryption APIs to trigger a RSI call to request a
transition between protected memory and shared memory (or vice versa)
and updating the kernel's linear map of modified pages to flip the top
bit of the IPA. This requires that block mappings are not used in the
direct map for realm guests.

Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Co-developed-by: Steven Price <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Conflicts:
	arch/arm64/Kconfig
[Only context conflicts]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit b08e2f4
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b08e2f42e86b5848add254da45b56fc672e2bced

--------------------------------

Within a realm guest the ITS is emulated by the host. This means the
allocations must have been made available to the host by a call to
set_memory_decrypted(). Introduce an allocation function which performs
this extra call.

For the ITT use a custom genpool-based allocator that calls
set_memory_decrypted() for each page allocated, but then suballocates the
size needed for each ITT. Note that there is no mechanism implemented to
return pages from the genpool, but it is unlikely that the peak number of
devices will be much larger than the normal level - so this isn't expected
to be an issue.

Co-developed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Conflicts:
	drivers/irqchip/irq-gic-v3-its.c
[Because the hisi inclusion commit 8207b85ea7207 ("kvm: hisi: print error for IPIV") and the commit 7b39fd06a3912
("irqchip/gic-v3-its: Alloc/Free device id from pools for virtual devices"), The context conflicts.]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit bc88d44
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bc88d44bd7e45b992cf8c2c2ffbc7bb3e24db4a7

--------------------------------

itt_alloc_pool() calls its_alloc_pages_node() to allocate an individual
page to add to the pool (for allocations <PAGE_SIZE). However the final
argument of its_alloc_pages_node() is the page order not the number of
pages. Currently it allocates two pages and leaks the second page.
Fix it by passing 0 instead (1 << 0 = 1 page).

Fixes: b08e2f4 ("irqchip/gic-v3-its: Share ITS tables with a non-trusted hypervisor")
Reported-by: Shanker Donthineni <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Closes: https://lore.kernel.org/r/ed65312a-245c-4fa5-91ad-5d620cab7c6b%40nvidia.com
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit e36d416
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e36d4165f0796536b338521ef714551be0feb706

--------------------------------

its_create_device() over-allocated by ITS_ITT_ALIGN - 1 bytes to ensure
that an aligned area was available within the allocation. The new genpool
allocator has its min_alloc_order set to get_order(ITS_ITT_ALIGN) so all
allocations from it should be appropriately aligned.

Remove the over-allocation from its_create_device() and alignment from
its_build_mapd_cmd().

Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.10-rc1
commit 91a1d97
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91a1d97ef482c1e4c9d4c1c656a53b0f6b16d0ed

--------------------------------

When a static_key is marked ro_after_init, its state will never change
(after init), therefore jump_label_update() will never need to iterate
the entries, and thus module load won't actually need to track this --
avoiding the static_key::next write.

Therefore, mark these keys such that jump_label_add_module() might
recognise them and avoid the modification.

Use the special state: 'static_key_linked(key) && !static_key_mod(key)'
to denote such keys.

jump_label_add_module() does not exist under CONFIG_JUMP_LABEL=n, so the
newly-introduced jump_label_init_ro() can be defined as a nop for that
configuration.

[ mingo: Renamed jump_label_ro() to jump_label_init_ro() ]

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Conflicts:
	init/main.c
[Only context conflicts]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.11-rc7
commit 213aa67
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=213aa670153ed675a007c1f35c5db544b0fefc94

--------------------------------

Do not write-protect the kernel read-only and __ro_after_init sections
earlier than before mark_rodata_ro() is called.  This fixes a boot issue on
parisc which is triggered by commit 91a1d97 ("jump_label,module: Don't
alloc static_key_mod for __ro_after_init keys"). That commit may modify
static key contents in the __ro_after_init section at bootup, so this
section needs to be writable at least until mark_rodata_ro() is called.

Signed-off-by: Helge Deller <[email protected]>
Reported-by: matoro <[email protected]>
Reported-by: Christoph Biedl <[email protected]>
Tested-by: Christoph Biedl <[email protected]>
Link: https://lore.kernel.org/linux-parisc/[email protected]/#r
Fixes: 91a1d97 ("jump_label,module: Don't alloc static_key_mod for __ro_after_init keys")
Cc: [email protected] # v6.10+
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
maillist inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24
CVE: NA

Reference: https://lore.kernel.org/all/[email protected]/

----------------------------------------------------------------------

For ioremap(), so far we only checked if it was a device (RIPAS_DEV) to choose
an encrypted vs decrypted mapping. However, we may have firmware reserved memory
regions exposed to the OS (e.g., EFI Coco Secret Securityfs, ACPI CCEL).
We need to make sure that anything that is RIPAS_RAM (i.e., Guest
protected memory with RMM guarantees) are also mapped as encrypted.

Rephrasing the above, anything that is not RIPAS_EMPTY is guaranteed to be
protected by the RMM. Thus we choose encrypted mapping for anything that is not
RIPAS_EMPTY. While at it, rename the helper function

  __arm64_is_protected_mmio => arm64_rsi_is_protected

to clearly indicate that this not an arm64 generic helper, but something to do
with Realms.

Cc: Sami Mujawar <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Steven Price <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
cca inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

--------------------------------

maillist:
Aneesh pointed out that this call to is_realm_world() is now too early
since the decision to delay the RSI detection. The upshot is that a
realm guest which doesn't have page granularity forced for other reasons
will fail to share pages with the host.

At the moment I can think of a couple of options:

(1) Make rodata_full a requirement for realm guests.
    CONFIG_RODATA_FULL_DEFAULT_ENABLED is already "default y" so this
    isn't a big ask.

(2) Revisit the idea of detecting when running as a realm guest early.
    This has the advantage of also "fixing" earlycon (no need to
    manually specify the shared-alias of an unprotected UART).

I'm currently leaning towards (1) because it's the default anyway. But
if we're going to need to fix earlycon (or indeed find other similar
issues) then (2) would obviously make sense.

CCA context is follow:
CCA guest must run at page granularity. Thus, we need to make
sure NO_BLOCK_MAPPINGS and NO_CONT_MAPPINGS flags are set during
paging_init. However, at that time, is_realm_world has not been
initialized yet. Therefore, rodata=full is forced currently while
booting a realm guest[1].

Link: https://patchwork.kernel.org/project/kvm/patch/[email protected]/

Signed-off-by: Yiwei Zhuang <[email protected]>

Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.15-rc1
commit c380931
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c380931712d16e23f6aa90703f438330139e9731

--------------------------------

phys_to_dma() sets the encryption bit on the translated DMA address. But
dma_to_phys() clears the encryption bit after it has been translated back
to the physical address, which could fail if the device uses DMA ranges.

AMD SME doesn't use the DMA ranges and thus this is harmless. But as we
are about to add support for other architectures, let us fix this.

Reported-by: Aneesh Kumar K.V <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Cc: Will Deacon <[email protected]>
Cc: Jean-Philippe Brucker <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Tom Lendacky <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Acked-by: Tom Lendacky <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Acked-by: Marek Szyprowski <[email protected]>
Fixes: 42be24a ("arm64: Enable memory encrypt for Realms")
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.15-rc1
commit b66e2ee
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b66e2ee7b6c8d45bbe4b6f6885ee27511506812c

--------------------------------

AMD SME added __sme_set/__sme_clr primitives to modify the DMA address for
encrypted/decrypted traffic. However this doesn't fit in with other models,
e.g., Arm CCA where the meanings are the opposite. i.e., "decrypted" traffic
has a bit set and "encrypted" traffic has the top bit cleared.

In preparation for adding the support for Arm CCA DMA conversions, convert the
existing primitives to more generic ones that can be provided by the backends.
i.e., add helpers to
 1. dma_addr_encrypted - Convert a DMA address to "encrypted" [ == __sme_set() ]
 2. dma_addr_unencrypted - Convert a DMA address to "decrypted" [ None exists today ]
 3. dma_addr_canonical - Clear any "encryption"/"decryption" bits from DMA
    address [ SME uses __sme_clr() ] and convert to a canonical DMA address.

Since the original __sme_xxx helpers come from linux/mem_encrypt.h, use that
as the home for the new definitions and provide dummy ones when none is provided
by the architectures.

With the above, phys_to_dma_unencrypted() uses the newly added dma_addr_unencrypted()
helper and to make it a bit more easier to read and avoid double conversion,
provide __phys_to_dma().

Suggested-by: Robin Murphy <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Jean-Philippe Brucker <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Acked-by: Marek Szyprowski <[email protected]>
Fixes: 42be24a ("arm64: Enable memory encrypt for Realms")
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.15-rc1
commit 7d953a0
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7d953a06241624ee2efb172d037a4168978f4147

--------------------------------

When a device performs DMA to a shared buffer using physical addresses,
(without Stage1 translation), the device must use the "{I}PA address" with the
top bit set in Realm. This is to make sure that a trusted device will be able
to write to shared buffers as well as the protected buffers. Thus, a Realm must
always program the full address including the "protection" bit, like AMD SME
encryption bits.

Enable this by providing arm64 specific dma_addr_{encrypted, canonical}
helpers for Realms. Please note that the VMM needs to similarly make sure that
the SMMU Stage2 in the Non-secure world is setup accordingly to map IPA at the
unprotected alias.

Cc: Will Deacon <[email protected]>
Cc: Jean-Philippe Brucker <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Acked-by: Marek Szyprowski <[email protected]>
Fixes: 42be24a ("arm64: Enable memory encrypt for Realms")
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit ec51ffc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ec51ffcf263016111f090b9440a3c5a8338648e8

--------------------------------

In preparation for adding another coco build target, relieve
drivers/virt/Makefile of the responsibility to track new compilation
unit additions to drivers/virt/coco/, and do the same for
drivers/virt/Kconfig.

Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Tested-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Tom Lendacky <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Conflicts:
	drivers/virt/Kconfig
	drivers/virt/Makefile
	drivers/virt/coco/Kconfig
	drivers/virt/coco/Makefile
[The commit be5ee944496f8 ("driver/virt/coco: Add HYGON CSV Guest dirver.") hygon inclusion csv-guest files, adapte
these patches by moving these csv-guest to coco Kconfig/Makefile]

Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit 70e6f7e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70e6f7e2b98575621019aa40ac616be58ff984e0

--------------------------------

One of the common operations of a TSM (Trusted Security Module) is to
provide a way for a TVM (confidential computing guest execution
environment) to take a measurement of its launch state, sign it and
submit it to a verifying party. Upon successful attestation that
verifies the integrity of the TVM additional secrets may be deployed.
The concept is common across TSMs, but the implementations are
unfortunately vendor specific. While the industry grapples with a common
definition of this attestation format [1], Linux need not make this
problem worse by defining a new ABI per TSM that wants to perform a
similar operation. The current momentum has been to invent new ioctl-ABI
per TSM per function which at best is an abdication of the kernel's
responsibility to make common infrastructure concepts share common ABI.

The proposal, targeted to conceptually work with TDX, SEV-SNP, COVE if
not more, is to define a configfs interface to retrieve the TSM-specific
blob.

    report=/sys/kernel/config/tsm/report/report0
    mkdir $report
    dd if=binary_userdata_plus_nonce > $report/inblob
    hexdump $report/outblob

This approach later allows for the standardization of the attestation
blob format without needing to invent a new ABI. Once standardization
happens the standard format can be emitted by $report/outblob and
indicated by $report/provider, or a new attribute like
"$report/tcg_coco_report" can emit the standard format alongside the
vendor format.

Review of previous iterations of this interface identified that there is
a need to scale report generation for multiple container environments
[2]. Configfs enables a model where each container can bind mount one or
more report generation item instances. Still, within a container only a
single thread can be manipulating a given configuration instance at a
time. A 'generation' count is provided to detect conflicts between
multiple threads racing to configure a report instance.

The SEV-SNP concepts of "extended reports" and "privilege levels" are
optionally enabled by selecting 'tsm_report_ext_type' at register_tsm()
time. The expectation is that those concepts are generic enough that
they may be adopted by other TSM implementations. In other words,
configfs-tsm aims to address a superset of TSM specific functionality
with a common ABI where attributes may appear, or not appear, based on
the set of concepts the implementation supports.

Link: http://lore.kernel.org/r/[email protected] [1]
Link: http://lore.kernel.org/r/[email protected] [2]
Cc: Kuppuswamy Sathyanarayanan <[email protected]>
Cc: Dionna Amalie Glaze <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Peter Gonda <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Samuel Ortiz <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Tested-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Tom Lendacky <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit a67d74a
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a67d74a4b163878a3c0537033ed1b20db92ebfc5

--------------------------------

Allow for the declaration of variables that trigger kvfree() when they
go out of scope. The check for NULL and call to kvfree() can be elided
by the compiler in most cases, otherwise without the NULL check an
unnecessary call to kvfree() may be emitted. Peter proposed a comment
for this detail [1].

Link: http://lore.kernel.org/r/[email protected] [1]
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Acked-by: Pankaj Gupta <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Tested-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Tom Lendacky <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 7999edc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7999edc484ca376f803562edb2d43ec921642c2a

--------------------------------

Introduce an arm-cca-guest driver that registers with
the configfs-tsm module to provide user interfaces for
retrieving an attestation token.

When a new report is requested the arm-cca-guest driver
invokes the appropriate RSI interfaces to query an
attestation token.

The steps to retrieve an attestation token are as follows:
  1. Mount the configfs filesystem if not already mounted
     mount -t configfs none /sys/kernel/config
  2. Generate an attestation token
     report=/sys/kernel/config/tsm/report/report0
     mkdir $report
     dd if=/dev/urandom bs=64 count=1 > $report/inblob
     hexdump -C $report/outblob
     rmdir $report

Signed-off-by: Sami Mujawar <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Conflicts:
	drivers/virt/coco/Kconfig
	drivers/virt/coco/Makefile
[The commit be5ee944496f8 ("driver/virt/coco: Add HYGON CSV Guest dirver.") adds csv-guest files resulting in context
conflicts]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc1
commit 972d755
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=972d755f01954bd0e36d8696f0d7dc6466072c21

--------------------------------

Add some documentation on Arm CCA and the requirements for running Linux
as a Realm guest. Also update booting.rst to describe the requirement
for RIPAS RAM.

Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.16-rc1
commit fba4cea
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fba4ceaa242d2bdf4c04b77bda41d32d02d3925d

--------------------------------

Unlike sysfs, the lifetime of configfs objects is controlled by
userspace. There is no mechanism for the kernel to find and delete all
created config-items. Instead, the configfs-tsm-report mechanism has an
expectation that tsm_unregister() can happen at any time and cause
established config-item access to start failing.

That expectation is not fully satisfied. While tsm_report_read(),
tsm_report_{is,is_bin}_visible(), and tsm_report_make_item() safely fail
if tsm_ops have been unregistered, tsm_report_privlevel_store()
tsm_report_provider_show() fail to check for ops registration. Add the
missing checks for tsm_ops having been removed.

Now, in supporting the ability for tsm_unregister() to always succeed,
it leaves the problem of what to do with lingering config-items. The
expectation is that the admin that arranges for the ->remove() (unbind)
of the ${tsm_arch}-guest driver is also responsible for deletion of all
open config-items. Until that deletion happens, ->probe() (reload /
bind) of the ${tsm_arch}-guest driver fails.

This allows for emergency shutdown / revocation of attestation
interfaces, and requires coordinated restart.

Fixes: 70e6f7e ("configfs-tsm: Introduce a shared ABI for attestation reports")
Cc: [email protected]
Cc: Suzuki K Poulose <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Sami Mujawar <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Tom Lendacky <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reported-by: Cedric Xing <[email protected]>
Reviewed-by: Kai Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Dan Williams <[email protected]>
Conflicts:
	drivers/virt/coco/tsm.c
[Because the commit 20dfee9 ("x86/sev: Take advantage of configfs
visibility support in TSM") is not merged, result in context conflicts.]
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.13-rc2
commit 9223059
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBWQ24

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=92230596252ab6155f2d7f7ff9fa61425800a13f

--------------------------------

Commits 7999edc ("virt: arm-cca-guest: TSM_REPORT support for
realm") and a06c3fa ("drivers/virt: pkvm: Add initial support for
running as a protected guest") added arm64 guest-side support for
running in CCA and pKVM confidential computing environments
respectively.

Unfortunately, these changes were not accompanied by a MAINTAINERS
entry and so aren't automatically picked up by the get_maintainer.pl
script. Since the initial support was merged via the arm64 tree, extend
the ARM64 entry to cover the two new directories.

Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Acked-by: Suzuki K Poulose <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Cai Xinchen <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit 57fc267
category: cleanup
bugzilla: https://gitee.com/openeuler/kernel/issues/I9Q7QP
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=57fc267f1b5caa56dfb46f60e20e673cbc4cc4a8

--------------------------------

Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
reads a vCPU's PMCR_EL0.

Currently, the PMCR_EL0 value is tracked per vCPU. The following
patches will make (only) PMCR_EL0.N track per guest. Having the
new helper will be useful to combine the PMCR_EL0.N field
(tracked per guest) and the other fields (tracked per vCPU)
to provide the value of PMCR_EL0.

No functional change intended.

Reviewed-by: Sebastian Ott <[email protected]>
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.8-rc1
commit 62e1f21
category: cleanup
bugzilla: https://gitee.com/openeuler/kernel/issues/I9Q7QP
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=62e1f212e5fe7624249212813ee96202e0c31430

--------------------------------

This is so that FIELD_GET and FIELD_PREP can be used and that the fields
are in a consistent format to arm64/tools/sysreg

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Junhao He <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
Steven Price and others added 19 commits November 23, 2025 08:30
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Physical device assignment is not yet supported by the RMM, so it
doesn't make much sense to allow device mappings within the realm.
Prevent them when the guest is a realm.

Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Arm CCA assigns the physical PMU device to the guest running in realm
world, however the IRQs are routed via the host. To enter a realm guest
while a PMU IRQ is pending it is necessary to block the physical IRQ to
prevent an immediate exit. Provide a mechanism in the PMU driver for KVM
to control the physical IRQ.

Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit 4d20deb
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d20debf9ca160720a0b01ba4f2dc3d62296c4d1

--------------------------------

The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, the value is set to the same
value as the current PE on every vCPU reset.  Unless the vCPU is
pinned to PEs that has the PMU associated to the guest from the
initial vCPU reset, the value might be different from the PMU's
PMCR_EL0.N on heterogeneous PMU systems.

Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
value. Track the PMCR_EL0.N per guest, as only one PMU can be set
for the guest (PMCR_EL0.N must be the same for all vCPUs of the
guest), and it is convenient for updating the value.

To achieve this, the patch introduces a helper,
kvm_arm_pmu_get_max_counters(), that reads the maximum number of
counters from the arm_pmu associated to the VM. Make the function
global as upcoming patches will be interested to know the value
while setting the PMCR.N of the guest from userspace.

KVM does not yet support userspace modifying PMCR_EL0.N.
The following patch will add support for that.

Reviewed-by: Sebastian Ott <[email protected]>
Co-developed-by: Marc Zyngier <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
Conflicts:arch/arm64/kvm/pmu-emul.c
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit ea9ca90
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea9ca904d24ff15ded92fd76c16462c47bcae2f8

--------------------------------

KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.

Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.

Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).

Co-developed-by: Marc Zyngier <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Signed-off-by: Oliver Upton <[email protected]>

Conflicts:arch/arm64/kvm/sys_regs.c
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit 1616ca6
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1616ca6f3c10723c1b60ae44724212fae88f502d

--------------------------------

Introduce new helper functions to set the guest's PMU
(kvm->arch.arm_pmu) either to a default probed instance or to a
caller requested one, and use it when the guest's PMU needs to
be set. These helpers will make it easier for the following
patches to modify the relevant code.

No functional change intended.

Reviewed-by: Sebastian Ott <[email protected]>
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
[arch/arm64/kvm/pmu-emul.c: remove duplicate kvm_arm_set_pmu]
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Use the PMU registers from the RmiRecExit structure to identify when an
overflow interrupt is due and inject it into the guest. Also hook up the
configuration option for enabling the PMU within the guest.

When entering a realm guest with a PMU interrupt pending, it is
necessary to disable the physical interrupt. Otherwise when the RMM
restores the PMU state the physical interrupt will trigger causing an
immediate exit back to the host. The guest is expected to acknowledge
the interrupt causing a host exit (to update the GIC state) which gives
the opportunity to re-enable the physical interrupt before the next PMU
event.

Number of PMU counters is configured by the VMM by writing to PMCR.N.

Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

For protected memory read only isn't supported by the RMM. While it may
be possible to support read only for unprotected memory, this isn't
supported at the present time.

Note that this does mean that ROM (or flash) data cannot be emulated
correctly by the VMM as the stage 2 mappings are either always
read/write or are trapped as MMIO (so don't support operations where the
syndrome information doesn't allow emulation, e.g. load/store pair).

This restriction can be lifted in the future by allowing the stage 2
mappings to be made read only.

Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

The RMM describes the maximum number of BPs/WPs available to the guest
in the Feature Register 0. Propagate those numbers into ID_AA64DFR0_EL1,
which is visible to userspace. A VMM needs this information in order to
set up realm parameters.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>

Signed-off-by: Yiwei Zhuang <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://patchwork.kernel.org/project/kvm/patch/[email protected]/

--------------------------------

Allow userspace to configure the number of breakpoints and watchpoints
of a Realm VM through KVM_SET_ONE_REG ID_AA64DFR0_EL1.

The KVM sys_reg handler checks the user value against the maximum value
given by RMM (arm64_check_features() gets it from the
read_sanitised_id_aa64dfr0_el1() reset handler).

Userspace discovers that it can write these fields by issuing a
KVM_ARM_GET_REG_WRITABLE_MASKS ioctl.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Conflicts:arch/arm64/kvm/rme.c
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://patchwork.kernel.org/project/kvm/patch/[email protected]/

--------------------------------

Provide an accurate number of available PMU counters to userspace when
setting up a Realm.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>

Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

RMM provides the maximum vector length it supports for a guest in its
feature register. Make it visible to the rest of KVM and to userspace
via KVM_REG_ARM64_SVE_VLS.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>

Signed-off-by: Yiwei Zhuang <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Obtain the max vector length configured by userspace on the vCPUs, and
write it into the Realm parameters. By default the vCPU is configured
with the max vector length reported by RMM, and userspace can reduce it
with a write to KVM_REG_ARM64_SVE_VLS.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

KVM_GET_REG_LIST should not be called before SVE is finalized. The ioctl
handler currently returns -EPERM in this case. But because it uses
kvm_arm_vcpu_is_finalized(), it now also rejects the call for
unfinalized REC even though finalizing the REC can only be done late,
after Realm descriptor creation.

Move the check to copy_sve_reg_indices(). One adverse side effect of
this change is that a KVM_GET_REG_LIST call that only probes for the
array size will now succeed even if SVE is not finalized, but that seems
harmless since the following KVM_GET_REG_LIST with the full array will
fail.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>

Signed-off-by: Yiwei Zhuang <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Userspace can set a few registers with KVM_SET_ONE_REG (9 GP registers
at runtime, and 3 system registers during initialization). Update the
register list returned by KVM_GET_REG_LIST.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Increment KVM_VCPU_MAX_FEATURES to expose the new capability to user
space.

Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IBY08N
Reference: https://lore.kernel.org/kvm/[email protected]/T/

--------------------------------

Add the ioctl to activate a realm and set the static branch to enable
access to the realm functionality if the RMM is detected.

Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>

Signed-off-by: Yiwei Zhuang <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
mainline inclusion
from mainline-v6.7-rc1
commit 4277335
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/ICWPIF
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=427733579744ef22ee6d0da9907560d79d937458

--------------------------------

Future changes to KVM's sysreg emulation will rely on having a valid PMU
instance to determine the number of implemented counters (PMCR_EL0.N).
This is earlier than when userspace is expected to modify the vPMU
device attributes, where the default is selected today.

Select the default PMU when handling KVM_ARM_VCPU_INIT such that it is
available in time for sysreg emulation.

Reviewed-by: Sebastian Ott <[email protected]>
Co-developed-by: Marc Zyngier <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[Oliver: rewrite changelog]
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…tomic section

community inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/ICX7FX?from=project-issue

Reference: https://patchew.org/linux/[email protected]/[email protected]

--------------------------------

Entering a realm is done using a SMC call to the RMM. On exit the
exit-codes need to be handled slightly differently to the normal KVM
path so define our own functions for realm enter/exit and hook them
in if the guest is a realm guest.

Fixes: a9b2e8a67446 ("[v8-16-43]arm64: RME: Handle realm enter/exit")
Signed-off-by: Steven Price <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
community inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/ICX7FX?from=project-issue

Reference: https://patchew.org/linux/[email protected]/[email protected]

------------------------

Each page within the protected region of the realm guest can be marked
as either RAM or EMPTY. Allow the VMM to control this before the guest
has started and provide the equivalent functions to change this (with
the guest's approval) at runtime.

When transitioning from RIPAS RAM (1) to RIPAS EMPTY (0) the memory is
unmapped from the guest and undelegated allowing the memory to be reused
by the host. When transitioning to RIPAS RAM the actual population of
the leaf RTTs is done later on stage 2 fault, however it may be
necessary to allocate additional RTTs to allow the RMM track the RIPAS
for the requested range.

When freeing a block mapping it is necessary to temporarily unfold the
RTT which requires delegating an extra page to the RMM, this page can
then be recovered once the contents of the block mapping have been
freed.

Fixes: 4afc64441759 ("[v8-15-43]arm64: RME: Allow VMM to set RIPAS")
Signed-off-by: Steven Price <[email protected]>
Signed-off-by: Xu Raoqing <[email protected]>
Signed-off-by: WangYuli <[email protected]>
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @Avenger-285714, your pull request is larger than the review limit of 150000 diff characters

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from avenger-285714. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deepin-ci-robot
Copy link

deepin pr auto review

这个diff看起来是关于ARM64架构下支持ARM CCA(Confidential Compute Architecture)的一系列改动。主要涉及以下几个方面:

  1. 内存加密/解密支持
  2. Realm Management Monitor (RMM)接口
  3. KVM对Realm的支持
  4. TSM(Trusted Security Module)报告接口
  5. 配置文件和文档更新

从代码审查角度,我提出以下建议:

  1. 安全性:
  • rsi_attestation_token_init()函数需要更严格的输入验证,目前只检查了长度(32-64字节),建议增加内容格式验证
  • 内存加密/解密操作(set_memory_encrypted/decrypted)应该增加额外的权限检查
  • RMI调用需要增加返回值检查,防止恶意guest通过这些接口泄露信息
  1. 性能:
  • realm_map_protected()和realm_map_non_secure()中的内存分配可以优化为批量分配
  • RTT(Realm Translation Table)的创建和销毁可以考虑缓存机制
  • attestation token的生成过程可以考虑异步化,避免阻塞
  1. 代码质量:
  • 部分函数缺少详细的文档注释,特别是RMI相关的接口函数
  • 错误处理路径需要更加统一和完整
  • 一些魔数(如0x1000)应该用宏定义替代
  1. 具体修改建议:
// 在rsi_attestation_token_init()中增加格式验证
static int validate_challenge(const u8 *challenge, unsigned long size)
{
    if (size < 32 || size > 64)
        return -EINVAL;
        
    // 检查challenge内容格式
    for (int i = 0; i < size; i++) {
        if (!is_valid_challenge_byte(challenge[i]))
            return -EINVAL;
    }
    return 0;
}

// 在内存加密操作中增加权限检查
int set_memory_encrypted(unsigned long addr, int numpages)
{
    if (!capable(CAP_SYS_ADMIN))
        return -EPERM;
        
    // 现有代码...
}

// 统一RMI调用错误处理
static int handle_rmi_error(int ret, const char *op)
{
    switch (RMI_RETURN_STATUS(ret)) {
    case RMI_SUCCESS:
        return 0;
    case RMI_ERROR_INPUT:
        pr_err("%s: Invalid input\n", op);
        return -EINVAL;
    case RMI_ERROR_REALM:
        pr_err("%s: Realm error\n", op);
        return -EPERM;
    default:
        pr_err("%s: Unknown error %d\n", op, ret);
        return -EIO;
    }
}
  1. 其他建议:
  • 考虑增加更多的调试和监控接口,方便问题诊断
  • 在关键路径上增加tracepoints,用于性能分析
  • 考虑添加更多的单元测试,特别是安全相关的代码路径

这些建议旨在提高代码的安全性、性能和可维护性。建议分阶段实施这些改进,优先处理安全相关的问题。

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for ARM Confidential Compute Architecture (CCA/RME) to the Linux 6.6 kernel, enabling confidential virtual machines (Realms) on ARM64 systems. The implementation includes both guest-side and host-side (KVM) support for running and managing Realm VMs, which provide hardware-enforced isolation from the hypervisor.

Key Changes:

  • Adds Realm Management Monitor (RMM) and Realm Services Interface (RSI) support for ARM64 guest environments
  • Implements KVM extensions to create, manage, and execute Realm VMs with protected memory
  • Introduces memory encryption/decryption infrastructure for ARM64 with bounce buffer support for I/O

Reviewed changes

Copilot reviewed 78 out of 78 changed files in this pull request and generated no comments.

Show a summary per file
File Description
kernel/jump_label.c Adds jump label sealing mechanism for ro_after_init static keys
init/main.c Calls jump_label_init_ro() during boot
virt/kvm/kvm_main.c Filters HVA notifications for private mappings
include/linux/tsm.h New TSM (Trusted Security Module) attestation interface
include/uapi/linux/kvm.h Adds KVM_VM_TYPE_ARM_REALM and RME capability definitions
drivers/virt/coco/tsm.c New TSM attestation report generation framework
drivers/virt/coco/arm-cca-guest/ New ARM CCA guest driver for attestation
arch/arm64/kernel/rsi.c New RSI implementation for Realm guests
arch/arm64/mm/*.c Memory encryption/decryption and page attribute support
arch/arm64/kvm/rme.c New RME support for KVM (1723 lines)
arch/arm64/kvm/rme-exit.c Realm exit handling
arch/arm64/kvm/*.c Various KVM changes for Realm support
drivers/perf/ PMU counter field access updates
drivers/irqchip/irq-gic-v3-its.c Memory encryption support for ITS tables

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Avenger-285714 Avenger-285714 changed the title [Deepin-Kernel-SIG] [linux 6.6-y] [Backport] [ARM] 添加ARM-CCA机密计算特性支持 [Deepin-Kernel-SIG] [linux 6.6-y] [Backport] [Security] [ARM] 添加ARM-CCA机密计算特性支持 Nov 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.