Skip to content

Commit d129377

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini: "ARM64: - Fix the guest view of the ID registers, making the relevant fields writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1) - Correcly expose S1PIE to guests, fixing a regression introduced in 6.12-rc1 with the S1POE support - Fix the recycling of stage-2 shadow MMUs by tracking the context (are we allowed to block or not) as well as the recycling state - Address a couple of issues with the vgic when userspace misconfigures the emulation, resulting in various splats. Headaches courtesy of our Syzkaller friends - Stop wasting space in the HYP idmap, as we are dangerously close to the 4kB limit, and this has already exploded in -next - Fix another race in vgic_init() - Fix a UBSAN error when faking the cache topology with MTE enabled RISCV: - RISCV: KVM: use raw_spinlock for critical section in imsic x86: - A bandaid for lack of XCR0 setup in selftests, which causes trouble if the compiler is configured to have x86-64-v3 (with AVX) as the default ISA. Proper XCR0 setup will come in the next merge window. - Fix an issue where KVM would not ignore low bits of the nested CR3 and potentially leak up to 31 bytes out of the guest memory's bounds - Fix case in which an out-of-date cached value for the segments could by returned by KVM_GET_SREGS. - More cleanups for KVM_X86_QUIRK_SLOT_ZAP_ALL - Override MTRR state for KVM confidential guests, making it WB by default as is already the case for Hyper-V guests. Generic: - Remove a couple of unused functions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits) RISCV: KVM: use raw_spinlock for critical section in imsic KVM: selftests: Fix out-of-bounds reads in CPUID test's array lookups KVM: selftests: x86: Avoid using SSE/AVX instructions KVM: nSVM: Ignore nCR3[4:0] when loading PDPTEs from memory KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset() KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot x86/kvm: Override default caching mode for SEV-SNP and TDX KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic KVM: Remove unused kvm_vcpu_gfn_to_pfn KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration KVM: arm64: vgic: Don't check for vgic_ready() when setting NR_IRQS KVM: arm64: Fix shift-out-of-bounds bug KVM: arm64: Shave a few bytes from the EL2 idmap code KVM: arm64: Don't eagerly teardown the vgic on init error KVM: arm64: Expose S1PIE to guests KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule KVM: arm64: nv: Punt stage-2 recycling to a vCPU request KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed ...
2 parents c1bc09d + e9001a3 commit d129377

File tree

25 files changed

+277
-103
lines changed

25 files changed

+277
-103
lines changed

Documentation/virt/kvm/api.rst

+9-7
Original file line numberDiff line numberDiff line change
@@ -8098,13 +8098,15 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
80988098
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
80998099
disabled.
81008100

8101-
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, KVM invalidates all SPTEs in
8102-
fast way for memslot deletion when VM type
8103-
is KVM_X86_DEFAULT_VM.
8104-
When this quirk is disabled or when VM type
8105-
is other than KVM_X86_DEFAULT_VM, KVM zaps
8106-
only leaf SPTEs that are within the range of
8107-
the memslot being deleted.
8101+
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, for KVM_X86_DEFAULT_VM VMs, KVM
8102+
invalidates all SPTEs in all memslots and
8103+
address spaces when a memslot is deleted or
8104+
moved. When this quirk is disabled (or the
8105+
VM type isn't KVM_X86_DEFAULT_VM), KVM only
8106+
ensures the backing memory of the deleted
8107+
or moved memslot isn't reachable, i.e KVM
8108+
_may_ invalidate only SPTEs related to the
8109+
memslot.
81088110
=================================== ============================================
81098111

81108112
7.32 KVM_CAP_MAX_VCPU_ID

Documentation/virt/kvm/locking.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ For direct sp, we can easily avoid it since the spte of direct sp is fixed
136136
to gfn. For indirect sp, we disabled fast page fault for simplicity.
137137

138138
A solution for indirect sp could be to pin the gfn, for example via
139-
kvm_vcpu_gfn_to_pfn_atomic, before the cmpxchg. After the pinning:
139+
gfn_to_pfn_memslot_atomic, before the cmpxchg. After the pinning:
140140

141141
- We have held the refcount of pfn; that means the pfn can not be freed and
142142
be reused for another gfn.

arch/arm64/include/asm/kvm_asm.h

+1
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,7 @@ struct kvm_nvhe_init_params {
178178
unsigned long hcr_el2;
179179
unsigned long vttbr;
180180
unsigned long vtcr;
181+
unsigned long tmp;
181182
};
182183

183184
/*

arch/arm64/include/asm/kvm_host.h

+7
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
5252
#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6)
5353
#define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7)
54+
#define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8)
5455

5556
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
5657
KVM_DIRTY_LOG_INITIALLY_SET)
@@ -211,6 +212,12 @@ struct kvm_s2_mmu {
211212
*/
212213
bool nested_stage2_enabled;
213214

215+
/*
216+
* true when this MMU needs to be unmapped before being used for a new
217+
* purpose.
218+
*/
219+
bool pending_unmap;
220+
214221
/*
215222
* 0: Nobody is currently using this, check vttbr for validity
216223
* >0: Somebody is actively using this.

arch/arm64/include/asm/kvm_mmu.h

+2-1
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
166166
int create_hyp_stack(phys_addr_t phys_addr, unsigned long *haddr);
167167
void __init free_hyp_pgds(void);
168168

169-
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size);
169+
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
170+
u64 size, bool may_block);
170171
void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);
171172
void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);
172173

arch/arm64/include/asm/kvm_nested.h

+3-1
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid,
7878
extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu);
7979
extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu);
8080

81+
extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu);
82+
8183
struct kvm_s2_trans {
8284
phys_addr_t output;
8385
unsigned long block_size;
@@ -124,7 +126,7 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu,
124126
struct kvm_s2_trans *trans);
125127
extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2);
126128
extern void kvm_nested_s2_wp(struct kvm *kvm);
127-
extern void kvm_nested_s2_unmap(struct kvm *kvm);
129+
extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block);
128130
extern void kvm_nested_s2_flush(struct kvm *kvm);
129131

130132
unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val);

arch/arm64/kernel/asm-offsets.c

+1
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ int main(void)
146146
DEFINE(NVHE_INIT_HCR_EL2, offsetof(struct kvm_nvhe_init_params, hcr_el2));
147147
DEFINE(NVHE_INIT_VTTBR, offsetof(struct kvm_nvhe_init_params, vttbr));
148148
DEFINE(NVHE_INIT_VTCR, offsetof(struct kvm_nvhe_init_params, vtcr));
149+
DEFINE(NVHE_INIT_TMP, offsetof(struct kvm_nvhe_init_params, tmp));
149150
#endif
150151
#ifdef CONFIG_CPU_PM
151152
DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp));

arch/arm64/kvm/arm.c

+5
Original file line numberDiff line numberDiff line change
@@ -997,6 +997,9 @@ static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu)
997997
static int check_vcpu_requests(struct kvm_vcpu *vcpu)
998998
{
999999
if (kvm_request_pending(vcpu)) {
1000+
if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu))
1001+
return -EIO;
1002+
10001003
if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
10011004
kvm_vcpu_sleep(vcpu);
10021005

@@ -1031,6 +1034,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
10311034

10321035
if (kvm_dirty_ring_check_request(vcpu))
10331036
return 0;
1037+
1038+
check_nested_vcpu_requests(vcpu);
10341039
}
10351040

10361041
return 1;

arch/arm64/kvm/hyp/nvhe/hyp-init.S

+29-23
Original file line numberDiff line numberDiff line change
@@ -24,28 +24,25 @@
2424
.align 11
2525

2626
SYM_CODE_START(__kvm_hyp_init)
27-
ventry __invalid // Synchronous EL2t
28-
ventry __invalid // IRQ EL2t
29-
ventry __invalid // FIQ EL2t
30-
ventry __invalid // Error EL2t
27+
ventry . // Synchronous EL2t
28+
ventry . // IRQ EL2t
29+
ventry . // FIQ EL2t
30+
ventry . // Error EL2t
3131

32-
ventry __invalid // Synchronous EL2h
33-
ventry __invalid // IRQ EL2h
34-
ventry __invalid // FIQ EL2h
35-
ventry __invalid // Error EL2h
32+
ventry . // Synchronous EL2h
33+
ventry . // IRQ EL2h
34+
ventry . // FIQ EL2h
35+
ventry . // Error EL2h
3636

3737
ventry __do_hyp_init // Synchronous 64-bit EL1
38-
ventry __invalid // IRQ 64-bit EL1
39-
ventry __invalid // FIQ 64-bit EL1
40-
ventry __invalid // Error 64-bit EL1
38+
ventry . // IRQ 64-bit EL1
39+
ventry . // FIQ 64-bit EL1
40+
ventry . // Error 64-bit EL1
4141

42-
ventry __invalid // Synchronous 32-bit EL1
43-
ventry __invalid // IRQ 32-bit EL1
44-
ventry __invalid // FIQ 32-bit EL1
45-
ventry __invalid // Error 32-bit EL1
46-
47-
__invalid:
48-
b .
42+
ventry . // Synchronous 32-bit EL1
43+
ventry . // IRQ 32-bit EL1
44+
ventry . // FIQ 32-bit EL1
45+
ventry . // Error 32-bit EL1
4946

5047
/*
5148
* Only uses x0..x3 so as to not clobber callee-saved SMCCC registers.
@@ -76,6 +73,13 @@ __do_hyp_init:
7673
eret
7774
SYM_CODE_END(__kvm_hyp_init)
7875

76+
SYM_CODE_START_LOCAL(__kvm_init_el2_state)
77+
/* Initialize EL2 CPU state to sane values. */
78+
init_el2_state // Clobbers x0..x2
79+
finalise_el2_state
80+
ret
81+
SYM_CODE_END(__kvm_init_el2_state)
82+
7983
/*
8084
* Initialize the hypervisor in EL2.
8185
*
@@ -102,9 +106,12 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init)
102106
// TPIDR_EL2 is used to preserve x0 across the macro maze...
103107
isb
104108
msr tpidr_el2, x0
105-
init_el2_state
106-
finalise_el2_state
109+
str lr, [x0, #NVHE_INIT_TMP]
110+
111+
bl __kvm_init_el2_state
112+
107113
mrs x0, tpidr_el2
114+
ldr lr, [x0, #NVHE_INIT_TMP]
108115

109116
1:
110117
ldr x1, [x0, #NVHE_INIT_TPIDR_EL2]
@@ -199,9 +206,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
199206

200207
2: msr SPsel, #1 // We want to use SP_EL{1,2}
201208

202-
/* Initialize EL2 CPU state to sane values. */
203-
init_el2_state // Clobbers x0..x2
204-
finalise_el2_state
209+
bl __kvm_init_el2_state
210+
205211
__init_el2_nvhe_prepare_eret
206212

207213
/* Enable MMU, set vectors and stack. */

arch/arm64/kvm/hypercalls.c

+6-6
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu)
317317
* to the guest, and hide SSBS so that the
318318
* guest stays protected.
319319
*/
320-
if (cpus_have_final_cap(ARM64_SSBS))
320+
if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))
321321
break;
322322
fallthrough;
323323
case SPECTRE_UNAFFECTED:
@@ -428,7 +428,7 @@ int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
428428
* Convert the workaround level into an easy-to-compare number, where higher
429429
* values mean better protection.
430430
*/
431-
static int get_kernel_wa_level(u64 regid)
431+
static int get_kernel_wa_level(struct kvm_vcpu *vcpu, u64 regid)
432432
{
433433
switch (regid) {
434434
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
@@ -449,7 +449,7 @@ static int get_kernel_wa_level(u64 regid)
449449
* don't have any FW mitigation if SSBS is there at
450450
* all times.
451451
*/
452-
if (cpus_have_final_cap(ARM64_SSBS))
452+
if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))
453453
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
454454
fallthrough;
455455
case SPECTRE_UNAFFECTED:
@@ -486,7 +486,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
486486
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
487487
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
488488
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
489-
val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
489+
val = get_kernel_wa_level(vcpu, reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
490490
break;
491491
case KVM_REG_ARM_STD_BMAP:
492492
val = READ_ONCE(smccc_feat->std_bmap);
@@ -588,7 +588,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
588588
if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
589589
return -EINVAL;
590590

591-
if (get_kernel_wa_level(reg->id) < val)
591+
if (get_kernel_wa_level(vcpu, reg->id) < val)
592592
return -EINVAL;
593593

594594
return 0;
@@ -624,7 +624,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
624624
* We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
625625
* other way around.
626626
*/
627-
if (get_kernel_wa_level(reg->id) < wa_level)
627+
if (get_kernel_wa_level(vcpu, reg->id) < wa_level)
628628
return -EINVAL;
629629

630630
return 0;

arch/arm64/kvm/mmu.c

+8-7
Original file line numberDiff line numberDiff line change
@@ -328,9 +328,10 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
328328
may_block));
329329
}
330330

331-
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
331+
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
332+
u64 size, bool may_block)
332333
{
333-
__unmap_stage2_range(mmu, start, size, true);
334+
__unmap_stage2_range(mmu, start, size, may_block);
334335
}
335336

336337
void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)
@@ -1015,7 +1016,7 @@ static void stage2_unmap_memslot(struct kvm *kvm,
10151016

10161017
if (!(vma->vm_flags & VM_PFNMAP)) {
10171018
gpa_t gpa = addr + (vm_start - memslot->userspace_addr);
1018-
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start);
1019+
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start, true);
10191020
}
10201021
hva = vm_end;
10211022
} while (hva < reg_end);
@@ -1042,7 +1043,7 @@ void stage2_unmap_vm(struct kvm *kvm)
10421043
kvm_for_each_memslot(memslot, bkt, slots)
10431044
stage2_unmap_memslot(kvm, memslot);
10441045

1045-
kvm_nested_s2_unmap(kvm);
1046+
kvm_nested_s2_unmap(kvm, true);
10461047

10471048
write_unlock(&kvm->mmu_lock);
10481049
mmap_read_unlock(current->mm);
@@ -1912,7 +1913,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
19121913
(range->end - range->start) << PAGE_SHIFT,
19131914
range->may_block);
19141915

1915-
kvm_nested_s2_unmap(kvm);
1916+
kvm_nested_s2_unmap(kvm, range->may_block);
19161917
return false;
19171918
}
19181919

@@ -2179,8 +2180,8 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
21792180
phys_addr_t size = slot->npages << PAGE_SHIFT;
21802181

21812182
write_lock(&kvm->mmu_lock);
2182-
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size);
2183-
kvm_nested_s2_unmap(kvm);
2183+
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true);
2184+
kvm_nested_s2_unmap(kvm, true);
21842185
write_unlock(&kvm->mmu_lock);
21852186
}
21862187

0 commit comments

Comments
 (0)