From: Michael Roth Date: Wed, 6 Dec 2023 15:58:21 +0000 (-0600) Subject: i386/sev: Avoid SEV-ES crash due to missing MSR_EFER_LMA bit X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=5746f70d685be1e0156e6e7f0d4a1e9bdca2e2e8;p=qemu.git i386/sev: Avoid SEV-ES crash due to missing MSR_EFER_LMA bit Commit 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it exposed a long-running bug in current KVM support for SEV-ES where the kernel assumes that MSR_EFER_LMA will be set explicitly by the guest kernel, in which case EFER write traps would result in KVM eventually seeing MSR_EFER_LMA get set and recording it in such a way that it would be subsequently visible when accessing it via KVM_GET_SREGS/etc. However, guest kernels currently rely on MSR_EFER_LMA getting set automatically when MSR_EFER_LME is set and paging is enabled via CR0_PG_MASK. As a result, the EFER write traps don't actually expose the MSR_EFER_LMA bit, even though it is set internally, and when QEMU subsequently tries to pass this EFER value back to KVM via KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL, which is now considered fatal due to the aforementioned QEMU commit. This can be addressed by inferring the MSR_EFER_LMA bit being set when paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure the expected bits are all present in subsequent handling on the host side. Ultimately, this handling will be implemented in the host kernel, but to avoid breaking QEMU's SEV-ES support when using older host kernels, the same handling can be done in QEMU just after fetching the register values via KVM_GET_SREGS*. Implement that here. Cc: Paolo Bonzini Cc: Marcelo Tosatti Cc: Tom Lendacky Cc: Akihiko Odaki Cc: Philippe Mathieu-Daudé Cc: Lara Lazier Cc: Vitaly Kuznetsov Cc: Maxim Levitsky Cc: Fixes: 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") Signed-off-by: Michael Roth Acked-by: Paolo Bonzini Signed-off-by: Stefan Hajnoczi Message-ID: <20231206155821.1194551-1-michael.roth@amd.com> --- diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 11b8177eff..4ce80555b4 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -3643,6 +3643,10 @@ static int kvm_get_sregs(X86CPU *cpu) env->cr[4] = sregs.cr4; env->efer = sregs.efer; + if (sev_es_enabled() && env->efer & MSR_EFER_LME && + env->cr[0] & CR0_PG_MASK) { + env->efer |= MSR_EFER_LMA; + } /* changes to apic base and cr8/tpr are read back via kvm_arch_post_run */ x86_update_hflags(env); @@ -3682,6 +3686,10 @@ static int kvm_get_sregs2(X86CPU *cpu) env->cr[4] = sregs.cr4; env->efer = sregs.efer; + if (sev_es_enabled() && env->efer & MSR_EFER_LME && + env->cr[0] & CR0_PG_MASK) { + env->efer |= MSR_EFER_LMA; + } env->pdptrs_valid = sregs.flags & KVM_SREGS2_FLAGS_PDPTRS_VALID;