From 7d9589239ec068c944190408b9838774d5ec1f8f Mon Sep 17 00:00:00 2001 From: Andrew Cooper Date: Thu, 24 Feb 2022 12:18:00 +0000 Subject: [PATCH] x86/CET: Fix S3 resume with shadow stacks active MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit The original shadow stack support has an error on S3 resume with very bizarre fallout. The BSP comes back up, but APs fail with: (XEN) Enabling non-boot CPUs ... (XEN) Stuck ?? (XEN) Error bringing CPU1 up: -5 and then later (on at least two Intel TigerLake platforms), the next HVM vCPU to be scheduled on the BSP dies with: (XEN) d1v0 Unexpected vmexit: reason 3 (XEN) domain_crash called from vmx.c:4304 (XEN) Domain 1 (vcpu#0) crashed on cpu#0: The VMExit reason is EXIT_REASON_INIT, which has nothing to do with the scheduled vCPU, and will be addressed in a subsequent patch. It is a consequence of the APs triple faulting. The reason the APs triple fault is because we don't tear down the stacks on suspend. The idle/play_dead loop is killed in the middle of running, meaning that the supervisor token is left busy. On resume, SETSSBSY finds busy bit set, suffers #CP and triple faults because the IDT isn't configured this early. Rework the AP bring-up path to (re)create the supervisor token. This ensures the primary stack is non-busy before use. Note: There are potential issues with the IST shadow stacks too, but fixing those is more involved. Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks") Link: https://github.com/QubesOS/qubes-issues/issues/7283 Reported-by: Thiner Logoer Reported-by: Marek Marczykowski-Górecki Signed-off-by: Andrew Cooper Tested-by: Thiner Logoer Tested-by: Marek Marczykowski-Górecki Reviewed-by: Jan Beulich --- xen/arch/x86/boot/x86_64.S | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index fa41990dde..5d12937a0e 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -51,13 +51,21 @@ ENTRY(__high_start) test $CET_SHSTK_EN, %al jz .L_ap_cet_done - /* Derive MSR_PL0_SSP from %rsp (token written when stack is allocated). */ - mov $MSR_PL0_SSP, %ecx + /* Derive the supervisor token address from %rsp. */ mov %rsp, %rdx + and $~(STACK_SIZE - 1), %rdx + or $(PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8, %rdx + + /* + * Write a new supervisor token. Doesn't matter on boot, but for S3 + * resume this clears the busy bit. + */ + wrssq %rdx, (%rdx) + + /* Point MSR_PL0_SSP at the token. */ + mov $MSR_PL0_SSP, %ecx + mov %edx, %eax shr $32, %rdx - mov %esp, %eax - and $~(STACK_SIZE - 1), %eax - or $(PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8, %eax wrmsr setssbsy -- 2.39.5