The list p2m->pages contain all the pages used by the P2M. On large
instance this can be quite large and the time spent to call
d->arch.paging.free_page() will take more than 1ms for a 80GB guest
on a Xen running in nested environment on a c5.metal.
By extrapolation, it would take > 100ms for a 8TB guest (what we
current security support). So add some preemption in p2m_teardown()
and propagate to the callers. Note there are 3 places where
the preemption is not enabled:
- hap_final_teardown()/shadow_final_teardown(): We are
preventing update the P2M once the domain is dying (so
no more pages could be allocated) and most of the P2M pages
will be freed in preemptive manneer when relinquishing the
resources. So this is fine to disable preemption.
- shadow_enable(): This is fine because it will undo the allocation
that may have been made by p2m_alloc_table() (so only the root
page table).
The preemption is arbitrarily checked every 1024 iterations.
We now need to include <xen/event.h> in p2m-basic in order to
import the definition for local_events_need_delivery() used by
general_preempt_check(). Ideally, the inclusion should happen in
xen/sched.h but it opened a can of worms.
Note that with the current approach, Xen doesn't keep track on whether
the alt/nested P2Ms have been cleared. So there are some redundant work.
However, this is not expected to incurr too much overhead (the P2M lock
shouldn't be contended during teardown). So this is optimization is
left outside of the security event.
This is part of CVE-2022-33746 / XSA-410.
Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Jan Beulich <jbeulich@suse.com>
----
Changes since v12:
- Correct altp2m preemption check placement.
Changes since v9:
- Integrate patch into series.
Changes since v2:
- Rework the loop doing the preemption
- Add a comment in shadow_enable() to explain why p2m_teardown()
doesn't need to be preemptible.
Changes since v1:
- Update the commit message
- Rebase on top of Roger's v8 series
- Fix preemption check
- Use 'unsigned int' rather than 'unsigned long' for the counter