On detection of a potential L1TF issue, most validation code returns
-ERESTART to allow the switch to shadow mode to happen and cause the
original operation to be restarted.
However, in the validation code, the return value -ERESTART has been
repurposed to indicate 1) the function has partially completed
something which needs to be undone, and 2) calling put_page_type()
should cleanly undo it. This causes problems in several places.
For L1 tables, on receiving an -ERESTART return from alloc_l1_table(),
alloc_page_type() will set PGT_partial on the page. If for some
reason the original operation never restarts, then on domain
destruction, relinquish_memory() will call free_page_type() on the
page.
Unfortunately, alloc_ and free_l1_table() aren't set up to deal with
PGT_partial. When returning a failure, alloc_l1_table() always
de-validates whatever it's validated so far, and free_l1_table()
always devalidates the whole page. This means that if
relinquish_memory() calls free_page_type() on an L1 that didn't
complete due to an L1TF, it will call put_page_from_l1e() on "page
entries" that have never been validated.
For L2+ tables, setting rc to ERESTART causes the rest of the
alloc_lN_table() function to *think* that the entry in question will
have PGT_partial set. This will cause it to set partial_pte = 1. If
relinqush_memory() then calls free_page_type() on one of those pages,
then free_lN_table() will call put_page_from_lNe() on the entry when
it shouldn't.
Rather than indicating -ERESTART, indicate -EINTR. This is the code
to indicate that nothing has changed from when you started the call
(which is effectively how alloc_l1_table() handles errors).
mod_lN_entry() shouldn't have any of these types of problems, so leave
potential changes there for a clean-up patch later.
This is part of XSA-299.
Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
3165ffef09e89d38f84d26051f606d2c1421aea3
master date: 2019-10-31 16:11:12 +0100
int rc;
if ( !(l2e_get_flags(l2e) & _PAGE_PRESENT) )
- return pv_l1tf_check_l2e(d, l2e) ? -ERESTART : 1;
+ return pv_l1tf_check_l2e(d, l2e) ? -EINTR : 1;
if ( unlikely((l2e_get_flags(l2e) & L2_DISALLOW_MASK)) )
{
int rc;
if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
- return pv_l1tf_check_l3e(d, l3e) ? -ERESTART : 1;
+ return pv_l1tf_check_l3e(d, l3e) ? -EINTR : 1;
if ( unlikely((l3e_get_flags(l3e) & l3_disallow_mask(d))) )
{
int rc;
if ( !(l4e_get_flags(l4e) & _PAGE_PRESENT) )
- return pv_l1tf_check_l4e(d, l4e) ? -ERESTART : 1;
+ return pv_l1tf_check_l4e(d, l4e) ? -EINTR : 1;
if ( unlikely((l4e_get_flags(l4e) & L4_DISALLOW_MASK)) )
{
{
if ( !(l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) )
{
- ret = pv_l1tf_check_l1e(d, pl1e[i]) ? -ERESTART : 0;
+ ret = pv_l1tf_check_l1e(d, pl1e[i]) ? -EINTR : 0;
if ( ret )
goto out;
}