AMD/IOMMU: Cease using a dynamic height for the IOMMU pagetables
update_paging_mode() has multiple bugs:
1) Booting with iommu=debug will cause it to inform you that that it called
without the pdev_list lock held.
2) When growing by more than a single level, it leaks the newly allocated
table(s) in the case of a further error.
Furthermore, the choice of default level for a domain has issues:
1) All HVM guests grow from 2 to 3 levels during construction because of the
position of the VRAM just below the 4G boundary, so defaulting to 2 is a
waste of effort.
2) The limit for PV guests doesn't take memory hotplug into account, and
isn't dynamic at runtime like HVM guests. This means that a PV guest may
get RAM which it can't map in the IOMMU.
The dynamic height is a property unique to AMD, and adds a substantial
quantity of complexity for what is a marginal performance improvement. Remove
the complexity by removing the dynamic height.
PV guests now get 3 or 4 levels based on any hotplug regions in the host.
This only makes a difference for hardware which previously had all RAM below
the 512G boundary, and a hotplug region above.
HVM guests now get 4 levels (which will be sufficient until 256TB guests
become a thing), because we don't currently have the information to know when
3 would be safe to use.
The overhead of this extra level is not expected to be noticeable. It costs
one page (4k) per domain, and one extra IO-TLB paging structure cache entry
which is very hot and less likely to be evicted.
This is XSA-311.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
master commit:
b4f042236ae0bb6725b3e8dd40af5a2466a6f971
master date: 2019-12-11 14:55:32 +0100