Andrea Bolognani [Tue, 16 Jan 2024 15:36:46 +0000 (16:36 +0100)]
qemu: Stop checking QEMU_CAPS_OBJECT_GPEX
For all versions of QEMU that we support, the virt machine type
has a hard dependency on this device, so we can stop checking
whether the capability is present and just use it unconditionally.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 14:44:44 +0000 (15:44 +0100)]
tests: Request virtio-mmio for balloon-mmio-deflate
For all supported QEMU version, the virt machine type has a hard
dependency on PCI support, so if we want to test virtio-balloon
together with virtio-mmio we have to either request that
explicitly or trick libvirt by masking capabilities. Do the
former instead of the latter.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 14:52:52 +0000 (15:52 +0100)]
tests: Drop various redundant tests
All of these are either a subset of other tests, or provide
coverage for scenarios that are not really possible: for all
versions of QEMU that we support, the virt machine type has a
hard dependency on the generic PCIe controller, which means
that we will never need to fall back to virtio-mmio.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 14:51:31 +0000 (15:51 +0100)]
tests: Add {aarch64,riscv64}-virt-headless-mmio
Even though virtio-mmio is no longer the default on either
architecture, and likely nobody is using it at this point, we
still provide a way to opt into virtio-mmio usage and want to
keep existing guests working. Add explicit test suite coverage
for this scenario.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 15:21:52 +0000 (16:21 +0100)]
tests: Drop aarch64-virtio-pci-default
After commit 1d8454639f40 (libvirt 3.0.0), the default address
type for aarch64/virt guests is PCI. These tests are then
pointless, as they are just a subset of other tests, and the
comment attached to them inaccurate.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 16:19:15 +0000 (17:19 +0100)]
qemu: Fix handling of user aliases for default PHB
The bus name for the default PHB is always "pci.0".
Fixes: 937f319536723fec57ad472b002a159d0f67a77c Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 16:05:07 +0000 (17:05 +0100)]
tests: Add pseries-phb-user-alias
This is the same as the existing pseries-phb-simple, except that
each of the controllers is given a user alias. If we tried to
start the resulting guest, we'd get an error:
Bus 'ua-phb0' not found
This is because, at the QEMU command line level, the default PHB
is not represented and so it can't be given a custom alias. We're
going to address this issue in a follow-up commit.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 16 Jan 2024 16:08:49 +0000 (17:08 +0100)]
tests: Add devices to pseries-phb-simple
We want to make sure that not only the controllers themselves
are added correctly, but also that devices attached to them
get assigned the expected bus value. In order to do that add
some devices, one per controller.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 15:38:46 +0000 (16:38 +0100)]
qemuxml2conftest: Test re-parsing of formatted XML
Re-parse and re-format the output XML to validate that the auto-added
bits and the formatter always agree. There's no way to specify an
alternative output file as a libvirt-formatted XML must be reformatted
identically.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 15:41:01 +0000 (16:41 +0100)]
virDomainDefAddConsoleCompat: Fix numbering of console targets after modification
The XML parser for consoles sets the 'port=' attribute of '<target' to
be always the index of the console.
Thus when the "really crazy backcompat stuff for consoles" function
modifies the order of consoles by inserting the default one for a serial
port it must re-number the ports to ensure that the value will not
change on subsequent parse.
This luckily didn't cause any visible changes to the VM as the port
number isn't used for anything.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 21:23:35 +0000 (22:23 +0100)]
qemuDomainAssignPCIAddresses: Assign extension addresses when auto-assigning PCI address
Assigning a PCI address needs to also assign any extension addresses
right away. Otherwise they'd be assigned only after subsequent
format->parse cycle and thus be potentially missing on first run after
defining the VM and thus could change.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 23:16:29 +0000 (00:16 +0100)]
qemu: Move 'shmem' device size validation to qemu_validate
The 'size' of a 'shmem' device is parsed and formatted as a "scaled"
value, stored in bytes, but the formatting scale is mebibytes. This
precission loss combined with the fact that the value was validated only
when starting and the size is formatted only when non-zero meant that
on first parse a value < 1 MiB would be accepted, but would be formatted
to the XML as 0 MiB as it was non-zero but truncated and a subsequent
parse would parse of such XML would parse it as 0 bytes, which in turn
would be interpreted as 'default' size.
Fix the issue by moving the validator, which ensures that the number is
a power of two and more than 1 MiB to the validator code so that it'll
be rejected at XML parsing time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 21:10:35 +0000 (22:10 +0100)]
virDomainAssignControllerIndexes: Ensure controller ordering after assigning indexes
Similarly to auto-adding of controllers, the assignment of indexes can
cause them to be considered in different ordering according to the logic
in 'virDomainControllerInsert' than they currently are.
To prevent changes in commandline between first run after defining a VM
xml and any subsequent run or restart of the daemon, we need to reorder
them when assigning the index.
The simplest method is to assign indexes and then create a new list of
controllers and re-instert them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 17:32:57 +0000 (18:32 +0100)]
conf: domain: Insert auto-added controllers in same order as in XML parser
'virDomainDefAddController' which is used in code-paths which auto-add
controllers to the definition such as 'virDomainDefMaybeAddController',
'virDomainDefAddUSBController', 'qemuDomainDefAddDefaultDevices' was
adding the controller at the end of the list. However that is not how
the XML parser would order the controller in the list as it uses
virDomainControllerInsert grouping them by type and additional
properties.
This would cause that auto-added controllers would re-order:
- between first and any subsequent run of the VM (even on commandline)
- after a libvirtd/virtqemud restart
- after any update of the definition based on the 'define' operation
(e.g. virsh edit)
To ensure that the ordering of controllers is identical always use
virDomainControllerInsert.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Sat, 16 Dec 2023 16:13:35 +0000 (17:13 +0100)]
qemuxml2argvtest: Test (inactive) def -> xml conversion
This is an intermediate step to merge qemuxml2xmltest into this common
helper. This eliminates double setup/parsing of the input data as well
as will ensure that all input XMLs are tested both for ARGV as well as
XML output. For now we skip tests that don't have an output XML to show
that the this does everything that qemuxml2xmltest does.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 15:02:33 +0000 (16:02 +0100)]
qemuxml2argvtest: Add parsing of the input XML as separate test
Get clean separation between the parsing and argv conversion so that
it's obvious in the test output:
2409) QEMU XML def parse s390-async-teardown.s390x-6.0.0 ... libvirt: QEMU Driver error : unsupported configuration: asynchronous teardown is not available with this QEMU binary
OK
2410) QEMU XML def -> ARGV s390-async-teardown.s390x-6.0.0 ... SKIP
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Sat, 16 Dec 2023 15:30:57 +0000 (16:30 +0100)]
qemuxml2argvtest: Extract setup/parse step
Extract the common setup and parsing of the input XML into a separate
helper testQemuConfXMLCommon(). The helper has semantics which will
allow us to call it from multiple places so that VIR_TEST_RANGE will
still work properly even when we'll add multiple steps reusing the
prepared data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Sat, 16 Dec 2023 07:48:21 +0000 (08:48 +0100)]
qemuxml2argvtest: Setup fake driver only once
Move the setup of the fake driver from testCompareXMLToArgv to 'mymain'.
With this we also won't need to reset the fake drivers which was done
only partially.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Sat, 16 Dec 2023 07:58:55 +0000 (08:58 +0100)]
qemuxml2argvtest: Remove unused separate parsing of arch
Prior to all tests being converted to "DO_TEST_CAPS*" invocation the
fake-caps tests required knowing the architecture, which was pre-parsed
in qemuxml2argvtest. This code was now removed, but the arch parser was
forgotten.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virDomainDefFormatInternalSetRootName' which is the top level XML
formatter function has the following condition as the very first thing:
if (def->id == -1)
flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;
This makes it pointless to separately do inactive->active and
inactive->inactive XML -> XML testing as both will be in the end treated
as inactive->inactive.
This patch adds a warning to virDomainDefFormatInternalSetRootName and
removes the second pointless invocation of the test from
qemuxml2xmtest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 15 Dec 2023 15:30:32 +0000 (16:30 +0100)]
qemuxml2xmltest: Parse all input files as inactive
In previous patches we've added testing of XML's explicitly parsed as
active (ensuring that it e.g. has a domain id) formatted into both
active and inactive versions.
Now qemuxml2xmltest can be simplified by making it test only XMLs parsed
as inactive.
To do this we pass VIR_DOMAIN_DEF_PARSE_INACTIVE in parseFlags. This
will also cause that all output files will become identical so the setup
of the test cases can be simplified by using the non-split output file
name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 18 Dec 2023 12:36:45 +0000 (13:36 +0100)]
qemuxmlactivetest: Add qemu active XML to active/inactive XML tests
Add explicit test cases for XMLs from qemuxml2argvdata which
historically had different output in qemuxml2xmltest.
qemuxmlactivetest explicitly ensures that the input XMLs are parsed in
'live' state and formatted both in inactive as well as live state,
rather than the previously present inactive->inactive, live->live tests
only.
The XMLs picked in this case are those which had separate output files
in qemuxml2argvtest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 15 Dec 2023 15:30:20 +0000 (16:30 +0100)]
qemuxmlactivetest: Prepare for proper active/inactive -> active/inactive testing
Currently the xml->xml testing we have in qemuxml2xmltest covers only 3
of the 4 possibilities:
By invocation:
active -> active;
inactive -> inactive;
by unintentionally:
active -> inactive (for configs which don't set an 'id' as the
formatter assumes it's inactive)
To do it better introduce proper active -> inactive/active testing into
qemuxmlactivetest. It's chosen such as we only really parse an XML as
live when restoring a status XML. To give users possibility to avoid
constructing a full status XML add a simpler variant. As of such it will
be used only for configs where we specifically cared about parsing live
data.
To ensure that the formatter doesn't decide that a config is inactive
because it doesn't have an ID we fill in a domain ID if it was not
present in the source.
In this patch the tests are not yet added.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 4 Dec 2023 16:00:54 +0000 (17:00 +0100)]
qemu*xml2*test: Invoke tests from a function
Refactor the code so that the test macros invoke a helper function with
no additional steps. This change prevents regressions in compilation
time when adding extra steps for the tests, which happen when the test
macro gets too complicated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 15 Dec 2023 20:47:17 +0000 (21:47 +0100)]
virschematest: Add possibility to have exceptions from the '-invalid' suffix
The exception is needed in qemuxml2xmltest which is in one instance
testing update from an invalid config to a valid one. Currently the
compliance with the test is achieved via a hack.
As further patches will be simpler without the hack present we need a
way to invert the expected output in specific cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
On the guest configuration side, mention that support for the
"dies" attribute was introduced in libvirt 6.1.0 and clarify
that the ability to use non-default values is subject to
architecture and machine limitations.
On the host capabilities side, the documentation was pretty
much entirely missing. It's still far from perfect, but anything
is better than having no information at all.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This makes it so libvirt can obtain accurate information about
guest CPUs from QEMU, and should make it possible to correctly
perform operations such as CPU hotplug.
Of course this is mostly moot at the moment: only aarch64 can use
CPU clusters, and CPU hotplug is not yet implemented on that
architecture.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
tests: Add hostcpudata for machine with CPU clusters
The data is taken from an HPE Apollo 70 machine, which uses
aarch64 CPUs. It is interesting for us because non-dummy
information about CPU clusters is exposed through sysfs.
In order to keep things reasonable, the data was manually
modified so that only 8 of the original 224 CPUs are included.
Care has been taken to ensure that the topology is otherwise
unaltered.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Andrea Bolognani [Fri, 12 Jan 2024 10:02:56 +0000 (11:02 +0100)]
ci: Do more as part of .qemu-build-template
Entering $SCRATCH_DIR, going back to the original directory and
setting SELinux labels for the newly-installed QEMU binaries
are all steps that logically belong to this template rather
than its callers.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Fri, 12 Jan 2024 10:01:34 +0000 (11:01 +0100)]
ci: Fix .integration_tests_upstream_qemu
We enter $SCRATCH_DIR before going through the process of
cloning QEMU's upstream repo and building it, but once we're
done we don't get back to libvirt's sources, so the very next
step fails with
/tmp/script.: line 188: ci/jobs.sh: No such file or directory
Use pushd/popd to ensure that we're back to the correct place
once QEMU has been built and installed.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
apparmor: Allow access to /sys/devices/system/node/*/cpumap for libnuma
A QEMU change (10218ae6d006f76410804cc4dc690085b3d008b5) introduced
some libnuma calls that require read access to
/sys/devices/system/node/*/cpumap, which currently is forbidden by the
standard apparmor profile.
This commit allows read-only access to the file specified above.
qemu: Be less aggressive when dropping channel source paths
In v9.7.0-rc1~130 I've shortened the path that's generated for
<channel/> source. With that, I had to adjust regex that matches
all versions of paths we have ever generated so that we can drop
them (see comment around qemuDomainChrDefDropDefaultPath()). But
as it is usually the case with regexes - they are write only. And
while I attempted to make one portion of the path optional
("/target/") I accidentally made regex accept more, which
resulted in libvirt dropping the user provided path and
generating our own instead.
Fixes: d3759d3674ab9453e5fb5a27ab6c28b8ff8d5569
Resolves: https://issues.redhat.com/browse/RHEL-20807 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
If a second thread arrives with an RPC request, it will queue it
for the first thread to process. To inform the first thread that
there's a new request it calls g_main_loop_quit() to break it out
of the main loop.
This works if the first thread is in g_main_loop_run() at that
time. There is a small window of opportunity, however, where
the first thread has released the client lock, but not yet got
into g_main_loop_run(). If that happens, the wakeup from the
second thread is lost.
This patch deals with that by changing the way the wakeup is
performed. Instead of directly calling g_main_loop_quit(), the
second thread creates an idle source to run the quit function
from within the first thread. This guarantees that the first
thread will see the wakeup.
Tested by: Fima Shevrin <efim.shevrin@virtuozzo.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
qemu: tighten semantics of 'size' when resizing block devices
When VIR_DOMAIN_BLOCK_RESIZE_CAPACITY is set, the 'size' parameter
is currently ignored. Since applications must none the less pass a
value for this parameter, it is preferrable to declare some explicit
semantics for it.
This declare that the parameter must be 0, or the exact size of the
underlying block device. The latter gives the management application
the ability to sanity check that the block device size matches what
they think it should be.
Reviewed-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
These are special in that, when a new target is introduced, some
preparation is needed before the changes can be merged. Since
this only happens every six months or so, it's unsurprising that
we keep messing it up and forgetting some steps. Having notes
right in the file will hopefully help going forward.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
These are jobs are supposed to be running tests using a QEMU
binary built from the latest upstream sources, but right now
they're just doing the same thing as the other jobs for the
target. Use the correct job templates.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The vexpress machine has never supported ACPI. This fact has
been silently ignored by QEMU so far, but recent versions have
started reporting attempts to use the combination as an error.
The other features (APIC, PAE) are also not relevant to the
vexpress machine, or the QEMU driver.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
During post-copy migration (once it actually switches to post-copy mode)
dirty memory pages are continued to be migrated iteratively, while the
destination can explicitly request a specific page to be migrated before
the iterative process gets to it (which happens when a guest wants to
read a page that was not migrated yet). Without the postcopy-preempt
capability enabled such pages need to wait until all other pages already
queued are transferred. Enabling this capability will instruct the
hypervisor to create a separate migration channel for explicitly
requested pages so that they can preempt the queue.
The only requirement for the feature to work is running a migration over
a protocol that supports multiple connections. In other words, we can't
pre-create the connection and pass its file descriptor to QEMU (i.e.,
using MIGRATION_DEST_CONNECT_SOCKET), but we have to let QEMU open the
connections itself (using MIGRATION_DEST_SOCKET). This change is applied
to all post-copy migrations even if postcopy-preempt is not supported to
avoid making the code even uglier than it is now. There's no real
difference between the two methods with modern QEMU (which can properly
report connection failures) anyway.
This capability is enabled for all post-copy migration as long as the
capability is supported on both sides of migration.
https://issues.redhat.com/browse/RHEL-7100
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Jiri Denemark [Mon, 8 Jan 2024 15:31:29 +0000 (16:31 +0100)]
qemu: Add support for optional migration capabilities
We enable various migration capabilities according to the flags passed
to a migration API. Missing support for such capabilities results in an
error because they are required by the corresponding flag. This patch
adds support for additional optional capability we may want to enable
for a given API flag in case it is supported. This is useful for
capabilities which are not critical for the flags to be supported, but
they can make things work better in some way.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Jiri Denemark [Mon, 8 Jan 2024 14:58:09 +0000 (15:58 +0100)]
qemu: Rename remoteCaps parameter in qemuMigrationParamsCheck
The migration cookie contains two bitmaps of migration capabilities:
supported and automatic. qemuMigrationParamsCheck expects the letter so
lets make it more obvious by renaming the parameter as remoteAuto.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Peter Krempa [Thu, 23 Feb 2023 15:25:18 +0000 (16:25 +0100)]
conf: Add possibility to configure multiple iothreads per disk
Introduce a new <iothreads> sub-element of disk's <driver> which will
allow configuring multiple iothreads and also map them to specific
virt-queues of virtio devices.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Laine Stump [Fri, 5 Jan 2024 01:12:51 +0000 (20:12 -0500)]
qemu: automatically bind to a vfio variant driver, if available
Rather than always binding to the vfio-pci driver, use the new
function virPCIDeviceFindBestVFIOVariant() to see if the running
kernel has a VFIO variant driver available that is a better match for
the device, and if one is found, use that instead.
virPCIDeviceFindBestVFIOVariant() function reads the modalias file for
the given device from sysfs, then looks through
/lib/modules/${kernel_release}/modules.alias for the vfio_pci alias
that matches with the least number of wildcard ('*') fields.
The appropriate "VFIO variant" driver for a device will be the PCI
driver implemented by the discovered module - these drivers are
compatible with (and provide the entire API of) the standard vfio-pci
driver, but have additional device-specific APIs that can be useful
for, e.g., saving/restoring state for migration.
If a specific driver is named (using <driver model='blah'/> in the
device XML), that will still be used rather than searching
modules.alias; this makes it possible to force binding of vfio-pci if
there is an issue with the auto-selected variant driver.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Laine Stump [Fri, 5 Jan 2024 01:12:51 +0000 (20:12 -0500)]
conf: support manually specifying VFIO variant driver in <hostdev> XML
This patch makes it possible to manually specify which VFIO variant
driver to use for PCI hostdev device assignment, so that, e.g. you
could force use of a VFIO "variant" driver, with e.g.
<driver model='mlx5_vfio_pci'/>
or alternately to force use of the generic vfio-pci driver with
<driver model='vfio-pci'/>
when libvirt would have normally (after applying a subsequent patch)
found a "better match" for a device in the active kernel's
modules.alias file. (The main potential use of this manual override
would probably be to work around a bug in a new VFIO variant driver by
temporarily not using that driver).
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Laine Stump [Fri, 5 Jan 2024 01:12:51 +0000 (20:12 -0500)]
tests: remove explicit <driver name='vfio'/> from hostdev test cases
The long-deprecated use of <driver name='vfio|xen|kvm'/> in domain xml
for <hostdev> devices was only ever necessary during the period when
libvirt (and the Linux kernel) supported both VFIO and "legacy KVM"
styles of hostdev device assignment for QEMU. This became pointless
many years ago when legacy KVM device assignment was removed from the
kernel, and support for that style of device assignment was completely
disabled in the libvirt source in 2019 (commit v5.6.0-316-g2e7225ea8c).
Nevertheless, there were instances of <driver name='vfio'/> in the
unit test data that were then (unnecessarily) propagated to several
more tests over the years. This patch cleans out those unnecessary
explicit settings of driver name='vfio' in all QEMU unit test data,
proving that the attribute is no longer (externally) needed. (A later
patch which adds a 2nd attribute to the <driver> element will include
a test case that explicitly exercises the driver name attribute).
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Laine Stump [Fri, 5 Jan 2024 01:12:51 +0000 (20:12 -0500)]
xen: explicitly set hostdev driver.name at runtime, not in postparse
Xen only supports a single type of PCI hostdev assignment, so it is
superfluous to have <driver name='xen'/> peppered throughout the
config. It *is* necessary to have the driver type explicitly set in
the hostdev object before calling into the hypervisor-agnostic "hostdev
manager" though (otherwise the hostdev manager doesn't know whether it
should do Xen-specific setup, or VFIO-specific setup).
Historically, the Xen driver has checked for "default" driver name
(i.e. not set in the XML), and set it to "xen', during the XML
postparse, thus guaranteeing that it will be set by the time the
object is sent to the hostdev manager at runtime, but also setting it
so early that a simple round-trip of parse-format results in the XML
always containing an explicit <driver name='xen'/>, even if that
wasn't specified in the original XML.
The QEMU driver *doesn't* set driver.name during postparse though;
instead, it waits until domain startup time (or device attach time for
hotplug), and sets the driver.name then. The result is that a
parse-format round trip of the XML in the QEMU driver *doesn't* add in
the <driver name='vfio'/>.
This patch modifies the Xen driver to behave similarly to the QEMU
driver - the PostParse just checks for a driver.name that isn't
supported by the Xen driver, and any explicit setting to "xen" is
deferred until domain runtime rather than during the postparse, thus
Xen domain XML also doesn't get extraneous <driver name='xen'/>.
This delayed setting of driver.name of course results in slightly
different xml2xml parse-format results, so the unit test data is
modified accordingly.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Laine Stump [Fri, 5 Jan 2024 01:12:51 +0000 (20:12 -0500)]
conf: replace virHostdevIsVFIODevice with virHostdevIsPCIDevice
virHostdevIsVFIODevice() and virDomainDefHasVFIOHostdev() are only ever
called from the QEMU driver, and in the case of the QEMU driver, any
PCI hostdev by definition uses VFIO, so really all these callers only
need to know if the device is a PCI hostdev.
(It turned out that the less specific virHostdevIsPCIDevice() already
existed in hypervisor/virhostdev.c, so I had to remove one of them;
since conf is a lower level directory than hypervisor, and the
function is called from conf, keeping the copy in hypervisor would
have required moving its caller (virDomainDefHasPCIHostdev()) into
hypervisor as well, so I just removed the copy in hypervisor.)
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>