virtio-iommu needs to be an integrated device, and our address
assignment code will make sure that is the case. If the user has
provided an explicit address, however, we should make sure any
addresses pointing to a different bus are rejected.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Thu, 23 Sep 2021 14:28:15 +0000 (16:28 +0200)]
conf: Add virDomainDeviceInfo to virDomainIOMMUDef
This is needed so that IOMMU devices can have addresses.
Existing IOMMU devices (intel-iommu and SMMUv3) are system
devices and as such don't have an address associated to them, but
virtio-iommu is a PCI device and needs one.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
This capability detects the availability of the boot-bypass
property of the virtio-iommu-pci device.
This property was only introduced in QEMU 7.0 but, since the
device has been around for much longer, we end up querying its
properties for several more releases. As I don't have convenient
access to the 10+ binaries necessary to regenerate the replies,
I just put some fake data in there.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Wed, 22 Sep 2021 17:15:01 +0000 (19:15 +0200)]
qemu: Introduce QEMU_CAPS_DEVICE_VIRTIO_IOMMU_PCI
This capability detects the availability of the virtio-iommu-pci
device.
Note that, while this device is present even in somewhat old
versions of QEMU, it's only some recent changes that made it
actually usable for our purposes.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Thu, 23 Sep 2021 14:44:42 +0000 (16:44 +0200)]
qemu: Tweak some code
The altered code is functionally equivalent to the previous one,
but it's already laid down in a way that will make further
changes easier and less messy.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Mon, 27 Sep 2021 13:49:23 +0000 (15:49 +0200)]
conf: Introduce VIR_PCI_CONNECT_INTEGRATED
This new flag can be used to convince the PCI address assignment
algorithm to place a device directly on the root bus. It will be
used to implement support for virtio-iommu, which needs to be an
integrated device in order to work correctly.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Tue, 29 Mar 2022 15:36:31 +0000 (17:36 +0200)]
tests: Update capabilities for QEMU 7.0.0 on ppc64
The QEMU binary is built from the v7.0.0-rc2 tag.
Some of the additional capabilities that show up are a
consequence of more features being enabled in this build than
in the one used to generate the replies initially.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Thu, 31 Mar 2022 09:01:24 +0000 (11:01 +0200)]
qemu: Dissolve virQEMUCapsFindBinaryForArch()
With the recent changes, virQEMUCapsGetDefaultEmulator() has
become a trivial wrapper around this function, as well as its
only caller. Clean up the situation by merging the two.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Jim Fehlig <jfehlig@suse.com>
Andrea Bolognani [Thu, 31 Mar 2022 09:00:38 +0000 (11:00 +0200)]
qemu: Don't assume that /usr/libexec/qemu-kvm exists
On a machine where no QEMU binary is installed, we end up logging
libvirtd: Cannot check QEMU binary /usr/libexec/qemu-kvm:
No such file or directory
which is not very useful in general, and downright misleading in
the case of operating systems that are not derived from RHEL.
This is a consequence of treating that specific path in a different
way from all other possible QEMU binary paths, and specifically of
not checking whether the file actually exists but sort of assuming
that it must do if we haven't found another QEMU binary earlier.
Address the issue by trying this path out in
virQEMUCapsFindBinaryForArch(), along with all the other possible
ones, and making sure it exists before returning it.
Reported-by: Jim Fehlig <jfehlig@suse.com> Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Jim Fehlig <jfehlig@suse.com>
Andrea Bolognani [Thu, 31 Mar 2022 08:56:55 +0000 (10:56 +0200)]
qemu: Clean up virQEMUCapsFindBinaryForArch()
If we get to the bottom of the function we know that none of the
attempts to locate a QEMU binary has been successful, so we can
simply return NULL directly.
This makes it unnecessary variable used to store the path, for
which we can use a more descriptive name.
Lastly, comparing with NULL explicitly is somewhat uncommon in
libvirt and more verbose than the equivalent implicit comparison,
so get rid of it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Jim Fehlig <jfehlig@suse.com>
Andrea Bolognani [Tue, 29 Mar 2022 09:43:36 +0000 (11:43 +0200)]
meson: Use dicts to initialize cfg_data objects
Instead of creating an empty object and then setting keys one
at a time, it is possible to pass a dict object to
configuration_data(). This is nicer because it doesn't require
repeating the name of the cfg_data object over and over.
There is one exception: the 'conf' object, where we store values
that are used directly by C code. In that case, using a dict
object is not feasible for two reasons: first of all, replacing
the set_quoted() calls would result in awkward code with a lot
of calls to format(); moreover, since code that modifies it is
sprinkled all over the place, refactoring it would probably
make things more complicated rather than simpler.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Andrea Bolognani [Mon, 28 Mar 2022 13:26:11 +0000 (15:26 +0200)]
qemu: Use real defaults for user and group in qemu.conf
The default values used by the library are determined at configure
time based on a number of factors, and we should reflect them in
the installed configuration file to make the comments it contains
more useful.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/263 Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Mon, 28 Mar 2022 15:24:47 +0000 (17:24 +0200)]
util: Improve macOS workaround
Since the workaround is specific to macOS, only disable compiler
warnings when building on that platform.
While at it, update the comment to reflect the fact that the
workaround is needed for all versions of the OS, including the
modern ones that we currently target.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Recent refactor (v8.1.0-217-ga193f4bef6) generalized job related enums
and functions by changing "qemu" prefix to "vir" and moving them to
src/hypervisor/domain_job.[ch]. This was in most cases a good thing, but
async job phases are driver specific and the corresponding functions
remained in src/qemu/qemu_domainjob.[ch], but still their prefix was
changed to "vir". Let's change it back to "qemu".
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Wed, 30 Mar 2022 11:54:50 +0000 (13:54 +0200)]
virsh: Fix integer overflow in allocpages
I've came across an aarch64 system which supports hugepages up to
16GiB of size. However, I was unable to allocate them using
virsh allocpages. This is because cmdAllocpages() uses
vshCommandOptScaledInt(), which scales passed value into bytes,
but since the virNodeAllocPages() expects size in KiB the
variable holding bytes is then divided by 1024. However, the
limit for the biggest value passed to vshCommandOptScaledInt() is
UINT_MAX which is now obviously wrong, as it needs to be UINT_MAX
* 1024.
The same bug is in completer. But here, let's use ULLONG_MAX so
that we don't have to care about it anymore.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jonathon Jongsma [Tue, 29 Mar 2022 19:24:36 +0000 (14:24 -0500)]
qemu: fix hotplug for multiqueue vdpa net device
While commit a5e659f0 removed the restriction against multiple queues
for the vdpa net device, there were some missing pieces. Configuring a
device statically and then starting the domain worked as expected, but
hotplugging a device didn't have the expected multiqueue support
enabled. Add the missing bits.
Consider the following device xml:
<interface type="vdpa">
<mac address="00:11:22:33:44:03" />
<source dev="/dev/vhost-vdpa-0" />
<model type="virtio" />
<driver queues='2' />
</interface>
Without this patch, hotplugging the above XML description resulted in
the following:
{"execute":"netdev_add","arguments":{"type":"vhost-vdpa","vhostdev":"/dev/fdset/0","id":"hostnet1"},"id":"libvirt-392"}
{"execute":"device_add","arguments":{"driver":"virtio-net-pci","netdev":"hostnet1","id":"net1","mac":"00:11:22:33:44:03","bus":"pci.5","addr":"0x0"},"id":"libvirt-393"}
With the patch, hotplugging results in the following:
{"execute":"netdev_add","arguments":{"type":"vhost-vdpa","vhostdev":"/dev/fdset/0","queues":2,"id":"hostnet1"},"id":"libvirt-392"}
{"execute":"device_add","arguments":{"driver":"virtio-net-pci","mq":true,"vectors":6,"netdev":"hostnet1","id":"net1","mac":"00:11:22:33:44:03","bus":"pci.5","addr":"0x0"},"id":"libvirt-393"}
John Levon [Tue, 29 Mar 2022 17:38:11 +0000 (18:38 +0100)]
fix documentation for sockets topology
In 0895a0e, it was noted that the "sockets" value in the topology
section of capabilities reflects not the number of sockets per NUMA
node, not the total number.
Unfortunately, the fix was applied to the wrong place: the domain XML
format documentation, not that for the capabilities output. And, in
fact, the domain XML interprets "sockets" as the total number, not a
per-node value.
Back out this change in favour of a note in the capabilities
documentation instead.
Fixes: 0895a0e75d13874254218e16dc66dcad673671d3 Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: John Levon <john.levon@nutanix.com>
Michal Privoznik [Mon, 28 Mar 2022 11:29:19 +0000 (13:29 +0200)]
virfile: Report error when changing pipe size fails
When changing the size of pipe that virFileWrapperFdNew() creates
we start at 1MiB and if that fails because it's above the system
wide limit we get EPERM and continue with half of the size.
However, we might get another error in which case we should
report proper system error and return failure from
virFileWrapperFdNew().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Mon, 21 Mar 2022 14:17:47 +0000 (15:17 +0100)]
qemu: command: Override device definition according to the namespace config
Apply the user-requested changes to the device definition as requested
by the <qemu:deviceOverride> element from the custom qemu XML namespace.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/287 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Mon, 21 Mar 2022 11:02:53 +0000 (12:02 +0100)]
docs: drvqemu: Document overriding of device properties
Upcoming patches will add possibility to override configuration of a
device with custom properties as a more versatile replacement to using
QEMU's '-set' parameter, which doesn't work when we use JSON to
instantiate devices.
Describe the XML used for the override as well as expectations of
upstream support in case something breaks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Claudio Fontana [Fri, 25 Mar 2022 15:10:19 +0000 (16:10 +0100)]
virfile: set pipe size in virFileWrapperFdNew to improve throughput
currently the only user of virFileWrapperFdNew is the qemu driver;
virsh save is very slow with a default pipe size.
This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also
the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 21 Mar 2022 12:33:06 +0000 (13:33 +0100)]
qemu_tpm: Do async IO when starting swtpm emulator
When vTPM is secured via virSecret libvirt passes the secret
value via an FD when swtpm is started (arguments --key and
--migration-key). The writing of the secret into the FDs is
handled via virCommand, specifically qemu_tpm calls
virCommandSetSendBuffer()) and then virCommandRunAsync() spawns a
thread to handle writing into the FD via
virCommandDoAsyncIOHelper. But the thread is not created unless
VIR_EXEC_ASYNC_IO flag is set, which it isn't. In order to fix
it, virCommandDoAsyncIO() must be called.
The credit goes to Marc-André Lureau
<marcandre.lureau@redhat.com> who has done all the debugging and
proposed fix in the bugzilla.
Paolo Bonzini [Thu, 24 Mar 2022 09:48:39 +0000 (10:48 +0100)]
qemu: add support for tsc.on_reboot element
QEMU 7.0.0 adds a new property tsc-clear-on-reset to x86 CPU, corresponding
to Libvirt's <tsc on_reboot="clear"/> element. Plumb it in the validation,
command line handling and tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Paolo Bonzini [Thu, 24 Mar 2022 09:48:38 +0000 (10:48 +0100)]
domain: add tsc.on_reboot element
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The workaround is to reset the TSC to a small value. Add
to the domain configuration an attribute for this. It can be used
by QEMU and in principle also by ESXi, which has a property called
monitor_control.enable_softResetClearTSC as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Paolo Bonzini [Thu, 24 Mar 2022 09:36:56 +0000 (10:36 +0100)]
tests: add dependencies to meson declaration
Make sure that all tests are run after the helpers and mocks are
(re)built. This enables for example using "meson test" as the
command line passed to "git bisect run".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Paolo Bonzini [Thu, 24 Mar 2022 10:53:04 +0000 (11:53 +0100)]
meson: do not look for libparted if not requested
libparted_dep is not used if -Dstorage_disk=disabled. Do not
bother looking for this library if the disk storage backend was
not requested.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These enums are essentially the same and always sorted in the
same order in every hypervisor with jobs. They can be generalized
by using the qemu enums as the main ones as they are the most
extensive.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The prealloc-threads is property of memory-backend class which is
parent to the other three classes memory-backend-{ram,file,memfd}.
Therefore the property is present for all, or none if QEMU is
older than v5.0.0-rc0~75^2~1^2~3 which introduced the property.
Anyway, the .reserve property is the same story, and we chose
memory-backend-file to detect it, so stick with our earlier
decision and use the same backend to detect this new property.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal Privoznik [Mon, 21 Mar 2022 15:49:25 +0000 (16:49 +0100)]
conf: Introduce memory allocation threads
Since its v5.0.0 release QEMU is capable of specifying number of
threads used to allocate memory. It defaults to 1, which may be
too low for humongous guests with gigantic pages.
In general, on QEMU cmd line level it is possible to use
different number of threads per each memory-backend-* object, in
practical terms it's not useful. Therefore, use <memoryBacking/>
to set guest wide value and let all memory devices 'inherit' it,
silently. IOW, don't introduce per device knob because that would
only complicate things for a little or no benefit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Erik Skultety [Tue, 22 Mar 2022 11:31:49 +0000 (12:31 +0100)]
ci: integration: Rename all Avocado standard stream log files to *.log
By default, stdout/stderr Avocado test log files do not have any file
extension which confuses GitLab's web UI to mangle the MIME type for
these and so the browser will never offer the option to open such file
from in a text editor rather than dowloading it.
Since GitLab sets a proper MIME for .txt and .log file extensions,
rename all Avocado log files without an extension to *.log . This pairs
nicely with the coredumpctl info file which we already name as
'coredumpctl.txt' because of this.
Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Erik Skultety [Mon, 21 Mar 2022 17:05:16 +0000 (18:05 +0100)]
ci: integration: Collect stack traces with coredumpctl
Some Red Hat-like distros have cores limited with a soft limit of 0
which means that neither a stack trace nor a core file will be
available. Since we want the stack trace we need to set the core limit
with systemd globally to unlimited/infinity.
Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Erik Skultety [Mon, 21 Mar 2022 12:51:53 +0000 (13:51 +0100)]
ci: Define the integration job tag dynamically via a variable
Custom runners are private to a project, so naturally forks cannot run
any workloads on these. The integration test suite which requires
access to our custom runner is naturally disabled on forks and can be
enabled by setting LIBVIRT_CI_INTEGRATION=1.
The problem is that the current integration jobs definitions have tags
statically defined as 'redhat-vm-host'. If users are going to supply
their own private runners for their forks, they can define whatever
tags they want with it and so unless they add 'redhat-vm-host' to their
own runner's tags, the pipeline won't run.
To solve this, define the integration job tag using a variable. The
repo config will use the value defined in the job for the variable
while users can override the value easily on a project/pipeline level
thanks to GitLab's CI variable precedence [1].