Andrea Bolognani [Fri, 12 Jul 2024 12:18:20 +0000 (14:18 +0200)]
cpu_map: Add pauth Arm CPU feature
This CPU feature can be used to explicitly enable or disable
support for pointer authentication. By default, it will be
enabled if the host supports it.
https://issues.redhat.com/browse/RHEL-7044
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemu: Don't leave beingDestroyed=true on inactive domain
Recent commit v10.4.0-87-gd9935a5c4f made a reasonable change to only
reset beingDestroyed back to false when vm->def->id is reset to make
sure other code can detect a domain is (about to become) inactive. It
even added a comment saying any caller of qemuProcessBeginStopJob is
supposed to call qemuProcessStop to clear beingDestroyed. But not every
caller really does so because they first call qemuProcessBeginStopJob
and then check whether a domain is still running. If not the
qemuProcessStop call is skipped leaving beingDestroyed=true. In case of
a persistent domain this may block incoming migrations of such domain as
the migration code would think the domain died unexpectedly (even though
it's still running).
The qemuProcessBeginStopJob function is a wrapper around
virDomainObjBeginJob, but virDomainObjEndJob was used directly for
cleanup. This patch introduces a new qemuProcessEndStopJob wrapper
around virDomainObjEndJob to properly undo everything
qemuProcessBeginStopJob did.
https://issues.redhat.com/browse/RHEL-43309
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Thu, 11 Jul 2024 07:32:40 +0000 (09:32 +0200)]
virt-host-validate: Drop extra "PASS"
If virt-host-validate is ran on a SEV-SNP capable machine, an
extra "PASS" is printed out. This is because
virHostValidateAMDSev() prints "PASS" and then returns 1
(indicating success) which in turn makes the caller
(virHostValidateSecureGuests()) print "PASS" again. Just drop the
extra printing in the caller and let virHostValidateAMDSev() do
all the printing.
Fixes: 1a8f646f291775d2423ce4e4df62ad69f06ab827
Resolves: https://issues.redhat.com/browse/RHEL-46868 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Ján Tomko [Thu, 4 Jul 2024 13:54:27 +0000 (15:54 +0200)]
tests: qemuxmlconf: adjust test case to new virtiofsd
Now that we have a fake virtiofsd json descriptor in our vhost-user
test data, we can remove the explicitly specified binary and our
mocking will ensure this test won't be affected by the host state.
Also remove the locking options, since they were never supported
by the Rust version of virtiofsd.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As of now, libvirt supports few essential stats as
part of virDomainGetJobStats for Live Migration such
as memory transferred, dirty rate, number of iteration
etc. Currently it does not have support for the vfio
stats returned via QEMU. This patch adds support for that.
Signed-off-by: Kshitij Jha <kshitij.jha@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Adam Julis [Tue, 9 Jul 2024 15:23:31 +0000 (17:23 +0200)]
network: allow "modify" option for DNS-Txt records
The "modify" command allows to replace an existing record (its
text value). The primary key is the name of the record. If
duplicity or missing record detected, throw error.
Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-text record and failure due to non-existing record.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639 Signed-off-by: Adam Julis <ajulis@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Adam Julis [Tue, 9 Jul 2024 15:23:18 +0000 (17:23 +0200)]
network: allow "modify" option for DNS-Srv records
The "modify" command allows to replace an existing Srv record
(some of its elements respectively: port, priority and weight).
The primary key used to choose the modify record is the remaining
parameters, only one of them is required. Not using some of these
parameters may cause duplicate records and error message. This
logic is there because of the previous implementation (Add and
Delete options) in the function.
Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-Srv record and failure due to non-existing record.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639 Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Adam Julis [Tue, 9 Jul 2024 15:23:02 +0000 (17:23 +0200)]
network: allow "modify" option for DNS hostname
The "modify" command allows you to replace an existing record
(its hostname, sub-elements). IP address acts as the primary key.
If it is not found, the attempt ends with an error message. If
the XML contains a duplicate address, it will select the last
one.
Tests in networkxml2xmlupdatetest.c contain replacements of an
existing DNS-Host record and failure due to non-existing record.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/639 Signed-off-by: Adam Julis <ajulis@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When generating paths for a domain specific AppArmor profile each
path undergoes a validation where it's matched against an array
of well known prefixes (among other things). Now, for
OVMF/AAVMF/... images we have a list and some entries have
comments to which type of image the entry belongs to. For
instance:
"/usr/share/OVMF/", /* for OVMF images */
"/usr/share/AAVMF/", /* for AAVMF images */
But these comments are pretty useless. The path itself already
gives away the image type. Drop them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
This commit removes the redundant call to qemuSecurityGetNested() in
qemuStateInitialize(). In qemuSecurityGetModel(), the first security manager
in the stack is already used by default, so this change helps to
simplify the code.
Signed-off-by: hongmianquan <hongmianquan@bytedance.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
security_manager: Ensure top lock is acquired before nested locks
Fix libvirtd hang since fork() was called while another thread had
security manager locked.
We have the stack security driver, which internally manages other security drivers,
just call them "top" and "nested".
We call virSecurityStackPreFork() to lock the top one, and it also locks
and then unlocks the nested drivers prior to fork. Then in qemuSecurityPostFork(),
it unlocks the top one, but not the nested ones. Thus, if one of the nested
drivers ("dac" or "selinux") is still locked, it will cause a deadlock. If we always
surround nested locks with top lock, it is always secure. Because we have got top lock
before fork child libvirtd.
However, it is not always the case in the current code, We discovered this case:
the nested list obtained through the qemuSecurityGetNested() will be locked directly
for subsequent use, such as in virQEMUDriverCreateCapabilities(), where the nested list
is locked using qemuSecurityGetDOI, but the top one is not locked beforehand.
In this commit, we ensure that the top lock is acquired before the nested lock,
so during fork, it's not possible for another task to acquire the nested lock.
qemuDomainChangeNet: check virtio options for non-virtio models
In a domain created with an interface with a <driver> subelement,
the device contains a non-NULL virDomainVirtioOptions struct, even
for non-virtio NIC models. The subelement need not be present again
after libvirt restarts, or when the interface is passed to clients.
When clients such as virsh domif-setlink put back the modified
interface XML, the new device's virtio attribute is NULL. This may
fail the equality checks for virtio options in qemuDomainChangeNet,
depending on whether libvird was restarted since define or not.
This patch modifies the check for non-virtio models, to ignore olddev
value of virtio (assumed valid), and to allow either NULL or a struct
with all values ABSENT in the new virtio options.
Signed-off-by: Miroslav Los <mirlos@cisco.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
vmx: Do not require all ID data for VMWare Distributed Switch
Similarly to commit 2482801608b8 we can safely ignore connectionId,
portId and portgroupId in both XML and VMX as they are only a blind
pass-through between XML and VMX and an ethernet without such parameters
was spotted in the wild. On top of that even our documentation says the
whole VMWare Distrubuted Switch configuration is a best-effort.
virt-aa-helper: Allow RO access to /usr/share/edk2-ovmf
When binary version of edk2 is distributed, the files reside
under /usr/share/edk2-ovmf as can be seen from Gentoo's ebuild
[1]. Allow virt-aa-helper to generate paths under that dir.
1: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-firmware/edk2-ovmf-bin/edk2-ovmf-bin-202202.ebuild
Resolves: https://bugs.gentoo.org/911786 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
John Levon [Thu, 4 Jul 2024 12:59:46 +0000 (13:59 +0100)]
test_driver: support VIR_DOMAIN_AFFECT_LIVE in testUpdateDeviceFlags()
Pick up some more of the qemu_driver.c code so this function supports
both CONFIG and LIVE updates.
Note that qemuDomainUpdateDeviceFlags() passed vm->def to
virDomainDeviceDefParse() for the VIR_DOMAIN_AFFECT_CONFIG case, which
is technically incorrect; in the test driver code we'll fix this.
Signed-off-by: John Levon <john.levon@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
conf: Fix out-of-bounds write during cleanup of virDomainNumaDefNodeDistanceParseXML
mem_nodes[i].ndistances is written outside the loop causing an out-of-bounds
write leading to heap corruption.
While we are at it, the entire cleanup portion can be removed as it can be
handled in virDomainNumaFree. One instance of VIR_FREE is also removed and
replaced with g_autofree.
This patch also adds a testcase which would be picked up by ASAN, if this
portion regresses.
Fixes: 742494eed8dbdde8b1d05a306032334e6226beea Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
tests: Move domainEventState initialization to qemuTestDriverInit
Under the test environment, driver->domainEventState is uninitialized. If a
disk gets dropped, it will attempt to queue an event which will cause a
segmentation fault. This crash does not occur during normal use.
This patch moves driver->domainEventState initialization from qemuhotplugtest
to qemuTestDriverInit in testutilsqemu (Credit goes to Michal Privoznik as he
had already provided the diff).
An additional test case is added to test dropping of disks with startupPolicy
set as optional.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Adam Julis [Mon, 1 Jul 2024 11:17:22 +0000 (13:17 +0200)]
qemuDomainChangeNet: forbid changing portgroup
Changing the postgroup attribute caused unexpected behavior.
Although it can be implemented, it has a non-trivial solution.
No requirement or use has yet been found for implementing this
feature, so it has been disabled for hot-plug.
Resolves: https://issues.redhat.com/browse/RHEL-7299 Signed-off-by: Adam Julis <ajulis@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
conf: Fix rawio/sgio checks for non-scsi hostdev devices
The current hostdev parsing logic sets rawio or sgio even if the hostdev type
is not 'scsi'. The rawio field in virDomainHostdevSubsysSCSI overlaps with
wwpn field in virDomainHostdevSubsysSCSIVHost, consequently setting a bogus
pointer value such as 0x1 or 0x2 from virDomainHostdevSubsysSCSIVHost's
point of view. This leads to a segmentation fault when it attempts to free
wwpn.
While setting sgio does not appear to crash, it shares the same flawed logic
as setting rawio.
Instead, we ensure these are set only after the hostdev type check succeeds.
This patch also adds two test cases to exercise both scenarios.
Fixes: bdb95b520c53f9bacc6504fc51381bac4813be38 Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add basic coverage of device update; for now, only support disk updates
until other types are needed or tested.
Signed-off-by: John Levon <john.levon@nutanix.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Thu, 27 Jun 2024 14:57:13 +0000 (16:57 +0200)]
qemu_capabilities: Retire QEMU_CAPS_VXHS
The support for VXHS device was removed in QEMU commit
v5.1.0-rc1~16^2~10. Since we require QEMU-5.2.0 at least there's
no QEMU that has the device and thus the corresponding capability
can be retired.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Thu, 27 Jun 2024 14:15:03 +0000 (16:15 +0200)]
qemu_capabilities: Drop version check for QEMU_CAPS_ENABLE_FIPS and QEMU_CAPS_NETDEV_USER
Now that the minimal required version of QEMU is 5.2.0 the
conditional setting of QEMU_CAPS_ENABLE_FIPS and
QEMU_CAPS_NETDEV_USER is effectively a dead code. Drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Fri, 28 Jun 2024 07:56:46 +0000 (09:56 +0200)]
qemu_domain: Set 'passt' net backend if 'default' is unsupported
It may happen that QEMU is compiled without SLIRP but with
support for passt. In such case it is acceptable to alter user
provided configuration and switch backend to passt as it offers
all the features as SLIRP.
Resolves: https://issues.redhat.com/browse/RHEL-45518 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Fri, 28 Jun 2024 07:53:10 +0000 (09:53 +0200)]
qemu_validate: Use domaincaps to validate supported net backend type
Now that the logic for detecting supported net backend types has
been moved to domain capabilities generation, we can just use it
when validating net backend type. Just like we do for device
models and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Fri, 28 Jun 2024 07:36:24 +0000 (09:36 +0200)]
conf: Accept 'default' backend type for <interface type='user'/>
After previous commits, domain capabilities XML reports basically
two possible values for backend type: 'default' and 'passt'.
Despite its misleading name, 'default' really means 'use
hypervisor's builtin SLIRP'. Since it's reported in domain
capabilities as a value accepted, make our parser and XML schema
accept it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If mgmt apps on top of libvirt want to make a decision on the
backend type for <interface type='user'/> (e.g. whether past is
supported) we currently offer them no way to learn this fact.
Domain capabilities were invented exactly for this reason. Report
supported net backend types there.
Now, because of backwards compatibility, specifying no backend
type (which translates to VIR_DOMAIN_NET_BACKEND_DEFAULT) means
"use hyperviosr's builtin SLIRP". That behaviour can not be
changed. But it may happen that the hypervisor has no support for
SLIRP. So we have to report it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since -netdev user can be disabled during QEMU compilation, we
can't blindly expect it to just be there. We need a capability
that tracks its presence.
For qemu-4.2.0 we are not able to detect the capability so do the
next best thing - assume the capability is there. This is
consistent with our current behaviour where we blindly assume the
capability, anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Jon Kohler [Thu, 27 Jun 2024 18:11:56 +0000 (11:11 -0700)]
qemu: fix switchover-ack regression for old qemu
When enabling switchover-ack on qemu from libvirt, the .party value
was set to both source and target; however, qemuMigrationParamsCheck()
only takes that into account to validate that the remote side of the
migration supports the flag if it is marked optional or auto/always on.
In the case of switchover-ack, when enabled on only the dst and not
the src, the migration will fail if the src qemu does not support
switchover-ack, as the dst qemu will issue a switchover-ack msg:
qemu/migration/savevm.c ->
loadvm_process_command ->
migrate_send_rp_switchover_ack(mis) ->
migrate_send_rp_message(mis, MIG_RP_MSG_SWITCHOVER_ACK, 0, NULL)
Since the src qemu doesn't understand messages with header_type ==
MIG_RP_MSG_SWITCHOVER_ACK, qemu will kill the migration with error:
qemu-kvm: RP: Received invalid message 0x0007 length 0x0000
qemu-kvm: Unable to write to socket: Bad file descriptor
Looking at the original commit [1] for optional migration capabilities,
it seems that the spirit of optional handling was to enhance a given
existing capability where possible. Given that switchover-ack
exclusively depends on return-path, adding it as optional to that cap
feels right.
[1] 61e34b08568 ("qemu: Add support for optional migration capabilities")
Fixes: 1cc7737f69e ("qemu: add support for qemu switchover-ack") Signed-off-by: Jon Kohler <jon@nutanix.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Avihai Horon <avihaih@nvidia.com> Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: YangHang Liu <yanghliu@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Michal Privoznik [Fri, 14 Jun 2024 11:18:25 +0000 (13:18 +0200)]
remote_daemon_dispatch: Unref sasl session when closing client connection
In ideal world, where clients close connection gracefully their
SASL session is freed in virNetServerClientDispose() as it's
stored in client->sasl. Unfortunately, if client connection is
closed prematurely (e.g. the moment virsh asks for credentials),
the _virNetServerClient member is never set and corresponding
SASL session is never freed. The handler is still stored in
client private data, so free it in remoteClientCloseFunc().
20,862 (288 direct, 20,574 indirect) bytes in 3 blocks are definitely lost in loss record 1,763 of 1,772
at 0x50390C4: g_type_create_instance (in /usr/lib64/libgobject-2.0.so.0.7800.6)
by 0x501BDAF: g_object_new_internal.part.0 (in /usr/lib64/libgobject-2.0.so.0.7800.6)
by 0x501D43D: g_object_new_with_properties (in /usr/lib64/libgobject-2.0.so.0.7800.6)
by 0x501E318: g_object_new (in /usr/lib64/libgobject-2.0.so.0.7800.6)
by 0x49BAA63: virObjectNew (virobject.c:252)
by 0x49BABC6: virObjectLockableNew (virobject.c:274)
by 0x4B0526C: virNetSASLSessionNewServer (virnetsaslcontext.c:230)
by 0x18EEFC: remoteDispatchAuthSaslInit (remote_daemon_dispatch.c:3696)
by 0x15E128: remoteDispatchAuthSaslInitHelper (remote_daemon_dispatch_stubs.h:74)
by 0x4B0FA5E: virNetServerProgramDispatchCall (virnetserverprogram.c:423)
by 0x4B0F591: virNetServerProgramDispatch (virnetserverprogram.c:299)
by 0x4B18AE3: virNetServerProcessMsg (virnetserver.c:135)
Resolves: https://issues.redhat.com/browse/RHEL-22574 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Mon, 24 Jun 2024 07:31:09 +0000 (09:31 +0200)]
virt-host-validate: Detect SEV-ES and SEV-SNP
With a simple cpuid (Section "E.4.17 Function
8000_001Fh—Encrypted Memory Capabilities" in "AMD64 Architecture
Programmer’s Manual Vol. 3") we can detect whether CPU is capable
of running SEV-ES and/or SEV-SNP guests. Report these in
virt-host-validate tool.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Mon, 24 Jun 2024 07:22:16 +0000 (09:22 +0200)]
virt-host-validate: Move AMD SEV into a separate func
The code that validates AMD SEV is going to be expanded soon.
Move it into its own function to avoid lengthening
virHostValidateSecureGuests() where the code lives now, even
more.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 25 Jun 2024 08:51:55 +0000 (10:51 +0200)]
qemu_validate: Use domaincaps to validate supported launchSecurity type
Now that the logic for detecting supported launchSecurity types
has been moved to domain capabilities generation, we can just use
it when validating launchSecurity type. Just like we do for
device models and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 25 Jun 2024 08:45:43 +0000 (10:45 +0200)]
qemu: Fill launchSecurity in domaincaps
The inspiration for these rules comes from
qemuValidateDomainDef().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 25 Jun 2024 07:53:57 +0000 (09:53 +0200)]
domcaps: Report launchSecurity
In order to learn what types of <launchSecurity/> are supported
users can turn to domain capabilities and find <sev/> and
<s390-pv/> elements. While these may expose some additional info
on individual launchSecurity types, we are lacking clean
enumeration (like we do for say device models). And given that
SEV and SEV SNP share the same basis (info found under <sev/> is
applicable to SEV SNP too) we have no other way to report SEV SNP
support.
Therefore, report supported launchSecurity types in domain
capabilities.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Fri, 21 Jun 2024 12:00:32 +0000 (14:00 +0200)]
qemu_capabilities: Probe SEV capabilities even for QEMU_CAPS_SEV_SNP_GUEST
While it's very unlikely to have QEMU that supports SEV-SNP but
doesn't support plain SEV, for completeness sake we ought to
query SEV capabilities if QEMU supports either. And similarly to
QEMU_CAPS_SEV_GUEST we need to clear the capability if talking to
QEMU proves SEV is not really supported.
This in turn removes the 'sev-snp-guest' capability from one of
our test cases as Peter's machine he uses to refresh capabilities
is not SEV capable. But that's okay. It's consistent with
'sev-guest' capability.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 25 Jun 2024 07:58:43 +0000 (09:58 +0200)]
qemuxmlconftest; Explicitly enable QEMU_CAPS_SEV_SNP_GUEST for "launch-security-sev-snp"
Soon, QEMU_CAPS_SEV_SNP_GUEST is going to be dependant on more
than plain presence of "sev-snp-guest" object in QEMU. Explicitly
enable the capability for "launch-security-sev-snp" test so that
we can continue testing cmd line and xml2xml.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Rayhan Faizel [Thu, 6 Jun 2024 14:27:51 +0000 (19:57 +0530)]
qemu_block: Validate number of hosts for iSCSI disk device
An iSCSI device with zero hosts will result in a segmentation fault. This patch
adds a check for the number of hosts, which must be one in the case of iSCSI.
Jon Kohler [Mon, 24 Jun 2024 17:38:51 +0000 (10:38 -0700)]
qemu: add support for qemu switchover-ack
Add plumbing for QEMU's switchover-ack migration capability, which
helps lower the downtime during VFIO migrations. This capability is
enabled by default as long as both the source and destination support
it.
Note: switchover-ack depends on the return path capability, so this may
not be used when VIR_MIGRATE_TUNNELLED flag is set.
Extensive details about the qemu switchover-ack implementation are
available in the qemu series v6 cover letter [1] where the highlight is
the extreme reduction in guest visible downtime. In addition to the
original test results below, I saw a roughly ~20% reduction in downtime
for VFIO VGPU devices at minimum.
=== Test results ===
The below table shows the downtime of two identical migrations. In the
first migration swithcover ack is disabled and in the second it is
enabled. The migrated VM is assigned with a mlx5 VFIO device which has
300MB of device data to be migrated.
Switchover ack gives a roughly 4.5 times improvement in downtime.
The 1480ms difference is time that is used for resource allocation for
the VFIO device in the destination. Without switchover ack, this time is
spent when the source VM is stopped and thus the downtime is much
higher. With switchover ack, the time is spent when the source VM is
still running.
Jiri Denemark [Wed, 12 Jun 2024 14:44:28 +0000 (16:44 +0200)]
qemu: Fix migration with disabled vmx-* CPU features
When starting a domain on a host which lacks a vmx-* CPU feature which
is expected to be enabled by the CPU model specified in the domain XML,
libvirt properly marks such feature as disabled in the active domain
XML. But migrating the domain to a similar host which lacks the same
vmx-* feature will fail with libvirt reporting the feature as missing.
This is because of a bug in the hack ensuring backward compatibility
libvirt running on the destination thinks the missing feature is
expected to be enabled.
https://issues.redhat.com/browse/RHEL-40899
Fixes: v10.1.0-85-g5fbfa5ab8a Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Jonathon Jongsma [Wed, 12 Jun 2024 17:18:49 +0000 (12:18 -0500)]
qemu: Don't specify vfio-pci.ramfb when ramfb is false
Commit 7c8e606b64c73ca56d7134cb16d01257f39c53ef attempted to fix
the specification of the ramfb property for vfio-pci devices, but it
failed when ramfb is explicitly set to 'off'. This is because only the
'vfio-pci-nohotplug' device supports the 'ramfb' property. Since we use
the base 'vfio-pci' device unless ramfb is enabled, attempting to set
the 'ramfb' parameter to 'off' this will result in an error like the
following:
error: internal error: QEMU unexpectedly closed the monitor
(vm='rhel'): 2024-06-06T04:43:22.896795Z qemu-kvm: -device
{"driver":"vfio-pci","host":"0000:b1:00.4","id":"hostdev0","display":"on
","ramfb":false,"bus":"pci.7","addr":"0x0"}: Property 'vfio-pci.ramfb'
not found.
This also more closely matches what is done for mdev devices.
Laine Stump [Fri, 21 Jun 2024 12:17:58 +0000 (08:17 -0400)]
network: add more firewall test cases
This patch adds some previously missing test cases that test for
proper firewall rule creation when the following are included in the
network definition:
* <forward dev='blah'>
* no forward element (an "isolated" network)
* nat port range when only ipv4 is nat-ed
* nat port range when both ipv4 & ipv6 are nated
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Laine Stump <laine@redhat.com>
Laine Stump [Wed, 12 Jun 2024 19:25:46 +0000 (15:25 -0400)]
tests: fix broken nftables test data so that individual tests are successful
When the chain names and table name used by the nftables firewall
backend were changed in commit 958aa7f274904eb8e4678a43eac845044f0dcc38, I forgot to change the test
data file base.nftables, which has the extra "list" and "add
chain/table" commands that are generated for the first test case of
networkxml2firewalltest.c. When the full set of tests is run, the
first test will be an iptables test case, so those extra commands
won't be added to any of the nftables cases, and so the data in
base.nftables never matches, and the tests are all successful.
However, if the test are limited with, e.g. VIR_TEST_RANGE=2 (test #2
will be the nftables version of the 1st test case), then the commands
to add nftables table/chains *will* be generated in the test output,
and so the test will fail. Because I was only running the entire test
series after the initial commits of nftables tests, I didn't notice
this. Until now.
base.nftables has now been updated to reflect the current names for
chains/table, and running individual test cases is once again
successful.
Fixes: 958aa7f274904eb8e4678a43eac845044f0dcc38 Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Laine Stump <laine@redhat.com>
Adam Julis [Fri, 21 Jun 2024 16:16:55 +0000 (18:16 +0200)]
qemuDomainDiskChangeSupported: Fill in missing check
The attribute 'discard_no_unref' of <disk/> is not allowed to be
changed while the virtual machine is running.
Resolves: https://issues.redhat.com/browse/RHEL-37542 Signed-off-by: Adam Julis <ajulis@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Fri, 7 Jun 2024 16:46:34 +0000 (12:46 -0400)]
network: allow for forward dev to be a transient interface
A user reported that if they set <forward mode='nat|route' dev='blah'>
starting the network would fail if the device 'blah' didn't already
exist.
This is caused by using "iif" and "oif" in nftables rules to check for
the forwarding device - these two commands work by saving the named
interface's ifindex (an unsigned integer) when the rule is added, and
comparing it to the ifindex associated with the packet's path at
runtime. This works great if the interface both 1) exists when the
rule is added, and 2) is never deleted and re-created after the rule
is added (since it would end up with a different ifindex).
When checking for the network's bridge device, it is okay for us to
use "iif" and "oif", because the bridge device is created before the
firewall rules are added, and will continue to exist until just after
the firewall rules are deleted when the network is shutdown.
But since the forward device might be deleted/re-added during the
lifetime of the network's firewall rules, we must instead us "oifname"
and "iifname" - these are much less efficient than "Xif" because they
do a string compare of the interface's name rather than just comparing
two integers (ifindex), but they don't require the interface to exist
when the rule is added, and they can properly cope with the named
interface being deleted and re-added later.
Fixes: a4f38f6ffe6a9edc001d18890ccfc3f38e72fb94 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Fri, 21 Jun 2024 08:37:35 +0000 (10:37 +0200)]
domain_validate: Add missing 'break' in virDomainDefLaunchSecurityValidate()
A few commits ago (v10.4.0-101-gc65eba1f57) I've introduced
virDomainDefLaunchSecurityValidate() and a switch() statement in
it. Some cases are empty but are lacking 'break' statement which
is not valid. Provide missing 'break' statement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Thu, 13 Jun 2024 12:35:57 +0000 (14:35 +0200)]
qemu_firmware: Pick the right firmware for SEV-SNP guests
The firmware descriptors have 'amd-sev-snp` feature which
describes whether firmware is suitable for SEV-SNP guests.
Provide necessary implementation to detect the feature and pick
the right firmware if guest is SEV-SNP enabled.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Wed, 12 Jun 2024 13:22:00 +0000 (15:22 +0200)]
qemu: Build cmd line for SEV-SNP
Pretty straightforward as qemu has 'sev-snp-guest' object which
attributes maps pretty much 1:1 to our XML model. Except for
@vcek where QEMU has 'vcek-disabled`, an inverted boolean, while
we model it as virTristateBool. But that's easy to map too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Tue, 11 Jun 2024 09:58:41 +0000 (11:58 +0200)]
conf: Introduce SEV-SNP support
SEV-SNP is an enhancement of SEV/SEV-ES and thus it shares some
fields with it. Nevertheless, on XML level, it's yet another type
of <launchSecurity/>.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Michal Privoznik [Mon, 10 Jun 2024 14:17:26 +0000 (16:17 +0200)]
qemu_monitor: Allow querying SEV-SNP state in 'query-sev'
In QEMU commit v9.0.0-1155-g59d3740cb4 the return type of
'query-sev' monitor command changed to accommodate SEV-SNP. Even
though we currently support launching plain SNP guests, this will
soon change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>