Andrea Bolognani [Mon, 12 Aug 2024 15:07:54 +0000 (17:07 +0200)]
security: Allow skipping locking when labeling lock files
This is needed when migrating a guest that has persistent TPM
state: relabeling (which implies locking) needs to happen
before the swtpm process is started on the destination host,
but the lock file won't be released by the swtpm process
running on the source host before a handshake with the target
process has happened, creating a catch-22 scenario.
In order to make migration possible, make it so that locking
for lock files can be explicitly skipped. All other state
files are handled as usual.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Andrea Bolognani [Fri, 30 Aug 2024 12:25:25 +0000 (14:25 +0200)]
security: Always forget labels for TPM state directory
In the case of outgoing migration, we avoid restoring the
remembered labels for the TPM state directory because doing so
would risk cutting off storage access for the target node.
Even in that case though, we should still forget (unref) the
remembered labels: if we don't, the source node will keep
thinking that the state directory is in use.
Note that this change only affects the SELinux driver because
the DAC driver doesn't currently implement label remembering
for TPM state at all.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Peter Krempa [Fri, 2 Aug 2024 13:23:44 +0000 (15:23 +0200)]
qemu: migration: Don't remember seclabel for images shared from current host
In case when the user exports images from current host and there is an
incoming migration from a remote host, security label remembering would
be possible but would attempt to remember the label allowing access to
the image as the image is already used by a VM on remote host.
To prevent remembering the wrong label, we'll skip the remembering of
the label for any shared resource, so that the code behaves identically
regardless of how the image is accessed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Fri, 2 Aug 2024 13:23:43 +0000 (15:23 +0200)]
storage_source: Add field for skipping seclabel remembering
In case of incoming migration where a local directory is shared to other
hosts we'll need to avoid seclabel remembering as the code would
remember the seclabel already allowing access to the image.
As the decision requires a lot of information not available in the
security driver it would either require plumbing in unpleasant callbacks
able to pass in the data or alternatively we can mark this in the
'virStorageSource' struct.
This patch chose to do the latter approach by adding a field called
'seclabelSkipRemember' which will be filled before starting the process
in cases when it will be required.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Fri, 2 Aug 2024 13:23:42 +0000 (15:23 +0200)]
security_(dac|selinux): Unref remembered security labels on outgoing migration
When 'qemuSecurityRestoreAllLabel' is called on outgoing migration it
skips the actual relabeling part of the images in dac/selinux drivers in
order to avoid cutting off access to the image.
As shared filesystems don't really support the trusted XATTR groups,
remembering of security labels never worked on those paths so we never
actually had remembered seclabels for images that could be migrated.
With recent changes we now support migration from local storage to
remote in case the admin declares it as shared. This means that in case
when the VM is started on local storage we'd actually store seclabels,
but when migrating out the XATTRs remembering the seclabels would not
actually be unref'd and thus the seclabels would leak.
As we can't know whether a remote host will be able to use the XATTRs or
not (but really it won't) and at the same time the destination side of
migration will actually call 'qemuSecuritySetAllLabel' setting/refing
it's own seclabels we really need to unref them on our side.
This patch adds the appropriate *RecallLabel() calls on the code paths
in which relabelling is skipped due to migration.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Fri, 2 Aug 2024 13:23:41 +0000 (15:23 +0200)]
virSecuritySELinuxRestoreImageLabelInt: Move FD image relabeling after 'migrated' check
Reorganize the code so that the 'migrated' flag isn't checked multiple
times and thus that it's more obvious what is happening when the
'migrated' flag is asserted.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Up until this point, we have avoided setting labels for
incoming migration when the TPM state is stored on a shared
filesystem. This seems to make sense, because since the
underlying storage is shared surely the labels will be as
well.
There's one problem, though: when a guest is migrated, the
SELinux context for the destination process is different from
the one of the source process.
We haven't hit any issues with the current approach so far
because NFS doesn't support SELinux, so effectively it doesn't
matter whether relabeling happens or not: even if the SELinux
contexts of the source and target processes are different,
both will be able to access the storage.
Now that it's possible for the local admin to manually mark
exported directories as shared filesystems, however, things
can get problematic.
Consider the case in which one host (mig-one) exports its
local filesystem /srv/nfs/libvirt/swtpm via NFS, and at the
same time bind-mounts it to /var/lib/libvirt/swtpm; another
host (mig-two) mounts the same filesystem to the same
location, this time via NFS. Additionally, in order to
allow migration in both directions, on mig-one the
/var/lib/libvirt/swtpm directory is listed in the
shared_filesystems qemu.conf option.
When migrating from mig-one to mig-two, things work just fine;
going in the opposite direction, however, results in an error:
# virsh migrate cirros qemu+ssh://mig-one/system
error: internal error: QEMU unexpectedly closed the monitor (vm='cirros'):
qemu-system-x86_64: tpm-emulator: Setting the stateblob (type 1) failed with a TPM error 0x1f
qemu-system-x86_64: error while loading state for instance 0x0 of device 'tpm-emulator'
qemu-system-x86_64: load of migration failed: Input/output error
This is because the directory on mig-one is considered a
shared filesystem and thus labeling is skipped, resulting in
a SELinux denial.
The solution is quite simple: remove the check and always
relabel. We know that it's okay to do so not just because it
makes the error seen above go away, but also because no such
check currently exists for disks and other types of persistent
storage such as NVRAM files, which always get relabeled.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
If the local admin has explicitly declared that a certain
filesystem is to be considered shared, we should treat it as
such.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
virFileIsSharedFS() is the function that ultimately decides
whether a filesystem should be considered shared, but the list
of manually configured shared filesystems is part of the QEMU
driver's configuration, so we need to pass the information
through several layers in order to make use of it.
Note that with this change the list is propagated all the way
through, but its contents are still ignored, so the behavior
remains the same for now.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
As explained in the comment, this can help in scenarios where
a shared filesystem can't be detected as such by libvirt, by
giving the admin the opportunity to provide this information
manually.
https://issues.redhat.com/browse/RHEL-35752
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
libxl_conf: Add check for unsupported graphics type
libxlMakeVfb always succeeds regardless of if the graphics type is
actually supported or not.
libxl_defbool_val is called in libxlMakeBuildInfoVfb which besides returning
the boolean value of the defbool also has an assertion that the defbool value
is not set to default. It is possible to fail this assertion if an
unsupported graphics type is used. In libxlMakeVfb, the VNC and SDL enable
defbools are still left in their default state if the graphics type falls
outside the two, which leads to this issue.
This patch adds a check to reject graphics types outside of SDL, VNC, and SPICE
very early on in libxlMakeVfb. As a safeguard, we also initialize both vnc
enable and sdl enable defbools as false early.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
libxl_conf: Fix config generation for multiple serial devices
Currently, an array of libxl_string_list (char **) or in other words,
a triple char pointer is initialized. This is dereferenced to a char ** type
and stored in serial_list, which is NULL at this point. There is an attempt to
reference an element of this serial_list when making a call to
libxlMakeChrdevStr which causes a segmentation fault.
To fix this, we simply allocate an array of char * instead of
libxl_string_list.
This patch also adds testcases to extend coverage over both single serial and
multiple serial cases.
Signed-off-by: Rayhan Faizel <rayhan.faizel@gmail.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 07:51:20 +0000 (09:51 +0200)]
qemu: Introduce and wire in 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES'
The new 'VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES' migration
parameter allows users of migration to pass in a list of disks where
zero-detection (which avoids transferring the zeroed-blocks) should be
enabled for the migration connection. This comes at the cost of extra
CPU cycles needed to check each block if it's all-zero.
This is useful for storage backends where information about the
allocation state of a block is not available and thus without this the
image would become fully allocated on the destination.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Mon, 30 Sep 2024 08:38:58 +0000 (10:38 +0200)]
qemu: migration: Extract validation of disk target list
The migration code is checking the disk list provided via
VIR_MIGRATE_PARAM_MIGRATE_DISKS against existing disks. Extract it to a
helper function as we'll be passing another list of disk targets soon.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 11:56:06 +0000 (13:56 +0200)]
qemu: migration: Avoid use of 'nmigration_disks'
'migration_disks' is a NULL-terminated string list, so the code can be
converted to either iterate the string-list, use existing accessors or
check the presence of the pointers instead of checking the count.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 11:40:15 +0000 (13:40 +0200)]
qemu: migration: Don't log 'nmigrate_disks'
The actual number of disks to migrate is not important. The presence of
disks to migrate can be inferred from presence of the 'migrate_disks'
pointer which is logged.
Since 'nmigrate_disks' will eventually be removed remove the logging
right now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 11:33:40 +0000 (13:33 +0200)]
qemuMigrationSrcBeginPhaseBlockDirtyBitmaps: Use qemuMigrationAnyCopyDisk()
The function open-coded the checking whether a disk is being migrated
with non-shared storage and did so badly (not taking into account if
user doesn't explicitly provide list of disks to migrate).
Use the existing helper instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 11:01:22 +0000 (13:01 +0200)]
virTypedParamsGetStringList: Ensure that returned array is NULL if there are no matching fields
'virTypedParamsGetStringList' fills the returned array only with string
parameters with matching name. The filtering code though leaves the
possibility that all items are filtered out but the return array is
still (over)allocated.
Since 'virTypedParamsFilter()' now also allows filtering by type we can
move the filtering there ensuring that we always allocate the right
number of elements and more importantly the returned array will be NULL
if none elements are present.
Rework the code and adjust docs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 09:15:06 +0000 (11:15 +0200)]
virTypedParamsGetStringList: Refactor and adjust docs
Use automatic freeing, declare one variable per line and return early
when possible. As this is an internal helper there's no need to check
that the caller passed non-NULL @values.
Modify the documentation to be accurate and warn callers to not free the
strings just the array.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 08:55:00 +0000 (10:55 +0200)]
virTypedParamsFilter: Adjust return type and docs
The 'virTypedParamsFilter' function can't fail and thus it never returns
negative values. Change the return type to 'size_t' and adjust callers
to not check the return value for being negative.
Adjust the docs to hilight this and also the fact that the filtered
typed param list returned via @ret is not a deep copy and thus callers
must not use the common function to free it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Mon, 30 Sep 2024 12:16:58 +0000 (14:16 +0200)]
qemu: migration: Pre-create QCOW2 images for non-shared storage with 0 allocation
Specify that the <allocation> parameter for the newly-created qcow2
image is 0 so that only metadata gets preallocated. Otherwise the
storage driver code instructs qemu to use 'fallocate' preallocation mode
and considers the image fully allocated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 07:07:22 +0000 (09:07 +0200)]
qemu: blockjob: Clean out disk mirror data after concluding the job
The 'disk->mirrorJob' and 'disk->mirrorState' fields need to be cleared
after a blockjob, but should be kept around while 'disk->mirror' is
still in place. As 'disk->mirror' is cleared only after conclusion of
the job in 'qemuBlockJobEventProcessConcluded()' we should be resetting
them only afterwards.
Move the code later, but since the job is unregistered from the disk we
need to store the pointer to the disk before concluding the job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Fri, 27 Sep 2024 06:57:08 +0000 (08:57 +0200)]
qemu: blockjob: Update 'mirror' of a copy job before removing images
When concluding a job with a 'mirror' we first unplugged the appropriate
no-longer used images from qemu and then updated the definition.
Normally this wouldn't be a problem because for any other thread this is
done under the VM lock thus atomic. Unfortunately though, the AppArmor
security backend is using a VM XML to pass data to the helper process
and the state of the definition at that point was unsuitable to format a
valid XML thus making 'virt-aa-helper' report parsing failure.
Since we're removing the images the proper state of the VM definition
indeed should not include the mirror element any more at the point when
the images are removed.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/601 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Peter Krempa [Tue, 24 Sep 2024 13:27:23 +0000 (15:27 +0200)]
testutilsqemuschema: Support 'unstable' feature in QMP schema validator
The 'unstable' feature is present on any schema member which was not yet
finalized in qemu. Use it to refuse such fields/commands in qemu as they
are possibly subject to change.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Laine Stump [Mon, 30 Sep 2024 15:30:38 +0000 (11:30 -0400)]
qemu: fix regression in update-device for interfaces
Commit a37bd2a15b8f2e7aa09519c86fe1ba1e59ce113f eliminated a failure
to update *any* change in an interface that was connected via a
network that consisted of a pool of VFs using macvtap passthrough
mode. Unfortunately it caused a regression that results in failure to
update changes to bandwidth/vlan/trustGuestRxFilters in any interface
connected via a network that uses a bridge to connect tap devices.
This fixes that problem by narrowing the usage of the fix in the
earlier patch to only be done in the case that the the interface is
connected via a macvtap+passthrough network.
Signed-off-by: Laine Stump <laine@redhat.com> Fixes: a37bd2a15b8f2e7aa09519c86fe1ba1e59ce113f Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Andrea Bolognani [Fri, 27 Sep 2024 12:47:21 +0000 (14:47 +0200)]
qemu: Look for qemu-bridge-helper in more directories
Commit 0caacf47d7b423db9126660fb0382ed56cd077c1 recently
made it so the new path used for qemu-bridge-helper in Debian
would be allowed, but the logic used to actually figure out
the complete path for the helper was not updated accordingly.
Ján Tomko [Wed, 14 Aug 2024 15:55:17 +0000 (17:55 +0200)]
ci: adapt to 'dtrace' package split
Fedora has decided to separate dtrace out of the systemtap-sdt-devel
package: https://fedoraproject.org/wiki/Changes/Separate_dtrace_package
Similarly, these are split in OpenSUSE Tumbleweed, however in a
backward-compatbile way:
https://build.opensuse.org/package/show/openSUSE:Factory/systemtap
Require the new 'systemtap' package mapping, as well as the old
'dtrace'.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The model was defined with two CPU features that cannot be explicitly
configured in QEMU (it knows the MSR bits, but there's no name
associated with them). The features should have never existed in the CPU
map. While removing them from the list of features and existing CPU
models is not trivial (to avoid compatibility issues), we can at least
fix the SierraForest CPU model added in this release cycle.
The rest will be handled later in a separate series.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Wed, 18 Sep 2024 13:32:45 +0000 (15:32 +0200)]
qemu: Provide sane default for dump_guest_core
QEMU uses Linux extensions to madvise() to include/exclude guest
memory from core dump. These are obviously not available
everywhere. Currently, users have two options:
1) configure <memory dumpCore=''/> in domain XML, or
2) configure dump_guest_core in qemu.conf
While these work, they may harm user experience as "things just
don't work" out of the box. Provide sane default in
virQEMUDriverConfigNew() so neither of two options is required.
To have predictable results in tests, explicitly set
cfg->dumpGuestCore to false in qemuTestDriverInit() (which
creates cfg object for tests).
qemu: Use per-domain private memoryBackingDir for new memory backends
The function qemuGetMemoryBackingPath() does not need the @def any more
and priv->memoryBackingDir can be used instead of constructing the path
by calling qemuGetMemoryBackingDomainPath().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 24 Sep 2024 07:32:22 +0000 (09:32 +0200)]
meson: Sort values reported in summary()
So far the only sorted summary() is list of detected libraries.
Other sections like hypervisor, storage, security drivers and
misc are in random order. Sort them alphabetically.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Tue, 24 Sep 2024 07:26:43 +0000 (09:26 +0200)]
meson: Restore alphabetical order of reported libraries
One of previous commits introduced json-c library and reports it
in the summary at the end. However, we like the list to be sorted
alphabetically which is not the case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Ján Tomko [Wed, 14 Aug 2024 15:50:38 +0000 (17:50 +0200)]
nss: convert findMACs to use json-c
While the parsing is still done by 1K buffers, the results
are no longer filtered during the parsing, but the whole JSON
has to live in memory at once, which was also the case before
the NSS plugin dropped its dependency on libvirt_util.
Also, the new parser might be more forgiving of missing elements.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Ján Tomko [Wed, 14 Aug 2024 13:38:23 +0000 (15:38 +0200)]
nss: convert findLeases to use json-c
While the parsing is still done by 1K buffers, the results
are no longer filtered during the parsing, but the whole JSON
has to live in memory at once, which was also the case before
the NSS plugin dropped its dependency on libvirt_util.
Also, the new parser might be more forgiving of missing elements.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Ján Tomko [Wed, 14 Feb 2024 11:51:36 +0000 (12:51 +0100)]
tests: switch to compact empty JSON object formatting
Some earlier versions of json-c format empty elements differently.
Run the tests who use the pretty formatting for readability and
diffability through a function that unifies the output.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Ján Tomko [Thu, 15 Feb 2024 15:21:23 +0000 (16:21 +0100)]
util: json: introduce virJSONStringPrettifyBlanks
A horribly named function for unifying formatting when pretty-printing
empty JSON arrays and objects. Useful for having stable test output
even if different JSON libraries format these differently.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Laine Stump [Fri, 9 Aug 2024 01:44:00 +0000 (21:44 -0400)]
util: use uint32 instead of char[4] for several virSocketAddrIPv4 operations
These 3 functions are easier to understand, and more efficient, when
the IPv4 address is viewed as a uint32 rather than an array of bytes.
virsocketAddrGetIPv4Addr() has bothered me for a long time - it was
doing ntohl of the address into a temporary uint32, and then a loop
one-by-one swapping the order of all the bytes back to network
order. Of course this only works as described on little-endian
architectures - on big-endian architectures the first assignment won't
swap the bytes' ordering, but the loop assumes the bytes are now in
little-endian order and "swaps them back", so the result will be
incorrect. (Do we not support any big-endian targets that would have
exposed this bug long before now??)
virSocketAddrCheckNetmask() was checking each byte of the two
addresses individually, when it could instead just do the operation
once on the full 32 bit values.
virSocketGetRange() was checking for "range > 65535" by seeing if the
first 2 bytes of the start and end were different, and then doing
arithmetic combining the lower two bytes (along with necessary bit
shifting to account for network byte order) to determine the exact
size of the range. Instead we can just get the ntohl of start & end,
and do the math directly.
Laine Stump [Fri, 9 Aug 2024 01:09:43 +0000 (21:09 -0400)]
util: make virSocketAddrIPv4 a union
virSocketAddrIPv4 is a type used only internally by
virsocketaddr.c. It is defined to be a character array, which leads to
multiple occurences of extra bit fiddling and byte swapping for no
good reason (except to confuse).
An IPv4 address is really just a uint32_t with the bytes in network
order, which is exactly the type of the s_addr member of the
sockaddr_in that is a part of the publicly consumed struct
virSocketAddr, and that we are copying in and out of a
virSocketAddrIPv4. Sometimes it's simpler to just treat it as a
network-order uint32_t, so let's make our virSocketAddrIPv4 a union
that has both an unsigned char bytes[4] (for the times when we need to
look one byte at a time) and a uint32_t val (for the times when it's
simpler to treat it as a single value).
For now we just change all the uses from, e.g. x[i] to x.bytes[y];
an upcoming patch will simplify some of the code to remove loops by
using x.val instead of x.bytes when appropriate.
Laine Stump [Sun, 4 Aug 2024 04:35:52 +0000 (00:35 -0400)]
util: fix virSocketAddrMask() when source and result are the same object
Many years ago (2011), virSocketAddrMask() had caused a bug by failing
to initialize an IPv6-specific field in the result virSocketAddr. This
was fixed by memset(0)ing the entire result (*network) at the
beginning of the function (thus making sure anything and everything
was initialized).
The problem is that virSocketAddrMask() has a comment above it that
says that the source (addr) and destination (network) arguments can
point to the same virSocketAddr. But in that case, the
memset(*network, 0) at the top of the function is actually doing a
memset(*addr, 0), and so there is nothing left for all the assignments
to copy except a giant field of 0's.
Fortunately in the 13 years since the memset was added, nobody has
ever called virSocketAddrMask() with addr and network being the same.
This patch makes the code agree with the comment by copying/masking
into a local virSocketAddr (which is initialized to all 0) and then
copying that to *network after it's finished assigning things from
addr.
Fixes: ba08c5932e556aa4f5101357127a6224c40e5ebe Signed-off-by: Laine Stump <laine@redhat.com>
Laine Stump [Fri, 13 Sep 2024 00:58:30 +0000 (20:58 -0400)]
qemu: rework needBridgeChange/needReconnect decisions in qemuDomainChangeNet()
This patch simplifies (?) the of qemuDomainChangeNet() code while
fixing some incorrect decisions about exactly when it's necessary to
re-attach an interface's bridge device, or to fail the device update
(needReconnect[*]) because the type of connection has changed (or
within bridge and direct (macvtap) type because some attribute of the
connection has changed that can't actually be modified after the
tap/macvtap device of the interface is created).
Example 1: it's pointless to require the bridge device to be
reattached just because the interface has been switched to a different
network (i.e. the name of the network is different), since the new
network could be using the same bridge as the old network (very
uncommon, but technically possible). Instead we should only care if
the name of the *bridge device* changes (or if something in
<virtualport> changes - see Example 3).
Example 2: wrt changing the "type" of the interface, a change should
be allowed if old and new type both used a bridge device (whether or
not the name of the bridge changes), or if old and new type are both
"direct" *and* the device being linked and macvtap mode remain the
same. Any other change in interface type cannot be accommodated and
should be a failure (i.e. needReconnect).
Example 3: there is no valid reason to fail just because the interface
has a <virtualport> element - the <virtualport> could just say
"type='openvswitch'" in both the before and after cases (in which case
it isn't a change by itself, and so is completely acceptable), and
even if the interfaceid changes, or the <virtualport> disappears
completely, that can still be reconciled by simply re-attaching the
bridge device. (If, on the other hand, the modified <virtualport> is
for a type='direct' interface, we can't domodify that, and so must
fail (needReconnect).)
(I tried splitting this into multiple patches, but they were so
intertwined that the intermediate patches made no sense.)
[*] "needReconnect" was a flag added to this function way back in
2012, when I still believed that QEMU might someday support connecting
a new & different device backend (the way the virtual device connects
to the host) to an already existing guest netdev (the virtual device
as it appears to the guest). Sadly that has never happened, so for the
purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to
"fail".
Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Sun, 15 Sep 2024 21:56:02 +0000 (17:56 -0400)]
qemu: replace open-coded remove/attach bridge with virNetDevTapReattachBridge()
The new function does what the old qemuDomainChangeNetbridge() did
manually, except that:
1) the new function supports changing from a bridge of one type to
another, e.g. from a Linux host bridge to an OVS
bridge. (previously that wasn't handled)
2) the new function doesn't emit audit log messages. This is actually
a good thing, because the old code would just log a "detach"
followed immediately by "attach" for the same MAC address, so it's
essentially a NOP. (the audit logs don't have any more detailed
info about the connection - just the VM name and MAC address, so it
makes no sense to log the detach/attach pair as it's not providing
any information).
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Tue, 17 Sep 2024 17:28:04 +0000 (13:28 -0400)]
util: don't return early from virNetDevTapReattachBridge() if "force" is true
It can be useful to force an interface to be detached/reattached from
its bridge even if it's the same bridge - possibly something like the
virtualport profileID has changed, and a detach/attach cycle will get
it connected with the new profileID.
The one and only current use of virNetDevTapReattachBridge() sets
force to false, to preserve current behavior. An upcoming patch will
use it with force set to true.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Mon, 9 Sep 2024 19:35:27 +0000 (15:35 -0400)]
qemu: prevent unnecessarily failing live interface update
Attempts to use update-device to modify just the link state of a guest
interface were failing due to a supposed attempt to modify something
in the interface that can't be modified live (even though the only
thing that was changing was the link state, which *can* be modified
live).
It turned out that this failure happened because the guest interface
in question was type='network', and the network in question was a
'direct' network that provides each guest interface with one device
from a pool of network devices. As a part of qemuDomainChangeNet() we
would always allocate a new port from the network driver for the
updated interface definition (by way of calling
virDomainNetAllocateActualDevice(newdev)), and this new port (ie the
ActualNetDef in newdev) would of course be allocated a new host device
from the pool (which would of course be different from the one
currently in use by the guest interface (in olddev)). Because direct
interfaces don't support changing the host device in a live update,
this would cause the update to fail.
The solution to this is to realize that as long as the interface
doesn't get switched to a different network as a part of the update,
the network port information (ie the ActualNetDef) will not change as
a part of updating the guest interface itself. So for sake of
comparison we can just point the newdev at the ActualNetDef of olddev,
and then clear out one or the other when we're done (to avoid a double
free or, more likely, attempt to reference freed memory).
(If, on the other hand, the name of the network has changed, or if the
interface type has changed to type='network' from something else, then
we *do* need to allocate a new port (actual device) from the network
driver (as we used to do in all cases when the new type was
'network'), and also indicate that we'll need to replace olddev in the
domain with newdev (because either of these changes is major enough
that we shouldn't just try to fix up olddev)
Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036 Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is
configured via "tls='yes'" attribute added to the "<source>" element
instead, so this prevents potential misconfig as the value would be
silently accepted.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/412 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Fri, 13 Sep 2024 08:33:47 +0000 (10:33 +0200)]
qemu: Use the new chardev backend JSON props generator also in the monitor
Now that we have a unified generator of chardev backend which is also
validated against the QMP schema we can replace the old generator with
it.
This patch modifies the monitor code to take virJSONValue 'props'
instead of the chardev definition and adds the conversion from the
chardev definition to JSON on higher levels.
The monitor code now also attempts to extract the returned 'pty' if
returned from qemu, so higher level code needs to report the error if
the path is needed and missing.
The current monitor generator is for now abandoned in place and will be
removed later.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>