Erik Skultety [Mon, 28 Aug 2023 08:47:32 +0000 (10:47 +0200)]
ci: lcitool: Add libvirt-tck+runtime deps list
This change was supposed to be part of commit 120a674f , but was
proposed against the libvirt TCK project instead. Since we're running
the TCK test suite as part of this project, this is the right place for
the TCK runtime deps list config.
Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
K Shiva Kiran [Thu, 31 Aug 2023 18:16:57 +0000 (23:46 +0530)]
virsh: Fix net-desc --config output
Fixes the following bug:
Command: `net-desc --config [--title] my_network`
Expected Output: Title/Description of persistent config
Output: Title/Description of live config
This was caused due to the usage of a single `flags` variable in
`virshGetNetworkDescription()` which ended up in a wrong enum being
passed to `virNetworkGetMetadata()` (enum being that of LIVE instead of
CONFIG).
Although the domain object has the same code, this didn't cause a problem
there because the enum values of `VIR_DOMAIN_INACTIVE_XML` and
`VIR_DOMAIN_METADATA_CONFIG` turn out to be the same (1 << 1), whereas
they are not for network equivalent ones (1 << 0, 1 << 1).
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Andrea Bolognani [Wed, 30 Aug 2023 15:45:47 +0000 (17:45 +0200)]
rpm: Recommend libvirt-daemon for with_modular_daemons distros
A default deployment on modern distros uses modular daemons but
switching back to the monolithic daemon, while not recommended,
is still considered a perfectly valid option.
For a monolithic daemon deployment, the upgrade to libvirt 9.2.0
or newer works as expected; a subsequent call to dnf autoremove,
however, results in the libvirt-daemon package being removed and
the deployment no longer working.
In order to avoid that situation, mark the libvirt-daemon as
recommended.
This will unfortunately result in it being included in most
installations despite not being necessary, but considering that
the alternative is breaking existing setups on upgrade it feels
like a reasonable tradeoff.
Moreover, since the dependency on libvirt-daemon is just a weak
one, it's still possible for people looking to minimize the
footprint of their installation to manually remove the package
after installation, mitigating the drawbacks of this approach.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Andrea Bolognani [Wed, 30 Aug 2023 15:41:14 +0000 (17:41 +0200)]
rpm: Fix typo in daemon name
The name of the virtsecretd daemon was misspelled, resulting
in multiple errors during installation:
Running scriptlet: libvirt-daemon-driver-secret-9.5.0-6.el9.x86_64
Failed to preset unit: Unit file virsecretd.socket does not exist.
Failed to preset unit: Unit file virsecretd-ro.socket does not exist.
Failed to preset unit: Unit file virsecretd-admin.socket does not exist.
Failed to preset unit: Unit file virsecretd.service does not exist.
Laura Hild [Tue, 15 Aug 2023 14:54:20 +0000 (10:54 -0400)]
Don't set cur=inf RLIM_NOFILE on macOS
virProcessActivateMaxFiles sets rlim_cur to rlim_max.
If rlim_max is RLIM_INFINITY,
2023-08-15 15:17:51.944+0000: 4456752640: debug :
virProcessActivateMaxFiles:1067 : Initial max files was 2560
2023-08-15 15:17:51.944+0000: 4456752640: debug :
virProcessActivateMaxFiles:1077 : Raised max files to 9223372036854775807
then when virCommandMassClose does `int openmax = sysconf(
_SC_OPEN_MAX)`, `openmax < 0` is true and virCommandMassClose
reports an error and bails. Setting rlim_cur instead to at most
OPEN_MAX, as macOS' documentation suggests, both avoids this problem
2023-08-18 16:01:44.366+0000: 4359562752: debug :
virProcessActivateMaxFiles:1072 : Initial max files was 256
2023-08-18 16:01:44.366+0000: 4359562752: debug :
virProcessActivateMaxFiles:1086 : Raised max files to 10240
and eliminates a case of what the documentation declares
to be invalid input to setrlimit anyway.
tools: fix VMSA construction with explicit CPU family/model/stepping
If the CPU family/model/stepping are provided on the command line, but
the firmware is being automatically extracted from the libvirt guest,
we try to build the VMSA too early. This leads to an exception trying
to parse the firmware that has not been loaded yet. We must delay
building the VMSA in that scenario.
Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
tools: fix handling of CPU family/model/stepping in SEV validation
The SEV-ES boot measurement includes the initial CPU register state
(VMSA) and one of the fields includes the CPU identification. When
building a VMSA blob we get the CPU family/model/stepping from the
host capabilities, however, the VMSA must reflect the guest CPU not
host CPU. Thus using host capabilities is only when whe the guest
has the 'host-passthrough' CPU mode active. With 'host-model' it is
cannot be assumed host and guest match, because QEMU may not (yet)
have a named CPU model for a given host CPU.
Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Andrea Bolognani [Thu, 24 Aug 2023 15:41:39 +0000 (17:41 +0200)]
ci: Fix quoting and option name
Multiple values passed to --meson-args need to be quoted so that
the shell will interpret them correctly. The option's name was
also reported incorrectly, so fix that as well.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Andrea Bolognani [Thu, 24 Aug 2023 15:35:53 +0000 (17:35 +0200)]
ci: Fix precedence between arguments passed to meson
Commit 9c9848f955fd merged $MESON_OPTS into $MESON_ARGS, and
while doing so changed their behavior: while until then the
contents of $MESON_ARGS had precedence over those of $MESON_OPTS,
now the opposite is true. Restore the original behavior and
document it.
The argument for merging the two variables in the first place
was that having both present on the meson command line could be
confusing; however, that should no longer be the case now that
we have reasonably extensive comments explaining the role of
each of the variables and how they interact with each other, so
return the meson command line to its original form.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Laine Stump [Fri, 25 Aug 2023 04:09:54 +0000 (00:09 -0400)]
docs: update description of virsh nodedev-detach --driver option
--driver can now be used to specify a specific driver to bind to the
device being detached from the host driver (e.g. vfio-pci-igbvf), not
just the *type* of driver (e.g. "vfio" or "xen", which are unnecessary
anyway, since they are implicit in which hypervisor driver is in use)
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This introduces the ability to set the discard granularity option
for a disk. It defines the smallest amount of data that can be
discarded in a single operation (useful for managing and
optimizing storage).
However, most hypervisors automatically set the proper discard
granularity and users usually do not need to change the default
setting.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Peter Krempa [Fri, 25 Aug 2023 12:16:12 +0000 (14:16 +0200)]
docs: Improve documentation of <disk type='dir'>
Note the implications and caveats of <disk type='dir'>.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/519 Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
K Shiva Kiran [Wed, 16 Aug 2023 18:47:14 +0000 (00:17 +0530)]
Add Test driver and testcase for Network Metadata change APIs
This commit implements the newly defined Network Metadata Get and
Set APIs into the test driver.
It also adds a new testcase "networkmetadatatest" to test the APIs.
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
K Shiva Kiran [Wed, 16 Aug 2023 18:47:13 +0000 (00:17 +0530)]
Add virNetworkObj Get and Set Methods for Metadata
- Introduces virNetworkObjGetMetadata() and
virNetworkObjSetMetadata().
- These functions implement common behaviour that can be reused by
network drivers.
- Introduces virNetworkObjUpdateModificationImpact() among other
helper functions that resolve the live/persistent state of
the network before setting metadata.
- Eliminates redundant call of virNetworkObjSetDefTransient() in
virNetworkConfigChangeSetup() among others.
- Substituted redundant logic in networkUpdate() with a call to
virNetworkObjUpdateModificationImpact().
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
K Shiva Kiran [Wed, 16 Aug 2023 18:47:12 +0000 (00:17 +0530)]
virsh exposure of Network Metadata APIs
Adds two new commands and a new option:
- 'net-desc' to show/modify network title and description.
- 'net-metadata' to show/modify network metadata.
- Option '--title' for 'net-list' to print corresponding
network titles in an additional column.
- Documentation for all the above.
- XML Fallback function `virshNetworkGetXMLFromNet` for title and
description for compatibility with hosts running older versions
of libvirtd.
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
K Shiva Kiran [Wed, 16 Aug 2023 18:47:10 +0000 (00:17 +0530)]
Adding Public Get and Set APIs for Network Metadata
This patch introduces public Get and Set APIs for modifying <title>,
<description> and <metadata> elements of the Network object.
- Added enum virNetworkMetadataType to select one of the above
elements to operate on.
- Added error code and messages for missing metadata.
- Added public API implementation.
- Added driver support.
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
K Shiva Kiran [Wed, 16 Aug 2023 18:47:09 +0000 (00:17 +0530)]
Add <title> and <description> for Network Objects
This patch adds new elements <title> and <description> to the Network XML.
- The <title> attribute holds a short title defined by the user and
cannot contain newlines.
- The <description> attribute holds any documentation that the user
wants to store.
- Schema definitions of <title> and <description> have been moved from
domaincommon.rng to basictypes.rng for use by network and future objects.
- Added Network XML parser logic for the above.
Signed-off-by: K Shiva Kiran <shiva_kr@riseup.net> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Thu, 24 Aug 2023 13:54:26 +0000 (15:54 +0200)]
qemuxml2argvtest: Pass expected state via struct testQemuInfo's 'flags' member
Rather than having a separate argument to DO_TEST pass the state via
newly added flags 'FLAG_SKIP_CONFIG_ACTIVE'. The '_INACTIVE' equivalent
was not added as there's no test which'd use it.
Remove the old 'WHEN_' flags and move the decision logic out of the
DO_TEST macro as any addition to the logic makes the compiler take much
longer to compile qemuxml2xmltest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Laine Stump [Sun, 9 Jul 2023 04:37:45 +0000 (00:37 -0400)]
node_device: support binding other drivers with virNodeDeviceDetachFlags()
In the past, the only allowable values for the "driver" field of
virNodeDeviceDetachFlags() were "kvm" or "vfio" for the QEMU driver,
and "xen" for the libxl driver. Then "kvm" was deprecated and removed,
so the driver name became essentially irrelevant (because it is always
called via a particular hypervisor driver, and so the "xen" or "vfio"
can be (and almost always is) implied.
With the advent of VFIO variant drivers, the ability to explicitly
specify a driver name once again becomes useful - it can be used to
name the exact VFIO driver that we want bound to the device in place
of vfio-pci, so this patch allows those other names to be passed down
the call chain, where the code in virpci.c can make use of them.
The names "vfio", "kvm", and "xen" retain their special meaning, though:
1) because there may be some application or configuration that still
calls virNodeDeviceDetachFlags() with driverName="vfio", this
single value is substituted with the synonym of NULL, which means
"bind the default driver for this device and hypervisor". This
will currently result in the vfio-pci driver being bound to the
device.
2) in the case of the libxl driver, "xen" means to use the standard
driver used in the case of Xen ("pciback").
3) "kvm" as a driver name always results in an error, as legacy KVM
device assignment was removed from the kernel around 10 years ago.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Sun, 9 Jul 2023 19:00:26 +0000 (15:00 -0400)]
util: probe stub driver from within function that binds to stub driver
virPCIProbeStubDriver() and virPCIDeviceBindToStub() both have
very similar code that locally sets a driver name (based on
stubDriverType). These two functions are each also called in just one
place (virPCIDeviceDetach()), with just a small bit of validation code
in between.
To eliminate the "duplicated" code (which is going to be expanded
slightly in upcoming patches to support manually or automatically
picking a VFIO variant driver), this patch modifies
virPCIProbeStubDriver() to take the driver name as an argument
(rather than the virPCIDevice object), and calls it from within
virPCIDeviceBindToStub() (rather than from that function's caller),
using the driverName it has just figured out with the
now-not-duplicated code.
(NB: Since it could be used to probe *any* driver module, the name is
changed to virPCIProbeDriver()).
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Fri, 2 Jun 2023 18:34:51 +0000 (14:34 -0400)]
util: permit existing binding to VFIO variant driver
Before a PCI device can be assigned to a guest with VFIO, that device
must be bound to the vfio-pci driver rather than to the device's
normal host driver. The vfio-pci driver provides APIs that permit QEMU
to perform all the necessary operations to make the device accessible
to the guest.
In the past vfio-pci was the only driver that supplied these APIs, but
there are now vendor/device-specific "VFIO variant" drivers that
provide the basic vfio-pci driver functionality/API while adding
support for device-specific operations (for example these
device-specific drivers may support live migration of certain
devices). All that is needed to make this functionality available is
to bind the vendor-specific "VFIO variant" driver to the device
(rather than the generic vfio-pci driver, which will continue to work,
just without the extra functionality).
But until now libvirt has required that all PCI devices being assigned
to a guest with VFIO specifically have the "vfio-pci" driver bound to
the device. So even if the user manually binds a shiny new
vendor-specific VFIO variant driver to the device (and puts
"managed='no'" in the config to prevent libvirt from changing the
binding), libvirt will just fail during startup of the guest (or
during hotplug) because the driver bound to the device isn't exactly
"vfio-pci".
Beginning with kernel 6.1, it's possible to determine from the sysfs
directory for a device whether the currently-bound driver is the
vfio-pci driver or a VFIO variant - the device directory will have a
subdirectory called "vfio-dev". We can use that to appropriately widen
the list of drivers that libvirt will allow for VFIO device
assignment.
This patch doesn't remove the explicit check for the exact "vfio-pci"
driver (since that would cause systems with pre-6.1 kernels to behave
incorrectly), but adds an additional check for the vfio-dev directory,
so that any VFIO variant driver is acceptable for libvirt to continue
setting up for VFIO device assignment.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Sun, 9 Jul 2023 03:12:09 +0000 (23:12 -0400)]
util: rename virPCIDeviceGetDriverPathAndName
Instead, call it virPCIDeviceGetCurrentDriverPathAndName() to avoid
confusion with the device name that is stored in the virPCIDevice
object - that one is not necessarily the name of the current driver
for the device, but could instead be the driver that we want to be
bound to the device in the future.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Sun, 9 Jul 2023 02:11:06 +0000 (22:11 -0400)]
util: use "stubDriverType" instead of just "stubDriver"
In the past we just kept track of the type of the "stub driver" (the
driver that is bound to a device in order to assign it to a
guest). The next commit will add a stubDriverName to go along with
type, so lets use stubDriverType for the existing enum to make it
easier to keep track of whether we're talking about the name or the
type.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Pavel Hrdina [Thu, 24 Aug 2023 16:19:52 +0000 (18:19 +0200)]
capabilities: report full external snapshot support
Now that deleting and reverting external snapshots is implemented we can
report that in capabilities so management applications can use that
information and start using external snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Thu, 24 Aug 2023 16:17:11 +0000 (18:17 +0200)]
capabilities: reword disksnapshot feature to mention creating snapshots
Libvirt supports creating snapshots for a long time but the wording of
the feature may imply that libvirt supports external snapshots in
general but that is not true as users were not able to use APIs to
delete or revert external snapshots.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since commit baca59a5384 the NUMA definition is automatically fixed if
the vCPU count mismatches the NUMA cpu count so that this warning will
never be triggered.
Additionally VIR_WARN of a misconfiguration of a VM would not really
be seen in most cases as it's only simply logged.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Thu, 24 Aug 2023 08:08:24 +0000 (10:08 +0200)]
virMdevctlList: Don't check for !output
After 'mdevctl' was ran, its stdout is captured in @output which
is then compared against NULL and if it is NULL a negative value
is returned (to indicate error to the caller). But this is
effectively a dead code, because virCommand (specifically
virCommandProcessIO()) makes sure both stdout and stderr buffers
are properly '\0' terminated. Therefore, this can never evaluate
to true. Also, if there really is no output from 'mdevctl' (which
was handled in one of earlier commits, but let just assume it
wasn't), then we should not error out and treat such scenario as
'no mdevs defined/active'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
We have virMdevctlListDefined() to list defined mdevs, and
virMdevctlListActive() to list active mdevs. Both have the same
body except for one boolean argument passed to
nodeDeviceGetMdevctlListCommand(). Join the two functions under
virMdevctlList() name and introduce @defined argument that is
then just passed to the cmd line builder function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Thu, 24 Aug 2023 08:08:07 +0000 (10:08 +0200)]
nodeDeviceParseMdevctlJSON: Accept empty string
It is possible for 'mdevctl' to output nothing, an empty string
(e.g. when no mediated devices are defined on the host). What is
weird is that when passing '--defined' then 'mdevctl' outputs an
empty JSON array instead. Nevertheless, we should accept both and
treat them the same, i.e. as no mediated devices.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/523 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Thu, 24 Aug 2023 07:57:39 +0000 (09:57 +0200)]
nodedevmdevctltest: Rename mdevctl-list-empty test case
The mdevctl-list-empty test case is there to test whether an
empty JSON array "[]" is handled correctly by mdevctl handling
code. Well, mdevctl can output both, an empty JSON array or no
output at all.
Therefore, rename "mdevctl-list-empty" test case to
"mdevctl-list-empty-array" which is more descriptive and also
frees up slot for actual empty output (handled in next commits).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Tue, 22 Aug 2023 07:45:47 +0000 (09:45 +0200)]
src: Detect close_range syscall during virGlobalInit()
The whole purpose of virCloseRangeInit() is to be called
somewhere during initialization (ideally before first virExec()
or virCommandRun()), so that the rest of the code already knows
kernel capabilities. While I can put the call somewhere into
remote_daemon.c (when a daemon initializes), we might call
virCommand*() even from client library (i.e. no daemon).
Therefore, put it into virGlobalInit() with the rest of
initialization code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Tue, 22 Aug 2023 07:41:32 +0000 (09:41 +0200)]
vircommand: Introduce virCommandMassCloseRange()
This is brand new way of closing FDs before exec(). We need to
close all FDs except those we want to explicitly pass to avoid
leaking FDs into the child. Historically, we've done this by
either iterating over all opened FDs and closing them one by one
(or preserving them), or by iterating over an FD interval [2 ...
N] and closing them one by one followed by calling closefrom(N +
1). This is a lot of syscalls.
That's why Linux kernel developers introduced new close_from
syscall. It closes all FDs within given range, in a single
syscall. Since we keep list of FDs we want to preserve and pass
to the child process, we can use this syscall to close all FDs
in between. We don't even need to care about opened FDs.
Of course, we have to check whether the syscall is available and
fall back to the old implementation if it isn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Mon, 21 Aug 2023 13:10:39 +0000 (15:10 +0200)]
vircommand: Unify mass FD closing
We have two version of mass FD closing: one for FreeBSD (because
it has closefrom()) and the other for everything else. But now
that we have closefrom() wrapper even for Linux, we can unify
these two.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Mon, 21 Aug 2023 13:10:25 +0000 (15:10 +0200)]
virfile: Introduce virCloseFrom()
It is handy to close all FDs from given FD to infinity. On
FreeBSD the libc even has a function for that: closefrom(). It
was ported to glibc too, but not musl. At least glibc
implementation falls back to calling:
close_range(from, ~0U, 0);
Now that we have a wrapper for close_range() we implement
closefrom() trivially.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Tue, 22 Aug 2023 06:49:10 +0000 (08:49 +0200)]
virfile: Introduce virCloseRange()
Linux gained new close_range() syscall (in v5.9) that allows
closing a range of FDs in a single syscall. Ideally, we would use
it to close FDs when spawning a process (e.g. via virCommand
module).
Glibc has close_range() wrapper over the syscall, which falls
back to iterative closing of all FDs inside the range if running
under older kernel. We don't wane that as in that case we might
just close opened FDs (see Linux version of
virCommandMassClose()). And musl doesn't have close_range() at
all. Therefore, call syscall directly.
Now, mass close of FDs happens in a fork()-ed off child. While it
could detect whether the kernel does support close_range(), it
has no way of passing this info back to the parent and thus each
child would need to query it again and again.
Since this can't change while we are running we can cache the
information - hence virCloseRangeInit().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Michal Privoznik [Thu, 17 Aug 2023 12:52:12 +0000 (14:52 +0200)]
src: Rename some members of _virDomainMemoryDef struct
As advertised earlier, now that the _virDomainMemoryDef struct is
cleaned up, we can shorten some names as their placement within
unions define their use.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Fri, 28 Jul 2023 09:40:37 +0000 (11:40 +0200)]
src: Move _virDomainMemoryDef target nodes into an union
The _virDomainMemoryDef struct is getting a bit messy. It has
various members and only some of them are valid for given model.
Worse, some are re-used for different models. We tried to make
this more bearable by putting a comment next to each member
describing what models the member is valid for, but that gets
messy too.
Therefore, do what we do elsewhere: introduce an union of structs
and move individual members into their respective groups.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Tue, 25 Jul 2023 14:48:21 +0000 (16:48 +0200)]
src: Move _virDomainMemoryDef source nodes into an union
The _virDomainMemoryDef struct is getting a bit messy. It has
various members and only some of them are valid for given model.
Worse, some are re-used for different models. We tried to make
this more bearable by putting a comment next to each member
describing what models the member is valid for, but that gets
messy too.
Therefore, do what we do elsewhere: introduce an union of structs
and move individual members into their respective groups.
This allows us to shorten some names (e.g. nvdimmPath or
sourceNodes) as their purpose is obvious due to their placement.
But to make this commit as small as possible, that'll be
addressed later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu_driver: validate mem->model on MEMORY_DEVICE_SIZE_CHANGE event
When guest acknowledges change in size of virtio-mem (portion
that's exposed to the guest), QEMU emits
MEMORY_DEVICE_SIZE_CHANGE event. We process it in
processMemoryDeviceSizeChange(). So far, QEMU emits the even only
for virtio-mem (as that's the only memory device model that
allows live changes to its size). Nevertheless, if this ever
changes, validate the memory model upon processing the event as
the rest of the code blindly expects virtio-mem model.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Fri, 28 Jul 2023 12:39:31 +0000 (14:39 +0200)]
conf: Compare memory device address in virDomainMemoryFindByDefInternal()
This is similar to one of my previous commits. Simply speaking,
users can specify address where a memory device is mapped to. And
as such, we should include it when looking up corresponding
device in domain definition (e.g. on device hot unplug).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Thu, 27 Jul 2023 13:14:12 +0000 (15:14 +0200)]
qemu_hotplug: Don't validate inaccessible fields in qemuDomainChangeMemoryLiveValidateChange()
The qemuDomainChangeMemoryLiveValidateChange() function is called
when a live memory device change is requested (via
virDomainUpdateDeviceFlags()). Currently, the only model that is
allowed to change is VIRTIO_MEM (and the only value that's
allowed to change is requestedsize). The aim of the function is
to check whether the change user requested follows this rule. And
in accordance with defensive programming I made the function
check all virDomainMemoryDef struct members. Even those which are
unused for VIRTIO_MEM model.
Drop these checks as the respective members will be inaccessible
soon (as the struct is refined).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu_hotplug: validate address on memory device change
As of v7.9.0-rc1~296 users have ability to adjust what portion of
virtio-mem is exposed to the guest. Then, as of v9.4.0-rc2~5 they
have ability to set address where the memory is mapped. But due
to a missing check it was possible to feed
virDomainUpdateDeviceFlags() API with memory device XML that
changes the address. This is of course not possible and should be
forbidden. Add the missing check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Wed, 26 Jul 2023 10:36:08 +0000 (12:36 +0200)]
virt-aa-helper: Set label on VIRTIO_PMEM device too
Conceptually, from host POV there's no difference between NVDIMM
and VIRTIO_PMEM. Both expose a file to the guest (which is used
as a permanent storage). Other secdriver treat NVDIMM and
VIRTIO_PMEM the same. Thus, modify virt-aa-helper so that is
appends virtio-pmem backing path into the domain profile too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, inside of virt-aa-helper code the domain definition is
parsed and then all def->mems are iterated over and for NVDIMM
models corresponding nvdimmPath is set label on. Conceptually,
this code works (except the label should be set for VIRTIO_PMEM
model too, but that is addressed in the next commit), but it can
be written in more extensible way. Firstly, there's no need to
check whether def->mems[i] is not NULL because we're inside a
for() loop that's counting through def->nmems. Secondly, we can
have a helper variable ('mem') to make the code more readable
(just like we do in other loops). Then, we can use switch() to
allow compiler warn us on new memory model.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
When running libvirt from the build directory with the 'run' script, it
will run as unconfined_t. This can result in unexpected behavior when
selinux is enforcing due to the fact that the selinux policies are
written assuming that libvirt is running with the
system_u:system_r:virtd_t context. This patch adds a new --selinux
option to the run script. When this option is specified, it will launch
the specified binary using the 'runcon' utility to set its selinux
context to the one mentioned above. Since this may require root
privileges, setting the selinux context is not the default behavior and
must be enabled with the command line switch.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Pavel Hrdina [Mon, 6 Mar 2023 11:25:04 +0000 (12:25 +0100)]
qemu_snapshot: add checks for external snapshot deletion
With the introduction of external snapshot revert support we need to
error out in some cases when trying to delete some snapshots.
If users reverts to non-leaf snapshots and would try to delete it after
the revert is done it would not work currently as this operation would
require using block-stream which is not implemented for now as in this
case the snapshot has two children so the disk files have multiple
overlays.
Similarly if user reverts to non-leaf snapshot and would try to delete
snapshot that is non-leaf but not in currently active snapshot chain we
would still need to use block-commit operation. The issue here is that
in order to do that we would have to start new qemu process with
different domain definition than what is currently used by the domain.
If the current domain would be running it would complicate things even
more so this operation is not yet supported.
If user creates new snapshot after reverting to non-leaf snapshot it
creates a new branch. Deleting snapshot with multiple children will
require block-stream which is not implemented for now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Wed, 7 Jun 2023 11:02:03 +0000 (13:02 +0200)]
qemu_snapshot: update backing store after deleting external snapshot
With introduction of external snapshot revert we will have to update
backing store of qcow images not actively used be QEMU manually.
The need for this patch comes from the fact that we stop and start QEMU
process therefore after revert not all existing snapshots will be known
to that QEMU process due to reverting to non-leaf snapshot or having
multiple branches.
We need to loop over all existing snapshots and check all disks to see
if they happen to have the image we are deleting as backing store and
update them to point to the new image except for images currently used
by the running QEMU process doing the merge operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We only need the domain definition from domain object. This will allow
us to use it from snapshot code where we need to pass different domain
definition.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Mon, 22 May 2023 12:32:35 +0000 (14:32 +0200)]
qemu_snapshot: remove revertdisks when creating new snapshot
When user creates a new snapshot after reverting to non-leaf snapshot we
no longer need to store the temporary overlays as they will be part of
the VM XMLs stored in the newly created snapshot.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Wed, 22 Feb 2023 10:43:45 +0000 (11:43 +0100)]
qemu_snapshot: delete: properly update parent snapshot with revert data
When deleting external snapshot and parent snapshot is the currently
active snapshot as user reverted to it we need to properly update the
parent snapshot metadata.
After the delete is done the new overlay files will be the currently
used files created when snapshot revert was done, replacing the original
overlay files.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Tue, 4 Apr 2023 12:21:05 +0000 (14:21 +0200)]
qemu_snapshot: add merge to external snapshot delete prepare data
Before external snapshot revert every delete operation did block commit
in order to delete a snapshot. But now when user reverts to non-leaf
snapshot deleting leaf snapshot will not have any overlay files so we
can just simply delete the snapshot images.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Fri, 5 May 2023 15:55:05 +0000 (17:55 +0200)]
qemu_snapshot: introduce external snapshot revert support
When reverting to external snapshot we need to create new overlay qcow2
files from the disk files the VM had when the snapshot was taken.
There are some specifics and limitations when reverting to a snapshot:
1) When reverting to last snapshot we need to first create new overlay
files before we can safely delete the old overlay files in case the
creation fails so we have still recovery option when we error out.
These new files will not have the suffix as when the snapshot was
created as renaming the original files in order to use the same file
names as when the snapshot was created would add unnecessary
complexity to the code.
2) When reverting to any snapshot we will always create overlay files
for every disk the VM had when the snapshot was done. Otherwise we
would have to figure out if there is any other qcow2 image already
using any of the VM disks as backing store and that itself might be
extremely complex and in some cases impossible.
3) When reverting from any state the current overlay files will be
always removed as that VM state is not meant to be saved. It's the
same as with internal snapshots. If user want's to keep the current
state before reverting they need to create a new snapshot. For now
this will only work if the current snapshot is the last.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Pavel Hrdina [Fri, 5 May 2023 15:53:54 +0000 (17:53 +0200)]
qemu_snapshot: use VIR_ASYNC_JOB_SNAPSHOT when reverting snapshot
Both creating and deleting snapshot are using VIR_ASYNC_JOB_SNAPSHOT but
reverting is using VIR_ASYNC_JOB_START. Let's unify it to make it
consistent for all snapshot operations.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>