The qcow2 driver allows passing discards to the storage while keeping
the reference of the block, and just marking it as zeroed. This can
decrease the levels of fragmentation of the qcow2 metadata when
discards are enabled.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 21 Jun 2023 14:01:26 +0000 (16:01 +0200)]
conf: Allow omitting 'slots' attribute of <maxMemory>
Memory slots are required only for DIMM-like devices, but the maximum
memory address space is relevant also for other non-DIMM memory devices
such as virtio-mem. Allow configurations where no slots are added.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 21 Jun 2023 13:31:24 +0000 (15:31 +0200)]
qemu_domain: Properly validate count of memory slots
Memory slots are required only for DIMM-like devices, while other
devices defined via <memory> such as virtio-mem may use the PCI bus and
thus do not require/consume a memory slot.
Fix the validation code to calculate the required count of memory
devices only for DIMMs and NVDIMMs.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Wed, 21 Jun 2023 14:31:46 +0000 (16:31 +0200)]
qemu_command: Always use modern syntax of '-m'
Specify the memory size by using '-m size=2048k' instead of just '-m 2'.
The new syntax is used when memory hotplug is enabled. To preserve
memory sizing, if memory hotplug is disabled the size is rounded down to
the nearest mebibyte.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Wed, 21 Jun 2023 06:56:54 +0000 (08:56 +0200)]
virGlobalInit: Make glib init its own global state
This should not be needed, but here's what's happening:
virStrToLong_*() family of functions was switched from strtol*()
to g_ascii_strtol*() in order to handle corner cases on Windows
(most notably parsing hex numbers with base=0) - see v9.4.0-61-g2ed41d7cd9. But what we did not realize back then, is
the fact that g_ascii_strtol*() family has their own global lock
rendering virStrToLong_*() function unsafe between fork() +
exec(). Worse, if one of the threads has to wait for the lock (or
on its corresponding condition), then errno is mangled and
g_ascii_strtol*() signals an error, even though there's no error.
Read more here:
https://gitlab.gnome.org/GNOME/glib/-/issues/3034
Nevertheless, if we make glib init the g_ascii_strtol*() global
state (by calling one function from g_ascii_strtol*() family),
then there shouldn't be any congestion on the lock and thus no
errno mangling.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Jiri Denemark [Fri, 9 Jun 2023 16:12:53 +0000 (18:12 +0200)]
qemu: Include maximum physical address size in baseline CPU
The current implementation of virConnectBaselineHypervisorCPU in QEMU
driver can provide a CPU definition that will not work on all hosts in
case they have different maximum physical address size. So when we get
the info from domain capabilities, we need to choose the smallest
physical address size for the computed baseline CPU definition.
Newer GCC (13.1.1 in my case) wrongly reports "maybe uninitialized"
warning for this variable inside the next condition. Even though this
accusation is wrong (the condition is guarded by the same condition as
the for cycle initializing it), initialize it during the declaration so
compilation errors don't stop others and maybe also future proof the
code for changes.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Using os.system("cp {0} {1}".format(...)) has two issues, it does not
work on Windows, but more importantly it can cause issues in case one of
the directories has a space in it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Thu, 23 Mar 2023 08:15:35 +0000 (09:15 +0100)]
meson: Use dependency().found() instead of conf.has()
So far this change alone doesn't make much sense, but prepares
code for upcoming change. Unfortunately, some conf.has()
statements have to stay, because there's no corresponding
dependency(). But that's okay.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
Michal Privoznik [Thu, 23 Mar 2023 08:21:17 +0000 (09:21 +0100)]
meson: numactl_dep switch to dependency()
The pkg-config file to libnuma was introduced in 2.0.12 release
(though the comment mistakenly claims 2.0.14 version). Every
supported distro ships at least this version, and thus we can
switch meson detection to dependency().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
Michal Privoznik [Thu, 23 Mar 2023 08:13:52 +0000 (09:13 +0100)]
meson: attr_dep switch to dependency()
The pkg-config file to libattr was introduced in 2.4.48 release.
Now that every supported distro ships at least this version, we
can switch meson detection to dependency().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
Michal Privoznik [Thu, 23 Mar 2023 08:12:52 +0000 (09:12 +0100)]
meson: acl_dep switch to dependency()
The pkg-config file to libacl was introduced in 2.2.53 release.
Now that every supported distro ships at least this version, we
can switch meson detection to dependency().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
qemu: Allow more generous cpuset.mems for vCPUs and IOThreads
The unit that cpuset CGroups controller works with is a
thread/process, not individual memory allocations. Therefore,
after we've set cpuset.mems for emulator (after previous commit
it's set to union of all host NUMA nodes allowed for given
domain), and as we try to set up cpuset.mems for vCPUs/IOThreads,
memory is migrated to selected NUMA node(s). We are effectively
saying: "this thread (vCPU thread) can have memory only from
these NUMA node(s)".
That's not really what we want though. The cpuset controller
doesn't differentiate memory "belonging" to the emulator thread
and vCPU thread or IOThread even.
Therefore, set union of all allowed host NUMA nodes, just like
we're doing for the emulator thread.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2138150 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
qemu: Don't try to 'fix up' cpuset.mems after QEMU's memory allocation
In ideal world, my plan was perfect. We allow union of all host
nodes in cpuset.mems and once QEMU has allocated its memory, we
'fix up' restriction of its emulator thread by writing the
original value we wanted to set all along. But in fact, we can't
do it because that triggers memory movement. For instance,
consider the following <numatune/>:
This is meant to create 1:1 mapping between guest and host NUMA
nodes. So we start QEMU with cpuset.mems set to "0-1" (so that it
can allocate memory even for guest node #1 and have the memory
come fro host node #1) and then, set cpuset.mems to "0" (because
that's where we wanted emulator thread to live).
But this in turn triggers movement of all memory (even the
allocated one) to host NUMA node #0. Therefore, we have to just
keep cpuset.mems untouched and rely on .host-nodes passed on the
QEMU cmd line.
The placement still suffers because of cpuset.mems set for vcpus
or iothreads, but that's fixed in next commit.
Fixes: 3ec6d586bc3ec7a8cf406b1b6363e87d50aa159c Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Jim Fehlig [Tue, 6 Jun 2023 17:05:50 +0000 (11:05 -0600)]
apparmor: Add support for local profile customizations
Apparmor profiles in /etc/apparmor.d/ are config files that can and should
be replaced on package upgrade, which introduces the potential to overwrite
any local changes. Apparmor supports local profile customizations via
/etc/apparmor.d/local/<service> [1].
This change makes the support explicit by adding libvirtd, virtqemud, and
virtxend profile customization stubs to /etc/apparmor.d/local/. The stubs
are conditionally included by the corresponding main profiles.
[1] https://ubuntu.com/server/docs/security-apparmor
See "Profile customization" section
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Notable changes:
- 'SapphireRapids' cpu model added
- 'EPYC-Genoa(-v1)' cpu model added
- 'EPYC-Milan-v2' cpu model added
- 'EPYC-Rome-(v3|v4)' cpu models added
- new cpu features:
'fb-clear', 'cmpccxadd', 'vnmi', 'flush-l1d', 'avx-vnni-int8', 'avx-ifma',
'no-nested-data-bp', 'null-sel-clr-base', 'amd-psfd', 'auto-ibrs', 'amx-fp16',
'prefetchiti', 'lfence-always-serializing', 'avx-ne-convert'
- 8.1 machine types added
- QMP schema:
- 'block-latency-histogram-set' gained 'boundaries-zap' property
- 'qcow2' block driver gained 'discard-no-unref' flag
- 'input-send-event' now supports the 'mtt' type and corresponding properties
- 'memory-backend-file' object now has a 'offset' property
- 'query-blockstats' reports 'failed_zone_append_operations', 'avg_zone_append_latency_ns'
'avg_zone_append_queue_depth', 'zone_append_bytes', 'zone_append_latency_histogram',
'zone_append_operations', 'zone_append_merged', 'zone_append_total_time_ns'
- 'single-step' property of 'query-status' is deprecated
- 'vcpu' argument of 'trace-events-(set|get'-state' is deprecated
'cpu-host-model' qemuxml2argv test output changed as EPYC-Rome gained
few new cpu flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Tue, 6 Jun 2023 09:03:39 +0000 (11:03 +0200)]
qemumonitorjsontest: Work around deprecation of 'vcpu' argument of 'trace-event-get-state'
'trace-event-get-state' was used for testing schema validation as it had
simple arguments. Now 'vcpu' is optional and deprecated. Fix the test so
that it won't break with upcoming qemu-8.1.
Drop the 'all-attrs' case, as it's not not really testing anything
special and for the 'missing mandatory attr' case use an empty object.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In one of its commits [1] libssh2 changed the 'text' member of
LIBSSH2_USERAUTH_KBDINT_PROMPT struct from 'char' to 'unsigned
char'. But we g_strdup() the member in order to fill 'prompt'
member of virConnectCredential struct. Typecast the value to
avoid warnings. Also, drop @prompt variable, as it's needless.
1: https://github.com/libssh2/libssh2/commit/83853f8aea0e2f739cacd491632eb7fd3d03ad2d Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Format the rule attributes in two passes, first for positive 'match' and
second pass for negative. This removes the crazy logic for switching
between match modes inside the formatter.
The refactor makes it also more clear in which cases we actually do
format something.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Thu, 16 Feb 2023 12:56:53 +0000 (13:56 +0100)]
virNWFilterRuleParse: Refactor attribute parser
Use virXMLNodeGetSubelementList to get the elements to process.
The new approach documents the complexity of the parser, which is
designed to ignore unknown attributes and parse only a single kind of
them after finding the first valid one.
Note that the XML schema doesn't actually allow having multiple
sub-elements, but I'm not sure how that translates to actual configs
present.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Thu, 16 Feb 2023 09:46:41 +0000 (10:46 +0100)]
nwfilterxml2xmltest: Add test case for parser and formatter quirks
The parser and formatter for nwfilter rules is very strange and has
weird quirks. Add a test case trying to capture some of the quirks to
visualize how it will change when the code is refactored.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 13 Feb 2023 14:18:21 +0000 (15:18 +0100)]
virNetworkDHCPDefParseXML: Refactor cleanup
There's nothing to clean up in the 'host' local variable on error as
the function which fills it makes sure to fill it only on success. In
such case it's also directly assigned to the array thus the 'host'
variable is cleared.
Remove the 'cleanup' label and 'ret' variable as we can now directly
return -1 on error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 22 May 2023 10:49:17 +0000 (12:49 +0200)]
testQEMUSchemaValidateObjectMember: validate QMP object member deprecation
The QMP schema validator wasn't adapted to consider features of 'object'
members and thus we didn't catch the deprecation of 'device' in
'block_set_io_throttle'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Tue, 23 May 2023 12:20:29 +0000 (14:20 +0200)]
qemumonitorjsontest: Use 'id' instead of deprecated 'device' argument of 'block_set_io_throttle'
The 'device' argument is deprecated. All real usage in the qemu driver
already uses 'id' as we populate the 'qomName' for everything except for
SD cards where throttling didn't work with libvirt for a very long time.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Thu, 25 May 2023 14:48:24 +0000 (16:48 +0200)]
qemu: Refuse setting <iotune> for 'SD' disks
Historically this didn't work with any supported qemu version as we
don't set the alias of the device, and thus qemu uses a different alias
resulting in a failure to startup the VM:
internal error: unable to execute QEMU command 'block_set_io_throttle': Device 'drive-sd-disk0' not found
Refuse setting throttling as this is unlikely to be needed and proper
fix requires using -device instead of -drive if=sd.
Note that this was broken when I moved the setup of throttling as a
command at startup for blockdev integration quite a while ago. Until
then throttling was passed as arguments for -drive.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 22 May 2023 11:17:17 +0000 (13:17 +0200)]
qemumonitorjsontest: Drop 'schema-meta' case
The test case is validating the QMP schema against itself. This was
useful when I was developing the validator but at this point it's no
longer needed.
Additionally the QMP schema has few deprecated members now, which our
validator doesn't catch yet, so this test would start failing once I fix
the validator.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a user requests debug logging by setting the environment variable:
LIBVIRT_DEBUG=1
we should log any errors regardless of the setting of e.g.
'LIBVIRT_LOG_OUTPUTS' as the code will log every 'debug' and 'info'
level message to stderr but will skip 'error' level messages.
This obviously makes debugging things very complicated as you can get to
a situation when the error itself is missing.
This can happen e.g. in tests.
Fix the issue by probing the default log level and calling the logger if
it's set for VIR_LOG_DEBUG.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemu: Set proper PCI backend for <interface/>-s that are actually hostdevs
When starting a domain, it's done so in two steps (actually more,
but lets focus on just the following two):
1) qemuProcessPrepareDomain(), followed by
2) qemuProcessPrepareHost().
Now, in the first step (PrepareDomain()), PCI backends for all
hostdevs is set (qemuProcessPrepareDomain() ->
qemuProcessPrepareDomainHostdevs() -> qemuDomainPrepareHostdev()
-> qemuDomainPrepareHostdevPCI()). Perfect.
But then, additional hostdevs may appear, because in the host
prepare phase we may insert some hostdevs into domain definition
(qemuProcessPrepareHost() -> qemuProcessNetworkPrepareDevices()).
Now, these additional hostdevs don't undergo the same prepare as
hostdevs that were already present in the domain definition (i.e.
in qemuProcessPrepareDomain() phase). Therefore, we have to call
corresponding prepare function explicitly.
NB, the interface hotplug code (qemuDomainAttachNetDevice()) does
not suffer from this problem, because it calls top level
qemuDomainAttachHostDevice() which is used to hotplug regular
hostdevs too and as such calls qemuDomainPrepareHostdev().
Fixes: 3b87709c768480e085556e06bd8d08f62270d42d
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2209853 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Peter Krempa [Mon, 22 May 2023 11:43:53 +0000 (13:43 +0200)]
qemuMonitorJSONTestAttachOneChardev: Rewrite using qemuMonitorTestAddItemVerbatim
'qemuMonitorTestAddItemExpect' doesn't do QMP schema validation. Since
it's the only use we can reimplement it using 'qemuMonitorTestAddItemVerbatim'
which does schema validation and remove the old code instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Tue, 23 May 2023 14:52:24 +0000 (16:52 +0200)]
docs: go: Add 'go-import' metadata via rST
The '.. meta::' rST directive allows adding header metadata. Move the
specific metadata from page.xsl into the individual files and pass them
through into the header from page.xsl.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
qemu_passt: Format portForward device even without address
It's almost like we've anticipated this. Our XML parser and
formatter handles @address and @dev attributes of <portForward/>
element completely independent of each other. And as of commit
2023_03_29.b10b983~3 passt allows handling these two separately
too. All that's left is generate the cmd line according to this
new fact.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2210287 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Thu, 25 May 2023 13:43:56 +0000 (15:43 +0200)]
conf: Reject invalid device's <seclabel relabel='yes'/> with no <label/>
We allow (some) domain devices to have a different <seclabel/>
than the top level domain one (this is mostly to allow access to
a resource for multiple domains). Now, we do couple of sanity
checks for such <seclabel/>, e.g. when the <label/> is specified,
but '@relabel' is set to no. But what we are missing is the
opposite: when '@relabel' is set, but no <label/> was provided.
Our schema already denies such combination. Make our parser
behave the same.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2160356 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Wed, 31 May 2023 09:02:21 +0000 (11:02 +0200)]
include: Fix 'Since' for new VIR_MIGRATE_PARAM_COMPRESSION_* macros
In v9.3.0-98-g150ae3e62b two new macros were introduced:
VIR_MIGRATE_PARAM_COMPRESSION_ZLIB_LEVEL and
VIR_MIGRATE_PARAM_COMPRESSION_ZSTD_LEVEL. But both list 9.1.0 as
the version they were introduced in (this is because the patch
was sent in that release time frame). Change the version to the
current release.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Michal Privoznik [Tue, 28 Mar 2023 08:44:15 +0000 (10:44 +0200)]
qemu_command: Generate .memaddr for virtio-mem and virtio-pmem
This is fairly trivial. Just set .memaddr attribute if a value
was set in the XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2180679 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal Privoznik [Tue, 28 Mar 2023 08:41:39 +0000 (10:41 +0200)]
qemu: Fill virtio-mem/virtio-pmem .memaddr at runtime
After a QEMU domain is started, among other thing we query memory
device information. And while memory address is returned by QEMU
for all models, we store it only for DIMMs and NVDIMMs. Do store
it for VIRTIO_MEM and VIRTIO_PMEM too.
This effectively reports the address the virtio-mem/virtio-pmem
is mapped to in live XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal Privoznik [Tue, 28 Mar 2023 08:39:55 +0000 (10:39 +0200)]
conf: Introduce <address/> for virtio-mem and virtio-pmem
Both virtio-mem and virtio-pmem devices have '.memaddr' attribute
which controls the address where they are mapped in the guest
memory. Ideally, users do not need to specify this as QEMU does
the right thing and computes addresses automatically on startup.
But soon, we will need to record this address as it is part of
guest ABI. And also, there might be some users that want to
control this value. Now, we are in a bit of a pickle, because
both these device types already have a PCI address, therefore we
can't just use <address/> blindly. But what we can do, is
introduce <address/> under the <target/> element. This is also
more conceptual, as knobs under <target/> control guest visible
config of memory device (and .memaddr surely falls into that
category).
NB, SgxEPCDeviceInfo struct in QMP definition also has .memaddr
attribute, but because of the way we build cmd line there's no
(easy) way to set the attribute. So ignore that for now.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal Privoznik [Fri, 26 May 2023 14:15:56 +0000 (16:15 +0200)]
conf: Run virDomainInputDefPostParse() only for VIR_DOMAIN_DEVICE_INPUT
Due to missed break; statement the virDomainInputDefPostParse()
is called not only for VIR_DOMAIN_DEVICE_INPUT but also
VIR_DOMAIN_DEVICE_LEASE and VIR_DOMAIN_DEVICE_NET, which can lead
to all sort of unpredictable results.
Fixes: c4bc4d3b82fbe22e03c986ca896090f481df5c10 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>