A <cell> defines a NUMA node. <distances> describes the NUMA distance
from the <cell> to the other NUMA nodes (the <sibling>s). For example,
in above XML description, the distance between NUMA node0 <cell id='0'
...> and NUMA node2 <sibling id='2' ...> is 31.
Valid distance values are '10 <= value <= 255'. A distance value of 10
represents the distance to the node itself. A distance value of 20
represents the default value for remote nodes but other values are
possible depending on the physical topology of the system.
When distances are not fully described, any missing sibling distance
values will default to 10 for local nodes and 20 for remote nodes.
If distance is given for A -> B, then we default B -> A to the same
value instead of 20.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When testing user aliases it was discovered that for 440fx
machine type which has default IDE bus builtin, domain cannot
start if IDE controller has the user provided alias. This is
because for 440fx we don't put the IDE controller onto the
command line (since it is builtin) and therefore any device that
is plugged onto the bus must use the default alias.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
numa: avoid failure in nodememstats on non-NUMA systems
libvirt reports a fake NUMA topology in virConnectGetCapabilities
even if built without numactl support. The fake NUMA topology consists
of a single cell representing the host's cpu and memory resources.
Currently this is the case for ARM and s390[x] RPM builds.
A client iterating over NUMA cells obtained via virConnectGetCapabilities
and invoking virNodeGetMemoryStats on them will see an internal failure
"NUMA isn't available on this host" from virNumaGetMaxNode. An example
for such a client is VDSM.
Since the intention seems to be that libvirt always reports at least
a single cell it is necessary to return "fake" node memory statistics
matching the previously reported fake cell in case NUMA isn't supported
on the system.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Dawid Zamirski [Tue, 7 Nov 2017 22:36:34 +0000 (17:36 -0500)]
vbox: Add support for 5.2.x
Simply add the 5.2 SDK header to the existing unified framework. No
other special handling is needed as there's no API break between
existing 5.1 and the just added 5.2.
Commit 3cc2a9e0 fixed a similar problem when parsing content of a
file but missed parsing in-memory content. But AFAICT, the better
fix is to properly set the end of the content when initializing the
virConfParserCtxt in virConfParse().
This commit reverts the part of 3cc2a9e0 that appends a newline to
files missing it, and fixes setting the end of content when
initializing virConfParserCtxt. A test is also added to check
parsing in-memory content missing an ending newline.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
qemu-ns: Detect /dev/* mount point duplicates even better
In 4f1570720218302 I've tried to make duplicates detection for
nested /dev mount better. However, I've missed the obvious case
when there are two same mount points. For instance if:
# mount --bind /dev/blah /dev/blah
# mount --bind /dev/blah /dev/blah
Yeah, very unlikely (in qemu driver world) but possible.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Tue, 7 Nov 2017 15:20:23 +0000 (16:20 +0100)]
util: storage: Fix parsing of IPv6 portal address for iSCSI
Split on the last colon and avoid parsing port if the split remainder
contains the closing square bracket, so that IPv6 addresses are
interpreted correctly.
qemu: Use predictable file names for memory-backend-file
In some cases management application needs to allocate memory for
qemu upfront and then just let qemu use that. Since we don't want
to expose path for memory-backend-file anywhere in the domain
XML, we can generate predictable paths. In this case:
$memoryBackingDir/libvirt/qemu/$shortName/$alias
where $shortName is result of virDomainDefGetShortName().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
When removing path where huge pages are call virFileDeleteTree
instead of plain rmdir(). The reason is that in the near future
there's going to be more in the path than just files - some
subdirs. Therefore plain rmdir() is not going to be enough.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
qemu: Set alias for memory cell in qemuBuildMemoryCellBackendStr
Very soon qemuBuildMemoryBackendStr() is going to use memory cell
aliases. Therefore set one. At the same time, move it a bit
further - if virAsprintf() fails, there's no point in setting
rest of the members.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Dawid Zamirski [Tue, 7 Nov 2017 21:36:47 +0000 (16:36 -0500)]
docs: Update vbox driver documentation.
* libvirt no longer supports vbox <= 3.x
* update XML definition sample to show how to attach disks to VBOX's SAS
controller and how to change IDE controller model.
* update XML to show how to create RDP display with autoport.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:29 +0000 (13:49 -0500)]
vbox: Add SAS controller support
In VirtualBox SAS and SCSI are separate controller types whereas libvirt
does not make such distinction. This patch adds support for attaching
the VBOX SAS controllers by mapping the 'lsisas1068' controller model in
libvirt XML to VBOX SAS controller type. If VBOX VM has disks attached
to both SCSI and SAS controller libvirt domain XML will have two
<controller type='scsci'> elements with index and model attributes set
accordingly. In this case, each respective <disk> element must have
<address> element specified to assign it to respective SCSI controller.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:28 +0000 (13:49 -0500)]
vbox: Generate disk address element in dumpxml
This patch adds <address> element to each <disk> device since device
names alone won't adequately reflect the storage device layout in the
VM. With this patch, the ouput produced by dumpxml will faithfully
reproduce the storage layout of the VM if used with define.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:27 +0000 (13:49 -0500)]
vbox: Process empty removable disks in dumpxml
Previously any removable storage device without media attached was
omitted from domain XML dump. They're still (rightfully) omitted in
snapshot XML dump but need to be accounted properly to for the device
names to stay in 'sync' between domain and snapshot XML dumps.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:25 +0000 (13:49 -0500)]
vbox: Correctly generate drive name in dumpxml
If a VBOX VM has e.g. a SATA and SCSI disk attached, the XML generated
by dumpxml used to produce "sda" for both of those disks. This is an
invalid domain XML as libvirt does not allow duplicate device names. To
address this, keep the running total of disks that will use "sd" prefix
for device name and pass it to the vboxGenerateMediumName which no
longer tries to "compute" the value based only on current and max
port and slot values. After this the vboxGetMaxPortSlotValues is not
needed and was deleted.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:23 +0000 (13:49 -0500)]
vbox: Do not free disk definitions on cleanup
Both vboxSnapshotGetReadWriteDisks and vboxSnapshotGetReadWriteDisks do
not need to free the def->disks on cleanup because it's being done by
the caller via virDomainSnaphotDefFree
Dawid Zamirski [Tue, 7 Nov 2017 18:49:22 +0000 (13:49 -0500)]
vbox: Cleanup/prepare snasphot dumpxml functions
This patch prepares the vboxSnapshotGetReadOnlyDisks and
vboxSnapshotGetReadWriteDisks functions for further changes so that
the code movement does not obstruct the gist of those future changes.
This is done primarily because we'll need to know the type of vbox
storage controller as early as possible and make decisions based on
that info.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:19 +0000 (13:49 -0500)]
vbox: Process <controller> element in domain XML
With this patch, the vbox driver will no longer attach all supported
storage controllers by default even if no disk devices are associated
with them. Instead, it will attach only those that are implicitly added
by virDomainDefAddImplicitController based on <disk> element or if
explicitly specified via the <controller> element.
Dawid Zamirski [Tue, 7 Nov 2017 18:49:18 +0000 (13:49 -0500)]
vbox: Cleanup partially-defined VM on failure
Since the VBOX API requires to register an initial VM before proceeding
to attach any remaining devices to it, any failure to attach such
devices should result in automatic cleanup of the initially registered
VM so that the state of VBOX registry remains clean without any leftover
"aborted" VMs in it. Failure to cleanup of such partial VMs results in a
warning log so that actual define error stays on the top of the error
stack.
Since qemu 2.9 via 9103f1ce "file-posix: Consider max_segments for
BlockLimits.max_transfer" this is a new access that is denied by the
qemu profile.
It is non fatal, but prevents the fix mentioned to actually work.
It should be safe to allow reading from that path.
Since qemu opens a symlink path we need to translate that for apparmor from
"/sys/dev/block/*/queue/max_segments" to
"/sys/devices/**/block/*/queue/max_segments"
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Peter Krempa [Fri, 3 Nov 2017 14:20:55 +0000 (15:20 +0100)]
tests: Add testing of storage backend JSON props formatter
Add a new test program called 'qemublocktest' to test the block layer
related stuff and test storage source to JSON generator by comparing it
to the JSON parser.
Peter Krempa [Mon, 23 Oct 2017 14:39:49 +0000 (16:39 +0200)]
storage: Don't store leading '/' in image name when splitting out volume
Libvirt historically stores storage source path including the volume as
one string in the XML, but that is not really flexible enough when
dealing with the fields in the code. Previously we'd store the slash
separating the two as part of the image name. This was fine for gluster
but it's not necessary and does not scale well when converting other
protocols.
Don't store the slash as part of the path. The resulting change from
absolute to relative path within the gluster driver should be okay,
as the root directory is the default when accessing gluster.
Peter Krempa [Thu, 13 Jul 2017 13:38:50 +0000 (15:38 +0200)]
qemu: process: Split out useful parts from qemuBuildNetworkDriveURI
Extract the part formatting the basic URI part so that it can be reused
to format JSON backing definitions. Parts specific to the command line
format will remain in qemuBuildNetworkDriveURI. The new function is
called qemuBlockStorageSourceGetURI.
Peter Krempa [Mon, 23 Oct 2017 16:02:28 +0000 (18:02 +0200)]
qemu: block: Use proper type for servers for VxHS disks
Original implementation used 'SocketAddress' equivalent from qemu for
the disk server field, while qemu documentation specifies
'InetSocketAddress'. The backing store parser uses the correct parsing
function but the formatter used the incorrect one (and also with the
legacy mode enabled which was wrong).
Peter Krempa [Wed, 1 Nov 2017 10:17:20 +0000 (11:17 +0100)]
qemu: command: Refactor qemuBuildDriveStrValidate to make qemuCaps optional
To allow merging this with other disk type checks we need to check
qemuCaps only when available, since some of the checks are executed on
disk cold-plug and thus capabilities should not be checked.
Make the checks optional by making them conditional on qemuCaps not
being NULL.
Peter Krempa [Wed, 1 Nov 2017 10:02:41 +0000 (11:02 +0100)]
qemu: command: Directly report bus type in qemuBuildDriveStrValidate
All of the error message are already in a conditional block with known
bus type. Inline the bus type rather than formatting it from a separate
variable.
Peter Krempa [Wed, 1 Nov 2017 09:41:55 +0000 (10:41 +0100)]
qemu: command: Move disk index validation closer to usage
The disk index validation is used only in very specific cases and does
not need to be performed otherwise. Move it out of the global check into
the usage place.
Peter Krempa [Wed, 1 Nov 2017 09:33:24 +0000 (10:33 +0100)]
qemu: command: Remove dead code when formatting -drive
busid and unitid are ever used only if the device is an SD card due to
the check in qemuDiskBusNeedsDeviceArg. Since the SD card does not have
an bus or unit number, most of the code and command line formatter can
be removed since it will never be used.
Michal Privoznik [Mon, 23 Oct 2017 09:33:06 +0000 (11:33 +0200)]
qemu: Move memPath generation from memoryBackingDir to a separate function
In near future we will need more than just a plain VIR_STRDUP().
Better implement that in a separate function and in
qemuBuildMemoryBackendStr() which is complicated enough already.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
rpc,lockd: Add missing netserver refcount increment on reload
After the virNetDaemonAddServerPostExec call in virtlogd we should have
netserver refcount set to 2. One goes to netdaemon servers hashtable
and one goes to virt{logd,lock} own reference to netserver. Let's add
the missing increment in virNetDaemonAddServerPostExec itself while
holding the daemon lock.
Since lockd defers management of the @srv object by the presence
in the hash table, virLockDaemonNewPostExecRestart must Unref the
alloc'd Ref on the @srv object done as part of virNetDaemonAddServerPostExec
and virNetServerNewPostExecRestart processing. The virNetDaemonGetServer
in lock_daemon main will also take a reference which is Unref'd during
main cleanup.
John Ferlan [Sat, 28 Oct 2017 21:03:20 +0000 (17:03 -0400)]
lockd: Need to Unref @srv when done with it.
Commit id '252610f7d' used a hash table to store the @srv, but
didn't handle the virObjectUnref if virNetDaemonNew failed nor
did it use virObjectUnref once successfully placed into the table
which will now be managing it's lifetime (and would cause the
virObjectRef if successfully inserted into the table).
Jiri Denemark [Thu, 2 Nov 2017 18:58:00 +0000 (19:58 +0100)]
conf: Don't inline virDomainNetTypeSharesHostView
When coverage build is enabled, gcc complains about it:
In file included from qemu/qemu_agent.h:29:0,
from qemu/qemu_driver.c:47:
qemu/qemu_driver.c: In function 'qemuDomainSetInterfaceParameters':
./conf/domain_conf.h:3397:1: error: inlining failed in call to
'virDomainNetTypeSharesHostView': call is unlikely and code size would
grow [-Werror=inline]
virDomainNetTypeSharesHostView(const virDomainNetDef *net)
^
Michal Privoznik [Tue, 31 Oct 2017 10:47:36 +0000 (11:47 +0100)]
virsh: Define multi line macros properly
In some cases there's dangling backward slash at the end of multi
line macros. While technically the code works, it will stop if
some empty lines are removed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Dawid Zamirski [Tue, 24 Oct 2017 19:35:31 +0000 (15:35 -0400)]
vbox: Add more IStorageController API mappings
This patch exposes additional methods of the native VBOX API to the
libvirt 'unified' vbox API to deal with IStorageController. The exposed
methods are:
Dawid Zamirski [Tue, 24 Oct 2017 19:35:30 +0000 (15:35 -0400)]
vbox: Support empty removable drives.
Original code was checking for non empty disk source before proceeding
to actually attach disk device to VM. This prevented from creating
empty removable devices like DVD or floppy. Therefore, this patch
re-organizes the loop work-flow to allow such configurations as well as
makes the code follow better libvirt practices. Additionally, adjusted
debug logs to be more helpful - removed old ones and added new which
give more valuable info for troubleshooting.
Dawid Zamirski [Tue, 24 Oct 2017 19:35:29 +0000 (15:35 -0400)]
vbox: Errors in vboxAttachDrives are now critical
Previously, if one tried to define a VBOX VM and the API failed to
perform the requested actions for some reason, it would just log the
error and move on to process remaining disk definitions. This is not
desired as it could result in incorrectly defined VM without the caller
even knowing about it. So now all the code paths that call
virReportError are now treated as hard failures as they should have
been.
Dawid Zamirski [Fri, 3 Nov 2017 11:49:32 +0000 (07:49 -0400)]
vbox: Remove unused mediumEmpty
Remove the setting since it's unused as of commit 34364df3 which should
have never copied it in from the old code which ended up getting removed
as part of commit c7c286c6.
Dawid Zamirski [Tue, 24 Oct 2017 19:35:28 +0000 (15:35 -0400)]
vbox: Cleanup vboxAttachDrives implementation
This commit primes vboxAttachDrives for further changes so when they
are made, the diff is less noisy:
* move variable declarations to the top of the function
* add disk variable to replace all the def->disks[i] instances
* add cleanup at the end of the loop body, so it's all in one place
rather than scattered through the loop body. It's purposefully
called 'cleanup' rather than 'skip' or 'continue' because future
commit will treat errors as hard-failures.
Dawid Zamirski [Tue, 24 Oct 2017 19:35:27 +0000 (15:35 -0400)]
vbox: vboxAttachDrives now relies on address info
Previously, the driver was computing VBOX's devicePort/deviceSlot values
based on device name and max port/slot values. While this worked, it
completely ignored <address> values. Additionally, libvirt's built-in
virDomainDiskDefAssignAddress already does a good job setting default
values on virDomainDeviceDriveAddress struct which we can use to set
devicePort and deviceSlot and accomplish the same result while allowing
the customizing those via XML. Also, this allows to remove some code
which will make further patches smaller.
Dawid Zamirski [Tue, 24 Oct 2017 19:35:25 +0000 (15:35 -0400)]
vbox: Close media when undefining domains
When registering a VM we call OpenMedium on each disk image which adds it
to vbox's global media registry. Therefore, we should make sure to call
Close when unregistering VM so we cleanup the media registry entries
after ourselves - this does not remove disk image files. This follows
the behaviour of the VBoxManage unregistervm command.
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Fri, 29 Sep 2017 10:02:29 +0000 (12:02 +0200)]
qemu: domain: skip chain detection to end of backing chain
When a user provides the backing chain, we will not need to re-detect
all the backing stores again, but should move to the end of the user
specified chain. Additionally if a user provides a full terminated chain
we should not attempt any further detection.
Peter Krempa [Fri, 20 Oct 2017 11:50:23 +0000 (13:50 +0200)]
qemu: domain: Extract setup for disk source secrets
Separate it so that it deals only with single virStorageSource, so that
it can later be reused for full backing chain support.
Two aliases are passed since authentication is more relevant to the
'storage backend' whereas encryption is more relevant to the protocol
layer. When using node names, the aliases will be different.
Peter Krempa [Mon, 16 Oct 2017 12:10:09 +0000 (14:10 +0200)]
qemu: domain: Simplify using DAC permissions of top of backing chain
qemuDomainGetImageIds and qemuDomainStorageFileInit are helpful when
trying to access a virStorageSource from the qemu driver since they
figure out the correct uid and gid for the image.
When accessing members of a backing chain the permissions for the top
level would be used. To allow using specific permissions per backing
chain level but still allow inheritance from the parent of the chain we
need to add a new parameter to the image ID APIs.
Peter Krempa [Tue, 17 Oct 2017 06:03:42 +0000 (08:03 +0200)]
security: selinux: Take parent security label into account
Until now we ignored user-provided backing chains and while detecting
the code inherited labels of the parent device. With user provided
chains we should keep this functionality, so label of the parent image
in the backing chain will be applied if an image-specific label is not
present.
Peter Krempa [Tue, 17 Oct 2017 06:03:42 +0000 (08:03 +0200)]
security: dac: Take parent security label into account
Until now we ignored user-provided backing chains and while detecting
the code inherited labels of the parent device. With user provided
chains we should keep this functionality, so label of the parent image
in the backing chain will be applied if an image-specific label is not
present.
Peter Krempa [Tue, 17 Oct 2017 05:25:51 +0000 (07:25 +0200)]
security: selinux: Pass parent storage source into image labeling helper
virSecuritySELinuxSetImageLabelInternal assigns different labels to
backing chain members than to the parent image. This was done via the
'first' flag. Convert it to passing in pointer to the parent
virStorageSource. This will allow us to use the parent virStorageSource
in further changes.
When the user provides backing chain, we don't need the full support for
traversing the backing chain. This patch adds a feature check for the
virStorageSourceAccess API.
Peter Krempa [Mon, 16 Oct 2017 11:23:51 +0000 (13:23 +0200)]
storage: Extract common code to retrieve driver backend for support check
The 'file access' module of the storage driver has few feature checks to
determine whether libvirt supports given storage driver method. The code
to retrieve the driver struct needed for the check is the same so it can
be extracted.
Jiri Denemark [Thu, 26 Oct 2017 19:51:37 +0000 (21:51 +0200)]
qemu: Add support for block-incremental migration parameter
We handle incremental storage migration in a different way. The support
for this new (as of QEMU 2.10) parameter is only needed for full
coverage of migration parameters used by QEMU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>