dann frazier [Thu, 20 Jul 2017 19:56:55 +0000 (13:56 -0600)]
qemu: Add AAVMF32 to the list of known UEFIs
Add a path for UEFI VMs for AArch32 VMs, based on the path Debian is using.
libvirt is the de facto canonical location for defining where distros
should place these firmware images, so let's define this path here to try
and minimize distro fragmentation.
Andrea Bolognani [Tue, 27 Jun 2017 06:30:58 +0000 (08:30 +0200)]
conf: Move some virDomainDeviceInfo functions
The virDomainDeviceInfo struct is defined in device_conf,
so generic functions that operate on it should also be
defined there rather than in domain_conf.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This patch addresses the same aspects on PPC the bug 1103314 addressed
on x86.
PCI expander bus creates multiple primary PCI busses, where each of these
busses can be assigned a specific NUMA affinity, which, on x86 is
advertised through ACPI on a per-bus basis.
For SPAPR, a PHB's NUMA affinities are assigned on a per-PHB basis, and
there is no mechanism for advertising NUMA affinities to a guest on a
per-bus basis. So, even if qemu-ppc manages to get some sort of multi-bus
topology working using PXB, there is no way to expose the affinities
of these busses to the guest. It can only be exposed on a per-PHB/per-domain
basis.
So patch enables NUMA node tag in pci-root controller on PPC.
The way to set the NUMA node is through the numa_node option of
spapr-pci-host-bridge device. However for the implicit PHB, the only way
to set the numa_node is from the -global option. The -global option applies
to all the PHBs unless explicitly specified with the option on the
respective PHB of CLI. The default PHB has the emulated devices only, so
the patch prevents setting the NUMA node for the default PHB.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Fri, 21 Jul 2017 10:09:58 +0000 (12:09 +0200)]
qemu: Clean up firmware list initialization
Instead of going through two completely different code paths,
one of which repeats the same hardcoded bit of information
three times in rapid succession, depending on whether or not
a firmware list has been provided at configure time, just
provide a reasonable default value and remove the extra code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Michal Privoznik [Fri, 21 Jul 2017 10:46:18 +0000 (12:46 +0200)]
libvirt-domain.h: Fix enum description placement
There are only two acceptable places for describing enum values.
It's either:
typedef enum {
/* Some long description. Therefore it's placed before
* the value. */
VIR_ENUM_A_VAL = 1,
} virEnumA;
or:
typedef enum {
VIR_ENUM_B_VAL = 1, /* Some short description */
} virEnumB;
However, during review of a patch sent upstream I realized that
is not always the case. I went through all the public header
files and identified all the offenders. Luckily there were just
two of them.
Yes, this makes our HTML generated documentation broken, but
that's bug of the generator. Our header files shouldn't be forced
to use something we don't want to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Wed, 12 Jul 2017 11:59:35 +0000 (13:59 +0200)]
qemu: process: Don't put memoryless NUMA nodes into autoNodeset
'numad' may return a nodeset which contains NUMA nodes without memory
for certain configurations. Since cgroups code will not be happy using
nodes without memory we need to store only numa nodes with memory in
autoNodeset.
On the other hand autoCpuset should contain cpus also for nodes which
do not have any memory.
Antoine Millet [Mon, 17 Jul 2017 15:49:00 +0000 (17:49 +0200)]
Handle hotplug change on VLAN configuration using OVS
A new function virNetDevOpenvswitchUpdateVlan has been created to instruct
OVS of the changes. qemuDomainChangeNet has been modified to handle the
update of the VLAN configuration for a running guest and rely on
virNetDevOpenvswitchUpdateVlan to do the actual update if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemu: shared disks with cache=directsync should be safe for migration
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
Peter Krempa [Tue, 11 Jul 2017 15:45:12 +0000 (17:45 +0200)]
qemu: blockcopy: Refactor logic checking the target storage file
Use virStorageSource accessors to check the file and call
virStorageFileAccess before even attempting to stat the target. This
will be helpful once we try to add network destinations for block copy,
since there will be no need to stat them.
Peter Krempa [Tue, 11 Jul 2017 06:23:38 +0000 (08:23 +0200)]
qemu: blockcopy: Explicitly assert 'reuse' for block devices
When copying to a block device, the block device will already exist. To
allow users using a block device without any preparation, they need to
use the block copy without VIR_DOMAIN_BLOCK_COPY_REUSE_EXT.
This means that if the target is an existing block device we don't need
to prepare it, but we can't reject it as being existing.
To avoid breaking this feature, explicitly assume that existing block
devices will be reused even without that flag explicitly specified,
while skipping attempts to create it.
qemuMonitorDriveMirror still needs to honor the flag as specified by the
user, since qemu overwrites the metadata otherwise.
Peter Krempa [Mon, 3 Jul 2017 13:16:02 +0000 (15:16 +0200)]
tests: virjson: Remove spaces from 'very-hard' parsing example
The example is rather long and upcomming patch will check whether the
string can be formatted back. As the formatted string lacks spaces and
adding the 'expect' string with spaces would be rather long, just drop
spaces from this test case.
There are other test cases which do contain spaces.
Peter Krempa [Tue, 18 Jul 2017 07:43:41 +0000 (09:43 +0200)]
security: apparmor: Properly link with storage driver in helper program
The refactor to split up storage driver into modules broke the apparmor
helper program, since that did not initialize the storage driver
properly and thus detection of the backing chain could not work.
Register the storage driver backends explicitly. Unfortunately it's now
necessary to link with the full storage driver to satisfy dependencies
of the loadable modules.
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Michal Privoznik [Tue, 11 Jul 2017 08:56:19 +0000 (10:56 +0200)]
virFileInData: Report an error if unable to reposition file
The purpose of this function is to tell if the current position
in given FD is in data section or a hole and how much bytes there
is remaining until the end of the section. This is achieved by
couple of lseeks(). The most important part is that we reposition
the FD back, so that the position is unchanged from the caller
POV. And until now the final lseek() back to the original
position was done with no check for errors. And I was convinced
that that's okay since nothing can go wrong. However, review
feedback from a related series persuaded me, that it's better to
be safe than sorry. Therefore, lets check if the final lseek()
succeeded and if it doesn't report an error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Jim Fehlig [Tue, 18 Jul 2017 16:20:35 +0000 (10:20 -0600)]
docs: schema: make disk driver name attribute optional
/domain/devices/disk/driver/@name is not a required or mandatory
attribute according to formatdomain, and indeed it was agreed on
IRC that the attribute is "optional for input, recommended (but
not required) for output". Currently the schema requires the
attribute, causing virt-xml-validate to fail on disk config where
the driver name is not explicitly specified. E.g.
# virt-xml-validate test.xml
Relax-NG validity error : Extra element devices in interleave
test.xml:21: element devices: Relax-NG validity error : Element domain failed to validate content
test.xml fails to validate
Relaxing the name attribute to be optional fixes the validation
Michal Privoznik [Tue, 18 Jul 2017 08:27:41 +0000 (10:27 +0200)]
wireshark: Adapt to tvb_new_subset() rename
In Wireshark commit of 7cd6906056922e4b8 (contained in v2.4.0)
the tvb_new_subset() function was renamed to
tvb_new_subset_length_caplen(). However, we can take the extra
step and rename to tvb_new_subset_remaining() directly (see
Wireshark commit 0ecfc7280cf3d7). The reasoning is that there is
no other protocol in the packet than libvirt. Therefore, from the
point that libvirt dissector takes over till the end of the
packet it's all libvirt packet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
All the pieces are now in place, so we can finally start
using isolation groups to achieve our initial goal, which is
separating hostdevs from emulated PCI devices while keeping
hostdevs that belong to the same host IOMMU group together.
Andrea Bolognani [Thu, 15 Jun 2017 08:40:42 +0000 (16:40 +0800)]
conf: Implement isolation rules
These rules will make it possible for libvirt to
automatically assign PCI addresses in a way that
respects any isolation constraints devices might
have.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Thu, 15 Jun 2017 08:38:33 +0000 (16:38 +0800)]
conf: Introduce isolation groups
Isolation groups will eventually allow us to make sure certain
devices, eg. PCI hostdevs, are assigned to guest PCI buses in
a way that guarantees improved isolation, error detection and
recovery for machine types and hypervisors that support it,
eg. pSeries guest on QEMU.
This patch merely defines storage for the new information
we're going to need later on and makes sure it is passed from
the hypervisor driver (QEMU / bhyve) down to the generic PCI
address allocation code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
John Ferlan [Mon, 15 May 2017 15:00:59 +0000 (11:00 -0400)]
nodedev: Convert virNodeDeviceObj to use virObjectLockable
Now that we have a bit more control, let's convert our object into
a lockable object and let that magic handle the create and lock/unlock.
This also involves creating a virNodeDeviceEndAPI in order to handle
the object cleanup for API's that use the Add or Find API's in order
to get a locked/reffed object. The EndAPI will unlock and unref the
object returning NULL to indicate to the caller to not use the obj.
In an overall effort to privatize access to virNodeDeviceObj and
virNodeDeviceObjList into the virnodedeviceobj module, move the
object list parsing from node_device_driver and replace with a
call to a virnodedeviceobj helper. This follows other similar
APIs/helpers which peruse the object list looking for some specific
data in order to get/return an @device (virNodeDevice) object to
the caller.
John Ferlan [Thu, 25 May 2017 14:20:58 +0000 (10:20 -0400)]
nodedev: Introduce virNodeDeviceGetSCSIHostCaps
We're about to move the call to nodeDeviceSysfsGetSCSIHostCaps from
node_device_driver into virnodedeviceobj, so move the guts of the code
from the driver specific node_device_linux_sysfs into its own API
since virnodedeviceobj cannot callback into the driver.
Nothing in the code deals with sysfs anyway, as that's hidden by the
various virSCSIHost* and virVHBA* utility function calls.
John Ferlan [Fri, 12 May 2017 15:08:57 +0000 (11:08 -0400)]
test: Adjust cleanup/error paths for nodedev test APIs
- In testDestroyVport rather than use a cleanup label, just return -1
immediately since nothing else is needed.
- In testStoragePoolDestroy, if !privpool, then just return -1 since
nothing else will happen anyway.
- Rather than "goto cleanup;" on failure to virNodeDeviceObjFindByName
an @obj, just return directly. This then allows the cleanup: label code
to not have to check "if (obj)" before calling virNodeDeviceObjUnlock.
This also simplifies some exit logic...
- In testNodeDeviceObjFindByName use an error: label to handle the failure
and don't do the ncaps++ within the VIR_STRDUP() source target index.
Only increment ncaps after success. Easier on eyes at error label too.
- In testNodeDeviceDestroy use "cleanup" rather than "out" for the goto
John Ferlan [Fri, 2 Jun 2017 13:04:29 +0000 (09:04 -0400)]
nodedev: Alter virNodeDeviceObjRemove
Rather than passing the object to be removed by reference, pass by value
and then let the caller decide whether or not the object should be free'd
and how to handle the logic afterwards. This includes free'ing the object
and/or setting the local variable to NULL to prevent subsequent unexpected
usage (via something like virNodeDeviceObjRemove in testNodeDeviceDestroy).
For now this function will just handle the remove of the object from the
list for which it was placed during virNodeDeviceObjAssignDef.
This essentially reverts logic from commit id '61148074' that free'd the
device entry on list, set *dev = NULL and returned. Thus fixing a bug in
node_device_hal.c/dev_refresh() which would never call dev_create(udi)
since @dev would have been set to NULL.
Andrea Bolognani [Mon, 29 May 2017 15:18:35 +0000 (17:18 +0200)]
qemu: Use PHBs when extending the guest PCI topology
When looking for slots suitable for a PCI device, libvirt
might need to add an extra PCI controller: for pSeries guests,
we want that extra controller to be a PHB (pci-root) rather
than a PCI bridge.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Fri, 26 May 2017 17:34:21 +0000 (19:34 +0200)]
qemu: Use PHBs to fill holes in PCI bus numbering
PCI bus has to be numbered sequentially, and no index can be
missing, so libvirt will fill in the blanks automatically for
the user.
Up until now, it has done so using either pci-bridge, for machine
types based on legacy PCI, or pcie-root-port, for machine types
based on PCI Express. Neither choice is good for pSeries guests,
where PHBs (pci-root) should be used instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Fri, 26 May 2017 16:33:36 +0000 (18:33 +0200)]
tests: Add baseline tests for automatic PHB usage
These tests demonstrate that, while it's now possible for the
user to create PHB explicitly and manually assign devices to
them, libvirt still defaults to extending the guest PCI
topology using PCI bridges and making suboptimal device
placement choices.
The next few commits will improve on these behaviors and the
tests outputs will automatically be updated to reflect this.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Tue, 28 Feb 2017 09:50:01 +0000 (10:50 +0100)]
qemu: Format additional PHBs on the command line
Additional PHBs (pci-root controllers) will be created for
the guest using the spapr-pci-host-bridge QEMU device, if
available; the implicit default PHB, while present in the
guest configuration, will be skipped.
Usually, a controller with alias 'x' will create a bus with the
same name; however, the bus created by a PHBs with alias 'x' will
be named 'x.0' instead, so we need to account for that.
As an exception to the exception, the implicit PHB that's added
automatically to every pSeries guest creates the 'pci.0' bus.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Tue, 28 Feb 2017 13:58:33 +0000 (14:58 +0100)]
qemu: Automatically pick target index and model for pci-root controllers
pSeries guests will soon need the new information; luckily,
we can figure it out automatically most of the time, so
users won't have to worry about it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Tue, 28 Feb 2017 13:58:08 +0000 (14:58 +0100)]
conf: Add 'spapr-pci-host-bridge' controller model
Adding it to the virDomainControllerPCIModelName enumeration
is enough for existing code to handle it, so parsing and
formatting will work without further tweaking.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Mon, 27 Feb 2017 19:18:32 +0000 (20:18 +0100)]
qemu: Relax pci-root index requirement for pSeries guests
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), meaning the current check
on the controller index no longer applies to them.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Fri, 24 Feb 2017 15:45:13 +0000 (16:45 +0100)]
conf: Move index number checking to drivers
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), which of course means that
all but one of them will have a non-zero index; hence,
we'll need to relax the current check.
However, right now the check is performed in the conf
module, which is generic rather than tied to the QEMU
driver, and where we don't have information such as the
guest machine type available.
To make this change of behavior possible down the line,
we need to move the check from the XML parser to the
drivers. Luckily, only QEMU and bhyve are using PCI
controllers, so this doesn't result in much duplication.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Andrea Bolognani [Tue, 28 Feb 2017 09:46:30 +0000 (10:46 +0100)]
qemu: Allow qemuBuildControllerDevStr() to return NULL
We will soon need to be able to return a NULL pointer
without the caller considering that an error: to make
it possible, change the return type to int and use
an out parameter for the string instead.
Add some documentation for the function as well.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
The current algorithm for slot allocation tries to be clever
and avoid looking at buses / slots more than once unless it's
necessary. Unfortunately that makes the code more complex,
and it will cause problem later on in some situations unless
even more complex code is added.
Since the performance gains are going to be pretty modest
anyway, we can just get rid of the extra complexity and use a
completely straighforward implementation instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Later on we're going to need access to information about IOMMU
groups for host devices. Implement the support in virpcimock,
and start using that mock library in a few QEMU test cases.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>
Use 0001:01:00.0 instead of 0000:04:02.0 as the source address
for the host device. This doesn't change anything at the moment,
but it will make a difference later on.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Laine Stump <laine@laine.org>