As it turns out some file headers (e.g. ext4) may be larger/longer than
the 512 bytes of zeros being written prior to a pvcreate, so let's write
out 2048 bytes similar to how the pvcreate sources would peek at the first
4 sectors of the device.
Make sure there is at enough bytes on the device to clear before doing
doing the clear - just to be sure.
virhostcpu: Expose virHostCPUGetOnline on non-Linux
Previously, this function must've been called only on Linux in order
to fail gracefully. That lead to #ifdef mess in callers, so the
function was redesigned so it failed gracefully on non-existing
files. However that commit forgot to define the function outside the
__linux__ ifdef, it broke non-Linux builds.
Jiri Denemark [Thu, 16 Mar 2017 11:26:30 +0000 (12:26 +0100)]
cputest: Add tests for virCPUUpdateLive API
The test takes
x86-cpuid-Something-guest.xml CPU (the CPU libvirt would use for
host-model on a CPU described by x86_64-cpuid-Something.xml without
talking to QEMU about what it supports on the host)
and updates it according to CPUID data from QEMU:
x86_64-cpuid-Something-enabled.xml (reported as "feature-words"
property of the CPU device)
and
x86_64-cpuid-Something-disabled.xml (reported as "filtered-features"
property of the CPU device).
The result is compared to
x86_64-cpuid-Something-json.xml (the CPU libvirt would use as
host-model based on the reply from query-cpu-model-expansion).
The comparison is a bit tricky because the *-json.xml CPU contains fewer
disabled features. Only the features which are included in the base CPU
model, but listed as disabled in *.json will be disabled in *-json.xml.
The CPU computed by virCPUUpdateLive from the test data will list all
features present in the host's CPUID data and not enabled in *.json as
disabled. The cpuTestUpdateLiveCompare function checks that the computed
and expected sets of enabled features match.
Jiri Denemark [Fri, 17 Mar 2017 10:46:19 +0000 (11:46 +0100)]
cputest: Disable "cmt" feature unknown to QEMU
All CPU features which QEMU does not know about but libvirt knows them
(currently "cmt" is the only one) are implicitly disabled by QEMU and
should be present in x86_64-cpuid-*-disabled.xml.
Jiri Denemark [Fri, 17 Mar 2017 10:29:02 +0000 (11:29 +0100)]
cputest: Disable TSX on broken models
Commit v3.1.0-26-gd60012b4e started filtering hle and rtm features from
broken Intel Haswell CPUs. QEMU implemented similar functionality and
thus it doesn't report rtm and hle features as enabled for Core i5-4670T
CPU anymore.
Jiri Denemark [Thu, 16 Mar 2017 11:25:30 +0000 (12:25 +0100)]
cputest: Add "diff" command to cpu-cpuid.py
The new command can be used to generate test data for virCPUUpdateLive.
When "cpu-cpuid.py diff x86-cpuid-Something.json" is run, it reads raw
CPUID data stored in x86-cpuid-Something.xml and CPUID data from QEMU
stored in x86-cpuid-Something.json to produce two more CPUID files:
x86-cpuid-Something-enabled.xml and x86-cpuid-Something-disabled.xml.
- x86-cpuid-Something-enabled.xml will contain CPUID bits present in
x86-cpuid-Something.json (i.e., enabled by QEMU for the "host" CPU)
- x86-cpuid-Something-disabled.xml will contain all CPUID bits from
x86-cpuid-Something.xml which are not present in
x86-cpuid-Something.json (i.e., CPUID bits which the host CPU
supports, but QEMU does not enable them for the "host" CPU)
Jiri Denemark [Fri, 17 Mar 2017 14:58:07 +0000 (15:58 +0100)]
cpu: Do not pass virConnectBaselineCPUFlags to cpuBaseline
The public API flags are handled by the cpuBaselineXML wrapper. The
internal cpuBaseline API only needs to know whether it is supposed to
drop non-migratable features.
Jiri Denemark [Fri, 17 Mar 2017 14:57:47 +0000 (15:57 +0100)]
cpu: Move feature expansion out of cpuBaseline
cpuBaseline is responsible for computing a baseline CPU while feature
expansion is done by virCPUExpandFeatures. The cpuBaselineXML wrapper
(used by hypervisor drivers to implement virConnectBaselineCPU API)
calls cpuBaseline followed by virCPUExpandFeatures if requested by
VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES flag.
The features in the three changed test files had to be sorted using
"sort -k 3" because virCPUExpandFeatures returns a sorted list of
features.
Jiri Denemark [Thu, 16 Mar 2017 11:23:50 +0000 (12:23 +0100)]
cpu: Introduce virCPUExpandFeatures
Having to use cpuBaseline with VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES
flag to expand CPU features is strange. Not to mention that cpuBaseline
can only expand host CPU definitions (i.e., it completely ignores
feature policies). The new virCPUExpandFeatures API is designed to work
with both host and guest CPU definitions.
Laine Stump [Mon, 27 Mar 2017 04:40:18 +0000 (00:40 -0400)]
docs: make interface start mode element optional
This brings the libvirt version of this RNG file in line with the same
file in netcf (as soon as the corresponding patch there is ACKed and
pushed).
There's no reason to require it when defining an interface (the config
option it corresponds to is optional), and it isn't even output in the
status of an interface.
Laine Stump [Tue, 7 Mar 2017 19:24:37 +0000 (14:24 -0500)]
util: try *really* hard to set the MAC address of an SRIOV VF
If an SRIOV VF has previously been used for VFIO device assignment,
the "admin MAC" that is stored in the PF driver's table of VF info
will have been set to the MAC address that the virtual machine wanted
the device to have. Setting the admin MAC for a VF also sets a flag in
the PF that is loosely called the "administratively set" flag. Once
that flag is set, it is no longer possible for the net driver of the
VF (either on the host or in a virtual machine) to directly set the
VF's MAC again; this flag isn't reset until the *PF* driver is
restarted, and that requires taking *all* VFs offline, so it's not
really feasible to do.
If the same SRIOV VF is later used for macvtap passthrough mode, the
VF's MAC address must be set, but normally we don't unbind the VF from
its host net driver (since we actually need the host net driver in
this case). Since setting the VF MAC directly will fail, in the past
"we" ("I") had tried to fix the problem by simply setting the admin MAC
(via the PF) instead. This *appeared* to work (and might have at one
time, due to promiscuous mode being turned on somewhere or something),
but it currently creates a non-working interface because only the
value for admin MAC is set to the desired value, *not* the actual MAC
that the VF is using.
Earlier patches in this series reverted that behavior, so that we once
again set the MAC of the VF itself for macvtap passthrough operation,
not the admin MAC. But that brings back the original bug - if the
interface has been used for VFIO device assignment, you can no longer
use it for macvtap passthrough.
This patch solves that problem by noticing when virNetDevSetMAC()
fails for a VF, and in that case it sets the desired MAC to the admin
MAC via the PF, then "bounces" the VF driver (by unbinding and the
immediately rebinding it to the VF). This causes the VF's MAC to be
reinitialized from the admin MAC, and everybody is happy (until the
*next* time someone wants to set the VF's MAC address, since the
"administratively set" bit is still turned on).
Laine Stump [Mon, 6 Mar 2017 18:59:29 +0000 (13:59 -0500)]
util: if setting admin MAC to 00:00:00:00:00:00 fails, try 02:00:00:00:00:00
Some PF drivers allow setting the admin MAC (that is the MAC address
that the VF will be initialized to the next time the VF's driver is
loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers
initialize the admin MACs to all 0, but don't allow setting it to that
very same value. It has been an uphill battle convincing the driver
people that it's reasonable to expect The argument that's used is
that an all 0 device MAC address on a device is invalid; however, from
an outsider's point of view, when the admin MAC is set to 0 at the
time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a
random non-0 value. But that's beside the point - even if I could
convince one or two SRIOV driver maintainers to permit setting the
admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a
failure to set the admin MAC due to an all 0 value, and retries it
with 02:00:00:00:00:00. That won't result in a random value being set
in the VF MAC at next VF driver init, but that's okay, because we
always want to set a specific value anyway. Rather, the "almost 0"
setting makes it easy to visually detect from the output of "ip link
show" which VFs are currently in use and which are free.
Laine Stump [Mon, 6 Mar 2017 15:01:02 +0000 (10:01 -0500)]
util: remove unused functions from virnetdev.c
The global functions virNetDevReplaceMacAddress(),
virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and
virNetDevRestoreNetConfig() are no longer used, as their functionality
has been replaced by virNetDev(Save|Read|Set)NetConfig().
The static functions virNetDevReplaceVfConfig() and
virNetDevRestoreVfConfig() were only used by the above-named global
functions that were removed.
Laine Stump [Mon, 6 Mar 2017 14:36:40 +0000 (09:36 -0500)]
util: after hostdev assignment, restore VF MAC address via setting admin MAC
It takes longer to explain this than to fix it...
In the past we weren't able to save the VF's own MAC address *at all*
when using it for hostdev assignment, because we had already unbound
the VF from the host net driver prior to saving its config. With the
previous patch, that problem has been solved, so we now have the VF's
MAC address saved and can move on to the *next* problem, which is twofold:
1) during teardown we restore the config before we've re-bound, so the
VF doesn't have a net driver, and thus we can't set its MAC address
directly.
2) even if we delay restoring the config until the VF is bound to a
net driver, the request to set its MAC address would fail, since
(during device setup) we had set the "admin MAC" for the VF via an
RTM_SETLINK to the PF - once you've set the admin MAC for a VF, the
VF driver (either on host or on guest) is not allowed to change the
VF's MAC address "forever" (well, until you reload the PF driver,
but that requires destroying and recreating every single VF, which
isn't something you can require).
The solution is to keep the restoration of config at the same place,
but to set the *admin MAC* to the address you want the VF to have -
when the VF net driver is later initialized (as a part of re-binding
to the VF net driver) its MAC will be initialized to the current value
of the admin MAC.
Laine Stump [Tue, 28 Feb 2017 21:44:26 +0000 (16:44 -0500)]
util: save hostdev network device config before unbinding from host driver
In order to properly restore the original state of an SRIOV VF when
we're finished with it, we need to save the MAC address of the VF
itself (not just the admin MAC address for the VF that is stored in
the PF). But that can only be done when the VF is still bound to the
host's netdev driver, and we have always done the saving of device
config after the VF is already bound to vfio-pci. This patch prepares
us for adding a save of the VF's MAC by calling the function that
saves netconfig earlier in the device preparation, before we've
unbound it from the host netdev driver.
Laine Stump [Tue, 28 Feb 2017 18:47:12 +0000 (13:47 -0500)]
util: replace virHostdevNetConfigReplace with ...(Save|Set)NetConfig()
These two operations will need to be separated so that saving of the
original config is done before detaching the host net driver, and
setting the new config is done after attaching vfio-pci. This patch
splits the single function into two, but for now calls them together
(to make bisecting easier if there is a regression).
Laine Stump [Mon, 6 Mar 2017 01:28:08 +0000 (20:28 -0500)]
util: use new virNetDev*NetConfig() functions for hostdev setup/teardown
virHostdevNetConfigReplace() and virHostdevNetConfigRestore() are
modified to use the new virNetDev*NetConfig() functions.
Note that due to the VF's original MAC addresses being saved after it
has already been un-bound from the host net driver, the actual current
VF MAC address won't be saved (because it no longer exists) - only the
"admin MAC" will be saved. This reflects existing behavior that will
be fixed in an upcoming patch.
Laine Stump [Sun, 5 Mar 2017 22:43:39 +0000 (17:43 -0500)]
util: use new virNetDev*NetConfig() functions for macvtap setup/teardown
This patch modifies the macvtap passthrough setup to use
virNetDevSaveNetConfig()+virNetDevSetConfig() instead of
virNetDevReplaceNetConfig() or virNetDevReplaceMacAddress(), and the
teardown to use virNetDevReadNetConfig()+virNetDevSetConfig() instead
of virNetDevRestoreNetConfig() or virNetDevRestoreMacAddress().
Since the older functions only saved/restored the admin MAC and vlan
tag (which is incorrect) and the new functions save/restore the VF's
own MAC address and vlan tag (correct), this actually fixes a bug
(which was introduced by commit cb3fe38c7, which was itself supposed
to be a fix for https://bugzilla.redhat.com/1113474 ).
The downside to this patch is that it causes an *apparent* regression
in that bug (because there will once again be an error reported if the
interface had previously been used for VFIO device assignment), but in
reality, the code hasn't been working for *any* case before this
current patch (at least not with any recent kernel). Anyway, that
"regression" will be fixed with an upcoming patch that fixes it the
*right* way.
Laine Stump [Mon, 20 Feb 2017 21:14:53 +0000 (16:14 -0500)]
util: new functions virNetDev(Save|Read|Set)NetConfig()
These three functions are destined to replace
virNetDev(Replace|Restore)NetConfig() and
virNetDev(Replace|Restore)MacAddress(), which both do the save and set
together as a single step. We need to separate the save, read, and set
steps because there will be situations where we need to do something
else in between (in particular, we will need to rebind a VF's driver
after save but before set).
This patch creates the new functions, but doesn't call them - that
will come in a subsequent patch. Note that the new functions to
read/write the file that stores the original network config now uses
JSON rather than plaintext (it still recognizes the old format as well
though, so it won't get confused during an upgrade).
Peter Krempa [Mon, 20 Mar 2017 16:55:56 +0000 (17:55 +0100)]
qemu: Log additional data from hyperv crash notifier
The hyperv panic notifier reports additional data in form of 5 registers
that are reported in the crash event from qemu. Log them into the VM log
file and report them as a warning so that admins can see the cause of
crash of their windows VMs.
Erik Skultety [Fri, 3 Feb 2017 13:04:59 +0000 (14:04 +0100)]
hostdev: Maintain a driver list of active mediated devices
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Erik Skultety [Fri, 3 Feb 2017 12:43:48 +0000 (13:43 +0100)]
qemu: Assign PCI addresses for mediated devices as well
So far, the official support is for x86_64 arch guests so unless a
different device API than vfio-pci is available let's only turn on
support for PCI address assignment. Once a different device API is
introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Erik Skultety [Fri, 3 Feb 2017 12:29:37 +0000 (13:29 +0100)]
conf: Enable cold-plug of a mediated device
This merely introduces virDomainHostdevMatchSubsysMediatedDev method that
is supposed to check whether device being cold-plugged does not already
exist in the domain configuration.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Erik Skultety [Tue, 31 Jan 2017 16:26:36 +0000 (17:26 +0100)]
conf: Introduce new hostdev device type mdev
A mediated device will be identified by a UUID (with 'model' now being
a mandatory <hostdev> attribute to represent the mediated device API) of
the user pre-created mediated device. We also need to make sure that if
user explicitly provides a guest address for a mdev device, the address
type will be matching the device API supported on that specific mediated
device and error out with an incorrect XML message.
Just a tiny wrapper over the SCSI def clearing logic to drop some
if-else branches from a switch, mainly because extending the switch in
the future would render the current code with branching less readable.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Enforce virDomainHostdevSubsysType checking during compilation. Again,
one of a few spots in our code where we should enforce the typecast to
the enum type, thus not forgetting to update *all* switch occurrences
dealing with the give enum.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Eric Blake [Mon, 27 Mar 2017 13:11:26 +0000 (08:11 -0500)]
util: fix build on RHEL 6
We keep forgetting that older setups don't like 'index':
CC util/libvirt_util_la-virsysinfo.lo
cc1: warnings being treated as errors
util/virstoragefile.c: In function 'virStorageSourceFindByNodeName':
util/virstoragefile.c:3804: error: declaration of 'index' shadows a global declaration [-Wshadow]
/usr/include/string.h:489: error: shadowed declaration is here [-Wshadow]
Instead of generating all of the capabilities, let's test more of our
code by probing sysfs data. This test needs quite some mocking for
now, but it paves the road for more future enhancements (hugepages
probing, for example).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
All mocked functions are related to numactl/virNuma and rely only on
virsysfs, so the paths they touch can be nicely controlled. And
because it is so nicely self-contained NUMA mock, it is named
numamock (instead of naming it after the test that will use it first).
We need top level API mock because some APIs might call libnuma
directly, e.g. virNumaIsAvailable(), virNumaGetMaxNode().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That file has only two exported files and each one of them has
different naming. virNode is what all the other files use, so let's
use it. It wasn't used before because the clash with public API
naming, so let's fix that by shortening the name (there is no other
private variant of it anyway).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no "node driver" as there was before, drivers have to do
their own ACL checking anyway, so they all specify their functions and
nodeinfo is basically just extending conf/capablities. Hence moving
the code to src/conf/ is the right way to go.
Also that way we can de-duplicate some code that is in virsysfs and/or
virhostcpu that got duplicated during the virhostcpu.c split. And
Some cleanup is done throughout the changes, like adding the vir*
prefix etc.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no reason for it not to be in the utils, all global symbols
under that file already have prefix vir* and there is no reason for it
to be part of DRIVER_SOURCES because that is just a leftover from
older days (pre-driver modules era, I believe).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
tests: Add cpu/{online,present} files for old tests
The functionality these tests partially relied on (scanning the cpu
directory for cpu[0-9]+ subdirectories) is going to be removed, so we
need additional files that are present on all non-medieval systems.
Removing all these tests would be an option but we would lose the
ability to test the topologies. Even though we just extract number of
sockets/cores/threads from all these directory trees.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Guests are handled in callers, but if something goes wrong (when it
cannot be added to virCapabilities, for example), there's no way for
them to free it properly.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Both QEMU and bhyve are using the same function for setting up the CPU
in virCapabilities, so de-duplicate it, save code and time, and help
other drivers adopt it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
STREQ_NULLABLE returns true if both parameters are NULL. And that's
not what we want here. We just want to skop comparing source nodes
that don't have that info set. The function wouldn't make much sense
with nodeName == NULL, so we don't need to check that. Moreover, the
function's declaration uses ATTRIBUDE_NONNULL for nodeName, which not
only means that function expects the parameter not to be NULL, but
actually tells the compiler that it can optimize out the NULL checks.
That way it could end up calling strcmp on NULL (either nodeformat or
nodebacking). GCC figures this out if libvirt is compiled with
lv_cv_static_analysis=yes, unfortunately not everyone uses that.
Peter Krempa [Thu, 16 Mar 2017 13:37:56 +0000 (14:37 +0100)]
qemu: stats: Display the block threshold size in bulk stats
Management tools may want to check whether the threshold is still set if
they missed an event. Add the data to the bulk stats API where they can
also query the current backing size at the same time.
Peter Krempa [Thu, 16 Mar 2017 11:30:16 +0000 (12:30 +0100)]
qemu: block: Add code to fetch block node data by node name
To allow updating stats based on the node name, add a helper function
that will fetch the required data from 'query-named-block-nodes' and
return it in hash table for easy lookup.
Peter Krempa [Wed, 15 Mar 2017 12:03:21 +0000 (13:03 +0100)]
qemu: block: Add code to detect node names when necessary
Detect the node names when setting block threshold and when reconnecting
or when they are cleared when a block job finishes. This operation will
become a no-op once we fully support node names.
Peter Krempa [Thu, 23 Feb 2017 18:36:52 +0000 (19:36 +0100)]
qemu: monitor: Extract the top level format node when querying disks
To allow matching the node names gathered via 'query-named-block-nodes'
we need to query and then use the top level nodes from 'query-block'.
Add the data to the structure returned by qemuMonitorGetBlockInfo.
Peter Krempa [Tue, 14 Mar 2017 15:20:47 +0000 (16:20 +0100)]
tests: qemumonitorjson: Add relative image names for node name detection
oVirt uses relative names with directories in them. Test such
configuration. Also tests a snapshot done with _REUSE_EXTERNAL and a
relative backing file pre-specified in the qcow2 metadata.
Peter Krempa [Tue, 14 Mar 2017 14:02:11 +0000 (15:02 +0100)]
tests: qemumonitorjson: Add case for two disks sharing a backing image
Since we have to match the images by filename a common backing image
will break the detection process. Add a test case to see that the code
correctly did not continue the detection process.
Peter Krempa [Mon, 13 Mar 2017 11:46:18 +0000 (12:46 +0100)]
qemu: block: Add code to allow detection of auto-allocated node names
qemu for some time already sets node names automatically for the block
nodes. This patch adds code that attempts a best-effort detection of the
node names for the backing chain from the output of
'query-named-block-nodes'. The only drawback is that the data provided
by qemu needs to be matched by the filename as seen by qemu and thus
if two disks share a single backing store file the detection won't work.
This will allow us to use qemu commands such as
'block-set-write-threshold' which only accepts node names.
In this patch only the detection code is added, it will be used later.
Peter Krempa [Fri, 24 Feb 2017 13:59:40 +0000 (14:59 +0100)]
qemu: monitor: Add monitor infrastructure for query-named-block-nodes
Add monitor tooling for calling query-named-block-nodes. The monitor
returns the data as the raw JSON array that is returned from the
monitor.
Unfortunately the logic to extract the node names for a complete backing
chain will be so complex that I won't be able to extract any meaningful
subset of the data in the monitor code.