Peter Krempa [Mon, 9 Sep 2019 08:55:20 +0000 (10:55 +0200)]
virsh: Don't open-code virJSONStringReformat in cmdQemuMonitorCommand
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 9 Sep 2019 08:51:25 +0000 (10:51 +0200)]
virsh: Use VIR_AUTO machinery in cmdQemuMonitorCommand
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 9 Sep 2019 08:41:15 +0000 (10:41 +0200)]
virsh: Use virshDomain type in 'inject-nmi'
With a nice side-effect of fixing alignment.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 9 Sep 2019 08:39:04 +0000 (10:39 +0200)]
virsh: demonstrate use of VIR_AUTOPTR(virshDomain) on 'send-process-signal'
Refactor the command code to use the new type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 9 Sep 2019 08:36:39 +0000 (10:36 +0200)]
virsh: Allow using VIR_AUTOPTR for releasing virDomainPtr in virsh
I opted to alias the 'virDomainType' to 'virshDomain' so that it's
obvious in all cases that this is a virsh-only construct. This is also
somewhat consistent with virsh's use of 'virshDomainFree' wrapper for
the freeing function which actually accepts NULL.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove windows thread implementation in favour of pthreads
Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The fact that qemu is capable -netdev socket is not enough to
start a migratable domain. It also needs dbus-vmstate capability.
Since there are already some qemu releases which have
net-socket-dgram capability and don't have dbus-vmstate we need
to check for dbus-vmstate.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
remote: pass identity across to newly opened daemons
When opening a connection to a second driver inside the daemon, we must
ensure the identity of the current user is passed across. This allows
the second daemon to perform access control checks against the real end
users, instead of against the libvirt daemon that's proxying across the
API calls.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
util: store identity attrs as virTypedParameter internally
We'll shortly be exposing the identity as virTypedParameter in the
public header, so it simplifies life to use that as the internal
representation too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
util: sanitize return values for virIdentity getters
The virIdentity getters are unusual in that they return -1 to indicate
"not found" and don't report any error. Change them to return -1 for
real errors, 0 for not found, and 1 for success.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Only expose the type safe getters/setters to other code in preparation
for changing the internal storage of data.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
tests: fix debug messages wrt selinux context when test fails
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Remove the "UNIX" tag from the names for user name, group name,
process ID and process time, since these attributes are all usable
for non-UNIX platforms like Windows.
User ID and group ID are left with a "UNIX" tag, since there's no
equivalent on Windows. The closest equivalent concept on Windows,
SID, is a struct containing a number of integer fields, which is
commonly represented in string format instead. This would require
a separate attribute, and is left for a future exercise, since
the daemons are not currently built on Windows anyway.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
api: introduce virConnectSetIdentity for passing uid, gid, selinux info
When using the fine grained access control mechanism for APIs, when a
client connects to libvirtd, the latter will fetch the uid, gid, selinux
info of the remote client on the UNIX domain socket. This is then used
as the identity when checking ACLs.
With the new split daemons things are a bit more complicated. The user
can connect to virtproxyd, which in turn connects to virtqemud. When
virtqemud requests the identity over the UNIX domain socket, it will
get the identity that virtproxyd is running as, not the identity of
the real end user/application.
virproxyd knows what the real identity is, and needs to be able to
forward this information to virtqemud. The virConnectSetIdentity API
provides a mechanism for doing this. Obviously virtqemud should not
accept such identity overrides from any client, it must only honour it
from a trusted client, aka one running as the same uid/gid as itself.
The typed parameters exposed in the API are the same as those currently
supported by the internal virIdentity class, with a few small name
changes.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Sat, 14 Sep 2019 07:40:29 +0000 (09:40 +0200)]
qemu: blockjob: Refuse to register blockjob if disk already has one
Most code paths prevent starting a blockjob if we already have one but
the job registering function does not do this check. While this isn't a
problem for regular cases we had a bad test case where we registered two
jobs for a single disk which leaked one of the jobs. Prevent this in the
registering function until we allow having multiple jobs per disk.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> ACKed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Peter Krempa [Sat, 14 Sep 2019 07:54:07 +0000 (09:54 +0200)]
tests: qemustatusxml2xml: Fix disk target mess
There were accidentally two disks with 'vdc' target with corresponding
blockjobs which made libvirt leak some references as there are not
supposed to be two blockjobs for a single disk. Fix this mess by
renaming some of the disks.
In addition the block job names also didn't correspond to the naming
convetion which also includes the disk target. Fix it as well.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> ACKed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Laine Stump [Fri, 13 Sep 2019 01:22:30 +0000 (21:22 -0400)]
qemu: call common NetDef validation for hotplug and device update
qemuDomainAttachNetDevice() (hotplug) previously had some of the
validation that is in qemuDomainValidateActualNetDef(), but it was
incomplete. qemuDomainChangeNet() had none of that validation, but it
is all appropriate in both cases.
This is the final piece of a previously partial resolution to
https://bugzilla.redhat.com/1502754
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Thu, 12 Sep 2019 22:25:21 +0000 (18:25 -0400)]
qemu: move runtime netdev validation into a separate function
The same validation should be done for both static network devices and
hotplugged devices, but they are currently inconsistent. Move all the
relevant validation from qemuBuildInterfaceCommandLine() into the new
function qemuDomainValidateActualNetDef() and call the latter from
the former.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Mon, 9 Sep 2019 15:50:39 +0000 (09:50 -0600)]
apparmor: avoid copying empty profile name
AppArmorGetSecurityProcessLabel copies the VM's profile name to the
label member of virSecurityLabel struct. If the profile is not loaded,
the name is set empty before calling virStrcpy to copy it. However,
virStrcpy will fail if src is empty (0 length), causing
AppArmorGetSecurityProcessLabel to needlessly fail. Simple operations
that report security driver information will subsequently fail
virsh dominfo test
Id: 248
Name: test
...
Security model: apparmor
Security DOI: 0
error: internal error: error copying profile name
Avoid copying an empty profile name when the profile is not loaded.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
network: add missing bandwidth limits for bridge forward type
In the case of a network with forward=bridge, which has a bridge device
listed, we are capable of setting bandwidth limits but fail to call the
function to register them.
Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Unfortunately the wrong version of this patch was posted and
reviewed and thus it lacked the code to actually apply the
bandwidth settings to the bridge itself.
Reviewed-by: Laine Stump <laine@laine.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
network: fix connection usage counts after restart
Since the introduction of the virNetworkPort object, the network driver
has a persistent record of ports that have been created against the
networks. Thus the hypervisor drivers no longer communicate to the
network driver during libvirtd restart.
This change, however, meant that the connection usage counts were
no longer re-initialized during a libvirtd restart. To deal with this we
must iterate over all virNetworkPortDefPtr objects we have and invoke
the notify callback to record the connection usage count.
Reviewed-by: Laine Stump <laine@laine.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
conf: simplify link from hostdev back to network device
hostdevs have a link back to the original network device. This is fairly
generic accepting any type of device, however, we don't intend to make
use of this approach in future. It can thus be specialized to network
devices.
Reviewed-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
which mistakenly deleted the assignment to the 'net' variable,
which meant we never invoked the network driver release callback
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
util: remove several unused _QUIET allocation macro variants
Only a few of the _QUIET allocation macros are used. Since we're no
longer reporting OOM as errors, we want to eliminate all the _QUIET
variants. This starts with the easy, unused, cases.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The functions are left returning an "int" to avoid an immediate
big-bang cleanup. They'll simply never return anything other
than 0, except for virInsertN which can still return an error
if the requested insertion index is out of range. Interestingly
in that case, the _QUIET function would none the less report
an error.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The OOM handling requires special build time options which we never
enable in our CI. Even once enabled the tests are incredibly slow and
typically require manual inspection of the results to weed out false
positives.
Since there was previous agreement to switch to abort on OOM in libvirt
code, there's no point continuing to keep the unused OOM testing code.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
conf: avoid looking up network port that doesn't exist
If the hypervisor driver has not yet created the network port, the
portid field will be "00000000-0000-0000-0000-000000000000".
If a failure occurs during early VM startup, the hypervisor driver may
none the less try to release the network port, resulting in an
undesirable warning:
2019-09-12 13:17:42.349+0000: 16544: error :
virNetworkObjLookupPort:1679 : network port not found: Network port with
UUID 00000000-0000-0000-0000-000000000000 does not exist
By checking if the portid UUID is valid, we can avoid polluting the logs
in this way.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Jiang Kun [Thu, 12 Sep 2019 08:05:39 +0000 (16:05 +0800)]
node_device_conf: Don't leak @physical_function in virNodeDeviceGetPCISRIOVCaps
The pci_dev->physical_function is rewritten in
virPCIGetPhysicalFunction() to a newly allocated pointer.
Therefore, we must free the old one to avoid memleak.
Signed-off-by: Jiang kun <jiang.kun2@zte.com.cn> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The LIBVIRT_RESULT function takes two or three arguments. The
first one is the name of the result (aka CHECK_NAME). It is
printed before the colon character. The rest of the arguments is
printed after the character. To produce colourized output a
couple of changes needs to be made.
Firstly, we need to print the CHECK_NAME using "echo -n" so that
the new line is not appended at the end of the message. To
achieve this, AS_MESSAGE_N function is introduced. It's a
verbatim copy of AS_MESSAGE (which is just another alias to
AC_MSG_NOTICE) except it doesn't put '\n' at the EOL.
The alias is defined at /usr/share/autoconf-*/autoconf/general.m4
and the AS_MESSAGE is then defined at
/usr/share/autoconf-2.69/m4sugar/m4sh.m4.
Secondly, the rest of the arguments are printed colourized and to
achieve that and also keep printing them into the log file the
_AS_ECHO and COLORIZE_RESULT functions need to be called.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Now that we have qemuFirmwareGetSupported() so that it also
returns a list of FW image paths, we can use it to report them in
domain capabilities instead of the old time default list.
qemufirmwaretest: Test FW path getting through qemuFirmwareGetSupported()
There is one hack hidden here, but since this is in a test, it's
okay. In order to get a list of expected firmwares in
virFirmwarePtr form I'm using virFirmwareParseList(). But
usually, in real life scenario, this function is used only to
parse a list of UEFI images which have NVRAM split out. In other
words, this function expects ${FW}:${NVRAM} pairs. But in this
test, we also want to allow just a single path: ${FW} because
some reported firmwares are just a BIOS image really. To avoid
writing some parser function, let's just pass "NULL" as ${NVRAM}
and fix the result later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>
qemu_firmware: Extend qemuFirmwareGetSupported to return FW paths
The qemuFirmwareGetSupported() function is called from qemu
driver to generate domain capabilities XML based on FW descriptor
files. However, the function currently reports only some features
from domcapabilities XML and not actual FW image paths. The paths
reported in the domcapabilities XML are still from pre-FW
descriptor era and therefore the XML might be a bit confusing.
For instance, it may say that secure boot is supported but
secboot enabled FW is not in the listed FW image paths.
To resolve this problem, change qemuFirmwareGetSupported() so
that it also returns a list of FW images (we have the list
anyway). Luckily, we already have a structure to represent a FW
image - virFirmware.
virfirmware: Expose and define autoptr for virFirmwareFree
This function frees a _virFirmware struct. So far, it doesn't
need to be called from outside of the module, but this will
change shortly. In the light of recent VIR_DEFINE_AUTOPTR_FUNC()
additions, do the same to virFirmwareFree().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>
The times, when we had small CRTs are long gone. Now, in the era
of wide screens we can be more generous when it comes to aligning
the output of configure. The longest string before the colon is
'wireshark_dissector' which counts 19 characters. Therefore,
align the strings at 20.
At the same time, drop the useless result alignment. It behaves
oddly - it puts a space at the end of each "no" because of the
%-3s format we use.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>
docs: Expand the "BIOS bootloader" documentation for domainCaps
Rewrite some parts for clarity, elaborate the meaning of some of the XML
attributes. And where necessary, distinguish that we're dealing with
two different XML documents here:
- the domainCapabilities XML, to detect the host "hypervisor"
(QEMU/KVM) capabilities, and what libvirt knows about them.
- the guest XML definition, i.e. what features a guest can use, based
on the capabilities (of QEMU and libvirt and the host) reported in
the domainCapabilities XML.
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
libvirt.spec.in: Add the Secure Boot-variant OVMF binaries
Currently the RPM spec doesn't add the 'secboot'-variant OVMF binaries
(an unintentional omission, checking with Cole on #virt, OFTC) for
'x86_64' and 'ia32'. Add them.
This way, getDomainCapabilities() will report all the OVMF binaries that
are present on the system. E.g. on Fedora 29, if you only have the
edk2-ovmf-20190308stable-1.fc29.noarch package installed, then running
`virsh domcapabilities` will enumerate _both_ the OVMF binaries (instead
of just the OVMF_CODE.fd):
snapshot: Store both config and live XML in the snapshot domain
The snapshot-create operation of running guests saves the live
XML and uses it to replace the active and inactive domain in
case of revert. So, the config XML is ignored by the snapshot
process. This commit changes it and adds the config XML in the
snapshot XML as the <inactiveDomain> entry.
In case of offline guest, the behavior remains the same and the
config XML is saved in the snapshot XML as <domain> entry. The
behavior of older snapshots of running guests, that don't have
the new <inactiveDomain>, remains the same too. The revert, in
this case, overrides both active and inactive domain with the
<domain> entry. So, the <inactiveDomain> in the snapshot XML is
not required to snapshot work, but it's useful to preserve the
config XML of running guests.
Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
qemu: formatting XML from domain def choosing the root name
The function virDomainDefFormatInternal() has the predefined root name
"domain" to format the XML. But to save both active and inactive domain
in the snapshot XML, the new root name "inactiveDomain" was created.
So, the new function virDomainDefFormatInternalSetRootName() allows to
choose the root name of XML. The former function became a tiny wrapper
to call the new function setting the correct parameters.
Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Eric Blake [Mon, 9 Sep 2019 20:52:42 +0000 (15:52 -0500)]
qemu: Fix regression in snapshot-revert
Commit f10562799 introduced a regression: if reverting to a snapshot
fails early (such as when we refuse to revert to an external
snapshot), we lose track of the domain's current snapshot.
Before that patch, we were tracking the notion of the domain's current
snapshot via two means: vm->current_snapshot (which was left untouched
on early exit) and snap->def->current (which only controls what gets
written to XML to remember snapshots across libvirtd restarts). That
patch was fixing a real bug: if a revert operation failed early, later
questions from the same libvirtd did not see any change to the current
snapsthot, but restarting libvirtd would now claim there is no current
snapshot. But it fixed it in the wrong direction, in that the current
snapshot was forgotten unconditionally, rather than only when the
snapshot to revert to has a chance of being useful.
It didn't help that the code after that patch had two separate spots
clearing the old notion of the current snapshot - one after
determining the snapshot to revert to was viable, the other
unconditionally on all failure exit paths. At any rate, the fix is
simple: drop the unconditional cleanup on error paths, and rely only
on the normal cleanup after early checks.
Sadly, it is not possible to test this bug in the existing
tests/virsh-snapshot, as the test driver does not have the same
prohibition against reverting to an external snapshot as the qemu
driver.
See: https://bugzilla.redhat.com/1738747 Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190909205242.15406-1-eblake@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Michal Privoznik [Tue, 10 Sep 2019 09:24:19 +0000 (11:24 +0200)]
virnetdevmacvlan: Provide stubs for macvlan related functions
In recent commit of 3d21ff72e0e the virNetDevMacVLanTapOpen() and
virNetDevMacVLanTapSetup() functions were exported in our private
symbols. But these functions live in an #ifdef so they need a
stub implementation.
Then in 1b46566ee the virNetDevMacVLanIsMacvtap() function was
implemented but again, only for #idef and without stub.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
util: activate directory override when used from library
The Perl bindings for libvirt use the test driver for unit tests. This
tries to load the cpu_map/index.xml file, and when run from an
uninstalled build will fail.
The problem is that virFileActivateDirOverride is called by our various
binaries like libvirtd, virsh, but is not called when a 3rd party app
uses libvirt.so
To deal with this we allow the LIBVIRT_DIR_OVERRIDE=1 env variable to be
set and make virInitialize look for this. The 'run' script will set it,
so now build using this script to run against an uninstalled tree we
will correctly resolve files to the source tree.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This commit broke virpolkittest on Ubuntu 18 which has an old
dbus (v1.12.2). Any other distro with the recent one works
(v1.12.16) which hints its a bug in dbus somewhere. Revert the
commit to stop tickling it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
qemu_conf: Drop a pair of needless 'cleanup' labels
There are two 'cleanup' labels - one in
virQEMUDriverConfigHugeTLBFSInit() and the other in
virQEMUDriverConfigSetDefaults() that do nothing more than
return and integer value. No memory freeing or anything important
is done there. Drop them in favour of returning immediately.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Our naming rules prefer qemuObjectOperation() scheme rather than
qemuOperationObject() for function names. These were not honoured
in recent commits to qemu_conf.c.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Laine Stump [Mon, 26 Aug 2019 17:05:19 +0000 (13:05 -0400)]
qemu: support unmanaged macvtap devices with <interface type='ethernet'>
Traditionally, macvtap devices are supported using <interface
type='direct'>, but that type requires specifying a source device name
and macvtap mode which can't be altered after the initial device
creation (and may not even be available to the management software
that's creating the XML config to feed to libvirt).
But the attributes in the <source> are essentially describing how the
device will be connected to the network, and if libvirt is to be
supplied with the name of a macvtap device that has already been
created, that device will also already be connected to the network
(and the connection can't be changed). Thus it seems more appropriate
to use type='ethernet', which was created explicitly for this purpose
- for devices that have already been (or will be) connected to the
external network by someone/something outside of libvirt. The fact
that it is a *macv*tap rather than a contentional tap device is just a
detail.
This patch supports using an existing macvtap device with <interface
type='ethernet'> by checking the supplied target dev name to see if it
is a macvtap device and, when this is the case, calling
virNetDevMacVLanTapOpen() instead of virNetDevTapCreate(). For
consistency, this is only done when target managed='no'.
Laine Stump [Mon, 26 Aug 2019 04:24:34 +0000 (00:24 -0400)]
qemu: support unmanaged target tap dev for <interface type='ethernet'>
If managed='no', then the tap device must already exist, and setting
of MAC address and online status (IFF_UP) is skipped.
NB: we still set IFF_VNET_HDR and IFF_MULTI_QUEUE as appropriate,
because those bits must be properly set in the TUNSETIFF we use to set
the tap device name of the handle we've opened - if IFF_VNET_HDR has
not been set and we set it the request will be honored even when
running libvirtd unprivileged; if IFF_MULTI_QUEUE is requested to be
different than how it was created, that will result in an error from
the kernel. This means that you don't need to pay attention to
IFF_VNET_HDR when creating the tap devices, but you *do* need to set
IFF_MULTI_QUEUE if you're going to use multiple queues for your tap
device.
NB2: /dev/vhost-net normally has permissions 600, so it can't be
opened by an unprivileged process. This would normally cause a warning
message when using a virtio net device from an unprivileged
libvirtd. I've found that setting the permissions for /dev/vhost-net
permits unprivileged libvirtd to use vhost-net for virtio devices, but
have no idea what sort of security implications that has. I haven't
changed libvrit's code to avoid *attempting* to open /dev/vhost-net -
if you are concerned about the security of opening up permissions of
/dev/vhost-net (probably a good idea at least until we ask someone who
knows about the code) then add <driver name='qemu'/> to the interface
definition and you'll avoid the warning message.
Note that virNetDevTapCreate() is the correct function to call in the
case of an existing device, because the same ioctl() that creates a
new tap device will also open an existing tap device.
Laine Stump [Wed, 21 Aug 2019 20:42:41 +0000 (16:42 -0400)]
conf: new "managed" attribute for target dev of <interface type='ethernet'>
Although <interface type='ethernet'> has always been able to use an
existing tap device, this is just a coincidence due to the fact that
the same ioctl is used to create a new tap device or get a handle to
an existing device.
Even then, once we have the handle to the device, we still insist on
doing extra setup to it (setting the MAC address and IFF_UP). That
*might* be okay if libvirtd is running as a privileged process, but if
libvirtd is running as an unprivileged user, those attempted
modifications to the tap device will fail (yes, even if the tap is set
to be owned by the user running libvirtd). We could avoid this if we
knew that the device already existed, but as stated above, an existing
device and new device are both accessed in the same manner, and
anyway, we need to preserve existing behavior for those who are
already using pre-existing devices with privileged libvirtd (and
allowing/expecting libvirt to configure the pre-existing device).
In order to cleanly support the idea of using a pre-existing and
pre-configured tap device, this patch introduces a new optional
attribute "managed" for the interface <target> element. This
attribute is only valid for <interface type='ethernet'> (since all
other interface types have mandatory config that doesn't apply in the
case where we expect the tap device to be setup before we
get it). The syntax would look something like this:
This patch just adds managed to the grammar and parser for <target>,
but has no functionality behind it.
(NB: when managed='no' (the default when not specified is 'yes'), the
target dev is always a name explicitly provided, so we don't
auto-remove it from the config just because it starts with "vnet"
(VIR_NET_GENERATED_TAP_PREFIX); this makes it possible to use the
same pattern of names that libvirt itself uses when it automatically
creates the tap devices.)
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Laine Stump [Tue, 27 Aug 2019 16:18:35 +0000 (12:18 -0400)]
qemu: reorganize qemuInterfaceEthernetConnect()
This just moves around a few things in qemuInterfaceConnect() with no
functional difference (except that a few failures that would have
previously resulted in a "success" audit log will now properly produce
a "fail" audit). The change is so that adding support for unmanaged
tap/macvtap devices will be more easily reviewable.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Laine Stump [Mon, 26 Aug 2019 05:51:40 +0000 (01:51 -0400)]
util: make a couple virNetDevMacVlan*() functions public
In virNetDevMacVLanOpen(), The "retries" arg has been removed and the
value hardcoded as 10, since previously the function was only called
from one place, so it was always 10.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Laine Stump [Mon, 26 Aug 2019 05:24:08 +0000 (01:24 -0400)]
util: new function virNetDevMacVLanIsMacvtap()
This function returns T if the given name is a macvtap device. This is
determined by 1) getting the ifindex of the device with that name (if
there is one), and 2) checking for existence of /dev/tapXX, where "XX"
is the ifindex learned in (1).
It's also possible to learn this by getting a netlink dump of the
interface and parsing through it to look for some attributes, but that
is complicated to figure out, takes longer to execute, and I'm lazy.
Signed-off-by: Laine Stump <laine@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
tests: Add a baseline test for multifunction pci device use case
There are already good number of test cases with hostdevices,
few have multifunction devices but none having more than one
than one multifunction cards.
This patch adds a case where there are two multifunction cards
and two Virtual functions part of the same XML.
0001:01:00.X & 0005:09:00.X - are Multifunction PCI cards.
0000:06:12.[5|6] - are SRIOV Virtual functions
Future commits will improve on automatically detecting the
multifunction cards and auto-assinging the addresses
appropriately.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Previous patch had to add '/sys/kernel/' prefix in opendir() because
the path, which is being mocked, wasn't being considered due to
an 'if SYSFS_PCI_PREFIX' guarding the call to getrealpath().
In fact, all current getrealpath() callers are guarding it with a
conditional to ensure that the function will never be called with
a non-mocked path. In this case, an extra non-NULL verification is
needed for the 'newpath' string to use the variable - which is
counterintuitive, given that getrealpath() will always write the
'newpath' string in any non-error conditon.
However, simply removing the guard of all getrealpath() instances
causes an abort in init_env(). This happens because tests will
execute access() to non-mocked paths even before the
LIBVIRT_FAKE_ROOT_DIR variable is declared in the test files. We
don't need 'fakerootdir' to be created at this point though.
This patch does the following changes to simplify getrealpath()
usage:
- getrealpath() will now guard the init_env() call by checking if
both fakeroot isn't created and the required path is being mocked.
This ensures that we're not failing inside init_env() because
we're too early and LIBVIRT_FAKE_ROOT_DIR wasn't defined yet;
- remove all conditional guards to call getrealpath() from
access(), virMockStatRedirect(), open(), open_2(), opendir()
and virFileCanonicalizePath(). As a bonus, remove all ternary
conditionals with 'newpath';
- a new 'pathPrefixIsMocked()' helper to aggregate all the prefixes
we're mocking, making it easier to add/remove them. If a prefix
is added inside this function, we can be sure that all functions
are mocking them.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds hostdev test cases in qemuhotplugtest.c.
Note: the small tweak inside virpcimock.c was needed because
the new tests added a code path in which virHostHasIOMMU()
(virutil.c) started being called, and the mocked '/sys/kernel/'
prefix that is mocked in virpcimock.c wasn't being considered
in the opendir() mock. An alternative to avoid these situations
in virpcimock.c is implemented in the next patch.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The softlink to physfn is the way to know if the device is
VF or not. So, the patch softlinks 'physfn' to the parent function.
The multifunction PCI devices dont have 'physfn' softlinks.
The patch adds few Virtual functions to the mock environment and
changes the existing VFIO test xmls using the VFs to use the newly
added VFs for their use case.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds mock of the /dev/vfio path, needed for proper
implementation of the support for multifunction/multiple devices
per iommu groups.
To do that, the existing bind and unbind operations were adapted
to operate with the mocked filesystem as well.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Eric Farman [Tue, 3 Sep 2019 20:09:48 +0000 (22:09 +0200)]
qemu: Adjust max memlock on mdev hotplug
When starting a domain, we use the presence of a vfio-pci or
mdev hostdev to determine if the memlock maximum needs to be
increased. But if we hotplug either of these devices, only the
vfio-pci path gets that love. This means that attaching a, say,
vfio-ccw device will appear to succeed but the device may be
unusable as the guest may see I/O errors on long CCW chains.
The host, meanwhile, would be flooded with these messages:
Let's adjust the maximum memlock value in the mdev hotplug path,
so that the domain has the same value as if it were started with
one or more mdev devices in its configuration.
Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Eric Farman [Tue, 3 Sep 2019 20:09:47 +0000 (22:09 +0200)]
qemu: Reset the maximum locked memory on hotplug fail
If attaching a PCI hostdev fails, there are several things that
need to be un-done as part of the cleanup. One thing that is
not done is re-calculating/re-setting the maximum amount of locked
memory for the domain, since we may have changed that.
Let's fix that, just to ensure everything is back the way it was.
Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Eric Farman [Tue, 3 Sep 2019 20:09:46 +0000 (22:09 +0200)]
qemu: Refactor the max memlock routine
Let's pull this hunk out into a function, so it can be reused
in another codepath that needs to do the same thing.
Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>