Gianni Tedesco [Mon, 16 Aug 2010 16:15:04 +0000 (17:15 +0100)]
xl: make libxl_uuid2string internal to libxenlight
libxenlight exports a function libxl_uuid2string which is used
internally in several places but has one external caller in xl.
This means that libxl internal callers leak since they were not
expecting to have to free() the UUID since the per-api-call-gc-lifetime
patch.
Convert libxl_uuid2string to be an internal function which participates
in the callers garbage collection. Eliminate string_of_uuid() macro in
favour of "format" and "arguments" macros suitable for printf()-like
functions which are made part of the libxl API and fix-up xl callers to
use that to avoid code duplication and enhance readability.
Values of cpu_weight and cpu_cap are lost after xend restart
For managed domains in state 'halted' I always get default values
for cpu_cap / cpu_weight after xend restart.
This is because the names of parameters differ between a SXP file to
create a VM (here the parameter names are cpu_cap / cpu_weight) and
a SXP file of a managed VM (here vcpus_params (cap 0) (weight 0)).
But XendConfig.py reads only cpu_cap / cpu_weight and if not found,
default values are used.
The patch reads first vcpus_params (cap, weight), if not found then cpu_cap,
cpu_weight and if both parameters are missing it uses the default values.
eXeC001er [Mon, 16 Aug 2010 16:11:30 +0000 (17:11 +0100)]
Fix "Error: Device 51952 not connected" error when using pygrub
The following is the process of booting a DomU with 'mounted-blktap2' (VHD
for example) and 'pygrub' as bootloader:
1. Connect boot-device to Dom0 as '/dev/xpvd'
2. Pygrub get info for load DomU
3. Disconnect boot-device from Dom0
4. Boot DomU
During step 3 the created device is disconnected from Dom0, but
xenstore does not scrape away after the device is disconnected so you
get the following error:
"Error: Device /dev/xvdp (51952, tap2) is already connected."
During step 3 xend calls destroyDevice always with 'tap' as argument.
Gianni Tedesco [Mon, 16 Aug 2010 12:39:19 +0000 (13:39 +0100)]
tools/libxl: remove libxl_free() since there are no more callers
libxl_free() allows allocated memory to be explicitly free'd from a
libxl_gc. Every previous use of this function has now been made
redundant and therefore has been removed. We can safely kill it and
amend the policy accordingly.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Gianni Tedesco [Mon, 16 Aug 2010 12:37:58 +0000 (13:37 +0100)]
tools/libxl: fix memory management bugs in libxl_device_disk_list()
fix invalid free segfault and use-after-free in libxl_device_disk_list()
Gah, libxl_device_disk_list() is returning a lot of pointers to free'd
data. Fix that by replacing libxl_xs_read() with xs_read() in line with
the policy.
Also fix a segfault caused by an erroneous free of the last disk-list
array element rather than the first one. This was causing xl create to
segfault when using the new qemu-dm code-base. Fix that and add a
comment about the fact that this libxl API requires a corresponding
libxl_device_disk_free() function.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
tools: xenconsole[d] and libxl: multiple console support
This patch implements the new protocol for handling pv consoles and
emulated serials as described in the document docs/misc/console.txt.
The changes are:
- xenconsoled: do not write the pty under serial in xenstore if
xenconsoled is handling a consolepath;
- xenconsole: implement support for an explicit console type parameter;
the parameter can be "pv", to specify that the user wants to
connect to a pv console, or "serial", to specify that the user wants to
connect to an emulated serial. If the type parameter hasn't been
specified be the user, xenconsole tries to guess which type of console
it has to connect to, defaulting to pv console for pv guests and
emulated serial for hvm guests.
- xenconsole: use the new xenstore paths;
- libxl: rename libxl_console_constype to libxl_console_consback:
constype is used to to specify whether qemu or xenconsoled provides the
backend, so I renamed it to libxl_console_consback to make it more
obvious that we are talking about backends;
- libxl: add a new libxl_console_constype to specify if the console is
an emulated serial or a pv console;
- libxl: support the new xenconsole "type" command line parameter;
- libxl: use the "output" node under console in xenstore to tell qemu
where do we want the output of this pv console to go;
- remove the legacy "serialpath" from xenconsoled altogether
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Also: update the QEMU_TAG to pull in the qemu part of these changes.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
After animated discussion with several libxl developers we seem to have
agreed on a policy for memory management within libxenlight. These
comments document the policy which is mostly implemented since
21977:51147d5b17c3 but some aspects (comments, function naming) are
guidelines to be followed in future functionality and perhaps to be
implemented by search/replace in future patches.
The document is mostly authored by Ian Jackson but with modifications to
reflect the slightly different functionality that has been implemented
since this was proposed.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 13 Aug 2010 13:59:03 +0000 (14:59 +0100)]
x86: eliminate bogus IRQ restrictions
As pointed out in
http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00077.html
the limits introduced in c/s 20072 are at least
questionable. Eliminate them in favor of a more dynamic approach:
There's no real need for an upper limit on nr_irqs (as anything beyond
nr_irqs_gsi isn't visible to domains anyway), and the split point (and
hence ratio) between GSI and MSI/MSI-X IRQs doesn't need to be hard
coded, but can instead be controlled on the command line in case there
are *very* many GSIs.
The default used for nr_irqs will be rather large with this patch, so
it may not be acceptable without also switching to a sparse irq_desc[]
as was done not so lomg ago in Linux.
The added capping of any domain's nr_pirqs is based on the observation
that no domain can possibly have more than the system wide number of
IRQs. The opposite case may in fact also require some adjustment:
Defaulting the number of non-GSI IRQs available (namely to Dom0) to a
fixed value may not be the best choice going forward, since if there
indeed are very many non-GSI interrupt sources, it won't be possible
for the kernel to make use of them without giving
"extra_guest_irqs=" on the command line (but the goal should be to
allow things to work right by default even on large systems).
Keir Fraser [Fri, 13 Aug 2010 13:58:06 +0000 (14:58 +0100)]
x2APIC: Improve x2APIC suspend/resume
x2apic depends on interrupt remapping, so it should disable interrupt
remapping behind x2apic disabling. And also this patch wraps
__enable_x2apic to get rid of duplicated code.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Fri, 13 Aug 2010 13:57:35 +0000 (14:57 +0100)]
Fix IOAPIC S3 with interrupt remapping enabled
In ioapic_suspend, it reads and saves ioapic RTEs. But when interrupt
remapping is enabled, io_apic_read will call io_apic_read_remap_rte to
convert remapped format interrupt to compatible format, this results
in 'dest' field may be changed in remap_entry_to_ioapic_rte. When in
ioapic_resume, it will write the saved RTEs with incorrect 'dest' to
interrupt remapping table.
Actually it needn't to convert RTEs regardless interrupt remapping is
enabled or not. It just needs to save and restore RTE values
directly. This patch just uses __io_apic_read and __io_apic_write,
which won't call Interrupt remapping functions to convert, to save and
restore RTEs in ioapic_suspend and ioapic_resume. Thus fix this issue.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Fri, 13 Aug 2010 13:56:15 +0000 (14:56 +0100)]
sysctl: Return physinfo.max_{cpu,node}_id as maximum *possible* IDs.
In particular, this fixes setting vcpu affinities via
libxl. Previously, the affinity mask would be narrowed to the maximum
currently-online CPU. So future hotplugged CPUs could not be
expressed.
Removing the include of sys/ptrace.h and threaddb.h exposed a few
places which were using time(2) or gettimeofday(2) without including
time.h or sys/time.h respectively and were relying on an include.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 13 Aug 2010 07:33:19 +0000 (08:33 +0100)]
credit1: Make weight per-vcpu
Change the meaning of credit1's "weight" parameter to be per-vcpu,
rather than per-VM.
At the moment, the "weight" parameter for a VM is set on a per-VM
basis. This means that when cpu time is scarce, two VMs with the same
weight will be given the same amount of total cpu time, no matter how
many vcpus it has. I.e., if a VM has 1 vcpu, that vcpu will get x% of
cpu time; if a VM has 2 vcpus, each vcpu will get (x/2)% of the cpu
time.
I believe this is a counter-intuitive interface. Users often choose
to add vcpus; when they do so, it's with the expectation that a VM
will need and use more cpu time. In my experience, however, users
rarely change the weight parameter. So the normal course of events is
for a user to decide a VM needs more processing power, add more cpus,
but doesn't change the weight. The VM still gets the same amount of
cpu time, but less efficiently allocated (because it's divided).
The attached patch changes the meaning of the "weight" parameter, to
be per-vcpu. Each vcpu is given the weight. So if you add an extra
vcpu, your VM will get more cpu time as well.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Currently scratch variables allocated by libxl have the same lifetime as
the context. While this is suitable for one off invocations of xl. It is
not so great for a daemon process linking to libxl. In that case there
will be prolific leakage of heap memory.
My proposed solution involves create a new libxl_gc structure, which
contains a pointer to an owning context as well as the garbage
collection data. Top-level library functions which expect to do a lot of
scratch allocations put gc struct on the stack and initialize it with a
macro. Before returning they then call libxl_free_all on this struct.
This means that static helper functions called by such functions will
usually take a gc instead of a ctx as a first parameter.
The patch touches almost every code-path so a close review and testing
would be much appreciated. I have tested with valgrind all of the parts
I could which looked non-straightforward. Suffice to say that it seems
crash-free even if we have exposed a few real memory leaks. These are
for cases where we return eg. block list to an xl caller but there is no
appropriate block_list_free() function to call. Ian Campbells work in
this area should sew up all these loose ends.
Ian Jackson [Thu, 12 Aug 2010 16:06:21 +0000 (17:06 +0100)]
tools/python: Remove non-ASCII characters introduced by fffedd3d70e1.
fffedd3d70e1 introduced into tools/python/xen/util/vscsi_util.py
high-bit-set characters which appear to be UTF-8 for non-breaking
spaces. Replace them with spaces to avoid getting a Python syntax
error.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
split LDLIBS from LDFLAGS to fix link errors in recent toolchains
Linker command lines are order-sensitive.
Move linker options -Lfoo -lfoo from LDFLAGS to LDLIBS and place this new
variable after the objects to link. This resolves build errors in xenpagin
and blktap with recent toolchains.
rename SHLIB_CFLAGS to SHLIB_LDFLAGS
rename LDFLAGS_* to LDLIBS_*
move LDFLAGS usage after CFLAGS in CC calls
remove stale comments in xenpaging Makefile
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Wed, 11 Aug 2010 16:01:02 +0000 (17:01 +0100)]
msi: Avoid uninitialized msi descriptors
When __pci_enable_msix() returns early, output parameter (struct
msi_desc **desc) will not be initialized. On my machine, a Broadcom
BCM5709 nic has both MSI and MSIX capability blocks and when guest
tries to enable msix interrupts but __pci_enable_msix() returns early
for encountering a msi block, the whole system will crash for fatal
page fault immediately.
George Dunlap [Wed, 11 Aug 2010 14:56:21 +0000 (15:56 +0100)]
[Xen-devel] [PATCH] PoD: Fix domain build populate-on-demand cache allocation
Rather than trying to count the number of PoD entries we're putting in, we
simply pass the target # of pages - the vga hole, and let the hypervisor
do the calculation.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xl: don't use libxl allocator for vcpu_list
This also fixes a bug with an erroneous call to libxl_free().
A destructor for the vcpu list is also implemented which is called from
xl.
"Dube, Lutz" [Wed, 11 Aug 2010 12:18:05 +0000 (13:18 +0100)]
Exception in xen/util/vscsi_util.py while starting xend
We have pscsi device with long scsi ids like 15:0:11:101.
In this case lsscsi prints no "blank" between id and type,
so the following split of the string returns wrong output.
The field physical_HCTL is set to 15:0:11:101]dis.
The patch replaces char "]" by "] ", so split() will return the right
physical_HTCL.
Signed-off-by: Lutz Dube Lutz.Dube@ts.fujitsu.com Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
xl: Use gcc hidden visibility attribute
This kills off about 1,000 PLT entries speeding up link time and shaving about
1-2KB off of binary size. It also prevents users of the library erroneously
calling libxl internal functions.
xl: Fix invalid return of internal ptrs via libxl_poolid_to_name
libxl_poolid_to_name has in-and-out-of-library callers. In library callers now use
_libxl_poolid_to_name() which participates in garbage collection and
out-of-library callers are fixed up to free() the domain name.
xl: Fix invalid return of libxl-internal pointers via libxl_domid_to_name()
libxl_domid_to_name has numerous in-and-out-of-library callers. In
library callers now use _libxl_domid_to_name() which participates in
garbage collection and out-of-library callers are fixed up to free() the
domain name.
xl: don't use libxl allocator for nic_list
This also fixes a bug with an erroneous call to libxl_free().
A destructor for the nic list is also implemented which is called from
xl.
xl: allocate version info strings with strdup()
libxl_strdup() will soon have a per-call rather than per-context lifetime so
use strdup() to allocate version info and manually free in context destructor
xl: Return void from libxl_free() and libxl_free_all()
Also abort() if a pointer passed in to libxl_free() cannot be found in the
mini-gc structures. This exposes several real bugs (ie. not just the
libxl_free() something that was allocated via malloc variety).
Andre Przywara [Tue, 10 Aug 2010 14:35:13 +0000 (15:35 +0100)]
xl: Fix xl vcpu-list output on machines with more than 16 cores Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xc_vcpu_[sg]etaffinity require the number of _bytes_ needed for
holding a CPU map, not the number of bits. Fix this when
calling the function from libxl and rename the misleading variable
name to avoid future confusion.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 9 Aug 2010 17:28:04 +0000 (18:28 +0100)]
x86: Reorder CPUs at boot time to reflect system topology.
This is an attempt to impose some sensible coherent ordering on the
cpu namespace, where previously there was none (we were at the mercy
of BIOS ordering, which varies wildly across systems).
Implement PCI pass-through for multi-function devices. The supported BDF
notation is: BB:DD.* - therefore passing-through a subset of functions or
remapping the function numbers is not supported except for when passing
through a single function which will be a virtual function 0.
Letting qemu automatically select the virtual slot is not supported in
multi-function passthrough since the qemu-dm protocol can not actually
handle that case.
Gianni Tedesco [Mon, 9 Aug 2010 16:43:18 +0000 (17:43 +0100)]
xc: fix segfault in pv domain create if kernel is an invalid image
If libelf calls elf_err() or elf_msg() before elf_set_log() has been
called then it could potentially read an uninitialised log handling
callback function pointer from struct elf_binary. Fix this in libxc by
zeroing the structure before calling elf_init().
The error message when one wants to list a non-existent domain is at
best misleading (libxl_domain_info failed (code -5)).
This patch catches this specific error and tells the user that the
requested domain does not exist:
Error: Domain '42' does not exist.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 9 Aug 2010 15:40:18 +0000 (16:40 +0100)]
Intel EPT: Fix out of range right shift on 32-bit host
Currently, there has a logic to check whether the EPT GFN is exceeding
guest physical address width. It uses right shift(>>) to implement the
check. But the right shift count is greater than the width of the
type(unsigned long = 32) under the PAE. And this will cause guest boot
fail under PAE with EPT supported.
Signed-off-by: Li Xin <xin.li@intel.com> Signed-off-by: Zhang Yang <yang.z.zhang@intel.com>
Keir Fraser [Mon, 9 Aug 2010 15:37:33 +0000 (16:37 +0100)]
Credit1: Tweak reset condition
VMs that don't use their full timeslice are guaranteed to flip back
and forth between "active" and "inactive". If we set credit to 0
when setting "inactive", then when the VM comes back to "active"
again, it will effectively be behind most other vcpus in credit.
This causes the credit1 to effectively discriminate *against*
VMs which use less than their full timeslice.
Instead of setting credit to 0, divide it in half. This gets rid of
some of the system credit while allowing non-cpu-bound VMs to keep
some priority advantage.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Mon, 9 Aug 2010 15:36:07 +0000 (16:36 +0100)]
scheduler: Implement yield for credit1
This patch implements 'yield' for credit1. It does this by attempting
to put yielding vcpu behind a single lower-priority vcpu on the
runqueue. If no lower-priority vcpus are in the queue, it will go at
the back (which if the queue is empty, will also be the front).
Runqueues are sorted every 30ms, so that's the longest this priority
inversion can happen.
For workloads with heavy concurrency hazard, and guest which implement
yield-on-spinlock, this patch significantly increases performance and
total system throughput.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Anthony Perard [Mon, 9 Aug 2010 15:46:02 +0000 (16:46 +0100)]
tools/hotplug, Use udev rules instead of qemu script to setup the bridge.
From: Anthony PERARD <anthony.perard@citrix.com>
This patch adds a second argument to vif-bridge script. It can be "vif"
or "tap". "vif" give the default behavior and "tap" just add the
interface to the found bridge when the action is "add".
Anthony Perard [Mon, 9 Aug 2010 15:46:01 +0000 (16:46 +0100)]
libxl, Introduce the command line handler for the new qemu.
From: Anthony PERARD <anthony.perard@citrix.com>
This patch adds a function to check the version of the device model.
Depending on the version of the DM, the command line arguments will be
built differently.
Keir Fraser [Mon, 9 Aug 2010 15:33:45 +0000 (16:33 +0100)]
vt-d: Fix ioapic_rte_to_remap_entry error path.
When ioapic_rte_to_remap_entry fails, currently it just writes value
to ioapic. But the 'mask' bit may be changed if it writes to the upper
half of RTE. This patch ensures to recover the original value of 'mask'
bit in this case.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Mon, 9 Aug 2010 15:32:45 +0000 (16:32 +0100)]
vt-d: Fix ioapic write order in io_apic_write_remap_rte
At the end of io_apic_write_remap_rte, it writes new entry (remapped
interrupt) to ioapic. But it writes low 32 bits before high 32 bits,
it unmasks interrupt before writing high 32 bits if 'mask' bit in low
32 bits is cleared. Thus it may result in issues. This patch fixes
this issue by writing high 32 bits before low 32 bits.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com>
Creates a shared memory ring buffer and event channel in HVM domains
for passing debug messages from PV drivers on HVM guests
The console is used by windows pv drivers to send debug data to xenconsoled
Signed-off-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Wed, 4 Aug 2010 15:00:13 +0000 (16:00 +0100)]
libxl: free values in XLU_ConfigSetting.
Fixes these valgrind reported leaks, found with "valgrind xl create -n ..."
==21170== 8 bytes in 3 blocks are definitely lost in loss record 1 of 3
==21170== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==21170== by 0x411B22F: strdup (in /lib/i686/cmov/libc-2.7.so)
==21170== by 0x4030085: xlu__cfgl_strdup (libxlu_cfg.c:290)
==21170== by 0x402F3C4: xlu__cfg_yylex (libxlu_cfg_l.l:37)
==21170== by 0x402DD86: xlu__cfg_yyparse (libxlu_cfg_y.c:1338)
==21170== by 0x40308AE: xlu_cfg_readdata (libxlu_cfg.c:85)
==21170== by 0x804DBE4: parse_config_data (xl_cmdimpl.c:591)
==21170== by 0x8056EE4: create_domain (xl_cmdimpl.c:1381)
==21170== by 0x80582AE: main_create (xl_cmdimpl.c:3178)
==21170== by 0x804B54B: main (xl.c:76)
==21170==
==21170== 57 bytes in 2 blocks are definitely lost in loss record 2 of 3
==21170== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==21170== by 0x402FE22: xlu__cfgl_dequote (libxlu_cfg.c:307)
==21170== by 0x402F4B4: xlu__cfg_yylex (libxlu_cfg_l.l:52)
==21170== by 0x402DD86: xlu__cfg_yyparse (libxlu_cfg_y.c:1338)
==21170== by 0x40308AE: xlu_cfg_readdata (libxlu_cfg.c:85)
==21170== by 0x804DBE4: parse_config_data (xl_cmdimpl.c:591)
==21170== by 0x8056EE4: create_domain (xl_cmdimpl.c:1381)
==21170== by 0x80582AE: main_create (xl_cmdimpl.c:3178)
==21170== by 0x804B54B: main (xl.c:76)
==21170==
==21170== 111 bytes in 6 blocks are definitely lost in loss record 3 of 3
==21170== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==21170== by 0x402FE22: xlu__cfgl_dequote (libxlu_cfg.c:307)
==21170== by 0x402F4ED: xlu__cfg_yylex (libxlu_cfg_l.l:56)
==21170== by 0x402DD86: xlu__cfg_yyparse (libxlu_cfg_y.c:1338)
==21170== by 0x40308AE: xlu_cfg_readdata (libxlu_cfg.c:85)
==21170== by 0x804DBE4: parse_config_data (xl_cmdimpl.c:591)
==21170== by 0x8056EE4: create_domain (xl_cmdimpl.c:1381)
==21170== by 0x80582AE: main_create (xl_cmdimpl.c:3178)
==21170== by 0x804B54B: main (xl.c:76)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 4 Aug 2010 14:35:28 +0000 (15:35 +0100)]
numa: Attempt more efficient NUMA allocation in hypervisor by default.
1. Try to allocate from nodes containing CPUs which a guest can be
scheduled on.
2. Remember which node we allocated from last, and round-robin
allocations among above-mentioned nodes.
Gianni Tedesco [Wed, 4 Aug 2010 13:43:46 +0000 (14:43 +0100)]
xl: detect pci-insert-failed dm status on pci-passthrough
NOTE: This functionality depends on a corresponding qemu-dm patch to work as
expected. Should be safe to use with an un-patched qemu-dm as before.
libxl_wait_for_device_model can only wait for one status value, re-work the
API so that a callback function can chose between several different possible
status values for qemu-dm and fix up all callers appropriately.
In the case of PCI device insert we succeed if qemu-dm reports
"pci-device-inserted" and error out instead of hanging forever if it fails
since qemu-dm now reports a status of "pci-insert-failed".
Gianni Tedesco [Wed, 4 Aug 2010 13:43:00 +0000 (14:43 +0100)]
xl: implement pci attach to explicitly defined virtual PCI slot
Move to state-machine parser for PCI BDF's now that the number of possible
sscanf expressions is sufficiently large to make it worth it. Also implement
parsing of virtual PCI slot numbers.
Gianni Tedesco [Wed, 4 Aug 2010 13:24:19 +0000 (14:24 +0100)]
xl: centralize BDF parsing in to libxl
Introduce a new libxl call libxl_device_pci_parse_bdf() and use it
consistently.
This patch also fixes an infinite loop bug in xl create. If there is a
parse-error on any pci config file entry then xl will attempt to skip it, but
since num_pcidevs is used to index the config list and is not incremented when
skipping, this caused an infinite loop. Solve that problem by introducing a new
loop counter variable.
Note that virtual PCI slots are not parsed by the new code. However this is
a step towards support for virtual slots and multi-function device assignment
since only one location in the code will need to be changed now.
Gianni Tedesco [Wed, 4 Aug 2010 13:23:29 +0000 (14:23 +0100)]
xl: PCI code cleanups
Get rid of scan_sys_pcidir() and open-code it inside
libxl_device_pci_list_assignable() since it's not a generically re-useable
function and we're not supporting pcistub driver now. Also use macros for sysfs
dirs in libxl_device_pci_reset
Ian Campbell [Tue, 3 Aug 2010 17:10:28 +0000 (18:10 +0100)]
xl: fix memory leaks in xl create
Found using "valgrind xl create -n ..." and "valgrind xl create -e ..."
freeing config_data solves:
==18276== 944 bytes in 1 blocks are definitely lost in loss record 12 of 12
==18276== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==18276== by 0x404AEC1: libxl_read_file_contents (libxl_utils.c:258)
==18276== by 0x8056865: create_domain (xl_cmdimpl.c:1314)
==18276== by 0x8057E2D: main_create (xl_cmdimpl.c:3135)
==18276== by 0x804B2FB: main (xl.c:76)
==18276==
Adding free_domain_config() solves the following (plus presumably others
which didn't trigger because I have no devices of that type).
d_config->disks:
==18276== 61 (32 direct, 29 indirect) bytes in 1 blocks are definitely lost in loss record 9 of 12
==18276== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==18276== by 0x4022F94: realloc (vg_replace_malloc.c:525)
==18276== by 0x804E2D3: parse_config_data (xl_cmdimpl.c:715)
==18276== by 0x8056A7C: create_domain (xl_cmdimpl.c:1347)
==18276== by 0x8057E2D: main_create (xl_cmdimpl.c:3135)
==18276== by 0x804B2FB: main (xl.c:76)
d_config->vifs:
==18276== 76 (48 direct, 28 indirect) bytes in 1 blocks are definitely lost in loss record 10 of 12
==18276== at 0x4022F0A: malloc (vg_replace_malloc.c:236)
==18276== by 0x4022F94: realloc (vg_replace_malloc.c:525)
==18276== by 0x804E665: parse_config_data (xl_cmdimpl.c:779)
==18276== by 0x8056A7C: create_domain (xl_cmdimpl.c:1347)
==18276== by 0x8057E2D: main_create (xl_cmdimpl.c:3135)
==18276== by 0x804B2FB: main (xl.c:76)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Tue, 3 Aug 2010 17:09:21 +0000 (18:09 +0100)]
libxl: fix memory leak in libxl_name_to_domid
Found with "valgrind xl destroy ...":
==16272== 53,248 bytes in 1 blocks are definitely lost in loss record 6 of 6
==16272== at 0x4022249: calloc (vg_replace_malloc.c:467)
==16272== by 0x403FD4A: libxl_list_domain (libxl.c:490)
==16272== by 0x404B901: libxl_name_to_domid (libxl_utils.c:65)
==16272== by 0x804B4D2: domain_qualifier_to_domid (xl_cmdimpl.c:181)
==16272== by 0x804B50F: find_domain (xl_cmdimpl.c:198)
==16272== by 0x804D70C: destroy_domain (xl_cmdimpl.c:2104)
==16272== by 0x8054E4C: main_destroy (xl_cmdimpl.c:2912)
==16272== by 0x804B2FB: main (xl.c:76)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Gianni Tedesco [Tue, 3 Aug 2010 16:34:08 +0000 (17:34 +0100)]
libblktapctl: fix use-after-free bug
This has not caused crashes because generally use after free is OK
provided nothing else is going on. However the patch makes things
correct. It also allows us to use heap poisoning feature of valgrind on
tools linking to libblktapctl.