Tim Deegan [Wed, 16 Feb 2011 09:48:05 +0000 (09:48 +0000)]
x86/shadow: unconditionally set the p2m/log-dirty allocation functions.
Otherwise enabling log-dirty mode on a PV guest that already has
a shadow allocation can leave the alloc/free functions pointers NULL,
and later try to dereference them.
p2m internals should always gate on whether HAP is enabled for the
domain, not whether a HAP paging mode is currently advertised.
This lets us revert the change to hap_enable() that advertises the
new mode before it's safe to use it.
docs: document disk configuration string syntax (particularly, xl's syntax)
Signed-off-by: Kamala Narasimhan <kamala.narasimhan@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Special case how we validate vhd image files. Without this patch when
tap:aio:vhd prefixed image files are specified in the config file,
disk validation and thus vm creation will fail.
Signed-off-by: Kamala Narasimhan <kamala.narasimhan@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Currently we pile all the backend and format information pertaining to
disk option in a single enum. This check-in separates the two and
uses two enums, one for disk format and another for disk backend.
This helps clearly differentiate between disk format and backend
within the implementation and also helps cleanup the code in this area
in preparation for the impending parser revamping to be done post 4.1.
Along with separating format and backend, this check-in also removes
unwanted types and renames variables in the disk interface and fixes
the code affected by the interface changes.
In specific, here are the disk interface changes made - In
libxl_device_disk structure physpath was renamed to pdev_path,
virtpath was renamed to vdev, phystype was removed and replaced with
backend and format enums. Also previously a single enum
libxl_disk_phystype held the values for qcow, qcow2, vhd, aio, file,
phy, empty and that got refactored into two enums, libxl_disk_format
to hold unknown, qcow, qcow2, vhd, raw, empty and libxl_disk_backend
to hold unknown, phy, tap and qdisk.
Signed-off-by: Kamala Narasimhan <kamala.narasimhan@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap [Tue, 15 Feb 2011 19:39:05 +0000 (19:39 +0000)]
tools: Include cpupool example in /etc/xen
xl cpupool-create at the moment requires a config file. Make
sure to include the example config file in the install.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 14 Feb 2011 17:02:55 +0000 (17:02 +0000)]
libxl: fix migrate for HVM guests
Prior to 22909:6868f7f3ab3f libxl would loop waiting simultaneously
for the domain the acknowledge a PV suspend request (by clearing the
XenStore node) and for the domain to actually suspend. For HVM guests
without PV drivers this same loop was simply waiting for the domain to
suspend.
In 22909:6868f7f3ab3f the original loop was split into two loops
(first waiting for the acknowledgement and then for the actual
suspend). This caused libxl to incorrectly wait for an HVM guest
without PV drivers to acknowledge the XenStore request, which is not
something it would ever do.
Fix this by only waiting for an acknowledgement from a guest which
contains PV drivers.
Previously we were also making the request regardless of whether the
guest had PV drivers, change that to only make the request if the
guest has PV drivers.
Lastly there is no need to sample HVM_PARAM_ACPI_S_STATE twice and not
doing so simplifies the test for PVHVM vs. normal HVM guests.
Tested with:
Windows with GPL PV drivers (event channel suspend mode)
Windows without PV drivers (xc_domain_shutdown mode)
Linux PV (PV with XenBus control node mode)
Linux HVM (PVHVM with XenBus control node mode (*))
Linux HVM (xc_domain_shutdown mode)
(*) In this case the kernel didn't actually suspend, due to:
PM: Device input1 failed to suspend: error -22
xen suspend: dpm_suspend_start -22
which may be a misconfiguration in my setup or may be a kernel
bug, but the libxl side dealt with this as gracefully as it could.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Mon, 14 Feb 2011 16:56:20 +0000 (16:56 +0000)]
xl: Support more than 32 vcpus for xl vcpu-set
xl vcpu-set currently uses a 32 bit mask for specifying which cpus are to be
set online. This restricts the number of cpus supported by this command.
The patch switches to libxl_cpumap, the interface of libxl_set_vcpuonline()
is changed accordingly.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Mon, 14 Feb 2011 16:55:00 +0000 (16:55 +0000)]
xl: correct xl cpupool-create with extra parameters
xl cpupool-create won't take always extra parameters specified on the command
line, as a 0-byte is missing at the end of the configuration file contents.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
This is the equivalent of xm trigger s3resume and it is implemented the
same way: using an ACPI state change.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Gang [Mon, 14 Feb 2011 10:41:12 +0000 (10:41 +0000)]
x86: Fix S3 resume for HPET MSI IRQ case
Jan Beulich found that for S3 resume on platforms without ARAT feature
but with MSI capable HPET, request_irq() will be called in
hpet_setup_msi_irq() for irq already setup(no release_irq() called
during S3 suspend), so that always falling back to using
legacy_hpet_event.
Fix it by conditional calling request_irq() for 4.1. Planned to split
the S3 resume path from booting path post 4.1, as Jan suggested.
Signed-off-by: Wei Gang <gang.wei@intel.com> Acked-by: Jan Beulich <jbeulich@novell.com>
Ian Jackson [Fri, 11 Feb 2011 18:21:35 +0000 (18:21 +0000)]
tools/hotplug/Linux: Use correct device name for vifs in setup scripts
In vif-common.sh, set the shell variable "dev" to the new interface
name when interfaces are renamed, and consistently use this variable
in all the vif scripts.
This fixes hotplug of renamed interfaces.
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
From: Patrick Scharrenberg <pittipatti@web.de> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Patrick Scharrenberg <pittipatti@web.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 11 Feb 2011 17:57:32 +0000 (17:57 +0000)]
libxl/xl: improve behaviour when guest fails to suspend itself.
The PV suspend protocol requires guest co-operating whereby the guest
must respond to a suspend request written to the xenstore control node
by clearing the node and then making a suspend hypercall.
Currently when a guest fails to do this libxl times out and returns
a generic failure code to the caller.
In response to this failure xl attempts to resume the guest. However
if the guest has not responded to the suspend request then the is no
guarantee that the guest has made the suspend hypercall (in fact it is
quite unlikely). Since the resume process attempts to modify the
return value of the hypercall (to indicate a cancelled suspend) this
results in the guest eax/rax register being corrupted!
To fix this change libxl to do the following:
* Wait for the guest to acknowledge the suspend request.
- on timeout cancel the suspend request.
- if cancellation is successful then return a new error code to
indicate that the guest is not responding.
- if the cancel does not succeed then we raced with the guest
which actually did acknowledge at the last minute, so
continue.
* Wait for the guest to suspend.
- on timeout return the standard error code as before
* Guest successfully suspended, return success.
Lastly in xl do not attempt to resume a guest if it has not responded
to the suspend request.
Tested by live migration of PVops kernels which either ignore the
suspend request, have already crashed and those which suspend/resume
correctly. In the first two cases the source domain is left alone (and
continues to function in the first case) and in the third the
migration is successful.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 11 Feb 2011 17:56:24 +0000 (17:56 +0000)]
libxl: allow guest to write "control/shutdown" xenstore node.
The PV shutdown/reboot/suspend protocol requires that the guest
acknowledge a request by clearing the node therefore it is necessary
to allow the guest to write to the node.
Currently libxl is quite relaxed about this protocol and doesn't
reeally seem to mind that the guest is unable to write the node to
perform the acknowledgement. However in a followup patch libxl needs
to be able to detect that a guest has acknowledged a suspend request.
A side effect of this change is that an empty "control/shutdown" node
is created upon domain creation instead of only being created when a
shutdown/reboot/suspend is requested. This should not (and does not
in my tests) have any negative impact on the guest.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: do not call libxl__file_reference_unmap twice
Fix double free due to libxl__file_reference_unmap(&info->kernel) called
multiple times: first at the end of libxl__domain_build and then in
libxl_domain_build_info_destroy.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 11 Feb 2011 17:49:13 +0000 (17:49 +0000)]
libxc: increase lzma max memory constant to 128Mby
According to lzma's configure.ac (!) the minimum memory limit to cope
with arbitrary input is 128Mby (!)
This is obviously an unreasonable amount of memory for this kind of
task, but we need to increase the constant limit for it not to
randomly fail. So do so.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Thu, 10 Feb 2011 14:19:54 +0000 (14:19 +0000)]
x86: suppress HPET broadcast initialization in the presence of ARAT
This follows Linux commit 39fe05e58c5e448601ce46e6b03900d5bf31c4b0,
noticing that all this setup is pointless when ARAT support is there,
and knowing that on SLED11's native kernel it has actually caused S3
resume issues.
A question would be whether HPET legacy interrupts should be forced
off in this case (rather than leaving whatever came from firmware).
Keir Fraser [Thu, 10 Feb 2011 14:19:23 +0000 (14:19 +0000)]
x86: tighten conditions under which writing certain MSRs is permitted
MSRs that control physical CPU aspects generally are pointless (and
possibly dangerous) to be written when the writer isn't sufficiently
aware that it's running virtualized.
Juergen Gross [Thu, 10 Feb 2011 09:02:50 +0000 (09:02 +0000)]
Cpupools: vcpu affinity handling
If a vcpu is pinned to multiple physical cpus, the pinning is not
removed if all those physical cpus are removed from the cpupool. When
disabling the scheduler on a cpu, the affinity mask must be checked
against the cpumask of the cpupool.
Wei Wang [Wed, 9 Feb 2011 08:57:12 +0000 (08:57 +0000)]
amd iommu: dynamic page table depth adjustment.
IO Page table growth is triggered by amd_iommu_map_page and grows to
upper level. I have tested it well for different devices (nic and gfx)
and different guests (linux and Win7) with different guest memory
sizes (512M, 1G, 4G and above).
Keir Fraser [Wed, 9 Feb 2011 08:44:38 +0000 (08:44 +0000)]
cpupool: Strict parameter checking for cpupool operations
Some cpupool actions didn't check the cpupool_id exactly. For some
actions this doesn't make any sense, so refuse those actions if the
specified cpupool doesn't exist.
Keir Fraser [Wed, 9 Feb 2011 08:40:05 +0000 (08:40 +0000)]
[VTD][QUIRK] add spin lock across snb pre/postamble functions
Added a spinlock across snb_vtd_ops_preamble() and
snb_vtd_ops_postamble() to make modifications to IGD registers atomic.
Continue keeping snb_igd_quirk default off.
James Harper [Tue, 8 Feb 2011 16:35:35 +0000 (16:35 +0000)]
xend: canonicalise symlinks found in /dev for vbds (helps vscsi)
By default, vscsi expects to be passed the final device name (eg
/dev/st3) instead of one of the various udev symlinks (eg
/dev/tape/by-path/pci-0000:01:08.0-scsi-0:0:2:0-st). The following patch
resolves the path to the real path if the name starts with /dev/
Signed-off-by: James Harper <james.harper@bendigoit.com.au> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: initialise some variables in print_bitmap, to suppress bogus warning
GCC 4.2.4 cannot figure out that three variables aren't used before
initialisation:
xl_cmdimpl.c: In function `print_domain_vcpuinfo':
xl_cmdimpl.c:3351: warning: `firstset' may be used uninitialized in this function
[etc]
Signed-off-by: Kamala Narasimhan <kamala.narasimhan@citrix.com> Acked-by: Andre Przywara <andre.przywara@amd.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 7 Feb 2011 16:55:25 +0000 (16:55 +0000)]
tools/hotplug: set mtu from bridge also on vif interface
Apply mtu size from bridge interface also in vif interface.
This depends on a kernel change which allows arbitrary mtu sizes until
the frontend driver has connected to the backend driver. Without this
kernel change, the vif mtu size will be limited to 1500 even with this
change to the vif-bridge script.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Mon, 7 Feb 2011 15:04:32 +0000 (15:04 +0000)]
cpupool: Correct cpupool diag printing
Some of the cpupool_dprintk() calls are using undefined or
uninitialized variables. Correct the argument lists to be able to
define cpupool_printk as printk.
Ian Campbell [Mon, 7 Feb 2011 12:13:24 +0000 (12:13 +0000)]
minios: do not export {test,set,clear}_bit etc to applications
Fixes ioemu stubdom build:
CC i386-stubdom/piix4acpi.o
[...]/stubdom/ioemu/hw/piix4acpi.c:272: error: expected ')' before '?' token
[...]/stubdom/ioemu/hw/piix4acpi.c:277: error: conflicting types for 'set_bit'
[...]/stubdom/../extras/mini-os/include/x86/mini-os/os.h:396: error: previous definition of 'set_bit' was here
[...]/stubdom/ioemu/hw/piix4acpi.c:282: error: conflicting types for 'clear_bit'
[...]/stubdom/../extras/mini-os/include/x86/mini-os/os.h:414: error: previous definition of 'clear_bit' was here
[...]/stubdom/ioemu/hw/piix4acpi.c: In function 'gpe_sts_write':
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Mon, 7 Feb 2011 09:58:11 +0000 (09:58 +0000)]
Pack some hvmop memory structures better
Some of the hvmop memory structures have a shocking amount of
unnecesssary padding in them. Elements which can have only 3 values
are given 64 bits of memory, and then aligned (so that there is
padding behind them).
This patch resizes and reorganizes in the following way, (hopefully)
without introducing any differences between the layout for 32- and
64-bit.
xen_hvm_set_mem_type:
hvmmem_type -> 16 bits
nr -> 32 bits (limiting us to setting 16TB at a time)
xen_hvm_set_mem_access:
hvmmem_access -> 16 bits
nr -> 32 bits
xen_hvm_get_mem_access:
hvmmem_access -> 16 bits
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Sun, 6 Feb 2011 17:26:31 +0000 (17:26 +0000)]
hvm: fix XSAVE leaf 0 EBX size calculation
Fixes a size calculation bug when enabled bits in XFEATURE_MASK (xcr0)
aren't contiguous.
Current for_loop will stop when xcr0 feature bit is 0. But in reality,
the bits can be non-contiguous. One example is that LWP is bit 62 on
AMD platform. This patch iterates through all bits to calculate the
size for enabled features.
Keir Fraser [Sun, 6 Feb 2011 17:10:31 +0000 (17:10 +0000)]
xsm/flask: Fix permission tables
At some point, it seems that someone manually added Flask permission
definitions to one header file without updating the corresponding
policy configuration or the other related table. The end result is
that we can get uninterpretable AVC messages like this:
# xl dmesg | grep avc
(XEN) avc: denied { 0x4000000 } for domid=0
scontext=system_u:system_r:dom0_t tcontext=system_u:system_r:domU_t
tclass=domain
Fix this by updating the flask config and regenerating the headers
from it. In the future, this can be further improved by integrating
the automatic generation of the headers into the build process as is
presently done in SELinux.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Keir Fraser [Sun, 6 Feb 2011 17:03:09 +0000 (17:03 +0000)]
hvm amd: Fix 32bit guest VM save/restore issues associated with SYSENTER MSRs
This patch turn-on SYSENTER MSRs interception for 32bit guest VMs on
AMD CPUs. With it, hvm_svm.guest_sysenter_xx fields always contain the
canonical version of SYSENTER MSRs and are used in guest save/restore.
The data fields in VMCB save area are updated as necessary.
Reported-by: James Harper <james.harper@bendigoit.com.au> Signed-off-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Sun, 6 Feb 2011 16:54:01 +0000 (16:54 +0000)]
amd iommu: Fix a xen crash after pci-attach
pci-detach triggers IO page table deallocation if the last passthru
device has been removed from pdev list, and this will result a BUG on
amd systems for next pci-attach. This patch fixes this issue.
Keir Fraser [Sun, 6 Feb 2011 16:07:27 +0000 (16:07 +0000)]
cpupool: Check for memory allocation failure on switching schedulers
When switching schedulers on a physical cpu due to a cpupool operation
check for a potential memory allocation failure and stop the operation
gracefully.
Ian Jackson [Fri, 4 Feb 2011 18:47:39 +0000 (18:47 +0000)]
libxl: vncviewer: make autopass work properly
The file we write the vnc password to must be rewound back to the
beginning, or the vnc viewer will simply get EOF.
When the syscalls for communicating the password to the vnc client
fail, bomb out with an error messsage rather than blundering on (and
probably producing a spurious password prompt).
Following this patch, xl vncviewer --autopass works, provided the qemu
patch for writing the password to xenstore has also been applied.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 4 Feb 2011 18:47:20 +0000 (18:47 +0000)]
libxl: vncviewer: unconditionally read listen port address and password
The /local/domain/DOMID/device/vfb/0/backend path is irrelevant.
libxl does not create it, so the branch would never be taken.
Instead, simply read the target paths of interest.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 4 Feb 2011 18:46:22 +0000 (18:46 +0000)]
libxl: vncviewer: fix use-after-free
This bug can prevent xl vncviewer from working at all.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 4 Feb 2011 18:46:00 +0000 (18:46 +0000)]
libxl: actually print an error when execve (in libxl__exec) fails
The header comment says libxl__exec logs errors. So it should do so.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 4 Feb 2011 18:45:26 +0000 (18:45 +0000)]
libxl: SECURITY: always honour request for vnc password
qemu only sets a password on its vnc display if the value for the -vnc
option has the ",password" modifier. The code for constructing
qemu-dm options was broken and only added this modifier for one of the
cases.
Unfortunately there does not appear to be any code for passing the vnc
password to upstream qemu (ie, in the case where
libxl_build_device_model_args_new is called). To avoid accidentally
running the domain without a password, check for this situation and
fail an assertion. This will have to be revisited after 4.1.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: fix console autoconnect with pygrub, by invoking xenconsole twice
When using pygrub we have to connect to the console twice: once at the
beginning to connect to pygrub and a second time after creating the pv
console to connect to the guest's console.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Fri, 4 Feb 2011 17:32:39 +0000 (17:32 +0000)]
xl: fix broken xl vcpu-list output (tool hangs on large machines)
The algorithm for printing the CPU affinity in a condensed way
looks for a set bit in a zero-byte:
for (i = 0; !(pcpumap & 1); ++i, pcpumap >>= 1)
Looking at the code I found that it is entirely broken if more than 8
CPUs are used. Beside that endless loop issue the output is totally
bogus except for the "any CPU" case, which is handled explicitly earlier.
I tried to fix it, but the whole approach does not work if the outer
loops actually iterates (executing more than once).
This fix reimplements the whole algorithm in a clean (though not much
optimized way). It survived some unit-testing.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Allen Kay [Wed, 2 Feb 2011 17:06:36 +0000 (17:06 +0000)]
libxl: pass gfx_passthru parameter to QEMU
Pass gfx_passthru parameter to QEMU. Keep it boolean for now as QEMU
does not expect any other integer value.
Signed-off-by: Allen Kay <allen.m.kay@intel.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 2 Feb 2011 17:05:27 +0000 (17:05 +0000)]
libxl: change default HVM emulated network card to rtl8139
xend uses rtl8139, and we want xl to be compatible with xm. Some
older operating systems don't have e1000 drivers, and we want widest
compatibility rather than best performance (people who want good
performance are best advised to use PV-on-HVM drivers).
We'll probably switch to a new default when switching to upstream
qemu, in the Xen 4.2 release cycle.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 1 Feb 2011 19:26:36 +0000 (19:26 +0000)]
libxc: maintain a small, per-handle, cache of hypercall buffer memory
Constantly m(un)locking memory can have significant overhead on
systems with large numbers of CPUs. This was previously fixed by
20841:fbe8f32fa257 but this was dropped during the transition to
hypercall buffers.
Introduce a small cache of single page hypercall buffer allocations
which can be resused to avoid this overhead.
Add some statistics tracking to the hypercall buffer allocations.
The cache size of 4 was chosen based on these statistics since they
indicated that 2 pages was sufficient to satisfy all concurrent single
page hypercall buffer allocations seen during "xl create", "xl
shutdown" and "xl destroy" of both a PV and HVM guest therefore 4
pages should cover the majority of important cases.
This fixes http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1719.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reported-by: Zheng, Shaohui <shaohui.zheng@intel.com> Tested-by: Haitao Shan <maillists.shan@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
The current libxl_set_memory_target function subtracts a negative amount
from an uint32_t variable without checking if the operation wraps
around.
This patch fixes this bug (that I previously believed to be an
hypervisor issue):
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1729
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 1 Feb 2011 19:23:31 +0000 (19:23 +0000)]
tools: disable linker --as-needed option.
The Xen build system is not currently compatible with the --as-needed
linker option. The proper fix for this is turning out to be rather
invasive to the build system so simply disable for now with the
intention of revisiting for the 4.2 release.
The --no-as-needed option is available at least since binutils 2.15
(released in May 2004) and hence I think can be unconditionally relied
on.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reported-by: Nathan March <nathan@gt.net> Tested-by: Nathan March <nathan@gt.net> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Michael Young [Tue, 1 Feb 2011 19:19:58 +0000 (19:19 +0000)]
tools/Makefiles: install libvhd and libblktap with INSTALL_PROG
Shared libraries should be executable.
(rpm (4.9.0) doesn't automatically supply a "provides" entry for a
library unless it is executable. Non-executable libraries can cause
other trouble too.)
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Michael Young [Tue, 1 Feb 2011 19:18:42 +0000 (19:18 +0000)]
docs: Bring comments about NetworkManager and bridging up to date
Update a comment about NetworkManager not supporting bridging in
Fedora 11 to refer instead to Fedora 14. Clarify the wording.
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Michael Young [Tue, 1 Feb 2011 19:16:28 +0000 (19:16 +0000)]
tools/hotplug: Fix proxy arp messing about to use correct device
Fix an anomaly in /etc/xen/scripts/network-route.
Currently this script contains
netdev=${netdev:-eth${vifnum}}
ie. netdev is set to eth${vifnum} by default. Unfortunately vifnum
is not set anywhere in the xen code so the default is actually the
broken "eth". And anyway the vif number (which is what vifnum ought
to be) is not relevant.
The patch changes the default to eth0 (which is what the comment at
the top of the file says it should be).
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Tue, 1 Feb 2011 19:05:51 +0000 (19:05 +0000)]
xl: output illegal option character
Though illegal characters on xl command lines are catched, the user
isn't currently informed which one was not right.
This patch fixes this by printing the faulting character.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 19:37:49 +0000 (19:37 +0000)]
libxc: Do not use dom0 physmem as parameter to lzma decoder
It's not clear why a userspace lzma decode would want to use that
particular value, what bearing it has on anything or why it would
assume it could use 1/3 of the total RAM in the system (potentially
quite a large amount of RAM) as opposed to any other limit number.
Instead, hardcode 32Mby.
This reverts 22830:c80960244942, removes the xc_get_physmem/physmem
function entirely, and replaces the expression at the call site with a
fixed constant.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Cc: Christoph Egger <Christoph.Egger@amd.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 18:39:09 +0000 (18:39 +0000)]
libxl: prevent creation of domains with duplicate names
libxl_domain_rename is where domain names are assigned. Therefore
this is where we check that no two domains have the same name. As a
special exception, domains whose names are "" are not considered to
clash.
We also take special care not to mind if we try to rename a domain to
the name it already has.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 18:38:26 +0000 (18:38 +0000)]
libxl: during domain destruction, do not complain if no devices dir to destroy
Previously calling libxl__devices_destroy on a half-constructed or
half-destroyed domain would sometimes complain along these lines:
libxl: error: libxl_device.c:327:libxl__devices_destroy /local/domain/29/device is empty
This is (a) not a reasonable thing to complain about and (b) not an
accurate description of all the things that that particular failure of
libxl__xs_directory might mean.
Change the code to check errno, so that if errno==ENOENT we silently
continue, not destroying any devices, and if errno!=ENOENT, properly
log the problem and fail.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 18:37:25 +0000 (18:37 +0000)]
libxl: internals: document the error behaviour of various libxl__xs_* functions
Many of the functions in libxl_xshelp.c simply return 0 on error, and
leave the errno value from xenstore in errno. Document this more
clearly.
Also fix a >75 column line.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 18:36:54 +0000 (18:36 +0000)]
libxl, xl: fixes to domain creation cleanup logic (domid values)
libxl__domain_make makes some assumptions about the way its caller
treats its uint32_t *domid parameter: specifically, if it fails it may
have partially created the domain and it does not every destroy it.
But it does not initialise it. Document this assumption in a comment,
and assert on entry that domid not a guest domain id, to ensure that
the caller has properly initialised it.
Introduce a function libxl_domid_valid_guest to help with this.
This is not intended to produce any practical functional change in
current code.
Secondly, libxl_create_stubdom calls libxl__domain_make and has no
code to tear down the domain again on error. This is complicated to
fix (since it may even be possible for the the domain to be left in a
state where it's not possible to tell that it was going to be a
stubdom for some other domain). So for now simply leave a fixme
comment.
Finally, in 22739:d839631b6048 we introduced "-1" as a sentinel "no
such domain" value for domid. However, domid is a uint32 so testing
it with "if (domid > 0)" as we do in 22740:ce208811f540 is wrong
because it always triggers. Instead use libxl_domid_valid_guest.
This fix means that that if "xl create" fails, it will not try to
destroy the domain "-1". Previously you'd see this message:
libxl: error: libxl.c:697:libxl_domain_destroy non-existant domain -1
whose "-1" many readers may have thought was an error code, but which
is actually supposedly a domain id.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 18:34:52 +0000 (18:34 +0000)]
libxl: fix error handling (xenstore transaction leak) in libxl__domain_make
libxl__domain_make could under some circumstances leak the xenstore
transaction (stored in the variable t). Also, failures to commit the
xenstore transaction for reasons other than EAGAIN would be ignored (!)
Fix this as follows:
* Initialise t to 0 (not a valid transaction id), and when the
transaction is successfully committed or rolled back, reset it.
* Change all the instances of: libxl__free_all(&gc); return error;
to instead do: rc=error; goto out;
* Use the out stanza for exiting, setting rc=0 on success first.
* Explicitly abort the transaction in the out stanza.
Also add a note by the calls manipulating the gc, to note that as this
is an internal function, the gc should really be set up and destroyed
by its caller. But let's not do that at this stage of the 4.1 release
cycle.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Fri, 28 Jan 2011 17:56:00 +0000 (17:56 +0000)]
xl: fix incorrect display of illegal option character
according to the getopt(3) manpage (and to my testing) getopt returns
'?' if an unknown option character is found and stores the insulting
character in optopt.
This patch fixes the broken output in such a situation:
root@dosorca:/data/images# xl vcpu-list -j
option `?' not supported.
Name ID VCPU CPU State Time(s) CPU Affinity
Domain-0 0 0 0 -b- 193.1 any cpu
turns into:
root@dosorca:/data/images# xl vcpu-list -j
option `j' not supported.
Name ID VCPU CPU State Time(s) CPU Affinity
Domain-0 0 0 0 -b- 193.1 any cpu
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Fri, 28 Jan 2011 17:45:24 +0000 (17:45 +0000)]
xl: fix xl cpupool-list <poolid>
The help screen of xl cpupool-list promises to allow a CPU pool to
be named on the command line, which will then be listed only.
Probably caused by a "DeMorgan brain twist" this specific CPU pool
is _omitted_ instead. The patch fixes this, so single CPU pools
can be explicitly listed again.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andre Przywara [Fri, 28 Jan 2011 17:43:50 +0000 (17:43 +0000)]
xl: remove unimplemented -l stub for cpupool-list
Although advertised via the usage output, xl cpupool-list -l just
returns ERROR_NI, which does not show up on the console. Instead the
output is empty, which is not exactly what --long hints to. To avoid
confusion remove the line from the help output and just ignore the -l
option properly until it gets finally implemented.
Signed-off-by: Andre Przywara <andre.przywara@amd.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Fri, 28 Jan 2011 17:41:15 +0000 (17:41 +0000)]
xl: fix broken cpupool-numa-split
The implementation of xl cpupool-numa-split is broken. It adds nodes
to the wrong pool. This was probably a copy and paste error which
happened when libxl_cpupool_cpuadd_node() was introduced.
Reported-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: when using pygrub, do not segfault if no blktap
Running xl create configfile where configfile includes the lines
bootloader = "/usr/bin/pygrub"
disk = [ 'file:/dev/mapper/vg0-partname,xvda1,w' ]
then xl segfaults at the line
ret = strdup(dev);
of libxl_device_disk_local_attach() in tools/libxl/libxl.c . The
problem is that dev is not set if libxl__blktap_enabled(&gc) is false
or if phystype isn't recognized. In the latter case we want to skip
that line and return NULL, but if libxl__blktap_enabled(&gc) is false
we should be returning something, at least in the cases where the
device has a name in the host which we can just refer to.
Also improve the error message when QCOW or QCOW2 are specified, and
avoid using an uninitialised value of "ret".
Signed-off-by: M A Young <m.a.young@durham.ac.uk> Signed-off-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Jan 2011 16:43:53 +0000 (16:43 +0000)]
libxl: correct error path in libxl_userdata_retrieve
Firstly, if libxl_read_file_contents fails, it doesn't really leave
*data and *datalen_r undefined - it leaves them unchanged. Tighten up
the spec for the benefit of libxl_userdata_retrieve.
Secondly, libxl_userdata_retrieve ignored errors, assuming they were
all ENOENT. Instead it should fail on unexpected errors.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Jim Fehlig <jfehlig@novell.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Currently, cur_pages (which is used as index into page_array for
fetching gfns) is used to judge whether it is proper here to allocated
1G pages. However, cur_pages == page_array[cur_pages] only holds true
when it is below 4G. When it is above 4G, page_array[cur_pages] -
cur_pages = 256M.
As a result, when guest has 10G memory, 8 1G-pages are allocated. But
only 2 of them have their corresponding gfns 1G aligned. The other 6
are forced to split to 2M pages, as their starting gfns are 4G+256M,
5G+256M .................
Inside the patch, true gfns are used instead of cur_pages to fix this
issue.
Signed-off-by: Shan Haitao <haitao.shan@intel.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Daniel Kiper [Thu, 27 Jan 2011 19:51:47 +0000 (19:51 +0000)]
tools/security: Adjust secpol_tool.c for change to xc_interface_open
xc_interface_open() was called with improper number of arguments. It
is fixed by this patch.
This appears to have been missed by 21483:779c0ef9682c. The interface
change also included the return type (int->xc_interface *) but that
was already covered in 21483.
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Daniel Kiper <dkiper@net-space.pl> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 27 Jan 2011 19:42:40 +0000 (19:42 +0000)]
Config.mk: commented-out CONFIG_QEMU example now uses `pwd`/$(XEN_ROOT)
If you actually set it to a relative path, the qemu build breaks.
So this commented-out rune (an example) should arrange to be absolute.
Unfortunately XEN_ROOT is itself relative so the previous attempt to
fix this (22772:654563af359f) didn't work. So use `pwd`.
Tested-by: M A Young <m.a.young@durham.ac.uk> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Christoph Egger [Thu, 27 Jan 2011 19:03:42 +0000 (19:03 +0000)]
libxc: break xc_get_physmem out into os-dependent files
NetBSD doesn't have sysconf(_SC_PHYS_PAGES).
Factor physmem() out into os-dependent files and rename it to
xc_get_physmem() so as not to pollute the namespace.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 27 Jan 2011 18:59:07 +0000 (18:59 +0000)]
xl: Revert "xl: avoid creating domains with duplicate names"
This reverts commit 22820:310cc33bfc81. This functionality should not
be in the domain parsing logic. It needs to be in libxl_domain_make.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>