libxl: Accept disk name in libxl_devid_to_device_disk
Accept disk name in xl block-detach.
Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl>
xen-unstable changest: 23604:5d7998be2252 Backport-requested-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: Remove frontend and backend devices from xenstore after destroy
Cleanup frontend and backend devices from xenstore for all dev types - not only
disks. Because backend cleanup moved to libxl__device_destroy,
libxl__devices_destroy is somehow simpler.
Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl>
xen-unstable changest: 23605:ff8d170852b3 Backport-requested-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
remus: handle exceptions while installing/unstalling net buffer
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23600:15fc211a13bf Backport-requested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
At the end of a checkpoint, when a new flush (of buffered disk writes)
is merged with ongoing flush, we have to make sure that none of the new
disk I/O requests overlap with ones in in progress. If it does, hold the
request and dont issue I/O until the overlapping one finishes. If we allow
the I/O to proceed, we might end up with two overlapping requests in the
disk's queue and the disk may not offer any guarantee on which one is
written first.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23414:ecff559bf474 Backport-requested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
DRBD disk backends can be used instead of tapdisk backends for Remus.
This requires a Remus style disk replication protocol (asynchronous
replication with output buffering at backup), that is not available in
standard DRBD code. A modified version that supports this new replication
protocol is available from git://aramis.nss.cs.ubc.ca/drbd-8.3-remus
Use of DRBD disk backends provides a means for efficient
resynchronization of data after the crashed machine comes back
online. Since DRBD allows for online resynchronization, a DRBD backed
Remus VM does not have to be stopped or shutdown while the disks are
resynchronizing. Once resynchronization is complete, Remus can be
started at will.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23413:62c0dfc9efbf Backport-requested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
This was introduced in 23195:13ec53a59a42
It is a problem for Python 2.4 and earlier, only.
So use try...(try...except)...finally as suggested by Ian Campbell.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23379:b04e57ec4671 Backport-requested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
The new --null option allows one to test and play with just the
memory checkpointing and network buffering aspect of remus, without
the need for a second host. The disk is not replicated. All replication
data is sent to /dev/null. This option is pretty handy when a user
wants to see the page churn for his workload or observe the latency hit
though the latter will not be accurate.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23196:29d81623dc14 Backport-requested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
current check includes ingress and pfifo_fast.
Add mq to the list of allowed qdiscs already installed
on ifb. This patch fixes cases where remus fails to start,
due to an mq qdisc already present on the vif.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23109:c8ae80a11d47
Backport-requested: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 30 Aug 2011 15:57:05 +0000 (16:57 +0100)]
xl: print sxp on dry-run of create.
The help text for xm create's --dry-run says "Dry run - prints the
resulting configuration in SXP but does not create the domain." so
update xl implementation to match. At least the xendomains initscript
relies on this (for better or worse).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Carsten Schiers <carsten@schiers.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23467:2ae357405850
Backport-requested: Carsten Schiers <carsten@schiers.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Fabio Fantoni [Tue, 30 Aug 2011 15:56:22 +0000 (16:56 +0100)]
tools: Improved LSB headers in init.d scripts
xendomains service now working also without xend service
Signed-off-by: Fabio Fantoni <fabio.fantoni@heliman.it> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23673:0648846b4d17
Backport-requested: Carsten Schiers <carsten@schiers.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
/etc/init.d/xendomains relies on simple pattern matching from sructures
being printed by "xl list -l" command. so update xl implementation to
match.
Signed-off-by: Carsten Schiers <carsten@schiers.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23567:c2995f0555af Backported-by: Carsten Schiers <carsten@schiers.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
David Vrabel [Mon, 22 Aug 2011 09:16:15 +0000 (10:16 +0100)]
x86: use 'dom0_mem' to limit the number of pages for dom0
Use the 'dom0_mem' command line option to set the maximum number of
pages for dom0. dom0 can use then use the XENMEM_maximum_reservation
memory op to automatically find this limit and reduce the size of any
page tables etc.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
xen-unstable changeset: 23779:c56dd5eb0fa2
xen-unstable date: Mon Aug 22 10:05:27 2011 +0100
Kevin Tian [Mon, 22 Aug 2011 09:14:14 +0000 (10:14 +0100)]
cpuidle: initialize default Cstate information
C0/C1 should be always available when cpuidle is enabled in Xen.
When there's case that Dom0 doesn't register ACPI Cstate information,
e.g. due to BIOS issue or acpi processor module is not installed,
this patch provides basic C0/C1 information available to xenpm tool.
Andrew Cooper [Fri, 19 Aug 2011 09:00:25 +0000 (10:00 +0100)]
x86/KEXEC: disable hpet legacy broadcasts earlier
On x2apic machines which booted in xapic mode,
hpet_disable_legacy_broadcast() sends an event check IPI to all online
processors. This leads to a protection fault as the genapic blindly
pokes x2apic MSRs while the local apic is in xapic mode.
One option is to change genapic when we shut down the local apic, but
there are still problems with trying to IPI processors in the online
processor map which are actually sitting in NMI loops
Another option is to have each CPU take itself out of the online CPU
map during the NMI shootdown.
Realistically however, disabling hpet legacy broadcasts earlier in the
kexec path is the easiest fix to the problem.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23776:0ddb4481f883
xen-unstable date: Fri Aug 19 09:58:22 2011 +0100
Jan Beulich [Tue, 16 Aug 2011 14:21:46 +0000 (15:21 +0100)]
x86/PCI-MSI: properly determine VF BAR values
As was discussed a couple of times on this list, SR-IOV virtual
functions have their BARs read as zero - the physical function's
SR-IOV capability structure must be consulted instead. The bogus
warnings people complained about are being eliminated with this
change.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23766:8d6edc3d26d2
xen-unstable date: Sat Aug 13 10:14:58 2011 +0100
PCI: consolidate interface for adding devices
The functionality of pci_add_device_ext() can be easily folded into
pci_add_device(), and eliminates the need to change two functions for
future adjustments.
Andrew Cooper [Tue, 16 Aug 2011 14:17:43 +0000 (15:17 +0100)]
x86: IRQ fix incorrect logic in __clear_irq_vector
In the old code, tmp_mask is the cpu_and of cfg->cpu_mask and
cpu_online_map. However, in the usual case of moving an IRQ from one
PCPU to another because the scheduler decides its a good idea,
cfg->cpu_mask and cfg->old_cpu_mask do not intersect. This causes the
old cpu vector_irq table to keep the irq reference when it shouldn't.
This leads to a resource leak if a domain is shut down wile an irq has
a move pending, which results in Xen's create_irq() eventually failing
with -ENOSPC when all vector_irq tables are full of stale references.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23765:68b903bb1b01
xen-unstable date: Sat Aug 13 10:14:28 2011 +0100
Jan Beulich [Tue, 16 Aug 2011 14:17:06 +0000 (15:17 +0100)]
VT-d: don't reject valid DMAR/ATSR tables on systems with multiple PCI segments
On multi-PCI-segment systems, each segment has to be expected to have
an include-all DRHD and an all-ports ATSR, so the firmware consistency
check incorrectly rejects valid configurations there (which is
particularly problematic when the firmware also pre-enabled x2apic
mode, as the system will panic in that case due to being unable to
enable interrupt remapping). Thus constrain the check to just segment
0 for now; once full multi-segment support is there (which I'm working
on), it can be revisited whether we'd want to track this per segment,
or whether we trust the firmware of such large systems.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23763:8f647d409196
xen-unstable date: Sat Aug 13 10:12:49 2011 +0100
Tim Deegan [Mon, 25 Jul 2011 15:48:39 +0000 (16:48 +0100)]
VT-d: always clean up dpci timers.
If a VM has all its PCI devices deassigned, need_iommu(d) becomes
false but it might still have DPCI EOI timers that were init_timer()d
but not yet kill_timer()d. That causes xen to crash later because the
linked list of inactive timers gets corrupted, e.g.:
tools: xencommons NetBSD init script: Multiple bugfixes and improvements
Added a cleanup of the xenstore database, to purge old entries,
prevented the restart of xenstore and set Domain-0 name. Also
replaced the sleep 5 (wait for xenstore to come up) with the method
used in the linux init script.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23699:6fe9f26bb9ae
xen-unstable date: Fri Jul 15 18:22:03 2011 +0100
xend: NetBSD portability fix for LVM raw volume names
Xen 4.1.1 was incorrectly passing /dev/rmapper/vg-lvname to pygrub
(notice the r in front of mapper), when it should pass
/dev/mapper/rvg-lvname (add the r to the last file) when using NetBSD.
I've patched it to work correctly. I'm attaching a unified diff with
the patch made against Xen 4.1.1 (it's a really simple modification).
From: Roger Pau Monne <roger.pau@entel.upc.edu> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23629:89ce3439686b
xen-unstable date: Tue Jun 28 13:56:53 2011 +0100
Mike McClurg [Thu, 21 Jul 2011 13:40:39 +0000 (14:40 +0100)]
tools/ocaml: ask compiler for correct library
OCaml libraries will live in /usr/local/ if the user compiles OCaml
from source. This patch asks the OCaml compiler where we should look
for libraries.
NB: it may be that we should do the same thing for the NetBSD case,
but I don't have a BSD box to test this out.
Signed-off-by: Mike McClurg <mike.mcclurg@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23566:7e5b54d1643e
xen-unstable date: Tue Jun 21 18:01:51 2011 +0100
hvmloader: Switch to absolute addressing for calling hypercall stubs.
This is clearer and less fragile than trying to make relative calls
work. In particular, the old approach failed if _start was not
== HVMLOADER_PHYSICAL_ADDRESS. This was the case for some modern
toolchains which reorder functions.
Jan Beulich [Sat, 16 Jul 2011 08:33:46 +0000 (09:33 +0100)]
x86: fix guest migration after c/s 20892:d311d1efc25e
Guests would not manage to run successfully after being migrated to a
host having sufficiently much more memory than the host they were
originally started on.
Subsequently the plan is to re-enable the changes behavior under the
control of a guest kernel announced feature flag.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen-unstable changeset: 23706:3dd399873c9e
xen-unstable date: Sat Jul 16 09:18:21 2011 +0100
David Vrabel [Sat, 16 Jul 2011 08:33:07 +0000 (09:33 +0100)]
xen/libxc: set CPUID topology leaf as unsupported for PV guests
The result of a CPUID Extended Topology Enumeration leaf for PV guests
is invalid as the level in ECX is ignored. This can cause some guests
to loop endlessly when trying to enumerate the topology.
Since the physical topology isn't useful to PV guests set the topology
leaf as unsupported.
Guests affected include Linux kernels prior 2.6.32 where a workaround
was applied ("xen: mask extended topology info in cpu", 82d6469916c6fcfa345636a49004c9d1753905d1).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
xen-unstable changeset: 23700:867bb675b57b
xen-unstable date: Sat Jul 16 09:05:45 2011 +0100
23408:1fc3347850c7 causes the following error:
machine_kexec.c:106: error: static declaration of
'machine_kexec_get_xen' follows non-static declaration
/xen-unstable.hg/xen/include/xen/kexec.h:39: error: previous
declaration of 'machine_kexec_get_xen' was here
Andrew Cooper [Fri, 8 Jul 2011 07:57:11 +0000 (08:57 +0100)]
KEXEC: disconnect all PCI devices from the PCI bus on crash
In the case of a crash, IOMMU DMA remapping gets turned off so that
the kdump kernel may boot. However, this is warned as being dangerous
in the VTD specification if a DMA transaction is in progress.
Also, in the case of a crash, DMA transactions and interrupts from
peripheral devices such as network cards are likely to keep coming in.
Without DMA remapping enabled, the transactions will be writing over
low memory, corrupting the crash state, and perhaps even the kdump
reserved memory.
Therefore, on the crash path, we can disconnect all PCI devices from
their respective buses so that they are no longer able to be DMA
busmasters. This reduces the risk of DMA transactions corrupting
state (and will also reduce spurious interrupts arriving to the kdump
kernel) until the kdump kernel and properly reset the PCI devices.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23666:b96f8bdcaa15
xen-unstable date: Fri Jul 08 08:38:35 2011 +0100
Paul Durrant [Fri, 8 Jul 2011 07:56:42 +0000 (08:56 +0100)]
x86/hvm: Don't expose CPUID time leaf when not using PVRDTSCP
Some versions of Oracle's Solaris PV drivers make a check that the
maximal Xen hypervisor CPUID leaf is <= base leaf + 2 and refuse to
work if this is not the case. The addition of the time leaf makes the
maximal leaf == base leaf + 3 so this patch introduces a workaround
that obscures the time leaf unless PVRDTSCP is in operation.
x86 cpu: Fix bug: unify cpu_dev attr as __cpuinitdata
Currently different x86 cpu define different attr for cpu_dev.
Some cpu define as __initdata, this would be risk under cpu hotplug.
This patch fix the bug, unify them as __cpuinitdata, as what AMD cpu
define now.
Tim Deegan [Tue, 28 Jun 2011 08:32:00 +0000 (09:32 +0100)]
x86: fix boot-time watchdog test.
Since the perf counter that the LAPIC NMI watchdog uses only
runs while the core isn't halted, and all APs are idle at
this point in the boot process, it's possible that remote
CPUs won't see any NMIs during the 10-tick waiting period.
Force all CPUs to busy-wait so we know the timers are running.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23612:6c7a23e08a04
xen-unstable date: Tue Jun 28 09:16:13 2011 +0100
pv-on-hvm: hvm_domain_use_pirq return positive no matter if the evtchn is bound
This patch fixes PV on HVM interrupt remapping with recent Linux
kernels and upstream qemu. hvm_domain_use_pirq should return positive
even if the evtchn is not currently bound. If it doesn't assert_irq
ends up injecting legacy interrupts even after the guest disabled the
irq.
Keir Fraser [Thu, 23 Jun 2011 10:54:53 +0000 (11:54 +0100)]
x86/hvm: add SMEP support to HVM guest
Intel new CPU supports SMEP (Supervisor Mode Execution
Protection). SMEP
prevents software operating with CPL < 3 (supervisor mode) from
fetching
instructions from any linear address with a valid translation for
which the U/S
flag (bit 2) is 1 in every paging-structure entry controlling the
translation
for the linear address.
This patch adds SMEP support to HVM guest.
Signed-off-by: Yang Wei <wei.y.yang@intel.com> Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Li Xin <xin.li@intel.com> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23504:c34604d5a293
xen-unstable date: Mon Jun 06 13:46:48 2011 +0100
Intel new CPU supports SMEP (Supervisor Mode Execution
Protection). SMEP prevents software operating with CPL < 3 (supervisor
mode) from fetching instructions from any linear address with a valid
translation for which the U/S flag (bit 2) is 1 in every
paging-structure entry controlling the translation for the linear
address.
This patch enables SMEP in Xen to protect Xen hypervisor from
executing pv guest instructions, whose translation paging-structure
entries' U/S flags are all set.
Signed-off-by: Yang Wei <wei.y.yang@intel.com> Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Li Xin <xin.li@intel.com> Signed-off-by: Keir Fraser <keir@xen.org>
xen-unstable changeset: 23481:0c0884fd8b49
xen-unstable date: Fri Jun 03 21:39:00 2011 +0100
Keir Fraser [Thu, 23 Jun 2011 10:48:18 +0000 (11:48 +0100)]
kexec: Backport fixes from xen-unstable
KEXEC: prevent panic on the kexec path when talking to the DMAR
hardware
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23547:b5955b9fc26c
xen-unstable date: Thu Jun 16 16:11:13 2011 +0100
KEXEC: correctly revert x2apic state when kexecing
Introduce the boolean variable 'kexecing' which indicates to functions
whether we are on the kexec path or not. This is used by
disable_local_APIC() to try and revert the APIC mode back to how it
was found on boot.
We also need some fudging of the x2apic_enabled variable. It is used
in multiple places over the codebase to mean multiple things,
including:
What did the user specifify on the command line?
Did the BIOS boot me in x2apic mode?
Is the BSP Local APIC in x2apic mode?
What mode is my Local APIC in?
Therefore, set it up to prevent a protection fault when disabling the
IOAPICs. (In this case, it is used in the "What mode is my Local APIC
in?" case, so the processor doesnt suffer a protection fault because
of trying to use x2apic MSRs when it should be using xapic MMIO)
Finally, make sure that interrupts are disabled when jumping into the
purgatory code. It would be bad to service interrupts in the Xen
context when the next kernel is booting.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23542:23c068b10923
xen-unstable date: Wed Jun 15 16:16:41 2011 +0100
IOMMU: add crash_shutdown iommu_op
The kdump kernel has problems booting with interrupt/dma
remapping enabled, so we need a new iommu_ops called
crash_shutdown which is basically suspend but doesn't
need to bother saving state.
Make sure that crash_shutdown is called on the kexec
path. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23541:c6307ddd3ab1
xen-unstable date: Wed Jun 15 16:10:11 2011 +0100
Experimental evidence shows that Extended Interrupt Mode remains in
effect even after Interrupt Remapping is disabled in each DMAR Global
Command Register. A consiquence of this is that when we switch from
x2apic mode back to xapic mode, and disable interrupt remapping for
the kdump kernel, interrupts passing through the IO APICs are in
x2apic format as opposed xapic. This causes a triple fault in the
kexec kernel.
As EIM is explicitly set up each time Interrup Remapping is enabled,
it is safe for us to clobber this when taring down.
Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
verbose and error-prone code, and was only used in 1 place before. We
now have IRTA_EIME which is the specific bit in the register.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23540:96f53d2b966e
xen-unstable date: Wed Jun 15 16:07:45 2011 +0100
Experimental evidence shows that Extended Interrupt Mode remains in
effect even after Interrupt Remapping is disabled in each DMAR Global
Command Register. A consiquence of this is that when we switch from
x2apic mode back to xapic mode, and disable interrupt remapping for
the kdump kernel, interrupts passing through the IO APICs are in
x2apic format as opposed xapic. This causes a triple fault in the
kexec kernel.
As EIM is explicitly set up each time Interrup Remapping is enabled,
it is safe for us to clobber this when taring down.
Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
verbose and error-prone code, and was only used in 1 place before. We
now have IRTA_EIME which is the specific bit in the register.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen-unstable changeset: 23515:337520d94cba
xen-unstable date: Tue Jun 14 13:04:09 2011 +0100
x86/apic: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems
when shutting down for a kexec jump. This patch records the boot
state so we can return to the boot state when kexecing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Keir Fraser <keir@xen.org> Acked-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23514:d04608ad70f8
xen-unstable date: Tue Jun 14 13:02:00 2011 +0100
x86/kexec: nmi_shootdown_cpus() should leave irqs disabled
Jan Beulich [Wed, 15 Jun 2011 19:45:54 +0000 (20:45 +0100)]
x86-64: fix incorrect assertion in __maddr_to_virt()
When memory map sparseness reduction is in use, machine address ranges
can't validly be compared directly against the total size of the
direct mapping range.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23543:a8edfacd4b5e
xen-unstable date: Wed Jun 15 20:24:09 2011 +0100
George Dunlap [Wed, 15 Jun 2011 19:45:20 +0000 (20:45 +0100)]
x86/hvm: Crash domain rather than guest on unexpected PIO IO state
Under certain conditions, if an IO gets into an unexpected state,
hvmemul_do_io can return X86EMUL_UNHANDLEABLE. Unfortunately,
handle_pio() does not expect this state, and calls BUG() if it sees
it, crashing the host.
Other HVM io-related code crashes the guest in this case. This patch
makes handle_pio() do the same.
The crash was seen when executing crash_guest in dom0 to forcibly
crash the guest.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
xen-unstable changeset: 23538:35b4220c98bc
xen-unstable date: Wed Jun 15 16:05:14 2011 +0100
Andrew Cooper [Wed, 15 Jun 2011 19:44:44 +0000 (20:44 +0100)]
x86/apic: fix potential Protection Fault during shutdown
This is a rare case, but if the BIOS is set to uniprocessor, and Xen
is booted with 'lapic x2apic', Xen will switch into x2apic mode, which
will cause a protection fault when disabling the local APIC. This
leads to a general protection fault as this code is also in the fault
handler.
When x2apic mode is enabled, the only tranlsation which does
not result in a protection fault is to clear both the EN and EXTD
bits, which is safe to do in all cases, even if you are in xapic
mode rather than x2apic mode.
The linux code from which this is derrived is protected by an
if ( ! x2apic_mode ...) clause which is how they get away with it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23512:0feb98534a87
xen-unstable date: Tue Jun 14 12:47:45 2011 +0100
mem_event: Revert pointless, unrelated, and broken (on i386) change in 23434:ef410f262299
vcpu_pause() is nestable in the hypervisor, hence checking for
already-paused is not required.
Signed-off-by: Keir Fraser <keir@xen.org>
xen-unstable changeset: 23435:c15f06b99bbe
xen-unstable date: Sat May 28 08:33:54 2011 +0100
mem_event: Allow memory access listener to perform single step execution.
Add a new memory event that handles single step. This allows the
memory access listener to handle instructions that modify data within
the execution page. This can be enabled in the listener by doing:
xc_set_hvm_param(xch, domain_id, HVM_PARAM_MEMORY_EVENT_SINGLE_STEP,
HVMPME_mode_sync)
Now the listener can start single stepping by:
xc_domain_debug_control(xch, domain_id,
XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON, vcpu_id)
And stop single stepping by: xc_domain_debug_control(xch, domain_id,
XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF, vcpu_id)
Signed-off-by: Aravindh Puthiyaparambil <aravindh@virtuata.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23434:ef410f262299
xen-unstable date: Fri May 27 18:44:26 2011 +0100
Markus Gross [Sat, 28 May 2011 08:09:40 +0000 (09:09 +0100)]
libxc: obtain correct length of p2m during core dumping
while implementing core dumping functionality for the libxl driver
of libvirt, I discovered an issue with mapping pages of a pv guest.
After dumping the core of a pv guest the domain was not cleared up
properly and some pages were not unmapped. This issue is similar
to the one reported here:
http://lists.xensource.com/archives/html/xen-devel/2011-05/msg01314.html
In xc_domain_dumpcore_via_callback in the file xc_core.c the function
xc_core_arch_map_p2m is called to map P2M_FL_ENTRIES pages to the
variable p2m.
But to unmap the pages later, the dinfo->p2m_size has to be set
accordingly.
This was not done, instead a variable named p2m_size was set.
This way P2M_FL_ENTRIES was always zero and the pages were left
mapped.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23374:8bd7b5e98f2a
xen-unstable date: Tue May 24 15:00:16 2011 +0100
Jim Fehlig [Sat, 28 May 2011 08:08:21 +0000 (09:08 +0100)]
libxc: after saving, unmap correct amount for live_m2p
With some help from Olaf, I've finally got to the bottom of an issue I
came across while trying to implement save/restore in the libvirt
libxenlight driver. After issuing the save operation, the saved
domain was not being cleaned up properly and left in this state from
xl's perspective
xen33:# xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 6821 8 r----- 122.5
(null) 2 2 2 --pssd 10.8
Checking the libvirtd /proc/$pid/maps I found this
So not all all pages belonging to the domain were unmapped from
libvirtd. In tools/libxc/xc_domain_save.c we found that
P2M_FL_ENTRIES were being mapped but only P2M_FLL_ENTRIES were being
unmapped. The attached patch changes the unmapping to use the same
P2M_FL_ENTRIES macro. I'm not too familiar with this code though so
posting here for review.
I suspect this was not noticed before since most (all?) processes
doing save terminate after the save and are not long-running like
libvirtd.
Ian Campbell writes:
> Looks like I introduced this in 18558:ccf0205255e1, sorry!
>
> I guess it is also wrong in the error path out of map_and_save_p2m_table
> and so we also need [another hunk].
This change should be backported to relevant earlier trees. -iwj
From: Jim Fehlig <jfehlig@novell.com>
From: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 23373:171007b4e2c4
xen-unstable date: Tue May 24 14:50:00 2011 +0100
Tim Deegan [Tue, 24 May 2011 07:19:39 +0000 (08:19 +0100)]
drivers/passthrough: fix error paths in pci_add_device*()
When a device can't be allocated to dom0 by the IOMMU, don't leave
dom0 in the "domain" field. It causes pci_remove_device()
to crash trying to remove the dev from the domain's list of devices
(and was probably the wrong thing to do anyway).
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23371:4326bcd88b33
xen-unstable date: Mon May 23 18:35:32 2011 +0100
Keir Fraser [Tue, 24 May 2011 07:18:42 +0000 (08:18 +0100)]
Fix Config.mk's cc-option for -Wno-* options.
These disable-warning options are handled specially by GCC:
(a) they are ignored unless the compiler emits a warning; and
(b) even then they produce a warning rather than an error
To handle this, modify the test invocation of GCC to compile a
fragment of code that will always provoke a warning (integer assigned
to pointer). This works around (a) above.
Then, we grep the compiler's stdout/stderr for the option-under-test,
the presence of which would indicate an "unrecognized command-line
option" warning/error. This works around (b) above, letting us
distinguish between the "integer assigned to pointer" and
"unrecognized command-line option" warnings.
Jan Beulich [Fri, 20 May 2011 12:49:36 +0000 (13:49 +0100)]
x86: clear CPUID output of leaf 0xd for Dom0 when xsave is disabled
Linux starting with 2.6.36 uses the XSAVEOPT instruction and has
certain code paths that look only at the feature bit reported through
CPUID leaf 0xd sub-leaf 1 (i.e. without qualifying the check with one
evaluating leaf 4 output). Consequently the hypervisor ought to mimic
actual hardware in clearing leaf 0xd output when not supporting xsave.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23353:a768a10d32b4
xen-unstable date: Fri May 20 08:54:45 2011 +0100
Olaf Hering [Fri, 20 May 2011 12:47:08 +0000 (13:47 +0100)]
x86/mm: add HVMOP_get_mem_type hvmop
The balloon driver in the guest frees guest pages and marks them as
mmio. When the kernel crashes and the crash kernel attempts to read
the
oldmem via /proc/vmcore a read from ballooned pages will generate 100%
load in dom0 because Xen asks qemu-dm for the page content. Since the
reads come in as 8byte requests each ballooned page is tried 512
times.
Add a new hvmop HVMOP_get_mem_type to return the hvmmem_type_t for the
given pfn. Pages which are neither ram or mmio will be HVMMEM_mmio_dm.
This interface enables the crash kernel to skip ballooned pages.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23298:26413986e6e0
xen-unstable date: Wed May 04 13:37:58 2011 +0100
Igor Mammedov [Mon, 16 May 2011 12:38:09 +0000 (13:38 +0100)]
VT-d: Fix resource leaks on error paths
On error exit from functions, maped pages should be unmapped
and acquired locks released.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Allen Kay <allen.m.kay@intel.com>
xen-unstable changeset: 23343:edcf8fc77b64
xen-unstable date: Mon May 16 13:29:24 2011 +0100
Ian Campbell [Mon, 16 May 2011 12:36:45 +0000 (13:36 +0100)]
x86/ioapic: avoid gcc 4.6 warnings about uninitialised variables
gcc 4.6 complains:
io_apic.c: In function 'restore_IO_APIC_setup':
/build/user-xen_4.1.0-3-amd64-zSon7K/xen-4.1.0/debian/build/build-hypervisor_amd64_amd64/xen/include/asm/io_apic.h:150:26:
error: '*((void *)&entry+4)' may be used uninitialized in this
function [-Werror=uninitialized]
io_apic.c:221:32: note: '*((void *)&entry+4)' was declared
here
/build/user-xen_4.1.0-3-amd64-zSon7K/xen-4.1.0/debian/build/build-hypervisor_amd64_amd64/xen/include/asm/io_apic.h:150:26:
error: 'entry' may be used uninitialized in this function
[-Werror=uninitialized]
io_apic.c:221:32: note: 'entry' was declared here
cc1: all warnings being treated as errors
Add functions to read/write an entire IO APIC entry using an explicit
union to allow gcc to spot the initialisation.
Reported as Debian bug #625438, thanks to Matthias Klose.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@novell.com>
xen-unstable changeset: 23341:87084ca76c9c
xen-unstable date: Mon May 16 13:13:41 2011 +0100
Keir Fraser [Thu, 12 May 2011 17:03:47 +0000 (18:03 +0100)]
x86, vtd: [CVE-2011-1898] Protect against malicious MSIs from untrusted devices.
In the absence of VT-d interrupt remapping support, a device can send
arbitrary APIC messages to host CPUs. One class of attack that results
is to confuse the hypervisor by delivering asynchronous interrupts to
vectors that are expected to handle only synchronous
traps/exceptions.
We block this class of attack by:
(1) setting APIC.TPR=0x10, to block all interrupts below vector
0x20. This blocks delivery to all architectural exception vectors.
(2) checking APIC.ISR[vec] for vectors 0x80 (fast syscall) and 0x82
(hypercall). In these cases we BUG if we detect we are handling a
hardware interrupt -- turning a potentially more severe infiltration
into a straightforward system crash (i.e, DoS).
Thanks to Invisible Things Lab <http://www.invisiblethingslab.com>
for discovery and detailed investigation of this attack.
The checks in assert_irq and deassert_irq to distinguish interrupts
that have been remapped onto event channels from the others that have
to be injected using the emulated lapic are wrong.
Fix the condition checks using the convenient hvm_domain_use_pirq
function.
Tim Deegan [Thu, 12 May 2011 08:19:29 +0000 (09:19 +0100)]
x86: use compat hypercall handlers for calls from 32-bit HVM guests
On 64-bit Xen, hypercalls from 32-bit HVM guests are handled as
a special case, but not all the hypercalls are corrently redirected
to their compat-mode wrappers. Use compat_* for xen_version,
sched_op and set_timer_op for consistency.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
xen-unstable changeset: 23333:fabdd682420c
xen-unstable date: Thu May 12 09:13:18 2011 +0100
Ian Jackson [Mon, 9 May 2011 14:04:01 +0000 (15:04 +0100)]
libxc: [CVE-2011-1583] pv kernel image validation
The functions which interpret the kernel image supplied for a
paravirtualised guest, and decompress it into memory when booting the
domain, are incautious. Specifically:
(i) Integer overflow in the decompression loop memory allocator might
result in overrunning the buffer used for the decompressed image;
(ii) Integer overflows and lack of checking of certain length fields
can result in the loader reading its own address space beyond the
size of the supplied kernel image file.
(iii) Lack of error checking in the decompression loop can lead to an
infinite loop.
This patch fixes these problems.
CVE-2011-1583.
Signed-off-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>