Keir Fraser [Mon, 30 Aug 2010 07:50:52 +0000 (08:50 +0100)]
x2APIC: Improve x2APIC suspend/resume
x2apic depends on interrupt remapping, so it should disable interrupt
remapping behind x2apic disabling. And also this patch wraps
__enable_x2apic to get rid of duplicated code.
Signed-off-by: Weidong Han <weidong.han@intel.com>
xen-unstable changeset: 3cee41690fa2
xen-unstable date: Fri Aug 13 14:58:06 2010 +0100
Keir Fraser [Sun, 15 Aug 2010 20:48:06 +0000 (21:48 +0100)]
blktap2: make protocol specific usage of shared sring explicit
I don't think protocol specific data really belongs in this header
but since it is already there and we seem to be stuck with it let's at
least make the users explicit lest people get caught out by future new
fields moving the pad field around.
This is the Xen portion of this change. The kernel portion will be
sent separately. There is no dependency between the two.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Daniel Stodden <daniel.stodden@citrix.com> Cc: Dongxiao Xu <dongxiao.xu@intel.com>
xen-unstable changeset: feee0abed6aa
xen-unstable date: Fri Jul 02 18:58:02 2010 +0100
Keir Fraser [Fri, 13 Aug 2010 14:06:24 +0000 (15:06 +0100)]
Fix IOAPIC S3 with interrupt remapping enabled
In ioapic_suspend, it reads and saves ioapic RTEs. But when interrupt
remapping is enabled, io_apic_read will call io_apic_read_remap_rte to
convert remapped format interrupt to compatible format, this results
in 'dest' field may be changed in remap_entry_to_ioapic_rte. When in
ioapic_resume, it will write the saved RTEs with incorrect 'dest' to
interrupt remapping table.
Actually it needn't to convert RTEs regardless interrupt remapping is
enabled or not. It just needs to save and restore RTE values
directly. This patch just uses __io_apic_read and __io_apic_write,
which won't call Interrupt remapping functions to convert, to save and
restore RTEs in ioapic_suspend and ioapic_resume. Thus fix this issue.
Signed-off-by: Weidong Han <weidong.han@intel.com>
xen-unstable changeset: 01d185dab39e
xen-unstable date: Fri Aug 13 14:57:35 2010 +0100
Keir Fraser [Fri, 13 Aug 2010 08:05:07 +0000 (09:05 +0100)]
[Xen-devel] [PATCH] PoD: Fix domain build populate-on-demand cache
allocation Rather than trying to count the number of PoD entries we're
putting in, we simply pass the target # of pages - the vga hole, and
let the hypervisor do the calculation.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xen-unstable changeset: 6f059a340cdf
xen-unstable date: Wed Aug 11 15:56:21 2010 +0100
Keir Fraser [Fri, 13 Aug 2010 07:52:56 +0000 (08:52 +0100)]
msi: Avoid uninitialized msi descriptors
When __pci_enable_msix() returns early, output parameter (struct
msi_desc **desc) will not be initialized. On my machine, a Broadcom
BCM5709 nic has both MSI and MSIX capability blocks and when guest
tries to enable msix interrupts but __pci_enable_msix() returns early
for encountering a msi block, the whole system will crash for fatal
page fault immediately.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
xen-unstable changeset: 786b163da49b
xen-unstable date: Wed Aug 11 17:01:02 2010 +0100
Keir Fraser [Fri, 13 Aug 2010 07:52:08 +0000 (08:52 +0100)]
xc: fix segfault in pv domain create if kernel is an invalid image
If libelf calls elf_err() or elf_msg() before elf_set_log() has been
called then it could potentially read an uninitialised log handling
callback function pointer from struct elf_binary. Fix this in libxc by
zeroing the structure before calling elf_init().
Keir Fraser [Mon, 9 Aug 2010 15:51:30 +0000 (16:51 +0100)]
vt-d: Fix ioapic_rte_to_remap_entry error path.
When ioapic_rte_to_remap_entry fails, currently it just writes value
to ioapic. But the 'mask' bit may be changed if it writes to the upper
half of RTE. This patch ensures to recover the original value of
'mask' bit in this case.
Signed-off-by: Weidong Han <weidong.han@intel.com>
xen-unstable changeset: 21934:befd1814c0a2
xen-unstable date: Mon Aug 09 16:33:45 2010 +0100
Keir Fraser [Mon, 9 Aug 2010 15:51:03 +0000 (16:51 +0100)]
vt-d: Fix ioapic write order in io_apic_write_remap_rte
At the end of io_apic_write_remap_rte, it writes new entry (remapped
interrupt) to ioapic. But it writes low 32 bits before high 32 bits,
it unmasks interrupt before writing high 32 bits if 'mask' bit in low
32 bits is cleared. Thus it may result in issues. This patch fixes
this issue by writing high 32 bits before low 32 bits.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com>
xen-unstable changeset: 21933:add40eb47868
xen-unstable date: Mon Aug 09 16:32:45 2010 +0100
Keir Fraser [Mon, 2 Aug 2010 16:17:55 +0000 (17:17 +0100)]
xenpaging: Add a check to Xen for EPT.
There isn't seem to be a way to directly check for EPT, so instead
check for HAP and an Intel processor. If EPT isn't enabled, then
return an error to the tool.
Keir Fraser [Mon, 2 Aug 2010 16:11:33 +0000 (17:11 +0100)]
Walking the page lists needs the page_alloc lock
There are a few places in Xen where we walk a domain's page lists
without holding the page_alloc lock. They race with updates to the
page lists, which are normally rare but can be quite common under PoD
when the domain is close to its memory limit and the PoD reclaimer is
busy. This patch protects those places by taking the page_alloc lock.
I think this is OK for the two debug-key printouts - they don't run
from irq context and look deadlock-free. The tboot change seems safe
too unless tboot shutdown functions are called from irq context or
with the page_alloc lock held. The p2m one is the scariest but there
are already code paths in PoD that take the page_alloc lock with the
p2m lock held so it's no worse than existing code.
iommu: New options iommu=dom-strict and iommu=dom0-passthrough
The former strips dom0 of its usual 1:1 mapping of all memory, and
only provides it with mappings of its own memory, like any other
domain. The latter is a new consistent name for iommu=passthrough.
xen: Send the debug VIRQ to guests after the rest of the domain dump is done.
Send the debug VIRQ to guests after the rest of the domain dump is
done. This stops all the 'q' debug-key output getting interleaved with
the debug-virq output from a pv-ops dom0 kernel.
xm: Do not check path of kernel if bootloader is specified
When create DomU, if bootloader is specified, 'kernel/ramdisk' will be
used by bootloader when boots DomU. So it is needless to check the
path is existent or not.
Newer version of gdb, version 7*, seems to have bug where it is not
parsing thread list from gdbsx properly. Getting rid of the space in
thread list works around it. It's ok with older gdb also.
tools/hotplug: locking.sh script: fix lock directory remains on error bug
_release_lock should be used instead of release_lock.
sigerr is introduced so that it can be redefined by
xen-hotplug-common.sh to a version which writes error status to
xenstore.
This matches similar checks done in Linux, since no good can come from
a domain trying to enable both MSI and MSI-X on the same device at the
same time.
This patch masks PIC and IOAPIC RTE's before x2APIC enabling, unmask
and restore them after x2APIC enabling. It also really enables
interrupt remapping before x2APIC enabling instead of just checking
interrupt remapping setting. This patch also handles all x2APIC
configuration including BIOS settings and command line
settings. Especially, it handles that BIOS hands over in x2APIC mode
(when there is apic id > 255). It checks if x2APIC is already enabled
by BIOS. If already enabled, it will disable interrupt remapping and
queued invalidation first, then enable them again.
x2APIC/VT-d: improve interrupt remapping and queued invalidation enabling and disabling
x2APIC depends on interrupt remapping, so interrupt remapping needs to
be enabled before x2APIC. Usually x2APIC is not enabled
(x2apic_enabled=0) when enable interrupt remapping, although x2APIC
will be enabled later. So it needs to pass a parameter to set
interrupt mode in intremap_enable, instead of checking
x2apic_enable. This patch adds a parameter "eim" to intremap_enable to
achieve it. Interrupt remapping and queued invalidation are already
enabled when enable x2apic, so it needn't to enable them again when
setup iommu. This patch checks if interrupt remapping and queued
invalidation are already enable or not, and won't enable them if
already enabled. It does the similar in disabling, that's to say don't
disable them if already disabled.
A drhd is created when parse ACPI DMAR table, but drhd->iommu is not
allocated until iommu setup. But iommu is needed by x2APIC which will
enable interrupt remapping before iommu setup. This patch allocates
iommu when create drhd. And then drhd->ecap can be removed because
it's the same as iommu->ecap.
Currently "make stubdom" on its own fails because it depends on files
being installed by the results of "make tools". This also means that
in some circumstances a parallel "make tools stubdom" (or "make all")
can fail due to races. So make "make stubdom" depend on "make tools"
having completed first.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen-unstable changeset: 21760:84719437205c
xen-unstable date: Fri Jul 09 12:22:52 2010 +0100
The hardware CPUID-levelling features level the feature flags but
don't change the CPU family/model/stepping. Relax the HVM restore
check on family/model/stepping to printk but not veto the load, so
that VMs can be migrated between machines that have been
CPUID-levelled.
xen: allow HVM save/restore from different changesets
Allow HVM save/restore from different changesets of Xen. The HVM save
records are supposed to be backwards compatible; XenServer
live-migrates between versions of Xen during upgrades.
xen: make the shadow allocation hypercalls include the p2m memory
in the total shadow allocation. This makes the effect of allocation
changes consistent regardless of p2m activity on boot.
Otherwise vcpu_periodic_timer_work() can think the next timer is in
the future (and re-issue it unchanged) while timer_softirq_action()
thinks it's in the past (and fires it immediately), leading to
livelock.
tools/libxl: allow setting of timer_mode, hpet and vpt_align parameters
Implement parsing for timer_mode, hpet and vpt_align parameters.
These are all HVM only parameters and hpet/vpt_align are boolean so
change types and place in hvm union accordingly. Also HPET is x86 only
on principle so make this compile-time conditional on arch as-is
viridian.
This path enables AMD OSVW (OS Visible Workaround) feature for
Xen. New AMD errata will have a OSVW id assigned in the future. OS is
supposed to check OSVW status MSR to find out whether CPU has a
specific erratum. Legacy errata are also supported in this patch:
traditional family/model/stepping approach will be used if OSVW
feature isn't applicable. This patch is adapted from Hans Rosenfeld's
patch submitted to Linux kernel.
After getting a report of 3.2.3's xenmon crashing Xen (as it turned
out this was because c/s 17000 was backported to that tree without
also applying c/s 17515), I figured that the hypervisor shouldn't rely
on any specific state of the actual trace buffer (as it is shared
writable with Dom0)
[GWD: Volatile quantifiers have been taken out and moved to another
patch]
To make clear what purpose specific variables have and/or where they
got loaded from, the patch also changes the type of some of them to be
explicitly u32/s32, and removes pointless assertions (like checking an
unsigned variable to be >= 0).
I also took the prototype adjustment of __trace_var() as an
opportunity to simplify the TRACE_xD() macros. Similar simplification
could be done on the (quite numerous) direct callers of the function.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
xen-unstable changeset: 21706:ae68758f8862
xen-unstable date: Fri Jul 02 18:56:34 2010 +0100
This patch implements HVMOP_pagetable_dying: an hypercall for
guests to notify Xen that a pagetable is about to be destroyed so that
Xen can use it as a hint to unshadow the pagetable soon and unhook the
top-level user-mode shadow entries right away.
Gianluca Guida is the original author of this patch.
"I am removing the tsc_scaled variable that is never actually used
because when tsc needs to be scaled vtsc is 1. I am also making this
more explicit in tsc_set_info. I am also removing hvm_domain.gtsc_khz
that is a duplicate of d->arch.tsc_khz. I am using scale_delta(delta,
&d->arch.ns_to_vtsc) to scale the tsc value before returning it to the
guest like in the pv case. I added a feature flag to specify that the
pvclock algorithm is safe to be used in an HVM guest so that the guest
can now use it without hanging."
Version 2 fixes a bug which breaks PV domU time.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
xen-unstable changeset: 21445:c1ed00d49534
xen-unstable date: Sat May 22 06:31:47 2010 +0100
Keir Fraser [Wed, 30 Jun 2010 17:23:19 +0000 (18:23 +0100)]
Use fixed-width types in the memory event interface
Set the types in the public memory_event header file to use
fixed-sized and self-aligned fields rather than "unsigned long". AIUI
this feature only works with 64-bit hypervisors but I think this
change will be necessary to use 32-on-64 dom0 tools.
This breaks compatibility with older builds of the tools, but I can't
see any way to avoid it short of __attribute__((__packed__)).
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
xen-unstable changeset: 21694:2a3a5979e3f1
xen-unstable date: Tue Jun 29 18:17:44 2010 +0100