Ian Campbell [Wed, 17 Jul 2013 11:19:28 +0000 (12:19 +0100)]
xen: arm: remove unnecessary cache flush in write_pte
On a ARMv7/v8 SMP system the MMU is coherent
Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
[ ijc -- dropped the associated dsb too ]
Unlike bx, eret will not update the instruction set (THUMB,ARM) according to
the return address. This will result to an unpredicable behaviour for the
processor if the address doesn't match the right instruction set.
When the kernel is compiled with THUMB2, THUMB bit needs to be set in CPSR
for the secondary cpus.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: Don't emulate the MMIO access if the instruction syndrome is invalid
When the instruction syndrome is not valid, the transfer register is unknown.
If this register is used in the emulation code (it's the case for the VGIC),
Xen can retrieve wrong data.
For safety, consider invalid instruction syndrome as wrong memory access.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: Initialize PERCPU variables at the beginning of start_xen
PERCPU variables rely on HTPIDR (TPIDR_EL2) which is in an unknown state when
a processor boot.
For the boot CPU, the first use of PERCPU is in setup_pagetables. So
initialize PERCPU and set the processor ID before.
Bamvor Jian Zhang observed this failure on the sun6i processor which does not
initialise HTPIDR and contributed a very similar patch.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
[ ijc -- added last para of commit message ]
=John Liu [Mon, 22 Jul 2013 21:23:10 +0000 (22:23 +0100)]
oxenstored: Protect oxenstored from malicious domains.
add check logic when read from IO ring, and if error happens,
then mark the reading connection as "bad", Unless vm reboot,
oxenstored will not handle message from this connection any more.
xs_ring_stubs.c: add a more strict check on ring reading
connection.ml, domain.ml: add getter and setter for bad flag
process.ml: if exception raised when reading from domain's ring,
mark this domain as "bad"
xenstored.ml: if a domain is marked as "bad", do not handle it.
Signed-off-by: John Liu <john.liuqiming@huawei.com> Acked-by: David Scott <dave.scott@eu.citrix.com>
During the Xen 4.3 release we discussed that this feature could be
turned on by default - as it benefits all of the guests - not just
tmem related.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Sun, 21 Jul 2013 05:24:30 +0000 (06:24 +0100)]
xen: x86: put back .gz suffix on installed hypervisor binary.
This reverts the effect of 524b93def23b "xen: x86: drop the ".gz" suffix when
installing" which broke things in osstest (Debian Squeeze update-grub
apparently can't cope). It is not a direct revert because of other changes made
since. We continue to omit the suffix on ARM.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 26 Apr 2013 10:58:47 +0000 (11:58 +0100)]
xen: arm: drop LDFLAGS_DIRECT emulation specification.
The current -maarch64elf fails when cross-building arm64 on Ubuntu Raring due
to a missing file "ldscripts/aarch64elf.xr". This is undoubtedly an Ubuntu gcc
bug, hwever when investigating I found that this option was not necessary at
all since we provide an explicit linker script when linking the hypervisor
(AFAICT all -m<foo> does is override the default linker script).
LDFLAGS_DIRECT is also used when linking the intermediate built-in.o files but
-m<emulatin> is not needed for this since it isn't linking the final image and
we are calling the linker with the correct, cross if necessary, name.
However it does appear to be potentially useful to supply -EL in both cases to
ensure that we get little endian images. (I just happened to spot that Linux
does this, for both arm and arm64, although I expect we are unlikely to trip
over such toolchains these days).
Tested with cross-builds of arm32 and arm64 as well as a native arm32 build.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Thu, 25 Apr 2013 14:45:50 +0000 (15:45 +0100)]
xen: arm: enable aborts on all physical processors.
I'm not sure how this ended up in construct dom0 where it only affects the
boot cpu and doesn't logically fit.
Enable aborts at the same time as we enable interrupts.
I'm not sure what the behaviour of an "abort worthy" operation while aborts
are disable is, but it must surely be worse than calling do_unexpected_trap,
which is what happens from now on.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 17 Jul 2013 11:18:51 +0000 (12:18 +0100)]
xen: arm: clear the exclusive monitor on exception return
Otherwise context switching between two vcpus which are contending the same
lock can result in a spurious success.
Our spinlock and atomics code (which we get from Linux) rely on this behaviour
because they use non-exclusive stores for single instruction operations (e.g.
spin_unlock or atomic_set).
This is not required on ARMv8 since eret implicitly clears the monitor.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Fri, 12 Jul 2013 11:54:42 +0000 (12:54 +0100)]
xen: arm: make zImage the default target which we install
The zImage compatible binary is the useful one on real hardware. The relocated
ELF thing is only really useful when booting directly on Fast Models. The
customary suffix for that case is .axf so provide that as a target.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Thu, 18 Jul 2013 08:41:43 +0000 (09:41 +0100)]
xen: allow architecture to choose how/whether to compress installed xen binary
This is a follow up to "xen: arm: make zImage the default target which we
install".
On ARM the xen.gz binary installed into /boot is not immediately useful because
bootloaders (e.g. u-boot) do not unconditionally support decompression (except
via the uImage wrapper, which we currently do not support via our build system)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Jan Beulich <jbeulich@suse.com>
Ian Campbell [Thu, 18 Jul 2013 08:41:41 +0000 (09:41 +0100)]
xen: x86: drop the ".gz" suffix when installing
As Jan says it is pretty meaningless under /boot anyway. However I am slightly
concerned about breaking bootloaders (or more specifically their help scripts
which automatically generate config files). By inspection at least grub 2's
update-grub script (as present in Debian Wheezy) seems to cope (it matches on
xen* not xen*.gz)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Jan Beulich <jbeulich@suse.com>
Eric Trudeau [Fri, 12 Jul 2013 17:30:48 +0000 (13:30 -0400)]
xen/arm: Clear the IRQ_GUEST bit in desc->status when releasing an IRQ
While adding support for guest domU IRQs, I noticed that release_irq did
not clear the IRQ_GUEST bit in the IRQ's desc->status field.
This is probably not a big deal since not many situations are likely to arise
where an IRQ is sometimes host and sometimes guest.
Signed-off-by: Eric Trudeau <etrudeau@broadcom.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Jan Beulich [Thu, 18 Jul 2013 11:32:12 +0000 (13:32 +0200)]
VT-d: enable for multi-vector MSI
The main change being to make alloc_remap_entry() capable of allocating
a block of entries.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Jan Beulich [Thu, 18 Jul 2013 08:05:14 +0000 (10:05 +0200)]
x86: fix cache flushing condition in map_pages_to_xen()
This fixes yet another shortcoming of the function (exposed by 8bfaa2c2
["x86: add locking to map_pages_to_xen()"]'s adjustment to
msix_put_fixmap()): It must not flush caches when transitioning to a
non-present mapping. Doing so causes the CLFLUSH to fault, if used in
favor of WBINVD.
To help code readability, factor out the whole flush flags updating
in map_pages_to_xen() into a helper macro.
Andrew Cooper [Thu, 18 Jul 2013 07:16:15 +0000 (09:16 +0200)]
x86/time: Update wallclock in shared info when altering domain time offset
domain_set_time_offset() udpates d->time_offset_seconds, but does not correct
the wallclock in the shared info, meaning that it is incorrect until the next
XENPF_settime hypercall from dom0 which resynchronises the wallclock for all
domains.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
George Dunlap [Fri, 5 Jul 2013 11:13:54 +0000 (12:13 +0100)]
libxl: Allow network driver domains when run_hotplug_scritps is set
As of commit 05bfd984dfe7014f1f5ea1133608b9bab589c120, hotplug scripts
are not run if backend_domid != LIBXL_TOOSTACK_DOMID; so there is no reason
to restrict this for network driver domains any more.
Ian Campbell [Mon, 15 Jul 2013 08:24:05 +0000 (09:24 +0100)]
xen: arm: correctly configure NSACR.
Previously we were setting it up twice, the second time neglecting to set the
NS_SMP bit.
NSACR.NS_SMP is a processor specific bit which on Cortex-A7 and -A15 regulates
access to the (also processor specific) ACTLR.SMP bit. Not setting NSACR.NS_SMP
meant that Xen's attempts to set ACTLR.SMP was silently ignored. Setting this
bit is required in order to cause the processor to take part in cache and TLB
coherency protocols. Failure to set this bit leads to random memory corruption
in guests (although nothing like as catastrophic as you might expect!).
An alternative fix would have been to set ACTLR.SMP when in Secure World,
however Linux expects to set ACTLR.SMP itself in NS mode, so it's a good bet
that bootloaders will set NSACR.NS_SMP instead.
While here switch to a read-modify-write of NSACR to preserve any existing
bits -- seems safer.
Ian Murray [Wed, 3 Jul 2013 23:58:27 +0000 (00:58 +0100)]
xl: support for leaving domain paused after save
New feature to allow xl save to leave a domain paused after its
memory has been saved. This is to allow disk snapshots of domU
to be taken that exactly correspond to the memory state at save time.
Once the snapshot(s) have been taken or whatever, the domain can be
unpaused in the usual manner.
Usage:
xl save -p <domid> <filespec>
Signed-off-by: Ian Murray <murrayie@yahoo.co.uk> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Introduce Cortex-A7 with a scalable proc_info_list which including cpu id
and cpu initialize function.
In head.S, search cpu specific MIDR in procinfo and call such initialize
function. Currently, support Cortex-A7 and Cortex-A15.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Wed, 17 Jul 2013 08:21:33 +0000 (10:21 +0200)]
x86: don't use destroy_xen_mappings() for vunmap()
Its attempt to tear down intermediate page table levels may race with
map_pages_to_xen() establishing them, and now that
map_domain_page_global() is backed by vmap() this teardown is also
wasteful (as it's very likely to need the same address space populated
again within foreseeable time).
As the race between vmap() and vunmap(), according to the latest stage
tester logs, doesn't appear to be the only one still left, the patch
also adds logging for vmap() and vunmap() uses (there shouldn't be too
many of them, so logs shouldn't get flooded). These are supposed to
get removed (and are made stand out clearly) as soon as we're certain
that there's no issue left.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 17 Jul 2013 06:48:24 +0000 (08:48 +0200)]
VMX: suppress pointless indirect calls
Get the other virtual interrupt delivery related actors in sync
with the newly added handle_eoi() one: Clear the respective pointers
(thus avoiding the call from generic code) when the feature is
unavailable instead of checking feature availability in the actors.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Jan Beulich [Wed, 17 Jul 2013 06:47:18 +0000 (08:47 +0200)]
VMX: fix interaction of APIC-V and Viridian emulation
Viridian using a synthetic MSR for issuing EOI notifications bypasses
the normal in-processor handling, which would clear
GUEST_INTR_STATUS.SVI. Hence we need to do this in software in order
for future interrupts to get delivered.
Based on analysis by Yang Z Zhang <yang.z.zhang@intel.com>.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Andrew Cooper [Wed, 17 Jul 2013 06:45:20 +0000 (08:45 +0200)]
x86/cpuidle: Change logging for unknown APIC IDs
Dom0 uses this hypercall to pass ACPI information to Xen. It is not very
uncommon for more cpus to be listed in the ACPI tables than are present on the
system, particularly on systems with a common BIOS for a 2 and 4 socket server
varients.
As Dom0 does not control the number of entries in the ACPI tables, and is
required to pass everything it finds to Xen, change the logging.
There is now an single unconditional warning for the first unknown ID, and
further warnings if "cpuinfo" is requested by the user on the command line.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 16 Jul 2013 09:54:07 +0000 (11:54 +0200)]
AMD IOMMU: untie remap and vector maps
With the specific IRTEs used for an interrupt no longer depending on
the vector, there's no need to tie the remap sharing model to the
vector sharing one.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Jan Beulich [Tue, 16 Jul 2013 09:52:38 +0000 (11:52 +0200)]
AMD IOMMU: allocate IRTE entries instead of using a static mapping
For multi-vector MSI, where we surely don't want to allocate
contiguous vectors and be able to set affinities of the individual
vectors separately, we need to drop the use of the tuple of vector and
delivery mode to determine the IRTE to use, and instead allocate IRTEs
(which imo should have been done from the beginning).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Andrew Cooper [Tue, 16 Jul 2013 09:10:45 +0000 (11:10 +0200)]
x86: Special case __HYPERVISOR_iret rather more when writing hypercall pages
In all cases when a hypercall page is written, __HYPERVISOR_iret is first
written as a regular hypercall, then subsequently rewritten in its special
case.
For VMX and SVM, this means that following the ud2a instruction is 3 bytes of
an imm32 parameter. For a ring3 kernel, this means that following the syscall
instruction is the second half of 'pop %r11'.
For a ring1 kernel, the iret case ends up as the same number of bytes as the
rest of the hypercalls, but it is pointless writing it twice, and is changed
for consistency.
Therefore, skip the loop iteration which would write the incorrect
__HYPERVISOR_iret hypercall. This removes junk machine code from the tail and
makes disassemblers rather more happy when looking at the hypercall page.
Also, a miscellaneous whitespace fix in the comment for ring3 kernel.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 15 Jul 2013 12:21:45 +0000 (14:21 +0200)]
AMD IOMMU: use ioremap()
There's no point in using the fixmap here, and it gets
map_iommu_mmio_region() in line with unmap_iommu_mmio_region(), which
was already using iounmap() (thus crashing if actually used).
Jan Beulich [Mon, 15 Jul 2013 12:21:03 +0000 (14:21 +0200)]
VT-d: use ioremap()
There's no point in using the fixmap here, and it gets iommu_alloc()
in line with iommu_free(), which was already using iounmap() (thus
crashing if actually used).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 15 Jul 2013 12:17:56 +0000 (14:17 +0200)]
x86: add locking to map_pages_to_xen()
While boot time calls don't need this, run time uses of the function
which may result in L2 page tables getting populated need to be
serialized to avoid two CPUs populating the same L2 (or L3) entry,
overwriting each other's results.
This is expected to fix what would seem to be a regression from commit b0581b92 ("x86: make map_domain_page_global() a simple wrapper around
vmap()"), albeit that change only made more readily visible the already
existing issue.
This patch intentionally does not
- add locking to the page table de-allocation logic in
destroy_xen_mappings() (the only user having potential races here,
msix_put_fixmap(), gets converted to use __set_fixmap() instead)
- avoid races between super page splitting and reconstruction in
map_pages_to_xen() (no such uses exist; races between multiple
splitting attempts or between multiple reconstruction attempts are
being taken care of)
If we wanted to take care of these, we'd need to alter the behavior
of virt_to_xen_l?e() - they would need to return with the lock held
then.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Wed, 10 Jul 2013 10:54:00 +0000 (12:54 +0200)]
arm: correct vfp save/restore asm constraints
Some versions of gcc complain:
> vfp.c: In function 'vfp_restore_state':
> vfp.c:45:27: error: memory input 0 is not directly addressable
> vfp.c:51:31: error: memory input 0 is not directly addressable
There is no way to express the constraint we want (which is the address of the
array, clobbering the whole array). Therefore we have to fake it up by using
two constraints.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Will.Deacon@arm.com Acked-by: Julien Grall <julien.grall@linaro.org>
Jan Beulich [Wed, 10 Jul 2013 08:03:40 +0000 (10:03 +0200)]
adjust x86 EFI build
While the rule to generate .init.o files from .o ones already correctly
included $(extra-y), the setting of the necessary compiler flag didn't
have the same. With some yet to be posted patch this resulted in build
breakage because of the compiler deciding not to inline a few functions
(which then results in .text not being empty as required for these
object files).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
libxl: do not call exit() in libxl_device_vtpm_list
Signal error with NULL return value, do not terminate the whole process.
Signed-off-by: Marek Marczykowski <marmarek@invisiblethingslab.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Thu, 4 Jul 2013 08:33:18 +0000 (10:33 +0200)]
x86/mm: Ensure useful progress in alloc_l2_table()
While debugging the issue which turned out to be XSA-58, a printk in this loop
showed that it was quite easy to never make useful progress, because of
consistently failing the preemption check.
One single l2 entry is a reasonable amount of work to do, even if an action is
pending, and also assures forwards progress across repeat continuations.
Tweak the continuation criteria to fail on the first iteration of the loop.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Thu, 4 Jul 2013 08:32:44 +0000 (10:32 +0200)]
use SMP barrier in common code dealing with shared memory protocols
Xen currently makes no strong distinction between the SMP barriers (smp_mb
etc) and the regular barrier (mb etc). In Linux, where we inherited these
names from having imported Linux code which uses them, the SMP barriers are
intended to be sufficient for implementing shared-memory protocols between
processors in an SMP system while the standard barriers are useful for MMIO
etc.
On x86 with the stronger ordering model there is not much practical difference
here but ARM has weaker barriers available which are suitable for use as SMP
barriers.
Therefore ensure that common code uses the SMP barriers when that is all which
is required.
On both ARM and x86 both types of barrier are currently identical so there is
no actual change. A future patch will change smp_mb to a weaker barrier on
ARM.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 4 Jul 2013 08:27:39 +0000 (10:27 +0200)]
x86: make map_domain_page_global() a simple wrapper around vmap()
This is in order to reduce the number of fundamental mapping mechanisms
as well as to reduce the amount of code to be maintained. In the course
of this the virtual space available to vmap() is being grown from 16Gb
to 64Gb.
Note that this requires callers of unmap_domain_page_global() to no
longer pass misaligned pointers - map_domain_page_global() returns page
size aligned pointers, so unmappinmg should be done accordingly.
unmap_vcpu_info() violated this and is being adjusted here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 4 Jul 2013 08:26:24 +0000 (10:26 +0200)]
bitmap_*() should cope with zero size bitmaps
... to match expectations set by memset()/memcpy().
Similarly for find_{first,next}_{,zero_}_bit() on x86.
__bitmap_shift_{left,right}() would also need fixing (they more
generally can't cope with the shift count being larger than the bitmap
size, and they perform undefined operations by possibly shifting an
unsigned long value by BITS_PER_LONG bits), but since these functions
aren't really used anywhere I wonder if we wouldn't better simply get
rid of them.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ben Guthro [Thu, 4 Jul 2013 08:23:36 +0000 (10:23 +0200)]
x86: Restore reboot quirks by DMI, fix reboot on a number of systems
The following patch ports the functionality following changeset from
Linux (from 2008) to xen:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=14d7ca5c
It implements an additional reboot quirk to do a PCI reset via port
CF9.
This also restores some code dropped in the x86_32 target removal
(changeset 5d1181a5ea5e0f11d481a94b16ed00d883f9726e) which sets some
quirks based on DMI matching.
This will add reboot quirks on the following systems that are known to
be necessary on Linux:
Dell E520
Dell PowerEdge 1300
Dell PowerEdge 300
Dell OptiPlex 745
Dell OptiPlex 745
Dell OptiPlex 745
Dell OptiPlex 330
Dell OptiPlex 360
Dell OptiPlex 760
Dell PowerEdge 2400
Dell Precision T5400
Dell Precision T7400
HP Compaq Laptop
Dell XPS710
Dell DXP061
Sony VGN-Z540N
ASUS P4S800
Acer Aspire One A110
Apple MacBook5
Apple MacBookPro5
Apple Macmini3,1
Apple iMac9,1
Dell Latitude E6320
Dell Latitude E5420
Dell Latitude E6220
Dell Latitude E6420
Dell OptiPlex 990
Dell OptiPlex 990
Dell Latitude E6520
Dell OptiPlex 790
Dell OptiPlex 990
Dell OptiPlex 390
Dell Latitude E6320
Dell Latitude E6420
Dell Latitude E6520
I clearly have not been able to test on all of these systems.
It does fix rebooting on the Dell 790, and should *not* change the
reboot paths of systems not on this DMI match list.
Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com>
Use driver_data, thus requiring only a single handler function.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org Acked-by: Ben Guthro <benjamin.guthro@citrix.com>
The IOMMU interrupt handling in bottom half must clear the PPR log interrupt
and event log interrupt bits to re-enable the interrupt. This is done by
writing 1 to the memory mapped register to clear the bit. Due to hardware bug,
if the driver tries to clear this bit while the IOMMU hardware also setting
this bit, the conflict will result with the bit being set. If the interrupt
handling code does not make sure to clear this bit, subsequent changes in the
event/PPR logs will no longer generating interrupts, and would result if
buffer overflow. After clearing the bits, the driver must read back
the register to verify.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Adjust to apply on top of heavily modified patch 1. Adjust flow to get away
with a single readl() in each instance of the status register checks.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
iommu/amd: Fix logic for clearing the IOMMU interrupt bits
The IOMMU interrupt bits in the IOMMU status registers are
"read-only, and write-1-to-clear (RW1C). Therefore, the existing
logic which reads the register, set the bit, and then writing back
the values could accidentally clear certain bits if it has been set.
The correct logic would just be writing only the value which only
set the interrupt bits, and leave the rest to zeros.
This patch also, clean up #define masks as Jan has suggested.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
With iommu_interrupt_handler() properly having got switched its readl()
from status to control register, the subsequent writel() needed to be
switched too (and the RW1C comment there was bogus).
Some of the cleanup went too far - undone.
Further, with iommu_interrupt_handler() now actually disabling the
interrupt sources, they also need to get re-enabled by the tasklet once
it finished processing the respective log. This also implies re-running
the tasklet so that log entries added between reading the log and re-
enabling the interrupt will get handled in a timely manner.
Finally, guest write emulation to the status register needs to be done
with the RW1C (and RO for all other bits) semantics in mind too.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Jan Beulich [Tue, 2 Jul 2013 06:48:03 +0000 (08:48 +0200)]
x86: don't pass negative time to gtime_to_gtsc() (try 2)
This mostly reverts commit eb60be3d ("x86: don't pass negative time to
gtime_to_gtsc()") and instead corrects __update_vcpu_system_time()'s
handling of this_cpu(cpu_time).stime_local_stamp dating back before the
start of a HVM guest (which would otherwise lead to a negative value
getting passed to gtime_to_gtsc(), causing scale_delta() to produce
meaningless output).
Flushing the value to zero was wrong, and printing a message for
something that can validly happen wasn't very useful either.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jacob Shin [Tue, 2 Jul 2013 06:47:00 +0000 (08:47 +0200)]
cpufreq, xenpm: fix cpufreq and xenpm mismatch
Currently cpufreq and xenpm are out of sync. Fix cpufreq reporting of
if turbo mode is enabled or not. Fix xenpm to not decode for tristate,
but a boolean.
Jan Beulich [Tue, 2 Jul 2013 06:42:49 +0000 (08:42 +0200)]
x86/fxsave: bring in line with recent xsave adjustments
Defer the FIP/FDP pointer reset needed on AMD CPUs to the restore path,
and switch from using EMMS to FFREE here too (to be resistant against
eventual future CPUs without MMX support). Also switch from using an
almost typeless pointer in fpu_fxrstor() to a properly typed one, thus
telling the compiler the truth about which memory gets accessed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Tue, 2 Jul 2013 06:41:28 +0000 (08:41 +0200)]
x86/xsave: adjust state management
The initial state for a vCPU is using default values, so there's no
need to force the XRSTOR to read the state from memory. This saves a
couple of thousand restores from memory just during boot of Linux on
my Sandy Bridge system (I didn't try to make further measurements).
The above requires that arch_set_info_guest() updates the state flags
in the save area when valid floating point state got passed in, but
that would really have been needed even before in case XSAVE{,OPT}
decided to clear one or both of the FP and SSE bits.
Furthermore, hvm_vcpu_reset_state() shouldn't just clear out the FPU/
SSE area, but needs to re-initialized MXCSR and FCW.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Jackson [Mon, 1 Jul 2013 14:20:28 +0000 (15:20 +0100)]
libxl: suppress device assignment to HVM guest when there is no IOMMU
This in effect copies similar logic from xend: While there's no way to
check whether a device is assigned to a particular guest,
XEN_DOMCTL_test_assign_device at least allows checking whether an
IOMMU is there and whether a device has been assign to _some_
guest.
For the time being, this should be enough to cover for the missing
error checking/recovery in other parts of libxl's device assignment
paths.
There remains a (functionality-, but not security-related) race in
that the iommu should be set up earlier, but this is too risky a
change for this stage of the 4.3 release.
This is a security issue, XSA-61.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Julien Grall [Thu, 27 Jun 2013 17:13:30 +0000 (18:13 +0100)]
xen/arm: Rework the way to compute dom0 DTB base address
If the DTB is loading right after the kernel, on some setup, Linux will
overwrite the DTB during the decompression step.
To be sure the DTB won't be overwritten by the decompression stage, load
the DTB near the end of the first memory bank and below 4Gib (if memory range is
greater).
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 28 Jun 2013 11:25:57 +0000 (12:25 +0100)]
xen/arm: gic_shutdown_irq must only disable the right IRQ
When GICD_ICENABLERn is read, all the 1s bit represent enabled IRQs.
Currently gic_shutdown_irq:
- read GICD_ICENABLER
- set the corresping bit to 1
- write back the new value
That means, Xen will disable more IRQs than necessary.
Dongxiao Xu [Thu, 27 Jun 2013 15:01:26 +0000 (17:01 +0200)]
nested vmx: Fix the booting of L2 PAE guest
When doing virtual VM entry and virtual VM exit, we need to
sychronize the PAE PDPTR related VMCS registers. With this fix,
we can boot 32bit PAE L2 guest (Win7 & RHEL6.4) on "Xen on Xen"
environment.
Andrew Cooper [Thu, 27 Jun 2013 12:01:18 +0000 (14:01 +0200)]
AMD/intremap: Prevent use of per-device vector maps until irq logic is fixed
XSA-36 changed the default vector map mode from global to per-device. This is
because a global vector map does not prevent one PCI device from impersonating
another and launching a DoS on the system.
However, the per-device vector map logic is broken for devices with multiple
MSI-X vectors, which can either result in a failed ASSERT() or misprogramming
of a guests interrupt remapping tables. The core problem is not trivial to
fix.
In an effort to get AMD systems back to a non-regressed state, introduce a new
type of vector map called per-device-global. This uses per-device vector maps
in the IOMMU, but uses a single used_vector map for the core IRQ logic.
This patch is intended to be removed as soon as the per-device logic is fixed
correctly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
This grub.cfg from a default fedora 19 Beta install
caused pygrub failures.The previous pygrub commit
fixed taht. So this example file added for reference.
Signed-off-by: Marcel Mol <marcel@mesa.nl> Acked-by: Ian Campbell <ian.campbell@citrix.com>