Ian Campbell [Fri, 22 Feb 2013 08:57:48 +0000 (08:57 +0000)]
xen: arm: refactor co-pro and sysreg reg handling.
AArch64 has removed the concept of co-processors replacing them with a
combination of specific instructions (cache and tlb flushes etc) and
system registers (which are understood by name in the assembler).
However most system registers are equivalent to a particular AArch32
co-pro register and can be used by generic code in the same way. Note
that the names of the registers differ (often only slightly)
For consistency it would be better to use only set of names in the
common code. Therefore move the {READ,WRITE}_CP{32,64} accessors into
arm32/processor.h and provide {READ,WRITE}_SYSREG. Where the names
differ #defines will be provided on 32-bit.
HSR_CPREG and friends are required even on 64-bit in order to decode
traps from 32 bit guests.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Fri, 22 Feb 2013 08:57:45 +0000 (08:57 +0000)]
xen: arm64: basic config and types headers
The 64-bit bitops are taken from the Linux asm-generic implementations. They
should be replaced with optimised versions from the Linux arm64 port when they
become available.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 22 Feb 2013 10:56:54 +0000 (11:56 +0100)]
honor ACPI v4 FADT flags
- force use of physical APIC mode if indicated so (as we don't support
xAPIC cluster mode, the respective flag is taken to force physical
mode too)
- don't use MSI if indicated so (implies no IOMMU)
Both can be overridden on the command line, for the MSI case this at
once adds a new command line option allowing to turn off PCI MSI (IOMMU
and HPET are unaffected by this).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 22 Feb 2013 10:48:57 +0000 (11:48 +0100)]
ACPI: support v5 (reduced HW) sleep interface
Note that this also fixes a broken input check in acpi_enter_sleep()
(previously validating the sleep->pm1[ab]_cnt_val relationship based
on acpi_sinfo.pm1b_cnt_val, which however gets set only subsequently).
Also adjust a few minor issues with the pre-v5 handling in
acpi_fadt_parse_sleep_info().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Bob Moore [Fri, 22 Feb 2013 10:46:32 +0000 (11:46 +0100)]
ACPI 5.0: Implement hardware-reduced option
If HW-reduced flag is set in the FADT, do not attempt to access
or initialize any ACPI hardware, including SCI and global lock.
No FACS will be present.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Also adjust acpi_fadt_parse_sleep_info().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 22 Feb 2013 10:21:38 +0000 (11:21 +0100)]
x86/nhvm: properly clean up after failure to set up all vCPU-s
Otherwise we may leak memory when setting up nHVM fails half way.
This implies that the individual destroy functions will have to remain
capable (in the VMX case they first need to be made so, following
26486:7648ef657fe7 and 26489:83a3fa9c8434) of being called for a vCPU
that the corresponding init function was never run on.
Once at it, also remove a redundant check from the corresponding
parameter validation code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> Tested-by: Olaf Hering <olaf@aepfle.de>
Frediano Ziglio [Thu, 14 Feb 2013 12:37:14 +0000 (12:37 +0000)]
gcov: Adding support for coverage information
This patch introduce coverage support to Xen.
Currently it allows to compile Xen with coverage support but there is no
way to extract them.
The declarations came from Linux source files (as you can see from file
headers).
The idea is to have some operations mainly
- get coverage information size
- read coverage information
- reset coverage counters
Linux use a file system to export these information. The information will
be a blob to handle with some tools (as usually tools require a bunch of
files but Xen does not handle files at all). I'll pack them to make things
simpler as possible.
These information cannot be put in a specific section (allowing a safe
mapping) as gcc use .rodata, .data, .text and .ctors sections.
I added code to handle constructors used in this case to initialize a
linked list of files.
I excluded %.init.o files as they are used before Xen start and should
not have section like .text or .data.
I used a "coverage" configuration option to mimic the "debug" one.
Tim Deegan [Thu, 21 Feb 2013 14:07:19 +0000 (14:07 +0000)]
x86/mm: Take the p2m lock even in shadow mode.
The reworking of p2m lookups to use get_gfn()/put_gfn() left the
shadow code not taking the p2m lock, even in cases where the p2m would
be updated (i.e. PoD).
In many cases, shadow code doesn't need the exclusion that
get_gfn()/put_gfn() provides, as it has its own interlocks against p2m
updates, but this is taking things too far, and can lead to crashes in
the PoD code.
Now that most shadow-code p2m lookups are done with explicitly
unlocked accessors, or with the get_page_from_gfn() accessor, which is
often lock-free, we can just turn this locking on.
The remaining locked lookups are in sh_page_fault() (in a path that's
almost always already serializing on the paging lock), and in
emulate_map_dest() (which can probably be updated to use
get_page_from_gfn()). They're not addressed here but may be in a
follow-up patch.
Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Ian Campbell [Thu, 21 Feb 2013 10:59:51 +0000 (10:59 +0000)]
libxl: fix build on 32-bit
aab4d1b266ce "libxl: Add qxl vga interface support for upstream qemu"
introduced:
libxl_dm.c: In function ‘libxl__build_device_model_args_new’:
libxl_dm.c:449: error: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
libxl_dm.c:451: error: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
on arm32 and x86_32.
Use the inttypes.h PRId64 macro.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Frediano Ziglio [Wed, 20 Feb 2013 16:59:43 +0000 (16:59 +0000)]
.gitignore: Do not ignore dsdl.asl file
dsdl.asl file is not autogenerated while all other dsdl_*.asl files are.
.hgignore is correct.
Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Fabio Fantoni [Wed, 20 Feb 2013 15:46:06 +0000 (15:46 +0000)]
libxl: Add qxl vga interface support for upstream qemu
Usage:
vga="qxl"
Signed-off-by: Fabio Fantoni <fabio.fantoni@heliman.it> Signed-off-by: Zhou Peng <zpengxen@gmail.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 19 Feb 2013 09:49:53 +0000 (10:49 +0100)]
xmalloc: make close-to-PAGE_SIZE allocations more efficient
Rather than bumping their sizes to slightly above (a multiple of)
PAGE_SIZE (in order to store tracking information), thus requiring
a non-order-0 allocation even when no more than a page is being
requested, return the result of alloc_xenheap_pages() directly, and use
the struct page_info field underlying PFN_ORDER() to store the actual
size (needed for freeing the memory).
This leverages the fact that sub-allocation of memory obtained from the
page allocator can only ever result in non-page-aligned memory chunks
(with the exception of zero size allocations with sufficiently high
alignment being requested, which is why zero-size allocations now get
special cased).
Use the new property to simplify allocation of the trap info array for
PV guests on x86.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jiongxi Li [Mon, 18 Feb 2013 08:34:18 +0000 (09:34 +0100)]
x86/VMX: fix VMCS setting for x2APIC mode guest while enabling APICV
The "APIC-register virtualization" and "virtual-interrupt deliver"
VM-execution control has no effect on the behavior of RDMSR/WRMSR if
the "virtualize x2APIC mode" VM-execution control is 0.
When guest uses x2APIC mode, we should enable "virtualize x2APIC mode"
for APICV first.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Jiongxi Li [Mon, 18 Feb 2013 08:27:58 +0000 (09:27 +0100)]
x86/VMX: fix live migration while enabling APICV
SVI should be restored in case guest is processing virtual interrupt
while saveing a domain state. Otherwise SVI would be missed when
virtual interrupt delivery is enabled.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Lalith Suresh [Fri, 15 Feb 2013 14:57:40 +0000 (14:57 +0000)]
xm: fix description of xm vcpu-set command
Minor language correction in the description of the xm vcpu-set command.
Signed-off-by: Lalith Suresh <suresh.lalith@gmail.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen/arm: introduce a driver for the ARM HDLCD controller
Read the screen resolution setting from device tree, find the
corresponding modeline in a small table of standard video modes, set the
hardware accordingly.
Use vexpress_syscfg to configure the pixel clock.
Use the generic framebuffer functions to print on the screen.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Abstract away from vesa.c the funcions to handle a linear framebuffer
and print characters to it.
Make use of the new functions in vesa.c.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
[ ijc -- s/fbp/lfbp/ in two places to fix the build ] Committed-by: Ian Campbell <ian.campbell@citrix.com>
Fabio Fantoni [Fri, 15 Feb 2013 13:32:27 +0000 (13:32 +0000)]
tools/libxl: Improve videoram setting
- If videoram setting is less than 8 mb shows error and exit.
- Added videoram setting for qemu upstream with cirrus (added in qemu 1.3).
- Updated xl.cfg man.
- Default and minimal videoram changed to 16 mb if stdvga is set and upstream
qemu is being used. This is required by qemu 1.4 to avoid a xen memory error
(qemu 1.3 doesn't complain about it, probably buggy).
Do not try to save and restore the vtimer for the idle domain.
Inject the vtimer interrupt from the Xen timer handler, taking care of
setting the timer as masked in the ctl field, so that at restore time it
is not going to fire the interrupt again.
No need to disable the vtimer before writing the new offset on restore:
the vtimer is already disabled.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Make xc_dom_feature_translated an arch-dependent function.
alloc_magic_pages: save console and xenstore pfn's in xc_dom_image.
alloc_magic_pages: set HVM_PARAM_CONSOLE_EVTCHN and
HVM_PARAM_STORE_EVTCHN hvm_params using the event channels allocated by
the toolstack.
Call xc_dom_gnttab_hvm_seed instead of xc_dom_gnttab_seed in
xc_dom_gnttab_init for autotranslated guests.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ross Philipson [Fri, 15 Feb 2013 13:32:17 +0000 (13:32 +0000)]
libxl: Cleanup, use LOG* and GCSPRINTF macro in libxl_dom.c
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
The changes are primarily in the domain building code where the firmware files
are read and passed to libxc for loading into the new guest. After the domain
building call to libxc, the addresses for the loaded blobs are returned and
written to xenstore.
LIBXL_HAVE_FIRMWARE_PASSTHROUGH is defined in libxl.h to allow users to
determine if the feature is present.
This patch also updates the xl.cfg man page with descriptions of the two new
parameters for firmware passthrough.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ross Philipson [Fri, 15 Feb 2013 13:32:16 +0000 (13:32 +0000)]
libxl: switch to using the new xc_hvm_build() libxc API.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Olaf Hering [Fri, 15 Feb 2013 13:32:15 +0000 (13:32 +0000)]
libxl: pass debug flag down to libxl_domain_suspend
libxl_domain_suspend is already prepared to handle LIBXL_SUSPEND_DEBUG,
and xl migrate handles the -d switch as well. Pass this flag down to
libxl_domain_suspend, so that finally xc_domain_save can dump huge
amount of debug data to stdout.
Update xl.1 and help text output.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Olaf Hering [Fri, 15 Feb 2013 13:32:13 +0000 (13:32 +0000)]
tools/xc: log pid in xc_save/xc_restore output
If several migrations log their output to xend.log its not clear which
line belongs to a which guest. Print entry/exit of xc_save and
xc_restore and also request to print pid with each log call.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Olaf Hering [Fri, 15 Feb 2013 13:32:13 +0000 (13:32 +0000)]
tools/xc: restore logging in xc_save
Prior to xen-4.1 the helper xc_save would print some progress during
migration. With the new xc_interface_open API no more messages were
printed because no logger was configured.
Restore previous behaviour by providing a logger. The progress in
xc_domain_save will be disabled because it generates alot of output and
fills up xend.log quickly.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Olaf Hering [Fri, 15 Feb 2013 13:32:11 +0000 (13:32 +0000)]
tools/xc: handle tty output differently in stdiostream_progress
If the output goes to a tty, rewind the cursor and print everything in a
single line as it was done up to now. If the output goes to a file or
pipe print a newline after each progress output. This will fix logging
of progress messages from xc_save to xend.log.
To support XTL_STDIOSTREAM_SHOW_PID or XTL_STDIOSTREAM_SHOW_DATE print
the output via vmessage if the output is not a tty.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 15 Feb 2013 13:32:10 +0000 (13:32 +0000)]
xen: arm32: Use system wide TLB flushes, not just inner-shareable
We currently setup page table walks etc as outer-shareable. Given we don't
really make the distinction between inner- and outer-shareable yet err on
theside of safety.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 15 Feb 2013 09:24:43 +0000 (09:24 +0000)]
tools/ocaml: oxenstored: correctly handle a full ring.
Change 26521:2c0fd406f02c (part of XSA-38 / CVE-2013-0215) incorrectly
caused us to ignore rather than process a completely full ring. Check if
producer and consumer are equal before masking to avoid this, since prod ==
cons + PAGE_SIZE after masking becomes prod == cons.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 15 Feb 2013 08:38:45 +0000 (09:38 +0100)]
hvm: Allow triple fault to imply crash rather than reboot
While the triple fault action on native hardware will result in a system
reset, any modern operating system can and will make use of less violent
reboot methods. As a result, the most likely cause of a triple fault is a
fatal software bug.
This patch allows the toolstack to indicate that a triple fault should mean a
crash rather than a reboot. The default of reboot still remains the same.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Jan Beulich <jbeulich@suse.com>
Tim Deegan [Thu, 14 Feb 2013 12:20:58 +0000 (12:20 +0000)]
x86/setup: don't relocate the VGA hole.
Copying the contents of the VGA hole is at best pointless and at worst
dangerous. Booting Xen on Xen, it causes a very long delay as each
byte is referred to qemu.
Since we were already discarding the first 1MB of the relocated area,
just avoid copying it in the first place.
Reported-by: Jon Ludlam <jonathan.ludlam@eu.citrix.com> Signed-off-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 14 Feb 2013 08:42:57 +0000 (09:42 +0100)]
AMD IOMMU: handle MSI for phantom functions
With ordinary requests allowed to come from phantom functions, the
remapping tables ought to be set up to also allow for MSI triggers to
come from other than the "real" device too.
It is not clear to me whether the alias-ID handling also needs
adjustment for this to work properly, or whether firmware can be
expected to properly express this through a device alias range
descriptor (or multiple device alias ones).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Thu, 14 Feb 2013 08:40:52 +0000 (09:40 +0100)]
AMD IOMMU: also spot missing IO-APIC entries in IVRS table
Apart from dealing duplicate conflicting entries, we also have to
handle firmware omitting IO-APIC entries in IVRS altogether. Not doing
so has resulted in c/s 26517:601139e2b0db to crash such systems during
boot (whereas with the change here the IOMMU gets disabled just as is
being done in the other cases, i.e. unless global tables are being
used).
Debugging this issue has also pointed out that the debug log output is
pretty ugly to look at - consolidate the output, and add one extra
item for the IVHD special entries, so that future issues are easier
to analyze.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Michael Young [Wed, 13 Feb 2013 17:00:15 +0000 (17:00 +0000)]
tools: Fix memset(&p,0,sizeof(p)) idiom in several places.
gcc 4.8 identifies several places where code of the form memset(x, 0,
sizeof(x)); is used incorrectly, meaning that less memory is set to
zero than required.
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Committed-by: Keir Fraser <keir@xen.org>
Add Xenoprofile support for AMD Family16h. The corresponded OProfile
patch has already been submitted to OProfile mailing list.
(http://marc.info/?l=oprofile-list&m=136036136017302&w=2 ).
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Ian Jackson [Thu, 7 Feb 2013 14:21:47 +0000 (14:21 +0000)]
oxenstored: Enforce a maximum message size of 4096 bytes
The maximum size of a message is part of the protocol spec in
xen/include/public/io/xs_wire.h
Before this patch a client which sends an overly large message can
cause a buffer read overrun.
Note if a badly-behaved client sends a very large message
then it will be difficult for them to make their connection
work again-- they will probably need to reboot.
Signed-off-by: David Scott <dave.scott@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 7 Feb 2013 14:21:44 +0000 (14:21 +0000)]
tools/ocaml: oxenstored: Be more paranoid about ring reading
oxenstored makes use of the OCaml Xenbus bindings, in which the
function xs_ring_read in tools/ocaml/libs/xb/xs_ring_stubs.c is used
to read from the shared memory Xenstore ring.
This function does not correctly handle all possible (prod, cons)
states when MASK_XENSTORE_IDX(prod) > MASK_XENSTORE_IDX(cons).
The root cause is the use of the unmasked values of prod and cons to
calculate to_read. If prod is set to an out-of-range value, the ring
peer can cause to_read to be too large or even negative. This allows
the ring peer to force oxenstored to read and write out of range for
the buffers leading to a crash or possibly to privilege escalation.
Correct this by masking the values of cons and prod at the start, so
we only deal with masked values. This makes the logic simpler, as
semantically inappropriate values of the upper bits of the ring
pointers are simply ignored.
The same vulnerability does not exist in the ring writer because the
only use made of the unmasked value is the check which prevents the
prod pointer overtaking the cons pointer. A ring peer which defeats
this check will suffer only lost data.
However, additionally, precautions need to be taken to ensure that
req_cons and req_prod are only read once in each function. Without
the use of volatile or some asm construct, the compiler can "prove"
that req_cons and req_prod do not change unexpectedly and is permitted
to "amplify" the read of (say) req_cons into two reads at different
times, giving two different values for use as cons, and then use the
two sources of cons interchangeably. (The use of xen_mb() does not
forbid this.)
Therefore do the reads of req_cons and req_prod through a volatile
pointer in both xs_ring_read and xs_ring_write.
This is currently believed to be a theoretical vulnerability as we are
not aware of any compilers which amplify reads in this way.
This is a security issue, part of XSA-38 / CVE-2013-0215.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Matthew Daley <mattjd@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Feb 2013 15:47:41 +0000 (15:47 +0000)]
xen: enable stubdom on a per arch basis
... and disable on ARM (for now).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>