]> xenbits.xensource.com Git - xen.git/log
xen.git
10 years agolibxl_json: introduce parser functions for builtin types
Wei Liu [Mon, 9 Jun 2014 12:43:20 +0000 (13:43 +0100)]
libxl_json: introduce parser functions for builtin types

This changeset introduces following functions:
 * libxl_defbool_parse_json
 * libxl__bool_parse_json
 * libxl_uuid_parse_json
 * libxl_mac_parse_json
 * libxl_bitmap_parse_json
 * libxl_cpuid_policy_list_parse_json
 * libxl_string_list_parse_json
 * libxl_key_value_list_parse_json
 * libxl_hwcap_parse_json
 * libxl__int_parse_json
 * libxl__uint{8,16,32,64}_parse_json
 * libxl__string_parse_json

They will be used in later patch to convert the libxl__json_object
tree of a builtin type to libxl_FOO struct.

Also remove declaration of libxl_domid_gen_json as libxl_domid uses
yajl_gen_integer to generate JSON object.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
10 years agolibxl_json: introduce libxl__object_from_json
Wei Liu [Mon, 9 Jun 2014 12:43:19 +0000 (13:43 +0100)]
libxl_json: introduce libxl__object_from_json

Given a JSON string, we need to convert it to libxl_FOO struct.

The approach is:
JSON string -> libxl__json_object -> libxl_FOO struct

With this approach we can make use of libxl's infrastructure to do the
first half (JSON string -> libxl__json_object).

Second half is done by auto-generated code by libxl's IDL
infrastructure. IDL patch(es) will come later.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl IDL: rename json_fn to json_gen_fn
Wei Liu [Mon, 9 Jun 2014 12:43:18 +0000 (13:43 +0100)]
libxl IDL: rename json_fn to json_gen_fn

This json_fn is in fact used to generate string representation of a json
data structure. We will introduce another json function to parse json
data structure in later changeset, so rename json_fn to json_gen_fn to
clarify.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: fix JSON generator for uint64_t
Wei Liu [Mon, 9 Jun 2014 12:43:17 +0000 (13:43 +0100)]
libxl: fix JSON generator for uint64_t

yajl_gen_integer cannot cope with uint64_t, because it takes a signed
long long. If we pass to it an uint64_t number which is between INT_MAX
and UINT_MAX, it generates a negative number. Later when we feed this
generated number into parser, the result gets signed extended, which is
wrong.

A new function called libxl__uint64_gen_json is introduced to handle
uint64_t. It utilises yajl_gen_number to generate numbers.

Also removed a duplicated definition of MemKB while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxl: remove parsing of "vncviewer" option in xl domain config file
Wei Liu [Mon, 9 Jun 2014 12:43:16 +0000 (13:43 +0100)]
xl: remove parsing of "vncviewer" option in xl domain config file

Print out a warning and suggest user use "-V" option when invoking "xl
create". Also remove that option in manpage. This will introduce a
minor functional regression but it's very easy to work around.

The rationale behind this change is that, this option is actually not
part of domain configuration. It just affects whether a vncviewer
should be automatically spawn, but has nothing to do with how a domain
should be constructed. And this option is also bogus, considering if you
migrate a domain to a remote host and the receiver spawns a vncviewer on
the receiving side then it either dies silently or occupies resource.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: make cpupool_qualifier_to_cpupoolid a library function
Wei Liu [Mon, 9 Jun 2014 12:43:12 +0000 (13:43 +0100)]
libxl: make cpupool_qualifier_to_cpupoolid a library function

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: add DECLARE_HYPERCALL_BUFFER_SHADOW()
David Vrabel [Mon, 9 Jun 2014 15:41:10 +0000 (16:41 +0100)]
tools/libxc: add DECLARE_HYPERCALL_BUFFER_SHADOW()

DECLARE_HYPERCALL_BUFFER_SHADOW() is like DECLARE_HYPERCALL_BUFFER()
except it is backed by an already allocated hypercall buffer.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Use _Static_assert if available
Andrew Cooper [Mon, 9 Jun 2014 15:41:08 +0000 (16:41 +0100)]
tools/libxc: Use _Static_assert if available

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Annotate xc_osdep_log with __attribute__((format))
Andrew Cooper [Mon, 9 Jun 2014 15:41:07 +0000 (16:41 +0100)]
tools/libxc: Annotate xc_osdep_log with __attribute__((format))

This helps the compiler spot printf formatting errors.

Fix up resulting errors in xenctrl_osdep_ENOSYS.c.  Substitute %p for the
slightly less bad %lx when trying to format an opaque structure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Annotate xc_report_error with __attribute__((format))
Andrew Cooper [Mon, 9 Jun 2014 15:41:06 +0000 (16:41 +0100)]
tools/libxc: Annotate xc_report_error with __attribute__((format))

This helps the compiler spot printf formatting errors.

Fix up all errors discovered.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen: arm: include .text.cold and .text.unlikely in text area
Ian Campbell [Mon, 9 Jun 2014 14:28:12 +0000 (15:28 +0100)]
xen: arm: include .text.cold and .text.unlikely in text area

Otherwise functions in these sections can end up between .text and .rodata
which is after _etext and therefore gets made non-executable.

This matches x86 (although it was done there for different reasons).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Cc: Jan Beulich <JBeulich@suse.com>
10 years agocpufreq: extend documentation for cpufreq parameter
Aravind Gopalakrishnan [Tue, 10 Jun 2014 10:05:37 +0000 (12:05 +0200)]
cpufreq: extend documentation for cpufreq parameter

cpufreq parameter can take more options than currently
documented. Include these with some comments regarding
their intention.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
10 years agocommon/grant: add a newline into error message
Andrew Cooper [Tue, 10 Jun 2014 10:04:59 +0000 (12:04 +0200)]
common/grant: add a newline into error message

Avoid corrupting the next line on the console.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86,amd: remove unused wrmsr_amd
Aravind Gopalakrishnan [Tue, 10 Jun 2014 10:04:35 +0000 (12:04 +0200)]
x86,amd: remove unused wrmsr_amd

After Andrew's commit 07884c9, all writes to password-protected
MSR's are performed using wrmsr_amd_safe.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
10 years agoavoid crash on HVM domain destroy with PCI passthrough
Juergen Gross [Tue, 10 Jun 2014 10:04:08 +0000 (12:04 +0200)]
avoid crash on HVM domain destroy with PCI passthrough

c/s bac6334b5 "move domain to cpupool0 before destroying it" introduced a
problem when destroying a HVM domain with PCI passthrough enabled. The
moving of the domain to cpupool0 includes moving the pirqs to the cpupool0
cpus, but the event channel infrastructure already is unusable for the
domain. So just avoid moving pirqs for dying domains.

Signed-off-by: Juergen Gross <jgross@suse.com>
10 years agox86/domctl: further fix to XEN_DOMCTL_[gs]etvcpuextstate
Andrew Cooper [Tue, 10 Jun 2014 10:03:16 +0000 (12:03 +0200)]
x86/domctl: further fix to XEN_DOMCTL_[gs]etvcpuextstate

Do not clobber errors from certain codepaths.  Clobbering of -EINVAL from
failing "evc->size <= PV_XSAVE_SIZE(_xcr0_accum)" was a pre-existing bug.

However, clobbering -EINVAL/-EFAULT from the get codepath was a bug
unintentionally introduced by 090ca8c1 "x86/domctl: two functional fixes to
XEN_DOMCTL_[gs]etvcpuextstate".

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86/amd: protect set_cpuidmask() against #GP faults
Andrew Cooper [Thu, 5 Jun 2014 15:57:07 +0000 (17:57 +0200)]
x86/amd: protect set_cpuidmask() against #GP faults

Virtual environments such as Xen HVM containers and VirtualBox do not
necessarily provide support for feature masking MSRs.

As their presence is detected by model numbers alone, and their use predicated
on command line parameters, use the safe() variants of {wr,rd}msr() to avoid
dying with an early #GP fault.

In fact, use the password variants in all cases because:
    a) they are safe to use even if not strictly required
    b) have a more useful function prototype for this purposes

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
10 years agox86: fix reboot/shutdown with running HVM guests
Roger Pau Monné [Thu, 5 Jun 2014 15:53:35 +0000 (17:53 +0200)]
x86: fix reboot/shutdown with running HVM guests

If there's a guest using VMX/SVM when the hypervisor shuts down, it
can lead to the following crash due to VMX/SVM functions being called
after hvm_cpu_down has been called. In order to prevent that, check in
{svm/vmx}_ctxt_switch_from that the cpu virtualization extensions are
still enabled.

(XEN) Domain 0 shutdown: rebooting machine.
(XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
(XEN) ----[ Xen-4.5-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
...
(XEN) Xen call trace:
(XEN)    [<ffff82d0801d90ce>] vmx_ctxt_switch_from+0x1e/0x14c
(XEN)    [<ffff82d08015d129>] __context_switch+0x127/0x462
(XEN)    [<ffff82d080160acf>] __sync_local_execstate+0x6a/0x8b
(XEN)    [<ffff82d080160af9>] sync_local_execstate+0x9/0xb
(XEN)    [<ffff82d080161728>] map_domain_page+0x88/0x4de
(XEN)    [<ffff82d08014e721>] map_vtd_domain_page+0xd/0xf
(XEN)    [<ffff82d08014cda2>] io_apic_read_remap_rte+0x158/0x29f
(XEN)    [<ffff82d0801448a8>] iommu_read_apic_from_ire+0x27/0x29
(XEN)    [<ffff82d080165625>] io_apic_read+0x17/0x65
(XEN)    [<ffff82d080166143>] __ioapic_read_entry+0x38/0x61
(XEN)    [<ffff82d080166aa8>] clear_IO_APIC_pin+0x1a/0xf3
(XEN)    [<ffff82d080166bae>] clear_IO_APIC+0x2d/0x60
(XEN)    [<ffff82d080166f63>] disable_IO_APIC+0xd/0x81
(XEN)    [<ffff82d08018228b>] smp_send_stop+0x58/0x68
(XEN)    [<ffff82d080181aa7>] machine_restart+0x80/0x20a
(XEN)    [<ffff82d080181c3c>] __machine_restart+0xb/0xf
(XEN)    [<ffff82d080128fb9>] smp_call_function_interrupt+0x99/0xc0
(XEN)    [<ffff82d080182330>] call_function_interrupt+0x33/0x43
(XEN)    [<ffff82d08016bd89>] do_IRQ+0x9e/0x63a
(XEN)    [<ffff82d08016406f>] common_interrupt+0x5f/0x70
(XEN)    [<ffff82d0801a8600>] mwait_idle+0x29c/0x2f7
(XEN)    [<ffff82d08015cf67>] idle_loop+0x58/0x76
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'read_cr0() & X86_CR0_TS' failed at vmx.c:644
(XEN) ****************************************

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
10 years agox86/domctl: two functional fixes to XEN_DOMCTL_[gs]etvcpuextstate
Andrew Cooper [Thu, 5 Jun 2014 15:52:57 +0000 (17:52 +0200)]
x86/domctl: two functional fixes to XEN_DOMCTL_[gs]etvcpuextstate

Interacting with the vcpu itself should be protected by vcpu_pause().
Buggy/naive toolstacks might encounter adverse interaction with a vcpu context
switch, or increase of xcr0_accum.  There are no much problems with current
in-tree code.

Explicitly permit a NULL guest handle as being a request for size.  It is the
prevailing Xen style, and without it, valgrind's ioctl handler is unable to
determine whether evc->buffer actually got written to.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86/xsave: add fastpath for common xstate_ctxt_size() requests
Andrew Cooper [Thu, 5 Jun 2014 15:52:11 +0000 (17:52 +0200)]
x86/xsave: add fastpath for common xstate_ctxt_size() requests

xstate_ctxt_size(xfeature_mask) is runtime constant after boot, and for bounds
checking when handling xsave state.  Avoid reloading xcr0 twice to obtain a
number which has already been calculated.

Also annotate xfeature_mask as __read_mostly as it is only ever written once.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agoVT-d: honor APEI firmware-first mode in XSA-59 workaround code
Jan Beulich [Thu, 5 Jun 2014 15:49:14 +0000 (17:49 +0200)]
VT-d: honor APEI firmware-first mode in XSA-59 workaround code

When firmware-first mode is being indicated by firmware, we shouldn't
be modifying AER registers - these are considered to be owned by
firmware in that case. Violating this is being reported to result in
SMI storms. While circumventing the workaround means re-exposing
affected hosts to the XSA-59 issues, this in any event seems better
than not booting at all. Respective messages are being issued to the
log, so the situation can be diagnosed.

The basic building blocks were taken from Linux 3.15-rc. Note that
this includes a block of code enclosed in #ifdef CONFIG_X86_MCE - we
don't define that symbol, and that code also wouldn't build without
suitable machine check side code added; that should happen eventually,
but isn't subject of this change.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Yang Zhang <yang.z.zhang@intel.com>
10 years agox86/HVM: make vmsi_deliver() return proper error values
Jan Beulich [Thu, 5 Jun 2014 15:46:13 +0000 (17:46 +0200)]
x86/HVM: make vmsi_deliver() return proper error values

... and propagate this from hvm_inject_msi(). In the course of this I
spotted further room for cleanup:
- vmsi_inj_irq()'s struct domain * parameter was unused
- vmsi_deliver() pointlessly passed on dest_ExtINT to vmsi_inj_irq()
  (which that one validly refused to handle)
- vmsi_inj_irq()'s sole caller guarantees a proper delivery mode (i.e.
  rather than printing an obscure message we can just BUG())
- some formatting and log message quirks

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/HVM: properly propagate errors from HVMOP_inject_msi
Jan Beulich [Thu, 5 Jun 2014 15:45:27 +0000 (17:45 +0200)]
x86/HVM: properly propagate errors from HVMOP_inject_msi

There are a number of ways this operation can go wrong, all of which
got ignored so far.

In the context of this I wonder whether map_domain_emuirq_pirq()
returning 0 in the "already mapped" case is really intended to be that
way (this is why the subsequent NULL check here can't be an ASSERT()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: correct hvm_ioreq_server_alloc_rangesets() failure path
Andrew Cooper [Thu, 5 Jun 2014 15:43:26 +0000 (17:43 +0200)]
x86/hvm: correct hvm_ioreq_server_alloc_rangesets() failure path

Coverity-ID: 1220092 "Unsigned compare against 0"
Coverity-ID: 1220093 "Out-of-bounds read"

Both of these are cased by the the while() loop in the fail path, which
results in an infinite loop and memory corruption from rangeset_destroy().

Move hvm_ioreq_server_free_rangesets() up and use it for cleanup on the
failure path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
10 years agoiommu: set correct IOMMU entries when !iommu_hap_pt_share
Roger Pau Monné [Thu, 5 Jun 2014 15:42:49 +0000 (17:42 +0200)]
iommu: set correct IOMMU entries when !iommu_hap_pt_share

If the memory map is not shared between HAP and IOMMU we fail to set
correct IOMMU mappings for memory types other than p2m_ram_rw.

This patchs adds IOMMU support for the following memory types:
p2m_grant_map_rw, p2m_map_foreign, p2m_ram_ro, p2m_grant_map_ro and
p2m_ram_logdirty.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Tested-by: David Zhuang <david.zhuang@oracle.com>
10 years agomake logdirty and iommu mutually exclusive
Roger Pau Monné [Thu, 5 Jun 2014 15:41:46 +0000 (17:41 +0200)]
make logdirty and iommu mutually exclusive

Prevent the usage of global logdirty if the domain is using the IOMMU,
and also prevent passthrough of devices if logdirty is enabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agodocs: Support building pdfs from markdown using pandoc
Andrew Cooper [Tue, 3 Jun 2014 13:13:48 +0000 (14:13 +0100)]
docs: Support building pdfs from markdown using pandoc

The Xen command line parameters document is far more useful as an indexed pdf
than it is as unindexed html webpage.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- reran autogen.sh ]

10 years agolibxc/trace: Fix style
Konrad Rzeszutek Wilk [Wed, 4 Jun 2014 13:44:29 +0000 (09:44 -0400)]
libxc/trace: Fix style

Most of the functions follow the proper style, but these
two are the odd ones out.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs: xentrace manpage
Konrad Rzeszutek Wilk [Wed, 4 Jun 2014 13:44:27 +0000 (09:44 -0400)]
docs: xentrace manpage

Update the -c and -e parameters wording.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoadded xentop option -f , --full-name to xentop manpage
Christian Wolter [Thu, 5 Jun 2014 09:24:54 +0000 (11:24 +0200)]
added xentop option -f , --full-name to xentop manpage

Signed-off-by: Christian Wolter <wolter@b1-systems.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen: arm: ensure we hold a reference to guest pages while we copy to/from them
Ian Campbell [Wed, 4 Jun 2014 13:58:38 +0000 (14:58 +0100)]
xen: arm: ensure we hold a reference to guest pages while we copy to/from them

This at once:
 - prevents the page from being reassigned under our feet
 - ensures that the domain owns the page, which stops a domain from giving a
   grant mapping, MMIO region, other non-RAM as a hypercall input/output.

We need to hold the p2m lock while doing the lookup until we have the
reference.

This also requires that during domain 0 building current is set to an actual
dom0 vcpu, so take care of this at the same time as the p2m is temporarily
loaded.

Lastly when dumping the guest stack we need to make sure that the guest hasn't
pointed its sp off into the weeds and/or misaligned it, which could lead to
hypervisor traps. Solve this by using the new function and checking alignment
first.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
10 years agoxen: arm: check permissions when copying to/from guest virtual addresses
Ian Campbell [Wed, 4 Jun 2014 13:58:36 +0000 (14:58 +0100)]
xen: arm: check permissions when copying to/from guest virtual addresses

In particular we need to make sure the guest has write permissions to buffers
which it passes as output buffers for hypercalls, otherwise the guest can
overwrite memory which it shouldn't be able to write (like r/o grant table
mappings).

This is XSA-98.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
10 years agox86/PVH: avoid call to handle_mmio
Mukesh Rathor [Wed, 4 Jun 2014 09:27:50 +0000 (11:27 +0200)]
x86/PVH: avoid call to handle_mmio

handle_mmio() is currently unsafe for pvh guests. A call to it would
result in call to vioapic_range that will crash xen since the vioapic
ptr in struct hvm_domain is not initialized for pvh guests.

However, one path exists for such a call. If a pvh guest, dom0 or domU,
unintentionally touches non-existing memory, an EPT violation would occur.
This would result in unconditional call to hvm_hap_nested_page_fault. In
that function, because get_gfn_type_access returns p2m_mmio_dm for non
existing mfns by default, handle_mmio() will get called. This would result
in xen crash instead of the guest crash. This patch addresses that.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
10 years agoACPI: Prevent acpi_table_entries from falling into a infinite loop
Malcolm Crossley [Wed, 4 Jun 2014 09:26:15 +0000 (11:26 +0200)]
ACPI: Prevent acpi_table_entries from falling into a infinite loop

If a buggy BIOS programs an ACPI table with to small an entry length
then acpi_table_entries gets stuck in an infinite loop.

To aid debugging, report the error and exit the loop.

Based on Linux kernel commit 369d913b242cae2205471b11b6e33ac368ed33ec

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Use < instead of <= (which I wrongly suggested), return -ENODATA
instead of -EINVAL, and make description match code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agoVT-d: replace another fixmap use with ioremap()
Jan Beulich [Wed, 4 Jun 2014 09:24:33 +0000 (11:24 +0200)]
VT-d: replace another fixmap use with ioremap()

... making the code more generic and limiting address space consumption
(however small it might be) to just those machines that need this
mapping (this is an erratum workaround after all).

At the same time properly map the full needed range from the base
address instead of just the third page and fix some formatting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agox86/HVM: eliminate vulnerabilities from hvm_inject_msi()
Jan Beulich [Tue, 3 Jun 2014 13:17:14 +0000 (15:17 +0200)]
x86/HVM: eliminate vulnerabilities from hvm_inject_msi()

- pirq_info() returns NULL for a non-allocated pIRQ, and hence we
  mustn't unconditionally de-reference it, and we need to invoke it
  another time after having called map_domain_emuirq_pirq()
- don't use printk(), namely without XENLOG_GUEST, for error reporting

This is XSA-96.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agoUpdate mail address
Juergen Gross [Tue, 3 Jun 2014 12:03:03 +0000 (14:03 +0200)]
Update mail address

Signed-off-by: Juergen Gross <jgross@suse.com>
10 years agox86, mce: remove amd_{k8,f10}_mcheck_init functions
Aravind Gopalakrishnan [Tue, 3 Jun 2014 10:02:11 +0000 (12:02 +0200)]
x86, mce: remove amd_{k8,f10}_mcheck_init functions

With all AMD mcheck initialization unified now after
commit 518576c, these two function definitions can be removed.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
10 years agosupport 'tera' suffixes for size parameters
Andrew Cooper [Tue, 3 Jun 2014 10:01:56 +0000 (12:01 +0200)]
support 'tera' suffixes for size parameters

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/xsave: remove xfeat_mask checking from validate_xstate()
Andrew Cooper [Tue, 3 Jun 2014 10:00:53 +0000 (12:00 +0200)]
x86/xsave: remove xfeat_mask checking from validate_xstate()

validate_xsave() is called codepaths which load new vcpu xsave state from
XEN_DOMCTL_{setvcpuextstate,sethvmcontext}, usually as part of migration.  In
both cases, this is the xfeature_mask of the saving Xen rather than the
restoring Xen.

Given that the xsave state itself is checked for consistency and validity on
the current cpu, checking whether it was valid for the cpu before migration is
not interesting (or indeed relevant, as the error can't be distinguished from
the other validity checking).

This change removes the need to pass the saving Xen's xfeature_mask,
simplifying the toolstack code and migration stream format in this area.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: use alternative mechanism to define CLAC/STAC
Feng Wu [Tue, 3 Jun 2014 09:56:24 +0000 (11:56 +0200)]
x86: use alternative mechanism to define CLAC/STAC

This patch use alternative mechanism to define CLAC/STAC.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86: port the basic alternative mechanism from Linux to Xen
Feng Wu [Tue, 3 Jun 2014 09:31:21 +0000 (11:31 +0200)]
x86: port the basic alternative mechanism from Linux to Xen

This patch ports the basic alternative mechanism from Linux to Xen.
With this mechanism, we can patch code based on the CPU features.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86: make set_nmi_callback return the old nmi callback
Feng Wu [Tue, 3 Jun 2014 09:29:38 +0000 (11:29 +0200)]
x86: make set_nmi_callback return the old nmi callback

This patch makes set_nmi_callback return the old nmi callback, so
we can set it back later.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: add definitions for NOP operation
Feng Wu [Tue, 3 Jun 2014 09:29:12 +0000 (11:29 +0200)]
x86: add definitions for NOP operation

This patch adds definitions for different length of NOP operation.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoxen/arm: grant: Add another entry to map MFN 1:1 in dom0 p2m
Julien Grall [Tue, 27 May 2014 11:11:41 +0000 (12:11 +0100)]
xen/arm: grant: Add another entry to map MFN 1:1 in dom0 p2m

Grant mappings can be used for DMA requests. Currently the dev_bus_addr returned
by the hypercall is the MFN (not the IPA). Guest expects to be able the returned
address for DMA. When the device is protected by IOMMU the request will fail.
Therefore, we have to add 1:1 mapping in the domain p2m to allow DMA request
to work.

This is valid because DOM0 has its memory mapped 1:1 and therefore we know
that RAM and devices cannot clash.

If the guest only owns protected device, the return dev_bus_addr should be an
IPA. This will allow us to remove safely the 1:1 mapping and make grant mapping
works correctly in the guest. For now, this is not addressed by this patch.

The grant mapping code does the reference counting on every MFN and will
call iommu_{map,unmap}_page when necessary. This was already handle for x86
PV guests, so we can reuse the same code path for ARM guest.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
[ ijc s/ld/d/ in both arch's gnttab_need_iommu_mapping() ]

10 years agodrivers/passthrough: arm: Add support for SMMU drivers
Julien Grall [Tue, 27 May 2014 11:11:40 +0000 (12:11 +0100)]
drivers/passthrough: arm: Add support for SMMU drivers

This patch add support for ARM architected SMMU driver. It's based on the
linux drivers (drivers/iommu/arm-smmu) commit 89ac23cd.

The major differences with the Linux driver are:
    - Fault by default if the SMMU is enabled to translate an
    address (Linux is bypassing the SMMU)
    - Using P2M page table instead of creating new one
    - Dropped stage-1 support
    - Dropped chained SMMUs support for now
    - Reworking device assignment and the different structures

Xen is programming each IOMMU by:
    - Using stage-2 mode translation
    - Sharing the page table with the processor
    - Injecting a fault if the device has made a wrong translation

Signed-off-by: Julien Grall<julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools: Use SeaBIOS's defconfig
Ian Campbell [Wed, 14 May 2014 09:10:04 +0000 (10:10 +0100)]
tools: Use SeaBIOS's defconfig

Compared with our local config this enables CONFIG_BOOTSPLASH and disables
CONFIG_ATA_DMA and CONFIG_ATA_PIO32.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: update to seabios rel-1.7.4
Ian Campbell [Wed, 14 May 2014 09:10:03 +0000 (10:10 +0100)]
tools: update to seabios rel-1.7.4

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: increase size of region set aside for guest grant table
Ian Campbell [Thu, 22 May 2014 09:46:44 +0000 (10:46 +0100)]
tools: arm: increase size of region set aside for guest grant table

The current size is sufficient for the default maximum grant table size
(32-frames), but increase the reserved region to 16M/4096 pages to allow for
the use of the gnttab_max_nr_frames command line option.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: support up to (almost) 1TB of guest RAM
Ian Campbell [Thu, 22 May 2014 09:46:43 +0000 (10:46 +0100)]
tools: arm: support up to (almost) 1TB of guest RAM

This creates a second bank of RAM starting at 8GB and potentially
extending to the 1TB boundary, which is the limit imposed by our
current use of a 3 level p2m with 2 pages at level 0 (2^40 bits).

I've deliberately left a gap between the two banks just to
exercise those code paths.

The second bank is 1016GB in size which plus the 3GB below 4GB is
1019GB maximum guest RAM. At the point where the fact that this
is slightly less than a full TB starts to become an issue for
people then we can switch to a 4 level p2m, which would be needed
to support guests larger than 1TB anyhow.

Tested on 32-bit with 1, 4 and 6GB guests. Anything more than
~3GB requires an LPAE enabled kernel, or a 64-bit guest.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: prepare guest FDT building for multiple RAM banks
Ian Campbell [Thu, 22 May 2014 09:46:42 +0000 (10:46 +0100)]
tools: arm: prepare guest FDT building for multiple RAM banks

This required exposing the sizes of the banks determined by the domain builder
up to libxl via xc_dom_image.

Since the domain build needs to know the size of the DTB we create placeholder
nodes for each possible bank and when we finalise the DTB we fill in the ones
which are actually populated and NOP out the rest.

Note that the number of guest RAM banks is still 1 after this change.

Also fixes a coding style violation in
libxl__arch_domain_finalise_hw_description while there.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- minor coding style fix ]

10 years agotools: arm: prepare domain builder for multiple banks of guest RAM
Ian Campbell [Thu, 22 May 2014 09:46:41 +0000 (10:46 +0100)]
tools: arm: prepare domain builder for multiple banks of guest RAM

Prepare for adding more banks of guest RAM by renaming a bunch of defines
as RAM0 and replacing variables with arrays and introducing loops.

Also in preparation switch to using GUEST_RAM0_BASE explicitly instead of
implicitly via dom->rambase_pfn (while asserting that they must be the same).
This makes the multiple bank case cleaner (although it looks a bit odd for
now).

GUEST_RAM_BASE is defined as the address of the lowest RAM bank, it is used in
tools/libxl/libxl_dom.c to call xc_dom_rambase_init().

Lastly for now ramsize (total size) and rambank_size[0] (size of first bank)
are the same, but use the appropriate one for each context.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: refactor code to setup guest p2m and fill it with RAM
Ian Campbell [Thu, 22 May 2014 09:46:40 +0000 (10:46 +0100)]
tools: arm: refactor code to setup guest p2m and fill it with RAM

This will help when we have more guest RAM banks.

Mostly code motion of the p2m_host initialisation and allocation loop into the
new function populate_guest_memory, but in addition in the caller we now
initialise the p2m all the INVALID_MFN to handle any holes, although in this
patch we still fill in the entire allocated region.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: rearrange guest physical address space to increase max RAM
Ian Campbell [Thu, 22 May 2014 09:46:39 +0000 (10:46 +0100)]
tools: arm: rearrange guest physical address space to increase max RAM

By switching things around we can manage to expose up to 3GB of RAM to guests.

I deliberately didn't place the RAM at address 0 to avoid coming to rely on
this, so the various peripherals, MMIO and magic pages etc all live in the
lower 1GB leaving the upper 3GB available for RAM.

It would likely have been possible to reduce the space used by the peripherals
etc and allow for 3.5 or 3.75GB but I decided to keep things simple and will
handle >3GB memory in a subsequent patch.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: arm: move magic pfns out of guest RAM region
Ian Campbell [Thu, 22 May 2014 09:46:38 +0000 (10:46 +0100)]
tools: arm: move magic pfns out of guest RAM region

Because toolstacks (at least libxl) only allow RAM to be specified in 1M
increments these two pages were effectively costing 1M of guest RAM space.

Since these pages don't actually need to live in RAM just move them out.

With this a guest can now use the full 768M of the address space reserved
for RAM. (ok, not that impressive, but it simplifies things later)

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
--
v3: make the size of the region explicit.
v2: remove spurious w/s change

tools: arm: make the size of the magic page region explicit

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools: arm: report an error if the guest RAM is too large
Ian Campbell [Thu, 22 May 2014 09:46:37 +0000 (10:46 +0100)]
tools: arm: report an error if the guest RAM is too large

Due to the layout of the guest physical address space we cannot support more
than 768M of RAM before overrunning the area set aside for the grant table. Due
to the presence of the magic pages at the end of the RAM region guests are
actually limited to 767M.

Catch this case during domain build and fail gracefully instead of obscurely
later on.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools: libxl: use uint64_t not unsigned long long for addresses
Ian Campbell [Thu, 22 May 2014 09:46:36 +0000 (10:46 +0100)]
tools: libxl: use uint64_t not unsigned long long for addresses

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agotools/xenstore: Fix memory leaks in the client
Andrew Cooper [Fri, 23 May 2014 10:32:01 +0000 (11:32 +0100)]
tools/xenstore: Fix memory leaks in the client

Free the expanding buffer and output buffer after use.  Close the xenstore
handle after use.

The command line client is now valgrind-clean.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools: install qemu into xen private directory and add rpath for the libraries
Zhigang Wang [Tue, 20 May 2014 17:30:54 +0000 (13:30 -0400)]
tools: install qemu into xen private directory and add rpath for the libraries

This patch will prevent our qemu from conflicting with system qemu.

Signed-off-by: Zhigang Wang <zhigang.x.wang@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: vcpu: Correctly release resources when a VCPU fails to initialize
Julien Grall [Wed, 30 Apr 2014 19:15:55 +0000 (20:15 +0100)]
xen/arm: vcpu: Correctly release resources when a VCPU fails to initialize

While I was adding new failing code at the end of the function, I noticed
that the vtimers are not freed which messes up all the timers and will crash
Xen quickly when the page s reused.

Currently neither vcpu_vgic_init nor vcpu_vtimer_init fails, so we
are safe for now. With the new GICv3 code, the former function will be able
to fail. This will result in a memory leak.

Call vcpu_destroy if the initialization has failed. We also need to add a
boolean to know if the vtimers are correctly setup as the timer common code
doesn't have any safeguard against removing a non-initialized timer.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: handle renamed VIFs
Jacek Konieczny [Fri, 23 May 2014 12:47:21 +0000 (14:47 +0200)]
libxenstat: handle renamed VIFs

Before trying to parse network interface name as 'vif*.*'
try to get the domid and network number from sysfs.

Fixes xentop output for domains with VIF renamed through the
'vifname' xl option.

Signed-off-by: Jacek Konieczny <jajcus@jajcus.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibvchan: Make raw_get_{data_ready, buffer_space} match
Jason Andryuk [Fri, 16 May 2014 20:48:16 +0000 (16:48 -0400)]
libvchan: Make raw_get_{data_ready, buffer_space} match

For writing into a vchan, raw_get_buffer_space used >, allowing the full
ring size to be written.  On the read side, raw_get_data_ready compared
the ring size with >=.  This mismatch means a completely filled buffer
cannot be read.  Fix this by making the size checks identical.

Signed-off-by: Jason Andryuk <andryuk@aero.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: introduce asynchronous execution API
Yang Hongyang [Mon, 5 May 2014 04:14:25 +0000 (12:14 +0800)]
libxl: introduce asynchronous execution API

1.introduce asynchronous execution API:
  libxl__async_exec_init
  libxl__async_exec_start
  libxl__async_exec_inuse
2.use the async exec API to execute device hotplug scripts

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agodom0: add opt_dom0pvh to setup.c
Mukesh Rathor [Mon, 2 Jun 2014 08:32:22 +0000 (10:32 +0200)]
dom0: add opt_dom0pvh to setup.c

Finally last patch in the series to enable creation of pvh dom0.
A pvh dom0 is created by adding dom0pvh to grub xen command line.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
10 years agopvh dom0: allow get_pg_owner for translated domains if pvh
Mukesh Rathor [Mon, 2 Jun 2014 08:31:49 +0000 (10:31 +0200)]
pvh dom0: allow get_pg_owner for translated domains if pvh

When creating a PV guest, toolstack on pvh dom0 will do_mmuext_op
to pin guest tables. do_mmuext_op calls get_pg_owner, which must allow
foreign mappings for pvh.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
10 years agopvh dom0: add and remove foreign pages
Mukesh Rathor [Mon, 2 Jun 2014 08:30:47 +0000 (10:30 +0200)]
pvh dom0: add and remove foreign pages

In this patch, a new function, p2m_add_foreign(), is added
to map pages from a foreign guest into dom0 for various purposes
like domU creation, running xentrace, etc... Such pages are
typed p2m_map_foreign.  Note, it is the nature of such pages
that a refcnt is held during their stay in the p2m. The
refcnt is added and released in the low level ept function
atomic_write_ept_entry. That macro is converted to a function to allow
for such refcounting, which only applies to leaf entries in the ept.
Furthermore, please note that paging/sharing is disabled if the
controlling or hardware domain is pvh. Any enabling of those features
would need to ensure refcnt are properly maintained for foreign types,
or paging/sharing is skipped for foreign types.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agox86: correctly report max number of hypervisor leaves
Boris Ostrovsky [Mon, 2 Jun 2014 08:20:23 +0000 (10:20 +0200)]
x86: correctly report max number of hypervisor leaves

Commit def0bbd31 provided support for changing max number of
hypervisor cpuid leaves (in leaf 0x4000xx00). It also made the
hypervisor incorrectly report this number for guests that
use default value (i.e. don't specify leaf 0x4000xx00 in config
file)

Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
10 years agox86, amd_ucode: flip revision numbers in printk
Aravind Gopalakrishnan [Mon, 2 Jun 2014 08:19:27 +0000 (10:19 +0200)]
x86, amd_ucode: flip revision numbers in printk

A failure would result in log message like so-
(XEN) microcode: CPU0 update from revision 0x6000637 to 0x6000626 failed
                                           ^^^^^^^^^^^^^^^^^^^^^^
The above message has the revision numbers inverted. Fix this.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
10 years agox86,mce: consolidate AMD mcheck initialization
Aravind Gopalakrishnan [Mon, 2 Jun 2014 08:18:07 +0000 (10:18 +0200)]
x86,mce: consolidate AMD mcheck initialization

amd_k8.c did a lot of common work and very little K8
specific work. So merge init functions of amd_f10.c and
amd_k8.c and move it into the common amd_mcheck_init
handler. With that done, there is not much left in either
files, so fold all code into just one file - mce_amd.c

While at it, update the comments regarding documentation
with correct URL's and revision numbers.

Also, update copyright info.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Acked-by: Christoph Egger <chegger@amazon.de>
10 years agoioreq-server: make buffered ioreq handling optional
Paul Durrant [Mon, 2 Jun 2014 08:02:25 +0000 (10:02 +0200)]
ioreq-server: make buffered ioreq handling optional

Some emulators will only register regions that require non-buffered
access. (In practice the only region that a guest uses buffered access
for today is the VGA aperture from 0xa0000-0xbffff). This patch therefore
makes allocation of the buffered ioreq page and event channel optional for
secondary ioreq servers.

If a guest attempts buffered access to an ioreq server that does not
support it, the access will be handled via the normal synchronous path.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agoioreq-server: remove p2m entries when server is enabled
Paul Durrant [Mon, 2 Jun 2014 08:01:27 +0000 (10:01 +0200)]
ioreq-server: remove p2m entries when server is enabled

For secondary servers, add a hvm op to enable/disable the server. The
server will not accept IO until it is enabled and the act of enabling
the server removes its pages from the guest p2m, thus preventing the guest
from directly mapping the pages and synthesizing ioreqs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agoioreq-server: add support for multiple servers
Paul Durrant [Mon, 2 Jun 2014 07:40:43 +0000 (09:40 +0200)]
ioreq-server: add support for multiple servers

The previous single ioreq server that was created on demand now
becomes the default server and an API is created to allow secondary
servers, which handle specific IO ranges or PCI devices, to be added.

When the guest issues an IO the list of secondary servers is checked
for a matching IO range or PCI device. If none is found then the IO
is passed to the default server.

Secondary servers use guest pages to communicate with emulators, in
the same way as the default server. These pages need to be in the
guest physmap otherwise there is no suitable reference that can be
queried by an emulator in order to map them. Therefore a pool of
pages in the current E820 reserved region, just below the special
pages is used. Secondary servers allocate from and free to this pool
as they are created and destroyed.

The size of the pool is currently hardcoded in the domain build at a
value of 8. This should be sufficient for now and both the location and
size of the pool can be modified in future without any need to change the
API.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Fix build errors in xen/xsm/dummy.c and xen/xsm/flask/hooks.c with XSM
enabled.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agohvmloader: don't use AML operations on 64-bit fields
Jan Beulich [Wed, 28 May 2014 08:57:18 +0000 (10:57 +0200)]
hvmloader: don't use AML operations on 64-bit fields

WinXP and Win2K3, while having no problem with the QWordMemory resource
(there was another one there before), don't like operations on 64-bit
fields. Split the fields d0688669 ("hvmloader: also cover PCI MMIO
ranges above 4G with UC MTRR ranges") added to 32-bit ones, handling
carry over explicitly.

Sadly the constructs needed to create the sub-fields - nominally

    CreateDWordField(PRT0, \_SB.PCI0._CRS._Y02._MIN, MINL)
    CreateDWordField(PRT0, Add(\_SB.PCI0._CRS._Y02._MIN, 4), MINH)

- can't be used: The former gets warned upon by newer iasl, i.e. would
need to be replaced by the latter just with the addend changed to 0,
and the latter doesn't translate properly with recent iasl). Hence,
short of having an ASL/iasl expert at hand, we need to work around the
shortcomings of various iasl versions. See the code comment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotimers: set the deadline more accurately
Ross Lagerwall [Wed, 28 May 2014 08:07:50 +0000 (10:07 +0200)]
timers: set the deadline more accurately

Program the timer to the deadline of the closest timer if it is further
than 50us ahead, otherwise set it 50us ahead.  This way a single event
fires on time rather than 50us late (as it would have previously) while
still preventing too many timer wakeups in the case of having many
timers scheduled close together.

(where 50us is the timer_slop)

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
10 years agocommon/domain: do not rely on the assumption that guest_type_pv has the value 0
Andrew Cooper [Wed, 28 May 2014 07:51:46 +0000 (09:51 +0200)]
common/domain: do not rely on the assumption that guest_type_pv has the value 0

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: don't use VA for cache flush when also flushing TLB
Jan Beulich [Wed, 28 May 2014 07:51:07 +0000 (09:51 +0200)]
x86: don't use VA for cache flush when also flushing TLB

Doing both flushes at once is a strong indication for the address
mapping to either having got dropped (in which case the cache flush,
when done via INVLPG, would fault) or its physical address having
changed (in which case the cache flush would end up being done on the
wrong address range). There is no adverse effect (other than the
obvious performance one) using WBINVD in this case regardless of the
range's size; only map_pages_to_xen() uses combined flushes at present.

This problem was observed with the 2nd try backport of d6cb14b3 ("VT-d:
suppress UR signaling for desktop chipsets") to 4.2 (where ioremap()
needs to be replaced with set_fixmap_nocache(); the now commented out
__set_fixmap(, 0, 0) there to undo the mapping resulted in the first of
the above two scenarios).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoAMD IOMMU: don't free page table prematurely
Jan Beulich [Wed, 28 May 2014 07:50:33 +0000 (09:50 +0200)]
AMD IOMMU: don't free page table prematurely

iommu_merge_pages() still wants to look at the next level page table,
the TLB flush necessary before freeing too happens in that function,
and if it fails no free should happen at all. Hence the freeing must
be done after that function returned successfully, not before it's
being called.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
10 years agox86: fix setup of PVH Dom0 memory map
Roger Pau Monné [Wed, 28 May 2014 07:48:56 +0000 (09:48 +0200)]
x86: fix setup of PVH Dom0 memory map

This patch adds the holes removed by MMIO regions to the end of the
memory map for PVH Dom0, so the guest OS doesn't have to manually
populate this memory.

Also, provide a suitable e820 memory map for PVH Dom0, that matches
the underlying p2m map. This means that PVH guests should always use
XENMEM_memory_map in order to obtain the e820, even when running as
Dom0.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agoVT-d: fix mask applied to DMIBAR in desktop chipset XSA-59 workaround
Jan Beulich [Mon, 26 May 2014 10:28:46 +0000 (12:28 +0200)]
VT-d: fix mask applied to DMIBAR in desktop chipset XSA-59 workaround

In commit  ("VT-d: suppress UR signaling for desktop chipsets")
the mask applied to the value read from DMIBAR is to narrow, only the
comment accompanying it was correct. Fix that and tag the literal
number as "long" at once to avoid eventual compiler warnings.

The widest possible value so far is 39 bits; all chipsets covered here
but having less than this number of bits have the remaining bits marked
reserved (zero), and hence there's no need for making the mask chipset
specific.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Yang Zhang <yang.z.zhang@intel.com>
10 years agoCoverity ID: 1215178
Paul Durrant [Mon, 26 May 2014 10:27:51 +0000 (12:27 +0200)]
Coverity ID: 1215178

There are two problems with initializetion of the ioreq_t in hvmemul_do_io():

- vp_eport is uninitialized (because it doesn't need to be) but because the
  struct is the subject of a copy in hvm_send_assist_req(), this is flagged
  as a problem.
- dir, addr, data_is_ptr, and data may be uninitialized when the struct is
  passed to hvmtrace_io_assist(). This is clearly a bug, so the initializ-
  ation of at least those fields needs to be moved earlier.

This patch fixes both these problems.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed: Jan Beulich <jbeulich@suse.com>

10 years agoACPI/ERST: fix table mapping
Jan Beulich [Mon, 26 May 2014 10:25:01 +0000 (12:25 +0200)]
ACPI/ERST: fix table mapping

acpi_get_table(), when executed before reaching SYS_STATE_active, will
return a mapping valid only until the next invocation of that funciton.
Consequently storing the returned pointer for later use is incorrect.
Copy the logic used in VT-d's DMAR handling.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agolibxl: Reset toolstack_save file position in libxl
Jason Andryuk [Mon, 19 May 2014 18:36:37 +0000 (14:36 -0400)]
libxl: Reset toolstack_save file position in libxl

toolstack_save data is written to a temporary file in libxl and read
back in libxl-save-helper.  The file position must be reset prior to
reading the file, which is done in libxl-save-helper with lseek.

lseek is unsupported for pipes and sockets, so a wrapper passing such an
fd to libxl-save-helper fails the lseek.  Moving the lseek to libxl
avoids the error, allowing the save to continue.

Signed-off-by: Jason Andryuk <andryuk@aero.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
10 years agohvmloader: fix build with certain iasl versions
Jan Beulich [Thu, 22 May 2014 12:20:19 +0000 (14:20 +0200)]
hvmloader: fix build with certain iasl versions

While most of them support what we have now, Wheezy's dislikes the
empty range. Put a fake one in place - it's getting overwritten upon
evaluation of _CRS anyway.

The range could be grown (downwards) if necessary; the way it is now
it is
- the highest possible one below the 36-bit boundary (with 36 bits
  being the lowest common denominator for all supported systems),
- the smallest possible one that said iasl accepts.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agohvmloader: PA range 0xfc000000-0xffffffff should be UC
Jan Beulich [Wed, 21 May 2014 16:14:04 +0000 (18:14 +0200)]
hvmloader: PA range 0xfc000000-0xffffffff should be UC

Rather than leaving the range from PCI_MEM_END (0xfc000000) to 4G
uncovered, we should include this in the UC range created for the (low)
PCI range. Besides being more correct, this also has the advantage that
with the way pci_setup() currently works the range will always be
mappable with a single variable range MTRR (rather than from 2 to 5
depending on how much the lower boundary gets shifted down to
accommodate all devices).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agohvmloader: also cover PCI MMIO ranges above 4G with UC MTRR ranges
Jan Beulich [Wed, 21 May 2014 16:13:36 +0000 (18:13 +0200)]
hvmloader: also cover PCI MMIO ranges above 4G with UC MTRR ranges

When adding support for BAR assignments to addresses above 4G, the MTRR
side of things was left out.

Additionally the MMIO ranges in the DSDT's \_SB.PCI0._CRS were having
memory types not matching the ones put into MTRRs: The legacy VGA range
is supposed to be WC, and the other ones should be UC.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agohotplug/linux: Fix the vif script to handle_iptable for tap interfaces
Sylvain Munaut [Tue, 20 May 2014 14:56:43 +0000 (16:56 +0200)]
hotplug/linux: Fix the vif script to handle_iptable for tap interfaces

The TAP interfaces need the same iptables rules as the VIF, without it,
traffic will not be forwarded to/from them is the default FORWARD policy
is DROP/REJECT

Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen: iommu: Define PAGE_{SHIFT, SIZE, ALIGN, MASK)_64K
Julien Grall [Mon, 19 May 2014 16:23:58 +0000 (17:23 +0100)]
xen: iommu: Define PAGE_{SHIFT, SIZE, ALIGN, MASK)_64K

Also add IOMMU_PAGE_* helper macros to help creating PAGE_* defines.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agoxen/arm: p2m: Clean cache PT when the IOMMU doesn't support coherent walk
Julien Grall [Mon, 19 May 2014 16:23:57 +0000 (17:23 +0100)]
xen/arm: p2m: Clean cache PT when the IOMMU doesn't support coherent walk

Some IOMMU don't suppport coherent PT walk. When the p2m is shared with
the CPU, Xen has to make sure the PT changes have reached the memory.

Introduce new IOMMU function that will check if the IOMMU feature is enabled
for a specified domain.

On ARM, the platform can contain multiple IOMMUs. Each of them may not
have the same set of feature. The domain parameter will be used to get the
set of features for IOMMUs used by this domain.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxc: check return values on mmap() and madvise() on xc_alloc_hypercall_buffer()
Luis R. Rodriguez [Tue, 20 May 2014 12:37:35 +0000 (05:37 -0700)]
libxc: check return values on mmap() and madvise() on xc_alloc_hypercall_buffer()

On a Thinkpad T4440p with OpenSUSE tumbleweed with v3.15-rc4
and today's latest xen tip from the git tree strace -f reveals
we end up on a never ending wait shortly after

write(20, "backend/console/5\0", 18 <unfinished ...>

This is right before we just wait on the qemu process which we
had mmap'd for. Without this you'll end up getting stuck on a
loop if mmap() worked but madvise() did not. While at it I noticed
even the mmap() error fail was not being checked, fix that too.

Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/man/xl.cfg.pod.5: add a missing new line and remove some redundant ones
Zhigang Wang [Tue, 20 May 2014 17:44:25 +0000 (13:44 -0400)]
docs/man/xl.cfg.pod.5: add a missing new line and remove some redundant ones

Without a new line after the `pvh` item, the generate html is wrong.

Signed-off-by: Zhigang Wang <zhigang.x.wang@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxc: Protect xc_domain_resume from clobbering domain registers
Jason Andryuk [Tue, 20 May 2014 13:37:08 +0000 (09:37 -0400)]
libxc: Protect xc_domain_resume from clobbering domain registers

xc_domain_resume() expects the guest to be in state SHUTDOWN_suspend.
However, nothing verifies the state before modify_returncode() modifies
the domain's registers.  This will crash guest processes or the kernel
itself.

This can be demonstrated with `LIBXL_SAVE_HELPER=/bin/false xl migrate`.

Signed-off-by: Jason Andryuk <andryuk@aero.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: IRQ: Handle multiple action per IRQ
Julien Grall [Fri, 16 May 2014 14:40:32 +0000 (15:40 +0100)]
xen/arm: IRQ: Handle multiple action per IRQ

On ARM, it may happen (eg ARM SMMU) to setup multiple handler for the same
interrupt.

To be able to use multiple action, the driver has to explicitly call
{setup,request}_irq with IRQF_SHARED as 2nd parameter.

The behavior stays the same on x86, e.g only one action is handled.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: IRQ: extend {request, setup}_irq to take an irqflags in parameter
Julien Grall [Fri, 16 May 2014 14:40:31 +0000 (15:40 +0100)]
xen/arm: IRQ: extend {request, setup}_irq to take an irqflags in parameter

The irqflags will be used later on ARM to know if we can shared the IRQ or not.

On x86, the irqflags should always be 0.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>
10 years agoxen: IRQ: Add dev_id parameter to release_irq
Julien Grall [Fri, 16 May 2014 14:40:30 +0000 (15:40 +0100)]
xen: IRQ: Add dev_id parameter to release_irq

The new parameter (dev_id) will be used in on ARM to release the right
action when support for multiple action is added.

Even if this function is declared in common code, no one is using it. So it's
safe to modify the prototype also for x86.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Replace route_guest_dt_irq by route_guest_irq
Julien Grall [Fri, 16 May 2014 14:40:29 +0000 (15:40 +0100)]
xen/arm: Replace route_guest_dt_irq by route_guest_irq

We can use platform_get_irq to get the IRQ which will be route to the guest.

platform_get_irq will store the type of IRQ (e.g level/edge...) directly in
the irq_desc.

This will avoid to have device tree specific routing function.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: IRQ: Replace {request, setup}_dt_irq by {request, setup}_irq
Julien Grall [Fri, 16 May 2014 14:40:28 +0000 (15:40 +0100)]
xen/arm: IRQ: Replace {request, setup}_dt_irq by {request, setup}_irq

Now that irq_desc stores the type of the IRQ (e.g level/edge,...), we don't
need to use specific IRQ function for ARM.

Also replace every call to dt_device_get_irq by platform_get_irq which is
a wrapper to this function and setup the IRQ type correctly.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Keir Fraser <keir@xen.org>
10 years agoxen/arm: IRQ: Store IRQ type in arch_irq_desc
Julien Grall [Fri, 16 May 2014 14:40:27 +0000 (15:40 +0100)]
xen/arm: IRQ: Store IRQ type in arch_irq_desc

For now, ARM uses different IRQ functions to setup an interrupt handler. This
is a bit annoying for common driver because we have to add idefery when
an IRQ is setup (see ns16550_init_postirq for an example).

To avoid to completely fork the IRQ management code, we can introduce a field
to store the IRQ type (e.g level/edge ...).

This patch also adds platform_get_irq which will retrieve the IRQ from the
device tree and setup correctly the IRQ type.

In order to use this solution, we have to move init_IRQ earlier for the boot
CPU. It's fine because the code only depends on percpu.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl_json: allow basic JSON type objects generation
Wei Liu [Tue, 13 May 2014 21:53:59 +0000 (22:53 +0100)]
libxl_json: allow basic JSON type objects generation

The original logic is that basic JSON types (number, string and null)
must be an element of JSON map or array. This assumption doesn't hold
true anymore when we need to return basic JSON types.

Returning basic JSON types is required for parsing number, string and
null objects back into libxl__json_object.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl_internal.h: introduce libxl__json_object_get_number
Wei Liu [Tue, 13 May 2014 21:53:57 +0000 (22:53 +0100)]
libxl_internal.h: introduce libxl__json_object_get_number

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl_internal.h: introduce libxl__json_object_is_{null, number, double}
Wei Liu [Tue, 13 May 2014 21:53:56 +0000 (22:53 +0100)]
libxl_internal.h: introduce libxl__json_object_is_{null, number, double}

... which return true if json object is valid and of type
JSON_{NULL,NUMBER,DOUBLE}.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>