]> xenbits.xensource.com Git - xen.git/log
xen.git
9 years agoVT-d: add iommu=igfx option to workaround graphics issues
Ting-Wei Lan [Wed, 5 Aug 2015 17:10:06 +0000 (01:10 +0800)]
VT-d: add iommu=igfx option to workaround graphics issues

When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
devices, It is possible to encounter graphics issues that make screen
unreadable or crash the system. It was reported in freedesktop bugzilla:

https://bugs.freedesktop.org/show_bug.cgi?id=90037

As we still cannot find a proper fix for this problem, this patch adds
iommu=igfx option to control whether Intel graphics IOMMU is enabled.
Running Xen with iommu=no-igfx is similar to running Linux with
intel_iommu=igfx_off, which disables IOMMU for Intel GPU. This can be
used by users to manually workaround the problem before a fix is
available for i915 driver.

Signed-off-by: Ting-Wei Lan <lantw44@gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agotools/libxl: Drop all legacy "toolstack" record infrastructure
Andrew Cooper [Tue, 4 Aug 2015 17:16:36 +0000 (18:16 +0100)]
tools/libxl: Drop all legacy "toolstack" record infrastructure

No functional change.  It is not used any more.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl/save&restore&convert: Switch to new EMULATOR_XENSTORE_DATA records
Andrew Cooper [Tue, 4 Aug 2015 17:16:35 +0000 (18:16 +0100)]
libxl/save&restore&convert: Switch to new EMULATOR_XENSTORE_DATA records

Read and write "toolstack" information using the new
EMULATOR_XENSTORE_DATA record, and have the conversion script take care
of the old format.

The entire libxc and libxl migration v2 streams are now bitness-neutral
in their records.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agotools/libxl: Prepare to write multiple records with EMULATOR headers
Andrew Cooper [Tue, 4 Aug 2015 17:16:34 +0000 (18:16 +0100)]
tools/libxl: Prepare to write multiple records with EMULATOR headers

With the newly specified EMULATOR_XENSTORE_DATA record, there are two
libxl records with an emulator subheader.  Refactor the existing code to
make future additions easier, and rename some functions for consistency
with the new scheme.

* Calculate the subheader at stream start time, rather than on the fly.
  Its contents are not going to change.
* Introduce a new setup_emulator_write() to insert a sub header in the
  appropriate place before a blob of data.
* Rename *toolstack_* to *emulator_xenstore_*
* Rename *emulator_* to *emulator_context_*

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
9 years agotools/libxl: Save and restore EMULATOR_XENSTORE_DATA content
Andrew Cooper [Tue, 4 Aug 2015 17:16:33 +0000 (18:16 +0100)]
tools/libxl: Save and restore EMULATOR_XENSTORE_DATA content

The new EMULATOR_XENSTORE_DATA content is a sequence of NUL terminated
key/value strings, with the key relative to the device model's xenstore
tree.

A sample might look like (as decoded by verify-stream-v2):

    Emulator Xenstore Data (Qemu Upstream, idx 0)
      'physmap/1f00000/start_addr' = 'f0000000'
      'physmap/1f00000/size' = '800000'
      'physmap/1f00000/name' = 'vga.vram'

This patch introduces libxl helpers to save and restore this new format,
which reimplement the existing libxl__toolstack_{save,restore}() logic.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agodocs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA
Andrew Cooper [Tue, 4 Aug 2015 17:16:32 +0000 (18:16 +0100)]
docs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA

The legacy "toolstack" record as implemented in libxl turns out not to
be 32/64bit safe.  As migration v2 has not shipped yet, take this
opportunity to adjust the specification and fix the incompatibility.

Libxl shall loose all knowledge of the old "toolstack" blob and use this
EMULATOR_XENSTORE_DATA record instead.  Compatibility shall be handled
by the legacy conversion script.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agotools/libxl: Make libxl__conversion_helper_abort() safe to use
Andrew Cooper [Tue, 4 Aug 2015 17:16:31 +0000 (18:16 +0100)]
tools/libxl: Make libxl__conversion_helper_abort() safe to use

Previously, in the case of an error causing a call to
libxl__conversion_helper_abort() on a stream without legacy conversion,
libxl would fall over a NULL pointer because chs->ao was not set up.

Arrange for all ->ao's to be set up at _init() time, by having each
_init() function assert that their caller has done the right thing.
While doing so, introduce a previously-missing save_helper_init() in
stream_read_init().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
9 years agolibxl: increase hotplug timeout to 40s
Roger Pau Monne [Tue, 4 Aug 2015 10:02:55 +0000 (12:02 +0200)]
libxl: increase hotplug timeout to 40s

The default libxl timeout for hotplug scripts execution is too low, when
launching 40 HVM guests in parallel, all using the same file as disk,
execution times of ~20s are expected. Increase the timeout to 40s in order
to be sure hotplug scripts have enough time to execute.

This is a short term solution.

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/gdt: Drop write-only, xalloc()'d array from set_gdt()
Andrew Cooper [Mon, 3 Aug 2015 17:05:43 +0000 (18:05 +0100)]
x86/gdt: Drop write-only, xalloc()'d array from set_gdt()

It is not used, and can cause a spurious failure of the set_gdt() hypercall in
low memory situations.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agolibxl: remove stray declaration of libxl__hotplug_settings
Wei Liu [Tue, 4 Aug 2015 10:16:32 +0000 (11:16 +0100)]
libxl: remove stray declaration of libxl__hotplug_settings

That function was removed in 2ba368d1 ("libxl: Remove linux udev rules")

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/hvm: don't rely on shared ioreq state for completion handling
Paul Durrant [Fri, 31 Jul 2015 15:34:22 +0000 (16:34 +0100)]
x86/hvm: don't rely on shared ioreq state for completion handling

Both hvm_io_pending() and hvm_wait_for_io() use the shared (with emulator)
ioreq structure to determined whether there is a pending I/O. The latter will
misbehave if the shared state is driven to STATE_IOREQ_NONE by the emulator,
or when the shared ioreq page is cleared for re-insertion into the guest
P2M when the ioreq server is disabled (STATE_IOREQ_NONE == 0) because it
will terminate its wait without calling hvm_io_assist() to adjust Xen's
internal I/O emulation state. This may then lead to an io completion
handler finding incorrect internal emulation state and calling
domain_crash().

This patch fixes the problem by adding a pending flag to the ioreq server's
per-vcpu structure which cannot be directly manipulated by the emulator
and thus can be used to determine whether an I/O is actually pending for
that vcpu on that ioreq server. If an I/O is pending and the shared state
is seen to go to STATE_IOREQ_NONE then it can be treated as an abnormal
completion of emulation (hence the data placed in the shared structure
is not used) and the internal state is adjusted as for a normal completion.
Thus, when a completion handler subsequently runs, the internal state is as
expected and domain_crash() will not be called.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoxl/libxl: disable PV vNUMA
Wei Liu [Thu, 30 Jul 2015 16:11:29 +0000 (17:11 +0100)]
xl/libxl: disable PV vNUMA

Update xl manual and disable PV vNUMA in libxl.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoMerge branch 'fsf-address-v1' of git://xenbits.xen.org/people/ianc/xen into staging
Ian Campbell [Thu, 30 Jul 2015 14:31:18 +0000 (15:31 +0100)]
Merge branch 'fsf-address-v1' of git://xenbits.xen.org/people/ianc/xen into staging

9 years agobuild: use correct qemu path in systemd service file and init script
Ting-Wei Lan [Thu, 30 Jul 2015 06:51:10 +0000 (14:51 +0800)]
build: use correct qemu path in systemd service file and init script

When --with-system-qemu is used, it is possible that we cannot find
qemu-system-i386 in LIBEXEC_BIN, which can cause error in xencommons
init script and xen-qemu-dom0-disk-backend.service systemd service.

Signed-off-by: Ting-Wei Lan <lantw44@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/hvm.c: Don't tear down altp2m state if it was never set up
Ravi Sahita [Wed, 29 Jul 2015 16:39:22 +0000 (09:39 -0700)]
x86/hvm.c: Don't tear down altp2m state if it was never set up

Reported-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Wei Liu <wei.liu2@citrix.com>
[ ijc -- replacement subject from Andy ]

9 years agox86/p2m.c: fix missed off-by-one in altp2m commit
Ravi Sahita [Wed, 29 Jul 2015 16:40:06 +0000 (09:40 -0700)]
x86/p2m.c: fix missed off-by-one in altp2m commit

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoQEMU_TAG update
Ian Jackson [Wed, 29 Jul 2015 15:32:48 +0000 (16:32 +0100)]
QEMU_TAG update

9 years agolibxlu: properly free buffer in PCI related functions
Wei Liu [Tue, 28 Jul 2015 16:23:56 +0000 (17:23 +0100)]
libxlu: properly free buffer in PCI related functions

Free buffer in both success and failure paths.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: Assert that libxl__ao_inprogress_gc() is not called with NULL
Andrew Cooper [Tue, 28 Jul 2015 21:44:37 +0000 (22:44 +0100)]
tools/libxl: Assert that libxl__ao_inprogress_gc() is not called with NULL

libxl__ao_inprogress_gc() is hidden behind various macros used to
construct local variables.  Assert() that NULL is not passed, to make
such an error very obvious, rather than a plain segfault at 0.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools/libxl: Only continue stream operations if the stream is still in progress
Andrew Cooper [Tue, 28 Jul 2015 21:44:36 +0000 (22:44 +0100)]
tools/libxl: Only continue stream operations if the stream is still in progress

Part of the callback contract with check_all_finished() is that each
running parallel task shall call it exactly once.

Previously, it was possible for stream_continue() or
write_toolstack_record() to fail and call into check_all_finished().  As
the save helpers callback has fired, it no longer counts as in use,
which causes check_all_finished() to fire the stream callback.  Then,
unwinding the stack back and calling check_all_finished() a second time
results in the same conditions being observed, and the stream callback
being fired a second time.

To avoid this, check_all_finished() is called before any other actions
which continue the stream functionality, and the stream is only
continued if it has not been torn down.  This guarantees not to continue
stream operations if the stream does not owe a callback to
check_all_finished().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
9 years agoReplace FSF street address with canonical URL
Ian Campbell [Wed, 29 Jul 2015 10:00:36 +0000 (11:00 +0100)]
Replace FSF street address with canonical URL

As recommended in http://www.gnu.org/licenses/gpl-howto.en.html.

This is the result of:
$ git grep -El Mass\|Temple\|Franklin | xargs ./fsf.pl

Where fsf.pl is:
    #!/usr/bin/perl -w -pi.bak -0777
    my $repl = 'If not, see <http://www.gnu.org/licenses/>.';
    my $br = qr/(?:\s*\n\s*(?:[\*\#]|\/\/|\.\\" )?\s*|\s+)/;

    my $inwt = qr/[Ii]f${br}not,${br}write${br}(?:to${br})?the${br}Free${br}Software${br}Foundation,(?:${br}Inc\.,)?/;

    my $mass = qr/675${br}Mass${br}Ave,?${br}Cambridge,?${br}MA${br}02139,?${br}USA,?\.?/;
    my $franklin = qr/51${br}Franklin${br}St(?:reet)?(?:,${br}| - )Fifth${br}Floor,?${br}Boston,?${br}MA,?${br}02110-1301,?${br}USA,?\.?/;
    my $temple = qr/59${br}Temple${br}Place(?:,${br}| - )Suite${br}330,?${br}Boston,?${br}MA,?${br}021110?-1307,?${br}USA,?\.?/;

    s|$inwt$br$mass|$repl|m;
    s|$inwt$br$franklin|$repl|m;
    s|$inwt$br$temple|$repl|m;

The only remaining mentions of these addresses are in COPYING files which I
haven't touched.

Some of the changed files are imports from elsewhere, however
filtering them out is tricky, I think it is tolerable to have these
files be modified here and then perhaps reverted on the next sync,
since it's only 1-2 lines and obvious what is going on.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agostubdom: Replace FSF street address with canonical URL in patches
Ian Campbell [Wed, 29 Jul 2015 09:09:47 +0000 (10:09 +0100)]
stubdom: Replace FSF street address with canonical URL in patches

Do these ones manually since the diff header needs fixup too.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: Do not fire the stream callback multiple times
Andrew Cooper [Mon, 27 Jul 2015 16:47:26 +0000 (17:47 +0100)]
tools/libxl: Do not fire the stream callback multiple times

Avoid stacking of check_all_finished() via synchronous teardown of
tasks.  If the _abort() functions call back synchronously,
stream->completion_callback() ends up getting called twice, as first
and last check_all_finished() frames observe each task being finished.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools/libxl: Do not set stream->rc in stream_complete()
Andrew Cooper [Mon, 27 Jul 2015 16:47:25 +0000 (17:47 +0100)]
tools/libxl: Do not set stream->rc in stream_complete()

Only ever set stream->rc in check_all_finished().  The first version of
the migration v2 series had separate rc and joined_rc parameters, where
this logic worked.  However when combining the two, the teardown path
fails to trigger if stream_complete() records stream->rc itself.  A side
effect of this is that stream_done() needs to take an rc parameter.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools/hvmloader: sync memory map[]
Tiejun Chen [Tue, 28 Jul 2015 07:27:57 +0000 (15:27 +0800)]
tools/hvmloader: sync memory map[]

Currently we always use memory map[] to help hvmloader construct e820 table
but hvmloader may have relocated RAM to support mmio allocation or just
populated ram to ensure we can have enough room to load ovmf. Anyway we
need to sync these changes into memory map[].

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoblkif.h: document physical-device node
Wei Liu [Tue, 7 Jul 2015 15:48:52 +0000 (16:48 +0100)]
blkif.h: document physical-device node

This node is used by toolstack (libxl, hotplug script) and blkback.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
9 years agolibxl: remove dead code libxl__domain_shutdown_reason
Wei Liu [Mon, 27 Jul 2015 17:45:09 +0000 (18:45 +0100)]
libxl: remove dead code libxl__domain_shutdown_reason

There is no user in tree.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agopython/xc: reinstate original implementation of next_bdf
Wei Liu [Mon, 27 Jul 2015 17:45:08 +0000 (18:45 +0100)]
python/xc: reinstate original implementation of next_bdf

I missed the fact that next_bdf is used to parsed user supplied
strings when reviewing. The user supplied string is a NULL-terminated
string separated by comma. User can supply several PCI devices in that
string. There is, however, no delimiter for different devices, hence
we can't change the syntax of that string.

This patch reinstate the original implementation of next_bdf to
preserve the original syntax. The last argument for xc_assign_device
is always 0.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: call libxl_dominfo_{init, dispose} in main_cpupoolnumasplit
Wei Liu [Mon, 27 Jul 2015 17:45:06 +0000 (18:45 +0100)]
xl: call libxl_dominfo_{init, dispose} in main_cpupoolnumasplit

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: valid fd can be 0 in main_loadpolicy
Wei Liu [Mon, 27 Jul 2015 17:45:05 +0000 (18:45 +0100)]
xl: valid fd can be 0 in main_loadpolicy

Initialise polFd to -1 before hand to avoid closing 0 by accident.

Also fixed some style problems while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: call libxl_dominfo_init in main_list
Wei Liu [Mon, 27 Jul 2015 17:45:04 +0000 (18:45 +0100)]
xl: call libxl_dominfo_init in main_list

Always call init and dispose function on info_buf though it's not
always used.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: lockdir should be lockfile in error message
Wei Liu [Mon, 27 Jul 2015 17:45:03 +0000 (18:45 +0100)]
xl: lockdir should be lockfile in error message

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxl: properly clean up array in libxl_list_cpupool failure path
Wei Liu [Mon, 27 Jul 2015 17:45:02 +0000 (18:45 +0100)]
libxl: properly clean up array in libxl_list_cpupool failure path

Document how cpupool_info works.  Distinguish success (ERROR_FAIL +
ENOENT) vs failure in libxl_list_cpupool and properly clean up the array
in failure path.

Also switch to libxl__realloc and call libxl_cpupool_{init,dispose}
where appropriate.

There is change of behaviour. Previously if memory allocation fails the
said function returns NULL. Now memory allocation failure is fatal. This
is in line with how we deal with memory allocation failure in other
places in libxl though.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libx{l, c}: Fix trivial Coverity defects in migration v2 code
Andrew Cooper [Mon, 20 Jul 2015 10:37:59 +0000 (11:37 +0100)]
tools/libx{l, c}: Fix trivial Coverity defects in migration v2 code

All of these are UNUSED_VALUE defects where a default value is
unconditionally overwritten.  They are not particularly interesting,
bug wise, but keeping these defects at bay helps prevent real bugs
going unnoticed in the volume.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agodocs: Migration v2 is now no longer draft
Andrew Cooper [Mon, 20 Jul 2015 10:37:58 +0000 (11:37 +0100)]
docs: Migration v2 is now no longer draft

Add further instructions to the libxc "Future Extensions" section, and
provide such a section for libxl.

In addition, drop the "In experimental __func__" IPRINTF()s from the
libxc implementations.

Finally, a correction to libxl's "Not Yet Included" section which
should have been amended in c/s 7eaec00 when libxl Remus support was
introduced into the protocol.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libx{l, c}: Drop '2' suffixes from xc_domain_{save, restore}2() functions
Andrew Cooper [Mon, 20 Jul 2015 10:37:57 +0000 (11:37 +0100)]
tools/libx{l, c}: Drop '2' suffixes from xc_domain_{save, restore}2() functions

As there is now only the one implementation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libx{l, c}: Remove XC_DEVICE_MODEL_RESTORE_FILE
Andrew Cooper [Mon, 20 Jul 2015 10:37:56 +0000 (11:37 +0100)]
tools/libx{l, c}: Remove XC_DEVICE_MODEL_RESTORE_FILE

All handling of device model files is now at the libxl level.  Remove
XC_DEVICE_MODEL_RESTORE_FILE and introduce LIBXL_DEVICE_MODEL_RESTORE_FILE in
its place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libx{l, c}: Remove the toolstack_{save, restore} callbacks
Andrew Cooper [Mon, 20 Jul 2015 10:37:55 +0000 (11:37 +0100)]
tools/libx{l, c}: Remove the toolstack_{save, restore} callbacks

Update the libxc spec to indicate more sternly that TOOLSTACK records
should no longer be used.

Also, trim further toolstack infrastructure which should have gone in
c/s 39bf4e9 "tools/libxl: Drop all knowledge of toolstack callbacks"

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxc: Remove legacy migration implementation
Andrew Cooper [Mon, 20 Jul 2015 10:37:54 +0000 (11:37 +0100)]
tools/libxc: Remove legacy migration implementation

It is no longer used.

One complication is that xc_map_m2p() has users in xc_offline_page.c,
xen-mfndump and xen-mceinj.  Move its implementation into
xc_offline_page (for want of a better location) beside it's current
user.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- drop mentions of removed files from MAINTAINERS ]

9 years agolibxl: check nesthvm and altp2m in libxl
Wei Liu [Mon, 27 Jul 2015 14:01:32 +0000 (15:01 +0100)]
libxl: check nesthvm and altp2m in libxl

In ea214001 ("x86/altp2m: add altp2mhvm HVM domain parameter"), a
check was added to ensure nestedhvm and altp2m cannot be enabled at
the same time. That check was added in xl, but in fact it should be in
libxl because it should be the entity that decides whether
the provided configuration is valid.

This patch moves the check to libxl. The code snippet is moved after
calling libxl__domain_build_info_setdefault so that we can:
1. remove libxl_defbool_is_default in `if()';
2. detect mistake in libxl__domain_build_info_setdefault.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxenconsole: Ensure exclusive access to console using locks
Martin Lucina [Fri, 24 Jul 2015 15:29:41 +0000 (17:29 +0200)]
xenconsole: Ensure exclusive access to console using locks

If more than one instance of xenconsole is run against the same DOMID
then each instance will only get some data. This change ensures
exclusive access to the console by obtaining an exclusive lock on
<XEN_LOCK_DIR>/xenconsole.<DOMID>.

The locking strategy used is based on
tools/libxl/libxl_internal.c:libxl__lock_domain_userdata().

Signed-off-by: Martin Lucina <martin@lucina.net>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxc: fix memory leak in migration v2
Wei Liu [Sun, 26 Jul 2015 21:34:54 +0000 (22:34 +0100)]
libxc: fix memory leak in migration v2

Originally there was only one counter to keep track of pages. It was
used erroneously to keep track of how many pages were mapped and how
many pages needed to be sent. In the end munmap(2) always had 0 as the
length argument, which resulted in leaking the mapping.

This problem was discovered on 32bit toolstack because 32bit applications
have notably smaller address space. In fact this bug affects 64bit
toolstack too.

Use a separate counter to keep track of the number of mapped pages to
solve this problem.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Fri, 24 Jul 2015 13:13:21 +0000 (14:13 +0100)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

9 years agoxen-access: altp2m testcases
Tamas K Lengyel [Fri, 24 Jul 2015 11:42:24 +0000 (13:42 +0200)]
xen-access: altp2m testcases

Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: add support to altp2m hvmops
Tamas K Lengyel [Fri, 24 Jul 2015 11:42:12 +0000 (13:42 +0200)]
libxc: add support to altp2m hvmops

Wrappers to issue altp2m hvmops.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoxenconsole: Allow non-interactive use
Martin Lucina [Fri, 24 Jul 2015 11:30:48 +0000 (13:30 +0200)]
xenconsole: Allow non-interactive use

If xenconsole is run with stdin closed or redirected to /dev/null,
console_loop() will return immediately due to failure to read from
STDIN_FILENO. This patch tests if stdin and stdout are both connected to
a TTY and, if not, xenconsole will not attempt to read from stdin or
modify stdout terminal attributes.

Existing behaviour when xenconsole is run from a terminal does not
change.

This allows for non-interactive use, eg. running "xl create -c" under
systemd or piping the output of "xl console" to another command.

Signed-off-by: Martin Lucina <martin@lucina.net>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/altp2m: XSM hooks for altp2m HVM ops
Ravi Sahita [Fri, 24 Jul 2015 11:39:33 +0000 (13:39 +0200)]
x86/altp2m: XSM hooks for altp2m HVM ops

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Drop now bogus conditional expression from xsm_hvm_altp2mhvm_op()
invocation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: add altp2mhvm HVM domain parameter
Ed White [Fri, 24 Jul 2015 11:38:28 +0000 (13:38 +0200)]
x86/altp2m: add altp2mhvm HVM domain parameter

The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: define and implement alternate p2m HVMOP types
Ed White [Fri, 24 Jul 2015 11:36:50 +0000 (13:36 +0200)]
x86/altp2m: define and implement alternate p2m HVMOP types

Signed-off-by: Ed White <edmund.h.white@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: add remaining support routines
Ed White [Fri, 24 Jul 2015 11:36:15 +0000 (13:36 +0200)]
x86/altp2m: add remaining support routines

Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Fix off-by-one in various checks against MAX_ALTP2M. Adjust error code
in p2m_destroy_altp2m_by_id(). Cosmetic adjustments.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: alternate p2m memory events
Ed White [Fri, 24 Jul 2015 11:34:46 +0000 (13:34 +0200)]
x86/altp2m: alternate p2m memory events

Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_mem_access_check() to handle alternate p2m's.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tamas K Lengyel <tlengyel@novetta.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: add control of suppress_ve
George Dunlap [Fri, 24 Jul 2015 11:30:44 +0000 (13:30 +0200)]
x86/altp2m: add control of suppress_ve

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Also adjust the caller in set_identity_p2m_entry().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agoVMX: add VMFUNC leaf 0 (EPTP switching) to emulator
Ravi Sahita [Fri, 24 Jul 2015 11:29:56 +0000 (13:29 +0200)]
VMX: add VMFUNC leaf 0 (EPTP switching) to emulator

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoVMX/altp2m: add code to support EPTP switching and #VE
Ed White [Fri, 24 Jul 2015 11:29:18 +0000 (13:29 +0200)]
VMX/altp2m: add code to support EPTP switching and #VE

Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/altp2m: basic data structures and support routines
Ed White [Fri, 24 Jul 2015 11:28:00 +0000 (13:28 +0200)]
x86/altp2m: basic data structures and support routines

Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Cosmetic adjustments.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/HVM: hardware alternate p2m support detection
Ed White [Fri, 24 Jul 2015 11:26:02 +0000 (13:26 +0200)]
x86/HVM: hardware alternate p2m support detection

As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoVMX: implement suppress #VE
Ed White [Fri, 24 Jul 2015 11:25:29 +0000 (13:25 +0200)]
VMX: implement suppress #VE

In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoVMX: VMFUNC and #VE definitions and detection
Ed White [Fri, 24 Jul 2015 11:24:51 +0000 (13:24 +0200)]
VMX: VMFUNC and #VE definitions and detection

Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agocommon/domain: helpers to pause a domain while in context
Andrew Cooper [Fri, 24 Jul 2015 11:23:59 +0000 (13:23 +0200)]
common/domain: helpers to pause a domain while in context

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agotools: libxl: Use correct printf format for uint64_t
Ian Campbell [Fri, 24 Jul 2015 10:41:17 +0000 (11:41 +0100)]
tools: libxl: Use correct printf format for uint64_t

Since 25652f232cbe "tools/libxl: detect and avoid conflicts with RDM"
the build is broken for x86_32 and arm32 with:

libxl_dm.c: In function ‘libxl__domain_device_construct_rdm’:
libxl_dm.c:349:13: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 8 has type ‘uint64_t’ [-Werror=format=]
             LOG(ERROR, "RDM conflict at 0x%lx.\n", d_config->rdms[i].start);
             ^
libxl_dm.c:352:13: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 8 has type ‘uint64_t’ [-Werror=format=]
             LOG(WARN, "Ignoring RDM conflict at 0x%lx.\n",

Use PRIx64 for these.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Tiejun Chen <tiejun.chen@intel.com>
9 years agoxl: free event struct after use in main_shutdown_or_reboot
Wei Liu [Thu, 23 Jul 2015 07:59:12 +0000 (08:59 +0100)]
xl: free event struct after use in main_shutdown_or_reboot

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: call libxl_bitmap_{init, dispose} in main_cpupoolcreate
Wei Liu [Thu, 23 Jul 2015 07:59:10 +0000 (08:59 +0100)]
xl: call libxl_bitmap_{init, dispose} in main_cpupoolcreate

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: call libxl_dominfo_{init, dispose} in psr_cmt_show
Wei Liu [Thu, 23 Jul 2015 07:59:07 +0000 (08:59 +0100)]
xl: call libxl_dominfo_{init, dispose} in psr_cmt_show

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: check json string is not null before printing in create_domain
Wei Liu [Thu, 23 Jul 2015 07:59:06 +0000 (08:59 +0100)]
xl: check json string is not null before printing in create_domain

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: free pid string in do_daemonize
Wei Liu [Thu, 23 Jul 2015 07:59:05 +0000 (08:59 +0100)]
xl: free pid string in do_daemonize

Pid is a null terminated string allocated by asprintf. It should be
freed after use.

Also fixed a coding style problem while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl/libxl: remove a bunch of pointless assignments
Wei Liu [Thu, 23 Jul 2015 07:59:04 +0000 (08:59 +0100)]
xl/libxl: remove a bunch of pointless assignments

Those values are  overwritten before they can be of any use.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxl: use thread-safe localtime_r and handle NULL
Wei Liu [Thu, 23 Jul 2015 07:59:02 +0000 (08:59 +0100)]
libxl: use thread-safe localtime_r and handle NULL

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: Command line: Support xl vcpu-set --help
Ian Jackson [Fri, 17 Jul 2015 17:00:51 +0000 (18:00 +0100)]
xl: Command line: Support xl vcpu-set --help

This ended with a literal sentinel.  Use COMMON_LONG_OPTIONS (which
mentions --help) instead.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: Command line: Make COMMON_LONG_OPTS include sentinel
Ian Jackson [Fri, 17 Jul 2015 17:00:50 +0000 (18:00 +0100)]
xl: Command line: Make COMMON_LONG_OPTS include sentinel

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- added comment to COMMON_LONG_OPTS ]

9 years agoxen/tools: Widen the machine_irq in xc_domain_*bind_pt_irq_int
Julien Grall [Fri, 17 Jul 2015 14:06:21 +0000 (15:06 +0100)]
xen/tools: Widen the machine_irq in xc_domain_*bind_pt_irq_int

The DOMCTLs {,un}bind_pt_irq are using uint32_t for the machine_irq
while the helper is using uint8_t.

Currently on ARM, we are supporting SPIs whose irq number can go up to
1019 which doesn't fit in an uint8_t. The helpers xc_domain_bind_pt_spi
and xc_domain_unbint_pt_spi are correctly taking an uint16_t so the
libxc was truncating without noticing the user which may end up to
route the wrong IRQ.

Fix the problem by widening the machine_irq parameter in
xc_domain_*bind_pt_irq_int.

Note that XEN_DOMCTL_irq_permission has the same problem but it's not
used at the moment on ARM. So we can defer the changes after the release
of Xen 4.7.

Reported-by: Iurii Konovalenko <iurii.konovalenko@globallogic.com>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen: arm: Document xenheap_megabytes limitation
Chris (Christopher) Brand [Thu, 23 Jul 2015 16:31:56 +0000 (16:31 +0000)]
xen: arm: Document xenheap_megabytes limitation

In setup_mm(), the value passed as xenheap_megabytes gets
converted to pages and passed to setup_xenheap_mappings(),
which in turn passes it to create_32mb_mappings(), which
contains an ASSERT that the value passed is a multiple of
32MB. So specifying any value that is not an integer multiple
of 32 will cause Xen to hit this assert and fail to boot.

Signed-off-by: Chris Brand <chris.brand@broadcom.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agooxenstored: link in the systemd system library
Jonathan Creekmore [Thu, 23 Jul 2015 13:40:39 +0000 (08:40 -0500)]
oxenstored: link in the systemd system library

If systemd is configured for use AND you are building oxenstored, the C
systemd library must be linked in to the systemd.cxma library.

Signed-off-by: Jonathan Creekmore <jonathan.creekmore@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agosched/cpupool: properly update affinity when removing a cpu from a cpupool
Dario Faggioli [Fri, 24 Jul 2015 09:29:35 +0000 (11:29 +0200)]
sched/cpupool: properly update affinity when removing a cpu from a cpupool

And this time, do it right. In fact, a similar change was
attempted in 93be8285a79c6 ("cpupools: update domU's node-affinity
on the cpupool_unassign_cpu() path"). But that was buggy, and got
reverted with 8395b67ab0b8a86.

However, even though reverting was the right thing to do, it
remains true that:
 - calling the function is better done in the cpupool cpu removal
   code, even if just for simmetry with the cpupool cpu adding path;
 - it is not necessary to call it during cpu teardown (for suspend
   or shutdown) code as we either are going down and will never
   come up (shutdown) or, when coming up, we want everything to be
   as before the tearing down process started, and so we would just
   undo any update made during the process.
 - calling it from the teardown path is not only unnecessary, but
   it can trigger an ASSERT(), in case we get, during the process,
   to remove the last online pcpu of a domain's node affinity:

  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  ... ... ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d0801055b9>] domain_update_node_affinity+0x113/0x240
  (XEN)    [<ffff82d08012e676>] cpu_disable_scheduler+0x334/0x3f2
  (XEN)    [<ffff82d08018bb8d>] __cpu_disable+0x313/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d080130ad9>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d08013274f>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132a85>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 12:
  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) ****************************************

Therefore, for all these reasons, move the call from
cpu_disable_schedule() to cpupool_unassign_cpu_helper().

While there, add some sanity checking (in the latter function), and
make sure that scanning the domain list is done with domlist_read_lock
held, at least when the system is 'live'.

I re-tested the scenario described in here:
 http://permalink.gmane.org/gmane.comp.emulators.xen.devel/235310

which is what led to the revert of 93be8285a79c6, and that is
working ok after this commit.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agosched: reorganize cpu_disable_scheduler()
Dario Faggioli [Fri, 24 Jul 2015 09:26:34 +0000 (11:26 +0200)]
sched: reorganize cpu_disable_scheduler()

The function is called both when we want to remove a cpu
from a cpupool, and during cpu teardown, for suspend or
shutdown. If, however, the boot cpu (cpu 0, most of the
times) is not present in the default cpupool, during
suspend or shutdown, Xen crashes like this:

  root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
  root@Zhaman:~# shutdown -h now
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d0801238de>] _csched_cpu_pick+0x156/0x61f
  (XEN)    [<ffff82d080123db5>] csched_cpu_pick+0xe/0x10
  (XEN)    [<ffff82d08012de3c>] vcpu_migrate+0x18e/0x321
  (XEN)    [<ffff82d08012e4f8>] cpu_disable_scheduler+0x1cf/0x2ac
  (XEN)    [<ffff82d08018bb8d>] __cpu_disable+0x313/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 15:
  (XEN) Assertion 'cpu < nr_cpu_ids' failed at ...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
  (XEN) ****************************************

There also are problems when we try to suspend or shutdown
with a cpupool configured with just one cpu (no matter, in
this case, whether that is the boot cpu or not):

  root@Zhaman:~# xl create /etc/xen/test.cfg
  root@Zhaman:~# xl cpupool-migrate test Pool-1
  root@Zhaman:~# xl cpupool-list -c
  Name               CPU list
  Pool-0             0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
  Pool-1             12
  root@Zhaman:~# shutdown -h now
  (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
  (XEN) CPU:    12
  ...
  (XEN) Xen call trace:
  (XEN)    [<ffff82d08018bb91>] __cpu_disable+0x317/0x36e
  (XEN)    [<ffff82d080101424>] take_cpu_down+0x34/0x3b
  (XEN)    [<ffff82d08013097a>] stopmachine_action+0x70/0x99
  (XEN)    [<ffff82d0801325f0>] do_tasklet_work+0x78/0xab
  (XEN)    [<ffff82d080132926>] do_tasklet+0x5e/0x8a
  (XEN)    [<ffff82d08016478c>] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 12:
  (XEN) Xen BUG at smpboot.c:895
  (XEN) ****************************************

In both cases, the problem is the scheduler not being able
to:
 - move all the vcpus to the boot cpu (as the boot cpu is
   not in the cpupool), in the former;
 - move the vcpus away from a cpu at all (as that is the
   only one cpu in the cpupool), in the latter.

Solution is to distinguish, inside cpu_disable_scheduler(),
the two cases of cpupool manipulation and teardown. For
cpupool manipulation, it is correct to ask the scheduler to
take an action, as pathological situation (like there not
being any cpu in the pool where to send vcpus) are taken
care of (i.e., forbidden!) already. For suspend and shutdown,
we don't want the scheduler to be involved at all, as the
final goal is pretty simple: "send all the vcpus to the
boot cpu ASAP", so we just go for it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: Expose xc_reserved_device_memory_map to ARM too
Julien Grall [Thu, 23 Jul 2015 16:47:09 +0000 (17:47 +0100)]
libxc: Expose xc_reserved_device_memory_map to ARM too

The commit 25652f2 "tools/libxl: detect and avoid conflicts with RDM"
introduced the usage of xc_reserved_device_memory_map in the libxl
generic code. But the function is only defined for x86 which breaks the
ARM build.

The hypercall called by this helper is implemented in the generic code
and doesn't contain any x86 specific code. Therefore, it's fine to
expose the helper to ARM.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: parse to enable new rdm policy parameters
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
tools: parse to enable new rdm policy parameters

This patch parses to enable user configurable parameters to specify
RDM resource and according policies which are defined previously,

Global RDM parameter:
    rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
    pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoxen/vtd: prevent from assign the device with shared rmrr
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
xen/vtd: prevent from assign the device with shared rmrr

Currently we're intending to cover this kind of devices
with shared RMRR simply since the case of shared RMRR is
a rare case according to our previous experiences. But
late we can group these devices which shared rmrr, and
then allow all devices within a group to be assigned to
same domain.

CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoxen/vtd: enable USB device assignment
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
xen/vtd: enable USB device assignment

USB RMRR may conflict with guest BIOS region. In such case, identity
mapping setup is simply skipped in previous implementation. Now we
can handle this scenario cleanly with new policy mechanism so previous
hack code can be removed now.

CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agolibxl: construct e820 map with RDM information for HVM guest
Tiejun Chen [Wed, 22 Jul 2015 01:40:26 +0000 (01:40 +0000)]
libxl: construct e820 map with RDM information for HVM guest

Here we'll construct a basic guest e820 table via
XENMEM_set_memory_map. This table includes lowmem, highmem
and RDMs if they exist, and hvmloader would need this info
later.

Note this guest e820 table would be same as before if the
platform has no any RDM or we disable RDM (by default).

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: introduce a new parameter to set a predefined rdm boundary
Tiejun Chen [Wed, 22 Jul 2015 01:40:10 +0000 (01:40 +0000)]
tools: introduce a new parameter to set a predefined rdm boundary

Previously we always fix that predefined boundary as 2G to handle
conflict between memory and rdm, but now this predefined boundar
can be changes with the parameter "rdm_mem_boundary" in .cfg file.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools/libxl: detect and avoid conflicts with RDM
Tiejun Chen [Wed, 22 Jul 2015 01:40:07 +0000 (01:40 +0000)]
tools/libxl: detect and avoid conflicts with RDM

While building a VM, HVM domain builder provides struct hvm_info_table{}
to help hvmloader. Currently it includes two fields to construct guest
e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
check them to fix any conflict with RDM.

RMRR can reside in address space beyond 4G theoretically, but we never
see this in real world. So in order to avoid breaking highmem layout
we don't solve highmem conflict. Note this means highmem rmrr could still
be supported if no conflict.

But in the case of lowmem, RMRR probably scatter the whole RAM space.
Especially multiple RMRR entries would worsen this to lead a complicated
memory layout. And then its hard to extend hvm_info_table{} to work
hvmloader out. So here we're trying to figure out a simple solution to
avoid breaking existing layout. So when a conflict occurs,

    #1. Above a predefined boundary (2G)
        - move lowmem_end below reserved region to solve conflict;

    #2. Below a predefined boundary (2G)
        - Check strict/relaxed policy.
        "strict" policy leads to fail libxl. Note when both policies
        are specified on a given region, 'strict' is always preferred.
        "relaxed" policy issue a warning message and also mask this entry INVALID
        to indicate we shouldn't expose this entry to hvmloader.

Note later we need to provide a parameter to set that predefined boundary
dynamically.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v13a: Change `flag' to `flags' in libxl__xc_device_get_rdm.
     No functional change.  [ Suggested by Tiejun Chen. ]
v13: Mechanical changes to deal with changes to patch 01/
     XENMEM_reserved_device_memory_map.

9 years agotools: introduce some new parameters to set rdm policy
Tiejun Chen [Wed, 22 Jul 2015 01:40:50 +0000 (01:40 +0000)]
tools: introduce some new parameters to set rdm policy

This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
    rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
    pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Global RDM parameter, "strategy", allows user to specify reserved regions
explicitly, Currently, using 'host' to include all reserved regions reported
on this platform which is good to handle hotplug scenario. In the future
this parameter may be further extended to allow specifying random regions,
e.g. even those belonging to another platform as a preparation for live
migration with passthrough devices. By default this isn't set so we don't
check all rdms. Instead, we just check rdm specific to a given device if
you're assigning this kind of device. Note this option is not recommended
unless you can make sure any conflict does exist.

'strict/relaxed' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'strict' means an immediate error
so VM can't keep running, while 'relaxed' allows moving forward with a
warning message thrown out.

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Checked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: extend xc_assign_device() to support rdm reservation policy
Tiejun Chen [Wed, 22 Jul 2015 01:40:08 +0000 (01:40 +0000)]
tools: extend xc_assign_device() to support rdm reservation policy

This patch passes rdm reservation policy to xc_assign_device() so the policy
is checked when assigning devices to a VM.

Note this also bring some fallout to python usage of xc_assign_device().

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: David Scott <dave.scott@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agotools/libxc: Expose new hypercall xc_reserved_device_memory_map
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
tools/libxc: Expose new hypercall xc_reserved_device_memory_map

We will introduce the hypercall xc_reserved_device_memory_map
approach to libxc. This helps us get rdm entry info according to
different parameters. If flag == PCI_DEV_RDM_ALL, all entries
should be exposed. Or we just expose that rdm entry specific to
a SBDF.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v13: Mechanical changes to deal with changes to patch 01/
     XENMEM_reserved_device_memory_map.

9 years agohvmloader/e820: construct guest e820 table
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
hvmloader/e820: construct guest e820 table

Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
9 years agohvmloader/pci: try to avoid placing BARs in RMRRs
George Dunlap [Wed, 22 Jul 2015 14:24:49 +0000 (15:24 +0100)]
hvmloader/pci: try to avoid placing BARs in RMRRs

Try to avoid placing PCI BARs over RMRRs:

- If mmio_hole_size is not specified, and the existing MMIO range has
  RMRRs in it, and there is space to expand the hole in lowmem without
  moving more memory, then make the MMIO hole as large as possible.

- When placing RMRRs, find the next RMRR higher than the current base
  in the lowmem mmio hole.  If it overlaps, skip ahead of it and find
  the next one.

This certainly won't work in all cases, but it should work in a
significant number of cases.  Additionally, users should be able to
work around problems by setting mmio_hole_size larger in the guest
config.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agohvmloader: get guest memory map into memory_map[]
Tiejun Chen [Wed, 22 Jul 2015 01:40:19 +0000 (01:40 +0000)]
hvmloader: get guest memory map into memory_map[]

Now we get this map layout by call XENMEM_memory_map then
save them into one global variable memory_map[]. It should
include lowmem range, rdm range and highmem range. Note
rdm range and highmem range may not exist in some cases.

And here we need to check if any reserved memory conflicts with
[RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END).
This range is used to allocate memory in hvmloder level, and
we would lead hvmloader failed in case of conflict since its
another rare possibility in real world.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoxen: enable XENMEM_memory_map in hvm
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
xen: enable XENMEM_memory_map in hvm

This patch enables XENMEM_memory_map in hvm. So hvmloader can
use it to setup the e820 mappings.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agoxen/passthrough: extend hypercall to support rdm reservation policy
Tiejun Chen [Wed, 22 Jul 2015 01:40:11 +0000 (01:40 +0000)]
xen/passthrough: extend hypercall to support rdm reservation policy

This patch extends the existing hypercall to support rdm reservation policy.
We return error or just throw out a warning message depending on whether
the policy is "strict" or "relaxed" when reserving RDM regions in pfn space.
Note in some special cases, e.g. add a device to hwdomain, and remove a
device from user domain, 'relaxed' is fine enough since this is always safe
to hwdomain.

CC: Tim Deegan <tim@xen.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
--
v13a: Fix build on ARM by passing 0 for flags to arm_smmu_assign_dev.

9 years agoxen/vtd: create RMRR mapping
Tiejun Chen [Wed, 22 Jul 2015 01:39:58 +0000 (01:39 +0000)]
xen/vtd: create RMRR mapping

RMRR reserved regions must be setup in the pfn space with an identity
mapping to reported mfn. However existing code has problem to setup
correct mapping when VT-d shares EPT page table, so lead to problem
when assigning devices (e.g GPU) with RMRR reported. So instead, this
patch aims to setup identity mapping in p2m layer, regardless of
whether EPT is shared or not. And we still keep creating VT-d table.

And we also need to introduce a pair of helper to create/clear this
sort of identity mapping as follows:

set_identity_p2m_entry():

If the gfn space is unoccupied, we just set the mapping. If space
is already occupied by desired identity mapping, do nothing.
Otherwise, failure is returned.

clear_identity_p2m_entry():

We just define macro to wrapper guest_physmap_remove_page() with
a returning value as necessary.

CC: Tim Deegan <tim@xen.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Yang Zhang <yang.z.zhang@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
9 years agointroduce XENMEM_reserved_device_memory_map
Jan Beulich [Wed, 22 Jul 2015 15:06:01 +0000 (16:06 +0100)]
introduce XENMEM_reserved_device_memory_map

This is a prerequisite for punching holes into HVM and PVH guests' P2M
to allow passing through devices that are associated with (on VT-d)
RMRRs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v12a: Move interface structure union member to the end, while moving
     the whole public header block into a __XEN__ / __XEN_TOOLS__
     conditional block.
v12: Restore changes as much as possible to my original version, fixing
     a few issues that got introduced after handing it over. Unionize
     new public memop interface structure to allow for non-PCI to be
     supported later on. Check flags to have all currently undefined
     flags clear. Refine adjustments to xen/pci.h.

9 years agox86/MSI: drop bogus NULL check from pci_restore_msi_state()
Jan Beulich [Thu, 23 Jul 2015 12:03:41 +0000 (14:03 +0200)]
x86/MSI: drop bogus NULL check from pci_restore_msi_state()

Commit 372900faf8 ("x86/MSI-X: reduce fiddling with control register
during restore") introduced de-references of pdev before it gets
checked against NULL. Instead of deferring the de-references, drop
the pointless check - both call sites do that check already.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agohvmloader: don't build with __XEN_TOOLS__ defined
Jan Beulich [Thu, 23 Jul 2015 12:03:20 +0000 (14:03 +0200)]
hvmloader: don't build with __XEN_TOOLS__ defined

This being an in-guest component, it shouldn't get to see (and even
less so use) tools-only public interfaces.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/PCI: intercept all PV Dom0 MMCFG writes
Jan Beulich [Thu, 23 Jul 2015 08:17:08 +0000 (10:17 +0200)]
x86/PCI: intercept all PV Dom0 MMCFG writes

... to hook up pci_conf_write_intercept() even for Dom0 not using
method 1 accesses for the base part of PCI device config space.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/MSI: properly track guest masking requests
Jan Beulich [Thu, 23 Jul 2015 08:16:27 +0000 (10:16 +0200)]
x86/MSI: properly track guest masking requests

... by monitoring writes to the mask register.

This allows reverting the main effect of the XSA-129 patches in qemu.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/MSI-X: reduce fiddling with control register during restore
Jan Beulich [Thu, 23 Jul 2015 08:16:03 +0000 (10:16 +0200)]
x86/MSI-X: reduce fiddling with control register during restore

Rather than disabling and enabling MSI-X once per vector, do it just
once per device.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/MSI-X: access MSI-X table only after having enabled MSI-X
Jan Beulich [Thu, 23 Jul 2015 08:15:39 +0000 (10:15 +0200)]
x86/MSI-X: access MSI-X table only after having enabled MSI-X

As done in Linux by f598282f51 ("PCI: Fix the NIU MSI-X problem in a
better way") and its broken predecessor, make sure we don't access the
MSI-X table without having enabled MSI-X first, using the mask-all flag
instead to prevent interrupts from occurring.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/MSI-X: be more careful during teardown
Jan Beulich [Thu, 23 Jul 2015 08:14:59 +0000 (10:14 +0200)]
x86/MSI-X: be more careful during teardown

When a device gets detached from a guest, pciback will clear its
command register, thus disabling both memory and I/O decoding. The
disabled memory decoding, however, has an effect on the MSI-X table
accesses the hypervisor does: These won't have the intended effect
anymore. Even worse, for PCIe devices (but not SR-IOV virtual
functions) such accesses may (will?) be treated as Unsupported
Requests, causing respective errors to be surfaced, potentially in the
form of NMIs that may be fatal to the hypervisor or Dom0 is different
ways. Hence rather than carrying out these accesses, we should avoid
them where we can, and use alternative (e.g. PCI config space based)
mechanisms to achieve at least the same effect.

At this time it continues to be unclear whether this is fixing an
actual bug or is rather just working around bogus (but apparently
common) system behavior.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/MSI-X: track host and guest mask-all requests separately
Jan Beulich [Thu, 23 Jul 2015 08:14:13 +0000 (10:14 +0200)]
x86/MSI-X: track host and guest mask-all requests separately

Host uses of the bits will be added subsequently, and must not be
overridden by guests (including Dom0, namely when acting on behalf of
a guest).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/PCI: add config space abstract write intercept logic
Jan Beulich [Thu, 23 Jul 2015 08:13:12 +0000 (10:13 +0200)]
x86/PCI: add config space abstract write intercept logic

This is to be used by MSI code, and later to also be hooked up to
MMCFG accesses by Dom0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>