Ian Campbell [Wed, 22 Feb 2012 14:33:23 +0000 (14:33 +0000)]
arm: restore ELR_hyp and SPSR_hyp on return from hypervisor to hypervisor.
This is necessary to handle nested traps to the hypervisor more than one deep.
I've not seen an actually failure relating to this but I'm not quite sure how
we've managed to get away with not doing it (I suppose multiply nested traps
are uncommon).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Campbell <Ian.Campbell@citrix.com>
David Vrabel [Wed, 22 Feb 2012 14:33:23 +0000 (14:33 +0000)]
arm: move check for CONFIG_DTB_FILE to xen/arch/arm/Makefile
CONFIG_DTB_FILE only needs to be set when building Xen itself.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <Ian.Campbell@citrix.com>
Roger Pau Monne [Wed, 22 Feb 2012 13:06:42 +0000 (13:06 +0000)]
autoconf: clean brctl options
This bit was missing, sorry.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Roger Pau Monne [Wed, 22 Feb 2012 11:56:24 +0000 (11:56 +0000)]
autoconf: remove udev checks from build
There's no need to have udev when building Xen since it's only used at
run time.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Roger Pau Monne [Wed, 22 Feb 2012 11:54:16 +0000 (11:54 +0000)]
autoconf: remove brctl check
Remove brctl check since it's usually only available to users with
high privileges, but Xen should be buildable by regular users.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 22 Feb 2012 01:55:03 +0000 (01:55 +0000)]
build: add autoconf to replace custom checks in tools/check
Added autotools magic to replace custom check scripts. The previous
checks have been ported to autoconf, and some additional ones have
been added (plus the suggestions from running autoscan). Two files are
created as a result from executing configure script, config/Tools.mk
and config.h.
conf/Tools.mk is included by tools/Rules.mk, and contains most of the
options previously defined in .config, that can now be set passing
parameters or defining environment variables when executing configure
script.
config.h is only used by libxl/xl to detect yajl_version.h.
[ tools/config.sub and config.guess copied from
autotools-dev 20100122.1 from Debian squeeze i386,
which is GPLv2.
tools/configure generated using the included ./autogen.sh
which ran autoconf 2.67-2 from Debian squeeze i386. autoconf
is GPLv3+ but has a special exception for the autoconf output;
this exception applies to us and exempts us from complying
with GPLv3+ for configure, which is good as Xen is GPL2 only.
- Ian Jackson ]
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Tested-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap [Tue, 21 Feb 2012 17:45:59 +0000 (17:45 +0000)]
libxl: cleanup: Remove pointless ERRNOVAL
Just call LIBXL__LOG rather than passing a meaningless ERRNOVAL.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: implement stack of free slots in pagefile
Scanning the slot_to_gfn[] array for a free slot is expensive because
evict_pages() always needs to scan the whole array. Remember the last
slots freed during page-in requests and reuse them in evict_pages().
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: unify error handling
Update functions to return -1 on error, 0 on success.
Simplify init_page() and make sure errno is assigned.
Adjust PERROR/ERROR usage, use PERROR early because it overwrites errno.
Adjust xenpaging_populate_page() to handle gfn as unsigned long.
Update xenpaging exit code handling. xenpaging_teardown cant possible
fail. Adjust mainloop to indicate possible errors to final exit.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: move nominate+evict into single function
Move all code to evict a single gfn into one function. This simplifies
error handling in caller. The function returns -1 on fatal error, 0 on
success and 1 if the gfn cant be paged.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: reduce number of qemu cache flushes
Currently the command to flush the qemu cache is called alot if there
are no more pages to evict. This causes churn in the logfiles, and qemu
can not release more pages anyway since the last command.
Fix this by remembering the current number of paged-out gfns, if this
number did not change since the last flush command then sending another
new flush command will not free any more gfns.
Remove return code from xenpaging_mem_paging_flush_ioemu_cache() since
errors do not matter, and will be handled elsewhere. Also failure to
send the flush command is not fatal.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: no poll timeout while page-out is in progress
The main loop calls xenpaging_wait_for_event_or_timeout() unconditionally
before doing any work. This function calls poll() with a timeout of 100ms. As
a result the page-out process is very slow due to the delay in poll().
Call poll() without timeout so that it returns immediately until the page-out
is done. Page-out is done when either the policy finds no more pages to
nominate or when the requested number of pages is reached.
The condition is cleared when a watch event arrives, so that processing the
new target is not delayed once again by poll().
v2:
- no poll timeout also when large number of evicts is pending
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 20:18:44 +0000 (21:18 +0100)]
xenpaging: use flat index for pagefile and page-in requests
This change is based on an idea by <hongkaixing@huawei.com> and
<bicky.shi@huawei.com>.
Scanning the victims[] array is time consuming with a large number of
target pages. Replace the loop to find the slot in the pagefile which
holds the requested gfn with an index.
Remove the victims array and replace it with a flat array. This array
holds the gfn for a given slot in the pagefile. Adjust all users of the
victims array.
Rename variable in main() from i to slot to clearify the meaning.
Update xenpaging_evict_page() to pass a pointer to xen_pfn_t to
xc_map_foreign_pages().
Update policy_choose_victim() to return either a gfn or INVALID_MFN.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Christoph Egger [Tue, 21 Feb 2012 16:44:15 +0000 (16:44 +0000)]
libxl: add missing includes
include <poll.h> for struct pollfd
include <sys/time.h> for struct timeval
Fixes gcc complaints about implicit declaration.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 18:58:07 +0000 (18:58 +0000)]
tools/hotplug: remove 4 from default runlevel in xencommons
LSB defines runlevel 4 as "reserved for local use, default is
normal/full multiuser"
The current behaviour of insserv in openSuSE 11.4 and SLES11SP2 is that
xencommons gets a symlink in /etc/init.d/rc4.d/ due to the 4 in the
Default-Start: line. As a result insserv will print a warning:
insserv: warning: current stop runlevel(s) (2 3 5) of script `xencommons' overwrites defaults (2 3 4 5).
Since the local admin is responsible to create all symlinks manually in
/etc/init.d/rc4.d/ the xencommons script should not automatically enable
itself in runlevel 4.
So, remove the 4 from Default-Start: line.
Note: This change will not automatically remove old/stale xencommon
symlinks in /etc/init.d/rc4.d/ during a package upgrade.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 20 Feb 2012 18:48:32 +0000 (18:48 +0000)]
minios: Remove unused variables warnings
s/DEBUG/printk/ in test_xenbus and all associated do_*_test+xenbus_dbg_message
and always print the IRQ and MFN used by the xenbus on init.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: John McDermott <john.mcdermott@nrl.navy.mil> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Mon, 20 Feb 2012 18:45:29 +0000 (18:45 +0000)]
libxl: Set VNC password through QMP
This patch provide the code to set the VNC password to QEMU upstream through
VNC. The password is still stored in xenstore but will not be used by QEMU
upstream.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Campbell <ian.campbell.com> Committed-by: Ian Jackson <ian.jackson.citrix.com>
remus: libcheckpoint - initialize unused callback fields to NULL
Add a memset to the save_callbacks struct instance in libcheckpoint's
initialization code. New additions to the callback struct will not
need to add an explicit initialization (to NULL), to maintain
compatibility with older xend/remus based invocation of xc_domain_save.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 20 Feb 2012 18:29:31 +0000 (18:29 +0000)]
oxenstored: Fix spelling of "persistent" config variable
Change "persistant" to "persistent", in the code and the
example/default config.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Fabio Fantoni [Mon, 20 Feb 2012 18:05:12 +0000 (18:05 +0000)]
tools/examples: Add the xl configuration examples to the makefile
Signed-off-by: Fabio Fantoni <fabio.fantoni@heliman.it> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Mon, 20 Feb 2012 17:53:33 +0000 (17:53 +0000)]
libxl_qmp: Handle unexpected end-of-socket
When read() return 0, the current code just tries again. But this
leads to an infinite loop if QEMU died too soon.
Also, retry select if a signal was caught.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Roger Pau Monne [Thu, 19 Jan 2012 10:17:35 +0000 (11:17 +0100)]
libxc/NetBSD: return ioctl return value on error
NetBSD libxc hypercall implementation was returning -errno on error,
instead of the actual error value from ioctl. Returning error is
easier to understand, and the caller can always check errno.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Campbell <ian.campbell.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Reported-by: Olaf Hering <olaf@aepfle.de>
Roger Pau Monne [Thu, 19 Jan 2012 10:21:10 +0000 (11:21 +0100)]
libxc: add comment to why NetBSD return hypercall->retval
Added a comment that explains why NetBSD return hypercall->retval on
success.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Campbell <ian.campbell.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Reported-by: Olaf Hering <olaf@aepfle.de>
Allen Kay [Mon, 20 Feb 2012 16:46:27 +0000 (16:46 +0000)]
libxl: Fix yajl-related build error due to missing error value
Some versions of yajl lack yajl_gen_no_buf.
Signed-off-by: Allen Kay <allen.m.kay@intel.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 20 Feb 2012 16:11:38 +0000 (16:11 +0000)]
xenpaging: mmap guest pages read-only
xenpaging does not write to the gfn, so map the gfn to page-out in
read-only mode.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Kaixing Hong [Fri, 17 Feb 2012 07:14:25 +0000 (15:14 +0800)]
x86/mm: Make sure the event channel is released accurately
In xenpaging source code,there is an interdomain communication between dom0
and domU. In mem_event_enable(),the function alloc_unbound_xen_event_channel()
allocates a free port for domU, and then it will be bound with dom0.
When xenpaging tears down,it just frees dom0's event channel port by
xc_evtchn_unbind(), leaves domU's port still occupied.
So we add the patch to free domU's port when xenpaging exits.
We need double free interdomain eventchannel. First free domainU port,
and leave domain 0 port unbond, Then free domain 0 port.
Signed-off-by: Kaixing Hong <hongkaixing@huawei.com>, Signed-off-by: Zhen Shi <bicky.shi@huawei.com> Acked-by: Olaf Hering <olaf@aepfle.de> Committed-by: Tim Deegan <tim@xen.org>
x86/mm: Fix more ballooning+paging and ballooning+sharing bugs
If the guest balloons away a page that has been nominated for paging but
not yet paged out, we fix:
- Send EVICT_FAIL flag in the event to the pager
- Do not leak the underlying page
If the page was shared, we were not:
- properly refreshing the mfn to balloon after the unshare.
- unlocking the p2m on the error exit case
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Thu, 16 Feb 2012 07:48:23 +0000 (08:48 +0100)]
replace bogus gdprintk() uses with {,d}printk()
When the subject domain is not the current one (e.g. during domctl or
HVM save/restore handling), use of gdprintk() is questionable at best,
as it won't give the intended information on what domain is affected.
Use plain printk() or dprintk() instead, but keep things (mostly) as
guest messages by using XENLOG_G_*.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 15 Feb 2012 11:04:44 +0000 (12:04 +0100)]
x86: don't allow Dom0 to map MSI-X table writably
With the traditional qemu tree fixed to not use PROT_WRITE anymore in
the mmap() call for this region, and with the upstream qemu tree not
being capable of handling passthrough, yet, there's no need to treat
Dom specially here anymore.
This continues to leave unaddressed the case where PV guests map the
MSI-X table page(s) before setting up the first MSI-X interrupt (see
the original c/s 22182:68cc3c514a0a description for options).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Mon, 13 Feb 2012 17:26:08 +0000 (17:26 +0000)]
arm: fixup hard tabs
Unfortunately the tool I was using to apply patches mangles hard tabs. This
patch corrects this in the effected files (which is fortunately only a subset
of .S or files imported from Linux).
This commit fixes this error such that the tree represents the state it would
have been in had I correctly committed what I was sent.
"git diff" and "git diff -b" vs. Stefano's v6 branch now contain the same
output -- i.e. only the intervening development
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Christoph Egger [Mon, 13 Feb 2012 18:17:28 +0000 (18:17 +0000)]
tools: make qemu build use correct PYTHON version
Pass --python=$(PYTHON) to qemu's configure.
Fixes error:
Python not found. Use --python=/path/to/python
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Stefan Bader [Mon, 13 Feb 2012 17:45:13 +0000 (17:45 +0000)]
xl: Add defaultbridge config option for xl.conf
Currently guests created with the xl stack will have "xenbr0"
written as their default into xenstore. It can be changed in
the individual guest config files, but there is no way to
have that default globally changed.
Add a config option to xl.conf that allows to have a different
default bridge name.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 13 Feb 2012 17:29:50 +0000 (17:29 +0000)]
blktap2/libvhd: Build shared objects using -fPIC.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 13 Feb 2012 16:57:53 +0000 (16:57 +0000)]
xl: Add -F to usage for xl shutdown/reboot
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Zhigang Wang [Mon, 13 Feb 2012 16:56:12 +0000 (16:56 +0000)]
xl: remove duplicate line
Signed-off-by: Zhigang Wang <zhigang.x.wang@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
David Vrabel [Mon, 13 Feb 2012 14:24:49 +0000 (14:24 +0000)]
arm: map device tree blob in initial page tables
Add a mapping for the device tree blob in the initial page tables.
This will allow the DTB to be parsed for memory information prior to
setting up the real page tables.
It is mapped into the first L2 slot after the fixmap. When this slot
is reused in setup_pagetables(), flush the TLB.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Mon, 13 Feb 2012 14:24:48 +0000 (14:24 +0000)]
arm: link a device tree blob into the xen image
Link a device tree blob (DTB) into the xen image. This is loaded
immediately after Xen and xen_start() is called with the correct
address in atag_paddr.
The DTB file must be supplied by setting the CONFIG_DTB_FILE variable
in .config or on the make command line.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Few missing #define are the cause of a compile failure with
XEN_TARGET_ARM=arm and XEN_COMPILE_ARM=arm (for example in the case of a
native compilation). This patch fill the gaps.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Julian Pidancet [Mon, 13 Feb 2012 12:50:46 +0000 (12:50 +0000)]
firmware: Introduce CONFIG_ROMBIOS and CONFIG_SEABIOS options
This patch introduces configuration options allowing to built either a
rombios only or a seabios only hvmloader.
Building option ROMs like vgabios or etherboot is only enabled for a
rombios hvmloader, since SeaBIOS takes care or extracting option ROMs
itself from the PCI devices (these option ROMs are provided by the
device model and do not need to be built in hvmloader).
The Makefile in tools/firmware/ now only checks for bcc if rombios is
enabled.
These two configuration options are left on by default to remain
compatible.
Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julian Pidancet [Mon, 13 Feb 2012 12:50:04 +0000 (12:50 +0000)]
hvmloader: Move option ROM loading into a separate optionnal file
Make load_rom field in struct bios_config an optionnal callback rather
than a boolean value. It allow BIOS specific code to implement it's
own option ROM loading methods.
Facilities to scan PCI devices, extract an deploy ROMs are moved into
a separate file that can be compiled optionnaly.
Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julian Pidancet [Mon, 13 Feb 2012 12:49:06 +0000 (12:49 +0000)]
firmware: Use mkhex from hvmloader directory for etherboot ROMs
To remain consistent with how other ROMs are built into hvmloader,
call mkhex on etherboot ROMs from the hvmloader directory, instead of
the etherboot directory. In other words, eb-roms.h is not used any
more.
Introduce ETHERBOOT_NICS config option to choose which ROMs should be
built (kept rtl8139 and 8086100e per default as before).
Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julian Pidancet [Mon, 13 Feb 2012 12:48:20 +0000 (12:48 +0000)]
hvmloader: Allow the mkhex command to take several file arguments Signed-off-by: Julian Pidancet <julian.pidancet@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 13 Feb 2012 12:12:30 +0000 (13:12 +0100)]
x86/vMCE: MC{G,i}_CTL handling adjustments
- g_mcg_cap was read to determine whether MCG_CTL exists before it got
initialized
- h_mci_ctrl[] and dom_vmce()->mci_ctl[] both got initialized via
memset() with an inappropriate size (hence causing a [minor?]
information leak)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 13 Feb 2012 12:09:02 +0000 (13:09 +0100)]
x86/paging: use clear_guest() for zero-filling guest buffers
While static arrays of all zeros may be tolerable (but are simply
inefficient now that we have the necessary infrastructure), using on-
stack arrays for this purpose (particularly when their size doesn't
have an upper limit enforced) is calling for eventual problems (even
if the code can be reached via administrative interfaces only).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Remove costly mem_sharing audits from the inline path, and instead make them
callable as a memop.
Have the audit function return the number of errors detected.
Update memshrtool to be able to trigger audits.
Set sharing audits as enabled by default.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Tim Deegan <tim@xen.org>
Use memops for mem paging, sharing, and access, instead of domctls
Per page operations in the paging, sharing, and access tracking subsystems are
all implemented with domctls (e.g. a domctl to evict one page, or to share one
page).
Under heavy load, the domctl path reveals a lack of scalability. The domctl
lock serializes dom0's vcpus in the hypervisor. When performing thousands of
per-page operations on dozens of domains, these vcpus will spin in the
hypervisor. Beyond the aggressive locking, an added inefficiency of blocking
vcpus in the domctl lock is that dom0 is prevented from re-scheduling any of
its other work-starved processes.
We retain the domctl interface for setting up and tearing down
paging/sharing/mem access for a domain. But we migrate all the per page
operations to use the memory_op hypercalls (e.g XENMEM_*).
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Tim Deegan <tim@xen.org>
When calling get_gfn multiple times on different gfn's in the same function, we
can easily deadlock if p2m lookups are locked. Thus, refactor these calls to
enforce simple deadlock-avoidance rules:
- Lowest-numbered domain first
- Lowest-numbered gfn first
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavila.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Re-order calls to put_gfn() around wait queue invocations
Since we use wait queues to handle potential ring congestion cases,
code paths that try to generate a mem event while holding a gfn lock
would go to sleep in non-preemptible mode.
Most such code paths can be fixed by simply postponing event generation until
locks are released.
Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
The PoD layer has a complex locking discipline. It relies on the
p2m being globally locked, and it also relies on the page alloc
lock to protect some of its data structures. Replace this all by an
explicit pod lock: per p2m, order enforced.
Three consequences:
- Critical sections in the pod code protected by the page alloc
lock are now reduced to modifications of the domain page list.
- When the p2m lock becomes fine-grained, there are no
assumptions broken in the PoD layer.
- The locking is easier to understand.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Tim Deegan <tim@xen.org>
x86/mm: Clean up locking now that p2m lockups are fully synchronized
With p2m lookups fully synchronized, many routines need not
call p2m_lock any longer. Also, many routines can logically
assert holding the p2m for a specific gfn.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
x86/mm: Make p2m lookups fully synchronized wrt modifications
We achieve this by locking/unlocking the global p2m_lock in get/put_gfn.
The lock is always taken recursively, as there are many paths that
call get_gfn, and later, make another attempt at grabbing the p2m_lock.
The lock is not taken for shadow lookups. We believe there are no problems
remaining for synchronized p2m+shadow paging, but we are not enabling this
combination due to lack of testing. Unlocked shadow p2m access are tolerable as
long as shadows do not gain support for paging or sharing.
HAP (EPT) lookups and all modifications do take the lock.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>