Jan Beulich [Tue, 22 Jan 2013 08:33:10 +0000 (09:33 +0100)]
x86: restore (optional) forwarding of PCI SERR induced NMI to Dom0
c/s 22949:54fe1011f86b removed the forwarding of NMIs to Dom0 when they
were caused by PCI SERR. NMI buttons as well as BMCs (like HP's iLO)
may however want such events to be seen in Dom0 (e.g. to trigger a
dump).
Therefore restore most of the functionality which named c/s removed
(adjusted for subsequent changes, and adjusting the public interface to
use the modern term, retaining the old one for backwards
compatibility).
Ian Campbell [Mon, 21 Jan 2013 16:04:56 +0000 (16:04 +0000)]
vtpmmgr: fix build on 32-bit
Correct format string, fixing:
vtpm_storage.c: In function 'vtpm_storage_load_header': vtpm_storage.c:658: error: format '%ld' expects type 'long int', but argument 5 has type 'unsigned int'
vtpm_storage.c:658: error: format '%ld' expects type 'long int', but argument 5 has type 'unsigned int' make[2]: *** [vtpm_storage.o] Error 1
Add padlock.o to PSSL_OBJS, fixing:
/local/scratch/ianc/devel/xen-unstable.git/stubdom/mini-os-x86_32-vtpmmgr/mini-os.o: In function `aes_crypt_ecb': /local/scratch/ianc/devel/xen-unstable.git/stubdom/polarssl-x86_32/library/aes.c:659: undefined reference to `padlock_supports'
/local/scratch/ianc/devel/xen-unstable.git/stubdom/polarssl-x86_32/library/aes.c:661: undefined reference to `padlock_xcryptecb' /local/scratch/ianc/devel/xen-unstable.git/stubdom/mini-os-x86_32-vtpmmgr/mini-os.o: In function `aes_crypt_cbc': /local/scratch/ianc/devel/xen-unstable.git/stubdom/polarssl-x86_32/library/aes.c:771: undefined reference to `padlock_supports'
/local/scratch/ianc/devel/xen-unstable.git/stubdom/polarssl-x86_32/library/aes.c:773: undefined reference to `padlock_xcryptcbc'
make[1]: ***
[/local/scratch/ianc/devel/xen-unstable.git/stubdom/mini-os-x86_32-vtpmmgr/mini-os]
Error 1
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked by: Matthew Fioravante <matthew.fioravante@jhuapl.edu>
[ ijc -- applied same fix to stubdom/vtpm/Makefile ] Committed-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: flush dcache after memcpy'ing the kernel image
After memcpy'ing the kernel in guest memory we need to flush the dcache
to make sure that the data actually reaches the memory before we start
executing guest code with caches disabled.
copy_from_paddr is the function that does the copy, so add a
flush_xen_dcache_va_range there.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Mon, 21 Jan 2013 12:40:31 +0000 (12:40 +0000)]
arm: use module provided command line for domain 0 command line
Fallback to xen,dom0-bootargs if this isn't present.
Ideally this would use module1-args iff the kernel came from the
modules and the existing xen,dom0-bootargs if the kernel came from
flash, but this approach is simpler and has the same effect in
practice.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Mon, 21 Jan 2013 12:40:27 +0000 (12:40 +0000)]
arm: avoid placing Xen over any modules.
This will still fail if the modules are such that Xen is pushed out of
the top 32M of RAM since it will then overlap with the domheap (or
possibly xenheap). This will be dealt with later.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Mon, 21 Jan 2013 12:40:26 +0000 (12:40 +0000)]
xen: arm: introduce concept of modules which can be in RAM at start of day
The intention is that these will eventually be filled in with
information from the bootloader, perhaps via a DTB binding.
Allow for 2 modules (kernel and initrd), plus a third pseudo-module
which is the hypervisor itself. Currently we neither parse nor do
anything with them.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Tim Deegan [Fri, 18 Jan 2013 11:31:57 +0000 (12:31 +0100)]
x86/hvm: fix RTC setting.
When the guest writes one field of the RTC time, we must bring all the
other fields up to date for the current second before calculating the
new RTC time.
Signed-off-by: Tim Deegan <tim@xen.org> Tested-by: Phil Evans <Phil.Evans@m247.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Boris Ostrovsky [Fri, 18 Jan 2013 11:20:58 +0000 (12:20 +0100)]
x86/AMD: Enable WC+ memory type on family 10 processors
In some cases BIOS may not enable WC+ memory type on family 10 processors,
instead converting what would be WC+ memory to CD type. On guests using
nested pages this could result in performance degradation. This patch
enables WC+.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Add conditional build of subsystems to configure.ac
The toplevel Makefile still works without running configure
and will default build everything
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Stub domains now use autoconf to build.
This configure script can enable or disable specific domains
and also specify custom download locations for stubdom library
packages. See ./configure --help for details.
C and Caml are disabled by default. vtpm-stubdom is conditional
on the presense of cmake.
Rename vtpmmgrdom to vtpmmgr-stubdom
Also update .*ignore
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
vtpm/vtpmmgr and required libs to stubdom/Makefile
Add 3 new libraries to stubdom:
libgmp
polarssl
Berlios TPM Emulator 0.7.4
Add makefile structure for vtpm and vtpmmgrdom. Both
vtpm domains are optional builds as vtpm depends on
cmake. To build either of them, you must do so explicitly.
make vtpm-stubdom vtpmmgrdom
Finally, also update .*ignore
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ijc, folded in fix from Matthew to workaround cmake 2.8.2 build failure] Committed-by: Ian Campbell <ian.campbell@citrix.com>
Add the code base for vtpmmgrdom. Makefile changes
next patch.
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Add the code base for vtpm-stubdom to the stubdom
heirarchy. Makefile changes in later patch.
Signed-off-by: Matthew Fioravante <matthew.fioravante@jhuapl.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Paolo Bonzini [Fri, 18 Jan 2013 10:35:11 +0000 (11:35 +0100)]
x86: find a better location for the real-mode trampoline
On some machines, the location at 0x40e does not point to the beginning
of the EBDA. Rather, it points to the beginning of the BIOS-reserved
area of the EBDA, while the option ROMs place their data below that
segment.
For this reason, 0x413 is actually a better source than 0x40e to get
the location of the real-mode trampoline. Xen was already using it
as a second source, and this patch keeps that working. However, just
in case, let's also fetch the information from the multiboot structure,
where the boot loader should have placed it. This way we don't
necessarily trust one of the BIOS or the multiboot loader more than
the other.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Retain the previous code, thus using the multiboot value only if it's
sane but lower than the BDA computed one. Also use the full 32-bit
mem_lower value and prefer MBI_MEMLIMITS over open coding it (requiring
a slight adjustment to multiboot.h to make its constants actually
usable in assembly code, which previously they were only meant to be).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Jan Beulich <jbeulich@suse.com>
xen/arm: initialize the GIC irq properties of interrupts routed to guests
We are currently initializing GIC irq properties (ITARGETSR, IPRIORITYR,
and GICD_ICFGR) only in gic_route_irq, that is not called for guest
interrupts at all.
Move the initialization into a separate function
(gic_set_irq_properties) and call it from gic_route_irq_to_guest.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Save and restore the virtual timer registers during the context switch.
At save time initialize an internal Xen timer to make sure that Xen
schedules the guest vcpu at the time of the next virtual timer
interrupt.
Receive the virtual timer interrupt into the hypervisor and inject it
into the running guest.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 17 Jan 2013 16:48:23 +0000 (16:48 +0000)]
xen: return a per-mapping error from XENMEM_add_to_physmap_range.
Since ARM and PVH dom0 kernel use this to map foreign domain pages
they could in the future hit paged out or shared pages etc and
therefore need to propagate which frames are -ENOENT and which failed
for some other reason.
We have not yet released a version of Xen with this particular
hypercall subop so we can change the interface without worrying about
compatibility (I think/hope).
This would be used by the privcmd driver, in particular it relates to
Mats' patch "improve performance of MMAPBATCH_V2."
NB I have only implemented the ARM side since the PVH side isn't in
tree yet.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Mats Petersson <mats.petersson@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Lars Rasmusson [Thu, 17 Jan 2013 16:48:22 +0000 (16:48 +0000)]
xen: arm: Correct register values and comment in early init_uart.
Set register values and comment in early init_uart to match
documentation of PL011 UART
Reading the PL011 UART documentation on
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf
in sec 3.2 shows the early initialisation of the UART on the Versatile Express
is incorrect. This fixes it.
Signed-off-by: Lars Rasmusson <Lars.Rasmusson@sics.se> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Daniel De Graaf [Thu, 17 Jan 2013 16:48:21 +0000 (16:48 +0000)]
libxl: correct xenstore permissions on console device
When the console is connected to a domain other than dom0, the console
device's backend field should be set so the xenstore permissions for the
console device reflect the domain that will be accessing them.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Daniel De Graaf [Thu, 17 Jan 2013 16:48:21 +0000 (16:48 +0000)]
xenconsoled: use grant references instead of map_foreign_range
Grant references for the xenstore and xenconsole shared pages exist, but
currently only xenstore uses these references. Change the xenconsole
daemon to prefer using the grant reference over map_foreign_range when
mapping the shared console ring.
This allows xenconsoled to be run in a domain other than dom0 if set up
correctly - for libxl, the xenstore path /tool/xenconsoled/domid
specifies the domain containing xenconsoled.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 17 Jan 2013 13:53:14 +0000 (13:53 +0000)]
tools: Update to SeaBIOS 1.7.1
Only lightly tested with a Linux HVM guest PXE boot.
Accept the defaults for the config options. Many of them are not
relevant to Xen but this matches what others (at least the Debian
SeaBIOS packages and the binary shipped by Qemu) are doing. The
Debian Xen packages are built against Debian's SeaBIOS package so
there is value in being similar.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Thu, 17 Jan 2013 13:53:09 +0000 (13:53 +0000)]
libxl: don't continue to create the domain if the device model is not spawned
When the device model can't be spawned, rc variable is cleared in
device_model_spawn_outcome (libxl_dm.c).
In this case libxl will continue to create the domain and let it between life
and death.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
If we pass 0 as pygrub --entry argument (i.e. we want to boot first item), default value is used instead. This is dueto wrong check for range of allowed values of index - 0 is index of first item.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Thu, 17 Jan 2013 13:53:03 +0000 (13:53 +0000)]
Switch from select() to poll() in xenconsoled's IO loop
In Linux select() typically supports up to 1024 file descriptors. This can be
a problem when user tries to boot up many guests. Switching to poll() has
minimum impact on existing code and has better scalibility.
pollfd array is dynamically allocated / reallocated. If the array fails to
expand, we just ignore the incoming fd.
Updated: reset *_pollfd after use.
This fixes regression 14869.
Also remove unused slave_pollfd in strcut domain.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Razvan Cojocaru [Thu, 17 Jan 2013 12:27:00 +0000 (12:27 +0000)]
mem_event: Add support for MEM_EVENT_REASON_MSR
Add the new MEM_EVENT_REASON_MSR event type. Works similarly
to the other register events, except event.gla always contains
the MSR address (in addition to event.gfn, which holds the value).
MEM_EVENT_REASON_MSR does not honour the HVMPME_onchangeonly bit,
as doing so would complicate the hvm_msr_write_intercept()
switch-based handling of writes for different MSR addresses,
with little added benefit.
Signed-off-by: Razvan Cojocaru <rzvncj@gmail.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Robert Phillips [Thu, 17 Jan 2013 11:53:42 +0000 (11:53 +0000)]
x86/mm: Provide support for multiple frame buffers in HVM guests.
Support is provided for both shadow and hardware assisted paging (HAP)
modes. This code bookkeeps the set of video frame buffers (vram),
detects when the guest has modified any of those buffers and, upon request,
returns a bitmap of the modified pages.
This lets other software components re-paint the portions of the monitor
(or monitors) that have changed.
Each monitor has a frame buffer of some size at some position
in guest physical memory.
The set of frame buffers being tracked can change over time as monitors
are plugged and unplugged.
Signed-off-by: Robert Phillips <robert.phillips@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Removed a stray #include and a few hard tabs.
Signed-off-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Thu, 17 Jan 2013 09:56:34 +0000 (10:56 +0100)]
miscellaneous cleanup
... noticed while putting together the 16Tb support patches for x86.
Briefly, this (in order of the changes below)
- fixes an inefficiency in x86's context switch code (translations to/
from struct page are more involved than to/from MFNs)
- drop unnecessary MFM-to-page conversions
- drop a redundant call to destroy_xen_mappings() (an indentical call
is being made a few lines up)
- simplify a VA-to-MFN translation
- drop dead code (several occurrences)
- add a missing __init annotation
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 17 Jan 2013 09:55:00 +0000 (10:55 +0100)]
x86/EFI: retrieve PCI ROM contents not accessible through BARs
Linux 3.8-rc added code to do this, so we need to support this in the
hypervisor for Dom0 to be able to get at the same information as a
native kernel.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 16 Jan 2013 12:56:55 +0000 (13:56 +0100)]
x86: consistently mask floating point exceptions
c/s 23142:f5e8d152a565 resulted in v->arch.fpu_ctxt to point into the
save area allocated for xsave/xrstor (when they're available). The way
vcpu_restore_fpu_lazy() works (using fpu_init() for an uninitialized
vCPU only when there's no xsave support) causes this to load whatever
arch_set_info_guest() put there, irrespective of whether the i387 state
was specified to be valid in the respective input structure.
Consequently, with a cleared (al zeroes) incoming FPU context, and with
xsave available, one gets all exceptions unmasked (as opposed to to the
legacy case, where FINIT and LDMXCSR get used, masking all exceptions).
This causes e.g. para-virtualized NetWare to crash.
The behavior of arch_set_info_guest() is thus being made more hardware-
like for the FPU portion of it: Considering it to be similar to INIT,
it will leave untouched all floating point state now. An alternative
would be to make the behavior RESET-like, forcing all state to known
values, albeit - taking into account legacy behavior - not to precisely
the values RESET would enforce (which masks only SSE exceptions, but
not x87 ones); that would come closest to mimicing FINIT behavior in
the xsave case. Another option would be to continue copying whatever
was provided, but override (at least) FCW and MXCSR if VGCF_I387_VALID
isn't set.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Zhang Xiantao [Tue, 15 Jan 2013 10:33:41 +0000 (11:33 +0100)]
nEPT: Expose EPT & VPID capablities to L1 VMM
Expose EPT's and VPID 's basic features to L1 VMM.
For EPT, no EPT A/D bit feature supported.
For VPID, exposes all features to L1 VMM
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:30:50 +0000 (11:30 +0100)]
nVMX: virutalize VPID capability to nested VMM
Virtualize VPID for the nested vmm, use host's VPID
to emualte guest's VPID. For each virtual vmentry, if
guest'v vpid is changed, allocate a new host VPID for
L2 guest.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:29:41 +0000 (11:29 +0100)]
nEPT: handle invept instruction from L1 VMM
Add the INVEPT instruction emulation logic.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:28:23 +0000 (11:28 +0100)]
nEPT: Use minimal permission for nested p2m
Emulate permission check for the nested p2m. Current solution is to
use minimal permission, and once meet permission violation in L0, then
determin whether it is caused by guest EPT or host EPT
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:23:05 +0000 (11:23 +0100)]
nEPT: Sync PDPTR fields if L2 guest in PAE paging mode
For PAE L2 guest, GUEST_DPPTR registers needs to be synced for each virtual
vmentry.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:18:46 +0000 (11:18 +0100)]
nEPT: Try to enable EPT paging for L2 guest
Once found EPT is enabled by L1 VMM, enabled nested EPT support
for L2 guest.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:17:01 +0000 (11:17 +0100)]
EPT: Make ept data structure or operations neutral
Share the current EPT logic with nested EPT case, so
make the related data structure or operations netural
to comment EPT and nested EPT.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:15:29 +0000 (11:15 +0100)]
nested_ept: Implement guest ept's walker
Implment guest EPT PT walker, some logic is based on shadow's
ia32e PT walker. During the PT walking, if the target pages are
not in memory, use RETRY mechanism and get a chance to let the
target page back.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:11:37 +0000 (11:11 +0100)]
nestedhap: Change nested p2m's walker to vendor-specific
EPT and NPT adopts differnt formats for each-level entry,
so change the walker functions to vendor-specific.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Zhang Xiantao [Tue, 15 Jan 2013 10:09:33 +0000 (11:09 +0100)]
nestedhap: Change hostcr3 and p2m->cr3 to meaningful words
VMX doesn't have the concept about host cr3 for nested p2m,
and only SVM has, so change it to netural words.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Dongxiao Xu [Fri, 11 Jan 2013 16:30:56 +0000 (17:30 +0100)]
nested vmx: fix CR0/CR4 emulation
While emulate CR0 and CR4 for nested virtualization, set the CR0/CR4
guest host mask to 0xffffffff in shadow VMCS, then calculate the
corresponding read shadow separately for CR0 and CR4. While getting
the VM exit for CR0/CR4 access, check if L1 VMM owns the bit. If so,
inject the VM exit to L1 VMM. Otherwise, L0 will handle it and sync
the value to L1 virtual VMCS.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Fri, 11 Jan 2013 12:22:30 +0000 (12:22 +0000)]
Switch from select() to poll() in xenconsoled's IO loop
In Linux select() typically supports up to 1024 file descriptors. This can be
a problem when user tries to boot up many guests. Switching to poll() has
minimum impact on existing code and has better scalibility.
pollfd array is dynamically allocated / reallocated. If the array fails to
expand, we just ignore the incoming fd.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Some renaming to correct the PCI and SBDF terminology.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 11 Jan 2013 12:22:29 +0000 (12:22 +0000)]
tools/ocaml: libxc bindings: Fix SBDF encoding
Changeset 23861:ec7c81fbe0de alters the SBDF encoding expected by the
DOMCTL_{de,}assign_device hypercalls.
While it updates libxl, libxc and the python bindings, the ocaml
bindings got missed. As a result, any attempt to use PCI Passthrough
with Xen-4.2 and later will fail.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
fix wrong path while calling pygrub and libxl-save-helper
in current xen x86_64, the default libexec directory is /usr/lib/xen/bin,
while the private binder is /usr/lib64/xen/bin. but some commands(pygrub,
libxl-save-helper) located in private binder directory is called from
libexec directory which lead to the following error:
1, for pygrub bootloader:
libxl: debug: libxl_bootloader.c:429:bootloader_disk_attached_cb: /usr/lib/xen/bin/pygrub doesn't exist, falling back to config path
2, for libxl-save-helper:
libxl: cannot execute /usr/lib/xen/bin/libxl-save-helper: No such file or directory
libxl: error: libxl_utils.c:363:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 3 save/restore helper stdout pipe
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: domain 3 save/restore helper [10222] exited with error status 255
there are two ways to fix above error. the first one is make such command
store in the /usr/lib/xen/bin and /usr/lib64/xen/bin(symbol link to
previous), e.g. qemu-dm. The second way is using private binder dir
instead of libexec dir. e.g. xenconsole.
For these cases, the latter one is suitable.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 11 Jan 2013 12:22:27 +0000 (12:22 +0000)]
libxc: x86: ensure that the initial mapping fits into the guest's memory
In particular we need to check that adding 512KB of slack and
rounding up to a 4MB boundary do not overflow the guest's memory
allocation. Otherwise we run off the end of the p2m when building the
guest's initial page tables and populate them with garbage.
Wei noticed this when build tiny (2MB) mini-os domains.
Reported-by: Wei Liu <Wei.Liu2@citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Jim Fehlig [Fri, 11 Jan 2013 12:22:26 +0000 (12:22 +0000)]
libxl: Set vfb and vkb devid if not done so by the caller
Other devices set a sensible devid if the caller has not done so.
Do the same for vfb and vkb. While at it, factor out the common code
used to determine a sensible devid, so it can be used by other
libxl__device_*_add functions.
Signed-off-by: Jim Fehlig <jfehlig@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Daniel De Graaf [Fri, 11 Jan 2013 10:49:49 +0000 (10:49 +0000)]
xsm/flask: document the access vectors
This adds comments to the FLASK access_vectors file describing what
operations each access vector controls and the meanings of the source
and target fields in the permission check. This also makes the
indentation of the file consistent; no functionality changes are made.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:46:43 +0000 (10:46 +0000)]
tmem: add XSM hooks
This adds a pair of XSM hooks for tmem operations: xsm_tmem_op which
controls any use of tmem, and xsm_tmem_control which allows use of the
TMEM_CONTROL operations. By default, all domains can use tmem while
only IS_PRIV domains can use control operations.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:44:01 +0000 (10:44 +0000)]
xen/xsm: Add xsm_default parameter to XSM hooks
Include the default XSM hook action as the first argument of the hook
to facilitate quick understanding of how the call site is expected to
be used (dom0-only, arbitrary guest, or device model). This argument
does not solely define how a given hook is interpreted, since any
changes to the hook's default action need to be made identically to
all callers of a hook (if there are multiple callers; most hooks only
have one), and may also require changing the arguments of the hook.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:43:02 +0000 (10:43 +0000)]
xen: platform_hypercall XSM hook removal
A number of the platform_hypercall XSM hooks have no parameters or
only pass the operation ID, making them redundant with the
xsm_platform_op hook. Remove these redundant hooks.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:42:30 +0000 (10:42 +0000)]
xen: sysctl XSM hook removal
A number of the sysctl XSM hooks have no parameters or only pass the
operation ID, making them redundant with the xsm_sysctl hook. Remove
these redundant hooks.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:41:51 +0000 (10:41 +0000)]
xen: domctl XSM hook removal
A number of the domctl XSM hooks do nothing except pass the domain and
operation ID, making them redundant with the xsm_domctl hook. Remove
these redundant hooks.
The remaining domctls all use individual hooks because they pass extra
details of the call to the XSM module in order to allow a more
fine-grained access decision to be made - for example, considering the
exact device or memory range being set up for guest access.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:39:58 +0000 (10:39 +0000)]
arch/x86: use XSM hooks for get_pg_owner access checks
There are three callers of get_pg_owner:
* do_mmuext_op, which does not have XSM hooks on all subfunctions
* do_mmu_update, which has hooks that are inefficient
* do_update_va_mapping_otherdomain, which has a simple XSM hook
In order to preserve return values for the do_mmuext_op hypercall, an
additional XSM hook is required to check the operation even for those
subfunctions that do not use the pg_owner field. This also covers the
MMUEXT_UNPIN_TABLE operation which did previously have an XSM hook.
The XSM hooks in do_mmu_update were capable of replacing the checks in
get_pg_owner; however, the hooks are buried in the inner loop of the
function - not very good for performance when XSM is enabled and these
turn in to indirect function calls. This patch removes the PTE from
the hooks and replaces it with a bitfield describing what accesses are
being requested. The XSM hook can then be called only when additional
bits are set instead of once per iteration of the loop.
This patch results in a change in the FLASK permissions used for
mapping an MMIO page: the target for the permisison check on the
memory mapping is no longer resolved to the device-specific type, and
is instead either the domain's own type or domio_t (depending on if
the domain uses DOMID_SELF or DOMID_IO in the map
command). Device-specific access is still controlled via the "resource
use" permisison checked at domain creation (or device hotplug).
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:39:20 +0000 (10:39 +0000)]
arch/x86: Add missing mem_sharing XSM hooks
This patch adds splits up the mem_sharing and mem_event XSM hooks to
better cover what the code is doing. It also changes the utility
function get_mem_event_op_target to rcu_lock_live_remote_domain_by_id
because there is no mm-specific logic in there.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:38:39 +0000 (10:38 +0000)]
xsm/flask: add distinct SIDs for self/target access
Because the FLASK XSM module no longer checks IS_PRIV for remote
domain accesses covered by XSM permissions, domains now have the
ability to perform memory management and other functions on all
domains that have the same type. While it is possible to prevent this
by only creating one domain per type, this solution significantly
limits the flexibility of the type system.
This patch introduces a domain type transition to represent a domain
that is operating on itself. In the example policy, this is
demonstrated by creating a type with _self appended when declaring a
domain type which will be used for reflexive operations. AVCs for a
domain of type domU_t will look like the following:
This change also allows policy to distinguish between event channels a
domain creates to itself and event channels created between domains of
the same type.
The IS_PRIV_FOR check used for device model domains is also no longer
checked by FLASK; a similar transition is performed when the target is
set and used when the device model accesses its target domain.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:37:10 +0000 (10:37 +0000)]
xsm/flask: Add checks on the domain performing the set_target operation
The existing domain__set_target check only verifies that the source
and target domains can be associated. We also need to check that the
privileged domain making this association is allowed to do so.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:09:45 +0000 (10:09 +0000)]
xen: convert do_domctl to use XSM
The xsm_domctl hook now covers every domctl, in addition to the more
fine-grained XSM hooks in most sub-functions. This also removes the
need to special-case XEN_DOMCTL_getdomaininfo.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:07:19 +0000 (10:07 +0000)]
xen: avoid calling rcu_lock_*target_domain when an XSM hook exists
The rcu_lock_{,remote_}target_domain_by_id functions are wrappers
around an IS_PRIV_FOR check for the current domain. This is now
redundant with XSM hooks, so replace these calls with
rcu_lock_domain_by_any_id or rcu_lock_remote_domain_by_id to remove
the duplicate permission checks.
When XSM_ENABLE is not defined or when the dummy XSM module is used,
this patch should not change any functionality. Because the locations
of privilege checks have sometimes moved below argument validation,
error returns of some functions may change from EPERM to EINVAL when
called with invalid arguments and from a domain without permission to
perform the operation.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Fri, 11 Jan 2013 10:06:43 +0000 (10:06 +0000)]
xen: use XSM instead of IS_PRIV where duplicated
The Xen hypervisor has two basic access control function calls:
IS_PRIV and the xsm_* functions. Most privileged operations currently
require that both checks succeed, and many times the checks are at
different locations in the code. This patch eliminates the explicit
and implicit IS_PRIV checks that are duplicated in XSM hooks.
When XSM_ENABLE is not defined or when the dummy XSM module is used,
this patch should not change any functionality. Because the locations
of privilege checks have sometimes moved below argument validation,
error returns of some functions may change from EPERM to EINVAL or
ESRCH if called with invalid arguments and from a domain without
permission to perform the operation.
Some checks are removed due to non-obvious duplicates in their
callers:
* acpi_enter_sleep is checked in XENPF_enter_acpi_sleep
* map_domain_pirq has IS_PRIV_FOR checked in its callers:
* physdev_map_pirq checks when acquiring the RCU lock
* ioapic_guest_write is checked in PHYSDEVOP_apic_write
* PHYSDEVOP_{manage_pci_add,manage_pci_add_ext,pci_device_add} are
checked by xsm_resource_plug_pci in pci_add_device
* PHYSDEVOP_manage_pci_remove is checked by xsm_resource_unplug_pci
in pci_remove_device
* PHYSDEVOP_{restore_msi,restore_msi_ext} are checked by
xsm_resource_setup_pci in pci_restore_msi_state
* do_console_io has changed to IS_PRIV from an explicit domid==0
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Thu, 10 Jan 2013 17:32:10 +0000 (17:32 +0000)]
arch/x86: add distinct XSM hooks for map/unmap
The xsm_iomem_permission and xsm_ioport_permission hooks are intended
to be called by the domain builder, while the calls in
arch/x86/domctl.c which control mapping are also performed by the
device model. Because these operations require distinct access
control policies, they cannot use the same XSM hooks.
This also adds a missing XSM hook in the unbind IRQ domctl.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Thu, 10 Jan 2013 17:30:47 +0000 (17:30 +0000)]
flask: move policy headers into hypervisor
Rather than keeping around headers that are autogenerated in order to
avoid adding build dependencies from xen/ to files in tools/, move the
relevant parts of the FLASK policy into the hypervisor tree and
generate the headers as part of the hypervisor's build.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Thu, 10 Jan 2013 17:27:58 +0000 (17:27 +0000)]
xsm: Use the dummy XSM module if XSM is disabled
This patch moves the implementation of the dummy XSM module to a
header file that provides inline functions when XSM_ENABLE is not
defined. This reduces duplication between the dummy module and callers
when the implementation of the dummy return is not just "return 0",
and also provides better compile-time checking for completeness of the
XSM implementations in the dummy module.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Keir Fraser <keir@xen.org>
Ross Philipson [Thu, 10 Jan 2013 17:18:43 +0000 (17:18 +0000)]
HVM firmware passthrough ACPI processing
ACPI table passthrough support allowing additional static tables and
SSDTs (AML code) to be loaded. These additional tables are added at
the end of the secondary table list in the RSDT/XSDT tables.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
Ross Philipson [Thu, 10 Jan 2013 17:18:10 +0000 (17:18 +0000)]
HVM firmware passthrough SMBIOS processing
Passthrough support for the SMBIOS structures including three new DMTF
defined types and support for OEM defined tables. Passed in SMBIOS
types override the default internal values. Default values can be
enabled for the new type 22 portable battery using a xenstore
flag. All other new DMTF defined and OEM structures will only be added
to the SMBIOS table if passthrough values are present.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
Ross Philipson [Thu, 10 Jan 2013 17:17:21 +0000 (17:17 +0000)]
HVM firmware passthrough control tools support
Xen control tools support for loading the firmware passthrough blocks
during domain construction. SMBIOS and ACPI blocks are passed in using
the new xc_hvm_build_args structure. Each block is read and loaded
into the new domain address space behind the HVMLOADER image. The base
address for the two blocks is returned as an out parameter to the
caller via the args structure.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
Ross Philipson [Thu, 10 Jan 2013 17:16:28 +0000 (17:16 +0000)]
HVM xenstore strings and firmware passthrough header
Add public HVM definitions header for xenstore strings used in
HVMLOADER. In addition this header describes the use of the firmware
passthrough values set using xenstore.
Signed-off-by: Ross Philipson <ross.philipson@citrix.com> Committed-by: Keir Fraser <keir@xen.org>