Dario Faggioli [Mon, 9 Oct 2017 11:23:24 +0000 (13:23 +0200)]
RCU: make the period of the idle timer configurable
Make it possible for the user to specify, with the boot
time parameter rcu-idle-timer-period-ms, how frequently
a CPU that went idle with pending RCU callbacks should be
woken up to check if the grace period ended.
Typical values (i.e., some of the values used by Linux as
the tick frequency) are 10, 4 or 1 ms. Default valus (used
when this parameter is not specified) is 10ms. Maximum is
100ms.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Mon, 9 Oct 2017 11:22:07 +0000 (13:22 +0200)]
RCU: let the RCU idle timer handler run
If stop_timer() is called between when the RCU
idle timer's interrupt arrives (and TIMER_SOFTIRQ is
raised) and when softirqs are checked and handled, the
timer is deactivated, and the handler never runs.
This happens to the RCU idle timer because stop_timer()
is called on it during the wakeup from idle (e.g., C-states,
on x86) path.
To fix that, we avoid calling stop_timer(), in case we see
that the timer itself is:
- still active,
- expired (i.e., it's expiry time is in the past).
In fact, that indicates (for this particular timer) that
it has fired, and we are just about to handle the TIMER_SOFTIRQ
(which will perform the timer deactivation and run its handler).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/arm: p2m: Read *_mapped_gfn with the p2m lock taken
*_mapped_gfn are currently read before acquiring the lock. However, they
may be modified by the p2m code before the lock was acquired. This means
we will use the wrong values.
Fix it by moving the read inside the section protected by the p2m lock.
Dario Faggioli [Fri, 6 Oct 2017 16:02:34 +0000 (18:02 +0200)]
MAINTAINERS: update entries to Dario's new email address
Replace, in the 'M:' fields of the components I co-maintain
('CPU POOLS', 'SCHEDULING' and 'RTDS SCHEDULER'), the Citrix
email, to which I don't have access any longer, with my
personal email.
Awais Masood [Fri, 6 Oct 2017 16:01:50 +0000 (18:01 +0200)]
ns16550: fix ISR lockup on Allwinner uart
This patch fixes an ISR lockup seen on Allwinner uart
On Allwinner H5, serial driver goes into an infinite loop
when interrupts are enabled. The reason is a residual
"busy detect" interrupt. Since the condition UART_IIR_NOINT
will not be true unless this interrupt is cleared, the
interrupt handler will remain locked up in this while loop.
It checks for a busy condition during setup and clears the
condition by reading UART_USR register.
On Allwinner hardware, the "busy detect" condition occurs
later because an LCR write is performed during setup 'after'
this clear and if uart is busy, the "busy detect" condition
will trigger again and cause the ISR lockup.
To solve this problem, the same UART_USR read operation needs
to be performed within the interrupt handler to clear this
condition.
Linux dw 8250 driver also handles this condition within
interrupt handler
http://elixir.free-electrons.com/linux/latest/source/drivers/tty/serial/8250/8250_dw.c#L233
Tested on Orange Pi PC2 (H5). This issue is seen on H3
as well and the same fix works.
Signed-off-by: Awais Masood <awais.masood@vadion.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Thu, 5 Oct 2017 17:42:16 +0000 (18:42 +0100)]
xen/x86: mem_sharing: Use copy_domain_page in __mem_sharing_unshare_page
The function __mem_sharing_unshare_page contains an open-code version of
copy_domain_page. Use the function to simplify a bit the code.
At the same time replace _mfn(__page_to_mfn(...)) by page_to_mfn(...)
given that the file given already provides a typesafe version of page_to_mfn.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:21:04 +0000 (16:21 +0100)]
x86/np2m: add break to np2m_flush_eptp()
Now that np2m sharing is implemented, there can be only one np2m object
with the same np2m_base. Break from loop if the required np2m was found
during np2m_flush_eptp().
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:21:03 +0000 (16:21 +0100)]
x86/np2m: refactor p2m_get_nestedp2m_locked()
Remove some code duplication.
Suggested-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:21:02 +0000 (16:21 +0100)]
x86/np2m: implement sharing of np2m between vCPUs
At the moment, nested p2ms are not shared between vcpus even if they
share the same base pointer.
Modify p2m_get_nestedp2m() to allow sharing a np2m between multiple
vcpus with the same np2m_base (L1 np2m_base value in VMCx12).
If the current np2m doesn't match the current base pointer, first look
for another nested p2m in the same domain with the same base pointer,
before reclaiming one from the LRU.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:21:01 +0000 (16:21 +0100)]
x86/np2m: send flush IPIs only when a vcpu is actively using an np2m
Flush IPIs are sent to all cpus in an np2m's dirty_cpumask when
updated. This mask however is far too broad. A pcpu's bit is set in
the cpumask when a vcpu runs on that pcpu, but is only cleared when a
flush happens. This means that the IPI includes the current pcpu of
vcpus that are not currently running, and also includes any pcpu that
has ever had a vcpu use this p2m since the last flush (which in turn
will cause spurious invalidations if a different vcpu is using an np2m).
Avoid these IPIs by keeping closer track of where an np2m is being used,
and when a vcpu needs to be flushed:
- On schedule-out, clear v->processor in p2m->dirty_cpumask
- Add a 'generation' counter to the p2m and nestedvcpu structs to
detect changes that would require re-loads on re-entry
- On schedule-in or p2m change:
- Set v->processor in p2m->dirty_cpumask
- flush the vcpu's nested p2m pointer (and update nv->generation) if
the generation changed
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:21:00 +0000 (16:21 +0100)]
x86/vvmx: make updating shadow EPTP value more efficient
At the moment, the shadow EPTP value is written unconditionally in
ept_handle_violation().
Instead, write the value on vmentry to the guest; but only write it if
the value needs updating.
To detect this, add a flag to the nestedvcpu struct, stale_np2m, to
indicate when such an action is necessary. Set it when the nested p2m
changes or when the np2m is flushed by an IPI, and clear it when we
write the new value.
Since an IPI invalidating the p2m may happen between
nvmx_switch_guest() and vmx_vmenter, but we can't perform the vmwrite
with interrupts disabled, check the flag just before entering the
guest and restart the vmentry if it's set.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
There is a possibility for nested_p2m to became stale between
nestedhvm_hap_nested_page_fault() and nestedhap_fix_p2m(). At the moment
this is handled by detecting such a race inside nestedhap_fix_p2m() and
special-casing it.
Instead, introduce p2m_get_nestedp2m_locked(), which will returned a
still-locked p2m. This allows us to call nestedhap_fix_p2m() with the
lock held and remove the code detecting the special-case.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:20:58 +0000 (16:20 +0100)]
x86/np2m: remove np2m_base from p2m_get_nestedp2m()
Remove np2m_base parameter as it should always match the value of
np2m_base in VMCx12.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:20:57 +0000 (16:20 +0100)]
x86/np2m: flush all np2m objects on nested INVEPT
At the moment, nvmx_handle_invept() updates the current np2m just to
flush it. Instead introduce a function, np2m_flush_base(), which will
look up the np2m base pointer and call p2m_flush_table() instead.
Unfortunately, since we don't know which p2m a given vcpu is using, we
must flush all p2ms that share that base pointer.
Convert p2m_flush_table() into p2m_flush_table_locked() in order not
to release the p2m_lock after np2m_base check.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Sergey Dyasli [Tue, 3 Oct 2017 15:20:56 +0000 (16:20 +0100)]
x86/np2m: refactor p2m_get_nestedp2m()
1. Add a helper function assign_np2m()
2. Remove useless volatile
3. Update function's comment in the header
4. Minor style fixes ('\n' and d)
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Wei Liu [Thu, 5 Oct 2017 09:35:28 +0000 (10:35 +0100)]
libxl: use correct type modifier for vuart_gfn
Fixes compilation error like:
libxl_console.c: In function ‘libxl__device_vuart_add’:
libxl_console.c:379:5: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘xen_pfn_t’ [-Werror=format=]
flexarray_append(ro_front, GCSPRINTF("%lu", state->vuart_gfn));
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Tested-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
livepatch: Expand check for safe_for_reapply if livepatch has only .rodata.
If the livepatch has only .rodata sections then it is OK to also
apply/revert/apply the livepatch without having to worry about the
unforseen consequences.
Ross Lagerwall [Wed, 28 Jun 2017 16:13:44 +0000 (17:13 +0100)]
livepatch: Declare live patching as a supported feature
See docs/features/livepatch.pandoc for the details.
Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the .bug.frames.X or .livepatch.funcs sizes are different
than what the hypervisor expects - we fail the payload. To help
in diagnosing this include the expected and the payload
sizes.
Also make it more natural by having "Multiples" in the warning.
Also fix one case where we would fail if the size of the .ex_table
was being zero - but that is OK.
Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The ELF specification mentions nothing about the sh_size being
modulo the sh_addralign. Only that sh_addr MUST be aligned on
sh_addralign if sh_addralign is not zero or one.
We on loading did not take this in-to account so this patch adds
a check on the ELF file as it is being parsed.
Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joao Martins [Tue, 3 Oct 2017 17:46:08 +0000 (18:46 +0100)]
public/io/netif.h: add gref mapping control messages
Adds 3 messages to allow guest to let backend keep grants mapped,
such that 1) guests allowing fast recycling of pages can avoid doing
grant ops for those cases, or otherwise 2) preferring copies over
grants and 3) always using a fixed set of pages for network I/O.
The three control ring messages added are:
- Add grefs to be mapped by backend
- Remove grefs mappings (If they are not in use)
- Get maximum amount of grefs kept mapped.
Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Juergen Gross [Wed, 4 Oct 2017 12:24:13 +0000 (14:24 +0200)]
gnttab: make resource limits per domain
Instead of using the same global resource limits of grant tables (max.
number of grant frames, max. number of maptrack frames) for all domains
make these limits per domain. Set those per-domain limits in
grant_table_set_limits(). The global settings are serving as an upper
boundary now which must not be exceeded by a per-domain value. The
default of max_grant_frames is set to the maximum default xl will use.
While updating the semantics of the boot parameters remove the
documentation of the no longer existing gnttab_max_nr_frames and
correct the default gnttab_max_maptrack_frames uses.
Segment bases (and limits) aren't being cleared by the loading of a nul
selector into a segment register on AMD CPUs. Therefore, if an
outgoing vCPU has a non-zero base in FS or GS and the subsequent
incoming vCPU has a non-zero but nul selector in the respective
register(s), the selector value(s) would be loaded without clearing the
segment base(s) in the hidden register portion.
Since the ABI states "zero" in its description of the fs and gs fields,
it is worth noting that the chosen approach to fix this alters the
written down ABI. I consider this preferrable over enforcing the
previously written down behavior, as nul selectors are far more likely
to be what was meant from the beginning.
The adjustments also eliminate an inconsistency between FS and GS
handling: Old code had an extra pointless (gs_base_user was always zero
when DIRTY_GS was set) conditional for GS. The old bitkeeper changeset
has no explanation for this asymmetry.
Additionally for DS and ES a flat selector is being loaded prior to the
loading of a nul one on AMD CPUs, just as a precautionary measure
(we're not currently aware of ways for a guest to deduce the base of a
segment register which has a nul selector loaded).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen/arm: vpl011: Add a pl011 uart DT node in the guest device tree
The SBSA UART node format is as specified in
Documentation/devicetree/bindings/serial/arm_sbsa_uart.txt and given below:
ARM SBSA defined generic UART
------------------------------
This UART uses a subset of the PL011 registers and consequently lives
in the PL011 driver. It's baudrate and other communication parameters
cannot be adjusted at runtime, so it lacks a clock specifier here.
Required properties:
- compatible: must be "arm,sbsa-uart"
- reg: exactly one register range
- interrupts: exactly one interrupt specifier
- current-speed: the (fixed) baud rate set by the firmware
Currently the baud rate of 115200 has been selected as a default value,
which is one of the valid baud rate settings. Higher baud rate was
selected since an emulated pl011 can support any valid baud rate without
any limitation of the hardware.
xen/arm: vpl011: Add support for multiple consoles in xenconsole
This patch adds the support for multiple consoles and introduces the
iterator functions to operate on multiple consoles.
The functions called by the iterators check that they are operating
on valid I/O parameters. This ensures that if a particular console is
not initialized then the functions will not do anything for that
console type.
This patch is in preparation to support a new vuart console.
Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com>
xen/arm: vpl011: Add a new handle_console_ring function in xenconsole
This patch introduces a new handle_console_ring function. This function
reads the data from the ring buffer on receiving an event.
The initialization of event channel poll fd to -1 is moved inside the
handle_console_ring function as they are related. There should be no
change in the behavior as there is no functional change.
xen/arm: vpl011: Modify xenconsole functions to take console structure as input
Xenconsole functions take domain structure as input. These functions shall be
modified to take console structure as input since these functions typically perform
console specific operations.
Also the console specific functions starting with prefix "domain_" shall be modified
to "console_" to indicate that these are console specific functions.
This patch is in preparation to support multiple consoles to support vuart console.
xen/arm: vpl011: Modify xenconsole to define and use a new console structure
Xenconsole uses a domain structure which contains console specific fields. This
patch defines a new console structure, which would be used by the xenconsole
functions to perform console specific operations like reading/writing data from/to
the console ring buffer or reading/writing data from/to console tty.
This patch is in preparation to support multiple consoles to support vuart console.
xen/arm: vpl011: Add a new domctl API to initialize vpl011
Add a new domctl API to initialize vpl011. It takes the GFN and console
backend domid as input and returns an event channel to be used for
sending and receiving events from Xen.
Xen will communicate with xenconsole using GFN as the ring buffer and
the event channel to transmit and receive pl011 data on the guest domain's
behalf.
Add emulation code to emulate read/write access to pl011 registers
and pl011 interrupts:
- Emulate DR read/write by reading and writing from/to the IN
and OUT ring buffers and raising an event to the backend when
there is data in the OUT ring buffer and injecting an interrupt
to the guest when there is data in the IN ring buffer
- Other registers are related to interrupt management and
essentially control when interrupts are delivered to the guest
This patch implements the SBSA Generic UART which is a subset of ARM
PL011 UART.
The SBSA Generic UART is covered in Appendix B of
https://static.docs.arm.com/den0029/a/Server_Base_System_Architecture_v3_1_ARM_DEN_0029A.pdf
xen/arm: vpl011: Define common ring buffer helper functions in console.h
DEFINE_XEN_FLEX_RING(xencons) defines common helper functions such as
xencons_queued() to tell the current size of the ring buffer,
xencons_mask() to mask off the index, which are useful helper functions.
pl011 emulation code will use these helper functions.
io/console.h includes io/ring.h which defines DEFINE_XEN_FLEX_RING.
In console/daemon/io.c, string.h had to be included before io/console.h
because ring.h uses string functions.
Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Julien Grall [Mon, 2 Oct 2017 12:59:41 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Rework prototype of p2m_pod_demand_populate
- Switch the return type to bool
- Remove the parameter p2m_query_t q as it is not used
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:40 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Use typesafe gfn for the fields reclaim_single and max_guest
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:39 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Use typesafe gfn in p2m_pod_demand_populate
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:38 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Use typesafe gfn in p2m_pod_zero_check
At the same time make the array gfns const has it is not modified within
the function.
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:37 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Clean-up p2m_pod_zero_check
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:36 +0000 (13:59 +0100)]
xen/x86: p2m-pod: Use typesafe GFN in pod_eager_record
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:35 +0000 (13:59 +0100)]
xen/x86: p2m: Use typesafe GFN in p2m_set_entry
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 12:59:34 +0000 (13:59 +0100)]
xen/x86: p2m: Use typesafe gfn for the P2M callbacks get_entry and set_entry
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Julien Grall [Mon, 2 Oct 2017 15:40:04 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Use typesafe gfn in p2m_pod_decrease_reservation
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:40:03 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Clean-up use of typesafe MFN
Some unboxing/boxing can be avoided by using mfn_add(...) instead.
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:40:02 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Avoid redundant assignments in p2m_pod_demand_populate
gfn_aligned is assigned 3 times with the exact same formula. All the
variables used are not modified, so consolidate in a single assignment
at the beginning of the function.
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:40:01 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Fix coding style
Also take the opportunity to:
- move from 1 << * to 1UL << *.
- use unsigned when possible
- move from unsigned int -> unsigned long for some induction
variables
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:40:01 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Fix coding style for comments
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:40:00 +0000 (16:40 +0100)]
xen/x86: p2m-pod: Remove trailing whitespaces
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Mon, 2 Oct 2017 15:39:59 +0000 (16:39 +0100)]
xen/x86: p2m-pod: Clean-up includes
A lot of the headers are not necessary. At the same time, order them in the
alphabetical order.
Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Petre Pircalabu [Mon, 2 Oct 2017 15:04:56 +0000 (16:04 +0100)]
x86/monitor: Notify monitor if an emulation fails.
If case of a vm_event with the emulate_flags set, if the instruction
is not implemented by the emulator, the monitor should be notified instead
of directly injecting a hw exception.
This behavior can be used to re-execute an instruction not supported by
the emulator using the real processor (e.g. altp2m) instead of just
crashing.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Petre Pircalabu [Mon, 2 Oct 2017 15:04:55 +0000 (16:04 +0100)]
x86emul: Add return code information to error messages
- print the return code of the last failed emulator operation
in hvm_dump_emulation_state.
- print the return code in sh_page_fault (SHADOW_PRINTK) to make the
distiction between X86EMUL_UNHANDLEABLE and X86EMUL_UNIMPLEMENTED.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Petre Pircalabu [Mon, 2 Oct 2017 15:04:54 +0000 (16:04 +0100)]
x86emul: New return code for unimplemented instruction
Enforce the distinction between an instruction not implemented by the
emulator and the failure to emulate that instruction by defining a new
return code, X86EMUL_UNIMPLEMENTED.
This value should only be returned by the core emulator when a valid
opcode is found but the execution logic for that instruction is missing.
It should NOT be returned by any of the x86_emulate_ops callbacks.
e.g. hvm_process_io_intercept should not return X86EMUL_UNIMPLEMENTED.
The return value of this function depends on either the return code of
one of the hvm_io_ops handlers (read/write) or the value returned by
hvm_copy_guest_from_phys / hvm_copy_to_guest_phys.
Similary, none of this functions should return X86EMUL_UNIMPLEMENTED.
- hvm_io_intercept
- hvmemul_do_io
- hvm_send_buffered_ioreq
- hvm_send_ioreq
- hvm_broadcast_ioreq
- hvmemul_do_io_buffer
- hvmemul_validate
Also the behavior of hvm_emulate_one_insn and vmx_realmode_emulate_one
was modified to generate an Invalid Opcode trap when X86EMUL_UNRECOGNIZED
is returned by the emulator instead of just crash the domain.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Tue, 26 Sep 2017 16:08:33 +0000 (17:08 +0100)]
x86/svm: Fix a livelock when trying to run shadowed unpaged guests
On AMD processors which support SMEP (Some Fam16h processors) and SMAP (Zen,
Fam17h), a guest which is running with shadow paging and clears CR0.PG while
keeping CR4.{SMEP,SMAP} set will livelock, as hardware raises #PF which the
shadow pagetable concludes shouldn't happen.
This occurs because hardware is running with host paging settings, which
causes the guests choice of SMEP/SMAP to actually take effect, even though
they shouldn't from the guests point of view.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
xen/arm: Correctly report the memory region in the dummy NUMA helpers
NUMA is currently not supported on Arm. Because common code is
NUMA-aware, dummy helpers are instead provided to expose a single node.
Those helpers are for instance used to know the region to scrub.
However the memory region is not reported correctly. Indeed, the
frametable may not be at the beginning of the memory and there might be
multiple memory banks. This will lead to not scrub some part of the
memory.
The memory information can be found using:
* first_valid_mfn as the start of the memory
* max_page - first_valid_mfn as the spanned pages
Note that first_valid_mfn is now been exported. The prototype has been
added in asm-arm/numa.h and not in a common header because I would
expect the variable to become static once NUMA is fully supported on
Arm.
xen/page_alloc: Cover memory unreserved after boot in first_valid_mfn
On Arm, some regions (e.g Initramfs, Dom0 Kernel...) are marked as
reserved until the hardware domain is built and they are copied into its
memory. Therefore, they will not be added in the boot allocator via
init_boot_pages.
Instead, init_xenheap_pages will be called once the region are not used
anymore.
Update first_valid_mfn in both init_heap_pages and init_boot_pages
(already exist) to cover all the cases.
This is XSA-245.
Signed-off-by: Julien Grall <julien.grall@arm.com>
[Adjust comment, added locking around first_valid_mfn update] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen/arm: Fix the issue in cmp_mmio_handler used in find_mmio_handler
This patch fixes the wrong range check done in cmp_mmio_handler().
This function returns -1 , 0 or 1 based on whether the key value
is below the range, in the range or above the range where the range is
(start, start+size). However, it should check against (start, start+size-1)
because start+size falls outside the range.
This resulted in returning a wrong mmio_handler for a given mmio address which
happened to be start+size.
This bug was introduced when the mmio region search switched from
linear search to binary search in the following commit:
8047e09 "xen/arm: io: Use binary search for mmio handler lookup".
xen: fail gnttab_grow_table() in case of missing allocations
In case gnttab_grow_table() is being called without
grant_table_set_limits() having been called for the domain, e.g. in
case of a toolstack error, fail the function instead of crashing the
system.
While at it let gnttab_grow_table() return a proper error code instead
of 1 for success.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Fri, 29 Sep 2017 12:29:21 +0000 (13:29 +0100)]
xen/gnttab: Clean up goto tangle in grant_table_init()
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
It should be possible to re-introduce it in the future with a proper
implementation, in order to create a HVM guest without a device model,
which is slightly different from a PVHv2 guest.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
libxl: remove device model "none" support from disk related functions
CD-ROM backend selection was partially based on the device model, this
is no longer needed since the device model "none" is now removed, so
HVM guests always have a device model.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Remove the device model "none" support from domain creation and
introduce support for PVH.
This requires changing some of the HVM checks to be applied for both
HVM and PVH.
Setting device model to none was never supported since it was an
unstable interface used while transitioning from PVHv1 to PVHv2.
Now that PVHv1 has been finally removed and that a supported
interface for PVHv2 is being added this option is no longer necessary,
hence it's removed.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
The new guest type is introduced to the libxl IDL. libxl__domain_make
is also modified to save the guest type, and libxl__domain_type is
expanded to fetch that information when detecting guest type.
This is required because the hypervisor only differentiates between PV
and HVM guests, so libxl needs some extra information in order to
differentiate between a HVM and a PVH guest.
The new PVH guest type and its options are documented on the xl.cfg
man page.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
The new firmware option aims to provide a coherent way to set the
firmware for the different kind of guests Xen supports.
For PV guests the available firmwares are pvgrub{32|64}, and for HVM
the following are supported: bios, uefi, seabios, rombios and ovmf.
Note that uefi maps to ovmf, and bios maps to the default firmware for
each device model.
The xl.cfg man page is updated to document the new feature.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>