Roger Pau Monne [Fri, 30 Jun 2017 14:39:53 +0000 (15:39 +0100)]
vpci/msix: add MSI-X handlers
Add handlers for accesses to the MSI-X message control field on the
PCI configuration space, and traps for accesses to the memory region
that contains the MSI-X table and PBA. This traps detect attempts from
the guest to configure MSI-X interrupts and properly sets them up.
Note that accesses to the Table Offset, Table BIR, PBA Offset and PBA
BIR are not trapped by Xen at the moment.
Finally, turn the panic in the Dom0 PVH builder into a warning.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v3:
- Propagate changes from previous versions: remove xen_ prefix, use
the new fields in vpci_val and remove the return value from
handlers.
- Remove the usage of GENMASK.
- Mave the arch-specific parts of the dump routine to the
x86/hvm/vmsi.c dump handler.
- Chain the MSI-X dump handler to the 'M' debug key.
- Fix the header BAR mappings so that the MSI-X regions inside of
BARs are unmapped from the domain p2m in order for the handlers to
work properly.
- Unconditionally trap and forward accesses to the PBA MSI-X area.
- Simplify the conditionals in vpci_msix_control_write.
- Fix vpci_msix_accept to use a bool type.
- Allow all supported accesses as described in the spec to the MSI-X
table.
- Truncate the returned address when the access is a 32b read.
- Always return X86EMUL_OKAY from the handlers, returning ~0 in the
read case if the access is not supported, or ignoring writes.
- Do not check that max_entries is != 0 in the init handler.
- Use trylock in the dump handler.
Changes since v2:
- Split out arch-specific code.
This patch has been tested with devices using both a single MSI-X
entry and multiple ones.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:53 +0000 (15:39 +0100)]
vpci: add a priority parameter to the vPCI register initializer
This is needed for MSI-X, since MSI-X will need to be initialized
before parsing the BARs, so that the header BAR handlers are aware of
the MSI-X related holes and make sure they are not mapped in order for
the trap handlers to work properly.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v3:
- Add a numerial suffix to the section used to store the pointer to
each initializer function, and sort them at link time.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:52 +0000 (15:39 +0100)]
vpci/msi: add MSI handlers
Add handlers for the MSI control, address, data and mask fields in
order to detect accesses to them and setup the interrupts as requested
by the guest.
Note that the pending register is not trapped, and the guest can
freely read/write to it.
Whether Xen is going to provide this functionality to Dom0 (MSI
emulation) is controlled by the "msi" option in the dom0 field. When
disabling this option Xen will hide the MSI capability structure from
Dom0.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com>
---
Changes since v3:
- Propagate changes from previous versions: drop xen_ prefix, drop
return value from handlers, use the new vpci_val fields.
- Use MASK_EXTR.
- Remove the usage of GENMASK.
- Add GFLAGS_SHIFT_DEST_ID and use it in msi_flags.
- Add "arch" to the MSI arch specific functions.
- Move the dumping of vPCI MSI information to dump_msi (key 'M').
- Remove the guest_vectors field.
- Allow the guest to change the number of active vectors without
having to disable and enable MSI.
- Check the number of active vectors when parsing the disable
mask.
- Remove the debug messages from vpci_init_msi.
- Move the arch-specific part of the dump handler to x86/hvm/vmsi.c.
- Use trylock in the dump handler to get the vpci lock.
Changes since v2:
- Add an arch-specific abstraction layer. Note that this is only implemented
for x86 currently.
- Add a wrapper to detect MSI enabling for vPCI.
NB: I've only been able to test this with devices using a single MSI interrupt
and no mask register. I will try to find hardware that supports the mask
register and more than one vector, but I cannot make any promises.
If there are doubts about the untested parts we could always force Xen to
report no per-vector masking support and only 1 available vector, but I would
rather avoid doing it.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:52 +0000 (15:39 +0100)]
xen/vpci: add handlers to map the BARs
Introduce a set of handlers that trap accesses to the PCI BARs and the command
register, in order to emulate BAR sizing and BAR relocation.
The command handler is used to detect changes to bit 2 (response to memory
space accesses), and maps/unmaps the BARs of the device into the guest p2m.
The BAR register handlers are used to detect attempts by the guest to size or
relocate the BARs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: George Dunlap <George.Dunlap@eu.citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Tim Deegan <tim@xen.org> Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v3:
- Propagate previous changes: drop xen_ prefix and use u8/u16/u32
instead of the previous half_word/word/double_word.
- Constify some of the paramerters.
- s/VPCI_BAR_MEM/VPCI_BAR_MEM32/.
- Simplify the number of fields stored for each BAR, a single address
field is stored and contains the address of the BAR both on Xen and
in the guest.
- Allow the guest to move the BARs around in the physical memory map.
- Add support for expansion ROM BARs.
- Do not cache the value of the command register.
- Remove a label used in vpci_cmd_write.
- Fix the calculation of the sizing mask in vpci_bar_write.
- Check the memory decode bit in order to decide if a BAR is
positioned or not.
- Disable memory decoding before sizing the BARs in Xen.
- When mapping/unmapping BARs check if there's overlap between BARs,
in order to avoid unmapping memory required by another BAR.
- Introduce a macro to check whether a BAR is mappable or not.
- Add a comment regarding the lack of support for SR-IOV.
- Remove the usage of the GENMASK macro.
Changes since v2:
- Detect unset BARs and allow the hardware domain to position them.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:51 +0000 (15:39 +0100)]
xen/pci: split code to size BARs from pci_add_device
So that it can be called from outside in order to get the size of regular PCI
BARs. This will be required in order to map the BARs from PCI devices into PVH
Dom0 p2m.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com>
---
Changes since v3:
- Rename function to size BARs to pci_size_mem_bar.
- Change the parameters passed to the function. Pass the position and
whether the BAR is the last one, instead of the (base, max_bars,
*index) tuple.
- Make the function return the number of BARs consumed (1 for 32b, 2
for 64b BARs).
- Change the dprintk back to printk.
- Do not log another error message in pci_add_device in case
pci_size_mem_bar fails.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:51 +0000 (15:39 +0100)]
xen/mm: move modify_identity_mmio to global file and drop __init
And also allow it to do non-identity mappings by adding a new
parameter.
This function will be needed in order to map the BARs from PCI devices
into the Dom0 p2m (and is also used by the x86 Dom0 builder). While
there fix the function to use gfn_t and mfn_t instead of unsigned long
for memory addresses.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v3:
- Remove the dummy modify_identity_mmio helper in dom0_build.c
- Try to make the comment in modify MMIO less scary.
- Clarify commit message.
- Only build the function for x86 or if there's PCI support.
Changes since v2:
- Use mfn_t and gfn_t.
- Remove stray newline.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:50 +0000 (15:39 +0100)]
x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0
So that hotplug (or MMCFG regions not present in the MCFG ACPI table)
can be added at run time by the hardware domain.
When a new MMCFG area is added to a PVH Dom0, Xen will scan it and add
the devices to the hardware domain.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
changes since v3:
- New in this version.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:50 +0000 (15:39 +0100)]
x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas
Introduce a set of handlers for the accesses to the MMCFG areas. Those
areas are setup based on the contents of the hardware MMCFG tables,
and the list of handled MMCFG areas is stored inside of the hvm_domain
struct.
The read/writes are forwarded to the generic vpci handlers once the
address is decoded in order to obtain the device and register the
guest is trying to access.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com>
---
Changes since v3:
- Propagate changes from previous patches: drop xen_ prefix for vpci
functions, pass slot and func instead of devfn and fix the error
paths of the MMCFG handlers.
- s/ecam/mmcfg/.
- Move the destroy code to a separate function, so the hvm_mmcfg
struct can be private to hvm/io.c.
- Constify the return of vpci_mmcfg_find.
- Use d instead of v->domain in vpci_mmcfg_accept.
- Allow 8byte accesses to the mmcfg.
Roger Pau Monne [Fri, 30 Jun 2017 14:39:50 +0000 (15:39 +0100)]
xen/vpci: introduce basic handlers to trap accesses to the PCI config space
This functionality is going to reside in vpci.c (and the corresponding
vpci.h header), and should be arch-agnostic. The handlers introduced
in this patch setup the basic functionality required in order to trap
accesses to the PCI config space, and allow decoding the address and
finding the corresponding handler that should handle the access
(although no handlers are implemented).
Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are
setup inside of a x86 HVM file, since that's not shared with other
arches.
A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen
whether a domain should use the newly introduced vPCI handlers, this
is only enabled for PVH Dom0 at the moment.
A very simple user-space test is also provided, so that the basic
functionality of the vPCI traps can be asserted. This has been proven
quite helpful during development, since the logic to handle partial
accesses or accesses that expand across multiple registers is not
trivial.
The handlers for the registers are added to a linked list that's keep
sorted at all times. Both the read and write handlers support accesses
that expand across multiple emulated registers and contain gaps not
emulated.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
--- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com>
---
Changes since v3:
* User-space test harness:
- Fix spaces in container_of macro.
- Implement a dummy locking functions.
- Remove 'current' macro make current a pointer to the statically
allocated vpcu.
- Remove unneeded parentheses in the pci_conf_readX macros.
- Fix the name of the write test macro.
- Remove the dummy EXPORT_SYMBOL macro (this was needed by the RB
code only).
- Import the max macro.
- Test all possible read/write size combinations with all possible
emulated register sizes.
- Introduce a test for register removal.
* Hypervisor code:
- Use a sorted list in order to store the config space handlers.
- Remove some unneeded 'else' branches.
- Make the IO port handlers always return X86EMUL_OKAY, and set the
data to all 1's in case of read failure (write are simply ignored).
- In hvm_select_ioreq_server reuse local variables when calling
XEN_DMOP_PCI_SBDF.
- Store the pointers to the initialization functions in the .rodata
section.
- Do not ignore the return value of xen_vpci_add_handlers in
setup_one_hwdom_device.
- Remove the vpci_init macro.
- Do not hide the pointers inside of the vpci_{read/write}_t
typedefs.
- Rename priv_data to private in vpci_register.
- Simplify checking for register overlap in vpci_register_cmp.
- Check that the offset and the length match before removing a
register in xen_vpci_remove_register.
- Make vpci_read_hw return a value rather than storing it in a
pointer passed by parameter.
- Handler dispatcher functions vpci_{read/write} no longer return an
error code, errors on reads/writes should be treated like hardware
(writes ignored, reads return all 1's or garbage).
- Make sure pcidevs is locked before calling pci_get_pdev_by_domain.
- Use a recursive spinlock for the vpci lock, so that spin_is_locked
checks that the current CPU is holding the lock.
- Make the code less error-chatty by removing some of the printk's.
- Pass the slot and the function as separate parameters to the
handler dispatchers (instead of passing devfn).
- Allow handlers to be registered with either a read or write
function only, the missing handler will be replaced by a dummy
handler (writes ignored, reads return 1's).
- Introduce PCI_CFG_SPACE_* defines from Linux.
- Simplify the handler dispatchers by removing the recursion, now the
dispatchers iterate over the list of sorted handlers and call them
in order.
- Remove the GENMASK_BYTES, SHIFT_RIGHT_BYTES and ADD_RESULT macros,
and instead provide a merge_result function in order to merge a
register output into a partial result.
- Rename the fields of the vpci_val union to u8/u16/u32.
- Remove the return values from the read/write handlers, errors
should be handled internally and signaled as would be done on
native hardware.
- Remove the usage of the GENMASK macro.
Changes since v2:
- Generalize the PCI address decoding and use it for IOREQ code also.
Changes since v1:
- Allow access to cross a word-boundary.
- Add locking.
- Add cleanup to xen_vpci_add_handlers in case of failure.
Sergey Dyasli [Wed, 28 Jun 2017 09:35:45 +0000 (10:35 +0100)]
vvmx: fix ept_sync() for nested p2m
If ept_sync_domain() is called for np2m, the following happens:
1. *np2m*::ept_data::invalidate cpumask is updated
2. IPIs are sent for CPUs in domain_dirty_cpumask forcing vmexits
3. vmx_vmenter_helper() checks *hostp2m*::ept_data::invalidate
and does nothing
Which is clearly a bug. Make ept_sync_domain() to update hostp2m's
invalidate mask in nested p2m case and make vmx_vmenter_helper() to
invalidate EPT translations for all EPTPs if nested virt is enabled.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Wed, 28 Jun 2017 14:05:35 +0000 (15:05 +0100)]
x86/vvmx: Fix WRMSR interception of VMX MSRs
FEATURE_CONTROL is already read with LOCK bit set (so is unmodifiable), and
all VMX MSRs are read-only. Also, fix the MSR_IA32_VMX_TRUE_ENTRY_CTLS bound
to be MSR_IA32_VMX_VMFUNC, rather than having the intervening MSRs falling
into the default case.
Raise #GP faults if the guest tries to modify any of them.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Zhongze Liu [Thu, 22 Jun 2017 16:35:28 +0000 (00:35 +0800)]
libxc: add xc_domain_add_to_physmap_batch to wrap XENMEM_add_to_physmap_batch
This is a preparation for the proposal "allow setting up shared memory areas
between VMs from xl config file". See:
V2: https://lists.xen.org/archives/html/xen-devel/2017-06/msg02256.html
V1: https://lists.xen.org/archives/html/xen-devel/2017-05/msg01288.html
The plan is to use XENMEM_add_to_physmap_batch in xl to map foregin pages from
one DomU to another so that the page could be shared. But currently there is no
wrapper for XENMEM_add_to_physmap_batch in libxc, so we just add a wrapper for
it.
Signed-off-by: Zhongze Liu <blackskygg@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Fri, 23 Jun 2017 10:48:21 +0000 (10:48 +0000)]
xen/tmem: Switch to using bool
* Drop redundant initialisers
* Style corrections while changing client_over_quota()
* Drop all write-only bools from do_tmem_op()
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Mon, 26 Jun 2017 14:20:35 +0000 (15:20 +0100)]
xen: move do_nmi_op and make it x86 only
Since ARM doesn't need {compat,do}_nmi_op, move the hypercall handlers
from common/kernel.c to pv/callback.c. Drop the stubs in ARM. Delete
the common and ARM nmi.h and adjust header inclusions in various
files.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Wei Liu [Mon, 5 Jun 2017 15:15:26 +0000 (16:15 +0100)]
x86/traps: factor out pv_trap_init
Factor out pv_trap_init and call it at the beginning of trap_init. We
then need to tune the code to generate stub handlers in entry.S. Take
the chance to tune init_irq_data so that 0x80 and 0x82 can be used for
regular interrupts in !CONFIG_PV case.
While at it, fix some coding style issues in init_irq_data and replace
0x80 with LEGACY_SYSCALL_VECTOR in pv_trap_init.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reivewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 27 Jun 2017 17:45:03 +0000 (18:45 +0100)]
xen/pt: Avoid NULL dereference in hvm_pirq_eoi()
Coverity warns that pirq_dpci unconditionally dereferences a NULL pointer.
This warning appears to be triggered by pirq_dpci() which is a hidden ternary
expression. In reality, it appears that both callers pass a non-NULL pirq
parameter, so the code is ok in practice.
Rearange the logic to fail-safe, which should quiesce Coverity.
Clean up bool_t => bool and trailing whitespace for hvm_domain_use_pirq()
while auditing this area.
No (intended) functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Tue, 27 Jun 2017 17:29:55 +0000 (18:29 +0100)]
xen/pt: Unlock d->event_lock on error paths
Introduced by c/s fba00494268 "x86/pt: enable binding of GSIs to a PVH Dom0"
Spotted by Coverity.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Fri, 23 Jun 2017 09:59:51 +0000 (10:59 +0100)]
x86/vioapic: bind interrupts to PVH Dom0
Add the glue in order to bind the PVH Dom0 GSI from bare metal. This
is done when Dom0 unmasks the vIO APIC pins, by fetching the current
pin settings and setting up the PIRQ, which will then be bound to
Dom0.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Mon, 26 Jun 2017 09:47:13 +0000 (10:47 +0100)]
x86/pt: enable binding of GSIs to a PVH Dom0
Achieve this by expanding pt_irq_create_bind in order to support
mapping interrupts of type PT_IRQ_TYPE_PCI to a PVH Dom0. GSIs bound
to Dom0 are always identity bound, which means the all the fields
inside of the u.pci sub-struct are ignored, and only the machine_irq
is actually used in order to determine which GSI the caller wants to
bind.
Also, the hvm_irq_dpci struct is not used by a PVH Dom0, since that's
used to route interrupts and allow different host to guest GSI
mappings, which is not used by a PVH Dom0.
This requires adding some specific handlers for such directly mapped
GSIs, which bypass the PCI interrupt routing done by Xen for HVM
guests.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Olaf Hering [Mon, 26 Jun 2017 12:55:07 +0000 (14:55 +0200)]
rombios: prevent building with PIC/PIE
If the default compiler silently defaults to to -fPIC/-fPIE building
rombios fails:
ld -melf_i386 -s -r 32bitbios.o tcgbios/tcgbiosext.o util.o pmm.o -o 32bitbios_all.o
There are undefined symbols in the BIOS:
U _GLOBAL_OFFSET_TABLE_
make[10]: *** [Makefile:26: 32bitbios_all.o] Error 11
Prevent the failure by enforcing non-PIC/PIE mode.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 26 Jun 2017 11:58:25 +0000 (12:58 +0100)]
x86/mm: Fix infinite loop in get_spage_pages()
c/s 2b8eb37 switched int i to being unsigned, but the undo logic on failure
relied in i being signed. As i being unsigned in still preforable, adjust the
undo logic to work with an unsigned i.
Coverity-ID: 1413017 Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Will <konrad.wilk@oracle.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Bhupinder Thakur [Thu, 22 Jun 2017 07:38:37 +0000 (13:08 +0530)]
xen/arm: Rename vgic_reg* functions definitions and calls to vreg_reg*
This patch renames the vgic_reg* access functions defined in vreg.h to vreg_reg*
and replaces all calls to vgic_reg* functions in vgic/its emulation code to vreg_reg*.
vreg_reg* are generic functions, which can be used to operate on 32/64-bit registers.
SBSA UART emulation code will also use vreg_reg* access functions for
accessing emulated pl011 registers.
Bhupinder Thakur [Thu, 22 Jun 2017 07:38:36 +0000 (13:08 +0530)]
xen/arm: vpl011: Move vgic register access functions to vreg.h
These functions are generic in nature and can be reused by other emulation
code in Xen. vGICv3 ITS and SBSA UART emulation code, would use these
functions to operate on their registers.
This patch moves the register access function definitions from vgic.h to
vreg.h.
Wei Liu [Mon, 5 Jun 2017 14:16:17 +0000 (15:16 +0100)]
x86: clean up PV emulation code
Replace bool_t with bool. Fix coding style issues. Add spaces around
binary ops. Use 1U for shifting. Eliminate TOGGLE_MODE.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 13 Jun 2017 20:36:58 +0000 (21:36 +0100)]
xen/livepatch: Don't crash on encountering STN_UNDEF relocations
A symndx of STN_UNDEF is special, and means a symbol value of 0. While
legitimate in the ELF standard, its existance in a livepatch is questionable
at best. Until a plausible usecase presents itself, reject such a relocation
with -EOPNOTSUPP.
Additionally, fix an off-by-one error while range checking symndx, and perform
a safety check on elf->sym[symndx].sym before derefencing it, to avoid
tripping over a NULL pointer when calculating val.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [x86 and arm32] Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Andrew Cooper [Thu, 22 Jun 2017 17:55:31 +0000 (18:55 +0100)]
xen/livepatch: Use zeroed memory allocations for arrays
Each of these arrays is sparse. Use zeroed allocations to cause uninitialised
array elements to contain deterministic values, most importantly for the
embedded pointers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [x86 and arm32] Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Andrew Cooper [Tue, 13 Jun 2017 20:17:47 +0000 (21:17 +0100)]
xen/livepatch: Clean up arch relocation handling
* Reduce symbol scope and initalisation as much as possible
* Annotate a fallthrough case in arm64
* Fix switch statement style in arm32
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [x86 and arm32]
Andrew Cooper [Wed, 21 Jun 2017 11:36:18 +0000 (12:36 +0100)]
xen: Replace ASSERT(0) with ASSERT_UNREACHABLE()
No functional change, but the result is more informative both in the code and
error messages if the assertions do get hit.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <julien.gralL@arm.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Tue, 18 Apr 2017 14:41:16 +0000 (15:41 +0100)]
x86/mm: Misc nonfunctional cleanup
* Drop trailing whitespace
* Apply Xen comment and space style
* Switch bool_t to bool
* Drop TOGGLE_MODE() macro
* Replace erroneous mandatory barriers with smp barriers
* Switch boolean ints for real bools
No (intended) functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Fixed an issue where the maximum index allowed (31) goes beyond the
actual number of array elements (4) of ad->monitor.write_ctrlreg_mask.
Coverity-ID: 1412966
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 23 Jun 2017 13:49:41 +0000 (15:49 +0200)]
make steal_page() return a proper error value
... and use it where suitable (the tmem caller doesn't propagate an
error code). While it doesn't matter as much, also make donate_page()
follow suit on x86 (on ARM it already returns -ENOSYS).
Also move their declarations to common code and add __must_check.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <julien.grall@arm.com>
Jan Beulich [Fri, 23 Jun 2017 13:48:28 +0000 (15:48 +0200)]
x86emul: simplify SHLD/SHRD handling
First of all there's no point considering the "shift == width" case,
when immediately before that check we mask "shift" by "width - 1". And
then truncate_word() use can be reduced too: dst.val, as obtained by
generic operand fetching code, is already suitably truncated, and its
use can also be made symmetric in the main conditional expression (on
only left shift results). Finally masking the result of a right shift
is not necessary when the left hand operand doesn't have more than
"width" significant bits.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 22 Jun 2017 07:58:07 +0000 (09:58 +0200)]
x86/mm: drop redundant domain parameter from get_page_from_gfn_p2m()
It can always be read from the passed p2m. Take the opportunity and
also rename the function, making the "p2m" suffix a prefix, to match
other p2m functions, and convert the "gfn" parameter's type.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Thu, 22 Jun 2017 07:55:08 +0000 (09:55 +0200)]
gnttab: limit mapkind()'s iteration count
There's no need for the function to observe increases of the maptrack
table (which can occur as the maptrack lock isn't being held) - actual
population of maptrack entries is excluded while we're here (by way of
holding the respective grant table lock for writing, while code
populating entries acquires it for reading). Latch the limit ahead of
the loop, allowing for the barrier to move out, too.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Thu, 22 Jun 2017 07:53:18 +0000 (09:53 +0200)]
gnttab: remove host map in the event of a grant_map failure
The current code appropriately removes the reference and type counts
on failure, but leaves the mapping set up. As the only path which can
trigger this is failure from IOMMU manipulation, and as unprivileged
domains are being crashed in that case, this is not by itself a
security issue.
Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 22 Jun 2017 07:51:29 +0000 (09:51 +0200)]
ARM: simplify page type handling
There's no need to have anything here on ARM other than the distinction
between writable and non-writable pages (and even that could likely be
eliminated, but with a more intrusive change). Limit type to a single
bit and drop pinned and validated flags altogether.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Dario Faggioli [Thu, 22 Jun 2017 07:45:37 +0000 (09:45 +0200)]
idle_loop: either deal with tasklets or go idle
In fact, there are two kinds of tasklets: vCPU and
softirq context. When we want to do vCPU context tasklet
work, we force the idle vCPU (of a particular pCPU) into
execution, and run it from there.
This means there are two possible reasons for choosing
to run the idle vCPU:
1) we want a pCPU to go idle,
2) we want to run some vCPU context tasklet work.
If we're in case 2), it does not make sense to even
try to go idle (as the check will _always_ fail).
This patch rearranges the code of the body of idle
vCPUs, so that we actually check whether we are in
case 1) or 2), and act accordingly.
As a matter of fact, this also means that we do not
check if there's any tasklet work to do after waking
up from idle. This is not a problem, because:
a) for softirq context tasklets, if any is queued
"during" wakeup from idle, TASKLET_SOFTIRQ is
raised, and the call to do_softirq() (which is still
happening *after* the wakeup) will take care of it;
b) for vCPU context tasklets, if any is queued "during"
wakeup from idle, SCHEDULE_SOFTIRQ is raised and
do_softirq() (happening after the wakeup) calls
the scheduler. The scheduler sees that there is
tasklet work pending and confirms the idle vCPU
in execution, which then will get to execute
do_tasklet().
Add a pointer to the gic device tree bindings. Add an explanation on how
to calculate irq numbers from device tree.
Add a brief explanation of the reg property and a pointer to the xl docs
for a description of the iomem property. Add a note that in the example
we are using different memory addresses for guests and host.
Juergen Gross [Thu, 15 Jun 2017 09:58:27 +0000 (11:58 +0200)]
tools/xen-detect: try sysfs node for obtaining guest type
Instead of relying on cpuid instruction behaviour to tell which domain
type we are just try asking the kernel via the appropriate sysfs node
(added in Linux kernel 4.13).
Keep the old detection logic as a fallback for older kernels.
Petre Pircalabu [Tue, 20 Jun 2017 15:13:20 +0000 (17:13 +0200)]
x86/monitor: add masking support for write_ctrlreg events
Add support for filtering out the write_ctrlreg monitor events if they
are generated only by changing certains bits.
A new parameter (bitmask) was added to the xc_monitor_write_ctrlreg
function in order to mask the event generation if the changed bits are
set.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Ross Lagerwall [Tue, 20 Jun 2017 15:13:02 +0000 (17:13 +0200)]
rombios/ata: wait for BSY to clear after write
After rombios transfers the data for a write, it checks the status and
fails if BSY is set. qemu-trad doesn't set BSY for PIO writes, but QEMU
upstream does, and this causes rombios to fail writes because they are
marked as BSY. Instead, wait for BSY to clear after a write.
INT 13 writes are probably rarely used these days, but they are used by
GRUB 2 to write to its environment file which happens by default on
Ubuntu.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Mon, 19 Jun 2017 14:04:08 +0000 (15:04 +0100)]
xen/test/Makefile: Fix clean target, broken by pattern rule
In "xen/test/livepatch: Regularise Makefiles" we reworked
xen/test/Makefile to use a pattern rule. However, there are two
problems with this. Both are related to the way that xen/Rules.mk is
implicitly part of this Makefile because of the way that Makefiles
under xen/ are invoked by their parent directory Makefiles.
Firstly, the Rules.mk `clean' target overrides the pattern rule in
xen/test/Makefile. The result is that `make -C xen clean' does not
actually run the livepatch clean target.
The Rules.mk clean target does have provision for recursing into
subdirectories, but that feature is tangled up with complex object
file iteration machinery which is not desirable here. However, we can
extend the Rules.mk clean target since it is a double-colon rule.
Sadly this involves duplicating the SUBDIR iteration boilerplate. (A
make function could be used but the cure would be worse than the
disease.)
Secondly, Rules.mk has a number of -include directives. make likes to
try to (re)build files mentioned in includes. With the % pattern
rule, this applies to those files too.
As a result, make -C xen clean would try to build `.*.d' (for example)
in xen/test. This would fail with an error message. The error would
be ignored because of the `-', but it's annoying and ugly.
Solve this by limiting the % pattern rule to the targets we expect it
to handle. These are those listed in the top-level Makefile help
message, apart from: those which are subdir- or component-qualified;
clean targets (which are handled specially, even distclean); and dist,
src-tarball-*, etc. (which are converted to install by an earlier
Makefile).
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Release-acked-by: Julien Grall <julien.grall@arm.com>
Jan Beulich [Tue, 20 Jun 2017 12:51:53 +0000 (14:51 +0200)]
memory: don't suppress P2M update in populate_physmap()
Commit d18627583d ("memory: don't hand MFN info to translated guests")
wrongly added a null-handle check there - just like stated in its
description for memory_exchange(), the array is also an input for
populate_physmap() (and hence can't reasonably be null). I have no idea
how I've managed to overlook this.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>