]> xenbits.xensource.com Git - xen.git/log
xen.git
9 years agoxl: fix a typo in error string in create_domain
Wei Liu [Tue, 14 Jul 2015 16:41:12 +0000 (17:41 +0100)]
xl: fix a typo in error string in create_domain

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoxl: correctly handle null extra config in main_config_update
Wei Liu [Tue, 14 Jul 2015 16:41:11 +0000 (17:41 +0100)]
xl: correctly handle null extra config in main_config_update

Don't dereference NULL.

Also fixed a typo in error string while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoxl: correct handling of extra_config in main_cpupoolcreate
Wei Liu [Tue, 14 Jul 2015 16:41:10 +0000 (17:41 +0100)]
xl: correct handling of extra_config in main_cpupoolcreate

Don't dereference extra_config if it's NULL. Don't leak extra_config in
the end.

Also fixed a typo in error string while I was there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: qmp_init_handler can return NULL
Wei Liu [Tue, 14 Jul 2015 16:41:09 +0000 (17:41 +0100)]
libxl: qmp_init_handler can return NULL

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: avoid leaking in libxl__initiate_device_remove
Wei Liu [Tue, 14 Jul 2015 16:41:07 +0000 (17:41 +0100)]
libxl: avoid leaking in libxl__initiate_device_remove

Change "return" to "goto out_success" to correctly dispose of the
structure.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: turn two malloc's to libxl__malloc
Wei Liu [Tue, 14 Jul 2015 16:41:05 +0000 (17:41 +0100)]
libxl: turn two malloc's to libxl__malloc

One is to combine malloc + libxl__alloc_failed. The other is to avoid
dereferencing NULL pointer in case of malloc failure. Also use gc for
memory allocation and remove free() in second case.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: don't check s!=NULL in libxl__abs_path
Wei Liu [Tue, 14 Jul 2015 16:41:04 +0000 (17:41 +0100)]
libxl: don't check s!=NULL in libxl__abs_path

That argument should not be NULL. Let subsequent dereferencing crashes
the process.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: libxl: Handle failure to create qemu dm logfile
Ian Campbell [Mon, 13 Jul 2015 12:31:23 +0000 (13:31 +0100)]
tools: libxl: Handle failure to create qemu dm logfile

If libxl_create_logfile fails for some reason then
libxl__create_qemu_logfile previously just carried on and dereferenced
the uninitialised logfile.

Check for the error from libxl_create_logfile, which has already
logged for us.

This was reported as Debian bug #784880.

Reported-by: Russell Coker <russell@coker.com.au>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: 784880@bugs.debian.org
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/HVM: drop now wrong ASSERT() from hvm_broadcast_ioreq()
Jan Beulich [Tue, 14 Jul 2015 13:20:15 +0000 (15:20 +0200)]
x86/HVM: drop now wrong ASSERT() from hvm_broadcast_ioreq()

The function is now also being used for IOREQ_TYPE_TIMEOFFSET.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
9 years agox86: avoid invalid phys_proc_id reference
Chao Peng [Mon, 13 Jul 2015 11:46:48 +0000 (13:46 +0200)]
x86: avoid invalid phys_proc_id reference

phys_proc_id is invalidated in remove_siblinginfo() which gets called
before cpu_smpboot_free(). This means calling cpu_to_socket(cpu) in
cpu_smpboot_free() is not possible to be correct.

This patch moves the invalidating of phys_proc_id from
remove_siblinginfo() to cpu_smpboot_free() so that cpu_to_socket(cpu)
can be used in cpu_smpboot_free().

The same is done for cpu_core_id/compute_unit_id and due to that
cpu_sibling_setup_map is private to the file so it's moved as well.

Reported-by: Dario Faggioli <dario.faggioli@citrix.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
9 years agoMAINTAINERS: support for xen-access and email change
Tamas K Lengyel [Mon, 13 Jul 2015 11:46:31 +0000 (13:46 +0200)]
MAINTAINERS: support for xen-access and email change

Add tools/tests/xen-acess to the supported list under VM EVENT/MEM ACCESS.
Also, changing my e-mail to the preferred one, as it is in many of the headers
already.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86: remove sh_{un}map_domain_page() and hap_{un}map_domain_page()
Ben Catterall [Mon, 13 Jul 2015 10:30:06 +0000 (12:30 +0200)]
x86: remove sh_{un}map_domain_page() and hap_{un}map_domain_page()

Removed as they were wrappers around map_domain_page() to
make it appear to take an mfn_t type.

Signed-off-by: Ben Catterall <Ben.Catterall@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
9 years agoconvert map_domain_page() to use the new mfn_t type
Ben Catterall [Mon, 13 Jul 2015 10:27:29 +0000 (12:27 +0200)]
convert map_domain_page() to use the new mfn_t type

Reworked the internals and declaration, applying (un)boxing
where needed. Converted calls to map_domain_page() to
provide mfn_t types, boxing where needed.

Signed-off-by: Ben Catterall <Ben.Catterall@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/hvm: fix deadlock in emulation of rep mov to or from VRAM
Paul Durrant [Mon, 13 Jul 2015 09:56:36 +0000 (11:56 +0200)]
x86/hvm: fix deadlock in emulation of rep mov to or from VRAM

Razvan Cojocaru reported a hypervisor deadlock with the following stack:

(XEN)    [<ffff82d08012c3f1>] _spin_lock+0x31/0x54
(XEN)    [<ffff82d0801d09b6>] stdvga_mem_accept+0x3b/0x125
(XEN)    [<ffff82d0801cb23a>] hvm_find_io_handler+0x68/0x8a
(XEN)    [<ffff82d0801cb410>] hvm_mmio_internal+0x37/0x67
(XEN)    [<ffff82d0801c2403>] __hvm_copy+0xe9/0x37d
(XEN)    [<ffff82d0801c3e5d>] hvm_copy_from_guest_phys+0x14/0x16
(XEN)    [<ffff82d0801cb107>] hvm_process_io_intercept+0x10b/0x1d6
(XEN)    [<ffff82d0801cb291>] hvm_io_intercept+0x35/0x5b
(XEN)    [<ffff82d0801bb440>] hvmemul_do_io+0x1ff/0x2c1
(XEN)    [<ffff82d0801bc0b9>] hvmemul_do_io_addr+0x117/0x163
(XEN)    [<ffff82d0801bc129>] hvmemul_do_mmio_addr+0x24/0x26
(XEN)    [<ffff82d0801bcbb5>] hvmemul_rep_movs+0x1ef/0x335
(XEN)    [<ffff82d080198b49>] x86_emulate+0x56c9/0x13088
(XEN)    [<ffff82d0801bbd26>] _hvm_emulate_one+0x186/0x281
(XEN)    [<ffff82d0801bc1e8>] hvm_emulate_one+0x10/0x12
(XEN)    [<ffff82d0801cb63e>] handle_mmio+0x54/0xd2
(XEN)    [<ffff82d0801cb700>] handle_mmio_with_translation+0x44/0x46
(XEN)    [<ffff82d0801c27f6>] hvm_hap_nested_page_fault+0x15f/0x589
(XEN)    [<ffff82d0801e9741>] vmx_vmexit_handler+0x150e/0x188d
(XEN)    [<ffff82d0801ee7d1>] vmx_asm_vmexit_handler+0x41/0xc0

The problem here is the call to hvm_mmio_internal() being made by
__hvm_copy().

When the emulated VRAM access was originally started by
hvm_io_intercept() a few frames up the stack, it would have called
stdvga_mem_accept() which would then have acquired the per-domain
stdvga lock. Unfortunately the call to hvm_mmio_internal(), to avoid
a costly P2M walk, speculatively calls stdvga_mem_accept() again to
see if the page handed to __hvm_copy() is actually an internally
emulated page and hence the vcpu deadlocks.

The fix is to do the range-check in stdvga_mem_accept() without taking
the stdvga lock. This is safe because the range is constant and we know
the I/O will never actually be accepted by the stdvga device model
because hvmemul_do_io_addr() makes sure that the source of the I/O is
actually RAM.

Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
9 years agox86/hvm: add support for broadcast of buffered ioreqs...
Paul Durrant [Mon, 13 Jul 2015 09:53:18 +0000 (11:53 +0200)]
x86/hvm: add support for broadcast of buffered ioreqs...

...and make RTC timeoffset ioreqs use it.

Without this patch RTC timeoffset updates go nowhere and Xen complains
with a (non-rate-limited) printk.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
9 years agox86: reintroduce read_unlock() optimization
David Vrabel [Mon, 13 Jul 2015 09:50:51 +0000 (11:50 +0200)]
x86: reintroduce read_unlock() optimization

Commit 902d1b5c310fb63b511f0b967cf5f32d3f605f3d (x86,arm: remove
asm/spinlock.h from all architectures) inadvertantly removed an
x86-specific optimization for read_unlock*().

Re-add asm/spinlock.h to allow architectures to provide an optmized
_raw_read_unlock() and make x86 provide the previous implementation.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agox86/hvm: avoid potential NULL pointer dereferences
Paul Durrant [Fri, 10 Jul 2015 15:45:46 +0000 (17:45 +0200)]
x86/hvm: avoid potential NULL pointer dereferences

Coverity flagged that hvm_next_io_handler() will return NULL after
calling domain_crash() and this will then lead to NULL pointer
dereferences in calling functions.

This patch checks for NULL in the callers and bails in that case.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/vm_event: toggle singlestep from vm_event response
Tamas K Lengyel [Fri, 10 Jul 2015 12:04:21 +0000 (14:04 +0200)]
x86/vm_event: toggle singlestep from vm_event response

Add an option to the vm_event response to toggle singlestepping on the vCPU.
This is only supported on Intel CPUs which have Monitor Trap Flag capability.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Ian Campbell <Ian.campbell@citrix.com>
9 years agox86/arm/mm: use gfn instead of pfn in p2m_{get,set}_mem_access
Vitaly Kuznetsov [Fri, 10 Jul 2015 11:58:24 +0000 (13:58 +0200)]
x86/arm/mm: use gfn instead of pfn in p2m_{get,set}_mem_access

'pfn' and 'start_pfn' are ambiguous, both these functions expect GFNs as input.

On x86 the interface of p2m_set_mem_access() in p2m.c doesn't match the
declaration in p2m-common.h as 'pfn' is being used instead of 'start_pfn'.

On ARM both p2m_set_mem_access and p2m_get_mem_access interfaces don't match
declarations from p2m-common.h: p2m_set_mem_access uses 'pfn' instead of
'start_pfn' and p2m_get_mem_access uses 'gpfn' instead of 'pfn'.

Convert p2m_get_mem_access/p2m_set_mem_access (and __p2m_get_mem_access on ARM)
interfaces to using gft_t instead of unsigned long and update all users of
these functions.

There is also an issue in p2m_get_mem_access on x86: 'gfn' parameter passed to
gfn_lock/gfn_unlock is not defined. This code compiles only because of a
coincidence: gfn_lock/gfn_unlock are currently macros which don't use their
second argument.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/monitor: move comment after 35f4aec5a0
Jan Beulich [Fri, 10 Jul 2015 11:56:54 +0000 (13:56 +0200)]
x86/monitor: move comment after 35f4aec5a0

It belongs with the check, not at a random other place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agoconvert copy/clear_domain_page() to using mfn_t
Andrew Cooper [Fri, 10 Jul 2015 10:54:10 +0000 (12:54 +0200)]
convert copy/clear_domain_page() to using mfn_t

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Convert grant_table.c to pass mfn_t types and fix ARM compiling.

Signed-off-by: Ben Catterall <Ben.Catterall@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoconvert map_domain_page_global() to using mfn_t
Andrew Cooper [Fri, 10 Jul 2015 10:53:24 +0000 (12:53 +0200)]
convert map_domain_page_global() to using mfn_t

The sh_map/unmap wrappers can be dropped, and take the opportunity to turn
some #define's into static inlines, for added type saftey.

As part of adding the type safety, GCC highlights an problematic include cycle
with arm/mm.h including domain_page.h which includes xen/mm.h and falls over
__page_to_mfn being used before being declared.  Simply dropping the inclusion
of domain_page.h fixes the compilation issue.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/monitor: don't use hvm_funcs directly
Tamas K Lengyel [Fri, 10 Jul 2015 10:39:02 +0000 (12:39 +0200)]
x86/monitor: don't use hvm_funcs directly

A couple spots in x86/monitor used hvm_funcs directly. This patch adds an extra
wrapper for enable_msr_exit_interception and changes monitor.c to use only the
wrappers.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
9 years agox86/monitor: add get_capabilities to monitor_op domctl
Tamas K Lengyel [Fri, 10 Jul 2015 10:38:09 +0000 (12:38 +0200)]
x86/monitor: add get_capabilities to monitor_op domctl

Add option to monitor_op domctl to determine the monitor capabilities of the
system.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
9 years agox86/MSI: fix guest unmasking when handling IRQ via event channel
Jan Beulich [Fri, 10 Jul 2015 10:36:24 +0000 (12:36 +0200)]
x86/MSI: fix guest unmasking when handling IRQ via event channel

Rather than assuming only PV guests need special treatment (and
dealing with that directly when an IRQ gets set up), keep all guest MSI
IRQs masked until either the (HVM) guest unmasks them via vMSI or the
(PV, PVHVM, or PVH) guest sets up an event channel for it.

To not further clutter the common evtchn_bind_pirq() with x86-specific
code, introduce an arch_evtchn_bind_pirq() hook instead.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86/hvm: track large memory mapped accesses by buffer offset
Paul Durrant [Thu, 9 Jul 2015 17:15:00 +0000 (19:15 +0200)]
x86/hvm: track large memory mapped accesses by buffer offset

The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical address. The code is also only relevant to memory
mapped I/O to or from a buffer.

This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
where it is relevant and tracks using buffer offset rather than address.
Separate I/O emulations (of which there may be up to three per instruction)
are distinguished by linear address.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: always re-emulate I/O from a buffer
Paul Durrant [Thu, 9 Jul 2015 17:16:00 +0000 (19:16 +0200)]
x86/hvm: always re-emulate I/O from a buffer

If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
otherwise only the first chunk will be processed. This patch makes
sure all I/O from a buffer is re-emulated regardless of whether it
is a read or a write.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: use ioreq_t to track in-flight state
Paul Durrant [Thu, 9 Jul 2015 17:15:00 +0000 (19:15 +0200)]
x86/hvm: use ioreq_t to track in-flight state

Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: remove hvm_io_state enumeration
Paul Durrant [Thu, 9 Jul 2015 17:16:00 +0000 (19:16 +0200)]
x86/hvm: remove hvm_io_state enumeration

Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none                -> STATE_IOREQ_NONE
HVMIO_awaiting_completion -> STATE_IOREQ_READY
HVMIO_completed           -> STATE_IORESP_READY

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: remove HVMIO_dispatched I/O state
Paul Durrant [Thu, 9 Jul 2015 17:03:00 +0000 (19:03 +0200)]
x86/hvm: remove HVMIO_dispatched I/O state

By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

The completion case for HVMIO_dispatched can also be trivally removed
from hvmemul_do_io() as it was already unreachable. This is because that
state was only ever used for writes or I/O to/from a guest page and
hvmemul_do_io() is never called to complete such I/O.

NOTE: There is one sublety in handle_pio()... The only case when
      handle_pio() got a return code of X86EMUL_RETRY back from
      hvmemul_do_pio_buffer() and found io_state was not
      HVMIO_awaiting_completion was in the case where the domain is
      shutting down. This is because all writes normally yield a return
      of HVMEMUL_OKAY and all reads put io_state into
      HVMIO_awaiting_completion. Hence the io_state check there is
      replaced with a check of the is_shutting_down flag on the domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: split I/O completion handling from state model
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: split I/O completion handling from state model

The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()

By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to hvm_process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

The calls to msix_write_completion() and vcpu_end_shutdown_deferral()
are also made unconditionally because the ioreq state will always be
STATE_IOREQ_NONE at the end of hvm_io_assist() so the test was
pointless. These calls are also only relevant when the emulation has
been handled externally which is now always the case.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: limit reps to avoid the need to handle retry
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: limit reps to avoid the need to handle retry

By limiting hvmemul_do_io_addr() to reps falling within the page on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic (added
by c/s 82ed8716b "fix direct PCI port I/O emulation retry and error
handling") from the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: unify stdvga mmio intercept with standard mmio intercept
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: unify stdvga mmio intercept with standard mmio intercept

It's clear from the following check in hvmemul_rep_movs:

    if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
         (sp2mt == p2m_mmio_dm && dp2mt == p2m_mmio_dm) )
        return X86EMUL_UNHANDLEABLE;

that mmio <-> mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio <-> mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio <-> mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: unify dpci portio intercept with standard portio intercept
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: unify dpci portio intercept with standard portio intercept

This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: add length to mmio check op
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: add length to mmio check op

When memory mapped I/O is range checked by internal handlers, the length
of the access should be taken into account.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: unify internal portio and mmio intercepts
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: unify internal portio and mmio intercepts

The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: change portio port numbers and sizes to unsigned int
Paul Durrant [Thu, 9 Jul 2015 17:04:00 +0000 (19:04 +0200)]
x86/hvm: change portio port numbers and sizes to unsigned int

Building on the previous patch, this patch changes portio port numbers
and sizes to unsigned int which then allows the io_handler size field to
reduce to an unsigned int.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...
Paul Durrant [Thu, 9 Jul 2015 16:27:51 +0000 (18:27 +0200)]
x86/hvm: change hvm_mmio_read_t and hvm_mmio_write_t length argument...

...from unsigned long to unsigned int

A 64-bit length is not necessary, 32 bits is enough.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: remove multiple open coded 'chunking' loops
Paul Durrant [Thu, 9 Jul 2015 16:27:16 +0000 (18:27 +0200)]
x86/hvm: remove multiple open coded 'chunking' loops

...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

NOTE: This patch also introduces a change in 'chunking' around a page
      boundary. Previously (for example) an 8 byte access at the last
      byte of a page would get carried out as 8 single-byte accesses.
      It will now be carried out as a single-byte access, followed by
      a 4-byte access, a 2-byte access and then another single-byte
      access.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: correct socket_cpumask allocation
Chao Peng [Thu, 9 Jul 2015 15:47:26 +0000 (17:47 +0200)]
x86: correct socket_cpumask allocation

For booting cpu, the socket number is not needed to be 0 so
it needs to be computed by cpu number.

For secondary cpu, phys_proc_id is not valid in CPU_PREPARE
notifier(cpu_smpboot_alloc), so cpu_to_socket(cpu) can't be used.
Instead, pre-allocate secondary_cpu_mask in cpu_smpboot_alloc()
and later consume it in smp_store_cpu_info().

This patch also change socket_cpumask type from 'cpumask_var_t *'
to 'cpumask_t **' so that smaller NR_CPUS works.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
9 years agox86/VPMU: move VPMU files up from hvm/ directory
Boris Ostrovsky [Fri, 19 Jun 2015 18:45:00 +0000 (20:45 +0200)]
x86/VPMU: move VPMU files up from hvm/ directory

Since PMU is now not HVM specific we can move VPMU-related files up from
arch/x86/hvm/ directory.

Specifically:
    arch/x86/hvm/vpmu.c -> arch/x86/cpu/vpmu.c
    arch/x86/hvm/svm/vpmu.c -> arch/x86/cpu/vpmu_amd.c
    arch/x86/hvm/vmx/vpmu_core2.c -> arch/x86/cpu/vpmu_intel.c
    include/asm-x86/hvm/vpmu.h -> include/asm-x86/vpmu.h

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agox86/VPMU: add privileged PMU mode
Boris Ostrovsky [Thu, 9 Jul 2015 14:52:31 +0000 (16:52 +0200)]
x86/VPMU: add privileged PMU mode

Add support for privileged PMU mode (XENPMU_MODE_ALL) which allows privileged
domain (dom0) profile both itself (and the hypervisor) and the guests. While
this mode is on profiling in guests is disabled.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: merge vpmu_rdmsr and vpmu_wrmsr
Boris Ostrovsky [Thu, 9 Jul 2015 14:51:51 +0000 (16:51 +0200)]
x86/VPMU: merge vpmu_rdmsr and vpmu_wrmsr

The two routines share most of their logic.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
9 years agox86/VPMU: handle PMU interrupts for PV(H) guests
Boris Ostrovsky [Thu, 9 Jul 2015 14:48:00 +0000 (16:48 +0200)]
x86/VPMU: handle PMU interrupts for PV(H) guests

Add support for handling PMU interrupts for PV(H) guests.

VPMU for the interrupted VCPU is unloaded until the guest issues XENPMU_flush
hypercall. This allows the guest to access PMU MSR values that are stored in
VPMU context which is shared between hypervisor and domain, thus avoiding
traps to hypervisor.

Since the interrupt handler may now force VPMU context save (i.e. set
VPMU_CONTEXT_SAVE flag) we need to make changes to amd_vpmu_save() which
until now expected this flag to be set only when the counters were stopped.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agotools/xl: Fix build error following c/s f52fbcf7
Andrew Cooper [Thu, 9 Jul 2015 10:36:22 +0000 (11:36 +0100)]
tools/xl: Fix build error following c/s f52fbcf7

CentOS7 complains that 'ret' might be unused, and indeed this is the case for
`xl psr-hwinfo --cat`.

The logic for selecting which information to print was rather awkward.
Introduce a new 'all' which default to true, and is cleared if specific
options are selected.  This allows for a far more clear logic when choosing
whether to print information or not.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Chao Peng <chao.p.peng@linux.intel.com>
9 years agoVPMU/AMD: check MSR values before writing to hardware
Boris Ostrovsky [Thu, 9 Jul 2015 11:55:32 +0000 (13:55 +0200)]
VPMU/AMD: check MSR values before writing to hardware

A number of fields of PMU control MSRs are defined as Reserved. AMD
documentation requires that such fields are preserved when the register
is written by software.

Add checks to amd_vpmu_do_wrmsr() to make sure that guests don't attempt
to modify those bits.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
9 years agox86/VPMU: use pre-computed masks when checking validity of MSRs
Boris Ostrovsky [Thu, 9 Jul 2015 11:54:57 +0000 (13:54 +0200)]
x86/VPMU: use pre-computed masks when checking validity of MSRs

No need to compute those masks on every MSR access.

Also, when checking MSR_P6_EVNTSELx registers make sure that bit 21
(which is a reserved bit) is not set.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agoMerge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Thu, 9 Jul 2015 11:54:35 +0000 (13:54 +0200)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging

9 years agox86/VPMU: add support for PMU register handling on PV guests
Boris Ostrovsky [Thu, 9 Jul 2015 11:53:55 +0000 (13:53 +0200)]
x86/VPMU: add support for PMU register handling on PV guests

Intercept accesses to PMU MSRs and process them in VPMU module. If vpmu ops
for VCPU are not initialized (which is the case, for example, for PV guests that
are not "VPMU-enlightened") access to MSRs will return failure.

Dump VPMU state for all domains (HVM and PV) when requested.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: when handling MSR accesses, leave fault injection to callers
Boris Ostrovsky [Thu, 9 Jul 2015 11:53:03 +0000 (13:53 +0200)]
x86/VPMU: when handling MSR accesses, leave fault injection to callers

Hypervisor cannot easily inject faults into PV guests from arch-specific VPMU
read/write MSR handlers (unlike it is in the case of HVM guests).

With this patch vpmu_do_msr() will return an error code to indicate whether an
error was encountered during MSR processing (instead of stating that the access
was to a VPMU register). The caller will then decide how to deal with the error.

As part of this patch we also check for validity of certain MSR accesses right
when we determine which register is being written, as opposed to postponing this
until later.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: save VPMU state for PV guests during context switch
Boris Ostrovsky [Thu, 9 Jul 2015 11:51:42 +0000 (13:51 +0200)]
x86/VPMU: save VPMU state for PV guests during context switch

Save VPMU state during context switch for both HVM and PV(H) guests.

A subsequent patch ("x86/VPMU: NMI-based VPMU support") will make it possible
for vpmu_switch_to() to call vmx_vmcs_try_enter()->vcpu_pause() which needs
is_running to be correctly set/cleared. To prepare for that, call context_saved()
before vpmu_switch_to() is executed. (Note that while this change could have
been dalayed until that later patch, the changes are harmless to existing code
and so we do it here)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: initialize PMU for PV(H) guests
Boris Ostrovsky [Thu, 9 Jul 2015 11:50:22 +0000 (13:50 +0200)]
x86/VPMU: initialize PMU for PV(H) guests

Code for initializing/tearing down PMU for PV guests

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: initialize VPMUs with __initcall
Boris Ostrovsky [Thu, 9 Jul 2015 11:48:39 +0000 (13:48 +0200)]
x86/VPMU: initialize VPMUs with __initcall

Move some VPMU initilization operations into __initcalls to avoid performing
same tests and calculations for each vcpu.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agofix XSM build after 11fe998e56
Jan Beulich [Thu, 9 Jul 2015 11:47:34 +0000 (13:47 +0200)]
fix XSM build after 11fe998e56

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agox86/VPMU: interface for setting PMU mode and flags
Boris Ostrovsky [Thu, 9 Jul 2015 11:39:53 +0000 (13:39 +0200)]
x86/VPMU: interface for setting PMU mode and flags

Add runtime interface for setting PMU mode and flags. Three main modes are
provided:
* XENPMU_MODE_OFF:  PMU is not virtualized
* XENPMU_MODE_SELF: Guests can access PMU MSRs and receive PMU interrupts.
* XENPMU_MODE_HV: Same as XENPMU_MODE_SELF for non-proviledged guests, dom0
  can profile itself and the hypervisor.

Note that PMU modes are different from what can be provided at Xen's boot line
with 'vpmu' argument. An 'off' (or '0') value is equivalent to XENPMU_MODE_OFF.
Any other value, on the other hand, will cause VPMU mode to be set to
XENPMU_MODE_SELF during boot.

For feature flags only Intel's BTS is currently supported.

Mode and flags are set via HYPERVISOR_xenpmu_op hypercall.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agox86/VPMU: make vpmu not HVM-specific
Boris Ostrovsky [Thu, 9 Jul 2015 11:36:15 +0000 (13:36 +0200)]
x86/VPMU: make vpmu not HVM-specific

vpmu structure will be used for both HVM and PV guests. Move it from
hvm_vcpu to arch_vcpu.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agox86/VPMU: add public xenpmu.h
Boris Ostrovsky [Thu, 9 Jul 2015 11:34:29 +0000 (13:34 +0200)]
x86/VPMU: add public xenpmu.h

Add pmu.h header files, move various macros and structures that will be
shared between hypervisor and PV guests to it.

Move MSR banks out of architectural PMU structures to allow for larger sizes
in the future. The banks are allocated immediately after the context and
PMU structures store offsets to them.

While making these updates, also:
* Remove unused vpmu_domain() macro from vpmu.h
* Convert msraddr_to_bitpos() into an inline and make it a little faster by
  realizing that all Intel's PMU-related MSRs are in the lower MSR range.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agocommon/symbols: export hypervisor symbols to privileged guest
Boris Ostrovsky [Thu, 9 Jul 2015 11:27:52 +0000 (13:27 +0200)]
common/symbols: export hypervisor symbols to privileged guest

Export Xen's symbols as {<address><type><name>} triplet via new XENPF_get_symbol
hypercall

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
9 years agoxen/arm: gic-v3: Add support of vGICv2 when available
Julien Grall [Tue, 7 Jul 2015 16:22:34 +0000 (17:22 +0100)]
xen/arm: gic-v3: Add support of vGICv2 when available

* Modify the GICv3 driver to recognize a such device. I wasn't able
  to find a register which tell if GICv2 is supported on GICv3. The only
  way to find it seems to check if the DT node provides GICC and GICV.

* Disable access to ICC_SRE_EL1 to guest using vGICv2

* The LR is slightly different for vGICv2. The interrupt is always
injected with group0.

* Add a comment explaining why Group1 is used for vGICv3.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoarm: Allow the user to specify the GIC version
Julien Grall [Tue, 7 Jul 2015 16:22:33 +0000 (17:22 +0100)]
arm: Allow the user to specify the GIC version

A platform may have a GIC compatible with previous version of the
device.

This is allow to virtualize an unmodified OS on new hardware if the GIC
is compatible with older version.

When a guest is created, the vGIC will emulate same version as the
hardware. Although, the user can specify in the configuration file the
preferred version (currently only GICv2 and GICv3 are supported).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools: ocaml: Handle anonymous struct members of structs in libxl IDL
Ian Campbell [Wed, 8 Jul 2015 14:29:10 +0000 (15:29 +0100)]
tools: ocaml: Handle anonymous struct members of structs in libxl IDL

Julien has a patch "arm: Allow the user to specify the GIC version"
which adds an anonymous struct to libxl_domain_build_info:

    @ -480,6 +486,11 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                           ])),
                      ("invalid", None),
                      ], keyvar_init_val = "LIBXL_DOMAIN_TYPE_INVALID")),
    +
    +
    +    ("arch_arm", Struct(None, [("gic_version", libxl_gic_version),
    +                              ])),
    +
         ], dir=DIR_IN
     )

(libxl_gic_version is an enum). This is not currently supported by the
ocaml genwrap.py. Add a simple pass which handles simple anonymous
unions as top level members of a struct type, but not more deeply
nested since that would be a much more complex change and is not
currently required.

With Juliens patch applied the relevant resulting change to the .mli
is:

    --- tools/ocaml/libs/xl/_libxl_BACKUP_types.mli.in 2015-07-08 11:22:35.000000000 +0100
    +++ tools/ocaml/libs/xl/_libxl_types.mli.in 2015-07-08 12:25:56.000000000 +0100
    @@ -469,6 +477,10 @@ module Domain_build_info : sig

      type type__union = Hvm of type_hvm | Pv of type_pv | Invalid

    + type arch_arm__anon = {
    + gic_version : gic_version;
    + }
    +
      type t =
      {
      max_vcpus : int;
    @@ -510,6 +522,7 @@ module Domain_build_info : sig
      ramdisk : string option;
      device_tree : string option;
      xl_type : type__union;
    + arch_arm : arch_arm__anon;
      }
      val default : ctx -> ?xl_type:domain_type -> unit -> t
     end

The .ml differs similarly. Without Julien's patch there is no change.

gen_struct is refactored slightly to take the indent level as an
argument, since it is now used at a different level.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Dave Scott <Dave.Scott@citrix.com>
Cc: Rob Hoes <Rob.Hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@citrix.com>
9 years agolibxl: Add AHCI support for upstream qemu
Fabio Fantoni [Wed, 8 Jul 2015 14:31:05 +0000 (16:31 +0200)]
libxl: Add AHCI support for upstream qemu

Usage:
hdtype=ide|ahci (default=ide)

If hdtype=ahci adds ich9 disk controller in ahci mode and uses it with
upstream qemu to emulate disks instead of ide.
It doesn't support cdroms which still using ide (cdroms will use
"-device ide-cd" as new qemu parameter)
Ahci requires new qemu parameter but for now other emulated disks cases
remains with old ones (I did it in other patch, not needed by this one)
I did it as libxl parameter disabled by default to avoid possible
problems:
- with save/restore/migration (restoring with ahci a domU that was with
ide instead)
- windows < 8 without pv drivers (a registry key change is needed for
AHCI<->IDE change FWIK to avoid possible blue screen)
- windows XP or older that many not support ahci by default.
Setting AHCI with libxl parameter and default to disabled seems the best
solution.
AHCI increase hvm domUs boot performance. On linux hvm domU I saw up to
only 20% of the previous total boot time, whereas boot time decrease a
lot on W7 domUs for most of boots I have done. Small difference in boot
time compared to ide mode on W8 and newer (probably other xen
improvements or fixes are needed not ahci related)

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- adjust name of LIBXL_HAVE #define as discussed on list,
         fixup pod syntax in xl.cfg.pod.5 ]

9 years agodocs: add xl-psr.markdown
Chao Peng [Thu, 9 Jul 2015 08:54:15 +0000 (16:54 +0800)]
docs: add xl-psr.markdown

Add document to introduce basic concepts and terms in PSR family
technologies and the xl interfaces.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools: add tools support for Intel CAT
Chao Peng [Thu, 9 Jul 2015 08:54:14 +0000 (16:54 +0800)]
tools: add tools support for Intel CAT

This is the xc/xl changes to support Intel Cache Allocation
Technology(CAT).

'xl psr-hwinfo' is updated to show CAT info and two new commands
for CAT are introduced:
- xl psr-cat-cbm-set [-s socket] <domain> <cbm>
  Set cache capacity bitmasks(CBM) for a domain.
- xl psr-cat-show <domain>
  Show CAT domain information.

Examples:
[root@vmm-psr vmm]# xl psr-hwinfo --cat
Cache Allocation Technology (CAT):
Socket ID       : 0
L3 Cache        : 12288KB
Maximum COS     : 15
CBM length      : 12
Default CBM     : 0xfff

[root@vmm-psr vmm]# xl psr-cat-cbm-set 0 0xff

[root@vmm-psr vmm]# xl psr-cat-show
Socket ID       : 0
L3 Cache        : 12288KB
Default CBM     : 0xfff
   ID                     NAME             CBM
    0                 Domain-0            0xff

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: introduce some socket helpers
Chao Peng [Thu, 9 Jul 2015 08:54:13 +0000 (16:54 +0800)]
tools/libxl: introduce some socket helpers

Add libxl_socket_bitmap_alloc() to allow allocating a socket specific
libxl_bitmap (as it is for cpu/node bitmap).

Internal function libxl__count_physical_sockets() is introduced together
to get the socket count when the size of bitmap is not specified.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: add command to show PSR hardware info
Chao Peng [Thu, 9 Jul 2015 08:54:12 +0000 (16:54 +0800)]
tools/libxl: add command to show PSR hardware info

Add dedicated one to show hardware information.

[root@vmm-psr]xl psr-hwinfo
Cache Monitoring Technology (CMT):
Enabled         : 1
Total RMID      : 63
Supported monitor types:
cache-occupancy
total-mem-bandwidth
local-mem-bandwidth

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: minor name changes for CMT commands
Chao Peng [Thu, 9 Jul 2015 08:54:11 +0000 (16:54 +0800)]
tools/libxl: minor name changes for CMT commands

Use "-" instead of  "_" for monitor types.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agosched: factor code that moves a vcpu to a new pcpu in a function
Dario Faggioli [Wed, 8 Jul 2015 15:31:01 +0000 (17:31 +0200)]
sched: factor code that moves a vcpu to a new pcpu in a function

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agosched: factor the code for taking two runq locks in a function
Dario Faggioli [Wed, 8 Jul 2015 15:30:25 +0000 (17:30 +0200)]
sched: factor the code for taking two runq locks in a function

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agovm_event: rename MEM_ACCESS_EMULATE and MEM_ACCESS_EMULATE_NOWRITE
Razvan Cojocaru [Wed, 8 Jul 2015 15:28:42 +0000 (17:28 +0200)]
vm_event: rename MEM_ACCESS_EMULATE and MEM_ACCESS_EMULATE_NOWRITE

By naming, placing and bit shift convention, it could be taken as
implied that MEM_ACCESS_EMULATE and MEM_ACCESS_EMULATE_NOWRITE are
mem_access event specific flags (instead of being generally
applicable as vm_event flags). This patch renames them to
VM_EVENT_FLAG_EMULATE and VM_EVENT_FLAG_EMULATE_NOWRITE
respectively, and uses bit shifts following the rest of the
VM_EVENT_FLAG_ constants.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tlengyel@novetta.com>
9 years agodocs: get rid of the SEDF scheduler
Dario Faggioli [Tue, 7 Jul 2015 16:44:18 +0000 (18:44 +0200)]
docs: get rid of the SEDF scheduler

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxl: get rid of the SEDF scheduler
Dario Faggioli [Tue, 7 Jul 2015 16:44:11 +0000 (18:44 +0200)]
xl: get rid of the SEDF scheduler

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen: kill sched_sedf.c
Dario Faggioli [Tue, 7 Jul 2015 16:44:02 +0000 (18:44 +0200)]
xen: kill sched_sedf.c

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agoxen: get rid of the SEDF scheduler
Dario Faggioli [Tue, 7 Jul 2015 16:43:55 +0000 (18:43 +0200)]
xen: get rid of the SEDF scheduler

more specifically, of all the symbols and references
to it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agolibxc: get rid of the SEDF scheduler
Dario Faggioli [Tue, 7 Jul 2015 16:43:47 +0000 (18:43 +0200)]
libxc: get rid of the SEDF scheduler

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools: python: get rid of the SEDF scheduler bindings
Dario Faggioli [Tue, 7 Jul 2015 16:43:40 +0000 (18:43 +0200)]
tools: python: get rid of the SEDF scheduler bindings

as it is going away from libxc, so these won't build any
longer.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxl: get rid of the SEDF scheduler
Dario Faggioli [Tue, 7 Jul 2015 16:43:32 +0000 (18:43 +0200)]
libxl: get rid of the SEDF scheduler

only the interface is left in place, for backward
compile-time compatibility, but every attempt to
use it would throw an error.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agotools: Do not add top-level tools dir to include path
Ian Campbell [Tue, 7 Jul 2015 15:40:32 +0000 (16:40 +0100)]
tools: Do not add top-level tools dir to include path

Instead switch to an explicit -include $(XEN_ROOT)/tools/config.h to
pickup config.h.

Most places already do this, fixup the rest.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: Link in-tree libvchan and libblktapctl users against lib....so
Ian Campbell [Tue, 7 Jul 2015 15:40:31 +0000 (16:40 +0100)]
tools: Link in-tree libvchan and libblktapctl users against lib....so

As with other in-tree users avoid -L + -l.

This avoids any confusion with versions of these libraries already
installed on the system and the possibility of linking against them by
mistake.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: libxl: Log on more error paths on domain create failure
Ian Campbell [Tue, 7 Jul 2015 15:40:30 +0000 (16:40 +0100)]
tools: libxl: Log on more error paths on domain create failure

The setdefault functions do not generally log why they didn't like
things.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxc: Correct log message in xc_map_foreign_bulk
Ian Campbell [Tue, 7 Jul 2015 15:40:29 +0000 (16:40 +0100)]
libxc: Correct log message in xc_map_foreign_bulk

Things are confusing enough as it is without using another function's
name here.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: libxc: fix "alocated" typo in comment
Ian Campbell [Tue, 7 Jul 2015 15:40:28 +0000 (16:40 +0100)]
tools: libxc: fix "alocated" typo in comment

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoxen/arm: Rename XEN_DOMCTL_CONFIG_GIC_DEFAULT to XEN_DOMCTL_CONFIG_GIC_NATIVE
Julien Grall [Tue, 7 Jul 2015 16:22:32 +0000 (17:22 +0100)]
xen/arm: Rename XEN_DOMCTL_CONFIG_GIC_DEFAULT to XEN_DOMCTL_CONFIG_GIC_NATIVE

This will reflect that we effectively emulate the same version as the
hardware GIC for the guest.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: Prevent NULL pointer dereference in stdiostream_vmessage()
Jennifer Herbert [Tue, 7 Jul 2015 16:38:59 +0000 (16:38 +0000)]
libxc: Prevent NULL pointer dereference in stdiostream_vmessage()

Unlikely that it may seem localtime_r could fail, which would result in a
null pointer dereference.  In this case, it shoud log the errno, (instead of
the date/time), and and continue its logging, as this is still useful.

Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoxl: Sane handling of extra config file arguments
Ian Jackson [Mon, 15 Jun 2015 13:50:42 +0000 (14:50 +0100)]
xl: Sane handling of extra config file arguments

Various xl sub-commands take additional parameters containing = as
additional config fragments.

The handling of these config fragments has a number of bugs:

 1. Use of a static 1024-byte buffer.  (If truncation would occur,
    with semi-trusted input, a security risk arises due to quotes
    being lost.)

 2. Mishandling of the return value from snprintf, so that if
    truncation occurs, the to-write pointer is updated with the
    wanted-to-write length, resulting in stack corruption.  (This is
    XSA-137.)

 3. Clone-and-hack of the code for constructing the appended
    config file.

These are fixed here, by introducing a new function
`string_realloc_append' and using it everywhere.  The `extra_info'
buffers are replaced by pointers, which start off NULL and are
explicitly freed on all return paths.

The separate variable which will become dom_info.extra_config is
abolished (which involves moving the clearing of dom_info).

Additional bugs I observe, not fixed here:

 4. The functions which now call string_realloc_append use ad-hoc
    error returns, with multiple calls to `return'.  This currently
    necessitates multiple new calls to `free'.

 5. Many of the paths in xl call exit(-rc) where rc is a libxl status
    code.  This is a ridiculous exit status `convention'.

 6. The loops for handling extra config data are clone-and-hacks.

 7. Once the extra config buffer is accumulated, it must be combined
    with the appropriate main config file.  The code to do this
    combining is clone-and-hacked too.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian,campbell@citrix.com>
9 years agolibxl: Increase device model startup timeout to 1min.
Anthony PERARD [Tue, 7 Jul 2015 15:09:13 +0000 (16:09 +0100)]
libxl: Increase device model startup timeout to 1min.

On a busy host, QEMU may take more than 10s to load and start.

This is likely due to a bug in Linux where the I/O subsystem sometime
produce high latency under load and result in QEMU taking a long time to
load every single dynamic libraries.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotools: libxl: allow permissive qemu-upstream pci passthrough.
Ian Campbell [Mon, 1 Jun 2015 10:32:23 +0000 (11:32 +0100)]
tools: libxl: allow permissive qemu-upstream pci passthrough.

Since XSA-131 qemu-xen now restricts access to PCI cfg by default. In
order to allow local configuration of the existing libxl_device_pci
"permissive" flag needs to be plumbed through via the new QMP property
added by the XSA-131 patches.

Versions of QEMU prior to XSA-131 did not support this permissive
property, so we only pass it if it is true. Older versions only
supported permissive mode.

qemu-xen-traditional already supports the permissive mode setting via
xenstore.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: Provide doc comments for AO_GC and STATE_AO_GC
Ian Jackson [Mon, 6 Jul 2015 15:52:30 +0000 (16:52 +0100)]
libxl: Provide doc comments for AO_GC and STATE_AO_GC

CC: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools: Add a block-tap script for setting up tapdisks via tap-ctl
George Dunlap [Mon, 6 Jul 2015 10:51:40 +0000 (11:51 +0100)]
tools: Add a block-tap script for setting up tapdisks via tap-ctl

The blocktap library isn't really necessary; all the necessary functionality
is available via the tap-ctl binary.

To use:

script=block-tap,vdev=[whatever],target=vhd:/path/to/file.vhd

script=block-tap,vdev=[whatever],target=aio:/path/to/file.raw

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: Add more logging to hotplug script path
George Dunlap [Mon, 6 Jul 2015 10:51:43 +0000 (11:51 +0100)]
libxl: Add more logging to hotplug script path

This was useful in tracking down bugs.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: Remove linux udev rules
George Dunlap [Mon, 6 Jul 2015 10:51:39 +0000 (11:51 +0100)]
libxl: Remove linux udev rules

They are no longer needed, having been replaced by a daemon for
driverdomains which will run scripts as necessary.

Worse yet, they seem to be broken for script-based block devices, such
as block-iscsi.  This wouldn't matter so much if they were never run
by default; but if you run block-attach without having created a
domain, then the appropriate node to disable running udev scripts will
not have been written yet, and the attach will silently fail.

Rather than try to sort out that issue, just remove them entirely.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: Make local_initiate_attach more rational
George Dunlap [Mon, 6 Jul 2015 10:51:38 +0000 (11:51 +0100)]
libxl: Make local_initiate_attach more rational

There are a lot of paths through
libxl__device_disk_local_initiate_attach(), but they all really boil
down to one thing: Can we just access the file directly, or do we need
to attach it?

The requirements for direct access are fairly simple:
* Is this local (as opposed to a driver domain)?
* Is this a raw format (as opposed to cooked)?
* Does this have no scripts associated with it?

If it meets all those requirements, we can access it directly;
otherwise we need to attach it.

This fixes a bug where bootloader execution fails for disks with
hotplug scripts.

This should fix a theoretical bug when using a qdisk backend in a
driver domain. (Not tested.)

Based on a patch by Roger Pau Monne <roger.pau@citrix.com>.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: fix PV vNUMA guest memory allocation
Wei Liu [Mon, 6 Jul 2015 13:47:40 +0000 (14:47 +0100)]
libxc: fix PV vNUMA guest memory allocation

In 415b58c1 (tools/libxc: Batch memory allocations for PV guests) the
number of super pages is calculated with the number of total pages. That
is wrong. It breaks PV guest vNUMA. The correct number of super pages
should be derived from the number of pages within that virtual NUMA
node.

Also change the name and type of super page variable to match the naming
convention and type of normal page variable. Make the necessary
adjustment to make code compile.

Reported-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-and-Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: remove trailing newline in xc_dom_panic format string
Wei Liu [Mon, 6 Jul 2015 13:17:19 +0000 (14:17 +0100)]
libxc: remove trailing newline in xc_dom_panic format string

xc_dom_panic prints more information after user supplied strings, so
don't print a newline.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/xenconsoled: Use XC_PAGE_SIZE rather than getpagesize()
Julien Grall [Mon, 11 May 2015 11:55:36 +0000 (12:55 +0100)]
tools/xenconsoled: Use XC_PAGE_SIZE rather than getpagesize()

Linux may not use the same page granularity as Xen. This will result to
a domain crash because it will try to map more page than required.

As the console page size will always be equal to a Xen page size, use
XC_PAGE_SIZE.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/xenstored: Use XC_PAGE_SIZE rather than getpagesize()
Julien Grall [Mon, 11 May 2015 11:55:35 +0000 (12:55 +0100)]
tools/xenstored: Use XC_PAGE_SIZE rather than getpagesize()

Linux may not use the same page granularity as Xen. This will result to
a domain crash because it will try to map more page than required.

As the xenstore page size willl always be equal to a Xen page size, use
XC_PAGE_SIZE.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen: earlycpio: Pull in latest linux earlycpio.[ch]
Ian Campbell [Wed, 1 Jul 2015 14:43:07 +0000 (15:43 +0100)]
xen: earlycpio: Pull in latest linux earlycpio.[ch]

AFAICT our current version does not correspond to any version in the
Linux history. This commit resynchronised to the state in Linux
commit 598bae70c2a8e35c8d39b610cca2b32afcf047af.

Differences from upstream: find_cpio_data is __init, printk instead of
pr_*.

This appears to fix Debian bug #785187. "Appears" because my test box
happens to be AMD and the issue is that the (valid) cpio generated by
the Intel ucode is not liked by the old Xen code. I've tested by
hacking the hypervisor to look for the Intel path.

Reported-by: Stephan Seitz <stse+debianbugs@fsing.rootsland.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stephan Seitz <stse+debianbugs@fsing.rootsland.net>
Cc: 785187@bugs.debian.org
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoxen: arm: consolidate mmio and irq mapping to dom0
Ian Campbell [Tue, 7 Jul 2015 08:46:18 +0000 (09:46 +0100)]
xen: arm: consolidate mmio and irq mapping to dom0

The code in the callbacks for dt_for_each_irq_map and
dt_for_each_range is very similar to the code in handle_device for
each non-pci device.

In fact the only major difference is that the irq callback needs to
call irq_set_spi_type in the PCI case. Refactor into a
map_dt_irq_to_domain callback which does the irq_set_spi_type and then
calls map_irq_to_domain which is also used from handle_device.

For mmio map_range_to_domain can already be used directly from
handle_device too. Note that the uses of PAGE_MASK in the
handle_device code here were unnecessary (and already removed from the
map_range_to_domain variant).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
9 years agoxen: arm: Import of_bus PCI entry from Linux (as a dt_bus entry)
Ian Campbell [Tue, 7 Jul 2015 08:46:17 +0000 (09:46 +0100)]
xen: arm: Import of_bus PCI entry from Linux (as a dt_bus entry)

This provides specific handlers for the PCI bus relating to matching
and translating. It's mostly similar to the defaults but includes some
additional error checks and other PCI specific bits.

There are some subtle differences in how the generic code vs. the pci
specific code here will handle buggy DTs (i.e. #*-cells which are not
as required by the pci bindings). This will mean we tolerate such
device trees better.

I say "buggy", but actually it's not clear to me from reading "PCI Bus
Binding to Open Firmware" that when the device_type is "pci" that
e.g. the text says "The value of "#address-cells" for PCI Bus Nodes is
3." and not "A PCI Bus Node must contain a #address-cells property
containing 3", iow the #address-cells might validly be implicit rather
than an actual property. Maybe that interpretation is bogus, but with
this patch we are are able to cope with DTs written by people who do
read it like that.

It also gets us the ability to parse the flags (cacheability),
although at the moment we only check them for validity rather than use
them.

Functions/types renamed and reindented (because apparently we do
that for these).

Needs a selection of IORESOURCE_* defines, which I've taken from Linux
and have included locally for now until we figure out where else they
might be needed.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>