]> xenbits.xensource.com Git - people/julieng/xen-unstable.git/log
people/julieng/xen-unstable.git
9 years agox86: remove unused x87 remnants of 32-bit days
Jan Beulich [Fri, 16 Oct 2015 15:44:35 +0000 (17:44 +0200)]
x86: remove unused x87 remnants of 32-bit days

x86-64 requires FXSR, XMM, and XMM2, so there's no point in hiding
respective code behind conditionals.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/PSR: fix compilation error after 44f126d
He Chen [Fri, 16 Oct 2015 15:43:19 +0000 (17:43 +0200)]
x86/PSR: fix compilation error after 44f126d

In non-debug build ASSERT_UNREACHABLE is nop and some compilers will
complain that cbm_code/cbm_data may be used uninitialized in function
psr_set_l3_cbm. Add return after ASSERT_UNREACHABLE to fix it.

Signed-off-by: He Chen <he.chen@linux.intel.com>
9 years agolibxc: fix the types used in xc_dom_image to build HVM guests
Roger Pau Monne [Thu, 15 Oct 2015 17:23:57 +0000 (19:23 +0200)]
libxc: fix the types used in xc_dom_image to build HVM guests

Fix the types used to store the memory parameters of an HVM guest,
previously they defaulted to unsigned long on 32bit toolstack builds, which
is wrong because a 32bit value cannot hold a 64bit memory address that
crosses the 4GB boundary.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: Initialize vcpu context for 64-bit PVH VCPUs
Boris Ostrovsky [Thu, 15 Oct 2015 14:44:26 +0000 (10:44 -0400)]
libxc: Initialize vcpu context for 64-bit PVH VCPUs

Commit 5b921b49f08 ("libxc: rework BSP initialization") forgot to call
xc_vcpu_setcontext() for 64-bit PVH VCPUs.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/traps: don't use 16bit reads of segment registers
Andrew Cooper [Wed, 14 Oct 2015 10:48:36 +0000 (12:48 +0200)]
x86/traps: don't use 16bit reads of segment registers

When executing `mov %sreg, %r32`, older Intel processors would leave the
upper 16 bits of %r32 undefined.  P4 processors and newer, as well as
all AMD processors will zero extend the segment selector.

As Xen only supports 64bit these days, there is no need to use the
operand-size override prefix and suffer the resulting pipeline overhead.

Rename read_segment_register() to read_sreg() and drop the existing
read_sreg() wrapper which took a regs parameter and did nothing with it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoMerge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Wed, 14 Oct 2015 10:48:22 +0000 (12:48 +0200)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging

9 years agox86/boot: use mnemonics rather than magic numbers
Andrew Cooper [Wed, 14 Oct 2015 10:48:00 +0000 (12:48 +0200)]
x86/boot: use mnemonics rather than magic numbers

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/NUMA: cleanup
Jan Beulich [Wed, 14 Oct 2015 10:47:08 +0000 (12:47 +0200)]
x86/NUMA: cleanup

- constification
- prefer container_of() over casts
- check original pointer against NULL instead of the container_of()
  result

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/NUMA: fix SRAT table processor entry parsing and consumption
Jan Beulich [Wed, 14 Oct 2015 10:46:27 +0000 (12:46 +0200)]
x86/NUMA: fix SRAT table processor entry parsing and consumption

- don't overrun apicid_to_node[] (possible in the x2APIC case)
- don't limit number of processor related SRAT entries we can consume
- make acpi_numa_{processor,x2apic}_affinity_init() as similar to one
  another as possible
- print APIC IDs in hex (to ease matching with other log messages), at
  once making legacy and x2APIC ones distinguishable (by width)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: add domctl cmd to set/get CDP code/data CBM
He Chen [Wed, 14 Oct 2015 10:45:34 +0000 (12:45 +0200)]
x86: add domctl cmd to set/get CDP code/data CBM

CDP extends CAT and provides the capacity to control L3 code & data
cache. With CDP, one COS corresponds to two CMBs(code & data). cbm_type
is added to distinguish different CBM operations. Besides, new domctl
cmds are introdunced to support set/get CDP CBM. Some CAT functions to
operation CBMs are extended to support CDP.

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Chao Peng <chao.p.peng@linux.intel.com>
9 years agox86: support enable CDP by boot parameter and add get CDP status
He Chen [Wed, 14 Oct 2015 10:44:40 +0000 (12:44 +0200)]
x86: support enable CDP by boot parameter and add get CDP status

Add boot parameter `psr=cdp` to enable CDP at boot time.
Intel Code/Data Prioritization (CDP) feature is based on CAT. Note that
cos_max would be half when CDP is on. struct psr_cat_cbm is extended to
support CDP operation. Extend psr_get_cat_l3_info sysctl to get CDP
status.

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Chao Peng <chao.p.peng@linux.intel.com>
9 years agoxen/arm: ctxt_switch: Document the erratum #852523 related to Cortex A57
Julien Grall [Thu, 8 Oct 2015 19:22:37 +0000 (20:22 +0100)]
xen/arm: ctxt_switch: Document the erratum #852523 related to Cortex A57

When restoring the system register state for an AArch32 guest at EL2,
writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
which can lead to the guest effectively running into unexpected domain
faults.

Thankfully, we don't hit this erratum in Xen. Nonetheless, document the
code to prevent any introduction of the erratum if the context switch
code is re-ordered.

Link: http://lists.xen.org/archives/html/xen-devel/2015-09/msg01746.html
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- comment nits ]

9 years agocxenstored: avoid using hardcoded paths
Wei Liu [Tue, 13 Oct 2015 13:40:28 +0000 (14:40 +0100)]
cxenstored: avoid using hardcoded paths

Use library functions which return socket paths instead.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: domain_build: Removed unused variable in write_properties
Julien Grall [Wed, 14 Oct 2015 09:34:32 +0000 (10:34 +0100)]
xen/arm: domain_build: Removed unused variable in write_properties

The variable new_data is initialized to NULL and free but never
allocated neither used.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: create an initial FPU state for HVM guests
Roger Pau Monne [Tue, 13 Oct 2015 16:27:20 +0000 (18:27 +0200)]
libxc: create an initial FPU state for HVM guests

Xen always set the FPU as initialized when loading a HVM context, so libxc
has to provide a valid FPU context when setting the CPU registers.

This is a stop-gap measure in order to unblock OSSTest Windows 7 failures
while a proper fix for the HVM CPU save/restore is being worked on.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86: drop unused declarations from processor.h
Andrew Cooper [Tue, 13 Oct 2015 15:19:07 +0000 (17:19 +0200)]
x86: drop unused declarations from processor.h

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/time: slightly streamline __update_vcpu_system_time()
Jan Beulich [Tue, 13 Oct 2015 15:18:34 +0000 (17:18 +0200)]
x86/time: slightly streamline __update_vcpu_system_time()

Fold two if()-s using the same condition, converting the memset() so
far separating them to a simple initializer. Move common assignments
out of the conditional. Drop an unnecessary initializer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: hide MWAITX from PV domains
Jan Beulich [Tue, 13 Oct 2015 15:17:52 +0000 (17:17 +0200)]
x86: hide MWAITX from PV domains

Since MWAIT is hidden too. (Linux starting with 4.3 is making use of
that feature, and is checking for it without looking at the MWAIT one.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoVT-d: section placement and type adjustments
Jan Beulich [Tue, 13 Oct 2015 15:17:16 +0000 (17:17 +0200)]
VT-d: section placement and type adjustments

With x2APIC requiring iommu_supports_eim() to return true, we can
adjust a few conditonals such that both it and
platform_supports_x2apic() can be marked __init. For the latter as
well as for platform_supports_intremap() also change the return types
to bool_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Yang Zhang <yang.z.zhang@intel.com>
9 years agoVT-d: use proper error codes in iommu_enable_x2apic_IR()
Jan Beulich [Tue, 13 Oct 2015 15:16:22 +0000 (17:16 +0200)]
VT-d: use proper error codes in iommu_enable_x2apic_IR()

... allowing to suppress a confusing message combination: When
ACPI_DMAR_X2APIC_OPT_OUT is set, so far we first logged a message
that IR could not be enabled (hence not using x2APIC), followed by
one indicating successful initialization of IR (if no other problems
prevented that).

Also adjust the return type of iommu_supports_eim() and fix some
broken indentation in the function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Yang Zhang <yang.z.zhang@intel.com>
9 years agocpufreq: fix notifier block double registration
Dario Faggioli [Mon, 12 Oct 2015 15:22:02 +0000 (17:22 +0200)]
cpufreq: fix notifier block double registration

As a consequence of commit 49388f11d512bb92706ce
("x86/cpufreq: relocate the driver register function")
the cpufreq CPU notifier was being registered twice.
That resulted in bugs when trying to offline a
CPU, as reported here:

 https://www.mail-archive.com/xen-devel@lists.xen.org/msg41618.html

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
9 years agobuild: don't shadow debug with "@debug@" in tools build
Wei Liu [Mon, 12 Oct 2015 14:02:53 +0000 (16:02 +0200)]
build: don't shadow debug with "@debug@" in tools build

In 16181cbb (tools: Honor Config.mk debug value, rather than setting our
own), configure doesn't set debug variable anymore. There is, however,
one place that was missed. The file config/Tools.mk.in was still
expecting a @debug@ value from configure. After 16181cbb that value
remained "debug := @debug@" all the time because configure didn't
substitute it.

The consequence was that we couldn't get a debug build even if debug was
set to "y" in Config.mk.

Fix this by removing the stray line "debug := @debug@" in Tools.mk.in.

Reported-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/shadow: Fix missing newline in dprintk()
Andrew Cooper [Mon, 12 Oct 2015 14:01:56 +0000 (16:01 +0200)]
x86/shadow: Fix missing newline in dprintk()

to avoid console corruption.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
9 years agosched / cpupool: dump the actual value of NOW()
Dario Faggioli [Mon, 12 Oct 2015 14:01:22 +0000 (16:01 +0200)]
sched / cpupool: dump the actual value of NOW()

rather than its hexadecimal representation. This makes
it easier to compare the actual system time with other
times being printed out (e.g., deadlines in RTDS).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
9 years agosched: fix an 'off by one \t' in credit2 debug dump
Dario Faggioli [Mon, 12 Oct 2015 14:00:52 +0000 (16:00 +0200)]
sched: fix an 'off by one \t' in credit2 debug dump

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
9 years agoMAINTAINERS: Tamás Lengyel to maintain mem-sharing
Jan Beulich [Mon, 12 Oct 2015 13:59:28 +0000 (15:59 +0200)]
MAINTAINERS: Tamás Lengyel to maintain mem-sharing

The component being unmaintained right now and him being the apparently
only user at present, this certainly is an improvement over the current
situation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
9 years agoVT-d: don't suppress invalidation address write when it is zero
Jan Beulich [Mon, 12 Oct 2015 13:58:35 +0000 (15:58 +0200)]
VT-d: don't suppress invalidation address write when it is zero

GFN zero is a valid address, and hence may need invalidation done for
it just like for any other GFN.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Yang Zhang <yang.z.zhang@intel.com>
9 years agoxen/arm: vgic: Introduce a new field to store the rank index and use it
Julien Grall [Wed, 7 Oct 2015 14:41:08 +0000 (15:41 +0100)]
xen/arm: vgic: Introduce a new field to store the rank index and use it

Having in hand the index for the rank is very handy to avoid computing
it every time.

For now, use it when enabling/disabling the vIRQs rather than a formula
which is not obvious to understand.

Also drop the comments which were wrong because a shift by DABT_WORD
will not give the IRQ number but the index of the register.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: vgic: Optimize the way to store GICD_IPRIORITYR in the rank
Julien Grall [Wed, 7 Oct 2015 14:41:07 +0000 (15:41 +0100)]
xen/arm: vgic: Optimize the way to store GICD_IPRIORITYR in the rank

Xen is currently directly storing the value of GICD_IPRIORITYR register
in the rank. This makes emulation of the register access very simple
but makes the code to get the priority for a given vIRQ more complex.

While the priority of an vIRQ is retrieved every time an vIRQ is injected
to the guest, the access to register occurs less often.

Each GICD_IPRIORITYR register stores 4 priorities associated for 4 vIRQs
(see 4.3.11 in IHI 0048B). As Xen is using little endian, we can use
an union to access directly a register or a priority for a given IRQ.

Note that the field "ipriority" has been renamed to "ipriorityr" to
match the name of the register in the GIC spec.

Finally, the implementation of the callback get_irq_priority is exactly
the same for both vGIC drivers. Consolidate the implementation in the
common vGIC code and drop the callback.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: vgic: ctlr stores a 32-bit hardware register so use uint32_t
Julien Grall [Wed, 7 Oct 2015 14:41:06 +0000 (15:41 +0100)]
xen/arm: vgic: ctlr stores a 32-bit hardware register so use uint32_t

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: io: Support sign-extension for every read access
Julien Grall [Wed, 7 Oct 2015 14:41:05 +0000 (15:41 +0100)]
xen/arm: io: Support sign-extension for every read access

The guest may try to load data from the emulated MMIO region using
instructions with Sign-Extension (i.e ldrs*). Any use of one those,
will set the SSE bit (Syndrome Sign Extend) in the ISS (see B3-1433
in DDI 0406C.b).

Note that the bit can only be set for access size smaller than the
register size (i.e byte/half-word for aarch32, byte/half-word/word for
aarch32). So we don't have to worry about undefined C behavior.

Until now, the support of sign-extension was limited for byte access in
vGIC emulation. Although there is no reason to not have it generically.

So move the support just after we get the data from the MMIO emulation.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: io: Extend write/read handler to pass the register in parameter
Julien Grall [Wed, 7 Oct 2015 14:41:04 +0000 (15:41 +0100)]
xen/arm: io: Extend write/read handler to pass the register in parameter

Rather than letting each handler to retrieve the register used by the
I/O access, add a new parameter to pass the register in parameter.

This will help to implement generic register manipulation on I/O access
such as sign-extension and endianess.

Read handlers need to modify the value of the register, so a pointer to
it is given in argument. Write handlers shouldn't modify the register,
therefore only a plain value is given.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: io: remove mmio_check_t typedef
Julien Grall [Wed, 7 Oct 2015 14:41:03 +0000 (15:41 +0100)]
xen/arm: io: remove mmio_check_t typedef

This typedef is a left-over of the previous MMIO emulation
implementation.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxenconsole: try to attach to PV console if HVM fails
Roger Pau Monne [Fri, 2 Oct 2015 15:48:59 +0000 (17:48 +0200)]
xenconsole: try to attach to PV console if HVM fails

HVM guests have always used the emulated serial console by default, but if
the emulated serial pty cannot be fetched from xenstore try to use the PV
console instead.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: remove dead HVM building code
Roger Pau Monne [Fri, 2 Oct 2015 15:48:42 +0000 (17:48 +0200)]
libxc: remove dead HVM building code

Remove xc_hvm_build_x86.c and xc_hvm_build_arm.c since xc_hvm_build is not
longer used in order to create HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: switch HVM domain building to use xc_dom_* helpers
Roger Pau Monne [Fri, 2 Oct 2015 15:48:41 +0000 (17:48 +0200)]
libxl: switch HVM domain building to use xc_dom_* helpers

Now that we have all the code in place HVM domain building in libxl can be
switched to use the xc_dom_* family of functions, just like they are used in
order to build PV guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests
Roger Pau Monne [Fri, 2 Oct 2015 15:48:40 +0000 (17:48 +0200)]
libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests

This xc_dom_arch will be used in order to build HVM domains. The code is
based on the existing xc_hvm_populate_memory and xc_hvm_populate_params
functions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: rework BSP initialization
Roger Pau Monne [Fri, 2 Oct 2015 15:48:39 +0000 (17:48 +0200)]
libxc: rework BSP initialization

Place the calls to xc_vcpu_setcontext and the allocation of the hypercall
buffer into the arch-specific vcpu hooks. This is needed in order to
introduce a new builder, so x86 HVM guests can initialize the BSP using
XEN_DOMCTL_sethvmcontext instead of XEN_DOMCTL_setvcpucontext.

This patch should not introduce any functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: make arch_setup_boot{init/late} xc_dom_arch hooks
Roger Pau Monne [Fri, 2 Oct 2015 15:48:38 +0000 (17:48 +0200)]
libxc: make arch_setup_boot{init/late} xc_dom_arch hooks

This should not introduce any functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: make arch_setup_meminit a xc_dom_arch hook
Roger Pau Monne [Fri, 2 Oct 2015 15:48:37 +0000 (17:48 +0200)]
libxc: make arch_setup_meminit a xc_dom_arch hook

This allows having different arch_setup_meminit implementations based on the
guest type. It should not introduce any functional changes.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: introduce a domain loader for HVM guest firmware
Roger Pau Monne [Wed, 7 Oct 2015 16:55:38 +0000 (18:55 +0200)]
libxc: introduce a domain loader for HVM guest firmware

Introduce a very simple (and dummy) domain loader to be used to load the
firmware (hvmloader) into HVM guests. Since hmvloader is just a 32bit elf
executable the loader is fairly simple.

Since the order in which loaders are tested cannot be arranged, prevent the
current elfloader from trying to boot a kernel that doesn't contain Xen
ELFNOTES.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: introduce the notion of a container type
Roger Pau Monne [Fri, 2 Oct 2015 15:48:35 +0000 (17:48 +0200)]
libxc: introduce the notion of a container type

Introduce the notion of a container type into xc_dom_image. This will be
needed by later changes that will also use xc_dom_image in order to build
HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: unify xc_dom_p2m_{host/guest}
Roger Pau Monne [Fri, 2 Oct 2015 15:48:34 +0000 (17:48 +0200)]
libxc: unify xc_dom_p2m_{host/guest}

Unify both functions into xc_dom_p2m. Should not introduce any functional
change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
9 years agolibxc: split x86 HVM setup_guest into smaller logical functions
Roger Pau Monne [Fri, 2 Oct 2015 15:48:33 +0000 (17:48 +0200)]
libxc: split x86 HVM setup_guest into smaller logical functions

This is just a preparatory change to clean up the code in setup_guest.
Should not introduce any functional changes.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
9 years agoefi: split out efi_exit_boot()
Daniel Kiper [Thu, 8 Oct 2015 09:26:37 +0000 (11:26 +0200)]
efi: split out efi_exit_boot()

..which gets memory map and calls ExitBootServices(). We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_set_gop_mode()
Daniel Kiper [Thu, 8 Oct 2015 09:25:09 +0000 (11:25 +0200)]
efi: split out efi_set_gop_mode()

..which sets chosen GOP mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_variables()
Daniel Kiper [Thu, 8 Oct 2015 09:24:31 +0000 (11:24 +0200)]
efi: split out efi_variables()

..which collects variable store parameters. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_tables()
Daniel Kiper [Thu, 8 Oct 2015 09:24:00 +0000 (11:24 +0200)]
efi: split out efi_tables()

..which collects system tables data. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_find_gop_mode()
Daniel Kiper [Thu, 8 Oct 2015 09:23:28 +0000 (11:23 +0200)]
efi: split out efi_find_gop_mode()

..which finds suitable GOP mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_get_gop()
Daniel Kiper [Thu, 8 Oct 2015 09:22:52 +0000 (11:22 +0200)]
efi: split out efi_get_gop()

..which gets pointer to GOP device. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_console_set_mode()
Daniel Kiper [Thu, 8 Oct 2015 09:21:45 +0000 (11:21 +0200)]
efi: split out efi_console_set_mode()

..which sets console mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoefi: split out efi_init()
Daniel Kiper [Thu, 8 Oct 2015 09:19:28 +0000 (11:19 +0200)]
efi: split out efi_init()

..which initializes basic EFI variables. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agox86/p2m: fix typo "populete"
Wei Liu [Thu, 8 Oct 2015 09:02:47 +0000 (11:02 +0200)]
x86/p2m: fix typo "populete"

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/cpufreq: relocate the driver register function
Wei Wang [Thu, 8 Oct 2015 09:01:58 +0000 (11:01 +0200)]
x86/cpufreq: relocate the driver register function

Move the driver register function to the cpufreq.c, and remove the
(unused) de-registration one.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Mark the funciton __init.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agolibxl: fix places missed by spatch
Wei Liu [Fri, 2 Oct 2015 14:56:41 +0000 (15:56 +0100)]
libxl: fix places missed by spatch

The spatch provided in previous patch didn't handle all sites that need
transformation.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: map LIBXL__LOG_VERBOSE to XTL_VERBOSE
Wei Liu [Fri, 2 Oct 2015 14:56:40 +0000 (15:56 +0100)]
libxl: map LIBXL__LOG_VERBOSE to XTL_VERBOSE

There is code in libxl using XTL_VERBOSE. We should provide a libxl
mapping for it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: fix long lines and delete extraneous quotes
Wei Liu [Fri, 2 Oct 2015 14:56:39 +0000 (15:56 +0100)]
libxl: fix long lines and delete extraneous quotes

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agolibxl: convert to use LOG() macro
Wei Liu [Fri, 2 Oct 2015 14:56:38 +0000 (15:56 +0100)]
libxl: convert to use LOG() macro

This patch converts most LIBXL__LOG* macros to LOG macro. It's done with
spatch plus some hand coding.

Using spatch rune:

    spatch --in-place --no-includes --include-headers \
        --sp-file libxl.spatch \
        tools/libxl/libxl*.c

with some exceptions.

libxl_json.c is untouched because the notion of ctx is in fact referring
to yajl context.

libxl_qmp.c is untouched because libxl ctx is buried in qmp context.

libxl_fork.c is untouched because it's clearer to just use original
code.

Some fallouts are dealt with manually. There are three categories.

Functions that don't have gc defined. Add gc definition with GC_INIT.
Also try my best to make them conform with libxl coding style.

 * libxl_list_domain
 * libxl_domain_info
 * libxl_domain_pause
 * libxl_get_physinfo
 * libxl_domain_set_nodeaffinity
 * libxl_domain_get_nodeaffinity
 * libxl_get_scheduler
 * libxl_sched_credit_params_get
 * libxl_sched_credit_params_set
 * libxl_send_debug_keys
 * libxl_xen_console_read_line
 * libxl_tmem_list
 * libxl_tmem_freeze
 * libxl_tmem_thaw
 * libxl_tmem_set
 * libxl_tmem_shared_auth
 * libxl_tmem_freeable
 * libxl_fd_set_cloexec
 * libxl_fd_set_nonblock
 * libxl__init_recursive_mutex
 * READ_WRITE_EXACTLY
 * libxl__ao_complete_check_progress_reports

Functions don't need ctx variable anymore after conversion. Delete that
variable.

 * libxl__device_from_disk
 * domcreate_rebuild_done
 * domcreate_devmodel_started
 * domcreate_attach_pci
 * libxl__domain_device_model
 * libxl__build_device_model_args_new
 * libxl__build_device_model_args
 * libxl__create_pci_backend
 * libxl__device_pci_add_xenstore
 * sysfs_write_bdf
 * sysfs_dev_unbind
 * pciback_dev_has_slot
 * pciback_dev_is_assigned
 * pciback_dev_assign
 * pciback_dev_unassign
 * pci_assignable_driver_path_write
 * libxl__device_pci_assignable_remove
 * libxl__xenstore_child_wait_deprecated
 * libxl__xs_libxl_path
 * libxl__device_model_version_running

Special handling for some functions.

 * ao__abort: easier to just use original code.
 * e820_sanitize: should have taken gc instead of ctx

=====
virtual patch
virtual context
virtual org
virtual report

@level1@
identifier FN =~ "LIBXL__LOG|LIBXL__LOG_ERRNO|LIBXL__LOG_ERRNOVAL";
constant l1 =~ "(LIBXL__LOG|XTL)_(DEBUG|INFO|WARNING|ERROR)";
expression ctx;
@@
FN(ctx, l1, ...);

@script:python level2@
l1 << level1.l1;
l2;
@@

import re
coccinelle.l2 = re.sub("LIBXL__LOG_|XTL_", "", l1);
if coccinelle.l2 == "WARNING": coccinelle.l2 = "WARN"

@log10@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
@@
-LIBXL__LOG(ctx, l1, fmt);
+LOG(l2, fmt);

@log11@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression arg1;
@@
-LIBXL__LOG(ctx, l1, fmt, arg1);
+LOG(l2, fmt, arg1);

@log12@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression arg1, arg2;
@@
-LIBXL__LOG(ctx, l1, fmt, arg1, arg2);
+LOG(l2, fmt, arg1, arg2);

@log13@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression arg1, arg2, arg3;
@@
-LIBXL__LOG(ctx, l1, fmt, arg1, arg2, arg3);
+LOG(l2, fmt, arg1, arg2, arg3);

@log20@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
@@
-LIBXL__LOG_ERRNO(ctx, l1, fmt);
+LOGE(l2, fmt);

@log21@
expression ctx;
expression fmt;
constant level1.l1;
identifier level2.l2;
expression arg1;
@@
-LIBXL__LOG_ERRNO(ctx, l1, fmt, arg1);
+LOGE(l2, fmt, arg1);

@log22@
expression ctx;
expression fmt;
constant level1.l1;
identifier level2.l2;
expression arg1, arg2;
@@
-LIBXL__LOG_ERRNO(ctx, l1, fmt, arg1, arg2);
+LOGE(l2, fmt, arg1, arg2);

@log23@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression arg1, arg2, arg3;
@@
-LIBXL__LOG_ERRNO(ctx, l1, fmt, arg1, arg2, arg3);
+LOGE(l2, fmt, arg1, arg2, arg3);

@log30@
expression fmt;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression errnoval;
@@
-LIBXL__LOG_ERRNOVAL(ctx, l1, errnoval, fmt);
+LOGEV(l2, errnoval, fmt);

@log31@
expression fmt;
expression arg1;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression errnoval;
@@
-LIBXL__LOG_ERRNOVAL(ctx, l1, errnoval, fmt, arg1);
+LOGEV(l2, errnoval, fmt, arg1);

@log32@
expression fmt;
expression arg1, arg2;
expression ctx;
constant level1.l1;
identifier level2.l2;
expression errnoval;
@@
-LIBXL__LOG_ERRNOVAL(ctx, l1, errnoval, fmt, arg1, arg2);
+LOGEV(l2, errnoval, fmt, arg1, arg2);
=====

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoAdd Libc multiarch package as build prerequisites on 64-bit platforms to the README
Sander Eikelenboom [Tue, 6 Oct 2015 16:58:25 +0000 (18:58 +0200)]
Add Libc multiarch package as build prerequisites on 64-bit platforms to the README

When building on 64-bit platforms this prevents  build errors for
32-bit components which are enabled on a default build.

Signed-off-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxc: Improve efficiency of xc_cpuid_apply_policy()
Andrew Cooper [Mon, 5 Oct 2015 13:12:17 +0000 (14:12 +0100)]
tools/libxc: Improve efficiency of xc_cpuid_apply_policy()

Having the internals of xc_cpuid_policy() make hypercalls to collect domain
information causes xc_cpuid_apply_policy() to be very inefficient.

Re-order operations to collect all information at once at the outermost layer,
and pass a structure in to all cpuid policy generation functions.

This removes several hypercalls (4 from HVM, 3 from PV) for each of the
up-to 108 leaves processed.

No change in the eventual policy provided, although all the information
gathering how has (or has correct) error checking.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agodocs: xl.cfg: permissive option is not PV only.
Ian Campbell [Tue, 6 Oct 2015 08:42:35 +0000 (09:42 +0100)]
docs: xl.cfg: permissive option is not PV only.

Since XSA-131 qemu-xen has defaulted to non-permissive mode and the
option was extended to cover that case in 015a373351e5 "tools: libxl:
allow permissive qemu-upstream pci passthrough".

Since I was rewrapping to adjust the text anyway I've split the safety
warning into a separate paragraph to make it more obvious.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Eric <epretorious@yahoo.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agotools/hotplug: Scan xenstore once when attaching shared images files
Mike Latimer [Fri, 2 Oct 2015 14:09:32 +0000 (08:09 -0600)]
tools/hotplug: Scan xenstore once when attaching shared images files

During the attachment of a loopback mounted image file, the mode of all
curent instances of this device already attached to other domains must be
checked. This requires finding all loopback devices pointing to the inode
of the shared image file, and then comparing the major and minor number of
these devices to the major and minor number of every vbd device found in the
xenstore database.

Prior to this patch, the entire xenstore database is walked for every instance
of every loopback device pointing to the same shared image file. This process
causes the block attachment process to becomes exponentially slower with every
additional attachment of a shared image.

Rather than scanning all of xenstore for every instance of a shared loopback
device, this patch creates a list of the major and minor numbers from all
matching loopback devices. After generating this list, Xenstore is walked
once, and major and minor numbers from every vbd are checked against the list.
If a match is found, the mode of that vbd is checked for compatibility with
the mode of the device being attached.

Signed-off-by: Mike Latimer <mlatimer@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoMINIOS_UPSTREAM_REVISION Update
Ian Campbell [Wed, 7 Oct 2015 11:18:10 +0000 (12:18 +0100)]
MINIOS_UPSTREAM_REVISION Update

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: psci: use SMC64 function ID when available on ARM64
Brijesh Singh [Mon, 5 Oct 2015 16:38:39 +0000 (11:38 -0500)]
xen/arm: psci: use SMC64 function ID when available on ARM64

As per PSCI 0.2 spec, if CPU_ON entry_point_address is 64-bit then SMC
call function ID parameter should be set to SMC64 version.

Signed-off-by: Brijesh Singh <brijeshkumar.singh@amd.com>
Reviewed-by: Julien grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agogitignore: ignore extras/mini-os*
Wei Liu [Tue, 6 Oct 2015 16:44:37 +0000 (17:44 +0100)]
gitignore: ignore extras/mini-os*

The original pattern doesn't handle mini-os-dir-remote.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoflask: Allow initial domain to use XENPF_get_symbol
Konrad Rzeszutek Wilk [Sat, 3 Oct 2015 19:22:29 +0000 (15:22 -0400)]
flask: Allow initial domain to use XENPF_get_symbol

It looks to be missing in the policy file for the initial
domain. Eventually we may want to extend this access to
non-dom0 domains but for now it certainly dom0-only.

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agobuild: drop unused config variable CONFIG_HVM
Doug Goldstein [Tue, 6 Oct 2015 15:39:33 +0000 (17:39 +0200)]
build: drop unused config variable CONFIG_HVM

CONFIG_HVM is not used anywhere in the build process so drop it.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agovcpu: add missing dummy_vcpu_info to compat VCPUOP_initialise
Roger Pau Monné [Tue, 6 Oct 2015 15:38:41 +0000 (17:38 +0200)]
vcpu: add missing dummy_vcpu_info to compat VCPUOP_initialise

This check is missing from the compat version when compared to the
non-compat version.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
9 years agox86/cpufreq: add a new driver interface, setpolicy()
Wei Wang [Tue, 6 Oct 2015 15:37:48 +0000 (17:37 +0200)]
x86/cpufreq: add a new driver interface, setpolicy()

In order to better support future Intel processors, intel_pstate
changes to use percentage values to tune P-states. The setpolicy
driver interface is used to configure the intel_pstate internal
policy. The __cpufreq_set_policy needs to be intercepted to use
the setpolicy driver if it exists.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: APERF/MPERF feature detect
Wei Wang [Mon, 5 Oct 2015 16:16:39 +0000 (18:16 +0200)]
x86: APERF/MPERF feature detect

Add support to detect the APERF/MPERF feature. Also, remove the identical
code in cpufreq.c and powernow.c.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cosmetics.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agopublic: fix typo in memory.h
Julien Grall [Mon, 5 Oct 2015 15:18:48 +0000 (17:18 +0200)]
public: fix typo in memory.h

Signed-off-by: Julien Grall <julien.grall@citrix.com>
9 years agoMAINTAINERS: adding myself as co-maintainer of scheduling
Dario Faggioli [Mon, 5 Oct 2015 15:18:33 +0000 (17:18 +0200)]
MAINTAINERS: adding myself as co-maintainer of scheduling

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
9 years agoMAINTAINERS: adding myself as co-maintainer of cpupools
Dario Faggioli [Mon, 5 Oct 2015 15:18:10 +0000 (17:18 +0200)]
MAINTAINERS: adding myself as co-maintainer of cpupools

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
9 years agouse masking operation instead of test_bit for CSFLAG bits
Juergen Gross [Mon, 5 Oct 2015 15:16:38 +0000 (17:16 +0200)]
use masking operation instead of test_bit for CSFLAG bits

Use a bit mask for testing of a set bit instead of test_bit in case no
atomic operation is needed, as this will lead to smaller and more
effective code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
9 years agopaging: Fix compile error when DEBUG_TRACE_DUMP is enabled.
Konrad Rzeszutek Wilk [Sat, 3 Oct 2015 09:41:23 +0000 (05:41 -0400)]
paging: Fix compile error when DEBUG_TRACE_DUMP is enabled.

And also remove the extra space. 'gmfn' does not exist in this
function anymore.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agouse masking operation instead of test_bit for MCSF bits
Juergen Gross [Fri, 2 Oct 2015 11:44:59 +0000 (13:44 +0200)]
use masking operation instead of test_bit for MCSF bits

Use a bit mask for testing of a set bit instead of test_bit in case no
atomic operation is needed, as this will lead to smaller and more
effective code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agouse masking operation instead of test_bit for VPF bits
Juergen Gross [Fri, 2 Oct 2015 11:44:31 +0000 (13:44 +0200)]
use masking operation instead of test_bit for VPF bits

Use a bit mask for testing of a set bit instead of test_bit in case no
atomic operation is needed, as this will lead to smaller and more
effective code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agouse masking operation instead of test_bit for VGCF bits
Juergen Gross [Fri, 2 Oct 2015 11:44:04 +0000 (13:44 +0200)]
use masking operation instead of test_bit for VGCF bits

Use a bit mask for testing of a set bit instead of test_bit in case no
atomic operation is needed, as this will lead to smaller and more
effective code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agouse masking operation instead of test_bit for RTDS bits
Juergen Gross [Fri, 2 Oct 2015 11:43:35 +0000 (13:43 +0200)]
use masking operation instead of test_bit for RTDS bits

Use a bit mask for testing of a set bit instead of test_bit in case no
atomic operation is needed, as this will lead to smaller and more
effective code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
9 years agox86/PoD: shorten certain operations on higher order ranges
Jan Beulich [Fri, 2 Oct 2015 11:42:01 +0000 (13:42 +0200)]
x86/PoD: shorten certain operations on higher order ranges

Now that p2m->get_entry() always returns a valid order, utilize this
to accelerate some of the operations in PoD code. (There are two uses
of p2m->get_entry() left which don't easily lend themselves to this
optimization.)

Also adjust a few types as needed and remove stale comments from
p2m_pod_cache_add() (to avoid duplicating them yet another time).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/p2m-pt: use pre-calculated IOMMU flags
Jan Beulich [Fri, 2 Oct 2015 11:41:24 +0000 (13:41 +0200)]
x86/p2m-pt: use pre-calculated IOMMU flags

... instead of recalculating them.

At once clean up formatting of the touched code and drop a stray loop
local variable shadowing a function scope one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/p2m-pt: tighten conditions of IOMMU mapping updates
Jan Beulich [Fri, 2 Oct 2015 11:40:36 +0000 (13:40 +0200)]
x86/p2m-pt: tighten conditions of IOMMU mapping updates

Whether the MFN changes does not depend on the new entry being valid
(but solely on the old one), and the need to update or TLB-flush also
depends on permission changes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/EPT: work around hardware erratum setting A bit
Ross Lagerwall [Fri, 2 Oct 2015 11:39:12 +0000 (13:39 +0200)]
x86/EPT: work around hardware erratum setting A bit

Since commit 191b3f3344ee ("p2m/ept: enable PML in p2m-ept for
log-dirty"), the A and D bits of EPT paging entries are set
unconditionally, regardless of whether PML is enabled or not. This
causes a regression in Xen 4.6 on some processors due to Intel Errata
AVR41 -- HVM guests get severe memory corruption when the A bit is set
due to incorrect TLB flushing on mov to cr3. The erratum affects the
Atom C2000 family (Avoton).

To fix, do not set the A bit on this processor family.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Move feature suppression to feature detection code. Add command line
override.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoxen: write a high level description of the sub-arch choices for heap layout
Ian Campbell [Wed, 30 Sep 2015 13:36:03 +0000 (14:36 +0100)]
xen: write a high level description of the sub-arch choices for heap layout

The 3 options which (sub)arches have for the layout of their heaps is
a little subtle (in particular the two CONFIG_SEPARATE_XENHEAP=n
submodes) and can be a bit tricky to derive from the code.

Therefore try and write down some guidance on what the various modes
are.

Note that this is intended more as a high level overview rather than a
detailed guide to the full page allocator interfaces.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agoxen/arm: vgic-v2: Drop cbase from arch_domain
Julien Grall [Tue, 29 Sep 2015 16:21:38 +0000 (17:21 +0100)]
xen/arm: vgic-v2: Drop cbase from arch_domain

The field value is only used within a single function in the vgic-v2
emulation. So it's not necessary to store the value in the domain
structure.

This is also saving 8 bytes on a structure which begin to be constrained
(the maximum size of struct domain is 4KB).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: Warn when a device tree path will be re-used by Xen
Julien Grall [Tue, 29 Sep 2015 16:21:37 +0000 (17:21 +0100)]
xen/arm: Warn when a device tree path will be re-used by Xen

Xen is unconditionally using certain device tree paths to create DOM0
specific node (for instance /psci, /memory and /hypervisor).

Print a warning message on the console to let the user know if we
re-use one of these nodes.

Note that the content of most of those is very common and they
should have already been skipped via the compatible string or type
string. This warning is here to catch unusual device-tree and
compatible string that we may not yet support in Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: Retrieve the correct number of cells when building dom0 DT
Julien Grall [Tue, 29 Sep 2015 16:21:36 +0000 (17:21 +0100)]
xen/arm: Retrieve the correct number of cells when building dom0 DT

The functions dt_n_*_cells return the number of cells for a "reg"
property of a given node. So those numbers won't be correct if the
parent of a given node is passed.

This is fine today because the parent is always the root node which
means there is no upper parent.

Introduce new helpers dt_child_n_*_cells to retrieve the number of
cells for the address and size that can be used to create the "reg"
property of the immediate child of a given parent. Also introduce
dt_child_set_range to pair up with dt_child_n_*_cells.

Use the new helpers when creating the hypervisor and memory node where
we only have the parent in hand. This is because those nodes are created
from scratch by Xen and therefore we don't have a dt_device_node for
them. The only thing we have is a pointer to their future parent.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: gic: Make it clear the GIC node is passed to make_hwdom_dt_node
Julien Grall [Tue, 29 Sep 2015 16:21:35 +0000 (17:21 +0100)]
xen/arm: gic: Make it clear the GIC node is passed to make_hwdom_dt_node

The callback make_hwdom_dt_node already has the GIC node in parameter.

Rather than using a weird mix between "dt_interrupt_controller" (aliased
to "gic") and "node", rename the callback parameter "node" to "gic" and
remove local GIC definitions in terms of the global
dt_interrupt_interrupt_controller.

Also, add an assert to gic_make_hwdom_dt_node to check that the GIC
really is the global dt_interrupt_controller.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: io: Shorten the name of the fields and clean up
Julien Grall [Tue, 29 Sep 2015 14:44:41 +0000 (15:44 +0100)]
xen/arm: io: Shorten the name of the fields and clean up

The field names in the IO emulation are really long and use repeatedly
the term handler which make some line cumbersome to read:

mmio_handler->mmio_handler_ops->write_handler

Also take the opportunity to do some clean up:
    - Avoid "handler" vs "handle" in register_mmio_handler
    - Use a local variable to initialize handler in
    register_mmio_handler
    - Add a comment explaining the dsb(ish) in register_mmio_handler
    - Rename the structure io_handler into vmmio because the io_handler
    is in fine handling multiple handlers and the name a the fields was
    io_handlers. Also rename the field io_handlers to vmmio
    - Rename the field mmio_handler_ops to ops because we are in the
    structure mmio_handler to not need to repeat it
    - Rename the field mmio_handlers to handlers because we are in the
    vmmio structure
    - Make it clear that register_mmio_ops is taking an ops and not an
    handle
    - Clean up local variable to help to understand the code

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: vgic-v3: Correctly retrieve the vCPU associated to a re-distributor
Julien Grall [Tue, 29 Sep 2015 14:44:40 +0000 (15:44 +0100)]
xen/arm: vgic-v3: Correctly retrieve the vCPU associated to a re-distributor

When the guest is accessing the re-distributor, Xen retrieves the base
of the re-distributor using a mask based on the stride.

When the stride contains multiple bits set, the corresponding mask will be
computed incorrectly [1] and therefore giving invalid vCPU and offset:

(XEN) d0v0: vGICR: unknown gpa read address 000000008d130008
(XEN) traps.c:2447:d0v1 HSR=0x93c08006 pc=0xffffffc00032362c
gva=0xffffff80000b0008 gpa=0x0000008d130008

For instance if the region of re-distributor is starting at 0x8d100000
and the stride is 0x30000, an access to the address 0x8d130008 should
be valid and use the re-distributor of vCPU1 with an offset of 0x8.
Although, Xen is returning the vCPU0 and an offset of 0x20008.

I didn't find a way to replace the current computation of the mask with
a valid one. The only solution I have found is to pass the region in
private data of the handler. So we can directly get the offset from the
beginning of the region and find the corresponding vCPU/offset in the
re-distributor.

This is also make the code simpler and avoid fast/slow path.

[1] http://lists.xen.org/archives/html/xen-devel/2015-09/msg03372.html

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: io: Extend write/read handler to pass private data
Julien Grall [Tue, 29 Sep 2015 14:44:39 +0000 (15:44 +0100)]
xen/arm: io: Extend write/read handler to pass private data

Some handlers may require to use private data in order to get quickly
information related to the region emulated.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: support gzip compressed kernels
Stefano Stabellini [Tue, 29 Sep 2015 15:59:04 +0000 (16:59 +0100)]
xen/arm: support gzip compressed kernels

Free the memory used for the compressed kernel and update the relative
mod->start and mod->size parameters with the uncompressed ones.

To decompress the kernel, allocate memory from dommheap, because freeing
the modules is done by calling init_heap_pages, which frees to domheap.
Map these pages using vmap, because they might not be in the linear 1:1
map.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: ian.campbell@citrix.com
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen: move perform_gunzip to common
Stefano Stabellini [Tue, 29 Sep 2015 15:59:03 +0000 (16:59 +0100)]
xen: move perform_gunzip to common

The current gunzip code to decompress the Dom0 kernel is implemented in
inflate.c which is included by bzimage.c.

I am looking to doing the same on ARM64 but there is quite a bit of
boilerplate definitions that I would need to import in order for
inflate.c to work correctly.

Instead of copying/pasting the code from x86/bzimage.c, move those
definitions to a new common file, gunzip.c. Export only perform_gunzip
and gzip_check. Leave output_length where it is.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: andrew.cooper3@citrix.com
9 years agolibxl: don't shadow global "socket" in psr code
Wei Liu [Wed, 30 Sep 2015 14:54:11 +0000 (15:54 +0100)]
libxl: don't shadow global "socket" in psr code

SLES11 and OpenSUSE 11.4 complain:

[ 1227s] libxl_psr.c: In function 'libxl_psr_cat_get_l3_info':
[ 1227s] libxl_psr.c:342: error: declaration of 'socket' shadows a > global declaration

Change "socket" to "socketid" to fix the problem.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agointroduce VM_EVENT_FLAG_SET_REGISTERS
Razvan Cojocaru [Wed, 30 Sep 2015 12:46:32 +0000 (14:46 +0200)]
introduce VM_EVENT_FLAG_SET_REGISTERS

A previous version of this patch dealing with support for skipping
the current instruction when a vm_event response requested it
computed the instruction length in the hypervisor, adding non-trivial
code dependencies. This patch allows a userspace vm_event client to
simply request that the guest's EIP is set to an arbitary value,
computed by the introspection application. The registers that can
now be set are EAX-EDX, ESP, EBP, ESI, EDI, R8-R15, EFLAGS, and EIP.
CR0, CR3 and CR4 are not set, as at the time of vm_event_resume()
we can't call hvm_set_cr{0,3,4}() and simply setting
v->arch.hvm_vcpu.guest_cr[{0,3,4}] is unlikely to have the desired
effect. The rest of the vm_event registers are not set because
they're not being filled by hvm_event_fill_regs(), but only by
p2m_vm_event_fill_regs(). Currently x86-only.
The VCPU needs to be paused for this flag to take effect.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
9 years agosched: adjustments to some performance counters
Dario Faggioli [Wed, 30 Sep 2015 12:46:02 +0000 (14:46 +0200)]
sched: adjustments to some performance counters

More specifically:

1) rename vcpu_destroy to vcpu_remove

It seems this have had to be done as part of 7e6b926a
("cpupools: Make interface more consistent"), which
renamed the function but not the counter.

In fact, because of cpupools, vcpus are not only removed
from a scheduler when they are destroyed, but also when
domains move between pools.

Make the related statistics counter reflect that more
properly.

2) rename vcpu_init to vcpu_alloc

As it lives in *_alloc_vdata.

3) add vcpu_insert

matching vcpu_remove, and useful to quickly check
whether the number of insertions and removal matches,
or in general investigare their relationship.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
9 years agosched: get rid of cpupool_scheduler_cpumask()
Dario Faggioli [Wed, 30 Sep 2015 12:45:23 +0000 (14:45 +0200)]
sched: get rid of cpupool_scheduler_cpumask()

and of (almost every) direct use of cpupool_online_cpumask().

In fact, what we really want for the most of the times,
is the set of valid pCPUs of the cpupool a certain domain
is part of. Furthermore, in case it's called with a NULL
pool as argument, cpupool_scheduler_cpumask() does more
harm than good, by returning the bitmask of free pCPUs!

This commit, therefore:
 * gets rid of cpupool_scheduler_cpumask(), in favour of
   cpupool_domain_cpumask(), which makes it more evident
   what we are after, and accommodates some sanity checking;
 * replaces some of the calls to cpupool_online_cpumask()
   with calls to the new functions too.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Joshua Whitehead <josh.whitehead@dornerworks.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agocredit1: fix tickling when it happens from a remote pCPU
Dario Faggioli [Wed, 30 Sep 2015 12:44:22 +0000 (14:44 +0200)]
credit1: fix tickling when it happens from a remote pCPU

especially if that is also from a different cpupool than the
processor of the vCPU that triggered the tickling.

In fact, it is possible that we get as far as calling vcpu_unblock()-->
vcpu_wake()-->csched_vcpu_wake()-->__runq_tickle() for the vCPU 'vc',
but all while running on a pCPU that is different from 'vc->processor'.

For instance, this can happen when an HVM domain runs in a cpupool,
with a different scheduler than the default one, and issues IOREQs
to Dom0, running in Pool-0 with the default scheduler.
In fact, right in this case, the following crash can be observed:

(XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    7
(XEN) RIP:    e008:[<ffff82d0801230de>] __runq_tickle+0x18f/0x430
(XEN) RFLAGS: 0000000000010086   CONTEXT: hypervisor (d1v0)
(XEN) rax: 0000000000000001   rbx: ffff8303184fee00   rcx: 0000000000000000
(XEN) ... ... ...
(XEN) Xen stack trace from rsp=ffff83031fa57a08:
(XEN)    ffff82d0801fe664 ffff82d08033c820 0000000100000002 0000000a00000001
(XEN)    0000000000006831 0000000000000000 0000000000000000 0000000000000000
(XEN) ... ... ...
(XEN) Xen call trace:
(XEN)    [<ffff82d0801230de>] __runq_tickle+0x18f/0x430
(XEN)    [<ffff82d08012348a>] csched_vcpu_wake+0x10b/0x110
(XEN)    [<ffff82d08012b421>] vcpu_wake+0x20a/0x3ce
(XEN)    [<ffff82d08012b91c>] vcpu_unblock+0x4b/0x4e
(XEN)    [<ffff82d080167bd0>] vcpu_kick+0x17/0x61
(XEN)    [<ffff82d080167c46>] vcpu_mark_events_pending+0x2c/0x2f
(XEN)    [<ffff82d08010ac35>] evtchn_fifo_set_pending+0x381/0x3f6
(XEN)    [<ffff82d08010a0f6>] notify_via_xen_event_channel+0xc9/0xd6
(XEN)    [<ffff82d0801c29ed>] hvm_send_ioreq+0x3e9/0x441
(XEN)    [<ffff82d0801bba7d>] hvmemul_do_io+0x23f/0x2d2
(XEN)    [<ffff82d0801bbb43>] hvmemul_do_io_buffer+0x33/0x64
(XEN)    [<ffff82d0801bc92b>] hvmemul_do_pio_buffer+0x35/0x37
(XEN)    [<ffff82d0801cc49f>] handle_pio+0x58/0x14c
(XEN)    [<ffff82d0801eabcb>] vmx_vmexit_handler+0x16b3/0x1bea
(XEN)    [<ffff82d0801efd21>] vmx_asm_vmexit_handler+0x41/0xc0

In this case, pCPU 7 is not in Pool-0, while the (Dom0's) vCPU being
woken is. pCPU's 7 pool has a different scheduler than credit, but it
is, however, right from pCPU 7 that we are waking the Dom0's vCPUs.
Therefore, the current code tries to access csched_balance_mask for
pCPU 7, but that is not defined, and hence the Oops.

(Note that, in case the two pools run the same scheduler we see no
Oops, but things are still conceptually wrong.)

Cure things by making the csched_balance_mask macro accept a
parameter for fetching a specific pCPU's mask (instead than always
using smp_processor_id()).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agoRevert "x86/PoD: shorten certain operations on higher order ranges"
Jan Beulich [Wed, 30 Sep 2015 12:43:21 +0000 (14:43 +0200)]
Revert "x86/PoD: shorten certain operations on higher order ranges"

This reverts commit dea4d7a9a847e8822f7fbfd7b143a5e203135179, which
has been found to be broken.

9 years agox86/PoD: shorten certain operations on higher order ranges
Jan Beulich [Tue, 29 Sep 2015 13:11:28 +0000 (15:11 +0200)]
x86/PoD: shorten certain operations on higher order ranges

Now that p2m->get_entry() always returns a valid order, utilize this
to accelerate some of the operations in PoD code. (There are two uses
of p2m->get_entry() left which don't easily lend themselves to this
optimization.)

Also adjust a few types as needed and remove stale comments from
p2m_pod_cache_add() (to avoid duplicating them yet another time).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>