fine grained control of REP emulation optimizations
Previously, if vm_event emulation support was enabled, then REP
optimizations were disabled when emulating REP-compatible
instructions. This patch allows fine-tuning of this behaviour by
providing a dedicated libxc helper function.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
cleanup domain builder declarations and related users
There are several unused function and structure declarations in the
hypervisor related to domain building. Remove them.
Use an enum for elf_dom_parms.pae instead of just hard coding the
values when setting the information and adjust the code to use those
instead of own macros (hypervisor and tools).
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Quan Xu [Fri, 25 Sep 2015 16:03:04 +0000 (18:03 +0200)]
vt-d: fix IM bit unmask of Fault Event Control Register in init_vtd_hw()
Bit 0:29 in Fault Event Control Register are 'Reserved and Preserved',
software cannot write 0 to it unconditionally. Software must preserve
the value read for writes.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Quan Xu <quan.xu@intel.com>
Although we already have 'gfx_passthru' in b_info, this doesn't suffice
after we want to handle IGD specifically. Now we define a new field of
type, gfx_passthru_kind, to indicate we're trying to pass IGD. Actually
this means we can benefit this to support other specific devices just
by extending gfx_passthru_kind. And then we can cooperate with
gfx_passthru to address IGD cases as follows:
gfx_passthru = 0 => sets build_info.u.gfx_passthru to false
gfx_passthru = 1 => sets build_info.u.gfx_passthru to true and
build_info.u.gfx_passthru_kind to DEFAULT
gfx_passthru = "igd" => sets build_info.u.gfx_passthru to true
and build_info.u.gfx_passthru_kind to IGD
Here if gfx_passthru_kind = DEFAULT, we will call
libxl__is_igd_vga_passthru() to check if we're hitting that table to need
to pass that option to qemu. But if gfx_passthru_kind = "igd" we always
force to pass that.
And "-gfx_passthru" is just introduced to work for qemu-xen-traditional
so we should get this away from libxl__build_device_model_args_new() in
the case of qemu upstream.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
While working with qemu, IGD is a specific device in the case of pass through
so we need to identify that to handle more later. Here we define a table to
record all IGD types currently we can support. Also we need to introduce two
helper functions to get vendor and device ids to lookup that table.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Quan Xu [Fri, 25 Sep 2015 07:08:22 +0000 (09:08 +0200)]
vt-d: fix IM bit mask and unmask of Fault Event Control Register
Bit 0:29 in Fault Event Control Register are 'Reserved and Preserved',
software cannot write 0 to it unconditionally. Software must preserve
the value read for writes.
Signed-off-by: Quan Xu <quan.xu@intel.com> Acked-by: Yang Zhang <yang.z.zhang@intel.com>
Andrew Cooper [Fri, 25 Sep 2015 07:06:34 +0000 (09:06 +0200)]
keyhandler: rework keyhandler infrastructure
struct keyhandler does not contain much information, and requires a lot
of boilerplate to use. It is far more convenient to have
register_keyhandler() take each piece of information a parameter,
especially when introducing temporary debugging keyhandlers.
This in turn allows struct keyhandler itself to become private to
keyhandler.c and for the key_table to become more efficient.
key_table doesn't need to contain 256 entries; all keys are ASCII which
limits them to 7 bits of index, rather than 8. It can also become a
straight array, rather than an array of pointers. The overall effect of
this is the key_table grows in size by 50%, but there are no longer
24-byte keyhandler structures all over the data section.
All of the key_table entries in keyhandler.c can be initialised at
compile time rather than runtime.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 25 Sep 2015 07:05:29 +0000 (09:05 +0200)]
x86/PV: properly populate descriptor tables
Us extending the GDT limit past the Xen descriptors so far meant that
guests (including user mode programs) accessing any descriptor table
slot above the original OS'es limit but below the first Xen descriptor
caused a #PF, converted to a #GP in our #PF handler. Which is quite
different from the native behavior, where some of such accesses (LAR
and LSL) don't fault. Mimic that behavior by mapping a blank page into
unused slots.
While not strictly required, treat the LDT the same for consistency.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Mike Belopuhov [Fri, 25 Sep 2015 07:04:24 +0000 (09:04 +0200)]
add missing license and copyright statements to public interface headers
The copyright line indicates a person, a group of people and/or a company
granting rights stated in the license text and is a required part of the
license.
The year of the copyright is chosen to be the same as when the license has
been applied to the file or when the file has been created in case there
was no license. It is possible to update or add additional years if major
changes have been done to the the file, but is generally not a requirement.
Signed-off-by: Mike Belopuhov <mike.belopuhov@esdenera.com>
PDX-es are 64 bits wide in that case, and hence no limit needs to be
enforced.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen/arm: gic-v3: Clean-up the GIC*_PIDR2_* definitions
GICR_PIDR2 and GICD_PIDR2 use the same register layout. Rather than
define twice, one of which is an alias to the other, introduce GIC_PIDR2_*
defines.
Also:
* Use the same prefix for the mask and the value
* Integrate the shift in the value to avoid shifting in the code
* Use GICv* to match the value name in the spec
* Move them in a proper place
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Use existing create/restore path to perform 'soft reset' for HVM
domains. Tear everything down, e.g. destroy domain's device model,
remove the domain from xenstore, save toolstack record and start
over.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
libxl: fix the cleanup of the backend path when using driver domains
With the current libxl implementation the control domain will remove both
the frontend and the backend xenstore paths of a device that's handled by a
driver domain. This is incorrect, since the driver domain possibly needs to
access the backend path in order to perform the disconnection and cleanup of
the device.
Fix this by making sure the control domain only cleans the frontend path,
leaving the backend path to be cleaned by the driver domain. Note that if
the device is not handled by a driver domain the control domain will perform
the removal of both the frontend and the backend paths.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Reported-by: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
The current flow of the devd helper (in charge of launching hotplug scripts
inside of driver domains) is to wait for the device backend to switch to
state 6 (XenbusStateClosed) and then remove it. This is not correct, since
a domain can reconnect it's PV devices as many times as it wants.
In order to fix this, introduce the following logic: the control domain will
set the "online" backend node to 0 when it wants the driver domain to
disconnect the device, so now the condition applied in devd is that "state"
must be 6 and "online" 0 in order to proceed with the disconnection.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Alex Velazquez <alex.j.velazquez@gmail.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: vgic: Correctly emulate write when byte is used
When a guest is writing a byte, the value will be located in bits[7:0]
of the register.
Although the current implementation is expecting the byte at the Nth
byte of the register where N = address & 4;
When the address is not 4-byte aligned, the corresponding byte in the
internal state will always be set to zero rather.
Note that byte access are only used for GICD_IPRIORITYR and
GICD_ITARGETSR. So the worst things that could happen is not setting the
priority correctly and ignore the target vCPU written.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Wed, 23 Sep 2015 09:16:51 +0000 (11:16 +0200)]
x86/hvm: fold opt_hap_{2mb,1gb} into hap_capabilities
This allows all runtime users to simply check hap_has_{2mb,1gb} rather than
having to check opt_hap_{2mb,1gb} as well.
As a result, opt_hap_{2mb,1gb} can move into __initdata.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Wed, 23 Sep 2015 09:16:08 +0000 (11:16 +0200)]
x86/hvm: refine hap_has_{2mb,1gb} checks
HAP superpages are a host property and not dependent on domain configuration.
Drop the domain paramter (which was only used in one of the two callsites),
and drop the redundant hvm_ prefix to mirror the cpu_has_* style of feature
detection.
Finally, convert the checks to being proper booleans rather than just non-zero
integers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Wed, 23 Sep 2015 09:14:05 +0000 (11:14 +0200)]
x86/p2m: add PoD accounting to set_typed_p2m_entry()
While neither PoD together with pass-through nor PVH are currently
supported we still shouldn't leave in place such latent issues.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
xen/xsm: Make p->policyvers be a local variable (ver) to shut up GCC 5.1.1 warnings.
policydb.c: In function ‘user_read’:
policydb.c:1443:26: error: ‘buf[2]’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
usrdatum->bounds = le32_to_cpu(buf[2]);
^
cc1: all warnings being treated as errors
Which (as Andrew mentioned) is because GCC cannot assume
that 'p->policyvers' has the same value between checks.
We make it local, optimize the name to 'ver' and the warnings go away.
We also update another call site with this modification to
make it more inline with the rest of the functions.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Andrew Cooper [Tue, 22 Sep 2015 10:42:21 +0000 (12:42 +0200)]
improve x86's alloc_vcpu_guest_context()
This essentially reverts c/s 2037f2adb "x86: introduce
alloc_vcpu_guest_context()", including the newer arm bits, but achieves
the same end goal by using the newer vmalloc() infrastructure.
For both x86 and ARM, {alloc,free}_vcpu_guest_context() become arch-local
static inlines (which avoids a call into a separate translation),
and removes an x86 scalability limit when compiling with a large NR_CPUS.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
tools/libxc: arm: Check the index before accessing the bank
When creating a guest with more than 3GB of memory, the 2 banks will be
used and the loop with overrunning. The code will fail later on because
Xen will deny to populate the region:
This is because we are currently accessing the bank before checking the
validity of the index. AFAICT, on Debian Jessie, the compiler (gcc 4.9.2) is
assuming that it's not necessary to verify the index because it's used
before. This is a valid assumption because the operand of && are
execute from from left to right.
Re-order the checks to verify the validity of the index before accessing
the bank.
The problem has been present since the introduction of the multi-bank
feature in commit 45d9867837f099e9eed4189dac5ed39d1fe2ed49 " tools: arm:
prepare domain builder for multiple banks of guest RAM".
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: vgic-v2: Map the GIC virtual CPU interface with the correct size
On GICv2, the GIC virtual CPU interface is at minimum 8KB. Due some to
some necessary quirk for GIC using 64KB stride, we are mapping the
region in 2 time.
The first mapping is 4KB and the second one is 8KB, i.e 12KB in total.
Although the minimum supported size (and widely used) is 8KB. This means
that we are mapping 4KB more to any guest using GICv2.
While this looks scary at first glance, the GIC virtual CPU interface is
most frequently at the end the GIC I/O region. So we will most likely
map an an unused I/O region or a mirrored version of GICV for platform
using 64KB stride.
Nonetheless, fix the second mapping to only map 4KB.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
The current libxl code doesn't deal with read-only drives at all.
Upstream QEMU and qemu-xen only support read-only cdrom drives: make
sure to specify "readonly=on" for cdrom drives and return error in case
the user requested a non-cdrom read-only drive.
This is XSA-142, discovered by Lin Liu
(https://bugzilla.redhat.com/show_bug.cgi?id=1257893).
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
tools/xen-mceinj: Pass in GPA when injecting through MSR_MCI_ADDR
This patch removes the address translation in xen-mceinj which
translates the guest physical address passed-in through the argument of
'-p' to the host machine address. Instead, xen-mceinj now passes a flag
MC_MSRINJ_F_GPADDR to ask do_mca() in the hypervisor to do this
translation.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
x86/mce: translate passed-in GPA to host machine address
This patch adds a new flag MC_MSRINJ_F_GPADDR to
xen_mc_msrinject.mcinj_flags, and makes do_mca() to translate the
guest physical address passed-in through
xen_mc_msrinject.mcinj_msr[i].value to the host machine address if
this flag is present.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Acked-by: Christoph Egger <chegger@amazon.de>
Andrew Cooper [Wed, 16 Sep 2015 09:22:00 +0000 (11:22 +0200)]
x86/sysctl: don't clobber memory if NCAPINTS > ARRAY_SIZE(pi->hw_cap)
There is no current problem, as both NCAPINTS and pi->hw_cap are 8 entries,
but the limit should be calculated appropriately so as to avoid hypervisor
stack corruption if the two do get out of sync.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Since commit 3848058e7dd6 (vtd/iommu: permit group devices to
passthrough in relaxed mode) is introduced, we always print
message as XENLOG_G_WARNING but its not correct in the case of
strict mode. So here is making this message depending on the
specific mode.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
David Vrabel [Mon, 3 Aug 2015 11:29:19 +0000 (12:29 +0100)]
arm: reduce power use by contented spin locks with WFE/SEV
Instead of cpu_relax() while spinning and observing the ticket head,
introduce arch_lock_relax() which executes a WFE instruction. After
the ticket head is changed call arch_lock_signal() to execute an SEV
instruction (with the required DSB first) to wake any spinners.
This should improve power consumption when locks are contented and
spinning.
For consistency also move arch_lock_(acquire|release)_barrier to
asm/spinlock.h.
Booted the result on arm32 (Midway) and arm64 (Mustang). Build test
only on amd64.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[ijc: add barrier, rename as arch_lock_*, move arch_lock_*_barrier, test] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
While it appears to be intentional for "xl pci-assignable-remove" to
not re-bind the original driver by default (requires the -r option),
permanently losing the information which driver was originally used
seems bad. Make "add; remove; add; remove -r" re-bind the original
driver by allowing "remove" to delete the information only upon
successful re-bind.
In the course of this I also noticed that binding information is lost
when upon first "add" pciback isn't loaded yet, due to its presence not
being checked for early enough. Adjust pciback_dev_is_assigned()
accordingly, and properly distinguish "yes" and "error" returns in the
"add" case (removing a redundant error message from the "remove" path
for consistency).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 14 Sep 2015 11:40:04 +0000 (13:40 +0200)]
x86/PoD: use clear_domain_page()
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Mon, 14 Sep 2015 11:39:19 +0000 (13:39 +0200)]
x86/p2m: fix mismatched unlock
Luckily, due to gfn_unlock() currently mapping to p2m_unlock(), this is
only a cosmetic issue right now.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
vtd/iommu: permit group devices to passthrough in relaxed mode
Currently we don't allow passing through any group devices which are
sharing same RMRR entry since it would break security among VMs. And
indeed, we expect we can figure out a better way to handle this kind
of case completely.
But before the group assignment gets implemented, we might make this
permission dependent on our RMRR policy. So, now it would be allowed
in the relaxed mode.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Wed, 9 Sep 2015 16:11:24 +0000 (17:11 +0100)]
xl/libxl: disallow saving a guest with vNUMA configured
This is because the migration stream does not preserve node information.
Note this is not a regression for migration v2 vs legacy migration
because neither of them preserves node information.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- some grammar fixes to the doc and fixed a comment ]
Ian Campbell [Fri, 11 Sep 2015 14:19:54 +0000 (15:19 +0100)]
libxl: format fd flags with 0x since they are hex.
Commit 93f5194e7270 "libxl: clear O_NONBLOCK|O_NDELAY on migration fd
and reinstate afterwards" added some logging of fcntl.F_GETFL at all
as %x without a 0x prefix to make it clear they numbers are hex. Fix
this alongwith an inadvertent logging of the fd itself as hex.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
The ACPI PM timer is sometimes broken on live migration.
Since vcpu->arch.hvm_vcpu.guest_time is always zero in other than
"delay for missed ticks mode". Even in "delay for missed ticks mode",
vcpu's guest_time field is not valid (i.e. zero) when
the state of vcpu is "blocked". (see pt_save_timer function)
The original author (Tim Deegan) of pmtimer_save() must have intended
that it saves the last scheduled time of the vcpu. Unfortunately it was
already implied this bug. FYI, there is no other timer mode than
"delay for missed ticks mode" then.
For consistency with HPET, pmtimer_save() should refer hvm_get_guest_time()
to update the counter as well as hpet_save() does.
Without this patch, the clock of windows server 2012R2 without HPET
might leap forward several minutes on live migration.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Retain use of ->arch.hvm_vcpu.guest_time when non-zero. Do the inverse
adjustment for vHPET.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Kouya Shimura <kouya@jp.fujitsu.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ian Campbell [Fri, 11 Sep 2015 10:42:51 +0000 (11:42 +0100)]
libxl: clear O_NONBLOCK|O_NDELAY on migration fd and reinstate afterwards
The fd passed to us by libvirt for both save and restore has at least
O_NONBLOCK set, which libxl does not expect and therefore fails to
handle any EAGAIN which might arise.
This has been observed with migration v2, but if v1 used to work I
think that would be just be by luck and/or coincidence.
Unix convention (and the principal of least surprise) is usually to
ensure that an fd has no "strange" properties, such as being
non-blocking, when handing it to another component.
However for the convenience of the application arrange instead for
libxl to clear any unexpected flags on the file descriptors it is
given for save or restore and restore them to their original state at
the end. O_NDELAY could be similarly problematic so clear that as
well as O_NONBLOCK.
To do this introduce a pair of new helper functions one to modify+save
the flags and another to restore them and call them in the appropriate
places.
The migration v1 code appeared to do some things with O_NONBLOCK in
the checkpoint case. Migration v2 doesn't seem to do so, and in any
case I wouldn't expect it to be relying on libvirt's setting of
O_NONBLOCK when xl doesn't use that flag.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca> Cc: Yang Hongyang <yanghy@cn.fujitsu.com>
Chris Brand [Fri, 21 Aug 2015 21:30:37 +0000 (14:30 -0700)]
xen: arm: Support <32MB frametables
setup_frametable_mappings() rounds frametable_size up to a multiple
of 32MB. This is wasteful on systems with less than 4GB of RAM,
although it does allow the "contig" bit to be set in the PTEs.
Where the frametable is less than 32MB in size, instead round up
to a multiple of 2MB, not setting the "contig" bit in the PTEs.
Signed-off-by: Chris Brand <chris.brand@broadcom.com> Reviewed-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Chris Brand [Thu, 10 Sep 2015 18:56:28 +0000 (11:56 -0700)]
xen: arm re-order assignments in mfn_to_xen_entry()
Shuffle lines around so that the assignments in mfn_to_xen_entry()
occur in the same order as the bits are declared in lpae_pt_t.
This makes it easier to see which ones are never given a value.
No change in behaviour.
Also fix a minor comment typo.
Signed-off-by: Chris Brand <chris.brand@broadcom.com> Reviewed-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Introduce xc_domain_soft_reset() function supporting XEN_DOMCTL_soft_reset.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
x86-specific hook cleans up the pirq-emuirq mappings, destroys all ioreq
servers and and replaces the shared_info frame with an empty page to support
subsequent XENMAPSPACE_shared_info call.
ARM-specific hook is -ENOSYS for now.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
New domctl resets state for a domain allowing it to 'start over': register
vcpu_info, switch to FIFO ABI for event channels. Still active grants are
being logged to help debugging misbehaving backends.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Log first 10 active grants for a domain. This function is going to be used
for soft reset, active grants on this path usually mean misbehaving backends
refusing to release their mappings on shutdown. We need that in addition to
the already existent 'g' keyhandler as such misbehaving backends can cause a
domain to crash right after the soft reset operation and 'g' option won't be
available in this case.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Thu, 10 Sep 2015 11:18:03 +0000 (12:18 +0100)]
configure: don't silently disable systemd support
Originally when user runs ./configure --enable-systemd and systemd
development library is not available the build system silently disables
systemd support. This is not in line with normal expectation.
Instead, configure should error out when user has asked for systemd
support but development libraries can't be found.
Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
domctl: lower loglevel of XEN_DOMCTL_memory_mapping
We should lower loglevel to XENLOG_G_DEBUG while mapping or
unmapping memory via XEN_DOMCTL_memory_mapping since its
fair enough to check this info just while debugging.
Add the appropriate #if checks around the kexec code in the x86 codebase
so that the feature can actually be turned off by the flag instead of
always required to be enabled on x86.
Signed-off-by: Jonathan Creekmore <jonathan.creekmore@gmail.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
x86: clean up vm_event-related code in asm-x86/domain.h
As suggested by Jan Beulich, moved struct monitor_write_data from
struct arch_domain to struct arch_vcpu, as well as moving all
vm_event-related data from asm-x86/domain.h to struct vm_event,
and allocating it dynamically only when needed.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
ACPI / table: Replace '1' with specific error return values
After commit 7f8f97c3cc (ACPI: acpi_table_parse() now returns
success/fail, not count), acpi_table_parse() returns '1' when it is
unable to find the table, but it should return a negative error code
in that case. Make it return -ENODEV instead.
Fix the same problem in acpi_table_init() analogously.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
[rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit 95df812dbdc350bfcf31e247e9100c378a472480] Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tomasz Nowicki [Wed, 9 Sep 2015 14:25:42 +0000 (16:25 +0200)]
ACPI/table: Always count matched and successfully parsed entries
acpi_parse_entries() allows to traverse all available table entries (aka
subtables) by passing max_entries parameter equal to 0, but since its count
variable is only incremented if max_entries is not 0, the function always
returns 0 for max_entries equal to 0. It would be more useful if it returned
the number of entries matched instead, so make it increment count in that
case too.
Objects loaded by FileHandle->Read need to be flushed from dcache,
otherwise copy_from_paddr will read stale data when copying the kernel,
causing a failure to boot.
Introduce efi_arch_flush_dcache_area and call it from read_file.
This commit introduces no functional changes on x86.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Sun, 6 Sep 2015 20:05:38 +0000 (21:05 +0100)]
libxc: don't populate same pfn more than once in populate_pfns
The original implementation of populate_pfns didn't consider the same
pfn can be present multiple times in the array. The mechanism to prevent
populating the same pfn multiple times only worked if the recurring pfn
appeared in different batches.
This bug is discovered by Linux 4.1 32 bit kernel save / restore test,
which has several ptes pointing to same pfn, which results in an array
containing recurring pfn. When libxc called x86_pv_localise_page, the
original implementation would populate the same pfn more than once.
The fix is to set bit in populated bitmap as we generate list of pfns to
be populated.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>