Ian Campbell [Mon, 7 Apr 2014 11:07:04 +0000 (12:07 +0100)]
xen: make sure that likely and unlikely convert the expression to a boolean
According to http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
__builtin_expect has the prototype:
long __builtin_expect (long exp, long c)
If sizeof(exp) > sizeof(long) then this will effectively mask off the top bits
of exp, meaning that the if in "if (unlikey(x))" will see the masked version,
which might be false when true was expected, likely has the same issue.
This is mostly likely to affect x86_32 and arm32 builds. x86_32 is not
present on 4.3 onwards and a quick grep of current staging shows that all the
existing arm32 uses of both likely and unlikely already pass a boolean. I
noticed this with an as yet unposted patch which did not have this property.
Also the defintion of likely might not have had the expected affect for cases
where a true value > 1 might be passed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: Keir Fraser <keir@xen.org> Cc: Tim Deegan <tim@xen.org>
Ian Campbell [Tue, 8 Apr 2014 15:37:58 +0000 (16:37 +0100)]
build: remove Linux kernel build integration.
We haven't shipped a XenoLinux kernel for more releases than I can remember.
We held onto these because osstest was using them but this is no longer the
case.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
These are a xend-ism. Since Xen 4.1 the recommened way to configure networking
has been to use the distro facilities (e.g.
http://wiki.xen.org/wiki/HostConfiguration/Networking)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 9 Apr 2014 08:26:23 +0000 (09:26 +0100)]
docs: remove stray CONFIG_XENDs and configure option from docs.
These were added by 7dbfc2f8b054 "docs: Honour --{en, dis}able-xend when
building docs" between v1 and the (eventually committed) v2 of 9e8672f1c36d
"tools: remove xend and associated python modules" and were missed when
rebasing for v2.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 9 Apr 2014 14:13:25 +0000 (16:13 +0200)]
x86/AMD: feature masking is unavailable on Fam11
Reported-by: Aravind Gopalakrishnan<aravind.gopalakrishnan@amd.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
The root cause is there is an wronng
'write_unlock(&pcd_tree_rwlocks[firstbyte])' in function
tmem_try_to_evict_pgp().
Nobody will lock &pcd_tree_rwlocks if dedup=0, but the write_unlock() will be
executed anyway. This was introduced by a git commit 38c433d0c711406778aba1ae183a195da98656f0 ("tmem: add page deduplication with
optional compression or trailing-zero-elimination")
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Tue, 28 Jan 2014 04:28:32 +0000 (12:28 +0800)]
tmem: reorg the shared pool allocate path
Reorg the code to make it more readable.
Check the return value of shared_pool_join() and drop a unneeded call to
it. Disable creating a shared & persistant pool in an advance place.
Note that one might be tempted to delay the creation of the pool even
further in the code. That however would break the behavior of the code
- that is if we ended up creating a shared pool and the
'uuid_lo == -1L && uuid_hi == -1L' logic stands we still need to
create a pool - just not shared type.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Tue, 28 Jan 2014 04:28:31 +0000 (12:28 +0800)]
tmem: cleanup: refactor function tmemc_shared_pool_auth()
Make function tmemc_shared_pool_auth() more readable.
Note that the previous check for free being set the first time
'(free == -1)' in the loop is now removed. That is OK because
when we set free the first time ('free = i;') we follow it
immediately with a break to get out of the loop.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Tue, 28 Jan 2014 04:28:28 +0000 (12:28 +0800)]
tmem: remove unneeded parameters from obj destroy path
Parameters "selective" and "no_rebalance" are meaningless in obj
destroy path, this patch remove them. No place uses
no_rebalance=1. In the obj_destroy path we always call it with
no_balance=0.
Note that this will now free it only if:
obj->last_client == cli_id
Which is OK - even if we allocate a non-shared pool we set by
default the obj->last_client to TMEM_CLI_ID_NULL so even if
the pool is never used, the pool_flush will take care of removing
those.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Tue, 28 Jan 2014 04:28:30 +0000 (12:28 +0800)]
tmem: fix the return value of tmemc_set_var()
tmemc_set_var() calls tmemc_set_var_one() but without taking its return value,
this patch fix this issue.
Also rename tmemc_set_var_one() to __tmemc_set_var().
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Wed, 12 Feb 2014 14:43:24 +0000 (22:43 +0800)]
tmem: cleanup the pgp free path
There are several functions related with pgp free, but their relationships are
not clear enough for understanding. This patch made some cleanup by remove
pgp_delist() and pgp_free_from_inv_list().
The call trace is simple now:
pgp_delist_free()
> pgp_free()
> __pgp_free()
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bob Liu [Tue, 28 Jan 2014 04:28:24 +0000 (12:28 +0800)]
tmem: cleanup: remove unneed parameter from pgp_delist()
The parameter "eph_lock" is only needed for function tmem_evict(). Embeded the
delist code into tmem_evict() directly so as to drop the eph_lock parameter. By
this change, the eph list lock can also be released a bit earier.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: A fix for an assertion of 'client->eph_count >= 0' was rolled in]
Bob Liu [Tue, 28 Jan 2014 04:28:23 +0000 (12:28 +0800)]
tmem: bugfix in obj allocate path
There is a potential bug in the obj allocate path. When there are parallel
callers allocate a obj and insert it to pool->obj_rb_root, an unexpected
obj might be returned (both callers use the same oid).
Continue write data to objA
But in future obj_find(), objB
will always be returned.
The route cause is the allocate path didn't check the return value of
obj_rb_insert(). This patch fix it and replace obj_new() with better name
obj_alloc().
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This is done so PVH guests can use PHYSDEVOP_pirq_eoi_gmfn_v{1/2}.
Update users of this fields, to reflect that this has been moved and
it is now also available to other kind of guests.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Move auto_unmask ahead of the other two fields, to reduce padding.
Don Slutz [Wed, 9 Apr 2014 10:16:00 +0000 (12:16 +0200)]
xentrace: add TRC_HVM_EMUL
This add a set of trace events that track the setup of various
emulated devices related to timers in domU.
This set is hpet, pit (i8253, i8254), rtc (MC146818), apic (lapic),
and pic (i8259). The pmtimer is not traced since it does not have a
changeable rate.
Signed-off-by: Don Slutz <dslutz@verizon.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
xen/arm32: __cmpxchg_mb should be marked always_inline
Currently __cmpxchg_mb is only marked inline. The compiler can decide to not
inline this function. In this case, the call to __cmpxchg will be inlined
but not optimised. This will result linking failure because of __bad_cmpxchg.
Caught by clang 3.5.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 4 Apr 2014 13:28:45 +0000 (14:28 +0100)]
tools: implement initial ramdisk support for ARM.
The ramdisk is passed to the kernel as a property in the chosen node of the
device tree. This is somewhat tricky since in order to place the ramdisk and
dtb in ram we first need to know the size of the dtb. So we initially create a
DTB with placeholders for the ramdisk and finalise the value (which doesn't
change the size) once we know where everything is.
Rename libxl__arch_domain_configure to xl__arch_domain_init_hw_description to
better reflect its use and to be consistent with the new
libxl__arch_domain_finalise_hw_description.
The common xc_dom_build_image() function did not support explicit placement of
the ramdisk, instead passing 0 to xc_dom_alloc_segment, meaning "pick
somewhere". This change instead passes ramdisk_seg.vstart. If nothing has set
vstart then it will be zero because the entire dom struct is zeroed on
allocation in xc_dom_allocate(). Therefore there is no change to the behaviour
on x86. This is also consistent with how other segments (kernel, dtb) are
handled.
Furthermore if the ramdisk has been explicitly placed then xc_dom_build_image()
assumes that it is not to be decompressed (since that would muck up the sizings
used on placement).
With all that I'm able to boot a domain using the current Debian Jessie armhf
installer initrd and have it complete successfully.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
[ ijc -- s/itherwise/otherwise and dropped bogus emacs magic change ]
Bob Liu [Wed, 12 Feb 2014 14:43:19 +0000 (22:43 +0800)]
tmem: refactor function do_tmem_op()
Refactor function do_tmem_op() to make it more readable.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Fixed up tab vs spaces, also removed dead code and added gulped code]
Andrew Cooper [Tue, 8 Apr 2014 10:39:23 +0000 (12:39 +0200)]
atomic: use static inlines instead of macros
This is some coverity-inspired tidying.
Coverity has some grief analysing the call sites of atomic_read(). This is
believed to be a bug in Coverity itself when expanding the nested macros, but
there is no legitimate reason for it to be a macro in the first place.
This patch changes {,_}atomic_{read,set}() from being macros to being static
inline functions, thus gaining some type safety.
One issue which is not immediately obvious is that the non-atomic variants take
their atomic_t at a different level of indirection to the atomic variants.
This is not suitable for _atomic_set() (when used to initialise an atomic_t)
which is converted to take its parameter as a pointer. One callsite of
_atomic_set() is updated, while the other two callsites are updated to
ATOMIC_INIT().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan<tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
[For the arm bits:] Acked-by: Ian Campbell <ian.campbell@citrix.com>
do_tmem_destroy_pool is checking if pools == NULL. But, pools is a fixed
array.
Clang 3.5 will fail to compile xen/common/tmem.c with the following error:
tmem.c:1848:18: error: comparison of array 'client->pools' equal to a null
pointer is always false [-Werror,-Wtautological-pointer-compare]
if ( client->pools == NULL )
print_special() uses the width argument to both select output format
and array size. So by passing 4 it expects an array of uint32_t.
But an array of uint64_t is passed.
So copy and mask the registers to 32 bits.
Signed-off-by: Don Slutz <dslutz@verizon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Don Slutz [Thu, 3 Apr 2014 19:07:04 +0000 (15:07 -0400)]
xenctx: change is_kernel_text() into kernel_addr().
A new enum has been added to allow the caller to determine if this
kernel address is a text or data address. This is currenlty not
used, but will be in the next patch.
Add both _end and __bss_stop as kernel_end.
Signed-off-by: Don Slutz <dslutz@verizon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Don Slutz [Thu, 3 Apr 2014 19:06:58 +0000 (15:06 -0400)]
xenctx: Change print_symbol to do the space before.
This stops the output of an extra space at the end of the line.
Signed-off-by: Don Slutz <Don@CloudSwitch.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Don Slutz <Don@CloudSwitch.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Don Slutz [Thu, 3 Apr 2014 19:06:55 +0000 (15:06 -0400)]
xenctx: Add command line options -b (--bytes-per-line) and -l (--lines)
-b <bytes>, --bytes-per-line <bytes>
change the number of bytes per line output for Stack.
(default 32) Note: rounded to native size (4 or 8 bytes).
This option allows you to change the width of the output line. When
used with the -D option and/or -t, the output can be adjusted with
this to less then 80 columns.
-l <lines>, --lines <lines>
change the number of lines output for Stack. (default 5)
Can be specified as MAX. Note: Fewer lines will be output
if stack limit reached.
The default value show a reasonable amount of the raw stack. The -S
option will output all of it one line at a time. This can be used
to select something in the middle.
Signed-off-by: Don Slutz <dslutz@verizon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Don Slutz <Don@CloudSwitch.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Don Slutz [Thu, 3 Apr 2014 19:06:52 +0000 (15:06 -0400)]
xenctx: clean up usage output
Fix usage formatting to be all the same.
Fix usage display of default --kernel-start for 64 bit.
Signed-off-by: Don Slutz <dslutz@verizon.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 12 Sep 2013 09:21:25 +0000 (10:21 +0100)]
tools: remove xend and associated python modules
I've retained xen.lowlevel.{xc,xs} since they seem more widely useful. I also
kept xen.lowlevel.xl even though it is disabled by default and IMHO useless in
its current form.
I've tried to clean up the various associated bits like example configs, init
scripts, udev rules etc but no doubt I have missed something, those can easily
be cleaned up later.
I've also removed xm-test since although it could in theory be reworked to
test xl it hasn't been touched for years. If someone wants to resurrect it
then they could do so via the git history.
This has been built but not runtime tested.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Clean out some .*ignore cruft
Remove some xm/xend docs.
Ian Campbell [Wed, 26 Mar 2014 13:38:52 +0000 (13:38 +0000)]
xen: arm: document what low level primitives we have imported from Linux
As part of the recent update I had to reverse engineer what we had, which was
very tedious. Check in my notes so that I have a reference for next time.
Now the secret is to remember to update this file every time!
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:51 +0000 (13:38 +0000)]
xen: arm: refactor xchg and cmpxchg into their own headers
Since these functions are taken from Linux this makes it easier to compare
against the Lihnux cmpxchg.h headers (which were split out from Linux's
system.h a while back).
Since these functions are from Linux the intention is to use Linux coding
style, therefore include a suitable emacs magic block.
For this reason also fix up the indentation in the 32-bit version to use hard
tabs while moving it. The 64-bit version was already correct.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:48 +0000 (13:38 +0000)]
xen: arm64: asm: remove redundant "cc" clobbers
This resyncs atomics and cmpxchgs with Linux v3.14-rc7 by importing:
commit 95c4189689f92fba7ecf9097173404d4928c6e9b
Author: Will Deacon <will.deacon@arm.com>
Date: Tue Feb 4 12:29:13 2014 +0000
arm64: asm: remove redundant "cc" clobbers
cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
from inline asm blocks that only use these instructions to implement
conditional branches.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:46 +0000 (13:38 +0000)]
xen: arm64: atomics: fix use of acquire + release for full barrier semantics
Xen, like Linux, expects full barrier semantics for bitops, atomics and
cmpxchgs. This issue was discovered on Linux and we get our implementation of
these from Linux so quoting Will Deacon in Linux commit 8e86f0b409a4 for the
gory details:
Linux requires a number of atomic operations to provide full barrier
semantics, that is no memory accesses after the operation can be
observed before any accesses up to and including the operation in
program order.
On arm64, these operations have been incorrectly implemented as follows:
// A, B, C are independent memory locations
<Access [A]>
// atomic_op (B)
1: ldaxr x0, [B] // Exclusive load with acquire
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
<Access [C]>
The assumption here being that two half barriers are equivalent to a
full barrier, so the only permitted ordering would be A -> B -> C
(where B is the atomic operation involving both a load and a store).
Unfortunately, this is not the case by the letter of the architecture
and, in fact, the accesses to A and C are permitted to pass their
nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
store-release on B). This is a clear violation of the full barrier
requirement.
The simple way to fix this is to implement the same algorithm as ARMv7
using explicit barriers:
<Access [A]>
// atomic_op (B)
dmb ish // Full barrier
1: ldxr x0, [B] // Exclusive load
<op(B)>
stxr w1, x0, [B] // Exclusive store
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
but this has the undesirable effect of introducing *two* full barrier
instructions. A better approach is actually the following, non-intuitive
sequence:
<Access [A]>
// atomic_op (B)
1: ldxr x0, [B] // Exclusive load
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
The simple observations here are:
- The dmb ensures that no subsequent accesses (e.g. the access to C)
can enter or pass the atomic sequence.
- The dmb also ensures that no prior accesses (e.g. the access to A)
can pass the atomic sequence.
- Therefore, no prior access can pass a subsequent access, or
vice-versa (i.e. A is strictly ordered before C).
- The stlxr ensures that no prior access can pass the store component
of the atomic operation.
The only tricky part remaining is the ordering between the ldxr and the
access to A, since the absence of the first dmb means that we're now
permitting re-ordering between the ldxr and any prior accesses.
From an (arbitrary) observer's point of view, there are two scenarios:
1. We have observed the ldxr. This means that if we perform a store to
[B], the ldxr will still return older data. If we can observe the
ldxr, then we can potentially observe the permitted re-ordering
with the access to A, which is clearly an issue when compared to
the dmb variant of the code. Thankfully, the exclusive monitor will
save us here since it will be cleared as a result of the store and
the ldxr will retry. Notice that any use of a later memory
observation to imply observation of the ldxr will also imply
observation of the access to A, since the stlxr/dmb ensure strict
ordering.
2. We have not observed the ldxr. This means we can perform a store
and influence the later ldxr. However, that doesn't actually tell
us anything about the access to [A], so we've not lost anything
here either when compared to the dmb variant.
This patch implements this solution for our barriered atomic operations,
ensuring that we satisfy the full barrier requirements where they are
needed.
Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:41 +0000 (13:38 +0000)]
xen: arm32: resync mem* with Linux v3.14-rc7
This pulls in the following Linux commits:
commit 455bd4c430b0c0a361f38e8658a0d6cb469942b5
Author: Ivan Djelic <ivan.djelic@parrot.com>
Date: Wed Mar 6 20:09:27 2013 +0100
Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on
assumptions about the implementation of memset and similar functions.
The current ARM optimized memset code does not return the value of
its first argument, as is usually expected from standard implementations.
GCC assumes memset returns the value of pointer 'waiter' in register r0; ca
register/memory corruptions.
This patch fixes the return value of the assembly version of memset.
It adds a 'mov' instruction and merges an additional load+store into
existing load/store instructions.
For ease of review, here is a breakdown of the patch into 4 simple steps:
Step 1
======
Perform the following substitutions:
ip -> r8, then
r0 -> ip,
and insert 'mov ip, r0' as the first statement of the function.
At this point, we have a memset() implementation returning the proper resul
but corrupting r8 on some paths (the ones that were using ip).
Step 2
======
Make sure r8 is saved and restored when (! CALGN(1)+0) == 1:
Step 4
======
Rewrite register list "r4-r7, r8" as "r4-r8".
Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit 418df63adac56841ef6b0f1fcf435bc64d4ed177
Author: Nicolas Pitre <nicolas.pitre@linaro.org>
Date: Tue Mar 12 13:00:42 2013 +0100
ARM: 7670/1: fix the memset fix
Commit 455bd4c430b0 ("ARM: 7668/1: fix memset-related crashes caused by
recent GCC (4.7.2) optimizations") attempted to fix a compliance issue
with the memset return value. However the memset itself became broken
by that patch for misaligned pointers.
This fixes the above by branching over the entry code from the
misaligned fixup code to avoid reloading the original pointer.
Also, because the function entry alignment is wrong in the Thumb mode
compilation, that fixup code is moved to the end.
While at it, the entry instructions are slightly reworked to help dual
issue pipelines.
Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:40 +0000 (13:38 +0000)]
xen: arm32: resync atomics with (almost) v3.14-rc7
Almost because I omitting aed3a4e "ARM: 7868/1: arm/arm64: remove
atomic_clear_mask() ..." which I will apply to both arm32 and arm64
simultaneously in a later patch.
ARM: atomics: prefetch the destination word for write prior to strex
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch prefixes our atomic access implementations with pldw
instructions (on CPUs which support them) to try and grab the line in
exclusive state from the start. Only the barrier-less functions are
updated, since memory barriers can limit the usefulness of prefetching
data.
Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
commit 4dcc1cf7316a26e112f5c9fcca531ff98ef44700
Author: Chen Gang <gang.chen@asianux.com>
Date: Sat Oct 26 15:07:25 2013 +0100
ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval
For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
instruction).
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:39 +0000 (13:38 +0000)]
xen: arm32: replace hard tabs in atomics.h
This file is from Linux and the intention was to keep the formatting the same
to make resyncing easier. Put the hardtabs back and adjust the emacs magic to
reflect the desired use of whitespace.
Adjust the 64-bit emacs magic too.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:38 +0000 (13:38 +0000)]
xen: arm32: ensure cmpxchg has full barrier semantics
Unrelated reads/writes should not pass the xchg.
Provide cmpxchg_local for parity with arm64, although it appears to be unused.
It also helps make the reason for the separation of __cmpxchg_mb more
apparent.
With this our cmpxchg is in sync with Linux v3.14-rc7.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
The bitops functions (e.g. _test_and_set_bit) on ARM do not have unwind
annotations and therefore the kernel cannot backtrace out of them on a
fatal error (for example, NULL pointer dereference).
This patch annotates the bitops assembly macros with UNWIND annotations
so that we can produce a meaningful backtrace on error. Callers of the
macros are modified to pass their function name as a macro parameter,
enforcing that the macros are used as standalone function implementations.
Acked-by: Dave Martin <dave.martin@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
commit d779c07dd72098a7416d907494f958213b7726f3
Author: Will Deacon <will.deacon@arm.com>
Date: Thu Jun 27 12:01:51 2013 +0100
ARM: bitops: prefetch the destination word for write prior to strex
The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch prefixes our atomic bitops implementation with prefetchw,
to try and grab the line in exclusive state from the start. The testop
macro is left alone, since the barrier semantics limit the usefulness
of prefetching data.
Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
commit b7ec699405f55667caeb46d96229d75bf33a83ad
Author: Will Deacon <will.deacon@arm.com>
Date: Tue Nov 19 15:46:11 2013 +0100
ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP
Uwe reported a build failure when targetting a NOMMU platform with my
recent prefetch changes:
arch/arm/lib/changebit.S: Assembler messages:
arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is
not allowed for the current base architecture
This is due to use of the .arch_extension mp directive immediately prior
to an ALT_SMP(...) instruction. Whilst the ALT_SMP macro will expand to
nothing if !CONFIG_SMP, gas will still choke on the directive.
This patch fixes the issue by only emitting the sequence (including the
directive) if CONFIG_SMP=y.
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Wed, 26 Mar 2014 13:38:36 +0000 (13:38 +0000)]
xen: x86 & generic: change to __builtin_prefetch()
Quoting Andi Kleen in Linux b483570a13be from 2007:
gcc 3.2+ supports __builtin_prefetch, so it's possible to use it on all
architectures. Change the generic fallback in linux/prefetch.h to use it
instead of noping it out. gcc should do the right thing when the
architecture doesn't support prefetching
Undefine the x86-64 inline assembler version and use the fallback.
ARM wants to use the builtins.
Fix a pair of spelling errors, one of which was from Lucas De Marchi in the
Linux tree.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: Keir Fraser <keir@xen.org> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Thu, 3 Apr 2014 08:59:43 +0000 (09:59 +0100)]
xen: arm32: don't force the compiler to allocate a dummy register
TLBIALLH, ICIALLU and BPIALL make no use of their register argument. Instead
of making the compiler allocate a dummy register just hardcode r0, there is no
need to represent this in the inline asm since the register is neither
clobbered nor used in any way.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Jan Beulich [Wed, 2 Apr 2014 08:09:33 +0000 (09:09 +0100)]
x86/HVM: fix setting mem access to default
commit 3b0bcb89 ("x86/mm/p2m: Move p2m code in HVMOP_[gs]et_mem_access
into p2m.c") introduced an off-by-one mistake forcing an input of
HVMMEM_access_default to always fail. Since related, also eliminate the
inefficient setup of an on-stack array for each function invocation.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Jan Beulich [Wed, 2 Apr 2014 08:09:00 +0000 (09:09 +0100)]
x86/HVM: fix preemption handling in do_hvm_op() (try 2)
Just like previously done for some mem-op hypercalls, undo preemption
using the interface structures (altering it in ways the caller may not
expect) and replace it by storing the continuation point in the high
bits of sub-operation argument.
This also changes the "nr" fields of struct xen_hvm_track_dirty_vram
(operation already limited to 1Gb worth of pages) and struct
xen_hvm_modified_memory to be only 32 bits wide, consistent with those
of struct xen_set_mem{type,access}. If that's not acceptable for some
reason, we'd need to shrink the HVMOP_op_bits (while still enforcing
the [then higher] limit resulting from the need to be able to encode
the continuation).
Whether (and if so how) to adjust xc_hvm_track_dirty_vram(),
xc_hvm_modified_memory(), xc_hvm_set_mem_type(), and
xc_hvm_set_mem_access() to reflect the 32-bit restriction on "nr" is
unclear: If the APIs need to remain stable, all four functions should
probably check that there was no truncation. Preferably their
parameters would be changed to uint32_t or unsigned int, though.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
common/domctl: some functions are only used internally
The list of function above are only used internally in common/domctl.c.
- bitmap_to_xenctl_bitmap
- xenctl_bitmap_to_bitmap
- nodemask_to_xenctl_bitmap
- xenctl_bitmap_to_nodemask
Daniel De Graaf [Tue, 1 Apr 2014 16:22:40 +0000 (18:22 +0200)]
evtchn: rearrange fields
Event channel arrays are allocated in blocks with EVTCHNS_PER_BUCKET
elements, which must be a power of 2. When XSM is disabled, struct
evtchn is 32 bytes including padding; however, when XSM is enabled, the
structure becomes larger and EVTCHNS_PER_BUCKET is halved. Rearranging
some of the fields in struct evtchn allows a 4-byte XSM field to fit
within the 32-byte structure.
This rearrangement turns the xen_consumer field of struct evtchn into a
bitfield and adjusts the xen_consumers array to fit the number of
addressable elements from this value. Since there are currently only
two users of this array, only 3 bits (7 values) are reserved. This
field is also used rarely enough that the slight overhead from applying
a bitmask should not cause problems.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Tue, 1 Apr 2014 14:49:18 +0000 (16:49 +0200)]
VMX: fix PAT value seen by guest
The XSA-60 fixes introduced a window during which the guest PAT gets
forced to all zeros. This shouldn't be visible to the guest. Therefore
we need to intercept PAT MSR accesses during that time period.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Liu Jinsong <jinsong.liu@intel.com>
Matthew Daley [Tue, 1 Apr 2014 14:48:02 +0000 (16:48 +0200)]
kexec: propagate ENOMEM result in error handling
...otherwise if kimage_alloc_control_page fails (presumably due to
out-of-memory; see the invocation just before this one), the caller of
do_kimage_alloc will think the call was successful.
Signed-off-by: Matthew Daley <mattd@bugfuzz.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Matthew Daley [Sat, 29 Mar 2014 05:08:08 +0000 (18:08 +1300)]
pv-grub: correct sizeof usage
We were lucky that sizeof(frame) >= sizeof(*frame) anyway.
Signed-off-by: Matthew Daley <mattd@bugfuzz.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Thu, 13 Mar 2014 15:09:17 +0000 (15:09 +0000)]
xen/console: Add support for early printk
On ARM, a function (early_printk) was introduced to output message when the
serial port is not initialized.
This solution is fragile because the developper needs to know when the serial
port is initialized, to use either early_printk or printk. Moreover some
functions (mainly in common code), only use printk. This will result to a loss
of message sometimes.
Directly call early_printk in console code when the serial port is not yet
initialized. For this purpose use serial_steal_fn.