Jan Beulich [Mon, 17 Jun 2013 08:30:57 +0000 (10:30 +0200)]
AMD IOMMU: make interrupt work again
Commit 899110e3 ("AMD IOMMU: include IOMMU interrupt information in 'M'
debug key output") made the AMD IOMMU MSI setup code use more of the
generic MSI setup code (as other than for VT-d this is an ordinary MSI-
capable PCI device), but failed to notice that till now interrupt setup
there _required_ the subsequent affinity setup to be done, as that was
the only point where the MSI message would get written. The generic MSI
affinity setting routine, however, does only an incremental change,
i.e. relies on this setup to have been done before.
In order to not make the code even more clumsy, introduce a new low
level helper routine __setup_msi_irq(), thus eliminating the need for
the AMD IOMMU code to directly fiddle with the IRQ descriptor.
Ian Campbell [Sat, 15 Jun 2013 08:30:47 +0000 (09:30 +0100)]
xen: arm: fix build after libelf changes.
ed65808a8ed4 "libelf: check all pointer accesses" caused:
kernel.c: In function 'kernel_elf_load':
kernel.c:162:18: error: 'struct elf_binary' has no member named 'dest'
make[4]: *** [kernel.o] Error 1
The field is now called dest_base. We also need to populate dest_size.
This fixes the build for me although have not tested it. I have a feeling that
loading the kernel from an ELF file on ARM doesn't currently work anyway
(everyone uses the zImage loader as far as I am aware).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Matthew Daley [Fri, 14 Jun 2013 15:39:38 +0000 (16:39 +0100)]
libxc: check blob size before proceeding in xc_dom_check_gzip
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Matthew Daley <mattjd@gmail.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v8: Add a comment explaining where the number 6 comes from.
Ian Jackson [Fri, 14 Jun 2013 15:39:38 +0000 (16:39 +0100)]
libxc: range checks in xc_dom_p2m_host and _guest
These functions take guest pfns and look them up in the p2m. They did
no range checking.
However, some callers, notably xc_dom_boot.c:setup_hypercall_page want
to pass untrusted guest-supplied value(s). It is most convenient to
detect this here and return INVALID_MFN.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Tim Deegan <tim@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
v6: Check for underflow too (thanks to Andrew Cooper).
Ian Jackson [Fri, 14 Jun 2013 15:39:38 +0000 (16:39 +0100)]
libxc: check return values from malloc
A sufficiently malformed input to libxc (such as a malformed input ELF
or other guest-controlled data) might cause one of libxc's malloc() to
fail. In this case we need to make sure we don't dereference or do
pointer arithmetic on the result.
Search for all occurrences of \b(m|c|re)alloc in libxc, and all
functions which call them, and add appropriate error checking where
missing.
This includes the functions xc_dom_malloc*, which now print a message
when they fail so that callers don't have to do so.
The function xc_cpuid_to_str wasn't provided with a sane return value
and has a pretty strange API, which now becomes a little stranger.
There are no in-tree callers.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v8: Move a check in xc_exchange_page to the previous patch
(ie, remove it from this patch).
v7: Add a missing check for a call to alloc_str.
Add arithmetic overflow check in xc_dom_malloc.
Coding style fix.
v6: Fix a missed call `pfn_err = calloc...' in xc_domain_restore.c.
Fix a missed call `new_pfn = xc_map_foreign_range...' in
xc_offline_page.c
v5: This patch is new in this version of the series.
Ian Jackson [Fri, 14 Jun 2013 15:39:38 +0000 (16:39 +0100)]
libxc: check failure of xc_dom_*_to_ptr, xc_map_foreign_range
The return values from xc_dom_*_to_ptr and xc_map_foreign_range are
sometimes dereferenced, or subjected to pointer arithmetic, without
checking whether the relevant function failed and returned NULL.
Add an appropriate error check at every call site.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v8: Add a missing check in xc_offline_page.c:xc_exchange_page,
which was in the next patch in v7 of the series.
Also improve the message.
I think in this particular error case it may be that the results
are a broken guest, but turning this from a possible host tools
crash into a guest problem seems to solve the potential security
problem.
v7: Simplify an error DOMPRINTF to not use "load ? : ".
Make DOMPRINTF allocation error messages consistent.
Do not set elf->dest_pages in xc_dom_load_elf_kernel
if xc_dom_seg_to_ptr_pages fails.
v5: This patch is new in this version of the series.
Ian Jackson [Fri, 14 Jun 2013 15:39:37 +0000 (16:39 +0100)]
libxc: Add range checking to xc_dom_binloader
This is a simple binary image loader with its own metadata format.
However, it is too careless with image-supplied values.
Add the following checks:
* That the image is bigger than the metadata table; otherwise the
pointer arithmetic to calculate the metadata table location may
yield undefined and dangerous values.
* When clamping the end of the region to search, that we do not
calculate pointers beyond the end of the image. The C
specification does not permit this and compilers are becoming ever
more determined to miscompile code when they can "prove" various
falsehoods based on assertions from the C spec.
* That the supplied image is big enough for the text we are allegedly
copying from it. Otherwise we might have a read overrun and copy
the results (perhaps a lot of secret data) into the guest.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
v9: Use clearer code for calculating probe_end in find_table.
v6: Add a missing `return -EINVAL' (Matthew Daley).
Fix an error in the commit message (Matthew Daley).
v5: This patch is new in this version of the series.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: abolish obsolete macros
Abolish ELF_PTRVAL_[CONST_]{CHAR,VOID}; change uses to elf_ptrval.
Abolish ELF_HANDLE_DECL_NONCONST; change uses to ELF_HANDLE_DECL.
Abolish ELF_OBSOLETE_VOIDP_CAST; simply remove all uses.
No functional change. (Verified by diffing assembler output.)
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v2: New patch.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: check loops for running away
Ensure that libelf does not have any loops which can run away
indefinitely even if the input is bogus. (Grepped for \bfor, \bwhile
and \bgoto in libelf and xc_dom_*loader*.c.)
Changes needed:
* elf_note_next uses the note's unchecked alleged length, which might
wrap round. If it does, return ELF_MAX_PTRVAL (0xfff..fff) instead,
which will be beyond the end of the section and so terminate the
caller's loop. Also check that the returned psuedopointer is sane.
* In various loops over section and program headers, check that the
calculated header pointer is still within the image, and quit the
loop if it isn't.
* Some fixed limits to avoid potentially O(image_size^2) loops:
- maximum length of strings: 4K (longer ones ignored totally)
- maximum total number of ELF notes: 65536 (any more are ignored)
* Check that the total program contents (text, data) we copy or
initialise doesn't exceed twice the output image area size.
* Remove an entirely useless loop from elf_xen_parse (!)
* Replace a nested search loop in in xc_dom_load_elf_symtab in
xc_dom_elfloader.c by a precomputation of a bitmap of referenced
symtabs.
We have not changed loops which might, in principle, iterate over the
whole image - even if they might do so one byte at a time with a
nontrivial access check function in the middle.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v8: Fix the two loops in libelf-dominfo.c; the comment about
PT_NOTE and SHT_NOTE wasn't true because the checks did
"continue", not "break".
Add a comment about elf_note_next's expectations of the caller's
loop conditions (which most plausible callers will follow anyway).
v5: Fix regression due to wrong image size loop limit calculation.
Check return value from xc_dom_malloc.
v4: Fix regression due to misplacement of test in elf_shdr_by_name
(uninitialised variable).
Introduce fixed limits.
Avoid O(size^2) loops.
Check returned psuedopointer from elf_note_next is correct.
A few style fixes.
v3: Fix a whitespace error.
v2: BUGFIX: elf_shdr_by_name, elf_note_next: Reject new <= old, not just <.
elf_shdr_by_name: Change order of checks to be a bit clearer.
elf_load_bsdsyms: shdr loop check, improve chance of brokenness detection.
Style fixes.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: use only unsigned integers
Signed integers have undesirable undefined behaviours on overflow.
Malicious compilers can turn apparently-correct code into code with
security vulnerabilities etc.
So use only unsigned integers. Exceptions are booleans (which we have
already changed) and error codes.
We _do_ change all the chars which aren't fixed constants from our own
text segment, but not the char*s. This is because it is safe to
access an arbitrary byte through a char*, but not necessarily safe to
convert an arbitrary value to a char.
As a consequence we need to compile libelf with -Wno-pointer-sign.
It is OK to change all the signed integers to unsigned because all the
inequalities in libelf are in contexts where we don't "expect"
negative numbers.
In libelf-dominfo.c:elf_xen_parse we rename a variable "rc" to
"more_notes" as it actually contains a note count derived from the
input image. The "error" return value from elf_xen_parse_notes is
changed from -1 to ~0U.
grepping shows only one occurrence of "PRId" or "%d" or "%ld" in
libelf and xc_dom_elfloader.c (a "%d" which becomes "%u").
This is part of the fix to a security issue, XSA-55.
For those concerned about unintentional functional changes, the
following rune produces a version of the patch which is much smaller
and eliminates only non-functional changes:
set +e
diff -pu --label "$path~" <(seddery <"$in") --label "$path" <(seddery <"$out")
rc=$?
set -e
if [ $rc = 1 ]; then rc=0; fi
exit $rc
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v8: Use "?!?!" to express consternation instead of a ruder phrase.
v5: Introduce ELF_NOTE_INVALID, instead of using a literal ~0U.
v4: Fix regression in elf_round_up; use uint64_t here.
v3: Changes to booleans split off into separate patch.
v2: BUGFIX: Eliminate conversion to int of return from elf_xen_parse_notes.
BUGFIX: Fix the one printf format thing which needs changing.
Remove irrelevant change to constify note_desc.name in libelf-dominfo.c.
In xc_dom_load_elf_symtab change one sizeof(int) to sizeof(unsigned).
Do not change type of 2nd argument to memset.
Provide seddery for easier review.
Style fix.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: use C99 bool for booleans
We want to remove uses of "int" because signed integers have
undesirable undefined behaviours on overflow. Malicious compilers can
turn apparently-correct code into code with security vulnerabilities
etc.
In this patch we change all the booleans in libelf to C99 bool,
from <stdbool.h>.
For the one visible libelf boolean in libxc's public interface we
retain the use of int to avoid changing the ABI; libxc converts it to
a bool for consumption by libelf.
It is OK to change all values only ever used as booleans to _Bool
(bool) because conversion from any scalar type to a _Bool works the
same as the boolean test in if() or ?: and is always defined (C99
6.3.1.2). But we do need to check that all these variables really are
only ever used that way. (It is theoretically possible that the old
code truncated some 64-bit values to 32-bit ints which might become
zero depending on the value, which would mean a behavioural change in
this patch, but it seems implausible that treating 0x????????00000000
as false could have been intended.)
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v3: Use <stdbool.h>'s bool (or _Bool) instead of defining elf_bool.
Split this into a separate patch.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: Make all callers call elf_check_broken
This arranges that if the new pointer reference error checking
tripped, we actually get a message about it. In this patch these
messages do not change the actual return values from the various
functions: so pointer reference errors do not prevent loading. This
is for fear that some existing kernels might cause the code to make
these wild references, which would then break, which is not a good
thing in a security patch.
In xen/arch/x86/domain_build.c we have to introduce an "out" label and
change all of the "return rc" beyond the relevant point into "goto
out".
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v5: Fix two whitespace errors.
v3.1:
Add error check to xc_dom_parse_elf_kernel.
Move check in xc_hvm_build_x86.c:setup_guest to right place.
v2 was Acked-by: Ian Campbell <ian.campbell@citrix.com>
v2 was Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: Check pointer references in elf_is_elfbinary
elf_is_elfbinary didn't take a length parameter and could potentially
access out of range when provided with a very short image.
We only need to check the size is enough for the actual dereference in
elf_is_elfbinary; callers are just using it to check the magic number
and do their own checks (usually via the new elf_ptrval system) before
dereferencing other parts of the header.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v7: Add a comment about the limited function of elf_is_elfbinary.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: check all pointer accesses
We change the ELF_PTRVAL and ELF_HANDLE types and associated macros:
* PTRVAL becomes a uintptr_t, for which we provide a typedef
elf_ptrval. This means no arithmetic done on it can overflow so
the compiler cannot do any malicious invalid pointer arithmetic
"optimisations". It also means that any places where we
dereference one of these pointers without using the appropriate
macros or functions become a compilation error.
So we can be sure that we won't miss any memory accesses.
All the PTRVAL variables were previously void* or char*, so
the actual address calculations are unchanged.
* ELF_HANDLE becomes a union, one half of which keeps the pointer
value and the other half of which is just there to record the
type.
The new type is not a pointer type so there can be no address
calculations on it whose meaning would change. Every assignment or
access has to go through one of our macros.
* The distinction between const and non-const pointers and char*s
and void*s in libelf goes away. This was not important (and
anyway libelf tended to cast away const in various places).
* The fields elf->image and elf->dest are renamed. That proves
that we haven't missed any unchecked uses of these actual
pointer values.
* The caller may fill in elf->caller_xdest_base and _size to
specify another range of memory which is safe for libelf to
access, besides the input and output images.
* When accesses fail due to being out of range, we mark the elf
"broken". This will be checked and used for diagnostics in
a following patch.
We do not check for write accesses to the input image. This is
because libelf actually does this in a number of places. So we
simply permit that.
* Each caller of libelf which used to set dest now sets
dest_base and dest_size.
* In xc_dom_load_elf_symtab we provide a new actual-pointer
value hdr_ptr which we get from mapping the guest's kernel
area and use (checking carefully) as the caller_xdest area.
* The STAR(h) macro in libelf-dominfo.c now uses elf_access_unsigned.
* elf-init uses the new elf_uval_3264 accessor to access the 32-bit
fields, rather than an unchecked field access (ie, unchecked
pointer access).
* elf_uval has been reworked to use elf_uval_3264. Both of these
macros are essentially new in this patch (although they are derived
from the old elf_uval) and need careful review.
* ELF_ADVANCE_DEST is now safe in the sense that you can use it to
chop parts off the front of the dest area but if you chop more than
is available, the dest area is simply set to be empty, preventing
future accesses.
* We introduce some #defines for memcpy, memset, memmove and strcpy:
- We provide elf_memcpy_safe and elf_memset_safe which take
PTRVALs and do checking on the supplied pointers.
- Users inside libelf must all be changed to either
elf_mem*_unchecked (which are just like mem*), or
elf_mem*_safe (which take PTRVALs) and are checked. Any
unchanged call sites become compilation errors.
* We do _not_ at this time fix elf_access_unsigned so that it doesn't
make unaligned accesses. We hope that unaligned accesses are OK on
every supported architecture. But it does check the supplied
pointer for validity.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v7: Remove a spurious whitespace change.
v5: Use allow_size value from xc_dom_vaddr_to_ptr to set xdest_size
correctly.
If ELF_ADVANCE_DEST advances past the end, mark the elf broken.
Always regard NULL allowable region pointers (e.g. dest_base)
as invalid (since NULL pointers don't point anywhere).
v4: Fix ELF_UNSAFE_PTR to work on 32-bit even when provided 64-bit
values.
Fix xc_dom_load_elf_symtab not to call XC_DOM_PAGE_SIZE
unnecessarily if load is false. This was a regression.
v3.1:
Introduce a change to elf_store_field to undo the effects of
the v3.1 change to the previous patch (the definition there
is not compatible with the new types).
v3: Fix a whitespace error.
v2 was Acked-by: Ian Campbell <ian.campbell@citrix.com>
v2: BUGFIX: elf_strval: Fix loop termination condition to actually work.
BUGFIX: elf_strval: Fix return value to not always be totally wild.
BUGFIX: xc_dom_load_elf_symtab: do proper check for small header size.
xc_dom_load_elf_symtab: narrow scope of `hdr_ptr'.
xc_dom_load_elf_symtab: split out uninit'd symtab.class ref fix.
More comments on the lifetime/validity of elf-> dest ptrs etc.
libelf.h: write "obsolete" out in full
libelf.h: rename "dontuse" to "typeonly" and add doc comment
elf_ptrval_in_range: Document trustedness of arguments.
Style and commit message fixes.
Ian Jackson [Fri, 14 Jun 2013 15:39:36 +0000 (16:39 +0100)]
libelf: check nul-terminated strings properly
It is not safe to simply take pointers into the ELF and use them as C
pointers. They might not be properly nul-terminated (and the pointers
might be wild).
So we are going to introduce a new function elf_strval for safely
getting strings. This will check that the addresses are in range and
that there is a proper nul-terminated string. Of course it might
discover that there isn't. In that case, it will be made to fail.
This means that elf_note_name might fail, too.
For the benefit of call sites which are just going to pass the value
to a printf-like function, we provide elf_strfmt which returns
"(invalid)" on failure rather than NULL.
In this patch we introduce dummy definitions of these functions. We
introduce calls to elf_strval and elf_strfmt everywhere, and update
all the call sites with appropriate error checking.
There is not yet any semantic change, since before this patch all the
places where we introduce elf_strval dereferenced the value anyway, so
it mustn't have been NULL.
In future patches, when elf_strval is made able return NULL, when it
does so it will mark the elf "broken" so that an appropriate
diagnostic can be printed.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v7: Change readnotes.c check to use two if statements rather than ||.
Use the new PTRVAL macros and elf_access_unsigned in
print_l1_mfn_valid_note.
No functional change unless the input is wrong, or we are reading a
file for a different endianness.
Separated out from the previous patch because this change does produce
a difference in the generated code.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
v2: Split out into its own patch.
Ian Jackson [Fri, 14 Jun 2013 15:39:35 +0000 (16:39 +0100)]
libelf: introduce macros for memory access and pointer handling
We introduce a collection of macros which abstract away all the
pointer arithmetic and dereferences used for accessing the input ELF
and the output area(s). We use the new macros everywhere.
For now, these macros are semantically identical to the code they
replace, so this patch has no functional change.
elf_is_elfbinary is an exception: since it doesn't take an elf*, we
need to handle it differently. In a future patch we will change it to
take, and check, a length parameter. For now we just mark it with a
fixme.
That this patch has no functional change can be verified as follows:
0. Copy the scripts "comparison-generate" and "function-filter"
out of this commit message.
1. Check out the tree before this patch.
2. Run the script ../comparison-generate .... ../before
3. Check out the tree after this patch.
4. Run the script ../comparison-generate .... ../after
5. diff --exclude=\*.[soi] -ruN before/ after/ |less
Expect these differences:
* stubdom/zlib-x86_64/ztest*.s2
The filename of this test file apparently contains the pid.
* xen/common/version.s2
The xen build timestamp appears in two diff hunks.
Verification that this is all that's needed:
In a completely built xen.git,
find * -name .*.d -type f | xargs grep -l libelf\.h
Expect results in:
xen/arch/x86: Checked above.
tools/libxc: Checked above.
tools/xcutils/readnotes: Checked above.
tools/xenstore: Checked above.
xen/common/libelf:
This is the build for the hypervisor; checked in B above.
stubdom:
We have one stubdom which reads ELFs using our libelf,
pvgrub, which is checked above.
I have not done this verification for ARM.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v7: Add uintptr_t cast to ELF_UNSAFE_PTR. Still verifies.
Use git foo not git-foo in commit message verification script.
v4: Fix elf_load_binary's phdr message to be correct on 32-bit.
Fix ELF_OBSOLETE_VOIDP_CAST to work on 32-bit.
Indent scripts in commit message.
v3.1:
Change elf_store_field to verify correctly on 32-bit.
comparison-generate copes with Xen 4.1's lack of ./configure.
v2: Use Xen style for multi-line comments.
Postpone changes to readnotes.c:print_l1_mfn_valid_note.
Much improved verification instructions with new script.
Fixed commit message subject.
if [ -f ./configure ]; then
$build_rune_prefix ./configure
fi
$build_rune_prefix make -C xen
$build_rune_prefix make -C tools/include
$build_rune_prefix make -C stubdom grub
$build_rune_prefix make -C tools/libxc
$build_rune_prefix make -C tools/xenstore
$build_rune_prefix make -C tools/xcutils
rm -rf "$result_dir"
mkdir "$result_dir"
set +x
for f in `find xen tools stubdom -name \*.[soi]`; do
mkdir -p "$result_dir"/`dirname $f`
cp $f "$result_dir"/${f}
case $f in
*.s)
../function-filter <$f >"$result_dir"/${f}2
;;
esac
done
echo ok.
-8<-
-8<- function-filter -8<-
#!/usr/bin/perl -w
# function-filter
# script for massaging gcc-generated labels to be consistent
use strict;
our @lines;
my $sedderybody = "sub seddery () {\n";
while (<>) {
push @lines, $_;
if (m/^(__FUNCTION__|__func__)\.(\d+)\:/) {
$sedderybody .= " s/\\b$1\\.$2\\b/__XSA55MANGLED__$1.$./g;\n";
}
}
$sedderybody .= "}\n1;\n";
eval $sedderybody or die $@;
foreach (@lines) {
seddery();
print or die $!;
}
-8<-
Ian Jackson [Fri, 14 Jun 2013 15:39:35 +0000 (16:39 +0100)]
libelf/xc_dom_load_elf_symtab: Do not use "syms" uninitialised
xc_dom_load_elf_symtab (with load==0) calls elf_round_up, but it
mistakenly used the uninitialised variable "syms" when calculating
dom->bsd_symtab_start. This should be a reference to "elf".
This change might have the effect of rounding the value differently.
Previously if the uninitialised value (a single byte on the stack) was
ELFCLASS64 (ie, 2), the alignment would be to 8 bytes, otherwise to 4.
However, the value is calculated from dom->kernel_seg.vend so this
could only make a difference if that value wasn't already aligned to 8
bytes.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
v2: Split this change into its own patch for proper review.
Ian Jackson [Fri, 14 Jun 2013 15:39:35 +0000 (16:39 +0100)]
libelf: move include of <asm/guest_access.h> to top of file
libelf-loader.c #includes <asm/guest_access.h>, when being compiled
for Xen. Currently it does this in the middle of the file.
Move this #include to the top of the file, before libelf-private.h.
This is necessary because in forthcoming patches we will introduce
private #defines of memcpy etc. which would interfere with definitions
in headers #included from guest_access.h.
No semantic or functional change in this patch.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Ian Jackson [Fri, 14 Jun 2013 15:39:34 +0000 (16:39 +0100)]
libelf: abolish elf_sval and elf_access_signed
These are not used anywhere.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Ian Jackson [Fri, 14 Jun 2013 15:39:34 +0000 (16:39 +0100)]
libelf: add `struct elf_binary*' parameter to elf_load_image
The meat of this function is going to need a copy of the elf pointer,
in forthcoming patches.
No functional change in this patch.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Ian Jackson [Fri, 14 Jun 2013 15:39:34 +0000 (16:39 +0100)]
libxc: Fix range checking in xc_dom_pfn_to_ptr etc.
* Ensure that xc_dom_pfn_to_ptr (when called with count==0) does not
return a previously-allocated block which is entirely before the
requested pfn (!)
* Provide a version of xc_dom_pfn_to_ptr, xc_dom_pfn_to_ptr_retcount,
which provides the length of the mapped region via an out parameter.
* Change xc_dom_vaddr_to_ptr to always provide the length of the
mapped region and change the call site in xc_dom_binloader.c to
check it. The call site in xc_dom_load_elf_symtab will be corrected
in a forthcoming patch, and for now ignores the returned length.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
v5: This patch is new in v5 of the series.
Ian Jackson [Fri, 14 Jun 2013 15:39:34 +0000 (16:39 +0100)]
libxc: introduce xc_dom_seg_to_ptr_pages
Provide a version of xc_dom_seg_to_ptr which returns the number of
guest pages it has actually mapped. This is useful for callers who
want to do range checking; we will use this later in this series.
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
v7: xc_dom_seg_to_ptr_pages now always expects pages_out!=NULL.
(It seems silly to have it tolerate NULL when all the real callers
pass non-NULL and there's a version which doesn't need pages_out
anyway. Fix the call in xc_dom_seg_to_ptr to have a dummy pages
for pages_out.)
v5: xc_dom_seg_to_ptr_pages sets *pages_out=0 if it returns NULL.
v4 was:
Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Fri, 14 Jun 2013 15:39:34 +0000 (16:39 +0100)]
libelf: abolish libelf-relocate.c
This file is not actually used. It's not built in Xen's instance of
libelf; in libxc's it's built but nothing in it is called. Do not
compile it in libxc, and delete it.
This reduces the amount of work we need to do in forthcoming patches
to libelf (particularly since as libelf-relocate.c is not used it is
probably full of bugs).
This is part of the fix to a security issue, XSA-55.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
Ian Campbell [Tue, 4 Jun 2013 10:54:10 +0000 (11:54 +0100)]
xen/arm64: fix stack dump in show_trace
On aarch64 the frame pointer points to the next frame pointer and the return
address is the previous stack slot (so below on the downward growing stack,
therefore above in memory):
|<RETURN ADDR> ^addresses grow up
FP -> |<NEXT FP> |
| |
v |
stack grows down.
This is contrary to aarch32 where the frame pointer points to the return
address and the next frame pointer is the next stack slot (so above on the
downward growing stack, below in memory):
FP -> |<RETURN ADDR> ^addresses grow up
|<NEXT FP> |
| |
v |
stack grows down.
In addition print out LR as part of the trace, since it may contain the
penultimate return address e.g. if the ultimate function is a leaf function.
Lastly nuke some unnecessary braces.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
A bit of debugging revealed that the map_domain_page and unmap_domain_page
are meant for short life-time mappings. And that those mappings are finite.
In the 2 VCPU guest we only have 32 entries and once we have exhausted those
we trigger the BUG_ON condition.
The two functions - tmh_persistent_pool_page_[get,put] are used by the xmem_pool
when xmem_pool_[alloc,free] are called. These xmem_pool_* function are wrapped
in macro and functions - the entry points are via: tmem_malloc
and tmem_page_alloc. In both cases the users are in the hypervisor and they
do not seem to suffer from using the hypervisor virtual addresses.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 13 Jun 2013 08:49:01 +0000 (10:49 +0200)]
x86: fix map_domain_page() last resort fallback
Guests with vCPU count not divisible by 4 have unused bits in the last
word of their inuse bitmap, and the garbage collection code therefore
would get mislead believing that some entries were actually recoverable
for use.
Also use an earlier established local variable in mapcache_vcpu_init()
instead of re-calculating the value (noticed while investigating the
generally better option of setting those overhanging bits once during
setup - this didn't work out in a simple enough fashion because the
mapping getting established there isn't in the current address space,
and hence the bitmap isn't directly accessible there).
Reported-by: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 12 Jun 2013 15:27:08 +0000 (17:27 +0200)]
x86: drop setup_idle_pagetable()
With vcpu->domain->arch.perdomain_l3_pg no longer getting set up for
the idle domain, this creates an invalid L4 entry (due to translating
a NULL struct page_info pointer to a physical address).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Dario Faggioli [Fri, 7 Jun 2013 16:14:35 +0000 (18:14 +0200)]
libxl: add LIBXL_HAVE_<foo> for outstanding_pages and outstanding_memkb
Commits d0782481 ("xl: export 'outstanding_pages' value from xcinfo")
and bec8f17e ("xen: Remove the XENMEM_get_oustanding_pages and provide
the data via xc_phys_info") added these two fields in libxl_physinfo
and in libxl_dominfo, respectively, but did not include the needed
LIBXL_HAVE_<foo> runes. Adding them.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Daniel De Graaf [Thu, 30 May 2013 12:57:22 +0000 (08:57 -0400)]
flask/policy: device model stubdom fixes
This fixes framebuffer support for device model stubdoms after 3f28d007
which added the target_hack permission but did not allow the permission
to the stubdom it was created for.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Wed, 12 Jun 2013 08:07:48 +0000 (10:07 +0200)]
io/ring.h: drop unused and broken *_RING_ATTACH() macros
Initializing r*_prod_pvt and r*_cons from independent shared ring
fields is broken, as other macros in this header rely on them being
coupled. Furthermore using the backend variant would also imply a
security vulnerability.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 12 Jun 2013 08:07:20 +0000 (10:07 +0200)]
io/ring.h: new macro to detect whether there are too many requests on the ring
Backends may need to protect themselves against an insane number of
produced requests stored by a frontend, in case they iterate over
requests until reaching the req_prod value. There can't be more
requests on the ring than the difference between produced requests
and produced (but possibly not yet published) responses.
This is a more strict alternative to a patch previously posted by
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
x86/HVM: fix initialization of wallclock time for PVHVM on migration
Call update_domain_wallclock_time on hvm_latch_shinfo_size even if
the bitness of the guest has already been set, this fixes the problem
with the wallclock not being set for PVHVM guests on resume from
migration.
Zhenguo Wang [Tue, 11 Jun 2013 07:45:55 +0000 (09:45 +0200)]
x86/HVM: fix x2APIC APIC_ID read emulation
APIC and x2APIC have different format for APIC_ID register. Need
translation.
Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com> Signed-off-by: Xiaowei Yang <xiaowei.yang@huawei.com>
Convert code to use switch(), fixing coding style issue at once, and
use GET_xAPIC_ID() on the value read instead of VLAPIC_ID() (reading
the field again).
In the course of this also properly reject both read and writes on the
non-existing MSR corresponding to APIC_ICR2.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
This was never reported to be hit, and we assume to have taken care of
the problem by excluding legacy IRQs from the IRQ move cleanup logic.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Wed, 5 Jun 2013 08:05:49 +0000 (10:05 +0200)]
AMD/IOMMU: revert "SR56x0 Erratum 64 - Reset all head & tail pointers"
The code this patch added is redundant with already present code in
set_iommu_{command_buffer,{event,ppr}_log}_control(). Just switch those
ones from using writel() to writeq() for consistency.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
When using a vtsc, hvm_set_guest_time changes hvm_vcpu.stime_offset,
which is used in the vcpu time structure to calculate the
tsc_timestamp, so after updating stime_offset we need to propagate the
change to vcpu_time in order for the guest to get the right time if
using the PV clock.
Jan Beulich [Tue, 4 Jun 2013 15:25:41 +0000 (17:25 +0200)]
x86: fix XCR0 handling
- both VMX and SVM ignored the ECX input to XSETBV
- both SVM and VMX used the full 64-bit RAX when calculating the input
mask to XSETBV
- faults on XSETBV did not get recovered from
Also consolidate the handling for PV and HVM into a single function,
and make the per-CPU variable "xcr0" static to xstate.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Tue, 4 Jun 2013 15:23:11 +0000 (17:23 +0200)]
x86: preserve FPU selectors for 32-bit guest code
Doing {,F}X{SAVE,RSTOR} unconditionally with 64-bit operand size leads
to the selector values associated with the last instruction/data
pointers getting lost. This, besides being inconsistent and not
compatible with native hardware behavior especially for 32-bit guests,
leads to bug checks in 32-bit Windows when running with "Driver
verifier" (see e.g. http://support.microsoft.com/kb/244617).
In a first try I made the code figure out the current guest mode, but
that has the disadvantage of only taking care of the issue when the
guest executes in the mode for which the state currently is (i.e.
namely not in the case of a 64-bit guest running a 32-bit application,
but being in kernle [64-bit] mode).
The solution presented here is to conditionally execute an auxiliary
FNSTENV and use the selectors from there.
In either case the determined word size gets stored in the last byte
of the FPU/SSE save area, which is available for software use (and I
verified is being cleared to zero by all versions of Xen, i.e. will not
present a problem when migrating guests from older to newer hosts), and
evaluated for determining the operand size {,F}XRSTOR is to be issued
with.
Note that I did check whether using a second FXSAVE or a partial second
XSAVE would be faster than FNSTENV - neither on my Westmere (FXSAVE)
nor on my Romley (XSAVE) they are.
I was really tempted to use branches into the middle of instructions
(i.e. past the REX64 prefixes) here, as that would have allowed to
collapse the otherwise identical fault recovery blocks. I stayed away
from doing so just because I expect others to dislike such slightly
subtle/tricky code.
Reported-by: Ben Guthro <Benjamin.Guthro@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Tue, 4 Jun 2013 07:29:07 +0000 (09:29 +0200)]
x86/xsave: properly check guest input to XSETBV
Other than the HVM emulation path, the PV case so far failed to check
that YMM state requires SSE state to be enabled, allowing for a #GP to
occur upon passing the inputs to XSETBV inside the hypervisor.
Jan Beulich [Tue, 4 Jun 2013 07:27:58 +0000 (09:27 +0200)]
x86/xsave: recover from faults on XRSTOR
Just like FXRSTOR, XRSTOR can raise #GP if bad content is being passed
to it in the memory block (i.e. aspects not under the control of the
hypervisor, other than e.g. proper alignment of the block).
Also correct the comment explaining why FXRSTOR needs exception
recovery code to not wrongly state that this can only be a result of
the control tools passing a bad image.
Jan Beulich [Tue, 4 Jun 2013 07:26:54 +0000 (09:26 +0200)]
x86/xsave: fix information leak on AMD CPUs
Just like for FXSAVE/FXRSTOR, XSAVE/XRSTOR also don't save/restore the
last instruction and operand pointers as well as the last opcode if
there's no pending unmasked exception (see CVE-2006-1056 and commit
9747:4d667a139318).
While the FXSR solution sits in the save path, I prefer to have this in
the restore path because there the handling is simpler (namely in the
context of the pending changes to properly save the selector values for
32-bit guest code).
Also this is using FFREE instead of EMMS, as it doesn't seem unlikely
that in the future we may see CPUs with x87 and SSE/AVX but no MMX
support. The goal here anyway is just to avoid an FPU stack overflow.
I would have preferred to use FFREEP instead of FFREE (freeing two
stack slots at once), but AMD doesn't document that instruction.
This is CVE-2013-2076 / XSA-52.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
When booting Xen on VMware ESX 5.1 and Workstation 9, you hit a GPF
during MCE initialization. The culprit is line 631 in
set_poll_bankmask():
bitmap_copy(mb->bank_map, mca_allbanks->bank_map, nr_mce_banks);
What is happening is that in mca_cap_init(), nr_mce_banks is being set
to 0. This causes the allocation of bank_map to be set to
ZERO_BLOCK_PTR which is the return value for zero-size allocation by
xzalloc_array()/_xmalloc(). This results in the bitmap_copy() to fail
disastrously. The following patch fixes this issue.
Signed-off-by: Aravindh Puthiyaparambil <aravindp@cisco.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de>
Boris Ostrovsky [Fri, 31 May 2013 08:06:11 +0000 (10:06 +0200)]
Currently only a few Intel models have VPMU workaround turned on. It
appears, however, that this issue exists on more models than what is
covered by check_pmc_quirk(). Since we don't know exactly which cpus
are affected we should turn this workaround on for all family 6
processors.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Sort-of-Acked-by: "Auld, Will" <will.auld@intel.com>
Jan Beulich [Fri, 31 May 2013 06:35:22 +0000 (08:35 +0200)]
x86: fix ordering of operations in destroy_irq()
The fix for XSA-36, switching the default of vector map management to
be per-device, exposed more readily a problem with the cleanup of these
vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
keeps the subsequently invoked clear_irq_vector() from clearing the
bits for both the in-use and a possibly still outstanding old vector.
Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
its only caller, deferring the clearing of the vector map pointer until
after clear_irq_vector().
Once at it, also defer resetting of desc->handler until after the loop
around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
(mostly theoretical) issue with the intercation with do_IRQ(): If we
don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
->ack() and ->end() with different ->handler pointers, potentially
leading to an IRQ remaining un-acked. The issue is mostly theoretical
because non-guest IRQs are subject to destroy_irq() only on (boot time)
error paths.
As to the changed locking: Invoking clear_irq_vector() with desc->lock
held is okay because vector_lock already nests inside desc->lock (proven
by set_desc_affinity(), which takes vector_lock and gets called from
various desc->handler->ack implementations, getting invoked with
desc->lock held).
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
George Dunlap [Fri, 24 May 2013 15:20:59 +0000 (16:20 +0100)]
libxl: Remove qxl support for the 4.3 release
The qxl drivers for Windows and Linux end up calling instructions
that cannot be used for MMIO at the moment. Just for the 4.3 release,
remove qxl support.
This patch should be reverted as soon as the 4.4 development window opens.
The instruction in question is "movdqu (%rcx),%xmm3". Xen knows how
to emulate it, but unfortunately %xmm3 is 16 bytes long, and the interface
between Xen and qemu at the moment would appear to only allow MMIO accesses
of 8 bytes.
It's too late in the release cycle to find a fix or a workaround.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Frediano Ziglio [Thu, 23 May 2013 13:55:18 +0000 (13:55 +0000)]
gcov: Add script to split coverage informations.
Split coverage informations extracted from xencov utility.
This script accept coverage blob either as file or from input and extract
into files compatible with gcc format (gcda).
Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
George Dunlap [Fri, 24 May 2013 14:33:27 +0000 (15:33 +0100)]
libxl: use Linux-compatible names for sse4 cpuid features
Linux uses sse4_1 and sse4_2, but at the moment libxl uses '.' instead
of '_'. This makes it confusing for people looking in Linux's /proc/cpuinfo
to disable features.
Add the Linux feature names, keeping the old ones for compatability.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.camppbell@citrix.com>
Andre Przywara [Fri, 24 May 2013 13:47:30 +0000 (15:47 +0200)]
arm/early-printk: add Calxeda Midway UART support
With the help of the previous patches add a stanza to
xen/arch/arm/Rules.mk to specify the UART configuration of the
Calxeda Midway machine.
The information has been taken from the Linux kernel's .dts file.
This can be enabled by adding "CONFIG_EARLY_PRINTK=midway" to
Config.mk.
Signed-off-by: Andre Przywara <andre.przywara@calxeda.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andre Przywara [Fri, 24 May 2013 13:47:29 +0000 (15:47 +0200)]
arm/early-printk: add support for ARM Fastmodel
Though the ARM Fastmodel software emulator mimics a Versatile Express
board, the boot process is different compared to the real hardware,
so the early printk differs slightly. Create a new early-printk
target to model this correctly.
Signed-off-by: Andre Przywara <andre.przywara@calxeda.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andre Przywara [Fri, 24 May 2013 13:47:28 +0000 (15:47 +0200)]
arm/early-printk: move UART base address to Rules.mk
The UART memory mapped base address is currently hardcoded in the
early-printk UART driver, which denies the driver to be used by
two machines with a different mapping.
Move this definition out to xen/arch/arm/Rules.mk, allowing easier
user access and later sharing of the driver.
Signed-off-by: Andre Przywara <andre.przywara@calxeda.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andre Przywara [Fri, 24 May 2013 13:47:27 +0000 (15:47 +0200)]
arm/early-printk: allow skipping of UART init
While it seems obvious to initialize the UART before using it, chances
are that some firmware code or the bootloader already did this.
So it may actually be a good idea to skip the initialization, in fact
this fixes early printk on my TC2 Versatile Express.
So provide an option in xen/arch/arm/Rules.mk to only initialize the
UART when needed.
Signed-off-by: Andre Przywara <andre.przywara@calxeda.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andre Przywara [Fri, 24 May 2013 13:47:26 +0000 (15:47 +0200)]
arm/early-printk: calculate baud rate divisor from user provided value
For early-printk the used baud rate was hardcoded in the code, using
precalculated divisor values.
Let the calculation of these values be done by the preprocessor and
use a human readable value in xen/arch/arm/Rules.mk to drive this.
Signed-off-by: Andre Przywara <andre.przywara@calxeda.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Thu, 23 May 2013 14:50:11 +0000 (15:50 +0100)]
xen/arm: avoid lost characters with early_printk
Introduce the function early_flush to wait until the UART has finished to
transfer the character.
When early printk is enabled, it's possible to loose the last characters if:
- the UART is initialized by the UART driver
- Xen hang
This is result to a truncated message. To be sure that the message is fully
transfered by early_printk, add a call to early_flush. This will avoid lost
characters.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
George Dunlap [Tue, 14 May 2013 10:07:14 +0000 (11:07 +0100)]
xl: Return an error if an empty file is passed to cd-insert
Two changes:
* Stat the file before calling libxl_cdrom_insert()
* Return an error if anything fails (including libxl_cdrom_insert)
This is in part to work around the fact that the RAW disk type
is used for things that aren't actually files; so we can't call
stat in libxl_device.c:libxl__device_disk_set_backend() because
it may be going over a remote protocol.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Christoph Egger [Mon, 13 May 2013 08:24:31 +0000 (10:24 +0200)]
stubdom: Make stubdom buildsystem consistent with tools buildsystem
Use FETCHER for stubdom, too. This makes stubdom buildsystem
more consistent with tools buildsystem.
Fixes toplevel configure failure if wget is not found
independent if we are going to build stubdom or not.
Signed-off-by: Christoph Egger <chegger@amazon.de> Reviewed-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
The code had an obvious bug where it would assume that the balloon
amount would always be _something_ and add an E820_RAM entry at the
end of the E820 array. The added E820_RAM would contain the balloon amount
plus the delta of memory that had to be subtracted b/c of the various
E820 entries. That assumption is certainly true when maxmem != mem,
but if guest config has maxmem = memory that is incorrect (as balloon
value is zero). The end result is that the E820 that is constructed
is missing a swath of "delta" memory and in most cases ends up with
only one E820_RAM entry that is of 512MB size on many Intel systems.
Reported-by: Christian Holpert <christian@holpert.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 29 May 2013 14:48:11 +0000 (15:48 +0100)]
libxc: limit cpu values when setting vcpu affinity
When support for pinning more than 64 cpus was added, check for cpu
out-of-range values was removed. This can lead to subsequent
out-of-bounds cpumap array accesses in case the cpu number is higher
than the actual count.
This patch returns the check.
This is CVE-2013-2072 / XSA-56
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
By moving the call to update_vcpu_system_time() out of schedule() into
arch-specific context switch code, the original problem of the function
accessing the wrong domain's address space goes away (obvious even from
patch context, as update_runstate_area() does similar copying).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jaeyong Yoo [Wed, 22 May 2013 02:34:18 +0000 (02:34 +0000)]
xen/arm: Disable interrupts for the entire duration of the context switch
Not just while saving state. Otherwise there is a race between interrupts
arriving and updating the LR state and gic_restore_state overwriting them with
the saved state.
With this change we no longer need to disable interrupts in gic_restore_state.
Eric Shelton [Thu, 23 May 2013 11:08:51 +0000 (13:08 +0200)]
x86/EFI: fix boot for pre-UEFI systems
efi/boot.c makes a call to QueryVariableInfo on all systems. However,
as it is only promised for UEFI 2.0+ systems, pre-UEFI systems can
hang or crash on this call. The below patch adds a version check, a
technique used in other parts of the Xen EFI codebase.
Signed-off-by: Eric Shelton <eshelton@pobox.com>
Check runtime services version instead of EFI version (while generally
they would be in sync, nothing requires them to be).
Jan Beulich [Thu, 23 May 2013 11:08:32 +0000 (13:08 +0200)]
x86: fix boot time APIC mode detection
current_cpu_data becomes valid only relatively late in the boot
process, so looking there for a particular feature early in the game
would generally give the appearance of the feature being unavailable.
Getting this wrong means that at kexec time the system would get
returned to xAPIC mode, causing disconnect_bsp_APIC() to try to access
the APIC page, which on systems with x2APIC pre-enabled will never get
set up.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 22 May 2013 13:26:52 +0000 (15:26 +0200)]
AMD/iommu: SR56x0 Erratum 64 - Reset all head & tail pointers
Reference at time of patch:
http://support.amd.com/us/ChipsetMotherboard_TechDocs/46303.pdf
Erratum 64 states that the head and tail pointers for the Command buffer and
Event log are only reset on a cold boot, not a warm boot.
While the erratum is limited to systems using SR56xx chipsets (such as Family
10h CPUs), resetting the pointers is a sensible action in all cases, including
the PPR log for consistency.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Jan Beulich [Tue, 21 May 2013 09:32:34 +0000 (11:32 +0200)]
fix XSA-46 regression with xend/xm
The hypervisor side changes for XSA-46 require the tool stack to now
always map the guest pIRQ before granting access permission to the
underlying host IRQ (GSI). This in particular requires that pciif.py
no longer can skip this step (assuming qemu would do it) for HVM
guests.
This in turn exposes, however, an inconsistency between xend and qemu:
The former wants to always establish 1:1 mappings between pIRQ and host
IRQ (for non-MSI only of course), while the latter always wants to
allocate an arbitrary mapping. Since the whole tool stack obviously
should always agree on the mapping model, make libxc enforce the 1:1
mapping as the more natural one (as well as being the one that allows
for easier debugging, since there no need to find out the extra
mapping). Users of libxc that want to establish a particular (rather
than an allocated) mapping are still free to do so, as well as tool
stacks not based on libxc wanting to implement an allocation based
model (which is why it's not the hypervisor that's being changed to
enforce either model).
Since libxl, like xend, already uses a 1:1 model, it's unaffected by
the libxc change (and it being unaffected by the original hypervisor
side changes is - afaict - simply due to qemu getting spawned at a
later point in time compared to the xend event flow).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Andreas Falck <falck.andreas.lists@gmail.com> (on 4.1) Tested-by: Gordan Bobic <gordan@bobich.net> (on 4.2) Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 21 May 2013 08:16:30 +0000 (10:16 +0200)]
tools: fix dependency file generation
There is a small set of places where files in subdirectories get
compiled from the parent directory. Dependency file wise this is no
problem as long as the files use names distinct without regard to the
directories they sit in, and tools/console/ violates this (in having
two main.c files). Hence we need to avoid losing the directory name,
both to ensure the two compiler instances don't simultaneously write
to the same file (happening of which is what triggered me looking
into this) and to guarantee dependencies for all files will be seen
by make on an incremental rebuild.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Tue, 21 May 2013 08:15:13 +0000 (10:15 +0200)]
x86/HVM: RTC code must be in line with WAET flags passed by hvmloader
With hvmloader telling the guest that it may skip REG_C reads during
the processing of RTC interrupts, the emulation code must not depend
upon these reads to occur. Introduce two modes of operation for the
emulation code, and short of a HVM parameter (too late to be
introduced for 4.3) hard code the mode determination to always assume
that Windows-conforming one for the time being.