Jan Beulich [Thu, 5 Sep 2013 09:47:03 +0000 (11:47 +0200)]
hvmloader: fix SeaBIOS interface
The SeaBIOS ROM image may validly exceed 128k in size, it's only our
interface code that so far assumed that it wouldn't. Remove that
restriction by setting the base address depending on image size.
Add a check to HVM loader so that too big images won't result in silent
guest failure anymore.
Uncomment the intended build-time size check for rombios, moving it
into a function so that it would actually compile.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Fri, 30 Aug 2013 11:33:33 +0000 (12:33 +0100)]
libvhd: use UTC for VHD timestamp
[ported from xapi-project/blktap a79ac2c05f9 ("XOP-289: use UTC for VHD
timestamps")]
Currently, the local timezone is factored into VHD timestamps due to the
use of mktime(). This breaks "vhd-util check" for VHDs created in one
timezone and then moved westward, which results in "primary footer
invalid: creation time in future" errors.
Signed-off-by: Andrei Lifchits <andrei.lifchits@citrix.com>
Andrei no longer works for Citrix but Germano Percossi (ex-colleague of
Andrei) contacted Andrei and confirmed that
1) this work was written on Citrix time,
2) it is OK to have Andrei's SoB.
Remove unused variable "tm".
Signed-off-by: Germano Percossi <germano.percossi@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 27 Aug 2013 14:22:43 +0000 (15:22 +0100)]
libxl: prefer qdisk over blktap when choosing disk backend
There are some disk formats commonly supported by both qdisk and blktap.
As qdisk is better supported and blktap is unmaintained, we choose qdisk
over blktap whenever possible.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Xi Xiong [Fri, 30 Aug 2013 19:21:56 +0000 (12:21 -0700)]
xend: fix file descriptor leak in pci utilities
A file descriptor leak was detected after creating multiple domUs with
pass-through PCI devices. This patch fixes the issue.
Signed-off-by: Xi Xiong <xixiong@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message] Signed-off-by: Matt Wilson <msw@amazon.com>
Steven Noonan [Fri, 30 Aug 2013 23:40:42 +0000 (16:40 -0700)]
xend: handle extended PCI configuration space when saving state
Newer PCI standards (e.g., PCI-X 2.0 and PCIe) introduce extended
configuration space which is larger than 256 bytes. This patch uses
stat() to determine the amount of space used to correctly save all of
the PCI configuration space. Resets handled by the xen-pciback driver
don't have this problem, as that code correctly handles saving
extended configuration space.
Signed-off-by: Steven Noonan <snoonan@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message] Signed-off-by: Matt Wilson <msw@amazon.com>
Andre Przywara [Tue, 3 Sep 2013 14:00:52 +0000 (16:00 +0200)]
pl011: preserve RTS and DTR signal on UART init
Although we do not support hardware flow control in the Xen driver
for the PL011 UART, the other end may be configured to use it.
In this case it waits in vain for the RTS signal to be asserted by
the host and will never transmit any characters.
So we leave RTS and DTR as they had been setup before.
This fixes the UART input on Calxeda Midway, which uses hardware
flow control for the serial-over-LAN functionality.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Tim Deegan [Thu, 29 Aug 2013 15:47:12 +0000 (16:47 +0100)]
xen/x86: don't use '.ifnes' in bug frame construction.
Spotted because it breaks the clang build for LLVM <3.2. .ifnes is
not right here as it will choke on a string with embedded quotes.
.ifnb would be better except that LLVM <3.2 doesn't support that either
(and nor does binutils 2.16).
It should be possible to use something like !!msg or !!msg[0] instead
of a separate flag, but I gave up trying to find something that would
make it through CPP, asm() and gas as a usable constant. :|
Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
Len Brown [Fri, 30 Aug 2013 09:00:07 +0000 (11:00 +0200)]
x86/mwait_idle: initial C8, C9, C10 support
Allow mwait_idle to utilize C8, C9, C10 when they are present on...
"Fourth Generation Intel(R) Core(TM) Processors", which are based on
Intel(R) microarchitecture code name Haswell.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Len Brown [Fri, 30 Aug 2013 08:59:09 +0000 (10:59 +0200)]
x86/mwait_idle: export both C1 and C1E
Here we disable HW promotion of C1 to C1E and export both C1 and C1E
as distinct C-states.
This allows a cpuidle governor to choose a lower latency C-state than
C1E when necessary to satisfy performance and QOS constraints -- and
still save power versus polling.
This also corrects the erroneous latency previously reported for C1E
-- it is 10usec, not 1usec.
Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Len Brown [Fri, 30 Aug 2013 08:58:21 +0000 (10:58 +0200)]
x86/mwait_idle: remove assumption of one C-state per MWAIT flag
Remove the assumption that cstate_tables are indexed by MWAIT flag
values. Each entry identifies itself via its own flags value. This
change is needed to support multiple states that share the same MWAIT
flags.
Note that this can have an effect on what state is described by 'N' on
cmdline max_cstate=N on some systems.
Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".
Drop MWAIT_MAX_NUM_CSTATES (done differently in a prior patch on
Linux).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 30 Aug 2013 08:56:07 +0000 (10:56 +0200)]
x86/xsave: initialization improvements
- properly validate available feature set on APs
- also validate xsaveopt availability on APs
- properly indicate whether the initialization is on the BSP (we
shouldn't be using "cpu == 0" checks for this)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
I realise that this patch causes a change to the public headers. However I
feel it is justified as:
* All toolstacks used to have to embed the magic string (and almost certainly
still do)
* If by some miriacle a new toolstack has started using the new define will
continue to work.
* The only intree consumer of the define is hvmloader itself.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 30 Aug 2013 08:40:29 +0000 (10:40 +0200)]
hvmloader/smbios: Correctly count the number of tables written
Fixes regression indirectly introduced by c/s 4d23036e709627
That changeset added some smbios tables which were option based on the
toolstack providing appropriate xenstore keys. The do_struct() macro would
unconditionally increment nr_structs, even if a table was not actually
written.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 29 Aug 2013 07:53:07 +0000 (09:53 +0200)]
AMD IOMMU: allow command line overrides for broken IVRS tables
With there being so many systems with broken ACPI tables, and with it
generally being known what's wrong with those tables, give people a
handle to overcome the resulting disabling of their IOMMUs.
Inspired by Linux side patches providing similar functionality.
Jan Beulich [Thu, 29 Aug 2013 07:31:37 +0000 (09:31 +0200)]
AMD IOMMU: add missing checks
For one we shouldn't accept IVHD tables specifying IO-APIC IDs beyond
the limit we support (MAX_IO_APICS, currently 128).
And then we shouldn't memset() a pointer allocation of which failed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
Jan Beulich [Wed, 28 Aug 2013 15:03:50 +0000 (17:03 +0200)]
x86: AVX instruction emulation fixes
- we used the C4/C5 (first prefix) byte instead of the apparent ModR/M
one as the second prefix byte
- early decoding normalized vex.reg, thus corrupting it for the main
consumer (copy_REX_VEX()), resulting in #UD on the two-operand
instructions we emulate
Also add respective test cases to the testing utility plus
- fix get_fpu() (the fall-through order was inverted)
- add cpu_has_avx2, even if it's currently unused (as in the new test
cases I decided to refrain from using AVX2 instructions in order to
be able to actually run all the tests on the hardware I have)
- slightly tweak cpu_has_avx to more consistently express the outputs
we don't care about (sinking them all into the same variable)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
xen: update tx_ready callback for ARM serial drivers
Type of tx_ready callback got changed to int to facilitate error condition,
but the ARM serial drivers were not modified thus breaking the compilation.
Reported-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
PCI UART: better cope with UART being temporarily unavailable
This happens for example when dom0 disables ioport responses during PCI
subsystem initialisation. If a __ns16550_poll() happens to be scheduled
during that time, Xen hangs. Detect and exit that condition.
Amended ns16550_ioport_invalid function to only check IER register,
which contins 3 reserved (always 0) bits, therefore it's sufficient for
that test.
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Fix inactive timer list corruption on second S3 resume
init_timer cannot be safely called multiple times on same timer since it does memset(0)
on the structure, erasing the auxiliary member used by linked list code. This breaks
inactive timer list in common/timer.c.
Moved resume_timer initialisation to ns16550_init_postirq, so it's only done once.
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 28 Aug 2013 08:12:36 +0000 (10:12 +0200)]
PCI: centralize parsing of device coordinates in command line options
With yet another case to come in a subsequent patch, it seems time to
do this in a single place rather than hand crafting it in various
scattered around locations.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 28 Aug 2013 08:11:19 +0000 (10:11 +0200)]
AMD IOMMU: also allocate IRTEs for HPET MSI
Omitting this was a blatant oversight of mine in commit 2ca9fbd7 ("AMD
IOMMU: allocate IRTE entries instead of using a static mapping").
This also changes a bogus inequality check into a sensible one, even
though it is already known that this will make HPET MSI unusable on
certain systems (having respective broken firmware). This, however,
seems better than failing on systems with consistent ACPI tables.
The commit 874f76a "PL011: fix reverse logic for interrupt mask register"
introduced regression on the Versatile Express. The board didn't receive
correctly input.
The timeout interrupt may be asserted when the FIFO is not empty, and no futher
data is received over a 32-bit period.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
That changeset moved the watchdog functions from nmi.h to their own
watchdog.h. I thought I had updated all relevant header files and the
compiler was happy as well. However, gdbstub is not even compiled by default,
and I accidentally missed it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 27 Aug 2013 09:28:26 +0000 (11:28 +0200)]
x86/boot: Explicitly clean pcpu stacks in debug builds
This reduces confusion when looking at a hexdump of the pcpu stacks and
wondering were on earth some of the junk was coming from. Also leave some
grep fodder for finding where the BSP switches stack (because it took me
far longer to find than I care to admit to).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Matt Wilson [Tue, 27 Aug 2013 09:23:09 +0000 (11:23 +0200)]
x86/time: remove Cyclone as a platform timer
The Cyclone time source was part of IBM's Summit chipset, which was
only used for 32-bit only ccNUMA and IA-64 machines. Neither of these
are supported by Xen anymore.
Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:07:00 +0000 (15:07 +0200)]
domctl: replace cpumask_weight() uses
In one case it could easily be replaced by range checking the result of
a subsequent operation, and in general cpumask_next(), not always
needing to scan the whole bitmap, is more efficient than the specific
uses of cpumask_weight() here. (When running on big systems, operations
on CPU masks aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 23 Aug 2013 13:06:21 +0000 (15:06 +0200)]
credit1: replace cpumask_empty() uses
In one case it was redundant with the operation it got combined with,
and in the other it could easily be replaced by range checking the
result of a subsequent operation. (When running on big systems,
operations on CPU masks aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:05:39 +0000 (15:05 +0200)]
credit2: replace cpumask_first() uses
... with cpumask_any() or cpumask_cycle().
In one case this also allows elimination of a cpumask_empty() call,
and while doing this I also spotted a redundant use of
cpumask_weight(). (When running on big systems, operations on CPU masks
aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:01:53 +0000 (15:01 +0200)]
un-alias cpumask_any() from cpumask_first()
In order to achieve more symmetric distribution of certain things,
cpumask_any() shouldn't always pick the first CPU (which frequently
will end up being CPU0). To facilitate that, introduce a library-like
function to obtain random numbers.
The per-architecture function is supposed to return zero if no valid
random number can be obtained (implying that if occasionally zero got
produced as random number, it wouldn't be considered such).
As fallback this uses the trivial algorithm from the C standard,
extended to produce "unsigned int" results.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 07:22:08 +0000 (09:22 +0200)]
PCI: break MSI-X data out of struct pci_dev_info
Considering that a significant share of PCI devices out there (not the
least the myriad of CPU-exposed ones) don't support MSI-X at all, and
that the amount of data is well beyond a handful of bytes, break this
out of the common structure, at once allowing the actual data to be
tracked to become architecture specific.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 23 Aug 2013 07:19:29 +0000 (09:19 +0200)]
x86: move struct bug_frame instances out of line
Just like Linux did many years ago, move them into a separate (data)
section, such that they no longer pollute instruction caches and TLBs.
Assertion frames, requiring two pointers to be stored, occupy two slots
in the array, with the second slot mimicking a frame the location
pointer of which doesn't match any address within .text or .init.text
(it effectively points back to the slot itself, which - being in a data
section - can't be reached by non-buggy execution).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Fri, 19 Jul 2013 15:20:10 +0000 (16:20 +0100)]
xen: arm: retry trylock if strex fails on free lock.
This comes from the Linux patches 15e7e5c1ebf5 for arm32 and 4ecf7ccb1973 for
arm64 by Will Deacon and Catalin Marinas respectively. The Linux commit message
says:
An exclusive store instruction may fail for reasons other than lock
contention (e.g. a cache eviction during the critical section) so, in
line with other architectures using similar exclusive instructions
(alpha, mips, powerpc), retry the trylock operation if the lock appears
to be free but the strex reported failure.
I have observed this due to register_cpu_notifier containing:
if ( !spin_trylock(&cpu_add_remove_lock) )
BUG(); /* Should never fail as we are called only during boot. */
which was spuriously failing.
The ARMv8 variant is taken directly from the Linux patch. For v7 I had to
reimplement since we don't currently use ticket locks.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Fri, 19 Jul 2013 15:20:08 +0000 (16:20 +0100)]
xen/arm64: Assembly optimized bitops from Linux
This patch replaces the previous hashed lock implementaiton of bitops with
assembly optimized ones taken from Linux v3.10-rc4.
The Linux derived ASM only supports 8 byte aligned bitmaps (which under Linux
are unsigned long * rather than our void *). We do have actually uses of 4
byte alignment (i.e. the bitmaps in struct xmem_pool) which trigger alignment
faults.
Therefore adjust the assembly to work in 4 byte increments, which involved:
- bit offset now bits 4:0 => mask #31 not #63
- use wN register not xN for load/modify/store loop.
There is no need to adjust the shift used to calculate the word offset, the
difference is already acounted for in the #63->#31 change.
NB: Xen's build system cannot cope with the change from .c to .S file,
remove xen/arch/arm/arm64/lib/.bitops.o.d or clean your build tree.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Chen Baozi [Tue, 13 Aug 2013 11:14:24 +0000 (19:14 +0800)]
xen/arm: Add the new OMAP UART driver.
TI OMAP UART introduces some features such as register access modes, which
makes its configuration and interrupt handling differs from 8250 compatible
UART. Thus, we seperate this driver from ns16550's implementation.
Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Chen Baozi [Tue, 13 Aug 2013 11:14:21 +0000 (19:14 +0800)]
xen: rename ns16550-uart.h to 8250-uart.h and fix some typos
Since UARTs on OMAP5 & Allwinner's SoC are not ns16550 but only 8250
compatible, rename ns16550-uart.h to 8250-uart.h, which is a more pervasive
name. At the same time, fix some typos, which have redundance UART_
prefixes in some macros.
Andre Przywara [Thu, 22 Aug 2013 07:40:54 +0000 (09:40 +0200)]
ARM: fix const declaration of platform struct
As Julien pointed out the other day, the data type for the platform
DT name match struct is wrong.
To be really immutable, we have to use "const char * const".
Fix it on the three currently existing platforms.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Thu, 22 Aug 2013 09:24:00 +0000 (11:24 +0200)]
Correct X2-APIC HVM emulation
commit 6859874b61d5ddaf5289e72ed2b2157739b72ca5 ("x86/HVM: fix x2APIC
APIC_ID read emulation") introduced an error for the hvm emulation of
x2apic. Any try to write to APIC_ICR MSR will result in a GP fault.
Yang Zhang [Thu, 22 Aug 2013 08:59:01 +0000 (10:59 +0200)]
Nested VMX: Update APIC-v(RVI/SVI) when vmexit to L1
If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
Yang Zhang [Thu, 22 Aug 2013 08:50:13 +0000 (10:50 +0200)]
Nested VMX: Force check ISR when L2 is running
External interrupt is allowed to notify CPU only when it has higher
priority than current in servicing interrupt. With APIC-v, the priority
comparing is done by hardware and hardware will inject the interrupt to
VCPU when it recognizes an interrupt. Currently, there is no virtual
APIC-v feature available for L1 to use, so when L2 is running, we still need
to compare interrupt priority with ISR in hypervisor instead via hardware.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
Ian Campbell [Wed, 15 May 2013 13:47:32 +0000 (14:47 +0100)]
tools: allow user to specify a system qemu-xen binary
If this option is given don't bother building qemu-xen ourselves. Likely to be
handy for distros who have an existing qemu package which they want to reuse.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 6 Aug 2013 10:32:32 +0000 (11:32 +0100)]
tools: Make qemu-xen-traditional build optional.
Now that we have upstream qemu people may want to avoid building this extra
code.
There is a little bit of trickery in stubdom/configure.ac to ensure that the
ioemu stubdom is only built if qemu-traditional is enabled.
libxl will return an error if a caller tries to build a domain using
qemu-xen-traditional when this support was disabled at build time. Since
qemu-xen-traditional has been historically tightly bound to the Xen releases I
don't see any value in supporting "3rd party" provision of
qemu-xen-traditional.
We also do not want/need this on ARM therefore default is on for x86 and off
otherwise.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- trivial conflicts in Tools.mk.in and tools/configure.ac.
Reran autogen.sh ]
Andre Przywara [Tue, 13 Aug 2013 15:12:35 +0000 (17:12 +0200)]
PL011: fix reverse logic for interrupt mask register
The PL011 IMSC register description is somehow fuzzy in the
documentation; by comparing it with the Linux implementation one can
see that the logic is actually reversed to Xen's implementation:
A "0" in field means interrupt disabled, a "1" enables it.
Therefore we enabled all interrupts instead of disabling them in the
beginning and later on masked the wrong interrupts.
Unclear how this worked on the Versatile Express, but this fix is
needed to get Calxeda Midway running (and works on VExpress, too).
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Wed, 21 Aug 2013 06:38:40 +0000 (08:38 +0200)]
ACPI: fix acpi_os_map_memory()
It using map_domain_page() was entirely wrong. Use __acpi_map_table()
instead for the time being, with locking added as the mappings it
produces get replaced with subsequent invocations. Using locking in
this way is acceptable here since the only two runtime callers are
acpi_os_{read,write}_memory(), which don't leave mappings pending upon
returning to their callers.
Also fix __acpi_map_table()'s first parameter's type - while benign for
unstable, backports to pre-4.3 trees will need this.
Joby Poriyath [Tue, 20 Aug 2013 15:04:21 +0000 (17:04 +0200)]
interrupts: allow guest to set/clear MSI-X mask bit
Guest needs the ability to enable and disable MSI-X interrupts
by setting the MSI-X control bit, for a passed-through device.
Guest is allowed to write MSI-X mask bit only if Xen *thinks*
that mask is clear (interrupts enabled). If the mask is set by
Xen (interrupts disabled), writes to mask bit by the guest is
ignored.
Currently, a write to MSI-X mask bit by the guest is silently
ignored.
A likely scenario is where we have a 82599 SR-IOV nic passed
through to a guest. From the guest if you do
ifconfig <ETH_DEV> down
ifconfig <ETH_DEV> up
the interrupts remain masked. On VF reset, the mask bit is set
by the controller. At this point, Xen is not aware that mask is set.
However, interrupts are enabled by VF driver by clearing the mask
bit by writing directly to BAR3 region containing the MSI-X table.
From dom0, we can verify that
interrupts are being masked using 'xl debug-keys M'.
Ian Campbell [Thu, 8 Aug 2013 12:15:17 +0000 (13:15 +0100)]
xen: arm: Use a direct mapping of RAM on arm64
We have plenty of virtual address space so we can avoid needing to map and
unmap pages all the time.
A totally arbitrarily chosen 32GB frame table leads to support for 5TB of RAM.
I haven't tested with anything near that amount of RAM though. There is plenty
of room to expand further when that becomes necessary.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Thu, 8 Aug 2013 12:15:15 +0000 (13:15 +0100)]
xen: arm: allow virt_to_maddr to take either a pointer or an integer
This seems to be expected by common code which passes both pointers and
unsigned long as virtual addresses. The latter case in particular is in
init_node_heap() under a DIRECTMAP_VIRT_END #ifdef, which is why it hasn't
affected us yet (but will in a subsequent patch).
The new prototypes match the x86 versions apart from using vaddr_t instead of
unsigned long.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell [Tue, 23 Jul 2013 17:12:26 +0000 (18:12 +0100)]
xen: arm: reduce the size of the xen heap to max 1/8 RAM size
When building a 1GB dom0 on a system with 2GB RAM we are running out of domheap
pages, while there are still plenty of xenheap pages spare.
I would have sworn that when the domheap was exhausted we would fall back to
allocating xenheap pages but this doesn't appear to be the case. It's possible
that we have setup something incorrectly on ARM but alloc_domheap_pages pretty
clearly tries to allocate memory from MEMZONE_XEN+1..zone_hi.
Without the fallback from domheap to xenheap taking 1GB of any system with >1GB
of RAM for xenheap is excessive so instead set a limit of 1/8 of the total
amount of RAM. By way of comparison x86_32 used to have a static 12MB xenheap
(which also included .text etc) and in theory supported up to 16GB RAM, by that
measure 1/8 is plenty.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Fri, 19 Jul 2013 11:51:11 +0000 (12:51 +0100)]
xen: remove evtchn_upcall_mask from interface on ARM
On ARM event-channel upcalls are masked using the hardware's interrupt mask
bit and not by a software bit.
Leaving this field present in the interface has caused some confusion already
and is liable to mean it gets inadvertently used in the future. So arrange for
this field to be turned into a padding field on ARM by introducing a
XEN_HAVE_PV_UPCALL_MASK define.
This bit is also unused for x86 PV-on-HVM guests, but we can't realistically
distinguish those from x86 PV guests in the headers.
Add a per-arch vcpu_event_delivery_is_enabled function to replace an open
coded use of evtchn_upcall_mask in common code (in a debug keyhandler). The
existing local_event_delivery_is_enabled, which operates only on current, was
unimplemented on ARM and unused on x86, so remove it.
ifdef the use of evtchn_upcall_mask when setting up a new vcpu info page.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 19 Jul 2013 11:51:09 +0000 (12:51 +0100)]
xen: arm: include public/xen.h in foreign interface checking
mkheader.py doesn't cope with
struct foo { };
so add a newline.
Define unsigned long and long to a non-existent type on ARM so as to catch
their use.
Teach mkheader.py to cope with structs which are ifdef'd. This cannot cope
with #defines between the #ifdef and the struct definitions, so move
MAX_GUEST_CMDLINE to be next to its only usage.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 19 Jul 2013 11:51:08 +0000 (12:51 +0100)]
xen: only expose start_info on architectures which have a PV boot path
Most of this struct is PV MMU specific and it is not used on ARM at all.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 19 Jul 2013 11:51:07 +0000 (12:51 +0100)]
xen/compat: support XEN_HAVE_FOO ifdefs in public interface
This allows us expose or hide interface features on different architectures
without requiring nasty arch-specific ifdeffery.
Preserves any #ifdef with a XEN_HAVE_* symbol name, as well as any #else or
The ifdef symbol becomes COMPAT_HAVE in the compat versions so that
architectures can enable or disable interfaces for compat mode too. (This
actually just fell out of the way the existing stuff works and it didn't seem
worth jumping through hoops to make the name remain XEN_HAVE).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 31 Jul 2013 15:15:57 +0000 (16:15 +0100)]
tools: drop 'sv'
I'm not even sure what this thing is. Looks like some sort of Twisted Python
based frontend to xend.
Whatever it is I am perfectly sure no one can be using it. Apart from drive by
build fixes caused by updates elsewhere it has seen no real development since
2005. I suspect it was never even finished/usable.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 31 Jul 2013 15:15:56 +0000 (16:15 +0100)]
tools: disable blktap1 build by default
I don't think there are any dom0's around whose kernels support only blktap1
and not something newer like blktap2 or qdisk. Certainly not that you would
want to run Xen 4.4 on.
libxl will never use blktap1.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 31 Jul 2013 15:15:54 +0000 (16:15 +0100)]
tools: remove lomount
Build was disabled by default in 2008 (9bb7f7e2aca49). As noted at the time
people should be using kpartx these days instead.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 31 Jul 2013 15:15:53 +0000 (16:15 +0100)]
tools: remove miniterm
It has been disabled by default since 2008 (9bb7f7e2aca4). Back then Ian J
asserted it was useful to keep them in the tree in source form. I don't think
this is true anymore.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 31 Jul 2013 15:15:52 +0000 (16:15 +0100)]
tools: delete xsview
This was apparently a Qt xenstore viewer. It hasn't been touched since it was
first committed in 2007 and I can't beleive anyone is actually using even if
it still happens to work.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>