]> xenbits.xensource.com Git - xen.git/log
xen.git
12 years agox86: introduce MWAIT-based, ACPI-less CPU idle driver
Jan Beulich [Fri, 21 Sep 2012 11:47:18 +0000 (13:47 +0200)]
x86: introduce MWAIT-based, ACPI-less CPU idle driver

This is a port of Linux'es intel-idle driver serving the same purpose.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agocpuidle: remove unused latency_ticks member
Jan Beulich [Fri, 21 Sep 2012 11:45:08 +0000 (13:45 +0200)]
cpuidle: remove unused latency_ticks member

... and code used only for initializing it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agointroduce guest_handle_for_field()
Jan Beulich [Thu, 20 Sep 2012 11:31:19 +0000 (13:31 +0200)]
introduce guest_handle_for_field()

This helper turns a field of a GUEST_HANDLE in a GUEST_HANDLE.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
12 years agoACPI: move tables.c fully into .init.*
Jan Beulich [Thu, 20 Sep 2012 07:22:55 +0000 (09:22 +0200)]
ACPI: move tables.c fully into .init.*

The only non-init item was the space reserved for the initial tables,
but we can as well dynamically allocate that array.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: tighten checks in XEN_DOMCTL_memory_mapping handler
Jan Beulich [Thu, 20 Sep 2012 07:21:53 +0000 (09:21 +0200)]
x86: tighten checks in XEN_DOMCTL_memory_mapping handler

Properly checking the MFN implies knowing the physical address width
supported by the platform, so to obtain this consistently the
respective code gets moved out of the MTRR subdir.

Btw., the model specific workaround in that code is likely unnecessary
- I believe those CPU models don't support 64-bit mode. But I wasn't
able to formally verify this, so I preferred to retain that code for
now.

But domctl code here also was lacking other error checks (as was,
looking at it again from that angle) the XEN_DOMCTL_ioport_mapping one.
Besides adding the missing checks, printing is also added for the case
where revoking access permissions didn't work (as that may have
implications for the host operator, e.g. wanting to not pass through
affected devices to another guest until the one previously using them
did actually die).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/IO-APIC: streamline level ack/end handling
Jan Beulich [Thu, 20 Sep 2012 07:20:30 +0000 (09:20 +0200)]
x86/IO-APIC: streamline level ack/end handling

Rather than evaluating "ioapic_ack_new" on each invocation, and
considering that the two methods really have almost no code in common,
split the handlers.

While at it, also move ioapic_ack_{new,forced} into .init.data
(eliminating the single non-__init reference to the former).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agotmem: bump pool version to 1 to fix restore issue when tmem enabled
Zhenzhong Duan [Wed, 19 Sep 2012 15:38:47 +0000 (17:38 +0200)]
tmem: bump pool version to 1 to fix restore issue when tmem enabled

Restore fails when tmem is enabled both in hypervisor and guest. This
is due to spec version mismatch when restoring a pool.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agox86: remove open-coded IO-APIC RTE reads/writes
Jan Beulich [Wed, 19 Sep 2012 07:30:50 +0000 (09:30 +0200)]
x86: remove open-coded IO-APIC RTE reads/writes

This improves readability, not the least through doing away with a
couple of ugly casts.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: properly check XEN_DOMCTL_ioport_mapping arguments for invalid range
Jan Beulich [Wed, 19 Sep 2012 07:27:55 +0000 (09:27 +0200)]
x86: properly check XEN_DOMCTL_ioport_mapping arguments for invalid range

In particular, the case of "np" being a very large value wasn't handled
correctly. The range start checks also were off by one (except that in
practice, when "np" is properly range checked, this would still have
been caught by the range end checks).

Also, is a GFN wrap in XEN_DOMCTL_memory_mapping really okay?

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/ACPI: fix error indication from acpi_parse_madt_lapic_entries()
Jan Beulich [Wed, 19 Sep 2012 07:26:26 +0000 (09:26 +0200)]
x86/ACPI: fix error indication from acpi_parse_madt_lapic_entries()

If the legacy APIC invocation of acpi_table_parse_madt() succeeds but
the x2APIC counterpart fails, this is regarded as failure by the
function, yet its return value would indicate success.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoxsm/flask: add domain relabel support
Daniel De Graaf [Mon, 17 Sep 2012 20:12:21 +0000 (21:12 +0100)]
xsm/flask: add domain relabel support

This adds the ability to change a domain's XSM label after creation.
The new label will be used for all future access checks; however,
existing event channels and memory mappings will remain valid even if
their creation would be denied by the new label.

With appropriate security policy and hooks in the domain builder, this
can be used to create domains that the domain builder does not have
access to after building. It can also be used to allow a domain to
drop privileges - for example, prior to launching a user-supplied
kernel loaded by a pv-grub stubdom.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoxsm/flask: remove unneeded create_sid field
Daniel De Graaf [Mon, 17 Sep 2012 20:10:39 +0000 (21:10 +0100)]
xsm/flask: remove unneeded create_sid field

This field was only used to populate the ssid of dom0, which can be
handled explicitly in the domain creation hook. This also removes the
unnecessary permission check on the creation of dom0.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoxsm/flask: remove inherited class attributes
Daniel De Graaf [Mon, 17 Sep 2012 20:10:07 +0000 (21:10 +0100)]
xsm/flask: remove inherited class attributes

The ability to declare common permission blocks shared across multiple
classes is not currently used in Xen. Currently, support for this
feature is broken in the header generation scripts, and it is not
expected that this feature will be used in the future, so remove the
dead code.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoxen: add virtual x2apic support for apicv
Jiongxi Li [Mon, 17 Sep 2012 20:06:02 +0000 (21:06 +0100)]
xen: add virtual x2apic support for apicv

basically to benefit from apicv, we need clear MSR bitmap for
corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
    delivery

Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoxen: enable Virtual-interrupt delivery
Jiongxi Li [Mon, 17 Sep 2012 20:05:11 +0000 (21:05 +0100)]
xen: enable Virtual-interrupt delivery

Virtual interrupt delivery avoids Xen to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:
For pending interrupt from vLAPIC, instead of direct injection, we may
need update architecture specific indicators before resuming to guest.
Before returning to guest, RVI should be updated if any pending IRRs
EOI exit bitmap controls whether an EOI write should cause VM-Exit. If
set, a trap-like induced EOI VM-Exit is triggered. The approach here
is to manipulate EOI exit bitmap based on value of TMR. Level
triggered irq requires a hook in vLAPIC EOI write, so that vIOAPIC EOI
is triggered and emulated

Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoxen: enable APIC-Register Virtualization
Jiongxi Li [Mon, 17 Sep 2012 20:04:08 +0000 (21:04 +0100)]
xen: enable APIC-Register Virtualization

Add APIC register virtualization support
 - APIC read doesn't cause VM-Exit
 - APIC write becomes trap-like

Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
12 years agoMCE: use new common mce handler on AMD CPUs
Christoph Egger [Mon, 17 Sep 2012 16:57:24 +0000 (17:57 +0100)]
MCE: use new common mce handler on AMD CPUs

Factor common machine check handler out of intel specific code
and move it into common files.
Replace old common mce handler with new one and use it on AMD CPUs.
No functional changes on Intel side.
While here fix some whitespace nits and comments.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
12 years agomem_event: fix regression affecting CR3, CR4 memory events
Steven Maresca [Mon, 17 Sep 2012 16:55:12 +0000 (17:55 +0100)]
mem_event: fix regression affecting CR3, CR4 memory events

This is a patch repairing a regression in code previously functional
in 4.1.x. It appears that, during some refactoring work, calls to
hvm_memory_event_cr3 and hvm_memory_event_cr4 were lost.

These functions were originally called in mov_to_cr() of vmx.c, but
the commit  http://xenbits.xen.org/hg/xen-unstable.hg/rev/1276926e3795
abstracted the original code into generic functions up a level in
hvm.c, dropping these calls in the process.

Signed-off-by: Steven Maresca <steve@zentific.com>
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
12 years agoExtra check in grant table code for mapping of shared frame
Andres Lagar-Cavilla [Mon, 17 Sep 2012 16:51:57 +0000 (17:51 +0100)]
Extra check in grant table code for mapping of shared frame

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Committed-by: Keir Fraser <keir@xen.org>
12 years agotools: drop ia64 only foreign structs from headers
Ian Campbell [Mon, 17 Sep 2012 10:17:05 +0000 (11:17 +0100)]
tools: drop ia64 only foreign structs from headers

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years ago.*ignore: drop ia64 entries
Ian Campbell [Mon, 17 Sep 2012 10:17:04 +0000 (11:17 +0100)]
.*ignore: drop ia64 entries

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoFix libxenstore memory leak when USE_PTHREAD is not defined
Andres Lagar-Cavilla [Mon, 17 Sep 2012 10:17:03 +0000 (11:17 +0100)]
Fix libxenstore memory leak when USE_PTHREAD is not defined

Redefine usage of pthread_cleanup_push and _pop, to explicitly call free for
heap objects in error paths.

By the way, set a suitable errno value for an error path that had none.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: Remove global domid and enable -Wshadow
Ian Campbell [Mon, 17 Sep 2012 10:17:02 +0000 (11:17 +0100)]
xl: Remove global domid and enable -Wshadow

Lots of functions loop over a list of domain and others take a domid as
a parameter, shadowing the global one and leading to all sorts of
confusion.

Therefore remove the global domid and explicitly pass it around as
necessary.

Adds a domid to the parameters for many functions and switches many
others from taking a char * domain specifier to taking a domid, pushing
the domid lookup to the toplevel.

Replaces some open-coded domain_qualifier_to_domid error checking with
find_domain.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- annotate find_domain() with warn_unused_result and fix the
         handful of errors. ]
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: prepare to enable Wshadow
Ian Campbell [Mon, 17 Sep 2012 10:17:01 +0000 (11:17 +0100)]
xl: prepare to enable Wshadow

Takes care of everything other than the global domid clashes.

Avoid galobal functions
  - stime(2)
  - time(2)

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Enable -Wshadow.
Ian Campbell [Mon, 17 Sep 2012 10:17:00 +0000 (11:17 +0100)]
libxl: Enable -Wshadow.

It was convenient to invent $(CFLAGS_LIBXL) to do this.

Various renamings to avoid shadowing standard functions:
  - index(3)
  - listen(2)
  - link(2)
  - abort(3)
  - abs(3)

Reduced the scope of some variables to avoid conflicts.

Change to libxc is due to the nested hypercall buf macros in
set_xen_guest_handle (used in libxl) using the same local private vars.

Build tested only.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: free libxl context, logger and lockfile using atexit handler
Ian Campbell [Mon, 17 Sep 2012 10:16:59 +0000 (11:16 +0100)]
xl: free libxl context, logger and lockfile using atexit handler

xl frequently just calls exit(3), especially on error. Try to clean
up some of our global state to make tools like valgrind more useful.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxenpm: make argument parsing and error handling more consistent
Jan Beulich [Mon, 17 Sep 2012 08:09:59 +0000 (10:09 +0200)]
xenpm: make argument parsing and error handling more consistent

Specifically, what values are or aren't accepted as CPU identifier, and
how the values get interpreted should be consistent across sub-commands
(intended behavior now: non-negative values are okay, and along with
omitting the argument, specifying "all" will also be accepted).

For error handling, error messages should get consistently issued to
stderr, and the tool should now (hopefully) produce an exit code of
zero only in the (partial) success case (there may still be a small
number of questionable cases).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agohvmloader: Do not zero the wallclock fields in shared-info.
Keir Fraser [Fri, 14 Sep 2012 18:47:57 +0000 (19:47 +0100)]
hvmloader: Do not zero the wallclock fields in shared-info.

These fields need to be valid at all times. Hypervisor ensures this
even across 32/64-bit guest transitions.

This fixes a bug where wallclock time is incorrect for booting 32-bit
HVM guests.

This should be backported to Xen 4.1 and 4.2.

Signed-off-by: Keir Fraser <keir@xen.org>
Tested-and-Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agox86/hvm: mark save/restore registration code __init
Jan Beulich [Fri, 14 Sep 2012 12:30:23 +0000 (14:30 +0200)]
x86/hvm: mark save/restore registration code __init

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/hvm: constify static data where possible
Jan Beulich [Fri, 14 Sep 2012 12:28:59 +0000 (14:28 +0200)]
x86/hvm: constify static data where possible

In a few cases this also extends to making them static in the first
place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/hvm: don't use indirect calls without need
Jan Beulich [Fri, 14 Sep 2012 12:25:22 +0000 (14:25 +0200)]
x86/hvm: don't use indirect calls without need

Direct calls perform better, so we should prefer them and use indirect
ones only when there indeed is a need for indirection.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoVT-d: use msi_compose_msg()
Jan Beulich [Fri, 14 Sep 2012 12:20:08 +0000 (14:20 +0200)]
VT-d: use msi_compose_msg()

... instead of open coding it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoamd iommu: use base platform MSI implementation
Jan Beulich [Fri, 14 Sep 2012 12:17:26 +0000 (14:17 +0200)]
amd iommu: use base platform MSI implementation

Given that here, other than for VT-d, the MSI interface gets surfaced
through a normal PCI device, the code should use as much as possible of
the "normal" MSI support code.

Further, the code can (and should) follow the "normal" MSI code in
distinguishing the maskable and non-maskable cases at the IRQ
controller level rather than checking the respective flag in the
individual actors.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Wang <wei.wang2@amd.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agolibxl: Tolerate xl config files missing trailing newline
Ian Jackson [Fri, 14 Sep 2012 09:25:15 +0000 (10:25 +0100)]
libxl: Tolerate xl config files missing trailing newline

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Fix missing dependency in api check rule
Ian Jackson [Fri, 14 Sep 2012 09:02:52 +0000 (10:02 +0100)]
libxl: Fix missing dependency in api check rule

Without this, the api check cpp run might happen before the various
autogenerated files which are #include by libxl.h are ready.

We need to remove the api-ok file from AUTOINCS to avoid a circular
dependency.  Instead, we list it explicitly as a dependency of the
object files.  The result is that the api check is the last thing to
be done before make considers the preparation done and can start work
on compiling .c files into .o's.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agodocs: flesh out xl.cfg documentation, correct typos, reorganize
Matt Wilson [Fri, 14 Sep 2012 09:02:51 +0000 (10:02 +0100)]
docs: flesh out xl.cfg documentation, correct typos, reorganize

Some highlights:
 * Correct some markup errors:
       Around line 663:
           '=item' outside of any '=over'
       Around line 671:
           You forgot a '=back' before '=head3'
 * Add documentation for msitranslate, power_mgnt, acpi_s3, aspi_s4,
   gfx_passthru, nomigrate, etc.
 * Reorganize items in "unclassified" sections like cpuid,
   gfx_passthru to where they belong
 * Correct link L<> references so they can be resolved within the
   document
 * Remove placeholders for deprecated options device_model and vif2
 * Remove placeholder for "sched" and "node", as these are options for
   cpupool configuration. Perhaps cpupool configuration deserves
   a section in this document.
 * Rename "global" options to "general"
 * Add section headers to group general VM options.

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxentop.c: Change curses painting behavior to avoid flicker
Jason McCarver [Fri, 14 Sep 2012 09:02:51 +0000 (10:02 +0100)]
xentop.c: Change curses painting behavior to avoid flicker

Currently, xentop calls clear() before drawing the screen and calling
refresh().  This causes the entire screen to be repainted from scratch
on each call to refresh().  It is inefficient and causes visible flicker
when using xentop.

This patch fixes this by calling erase() instead of clear() which overwrites
the current screen with blanks instead.  The screen is then drawn as usual
in the top() function and refresh() is called.  This method allows curses
to only repaint the characters that have changed since the last call
to refresh(), thus avoiding the flicker and sending fewer characters to
the terminal.

In the event the screen becomes corrupted, this patch accepts a CTRL-L
keystroke from the user which will call clear() and force a repaint of
the entire screen.

Signed-off-by: Jason McCarver <slam@parasite.cc>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: do not leak cpupool names.
Ian Campbell [Fri, 14 Sep 2012 09:02:50 +0000 (10:02 +0100)]
xl: do not leak cpupool names.

Valgrind reports:
==3076== 7 bytes in 1 blocks are definitely lost in loss record 1 of 1
==3076==    at 0x402458C: malloc (vg_replace_malloc.c:270)
==3076==    by 0x406F86D: libxl_cpupoolid_to_name (libxl_utils.c:102)
==3076==    by 0x8058742: parse_config_data (xl_cmdimpl.c:639)
==3076==    by 0x805BD56: create_domain (xl_cmdimpl.c:1838)
==3076==    by 0x805DAED: main_create (xl_cmdimpl.c:3903)
==3076==    by 0x804D39D: main (xl.c:285)

And indeed there are several places where xl uses
libxl_cpupoolid_to_name as a boolean to test if the pool name is
valid and leaks the name if it is. Introduce an is_valid helper and
use that instead.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Juergen Gross<juergen.gross@ts.fujitsu.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: error if vif backend!=0 is used with run_hotplug_scripts
Roger Pau Monne [Fri, 14 Sep 2012 09:02:49 +0000 (10:02 +0100)]
xl: error if vif backend!=0 is used with run_hotplug_scripts

Print an error and exit if backend!=0 is used in conjunction with
run_hotplug_scripts. Currently libxl can only execute hotplug scripts
from the toolstack domain (the same domain xl is running from).

Added a description and workaround of this issue on
xl-network-configuration.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: fix usage of backend parameter and run_hotplug_scripts
Roger Pau Monne [Fri, 14 Sep 2012 09:02:48 +0000 (10:02 +0100)]
libxl: fix usage of backend parameter and run_hotplug_scripts

vif interfaces allows the user to specify the domain that should run
the backend (also known as driver domain) using the 'backend'
parameter. This is not compatible with run_hotplug_scripts=1, since
libxl can only run the hotplug scripts from the Domain 0.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibfsimage: add ext4 support for CentOS 5.x
Roger Pau Monne [Fri, 14 Sep 2012 09:02:47 +0000 (10:02 +0100)]
libfsimage: add ext4 support for CentOS 5.x

CentOS 5.x forked e2fs ext4 support into a different package called
e4fs, and so headers and library names changed from ext2fs to ext4fs.
Check if ext4fs/ext2fs.h and -lext4fs work, and use that instead of
ext2fs to build libfsimage. This patch assumes that if the ext4fs
library is present it should always be used instead of ext2fs.

This patch includes a rework of the ext2fs check, a new ext4fs check
and a minor modification in libfsimage to use the correct library.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: handle errors from xc_sharing_* info functions
Ian Campbell [Fri, 14 Sep 2012 09:02:46 +0000 (10:02 +0100)]
libxl: handle errors from xc_sharing_* info functions

On a 32 bit hypervisor xl info currently reports:
sharing_freed_memory   : 72057594037927935
sharing_used_memory    : 72057594037927935

Eat the ENOSYS and turn it into 0. Log and propagate other errors.

I don't have a 32 bit system handy, so tested on x86_64 with a libxc
hacked to return -ENOSYS and -EINVAL.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agobuild: Require GCC 4.1 or later.
Keir Fraser [Thu, 13 Sep 2012 19:13:36 +0000 (20:13 +0100)]
build: Require GCC 4.1 or later.

Centralise the version check in Config.mk. Any more strict version
requirements can be added to specific subdirs/arches.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86: check for data and BSS in reloc code at compile time.
Tim Deegan [Thu, 13 Sep 2012 15:41:33 +0000 (16:41 +0100)]
x86: check for data and BSS in reloc code at compile time.

This is a more useful failure mode than hanging at boot time, and
incidentally fixes the clang/LLVM build by removing a .subsection rune.

Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Tim Deegan <tim@xen.org>
12 years agox86/mm: Update comments now that Xen is always 64-bit.
Tim Deegan [Thu, 13 Sep 2012 15:41:33 +0000 (16:41 +0100)]
x86/mm: Update comments now that Xen is always 64-bit.

Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
12 years agox86/mm: remove the linear mapping of the p2m tables.
Tim Deegan [Thu, 13 Sep 2012 15:41:33 +0000 (16:41 +0100)]
x86/mm: remove the linear mapping of the p2m tables.

Mapping the p2m into the monitor tables was an important optimization
on 32-bit builds, where it avoided mapping and unmapping p2m pages
during a walk.  On 64-bit it makes no difference -- see
http://old-list-archives.xen.org/archives/html/xen-devel/2010-04/msg00981.html
Get rid of it, and use the explicit walk for all lookups.

Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
12 years agoamd iommu: use PCI macros
Jan Beulich [Thu, 13 Sep 2012 08:23:17 +0000 (10:23 +0200)]
amd iommu: use PCI macros

... instead of open coding them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Wang <wei.wang2@amd.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agodrivers: Remove some CONFIG_X86 ifdef'ery.
Keir Fraser [Wed, 12 Sep 2012 19:54:23 +0000 (20:54 +0100)]
drivers: Remove some CONFIG_X86 ifdef'ery.

Not quite all, but a great deal was to specifically allow ia64 support
to be retrofitted to x86 platform code. Since we no longer support
ia64 we can happily remove the ifdefs. Any new platform which wanted
to share this code would likely need a different set of ifdefs in any
case, making it a brand new porting effort.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agoIn most of the codebase, use CONFIG_X86 in place of __i386__||__x86_64__
Keir Fraser [Wed, 12 Sep 2012 19:41:01 +0000 (20:41 +0100)]
In most of the codebase, use CONFIG_X86 in place of __i386__||__x86_64__

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agoCONFIG_X86_64 -> CONFIG_X86
Keir Fraser [Wed, 12 Sep 2012 19:32:52 +0000 (20:32 +0100)]
CONFIG_X86_64 -> CONFIG_X86

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86: HYPERVISOR_VIRT_END is always defined. Remove ifdef'ery.
Keir Fraser [Wed, 12 Sep 2012 19:23:10 +0000 (20:23 +0100)]
x86: HYPERVISOR_VIRT_END is always defined. Remove ifdef'ery.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86: Remove CONFIG_COMPAT ifdef'ery from arch/x86 -- it is always defined.
Keir Fraser [Wed, 12 Sep 2012 19:21:02 +0000 (20:21 +0100)]
x86: Remove CONFIG_COMPAT ifdef'ery from arch/x86 -- it is always defined.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86/passthrough: Fix corruption caused by race conditions between
Andrew Cooper [Wed, 12 Sep 2012 18:31:16 +0000 (19:31 +0100)]
x86/passthrough: Fix corruption caused by race conditions between
device allocation and deallocation to a domain.

A toolstack, when dealing with a domain using PCIPassthrough, could
reasonably be expected to issue DOMCTL_deassign_device hypercalls to
remove all passed through devices before issuing a
DOMCTL_destroydomain hypercall to kill the domain.  In the case where
a toolstack is perhaps less sensible in this regard, the hypervisor
should not fall over.

In domain_kill(), pci_release_devices() searches the alldevs_list list
looking for PCI devices still assigned to the domain.  If the
toolstack has correctly deassigned all devices before killing the
domain, this loop does nothing.

However, if there are still devices attached to the domain, the loop
will call pci_cleanup_msi() without unbinding the pirq from the
domain.  This eventually calls destroy_irq() which xfree()'s the
action.

However, as the irq_desc->action pointer is abused in an unsafe
matter, without unbinding first (which at least correctly cleans up),
the action is actually an irq_guest_action_t* rather than an
irqaction*, meaning that the cpu_eoi_map is leaked, and eoi_timer is
free()'d while still being on a pcpu's inactive_timer list.  As a
result, when this free()'d memory gets reused, the inactive_timer list
becomes corrupt, and list_*** operations will corrupt hypervisor
memory.

If the above were not bad enough, the loop in pci_release_devices()
still leaves references to the irq it destroyed in
domain->arch.pirq_irq and irq_pirq, meaning that a later loop,
free_domain_pirqs(), which happens as a result of
complete_domain_destroy() will unbind and destroy all irqs which were
still bound to the domain, resulting in a double destroy of any irq
which was still bound to the domain at the point at which the
DOMCTL_destroydomain hypercall happened.

Because of the allocation of irqs from find_unassigned_irq(), the
lowest free irq number is going to be handed back from create_irq().

There is a further race condition between the original (incorrect)
call to destroy_irq() from pci_release_devices(), and the later call
to free_domain_pirqs() (which happens in a softirq context at some
point after the domain has officially died) during which the same irq
number (which is still referenced in a stale way in
domain->arch.pirq_irq and irq_pirq) has been allocated to a new domain
via a PHYSDEVOP_map_pirq hypercall (Say perhaps in the case of
rebooting a domain).

In this case, the cleanup for the dead domain will free the recently
bound irq under the feet of the new domain.  Furthermore, after the
irq has been incorrectly destroyed, the same domain with another
PHYSDEVOP_map_pirq hypercall can be allocated the same irq number as
before, leading to an error along the lines of:

../physdev.c:188: dom54: -1:-1 already mapped to 74

In this case, the pirq_irq and irq_pirq mappings get updated to the
new PCI device from the latter PHYSDEVOP_map_pirq hypercall, and the
IOMMU interrupt remapping registers get updated, leading to IOMMU
Primary Pending Fault due to source-id verification failure for
incoming interrupts from the passed through device.

The easy fix is to simply deassign the device in pci_release_devices()
and leave all the real cleanup to the free_domain_pirqs() which
correctly unbinds and destroys the irq without leaving stale
references around.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
12 years agotools: drop ia64 support
Ian Campbell [Wed, 12 Sep 2012 16:55:27 +0000 (17:55 +0100)]
tools: drop ia64 support

Removed support from libxc and mini-os.

This also took me under xen/include/public via various symlinks.

Dropped tools/debugger/xenitp entirely, it was described upon commit
as:
"Xenitp is a low-level debugger for ia64" and doesn't appear to be
linked into the build anywhere.

 99 files changed, 14 insertions(+), 32361 deletions(-)

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agox86: Fix 32-bit stubdom build, libelf.h must support __i386__
Keir Fraser [Wed, 12 Sep 2012 15:12:42 +0000 (16:12 +0100)]
x86: Fix 32-bit stubdom build, libelf.h must support __i386__

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86: Remove unused 'sis_apic_bug' variable. It was only used on x86_32.
Keir Fraser [Wed, 12 Sep 2012 14:52:33 +0000 (15:52 +0100)]
x86: Remove unused 'sis_apic_bug' variable. It was only used on x86_32.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agox86: We can assume CONFIG_PAGING_LEVELS==4.
Keir Fraser [Wed, 12 Sep 2012 12:59:26 +0000 (13:59 +0100)]
x86: We can assume CONFIG_PAGING_LEVELS==4.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agoxen: Remove x86_32 build target.
Keir Fraser [Wed, 12 Sep 2012 12:29:30 +0000 (13:29 +0100)]
xen: Remove x86_32 build target.

Signed-off-by: Keir Fraser <keir@xen.org>
12 years agoRevert 25843:51090fe1ab97 (x86/HVM: assorted RTC emulation adjustments)
Jan Beulich [Wed, 12 Sep 2012 11:24:28 +0000 (13:24 +0200)]
Revert 25843:51090fe1ab97 (x86/HVM: assorted RTC emulation adjustments)

This was found to cause RHEL6 HVM guests to hang during shutdown.

12 years agognttab: cleanup of number-of-active-frames calculations
Jan Beulich [Wed, 12 Sep 2012 08:21:21 +0000 (10:21 +0200)]
gnttab: cleanup of number-of-active-frames calculations

max_nr_active_grant_frames() is merly is special case of
num_act_frames_from_sha_frames(), so there's no need to have a special
case implementation for it.

Further, some of the related definitions (including the "struct
active_grant_entry" definition itself) can (and hence should) really be
private to grant_table.c.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: use only a single branch for upcall-pending exit path checks
Jan Beulich [Wed, 12 Sep 2012 08:20:18 +0000 (10:20 +0200)]
x86: use only a single branch for upcall-pending exit path checks

This utilizes the fact that the two bytes of interest are adjacent to
one another and that the resulting 16-bit values of interest are within
a contiguous range of numbers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86-64/EFI: allow chaining of config files
Jan Beulich [Wed, 12 Sep 2012 08:19:34 +0000 (10:19 +0200)]
x86-64/EFI: allow chaining of config files

Namely when making use the CONFIG_XEN_COMPAT_* options in the legacy
Linux kernels, newer kernels may not be compatible with older
hypervisors, so trying to boot such a combination makes little sense.
Booting older kernels on newer hypervisors, however, has to always
work.

With the way xen.efi looks for its configuration file, allowing
individual configuration files to refer only to compatible kernels,
and referring from an older- to a newer-hypervisor one (the kernels
of which will, as said, necessarily be compatible with the older
hypervisor) allows to greatly reduce redundancy at least in
development environments where one frequently wants multiple
hypervisors and kernles to be installed in parallel.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: retrieve keyboard shift status flags from BIOS
Jan Beulich [Wed, 12 Sep 2012 08:17:34 +0000 (10:17 +0200)]
x86: retrieve keyboard shift status flags from BIOS

Recent Linux tries to make use of this, and has no way of getting at
these bits without Xen assisting it.

There doesn't appear to be a way to obtain the same information from
UEFI.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86-64: construct static, uniform parts of page tables at build time
Jan Beulich [Tue, 11 Sep 2012 14:04:49 +0000 (16:04 +0200)]
x86-64: construct static, uniform parts of page tables at build time

... rather than at boot time, removing unnecessary redundancy between
EFI and legacy boot code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: construct static part of 1:1 mapping at build time
Jan Beulich [Tue, 11 Sep 2012 14:03:38 +0000 (16:03 +0200)]
x86: construct static part of 1:1 mapping at build time

... rather than at boot time, removing unnecessary redundancy between
EFI and legacy boot code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoPCI: don't allow guest assignment of devices used by Xen
Jan Beulich [Tue, 11 Sep 2012 14:01:15 +0000 (16:01 +0200)]
PCI: don't allow guest assignment of devices used by Xen

This covers the devices used for the console and the AMD IOMMU ones (as
would be any others that might get passed to pci_ro_device()).

Boot video device determination cloned from similar Linux logic.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoPCI: bus scan adjustments
Jan Beulich [Tue, 11 Sep 2012 13:59:43 +0000 (15:59 +0200)]
PCI: bus scan adjustments

As done elsewhere, the ns16550 code shouldn't look at non-zero
functions of a device if that isn't multi-function.

Also both there and in pass-through's _scan_pci_devices() skip looking
at non-zero functions when the device at function zero doesn't exist.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agodrop tx_fifo_size
Jan Beulich [Tue, 11 Sep 2012 13:57:38 +0000 (15:57 +0200)]
drop tx_fifo_size

... in favor of having what so far was called tx_empty() return the
amount of space available.

Note that in the pl011.c case, original code and comment disagreed, and
I picked the conservative value for it's ->tx_ready() handler's return
value.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agons16550: command line parsing adjustments
Jan Beulich [Tue, 11 Sep 2012 13:56:45 +0000 (15:56 +0200)]
ns16550: command line parsing adjustments

Allow intermediate parts of the command line options to be absent
(expressed by two immediately succeeding commas).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agons16550: PCI initialization adjustments
Jan Beulich [Tue, 11 Sep 2012 13:55:33 +0000 (15:55 +0200)]
ns16550: PCI initialization adjustments

Besides single-port serial cards, also accept multi-port ones and such
providing mixed functionality (e.g. also having a parallel port).

Reading PCI_INTERRUPT_PIN before ACPI gets enabled generally produces
an incorrect IRQ (below 16, whereas after enabling ACPI it frequently
would end up at a higher one), so this is useful (almost) only when a
system already boots in ACPI mode.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agons16550: MMIO adjustments
Jan Beulich [Tue, 11 Sep 2012 13:52:36 +0000 (15:52 +0200)]
ns16550: MMIO adjustments

On x86 ioremap() is not suitable here, set_fixmap() must be used
instead.

Also replace some literal numbers by their proper symbolic constants,
making the code easier to understand.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoserial: avoid fully initializing unused consoles
Jan Beulich [Tue, 11 Sep 2012 13:51:52 +0000 (15:51 +0200)]
serial: avoid fully initializing unused consoles

Defer calling the drivers' post-IRQ initialization functions (generally
doing allocation of transmit buffers) until it is known that the
respective console is actually going to be used.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoconsole: add EHCI debug port based serial console
Jan Beulich [Tue, 11 Sep 2012 13:49:52 +0000 (15:49 +0200)]
console: add EHCI debug port based serial console

Low level hardware interface pieces adapted from Linux.

For setup information, see Linux'es Documentation/x86/earlyprintk.txt
and/or http://www.coreboot.org/EHCI_Debug_Port.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoconsole: prepare for non-COMn port support
Jan Beulich [Tue, 11 Sep 2012 13:47:16 +0000 (15:47 +0200)]
console: prepare for non-COMn port support

Widen SERHND_IDX (and use it where needed), introduce a flush low level
driver method, and remove unnecessary peeking of the common code at the
(driver specific) serial port identification string in the "console="
command line option value.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: allow early use of fixmaps
Jan Beulich [Tue, 11 Sep 2012 13:45:20 +0000 (15:45 +0200)]
x86: allow early use of fixmaps

As a prerequisite for adding an EHCI debug port based console
implementation, set up the page tables needed for (a sub-portion of)
the fixmaps together with other boot time page table construction.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agotmem: cleanup
Jan Beulich [Tue, 11 Sep 2012 12:19:29 +0000 (14:19 +0200)]
tmem: cleanup

- one more case of checking for a specific rather than any error
- drop no longer needed first parameter from cli_put_page()
- drop a redundant cast

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
12 years agotmem: fixup 2010 cleanup patch that breaks tmem save/restore
Dan Magenheimer [Tue, 11 Sep 2012 12:19:03 +0000 (14:19 +0200)]
tmem: fixup 2010 cleanup patch that breaks tmem save/restore

20918:a3fa6d444b25 "Fix domain reference leaks" (in Feb 2010, by Jan)
does some cleanup in addition to the leak fixes.  Unfortunately, that
cleanup inadvertently resulted in an incorrect fallthrough in a switch
statement which breaks tmem save/restore.

That broken patch was apparently applied to 4.0-testing and 4.1-testing
so those are broken as well.

What is the process now for requesting back-patches to 4.0 and 4.1?

(Side note: This does not by itself entirely fix save/restore in 4.2.)

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agotmem: reduce severity of log messages
Jan Beulich [Tue, 11 Sep 2012 12:18:36 +0000 (14:18 +0200)]
tmem: reduce severity of log messages

Otherwise they can be used by a guest to spam the hypervisor log with
all settings at their defaults.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
12 years agotmem: properly drop lock on error path in do_tmem_op()
Jan Beulich [Tue, 11 Sep 2012 12:18:26 +0000 (14:18 +0200)]
tmem: properly drop lock on error path in do_tmem_op()

Reported-by: Tim Deegan <tim@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
12 years agotmem: properly drop lock on error path in do_tmem_get()
Jan Beulich [Tue, 11 Sep 2012 12:18:08 +0000 (14:18 +0200)]
tmem: properly drop lock on error path in do_tmem_get()

Also remove a bogus assertion.

Reported-by: Tim Deegan <tim@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
12 years agotmem: detect arithmetic overflow in tmh_copy_{from,to}_client()
Jan Beulich [Tue, 11 Sep 2012 12:17:59 +0000 (14:17 +0200)]
tmem: detect arithmetic overflow in tmh_copy_{from,to}_client()

This implies adjusting callers to deal with errors other than -EFAULT
and removing some comments which would otherwise become stale.

Reported-by: Tim Deegan <tim@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
12 years agotmem: don't access guest memory without using the accessors intended for this
Jan Beulich [Tue, 11 Sep 2012 12:17:49 +0000 (14:17 +0200)]
tmem: don't access guest memory without using the accessors intended for this

This is not permitted, not even for buffers coming from Dom0 (and it
would also break the moment Dom0 runs in HVM mode). An implication from
the changes here is that tmh_copy_page() can't be used anymore for
control operations calling tmh_copy_{from,to}_client() (as those pass
the buffer by virtual address rather than MFN).

Note that tmemc_save_get_next_page() previously didn't set the returned
handle's pool_id field, while the new code does. It need to be
confirmed that this is not a problem (otherwise the copy-out operation
will require further tmh_...() abstractions to be added).

Further note that the patch removes (rather than adjusts) an invalid
call to unmap_domain_page() (no matching map_domain_page()) from
tmh_compress_from_client() and adds a missing one to an error return
path in tmh_copy_from_client().

Finally note that the patch adds a previously missing return statement
to cli_get_page() (without which that function could de-reference a
NULL pointer, triggerable from guest mode).

This is part of XSA-15 / CVE-2012-3497.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
12 years agotmem: check for a valid client ("domain") in the save subops
Ian Campbell [Tue, 11 Sep 2012 12:17:27 +0000 (14:17 +0200)]
tmem: check for a valid client ("domain") in the save subops

This is part of XSA-15 / CVE-2012-3497.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agotmem: check the pool_id is valid when destroying a tmem pool
Ian Campbell [Tue, 11 Sep 2012 12:06:54 +0000 (14:06 +0200)]
tmem: check the pool_id is valid when destroying a tmem pool

This is part of XSA-15 / CVE-2012-3497.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agotmem: consistently make pool_id a uint32_t
Ian Campbell [Tue, 11 Sep 2012 12:06:43 +0000 (14:06 +0200)]
tmem: consistently make pool_id a uint32_t

Treating it as an int could allow a malicious guest to provide a
negative pool_Id, by passing the MAX_POOLS_PER_DOMAIN limit check and
allowing access to the negative offsets of the pool array.

This is part of XSA-15 / CVE-2012-3497.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agotmem: only allow tmem control operations from privileged domains
Ian Campbell [Tue, 11 Sep 2012 12:06:30 +0000 (14:06 +0200)]
tmem: only allow tmem control operations from privileged domains

This is part of XSA-15 / CVE-2012-3497.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agoamd iommu: remove unnecessary map/unmap for l1 page tables
Wei Wang [Tue, 11 Sep 2012 12:03:12 +0000 (14:03 +0200)]
amd iommu: remove unnecessary map/unmap for l1 page tables

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agoamd iommu: use next_level instead of recalculating it
Wei Wang [Tue, 11 Sep 2012 12:01:52 +0000 (14:01 +0200)]
amd iommu: use next_level instead of recalculating it

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agoamd iommu: add 2 helper functions: iommu_is_pte_present and iommu_next_level
Wei Wang [Tue, 11 Sep 2012 12:00:04 +0000 (14:00 +0200)]
amd iommu: add 2 helper functions: iommu_is_pte_present and iommu_next_level

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agox86: refactor mce code
Christoph Egger [Tue, 11 Sep 2012 10:28:32 +0000 (12:28 +0200)]
x86: refactor mce code

Factor common mc code out of intel specific code and move it into
common files. No functional changes.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agox86: make the dom0_max_vcpus option more flexible
David Vrabel [Tue, 11 Sep 2012 10:26:25 +0000 (12:26 +0200)]
x86: make the dom0_max_vcpus option more flexible

The dom0_max_vcpus command line option only allows the exact number of
VCPUs for dom0 to be set.  It is not possible to say "up to N VCPUs
but no more than the number physically present."

Allow a range for the option to set a minimum number of VCPUs, and a
maximum which does not exceed the number of PCPUs.

For example, with "dom0_max_vcpus=4-8":

    PCPUs  Dom0 VCPUs
     2      4
     4      4
     6      6
     8      8
    10      8

Existing command lines with "dom0_max_vcpus=N" still work as before
(and are equivalent to dom0_max_vcpus=N-N).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agopowernow: Update P-state directly when _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL
Boris Ostrovsky [Tue, 11 Sep 2012 08:57:36 +0000 (10:57 +0200)]
powernow: Update P-state directly when _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL

When _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL (i.e. shared_type is
CPUFREQ_SHARED_TYPE_HW) which most often is the case on servers, there
is no reason to go into on_selected_cpus() code, we call call
transition_pstate() directly.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agox86/HVM: assorted RTC emulation adjustments
Jan Beulich [Tue, 11 Sep 2012 08:00:06 +0000 (10:00 +0200)]
x86/HVM: assorted RTC emulation adjustments

- don't look at RTC_PIE in rtc_timer_update(), and hence don't call the
  function on REG_B writes at all
- only call alarm_timer_update() on REG_B writes when relevant bits
  change
- only call check_update_timer() on REG_B writes when SET changes
- instead properly handle AF and PF when the guest is not also setting
  AIE/PIE respectively (for UF this was already the case, only a
  comment was slightly inaccurate)
- raise the RTC IRQ not only when UIE gets set while UF was already
  set, but generalize this to cover AIE and PIE as well
- properly mask off bit 7 when retrieving the hour values in
  alarm_timer_update(), and properly use RTC_HOURS_ALARM's bit 7 when
  converting from 12- to 24-hour value
- also handle the two other possible clock bases
- use RTC_* names in a couple of places where literal numbers were used
  so far

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/hvm: don't give vector callback higher priority than NMI/MCE
Jan Beulich [Mon, 10 Sep 2012 14:47:31 +0000 (16:47 +0200)]
x86/hvm: don't give vector callback higher priority than NMI/MCE

Those two should always be delivered first imo.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
12 years agodocs: document "ucode=" hypervisor command line option
Jan Beulich [Mon, 10 Sep 2012 10:13:56 +0000 (11:13 +0100)]
docs: document "ucode=" hypervisor command line option

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agodocs: correct formatting errors in xmdomain.cfg
Matt Wilson [Mon, 10 Sep 2012 10:13:55 +0000 (11:13 +0100)]
docs: correct formatting errors in xmdomain.cfg

This patch corrects the following errors produced by pod2man:

Hey! The above document had some coding errors, which are explained
below:

Around line 301:
    You can't have =items (as at line 305) unless the first thing after
    the =over is an =item

Around line 311:
    '=item' outside of any '=over'

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl.cfg: gfx_passthru documentation improvements
Pasi Kärkkäinen [Mon, 10 Sep 2012 10:13:54 +0000 (11:13 +0100)]
xl.cfg: gfx_passthru documentation improvements

gfx_passthru: Document gfx_passthru makes the GPU become primary in the guest
and other generic info about gfx_passthru.

Signed-off-by: Pasi Kärkkäinen <pasik@iki.fi>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: fix error message in device_backend_callback
Roger Pau Monne [Mon, 10 Sep 2012 10:13:53 +0000 (11:13 +0100)]
libxl: fix error message in device_backend_callback

device_backend_callback error path always says "unable to disconnect",
but this can also happen during the connection of a device. Fix the
error message using the information in aodev->action.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
12 years agounmodified_drivers: handle IRQF_SAMPLE_RANDOM
Olaf Hering [Mon, 10 Sep 2012 08:54:13 +0000 (10:54 +0200)]
unmodified_drivers: handle IRQF_SAMPLE_RANDOM

The flag IRQF_SAMPLE_RANDOM was removed in 3.6-rc1. Add it only if it is
defined. An additional call to add_interrupt_randomness is appearently
not needed because its now called unconditionally in
handle_irq_event_percpu().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Jan Beulich <jbeulich@suse.com>
12 years agoVT-d: split .ack and .disable DMA-MSI actors
Jan Beulich [Mon, 10 Sep 2012 07:45:30 +0000 (09:45 +0200)]
VT-d: split .ack and .disable DMA-MSI actors

Calling irq_complete_move() from .disable is wrong, breaking S3 resume.

Comparing with all other .ack actors, it was also missing a call to
move_{native,masked}_irq(). As the actor is masking its interrupt
anyway (albeit it's not immediately obvious why), the latter is the
better choice.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>