During creation of the PV domain we allocate the E820 structure to
have the amount of E820 entries on the machine, plus the number three.
This will allow the tool stack to fill the E820 with more than three
entries. Specifically the use cases is , where the toolstack retrieves
the E820, sanitizes it, and then sets it for the PV guest (for PCI
passthrough), this dynamic number of E820 is just right.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Keir Fraser <keir@xen.org>
Stephen Smalley [Tue, 12 Apr 2011 13:55:25 +0000 (14:55 +0100)]
xsm: Fix xsm_mmu_* and xsm_update_va_mapping hooks
This is an attempt to properly fix the hypervisor crash previously
described in
http://marc.info/?l=xen-devel&m=128396289707362&w=2
In looking into this issue, I think the proper fix is to move the
xsm_mmu_* and xsm_update_va_mapping hook calls later in the callers,
after more validation has been performed and the page_info struct is
readily available, and pass the page_info to the hooks. This patch
moves the xsm_mmu_normal_update, xsm_mmu_machphys_update and
xsm_update_va_mapping hook calls accordingly, and updates their
interfaces and hook function implementations. This appears to resolve
the crashes for me.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Ian Campbell [Tue, 12 Apr 2011 12:39:22 +0000 (13:39 +0100)]
tools: hvmloader: split scratch and hypercall addressing from ROMBIOS low heap.
Although happen to live at the same physical address their lifespans
do not overlap. The scratch and hypercall spaces are used only within
hvmloader and the same area is reused as a heap within ROMBIOS. But
each is free to make its own decisions about where to place things.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Tue, 12 Apr 2011 12:36:17 +0000 (13:36 +0100)]
tools: hvmloader: split e820 support into its own code module.
Pass the table address as a paramter to the build function and cause
it to return the number of entries. Pass both base and offset as
parameters to the dump function.
This adds a duplicated e820.h header to ROMBIOS. Since the e820 data
structure is well defined by existing BIOS implementations I think
this is OK and simplifies the cross talk between hvmloader and
ROMBIOS.
Reduces the cross talk between ROMBIOS and hvmloader.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Tue, 12 Apr 2011 12:34:30 +0000 (13:34 +0100)]
tools: hvmloader: move ROMBIOS configuration into tools/firmware/rombios/
Currently rombios and hvmloader are rather intertwined. Separate the
ROMBIOS configuration options out into a ROMBIOS provided file so that
the dependency can become strictly from hvmloader to rombios.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
remus: fix incorrect error handling for switch_qemu_logdirty in checkpoint code
c/s 22275: "tools: cleanup domain save switch_qemu_logdirty callback"
introduced a whole bunch of error code fixups. In the process, it also
ended up treating the success return code (0) from
switch_qemu_logdirty as an error and vice versa.
Wei Wang [Tue, 12 Apr 2011 12:26:19 +0000 (13:26 +0100)]
AMD IOMMU: Fix an interrupt remapping issue
Some device could generate bogus interrupts if an IO-APIC RTE and an
iommu interrupt remapping entry are not consistent during 2 adjacent
64bits IO-APIC RTE updates. For example, if the 2nd operation updates
destination bits in RTE for SATA device and unmask it, in some case,
SATA device will assert ioapic pin to generate interrupt immediately
using new destination but iommu could still translate it into the old
destination, then dom0 would be confused. To fix that, we sync up
interrupt remapping entry with IO-APIC IRE on every 32 bits operation
and forward IOAPIC RTE updates after interrupt.
Signed-off-by: Wei Wang <wei.wang2@amd.com> Acked-by: Jan Beulich <jbeulich@novell.com>
They are a pointless level of abstraction beneath nestedhvm_* variants
of the same operations, which all callers should be using.
At the same time, nestedhvm_vcpu_initialise() does not need to call
destroy if initialisation fails. That is the vendor-specific init
function's job (clearing up its own state on failure).
The new --null option allows one to test and play with just the
memory checkpointing and network buffering aspect of remus, without
the need for a second host. The disk is not replicated. All replication
data is sent to /dev/null. This option is pretty handy when a user
wants to see the page churn for his workload or observe the latency hit
though the latter will not be accurate.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
While running remus, when an error occurs during checkpointing
(e.g., timeouts on primary, failing to checkpoint network buffer
or disk or even communication failure) the domU is sometimes
left in suspended state on primary. Instead of blindly closing
the checkpoint file handle, attempt to resume the domain before
the close.
Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:39:53 +0000 (16:39 +0100)]
libxl: refactor DISK_BACKEND_PHY handling in libxl_device_disk_add
A step on the path to sharing this code with the tail-end of the
DISK_BACKEND_TAP case.
I made the result of libxl__blktap_devpath non-const to achieve
this. The existing caller calls libxl__strdup on the result but since
the function is an internal one and the result is already garbage
collected I think this is unnecessary and we can just use the
non-const result directly.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:39:19 +0000 (16:39 +0100)]
libxl: only a CDROM type disk can be empty.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:38:59 +0000 (16:38 +0100)]
libxl: convert an empty tap disk into a qdisk
I'm not sure that empty disks which are is_cdrom are especially valid,
or that a cdrom can ever be handled by tapdisk anyway but try to do
something sane since it seems that xl's parse_disk_config() routine
could potentially generate such a configuration (although whether from
a valid input string or not I'm not sure).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:38:36 +0000 (16:38 +0100)]
libxl: make fallback from blktap2 to qdisk more explicit.
When blktap2 is not present we fallback to qdisk, instead of falling
through a switch statement instead make this explicit, with a comment,
prior to the switch statement.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:38:06 +0000 (16:38 +0100)]
libxl: remove impossible check for backend != DISK_BACKEND_QDISK
In this case we are already in the DISK_BACKEND_QDISK case of a switch
statement on the same variable.
It is possible that we fell through from the DISK_BACKEND_TAP case
(although I'm about to remove that in a subsequent patch), however in
that case we are explicitly falling back from blktap2 to qdisk so
DEVICE_QDISK is still the right answer.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:36:20 +0000 (16:36 +0100)]
libxl: drop domid field from libxl_device_*
All functions which add a device to a domain already take a domid
argument and the callers typically write the same value to the
structure right before making the call.
Functions which delete a device typically do not but adding this field
makes the interface more consistent anyway and all callers have the
domid to hand.
All functions which return a libxl device structure are given a domid
as a paramter and the caller therefore already knows which domain it
is dealing with.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:22:51 +0000 (16:22 +0100)]
xend: drop XenAPI error message translation
The only "translation" is to the C locale (e.g. the NUL
translation). I think it very unlikely we are going to see any new
translations of the XenAPI error messages at this point so the only
purpose of this code appears to be to periodically regenerate
xen-xm.pot with a new embedded timestamp, to the detriment of those of
us who use a version control system.
After much beating with sticks I mananged to enable XenAPI support in
xend and configure xm such that it returns "Permission denied." (AKA
the SESSION_AUTHENTICATION_FAILED message) which I take to be a sign
I've not broken things too badly.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:21:12 +0000 (16:21 +0100)]
libxl/xl: drop support for netchannel2
netchannel2 was never widely deployed and no supported kernel includes
either the front- or back-ends. The last known kernel with this
support was the xen.git 2.6.31 branch which has been unsupported for
ages.
xl will warn the user if it spots a "vif2" configration item but
otherwise support is completely removed.
Work is ongoing to add the interesting features of netchannel2 as
protocol extensions to netchannel1.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 8 Apr 2011 15:17:18 +0000 (16:17 +0100)]
libxl: bump SONAME after binary incompatible change.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tim Deegan [Thu, 7 Apr 2011 14:06:47 +0000 (15:06 +0100)]
xen/acpi: disentangle ACPI enumerations.
There are two sets of ACPI table enums and structs, and clang
complains about implicit casts between them. It would be much better
to remove one entire set of ACPI definitions but for now just use the
right enum for each interface.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
libxc: set all VCPU's online by default in HVM info table
This sets a saner default for the cpu-online-map by setting all bits
to 1. The default assumption ought to be that nr-vcpus ==
nr-vcpus-at-start. If that is not true, then the toolstack must modify
the bitmap, but if it is true, the toolstack oughtn't need to do
anything further.
When offline a page, or, when a broken page occur, the page maybe
populated, or, may at pod cache. This patch is to handle the
offline/broken page at pod cache. It scan pod cache, if hit, remove
and replace it, and then put the offline/broken page to
page_offlined_list/page_broken_list
c/s 19913 break mce offline page logic:
For page_state_is(pg, free), it's impossible to trigger the case;
For page_state_is(pg, offlined), it in fact didn't offline related
page;
This patch fix the bug, and remove an ambiguous comment.
Tim Deegan [Thu, 7 Apr 2011 10:39:35 +0000 (11:39 +0100)]
x86/hvm: do actually init nested HVM state for VCPUs
when nested HVM is enabled after VCPus are allocated.
The previous patch would fail because the call to
nestedhvm_vcpu_initialise() in the HVM param set code
happens before nestedhvm_enabled(v->domain) is true.
Ian Campbell [Wed, 6 Apr 2011 15:50:16 +0000 (16:50 +0100)]
libxl: do not expose libxenctrl/libxenstore headers via libxl.h
This completely removes libxenstore from libxl users' view.
xl still needs libxenctrl directly due to the direct use of the
xentoollog functionality but it is not exposed to the indirect linkage
anymore.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap [Wed, 6 Apr 2011 10:40:54 +0000 (11:40 +0100)]
x86/hvm: load CPU structures from xen versions <=3.4
Xen 4.0 added "msr_tsc_aux" in the middle of the hvm_hw_cpu structure, making
it incompatible with pre-3.4 savefiles. This patch uses the recently introduced
backwards-compatibility infrastructure to convert the old to the new.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
George Dunlap [Wed, 6 Apr 2011 10:40:51 +0000 (11:40 +0100)]
hvm: infrastructure for backwards-compatible loading
The hvm_save code is used to save and restore hypervisor-related
hvm state, either for classic save/restore, or for migration
(including remus). This is meant to be backwards-compatible across
some hypervisor versions; but if it does change, there is no way to
handle the old format as well as the new.
This patch introduces the infrastructure to allow a single older
version ("compat") of any given "save type" to be defined, along with
a function to turn the "old" version into the "new" version. If the
size check fails for the "normal" version, it will check the "compat"
version, and if it matches, will read the old entry and call the
conversion function.
This patch involves some preprocessor hackery, but I'm only extending the
hackery that's already there.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
Tim Deegan [Wed, 6 Apr 2011 10:22:39 +0000 (11:22 +0100)]
Nested SVM: fix race in remote shootdown.
nestedhvm_flushtlb_ipi() can run between nsvm_vcpu_switch() and CLGI,
which would leave the VMCB pointing at the wrong p2m table.
Check for this after CLGI.
cegger [Mon, 28 Feb 2011 11:21:57 +0000 (12:21 +0100)]
Handle interrupts (generic part)
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:54 +0000 (12:21 +0100)]
Allow guest to enable SVM in EFER
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:52 +0000 (12:21 +0100)]
When injecting an exception into L2 guest,
inject a #VMEXIT if L1 guest intercepts the exception
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:49 +0000 (12:21 +0100)]
Allow paged real mode during vmrun emulation.
Emulate cr0 and cr4 when guest does not intercept them.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:46 +0000 (12:21 +0100)]
Nested Virtualization core implementation
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:44 +0000 (12:21 +0100)]
add nestedhvm function hooks for svm/vmx specific code
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:41 +0000 (12:21 +0100)]
Data structures for Nested Virtualization
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
cegger [Mon, 28 Feb 2011 11:21:38 +0000 (12:21 +0100)]
tools: Add nestedhvm guest config option
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
Allen Kay [Wed, 6 Apr 2011 08:11:02 +0000 (09:11 +0100)]
[VTD] Fixes to ACPI DMAR flag checks.
* platform_supports_{intremap,x2apic} should not be marked __init as
they are used during S3 resume.
* DMAR flags should be taken from the table passed to
acpi_parse_dmar() -- this is the trusted copy of the DMAR, when
running in TXT mode.
AMD64 defines two special bits (bit 3 and 4) RdMem and WrMem in fixed
MTRR type. Their values are supposed to be 0 after BIOS hands the
control to OS according to AMD BKDG. Unless OS specificially turn them
on, they are kept 0 all the time. As a result, k8_enable_fixed_iorrs()
is unnecessary and removed from upstream kernel (see
https://patchwork.kernel.org/patch/11425/). This patch does the same
thing.
x86, amd, MTRR: correct DramModEn bit of SYS_CFG MSR
Some buggy BIOS might set SYS_CFG DramModEn bit to 1, which can cause
unexpected behavior on AMD platforms. This patch clears DramModEn bit
if it is 1.
Ian Campbell [Tue, 5 Apr 2011 17:23:54 +0000 (18:23 +0100)]
libxl: Drop unnecessary \n from log message
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Apr 2011 17:17:55 +0000 (18:17 +0100)]
libxl: specific explicit disk image format to new qemu
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Apr 2011 16:34:48 +0000 (17:34 +0100)]
libxl: specify disks using supported command line syntax for new qemu
The -hdX syntax is only retained for compatibility reasons and the
-sdX syntax doesn't even exist.
Additionally convert the first four non-SCSI disks to hd[a-d] and
ignore any further non-SCSI disks (since qemu only supports 4 IDE
devices).
SCSI disks are passed through as is. qemu-xen was limited to 7 SCSI
devices but upstream qemu supports 256, therefore do not limit the
number of disks on the libxl side.
qemu-xen did all this itself internally.
Fixes "qemu: -xvda: invalid option" and allows PVHVM to work with
upstream qemu.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Apr 2011 16:27:49 +0000 (17:27 +0100)]
libxl: pass list of disks to libxl__build_device_model_args
Given that we have the information available this is preferable to
picking it out of xenstore instead. We already do this for VIFs.
Only the qemu upstream version makes use of it since old qemu-xen
actually parses xenstore itself.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Apr 2011 16:23:51 +0000 (17:23 +0100)]
libxl: return raw disk and partition number from libxl__device_disk_dev_number
Optional parameters, caller to follow.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 5 Apr 2011 16:21:36 +0000 (17:21 +0100)]
libxl: explicitly set disk format in libxl__append_disk_list_of_type
Ideally we should be able to infer the format from something stashed
in xenstore but this is better than letting users see garbage values.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 5 Apr 2011 12:05:05 +0000 (13:05 +0100)]
passthrough: use domain pirq as index of struct hvm_irq_dpci's hvm_timer array
Since d->nr_pirqs is guaranteed to be not larger than nr_irqs,
indexing arrays by the former ought to be preferred. In the case
given, the indices so far had to be computed specially in a number of
cases, whereas the indexes use now are all readily available.
This opens the possibility to fold the ->mirq[] and ->hvm_timer[]
members of struct hvm_irq_dpci into a single array, possibly with some
members overlayed in a union to reduce size (see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02006.html).
Such space saving wouldn't, however, suffice to generally get the
respective allocation sizes here to below PAGE_SIZE, not even when
converting the array of structures into an array of pointers to
structures. Whether a multi-level lookup mechanism would make sense
here is questionable, as it can be expected that for other than Dom0
(which isn't hvm, and hence shouldn't use these data structures - see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02004.html)
only very few entries would commonly be used here. An obvious
alternative would be to use rb or radix trees (both currently only
used in tmem).
Jan Beulich [Tue, 5 Apr 2011 12:03:29 +0000 (13:03 +0100)]
x86: introduce alloc_vcpu_guest_context()
This is necessary because on x86-64 struct vcpu_guest_context is
larger than PAGE_SIZE, and hence not suitable for a general purpose
runtime allocation. On x86-32, FIX_PAE_HIGHMEM_* fixmap entries are
being re-used, whiule on x86-64 new per-CPU fixmap entries get
introduced. The implication of using per-CPU fixmaps is that these
allocations have to happen from non-preemptable hypercall context
(which they all do).
Jan Beulich [Tue, 5 Apr 2011 12:02:57 +0000 (13:02 +0100)]
x86: split struct domain
This is accomplished by converting a couple of embedded arrays (in one
case a structure containing an array) into separately allocated
pointers, and (just as for struct arch_vcpu in a prior patch)
overlaying some PV-only fields with HVM-only ones.
One particularly noteworthy change in the opposite direction is that
of PITState - this field so far lived in the HVM-only portion, but is
being used by PV guests too, and hence needed to be moved out of
struct hvm_domain.
The change to XENMEM_set_memory_map (and hence libxl__build_pre() and
the movement of the E820 related pieces to struct pv_domain) are
subject to a positive response to a query sent to xen-devel regarding
the need for this to happen for HVM guests (see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01848.html).
The protection of arch.hvm_domain.irq.dpci accesses by is_hvm_domain()
is subject to confirmation that the field is used for HVM guests only
(see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02004.html).
In the absence of any reply to these queries, and given the early
state of 4.2 development, I think it should be acceptable to take the
risk of having to later undo/redo some of this.
Jan Beulich [Tue, 5 Apr 2011 12:02:00 +0000 (13:02 +0100)]
x86: move pv-only members of struct vcpu to struct pv_vcpu
... thus further shrinking overall size of struct arch_vcpu.
This has a minor effect on XEN_DOMCTL_{get,set}_ext_vcpucontext - for
HVM guests, some meaningless fields will no longer get stored or
retrieved: reads will now return zero, and writes are required to be
(mostly) zero (the same as was already done on x86-32).
Jan Beulich [Tue, 5 Apr 2011 12:01:25 +0000 (13:01 +0100)]
x86: split struct vcpu
This is accomplished by splitting the guest_context member, which by
itself is larger than a page on x86-64. Quite a number of fields of
this structure is completely meaningless for HVM guests, and thus a
new struct pv_vcpu gets introduced, which is being overlaid with
struct hvm_vcpu in struct arch_vcpu. The one member that is mostly
responsible for the large size is trap_ctxt, which now gets allocated
separately (unless fitting on the same page as struct arch_vcpu, as is
currently the case for x86-32), and only for non-hvm, non-idle
domains.
This change pointed out a latent problem in arch_set_info_guest(),
which is permitted to be called on already initialized vCPU-s, but
so far copied the new state into struct arch_vcpu without (in this
case) actually going through all the necessary accounting/validation
steps. The logic gets changed so that the pieces that bypass
accounting
will at least be verified to be no different from the currently active
bits, and the whole change will fail in case they are. The logic does
*not* get adjusted here to do full error recovery, that is, partially
modified state continues to not get unrolled in case of failure.
Jan Beulich [Tue, 5 Apr 2011 12:00:54 +0000 (13:00 +0100)]
Remove direct cpumask_t members from struct vcpu and struct domain
The CPU masks embedded in these structures prevent NR_CPUS-independent
sizing of these structures.
Basic concept (in xen/include/cpumask.h) taken from recent Linux.
For scalability purposes, many other uses of cpumask_t should be
replaced by cpumask_var_t, particularly local variables of functions.
This implies that no functions should have by-value cpumask_t
parameters, and that the whole old cpumask interface (cpus_...())
should go away in favor of the new (cpumask_...()) one.
Ian Jackson [Mon, 4 Apr 2011 13:54:46 +0000 (14:54 +0100)]
libxl: add CODING_STYLE
libxenlight and xl grew enough to need a CODING_STYLE, that I blatantly
copied from qemu and linux, just adding few specific modifications.
The result should be as less controversial as possible, mostly
documenting what we are already doing.
[ Message and document originally posted to xen-devel on 2010-09-01 ]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Sat, 2 Apr 2011 14:58:54 +0000 (15:58 +0100)]
x86: cleanup bogus CONFIG_ACPI_PCI uses
We're building for one case (CONFIG_ACPI_PCI defined) only, yet still
had the other case's code in there. Additionally there was quite a bit
of pseudo-duplication between disabled(!) DMI scan and ACPI boot code.
acpi_pci_disabled had only a single reader, which is off by default
(i.e. must be enable on the command line), so it seems pointless to
keep it.
Jan Beulich [Sat, 2 Apr 2011 14:58:22 +0000 (15:58 +0100)]
x86/ACPI: __init-annotate
xen/arch/x86/acpi/boot.c consists of almost only code/data in .init.*,
so move the few bits that aren't into a new file and then use the
recently introduced .init.o mechanism to move all the literal strings
into .init.rodata.
Jan Beulich [Sat, 2 Apr 2011 14:57:35 +0000 (15:57 +0100)]
amd-iommu: __init-annotate
Besides marking a few more items __init/__initdata, use the recently
introduced .init.o mechanism to move all the literal strings into
.init.rodata in those files that consist of only contributions to
.init.*.
Olaf Hering [Sat, 2 Apr 2011 14:50:19 +0000 (15:50 +0100)]
xentrace: correct formula to calculate t_info_pages
The current formula to calculate t_info_pages, based on the initial
code, is slightly incorrect. It may allocate more than needed.
Each cpu has some pages/mfns stored as uint32_t.
That list is stored with an offset at tinfo.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>