This patch introduces libxl_primary_console_exec: a new libxl function
that finds the domid and console number corresponding to the primary
console of a given vm. The domid might be different from the domid of
the VM and the console number might not be 0 when using stubdoms.
The caller (xl_cmdimpl.c in this case) has to make sure that the stubdom
is already created before calling libxl_primary_console_exec in the hvm
case. In the PV case libxl_primary_console_exec has to be called before
libxl_run_bootloader.
Ian Jackson [Thu, 15 Jul 2010 17:18:16 +0000 (18:18 +0100)]
pygrub: look in every partition for something to boot
pygrub: look in every partition for something to boot, in case
the OS installer (SLES 10 sp1 in particular) forgets to mark the
boot partition as active.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> Acked-by: David Markey <admin@dmarkey.com>
Ian Jackson [Thu, 15 Jul 2010 15:32:50 +0000 (16:32 +0100)]
xm: Do not check path of kernel if bootloader is specified
When create DomU, if bootloader is specified, 'kernel/ramdisk' will be
used by bootloader when boots DomU. So it is needless to check the
path is existent or not.
Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 15 Jul 2010 15:30:24 +0000 (16:30 +0100)]
gdbsx: update README and remove space in q packet
Newer version of gdb, version 7*, seems to have bug where it is not
parsing thread list from gdbsx properly. Getting rid of the space in
thread list works around it. It's ok with older gdb also.
Ian Jackson [Wed, 14 Jul 2010 15:45:38 +0000 (16:45 +0100)]
libxl, xl: support running bootloader (e.g. pygrub) in domain 0
Much of the bootloader interaction (including the Solaris and NetBSD
portability bits) are translated pretty much directly from the python
in tools/python/xen/xend/XendBootloader.py
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 14 Jul 2010 15:44:18 +0000 (16:44 +0100)]
libxl: add function to attach/detach a disk to/from the local VM
Useful if you need to read a guest filesystem (e.g. pygrub).
I'm not overly thrilled with the implementation WRT tap interfaces,
particularly WRT to detach. I was unable to find a way to get at the
paramters necessary to call tap_ctl_destroy so I assumed for now it
that is OK to assume that the tap device is going to be wanted for the
actual domain at some point in the immediate future and hence there is
no pressing need to destroy it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 14 Jul 2010 15:43:49 +0000 (16:43 +0100)]
libxl: support mapping files rather than carrying paths around
This will allow us to map and then unlink the file and therefore
delete the process on process exit or explicit unmap.
Using the mmaped versions of these files required rewriting build_pv
to use the xc_dom builder functionality directly rather than through
the xc_linux_build "compatibility layer". (The status of the
xc_linux_build interface as a compatibility layer seems a bit dubious
since all existing callers use it but if anything is going to replace
it then libxl seems like the likely candidate).
I'm not thrilled with the definition of the maps lifecycle. This could
be solved by adding a helper function to explicitly free the toplevel
structure.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 14 Jul 2010 15:40:33 +0000 (16:40 +0100)]
libxl, xl: exec xenconsole in current process, defer decision to fork to caller
Use this to run xenconsole as the foreground process and move the
connection to the console in the "create -c" case early enough to be
able to view output from the bootloader. This behaviour is consistent
with how both "xm console" and "xm create -c" operate.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 14 Jul 2010 15:36:23 +0000 (16:36 +0100)]
pygrub: introduce easier to parse output format
libxl would rather like to parse the output of pygrub. Rather than
implement an SXP parser in libxl add a --output-format option to
pygrub which can select an alternative, simpler to parse,
format. Available formats are:
sxp: current SXP output format;
simple: simple key+value output with \n separating item ( for
debugging). key and value are separated by a single
space (and key therefore cannot contain a space);
simple0: as simple but with \0 as a separator;
Also add --output-directory to allow temporary files to be placed
somewhere other than /var/run/xend/boot.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 14 Jul 2010 15:30:42 +0000 (16:30 +0100)]
tools/misc/xenpm: provide core/package cstate residencies
According to Intel 64 and IA32 Architectures SDM 3B Appendix B, Intel
Nehalem/Westmere processors provide h/w MSR to report the core/package
cstate residencies.Extend sysctl_get_pmstat interface to pass the
core/package cstate residencies, and modify xenpm to output those
information.
Besides the .text space savings of over 2.5k on x86-64 (1.5k for
x86-32) this removes a load (plus a lea on x86-64) from various
frequently executed code paths, and finally provides a reason (other
than legibility) to prefer this_cpu() over per_cpu() in all places
where smp_processor_id() isn't being called anyway.
Print out the event log entry content for debug purposes.
Additionally, when IOMMU reset event log (due to event log overflow),
we should print out the event log content for debugging.
Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Wei Wang <wei.wang2@amd.com>
According to Intel 64 and IA32 Architectures SDM 3B Appendix B, Intel
Nehalem/Westmere processors provide h/w MSR to report the core/package
cstate residencies. Extend sysctl_get_pmstat interface to pass the
core/package cstate residencies.
Eliminate redundant ones, fix names (where so far inappropriately
referring to capability structure fields the don't really relate to),
use symbolic names instead of raw numbers, and remove an unusable one.
This matches similar checks done in Linux, since no good can come from
a domain trying to enable both MSI and MSI-X on the same device at the
same time.
x86/cpufreq: pass pointers to cpu masks where possible
This includes replacing the bogus definition of cpumask_test_cpu()
(introduced by c/s 20073) with a Linux compatible one and replacing
the bad uses with cpu_isset().
x86 hvm: Add a new HVMOP to get the current Xen system time
Xen absolute system time, so that it can use SCHEDOP_poll in a
sensible fashion. HVM PV drivers can't use the normal PV clock
because they might have TSC offsets that hey don't know about.
iommu: New options iommu=dom-strict and iommu=dom0-passthrough
The former strips dom0 of its usual 1:1 mapping of all memory, and
only provides it with mappings of its own memory, like any other
domain. The latter is a new consistent name for iommu=passthrough.
Currently "make stubdom" on its own fails because it depends on files
being installed by the results of "make tools". This also means that
in some circumstances a parallel "make tools stubdom" (or "make all")
can fail due to races. So make "make stubdom" depend on "make tools"
having completed first.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
The hardware CPUID-levelling features level the feature flags but
don't change the CPU family/model/stepping. Relax the HVM restore
check on family/model/stepping to printk but not veto the load, so
that VMs can be migrated between machines that have been
CPUID-levelled.
xen: allow HVM save/restore from different changesets
Allow HVM save/restore from different changesets of Xen. The HVM save
records are supposed to be backwards compatible; XenServer
live-migrates between versions of Xen during upgrades.
xen: make the shadow allocation hypercalls include the p2m memory
in the total shadow allocation. This makes the effect of allocation
changes consistent regardless of p2m activity on boot.
Otherwise vcpu_periodic_timer_work() can think the next timer is in
the future (and re-issue it unchanged) while timer_softirq_action()
thinks it's in the past (and fires it immediately), leading to
livelock.
rombios: move the stack to 0x9e000 and protect it with an e820 entry
so that we don't corrupt E820_RAM memory with stack ops in S3 wakeup.
It has to move up so the lowest contiguous RAM area is >= 512MiB.
This relies on the previous fix to let DS != SS
Signed-off-by: Paul Durrant <Paul.Durrant@citrix.com> Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Ian Jackson [Tue, 6 Jul 2010 15:55:49 +0000 (16:55 +0100)]
tools/libxl: allow setting of timer_mode, hpet and vpt_align parameters
Implement parsing for timer_mode, hpet and vpt_align parameters.
These are all HVM only parameters and hpet/vpt_align are boolean so
change types and place in hvm union accordingly. Also HPET is x86 only
on principle so make this compile-time conditional on arch as-is
viridian.
Ian Jackson [Tue, 6 Jul 2010 12:10:14 +0000 (13:10 +0100)]
tools/hotplug: locking.sh script: fix lock directory remains on error bug
_release_lock should be used instead of release_lock.
sigerr is introduced so that it can be redefined by
xen-hotplug-common.sh to a version which writes error status to xenstore.
Ian Jackson [Tue, 6 Jul 2010 10:57:20 +0000 (11:57 +0100)]
tools/xenstore: add XS_RESTRICT operation to C xenstore client libs.
The OCaml xenstored supports the XS_RESTRICT operation, which
deprivileges a dom0 xenstore connection so it can only affect one
domain's entries. Add the relevant definitions to the C libraries
so that callers can use it.
This patch masks PIC and IOAPIC RTE's before x2APIC enabling, unmask
and restore them after x2APIC enabling. It also really enables
interrupt remapping before x2APIC enabling instead of just checking
interrupt remapping setting. This patch also handles all x2APIC
configuration including BIOS settings and command line
settings. Especially, it handles that BIOS hands over in x2APIC mode
(when there is apic id > 255). It checks if x2APIC is already enabled
by BIOS. If already enabled, it will disable interrupt remapping and
queued invalidation first, then enable them again.
Signed-off-by: Weidong Han <weidong.han@intel.com>
x2APIC/VT-d: improve interrupt remapping and queued invalidation enabling and disabling
x2APIC depends on interrupt remapping, so interrupt remapping needs to
be enabled before x2APIC. Usually x2APIC is not enabled
(x2apic_enabled=0) when enable interrupt remapping, although x2APIC
will be enabled later. So it needs to pass a parameter to set
interrupt mode in intremap_enable, instead of checking
x2apic_enable. This patch adds a parameter "eim" to intremap_enable to
achieve it. Interrupt remapping and queued invalidation are already
enabled when enable x2apic, so it needn't to enable them again when
setup iommu. This patch checks if interrupt remapping and queued
invalidation are already enable or not, and won't enable them if
already enabled. It does the similar in disabling, that's to say don't
disable them if already disabled.
Signed-off-by: Weidong Han <weidong.han@intel.com>
A drhd is created when parse ACPI DMAR table, but drhd->iommu is not
allocated until iommu setup. But iommu is needed by x2APIC which will
enable interrupt remapping before iommu setup. This patch allocates
iommu when create drhd. And then drhd->ecap can be removed because
it's the same as iommu->ecap.
Signed-off-by: Weidong Han <weidong.han@intel.com>
VMX: fix ept pages free up when ept superpage split fails:
1) implement ept super page split in a recursive way to
form an ept sub tree before real installation;
2) free an ept sub tree also in a recursive way.
3) change ept_next_level last input parameter from shift
bits # to next walk level;
This path enables AMD OSVW (OS Visible Workaround) feature for
Xen. New AMD errata will have a OSVW id assigned in the future. OS is
supposed to check OSVW status MSR to find out whether CPU has a
specific erratum. Legacy errata are also supported in this patch:
traditional family/model/stepping approach will be used if OSVW
feature isn't applicable. This patch is adapted from Hans Rosenfeld's
patch submitted to Linux kernel.
Signed-off-by: Wei Huang <wei.huang2@amd.com> Signed-off-by: Hans Rosenfeld <hands.rosenfeld@amd.com> Acked-by: Jan Beulich <jbeulich@novell.com>
blktap2: make protocol specific usage of shared sring explicit
I don't think protocol specific data really belongs in this header
but since it is already there and we seem to be stuck with it let's at
least make the users explicit lest people get caught out by future new
fields moving the pad field around.
This is the Xen portion of this change. The kernel portion will be
sent separately. There is no dependency between the two.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Daniel Stodden <daniel.stodden@citrix.com> Cc: Dongxiao Xu <dongxiao.xu@intel.com>
After getting a report of 3.2.3's xenmon crashing Xen (as it turned
out this was because c/s 17000 was backported to that tree without
also applying c/s 17515), I figured that the hypervisor shouldn't rely
on any specific state of the actual trace buffer (as it is shared
writable with Dom0)
[GWD: Volatile quantifiers have been taken out and moved to another
patch]
To make clear what purpose specific variables have and/or where they
got loaded from, the patch also changes the type of some of them to be
explicitly u32/s32, and removes pointless assertions (like checking an
unsigned variable to be >= 0).
I also took the prototype adjustment of __trace_var() as an
opportunity to simplify the TRACE_xD() macros. Similar simplification
could be done on the (quite numerous) direct callers of the function.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>