CPUIDLE: Initialize timer broadcast mechanism for C2
Without this patch, while running on platforms on which the deepest
C-state is C2, acpi_processor_idle fns will call into NULL
function. This has been the case since 18518:e61c7833dc9d8.
hvm: Default timer_mode=1 (do not delay virtual time for missed
ticks). Most guests prefer this mode compared with screwing with
progress of virtual time.
X86 and IA64: Rebase cpufreq logic for supporting both x86 and ia64
arch
Rebase cpufreq logic for supporting both x86 and ia64 arch:
1. move cpufreq arch-independent logic into common dir
(xen/drivers/acpi
and xen/drivers/cpufreq dir);
2. leave cpufreq x86-dependent logic at xen/arch/x86/acpi/cpufreq dir;
Signed-off-by: Yu, Ke <ke.yu@intel.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
vtd: Fix check for interrupt remapping of ioapic RTE
For IOAPIC interrupt remapping, it only needs to remap ioapci RTE,
should not remap other IOAPIC registers, which are IOAPIC ID, VERSION
and Arbitration ID. This patch adds the check for this and only remap
ioapci RTE.
Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
In xm start the --vncviewer option has no effect, instead -c tries to
both connect to the console and start vncviewer. Additionally, to
start vncviewer it uses the domid variable which is only defined a few
lines later. Thus xm start -c doesn't work at all.
guest_physmap_add_entry() checks to see if the given mfn and gpfn
range in the p2m and m2p tables is already mapped before overwriting
the maps, and attempts to do something reasonable so that we don't
have any "dangling" pointers.
Unfortunately, these checks got broken when the page_order argument
was added. Each individual p2m and m2p entry needs to be checked, not
just the first page in a page order.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Since Linux started to use one of the 3 low available bits, _PAGE_IO
needed to be moved to a different one. Not remembering about
_PAGE_GNTTAB in debug hypervisors, I ended up assigning it to the same
bit, which made the kernel fail on the debug hypervisor. However,
rather than fixing the kernel it seems more appropriate for the
hypervisor to stay away from these bits, not the least because its
definition was anyway accompanied by a warning that this may be
incompatible with certain OSes.
While obviously the hypervisor has to use some bit (and it's therefore
unavoidable that there's some risk of collision), using one of the
high available bits seems to be the better choice over using one of
the three low ones. Since in 32-bit mode these bits are reserved, the
patch disables the functionality here. The only reasonable alternative
I would see is to disable the functionality by default, but add a
command line option to specify which bit to use.
XENLOG_G_* should not be used in invocations of gdprintk().
Also change the wording in a few places and consistently print the
target domain. It remains questionable whether the code should be this
verbose in the first place, especially now that MSI is on by default.
ia64: fix make install under tools/debugger/xenitp
This patch fixes the following error with make install under
the directory, tools/debugger/xenitp by checking whether
the variable is length zero string.
XEN_DOMCTL_setvcpucontext, XEN_DOMCTL_max_vcpus, and
XEN_DOMCTL_setdebugging don't seem to allow Dom0 as the subject domain
(based on the criteria that they pause that domain in order to do
their job).
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
The major issue with supporting a significantly larger number of
physical CPUs appears to be the use of per-CPU GDT entries - at
present, x86-64 could support only up to 126 CPUs (with code changes
to also use the top-most GDT page, that would be 254). Instead of
trying to go with incremental steps here, by converting the GDT itself
to be per-CPU, limitations in that respect go away entirely.
ACPI C2 is quite possible mapped to CPU C3 or deeper state, so
thinking from worst cases, enable C3 like entry/exit handling for C2
by default. Option 'lapic_timer_c2_ok' can be used to select simple C2
entry/exit only if the user make sure that LAPIC tmr & TSC will not be
stop during C2.
There may be multiple ACPI C3 states reported by BIOS. Those C3 states
may be different on latency & power. So made some modification to
support this case.
xm list: Return unique exit code for non-existent domain
This patch will make xm return a exit code of 3 if `xm list
<non_existant_domain>` is done rather than the generic code of 1. I
used 3 because XendClient had a macro setup pointing
ERROR_INVALID_DOMAIN to 3.
x86, amd, hvm: pass through one more cpuid cache description leaf
Add a missing CPUID leaf that contains AMD-specific cache info.
Without this, Windows can spin trying to prefetch memory buffers using
a stride length of zero.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
x86: Allow continue_hypercall_on_cpu() to be called from within an
existing continuation handler. This fix is needed for the new method
of microcode re-programming.
details:
Since 21dd1fdb73d8 there have been (as far as I can see) three
separate mechanisms for achieving a VNC display:
1. xm spawns vncviewer after getting vnc display info
from qemu-dm via xenstore (introduced in 21dd1fdb73d8)
2. xm spawns vncviewer -listen and qemu-dm connects to it
3. qemu-dm spawns vncviewer (!)
The latter two are rather strange - No.3 is very strange indeed.
So I decided that rather than try to get No.2 or No.3 on track for
going into qemu upstream, No.2 and No.3 would be dropped.
After discussion on xen-devel the mechanism No.1 was introduced,
above.
No.1 is controlled by the --spawn-vncviewer (and --vncviewer-autopass)
command line options to xm, by analogy with the -c option.
Nos.2 and 3 are controlled by elements of the domain configuration
file - and their code still remains. So if you turn all of the vnc
options on you can get several vncviewers (although only one of them
will work).
This patch removes the support for the passive connection mode No.2
After all ioemu-remote will never connect to such a vncviewer.
The options to engage this functionality were already removed from
the example config files by Keir in 18241:bf4ef45e6a38.
x86-64: enforce memory limits imposed by virtual memory layout
... which currently means:
- The 1:1 map cannot deal with more than 1Tb.
- The m2p table can handle at most 8Tb.
- The page_info array can cover up to e.g. 1.6Gb (<=3D 64 CPUs) or
1Tb (193-256 CPUs).
x86: Simplify RDMSR pass-through emulation for certain
explicitly-named MSRs (but keep the names in the source code in case
we tighten up RDMSR emulation later).
Also add MSR_AMD_PATCHLEVEL MSR as explicitly required (for Solaris).
Replace the stubdom/ioemu link farm creation in stubdom/Makefile,
with code which arranges that:
* No symlinks are made for output files - in particular, any
symlinks for .d files would be written through by the compiler
and cause damage to the original tree and other strange
behaviours
* All subdirectories are made as local subdirectories rather than
links
* Any interrupted or half-completed creation of the link farm
leaves the directory in a state where the link farming will be
restarted
* We use make's inherent ability to test for the existence of files
rather than using [ -f ... ] at the start of the rule's commands
* The list of files to be excluded from the link farm can be
easily updated
etc.
This should fix some problems particularly with parallel builds,
or by-hand builds where directories are entered in other than the
usual order.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Avoid parallel invocation of git for ioemu-remote.
The stubdom and tools directories both run `make ioemu-dir-find' in
tools. In a parallel build, both these invocations can run
concurrently because we're doing recursive make.
This change fixes this problem by adding a suitable dependencies in
the top-level Makefile for the recursion into tools/ and stubdom/,
ensuring that the git fetch happens once, first.
The bug was introduced in 18472/18474.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
- This minor patch implements the missing stub function
security_label_to_details in the dummy module. This stub function is
necessary to create domains with network interfaces for modules that
do not implement the security_label_to_details function.
Signed-off-by: George Coker <gscoker@alpha.ncsc.mil>
x86, shadow: Allow removing writable mappings from splintered page tables.
The moving of the pagetable mapping in the linux kernel exposed the
fact that under the linux kernel sh_rm_write_access_from_sl1p was
always failing.
Linux seems to use big pages to access page tables, so we should
instruct the shadow code to be able to remove writable mappings from
splintered pagetables as well, avoiding using OS heuristic (which were
failing in 2.6.27 before George patch, leading to brute-force search
at each resync).
$(XEN_ROOT) absolutification fixes for ioemu-remote (incl stubdom)
* Move code for generating an absolute version of XEN_ROOT
into a common make variable set in Config.mk
* Use this common code when invoking make -C ioemu-dir clean
from tools/, which avoids a problem where `make clean' fails
because qemu's (ioemu-remote's) build system wants to run
`make clean' in `tests' but XEN_ROOT is a confection involving
../'s.
* Use this common code in stubdom/Makefile, instead of $(abspath...)
as the latter is a relatively new feature in GNU make and is not
available in all the places that we want to be able to build
(cf c/s 17997:3f23e01d31985899dbd1660b166f229f1ee74292)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
- This patch corrects an unsafe/incorrect usage of FOREIGNDOM. The
value of FOREIGNDOM is now passed through the XSM interface.
Corresponding updates to the Flask module are included in this patch.
- This patch also includes a minor header update to allow the Flask
module to compile after recent updates to Xen.
Signed-off-by: George Coker <gscoker@alpha.ncsc.mil>
CPUIDLE: Port Linux menu governor to replace the initial ladder governor
The ladder governor has long pro/demotion delay shortcome while
applying to tickless mode, because it needs to count usage. Menu
governor chooses the next state simply via break-event prediction
including the factors of next timer event & last residency time etc,
so it would have faster response speed.
CPUIDLE: Avoid remnant LAPIC timer intr while force hpetbroadcast
LAPIC will stop during C3, and resume to work after exit from
C3. Considering below case:
The LAPIC timer was programmed to expire after 1000us, but CPU enter
C3 after 100us and exit C3 at 9xxus.
0us: reprogram_timer(1000us)
100us: entry C3, LAPIC timer stop
9xxus: exit C3 due to unexpected event, LAPIC timer continue running
10xxus: reprogram_timer(1000us), fail due to the past expiring time.
......: no timer softirq raised, no change to LAPIC timer.
......: if entry C3 again, HPET will be forced reprogramed to
now+small_slop.
......: if entry C2, no change to LAPIC.
18xxus: LAPIC timer expires unexpectedly if no C3 entries after
10xxus.
CPUIDLE: Avoid remnant HPET intr while force hpetbroadcast
Exit from C3 is mainly caused by HPET intr if force enable
hpetbroadcast. But it is still probably caused by other unexpected
events. In this case the HPET timer may still be alive while there is
no CPU in C3. Avoid those remnant HPET intr can save cpu handling time
and increase idle time.
vtd: Add a command line param to enable/disable pass-through feature
Taking security into accout, it's not suitable to bypass VT-d
translation for Dom0 by default when the pass-through field in
extended capability register is set. This feature is for people/usages
who are not overly worried about security/isolation, but want better
performance.
This patch adds a command line param that controls if it's enabled or
disabled.
Signed-off-by: Weidong Han <weidong.han@intel.com>
x86, xend: Fix processing of cpuid config parameters
There is an python indentation issue keeping the full range of syntax
for the cpuid config file parameter from working correctly. This
patch fixes that. It also fixes some misspelling and a missing 'x' in
two of the example config files (must have 32 bits represented for
cpuid registers).
This small patch fixes an issue leading to a crash (segfault, although
with earlier changesets I was seeing sigbus - not sure what changed)
in qemu-dm when the following conditions occur:
1. A valid mapping for a bucket on a low address exists
2. Immediately after accessing memory mapped in this bucket, an access
occurs to a high (beyond assigned ram) address beyond the 1GB limit
for 32bit map cache wrapping around to the previous bucket's entry
number.
3. The next call to map cache again accesses the low address.
In this scenario, the guest mem for the low bucket has been unmapped
by the remap_bucket caused by 2., but because the valid_mapping
bit-test fails, map_cache returns before last_address_index has been
updated. The subsequent call to map_cache therefore never remaps the
low, valid bucket and instead returns a vaddr pointing to memory that
has failed to get mapped.