Fix injection of guest faults resulting from failed injection of a
previous event. We enter an infinite loop if the original failed
injection cannot be fixed up by Xen (e.g., because it's not a shadow
pagetable issue).
The RHEL4 HVM guest hang issue was actually a side effect of
change-set 9699. In the rhel4 guest hang rc.sysinit init-script was
calls kmodule program to probe the hardware. The kmodule uses the kudzu
library call probeDevices(). For probing the graphics hardware in the
vbe_get_mode_info() function, sets up the environment and goes into the
vm86 mode to do the int x10 call. For returning back to protected mode
it sets up a int 0xff call. At the time of calling the int 0xff the
guest process pages were not filled up. And it was causing an infinite
loop of vmexits with the IDT_VECTORING_INFO on the int 0xff instruction.
The reason for the infinite loop is changeset 9699. With that
the guest page fault was always getting overridden by the int 0xff gp
fault coming from the IDT_VECTORING_INFO. With the attached patch if VMM
is injecting exceptions like page faults or gp faults then
IDT_VECTORING_INFO field does not override it, and that breaks the
vmexit infinite loop for the rhel4.
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Edwin Zhai <edwin.zhai@intel.com>
xen-unstable changeset: 9945:0c586a81d941ab0a18aecca87cffe1500a9185c5
xen-unstable date: Fri May 5 14:05:31 2006 +0100
Fix python pciif script to reference correct 2.0 compatibility variable.
In the Xen 2.0.x compatibility section of xend (where we try to parse
the s-expressions if they came from an SXP configuration file for Xen
2.0.x), the wrong variable is referenced. This fix corrects the python
script to use the correct variable.
Thanks to Mike Wright for reporting this.
Signed-off-by: Ryan Wilson <hap9@epoch.ncsc.mil>
xen-unstable changeset: 9944:7801e09f518cfdf566a405bce2c3f41553e35218
xen-unstable date: Fri May 5 14:01:43 2006 +0100
SVM patch for 64bit hv, to reset the ss, es, ds host selectors to NULL
during a context switch to the SVM domain's vcpu. This patch also
initializes the tlb_control to 1 for the initial do_launch(). Signed-off-by: Tom Woller <thomas.woller@amd.com>
xen-unstable changeset: 9935:8761333499ae2874647eb5d67d8cb091fbc5b14b
xen-unstable date: Thu May 4 21:24:39 2006 +0100
SVM patch to add BP exception intercept support. Signed-off-by: Tom Woller <thomas.woller@amd.com>
xen-unstable changeset: 9635:b77ebfaa72b200af0cdfc38dd8f7dbe274e5e386
xen-unstable date: Thu Apr 13 11:08:20 2006 +0100
SVM patch to cleanup the host save area allocation and deallocation,
including removing memory leaks concerning these areas. Also fixes
problem where the HSA MSR was not initialized properly for cores>0. Signed-off-by: Tom Woller <thomas.woller@amd.com>
xen-unstable changeset: 9922:e1a47a2696004087852cb9f2e09fe4eb8ad1b928
xen-unstable date: Thu May 4 11:14:45 2006 +0100
Fix xenbus userspace device transaction tracking.
If a transaction end command fails, the semaphore which keeps track
of whether we're in a transaction or not was not getting updated.
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
xen-unstable changeset: 9921:bbce4d11518910328380f6a3325268acfa5b3aff
xen-unstable date: Thu May 4 10:25:27 2006 +0100
Balloon driver should hijack the ->lru list field rather than
adding another list field to every page structure. Signed-off-by: Keir Fraser <keir@xensource.com>
xen-unstable changeset: 9913:decf309fb47b3f4246540a5e1327663651d266fe
xen-unstable date: Tue May 2 17:23:21 2006 +0100
Fix Xen's interrupt acknowledgement routines on certain
(apparently broken) IO-APIC hardware:
1. Do not mask/unmask the IO-APIC pin during normal ISR
processing. This seems to have really bizarre side effects
on some chipsets.
2. Since we instead tickle the local APIC in the ->end
irq hook function, it *must* run on the CPU that
received the interrupt. Therefore we track which CPUs
need to do final acknowledgement and IPI them if
necessary to do so.
New IO-APIC ACK method seems to cause problems on some systems
(e.g., Dell 1850). Disable it by default for now, but allow the
new mwethod to be tested by passing boot parameter 'new_ack'
to Xen.
You can tell which ACK method you are using because Xen prints
out "Using old ACK method" or "Using new ACK method" during boot.
This workaround can be removed if/when the problems with the new
ACK method are flushed out.
Big fixes for the new IO-APIC acknowledging method. The problems
were:
1. Some critical Xen interrupts could get blocked behind
unacknowledged guest interrupts. This is avoided by making
all Xen-bound interrrupts strictly higher priority.
2. Interrupts must not only be EOIed on the CPU that received
them, but also in reverse order when interrupts are nested.
A whole load of logic has been added to ensure this.
There are two boot parameters relating to all this:
'ioapic_ack=old' -- use the old IO-APIC ACK method
'ioapic_ack=new' -- use the new IO-APIC ACK method (default)
'force_intack' -- periodically force acknowledgement of
interrupts (default is no; useful for debugging)
This patch defines a test_and_clear bitop for cpumask_t pointers.
Also fixes "wrong pointer type" for type specific bitops by using
&foo[0] instead of &foo.
Occasionally large smp machines fail to reboot properly and die under
an IPI storm of smp_call_function() to machine_reboot. Only the boot
processor needs to run machine_restart, so send an IPI to CPU0.
This patch addresses CVE-2006-1056 (information leak from
fxsave/fxrstor on AMD CPUs) and also adjusts 64-bit handling so that
full 64-bit RIP/RDP values get saved/restored. More fine-grained
handling may be needed if 32-bit processes are expected to properly
see their selectors (native Linux doesn't currently do that either,
but there is a patch to adjust it there).
Original patch: Jan Beulich (based on Linux original by Andi Kleen)
While other aspects of the system configuration may still be
controlled by the outcome of the table scan, if apic= was given on the
command line its effect should not be overridden here.
Do not create blkback vbd kernel thread until fully connected
to frontend driver. Otherwise the kernel thread may crash trying
to access the non-existent shared ring.
The Xen checksum offload feature attempts to insert a TCP/UDP
checksums into already encrypted packets (esp4) in dom0. Obviously,
it is not possible to insert a checksum into an already encrypted
packet, so this patch inserts the checksum prior to encrypting
packets in net/ipv4/xfrm4_output.c.
To do this cleanly, the TCP/UDP header pointers need to be pointed to
the correct spot, so this functionality has been abstracted into a new
function.
This patch fixes bug 143 (verified by Jim Dykman). Earlier version
verified by Jon McCune.
Signed-off-by: James Dykman <dykman@us.ibm.com> Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Remove update_vcpu_system_time() call from the per-VCPU timer
callback function. It's unnecessary and in fact may occasionally
even run on the wrong CPU.
Fix command-line parsing in a few respects -- be more
generous about what we accept, avoid stack overflow, and
print the command line during boot (rather useful!).
This should fix the 'lapic' and 'nolapic' boot options.
Propagate information about bad (or good) REGSEL register
of chipset IO-APICs to Xen. If REGSEL is bad (some old SiS
chipsets) then we have a slower read-modify-write routine.
Loosely based on an original patch from Jan Beulich.
Fix the "hda lost interrupt" issue when creating a VMX guest on a PAE
host.
Occasionally when injecting an IDE DMA interrupt into the guest, a
page fault occurs (e.g., because the IDT mapping is not present in
shadow pagetables). This causes an immediate vmexit and, because it
occurred during event delivery, the original VM_ENTRY_INTR_INFO_FIELD
is kept in IDT_VECTORING_INFO_FIELD.
The current code copies IDT_VECTORING_INFO_FIELD back to
VM_ENTRY_INTR_INFO_FIELD, intending that the interrupt will be
injected again on next vmresume.
However, there is a corner case: if, before the next vmresume, a timer
interrupt happened then vmx_intr_assist may overwrite the information
on VM_ENTRY_INTR_INFO_FIELD, and the IDE DMA interrupt is effectively
lost.
This patch checks the IDT_VECTORING_INFO_FIELD in vmx_intr_assist and,
if it is set, copies it to VM_ENTRY_INTR_INFO_FIELD and returns.
Signed-off-by: Yunhong Jiang <Yunhong.jiang@intel.com> Signed-off-by: Eddie Dong <eddie.dong@intel.com>
There are instances where we DO NOT want an hvm guest to run an
MP enabled kernel. In such situations we should have a workaround to
guarantee hvm guests will not detect MP.
For example, in the absence of ACPI and MPS the installation code in some
linux distributions key off the presence of cpuid edx/HTT bit (indicating
the presence of Hyper-Threading Technology) to determine if another
logical processor is present and if so load an MP enabled kernel instead
of a uniprocessor kernel. SMBIOS is also looked at for the same purpose
and presents a potential problem as well. While both approaches for
selecting an MP kernel are debatable (since using MPS or ACPI have long
been the standard for MP detection), these approaches are something we
have to live and work around with because making a change in the fully
virtualized guest is not an option.
To solve the problem we need to hide all secondary processors from the hvm
guest. Since the hvm does not surface MPS tables, we only need to deal
with ACPI, cpuid HTT, and possibly SMBIOS. (I did not have time right
now to look closely at the hvm BIOS to know if SMBIOS is also going to be
a problem.)
Also fixes a logic problem the code path where apic=0 was not
being handled correctly (vmx path only).
Increase size of level-2 initial PDE identity map from first 64MB of
physical RAM to first 1GB of physical RAM. This allows x86_64 xen to boot
larger dom0 images. Without this changes large dom0 images fail to
boot with "Unknown interrupt" on xen console and wedge.
SVM patch to ensure that PAE bit is set for 32bit guests on 32bit PAE,
by using paging levels>=3 rather than ifdef i386. This patch fixes
the "black screen" hang issue when building w/XEN_TARGET_X86_PAE=y on
32bit.
Tested linux debian and win2003EE guests with pae=1. The linux
guest boots without error, while the windows guest sometimes hits a
bug() in shadow.c. Both VT and SVM encounter the same bug.
Read the message type out of the message before sending it to xenstored, and
use that saved value when handling the reply. Xenstored will leave the
message type intact, _except_ when returning an error, in which case it will
change the type to XS_ERROR. This meant that we failed to remove a
transaction from our internal list if xenstored returned EAGAIN, as we did not
realise that the message was XS_TRANSACTION_END. This manifested itself as
the intended behaviour until the connection was closed, at which point all of
those failed transactions would erroneously be aborted.
Currently, it is possible to set the mem-max value to value lower than
what has been currently allocated to the domain causing the kernel to
crash. This patch validates the value passed in and prevents setting the
value below the current allocation level.
Since we don't reset the proto_csum_blank flag in the skb, the
checksum calculation gets done twice, which is not twice as good as
once.
With this patch, TCP/UDP checksum errors from dom0 are fixed, and
domUs can use TCP/UDP without turning off TX checksum offload. Normal
non-VLAN bridged configs still work fine, tested with xm-test.
Trivial patch to fix x86_64 builds in which XEN_TARGET_ARCH
is specified on the make command line, e.g.:
make XEN_TARGET_ARCH=x86_64
This busted the vmxassist and hvmloader builds, which must
be done -m32. Using "override" in the vmxassist/hvmloader
Makefiles fixes the problem by not allowing this to be
overridden from the command line.
Signed-off-by: Dave Lively <dlively@virtualiron.com>
The maximum instruction length for both x86-32 and
x86-64 is 15 bytes (including all prefixes, opcode,
ModRM, SIB, displacement, and immediate bytes).
This patch adjusts the MAX_INST_LEN to the correct
value. This should reduce the size of some variables
in the hypervisor code. This patch also does some
minor code clean-up in the vm exit handler for VMX.
When running test 5 in Memtest86+ v1.65, I got a "this opcode is not
supported", so I decided to add it. It's a compare operation, and it's
just the opposite of the already supported one (opcode 0x39), so it's
nothing spectacular. Why there's a page-fault when this instruction gets
executed, I haven't got a clue, but I have a feeling that Memtest86 is
doing something wrong :-( However, this fix may help some other code to
run too...
With this, Test 5 passes all the way through without crashing. I did see
some occassional memory errors in some other tests, and I'm not 100%
sure whether those are caused by the system or they are "real" memory
errors. At some time in the future I may get round to memory testing my
target system...
Signed off by: Mats Petersson (mats.petersson@amd.com)
Fix the test inside all_devices_ready, and move it from xenbus_probe (a
postcore_initcall) to a new late_initcall, so that it happens after the
drivers have initialised.
Fixes the reopened bug #549 (I hope).
Signed-off-by: Ewan Mellor <ewan@xensource.com>
Netfront must switch state using xenbus_switch_state() or this
is not picked up by the waiting code in xenbus_probe.c.
Add a new config option for all backend drivers. This has two benefits:
1. All backend drivers can be disabled or modularised via
one config option.
2. Backend helper routines that are not specific to any particular driver
can be disabled or modularised based on this config option. In
particular this may allow backend drivers plus the service module
to be upgraded separate from the kernel core as and when the backend
interfaces change (and they will).
If the 'cdrom=' option is specified in the definition file but media is
not found in the CD drive then main() in vl.c exits and the guest appears
to hang. This patch modifies vl.c slightly to check for the presents of
media. If the cdrom cannot be opened then the cd entry is removed from
hd_filename[] and bs_table[] allowing the guest to continue initializing.
If the guest requires the CD media then the guest should report, gracefully
or otherwise, that it's missing.
Further workarounds for the broken string marshalling in xmlrpclib. Regardless
of the encoding used, one still may not include non-printable characters in an
XML document. When a dmesg contains a ^D character, something seen on one of
our test machines, an invalid XML document is generated.
Use a trick by David Mertz to work around this -- escape the string using
Python's repr function.
Fix another blkback kernel thread I introduced. :-( The kernel thread
is created before we are fully connected to the front end, so before
entering the main loop we must make sure that the shared ring is
mapped, otherwise we can fault.
This patch is an essential companion to the other two blkback
patches I committed earlier today. Hopefully this ends the saga.
Allow CONFIG_DEBUG_INFO to be specified when building
x86/64 XenLinux. Builds and boots fine. Leave the option
disabled by default, as with all other defconfigs.
Update the user manual appendix to describe bow to get a mouse working
properly in a VNC window. Also add 'pae' configuration introduction for
HVM guests.
Signed-off-by: You, Yongkang <yongkang.you@intel.com> Signed-off-by: Dugger, Donald D <donald.d.dugger@intel.com>
There are a couple of bugs with the current handling of reads and writes
in the configuration space overlay functions. The wrong offset is passed
to the virtual field handlers. This patch uses the variable which
contains the correct offset. This patch also fixes the logic which
generates the actual value to write to a given virtual configuration
space field.
Workaround bug in xmlrpclib's string escaping. That library outputs invalid
UTF-8 if given a string containing high-bit characters, so instead pass in
Unicode strings, for which the escaping is correct.
This fixes a bug seen when the dmesg of a machine contains high bit characters,
but I'm sure there are other ways in which it might be triggered.
Handle failure to register the xen store event channel instead of
just not initialising xenbus/store when the supervisor_mode_kernel
feature flag is enabled.
When initialising grant tables only -ENOSYS is a valid reason
to fail so BUG_ON anything else like we did prior to changeset
9498.
Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
Make checksum handling in the virtual network drivers more robust.
Largely this involves making the logic symmetrical: for example,
not only should netfront be able to tell netback that a packet has
an empty protocol checksum field, but the reverse must also be true.
Another change is that the drivers only advertise IP checksum
offload functionality. There is currently no information
propagated across the device channel about the offset of the
protocol-specific checksum field. Therefore it is not safe to
defer checksum calculation for protocols the remote end may not
understand -- it will end up dropping having to drop the packet.
Yet another change is to allow netback to disable tx checksum
offload, just as we already could for netfront. Currently there is
no support for disabling rx checksum offload -- that would seem
to require some way of propagating the checksum-offload advertisement
(or lack of it) across the device channel, as it really ought to be
the transmitter that acts on it.
Thanks to Ian Jackson for pointing out some of the problems with
our checksum-offload handling. Several of the changes here are
due to his comments.