x86: avoid EPT scanning errors when splitting superpages during live migration
Since Xen did not lock the p2m table for p2m table reading, when
splitting the large page during live migration, we should make sure
the path of EPT entries be modified are always there while other CPUs
may access the super entries at the same time.
xend: clean up qemu-dm related items on domain destroy
Some qemu-dm related stuffs might be left behind after the domain is
destroyed.
- xenstore entry, /local/domain/0/device-model/<domid>
- named pipes, /var/run/tap/qemu-{read,write}-<domid>
Extend pt_bind_irq to handle the update of msi guest
vector and flag.
Unbind and rebind using separate hypercalls may not be viable
sometime.
For example, the guest may update MSI address/data on fly without
disabling it first (e.g. change delivery/destination), implement these
updates in such a way may result in interrupt loss.
I observed from xend.log that several domain restart threads run
simultaneously. This patch make it singleton.
Without this, several coredump of a domain might be created.
If a qemu-dm dies immediately (probably by wrong setting),
xend repeats to restart a domain so many times.=20
That causes system overload.
There is already a logic to avoid too early restarting, however,
it might not work. Since xenstore entry 'xend/previous_restart_time'
is volatile. XendDomainInfo.destroy() which removes the entry from
xenstore is called in some places.
Also, this patch prevents too early restarting even at the first
domain creation.
The corruption happens every time we pass a sector aligned buffer
(instead of a page aligned buffer) to blkfront_aio. To trigger the COW
we have to write at least a byte to each page of the buffer, but we
must be careful not to overwrite useful content.
Currently cpufreq HW-ALL coordination is handled same way as SW-ALL.
However, SW-ALL will bring more IPIs which is bad for cpuidle.
This patch implement HW-ALL coordination handled in different way from
SW-ALL, for the sake of performance and reduce IPIs. We also
suspend/resume HW-ALL dbs timer for idle.
Signed-off-by: Yu, Ke <ke.yu@intel.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Tian, Kevin <kevin.tian@intel.com>
This patch makes two small changes to dom0 iptables rules that permit
(and revoke) domU network access.
First:
Currently, a rule intended to allow domU network access is appended to
the end of the FORWARD chain, where it can be preempted by other =20
rules. This patch causes the rule to be inserted at the top, where
it's more likely to have the intended effect.
Second:
In some cases (e.g. Fedora 9's default iptables configuration), the
first rule alone is insufficient to permit two-way packet flow. This
patch adds a second rule to the FORWARD chain that permits replies to
domU network requests to reach the domU vif.
Signed-off-by: Chris Bookholt <hap10@tycho.ncsc.mil>
Do not share synchronization variables between the trap handler and
the softirq handler, as this will cause problems. Abstract the
synchronization bits into functions. Make the synchronization code
aware of a panic, so that spinning with interrupts disabled is
avoided.
To avoid problems with MCEs happening while we're doing recovery
actions in the softirq handler, implement a deferred list of telemetry
structures, using the mctelem interfaces. Thist list will get updated
atomically, so any additional MCEs will not cause error telemetry to
be missed or not handled.
Signed-off-by: Frank van der Linden <frank.vanderlinden@sun.com> Signed-off-by: Liping Ke <liping.ke@intel.com> Signed-off-by: Yunhong Jiang<yunhong.jiang@intel.com>
tools: Always use sane upstream (`native') python paths
Previously, by default we would install our python modules into
/usr/lib/python/xen, for example /usr/lib/python/xen/__init__.py.
Upstream python's standard install location (a) includes the Python
version number and (b) puts things in site-packages by default.
Our best conjecture for the reason for this was an attempt to make the
installs portable between different python versions. However, that
doesn't work because compiled python modules (.pyc), and C python
extensions corresponding to one version of python, are not compatible
across different versions of python.
This is why upstream include the version number.
site-packages is the standard location for locally-installed packages
and is automatically included on the python search path.
In this change, we abandon our own unusual python path setup:
* Invoke setup.py in an entirely standard manner. We pass
PREFIX and DESTDIR using the appropriate options provided by
setup.py for those purposes (adding them to setup.py calls
which were previously lacking them).
* Since the installation locations are now on the standard
python path, we no longer need to add anything to the path
in any of our python utilities. Therefore remove all that
code from every python script. (Many of these scripts
unconditionally added /usr/lib/python and /usr/lib64/python which
is wrong even in the old world.)
* There is no longer any special `Xen python path'. xen-python-path
is no longer needed. It is no longer called by anything in our
tree. However since out-of-tree callers may still invoke it, we
retain it. It now prints a fixed string referring to a directory
which does not to exist; callers (who use it to augment their
python path) will thus add a nonexistent directory to their python
path which is harmless.
* Remove various workarounds including use of setup.py --home
(which is intended for something completely different).
* Remove tests for the XEN_PYTHON_NATIVE_INSTALL build-time
environment variable. The new behaviour is the behaviour which we
should have had if this variable had been set. That is, it is now
as if this variable was always set but also bugs in the resulting
install have been fixed.
This should be a proper fix for the bug addressed by c/s 19515.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
x86 mce: Small cleanups to machine-check hypercall handling. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
G: Enter commit message. Lines beginning with 'HG:' are removed.
When xend calls xc.test_assign_device, xend does not have to give
the domain-ID of a guest domain.
This patch gives domain-ID 0 to xc.test_assign_device.
The following methods give domain-ID 0 to xc.test_assign_device
currently.
- setupDevice@xend/server/pciif.py
- pciinfo@xend/XendNode.py
This patch fixes unwind info of fast_hypercall.
fast_hypercall uses r32->r35 without alloc instruction.
In the case of this, the unwind info move over a little.
With my patch, I confirmed the stack trace worked fine.
x86 mce: Small fix for polling/CMCI race conditions.
When CMCI happens very quickly, polling/CMCI processing path might
cross. For Intel CPUs which support CMCI, if the error bank has CMCI
capability, we'll disable poll on this bank.
Signed-off-by: Liping Ke <liping.ke@intel.com> Signed-off-by: Yunhong Jiang<yunhong.jiang@intel.com>
While in the comments to an earlier submitted (and already applied)
patch I claimed to have fixed the need to specify both "nolapic" and
"noapic" when "nolapic" alone should already have the intended effect,
this doesn't appear to be the case. Here are the missing bits.
This patch adds the ACPI fixed hardware power button for HVM.
It enables a graceful shutdown of a guest OS by direction of Dom0.
(if a proper action for the power button is set inside the guest)
network-bridge: Fix do_ifup in the case of ${bridge} != ${netdev}
On RHEL5.2, ifup ${bridge} fails if ${bridge} != ${netdev},
because RHEL5.2's ifup ${bridge} runs the following sequence:
1. Search CONFIG that has the same mac address of ${bridge}.=20
ifcfg-${netdev} is found.
2. Run "ip link set dev ${netdev} up".
# ${bridge} is expected.
3. Output "Failed to bring up ${netdev}."
Because ${netdev} does not exist.
Thus, do_ifup() should not use ifup if ${bridge} != ${netdev}.
vector_channel[], as its name already says, is vector-, not
irq-indexed.
hpet_assign_irq() sits not only in the boot path, but also in the
resume one. Short of knowing why this is, simply checking whether a
vector was already assigned prevents leaking previously assigned ones.
xend: allow hvm domain to have multiple serial consoles
This patch allows hvm domain to have multiple serial ports
with serial =3D [ '...', '...'].
The old style, serial=3D'option string', is also accepted for
compatibility.
Keir Fraser [Tue, 31 Mar 2009 12:28:45 +0000 (13:28 +0100)]
x86: Enable S3 for 32bit dom0 on 64bit Xen
Three SYSENTER MSRs should be taken care of at save/restore BSP
context, or else 32bit dom0 rejects working after S3 resume. Thanks
for Jan's help to find this missing part.
Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Keir Fraser [Tue, 31 Mar 2009 12:27:03 +0000 (13:27 +0100)]
hvmloader acpi: Reserve ioport ranges for expanded PHP
Now there are two control registers plus one register for
each of the 32 PHP slots. A total of 34 registers. Accordingly the
ioport space required has expanded by from 3 to 34 bytes.
Signed-off-by: Simon Horman <horms@verge.net.au>
hvmloader acpi: Use If and Else instead of Switch
Keir Fraser [Tue, 31 Mar 2009 12:23:11 +0000 (13:23 +0100)]
x86: unify BUG() & Co, reduce overhead on x86-64
Since it's only the string pointer representations that differ between
i386 and x86-64, abstract out those and make everything else shared.
While touching this code, also use
- proper instructions rather than a mixture of such and raw .byte/
.long/.quad data emissions,
- PC-relative pointers on x86-64 to cut the amount of storage (and
in particular cache space) needed for string references by half.
Keir Fraser [Tue, 31 Mar 2009 12:22:12 +0000 (13:22 +0100)]
Use unlikely() in BUG_ON()/WARN_ON()
-fno-reorder-blocks was added in c/s 1712, when x86-64 just started to
become enabled. The reason it got added is entirely unclear to me, and
it prevents the intended effect of unlikely() constructs (in
particular
the ones added here) of moving out of line code which is expected to
never get executed, as well as using forward branches (which are
statically predicted taken by various processors' branch prediction
units) preferably to reach infrequently executed code.
Keir Fraser [Tue, 31 Mar 2009 12:20:04 +0000 (13:20 +0100)]
xend: less noise in xend-debug.log on HVM shutdown
Shutting down a hvm, xend-debug.log always shows:
Unhandled exception in thread started by=20
Traceback (most recent call last):
File "//usr/lib64/python/xen/xend/image.py", line 549, in
_sentinel_watch
self._dmfailed(message)
File "//usr/lib64/python/xen/xend/image.py", line 491, in _dmfailed
xc.domain_shutdown(self.vm.getDomid(), DOMAIN_CRASH)
xen.lowlevel.xc.Error: (3, 'No such process')
Keir Fraser [Tue, 31 Mar 2009 12:11:56 +0000 (13:11 +0100)]
x86 mce: fix c/s 17968 for 32-on-64
32-on-64 aspects were not properly considered. Add respective
checking, and adjust structure layouts for the cases where the
checking pointed out issues.
Also,
- fix a potential memory corruption issue (do_mca() could write beyond
log_cpus' end if the guest specified less than the number of online
CPUs
- there is no reason to make the (not even properly prefixed)
definitions in xen/public/arch-x86/xen-mca.h globally visible by
including the file from xen/public/arch-x86/xen.h.
Keir Fraser [Tue, 31 Mar 2009 10:54:12 +0000 (11:54 +0100)]
vtd: fix multiple Dom0 S3 on hosts that support Queued Invalidation.
On such hosts we can't do multiple Dom0 S3 when VT-d is enabled.
The cause is: during the first S3 resume, init_vtd_hw() initializes
the invalidation function pointers to the register-based ones and later
enable_qinval() forgets to overwrite the flush function pointers to
queued-based ones, so actually Queued Invalidaton is enabled, but we
actually use the register-based invalidation function! Later during
the second Dom0 S3, in iommu_suspend() -> iommu_flush_all(), we try to
use the register-based invalidation functions to perform global flush
while Queued Invalidation is enabled, and this can cause a host reset
because VT-d spec says: when the queued invalidation is enabled,
software must submit invalidation commands only through the IQ (and
not through any invalidation command registers).
The attached patch fixes the buggy enable_qinval(). And in
iommu_resume(), we invoke iommu_flush_all() for safety.
Keir Fraser [Tue, 31 Mar 2009 10:51:56 +0000 (11:51 +0100)]
cpuidle: suspend/resume scheduler tick timer during cpu idle state entry/exit
cpuidle can collaborate with scheduler to reduce unnecessary timer
interrupt. For example, credit scheduler accounting timer
doesn't need to be active at idle time, so it can be stopped at
cpuidle entry and resumed at cpuidle exit. This patch implements this
function by adding two ops in scheduler: tick_suspend/tick_resume, and
implement them for credit scheduler
Signed-off-by: Yu Ke <ke.yu@intel.com> Signed-off-by: Tian Kevin <kevin.tian@intel.com>
Keir Fraser [Tue, 31 Mar 2009 10:41:13 +0000 (11:41 +0100)]
vtd: fix iommu vector leak
When we do Dom0 S3 for many times, iommu_set_interrupt() would fail
during S3 resume because it can't obtain vector. We should not request
new vector for every Dom0 S3 resume. We should re-use the same vector.