Keir Fraser [Tue, 19 May 2009 00:37:19 +0000 (01:37 +0100)]
xend: solve issues with xm block-configure command.
In the case of inactive managed domains:
The following error occurs currently. We cannot change the
configuration of the VBD by using xm block-configure. Of course,
using xm block-detach and xm block-attach instead of xm
block-configure, we can change it. However, I'd like to change it by
using xm block-configure.
In the case of active domains:
Another problem occurs after a domain was rebooted. Even if we
change a configuration of a VBD in the domain by using xm
block-configure, the configuration of the VBD is reverted to previous
configuration after the domain was rebooted.
Keir Fraser [Tue, 19 May 2009 00:31:26 +0000 (01:31 +0100)]
x86, cpufreq: fix ondemand governor to take aperf/mperf feedback
APERF/MPERF MSRs provides feedback about actual freq in
eplased time, which could be different from requested freq by
governor. However currently ondemand governor only takes that
feedback at freq down path. We should do that for scale up too.
Keir Fraser [Fri, 15 May 2009 07:12:39 +0000 (08:12 +0100)]
vt-d: Fix interrupt remapping for multiple IOAPICs
Current IOAPIC interrupt remapping code assumes there is only one
IOAPIC in system. It brings problem when there are more than one
IOAPIC in system. This patch extends ioapic_pin_to_intremap_index[]
array to handle multiple IOAPICs case.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Thu, 14 May 2009 14:46:04 +0000 (15:46 +0100)]
xen public: make mmuext_op's vcpumask field const
Linux started to pass around pointers to 'const cpumask_t' a while ago,
and passing such a pointer to set_xen_guest_handle() requires that the
field be a handle for a constant type in order to avoid compiler
warnings.
Keir Fraser [Wed, 13 May 2009 09:39:44 +0000 (10:39 +0100)]
x86 vmx: Ensure debug-mode intercept for int3 and debug exceptions are
reinstated when resetting EXCEPTION_BIRTMAP entry in VMCS after
exiting real mode.
Keir Fraser [Wed, 13 May 2009 09:28:35 +0000 (10:28 +0100)]
passthrough: Fix PCI hot-plug option parsing
When a PCI function is passed-through extra options may be passed
through.
In the case of boot-time PCI pass-through the documented format is:
[dom:]bus:dev.slot[@vslot][[,opt]...]
e.g.
00:01.00.1@7,msitranslate=3D1
In the case of PCI hot-plug the xm pci-attach command take the
following arguments:
[-o opt[,opt]...] [dom:]bus:dev.slot [vslot]
e.g.
-o msitranslate=3D1 00:01.00.1 7
These xm ends up passing these to xem-qemu as:
[dom:]bus:dev.slot[[,opt]...][@vslot]
e.g.
00:01.00.1,msitranslate=3D1@7
Note that the option and the vslot have are transposed when
compared to the format used by boot-time PCI pass-through.
The parser inside qemu-xen can only handle the format used by
boot-time PCI pass-through and because of this ignores
any options passed by hot-plug.
This patch alters format used by hot-plug to match the parser.
Keir Fraser [Fri, 8 May 2009 10:50:12 +0000 (11:50 +0100)]
x86 hvm: hvm_set_callback_irq_level() must not be called in IRQ
context or with IRQs disabled. Ensure this by deferring to tasklet
(softirq) context if required.
Keir Fraser [Thu, 7 May 2009 18:32:10 +0000 (19:32 +0100)]
Permit user to suppress passing --prefix to setup.py
We change all invocations of setup.py as follows:
* use $(PYTHON) instead of `python' so that the user can specify
an alternative python version if they need to. If not set it
defaults to `python' in Config.mk.
* pass --prefix=$(PREFIX) via a new make variable
$(PYTHON_PREFIX_ARG). This allows a user to suppress the
--prefix=... argument entirely by setting PYTHON_PREFIX_ARG=''.
This will work around the bug described here
https://bugs.launchpad.net/ubuntu/+bug/362570
where passing --prefix=/usr/local (which ought to have no effect as
/usr/local is the default prefix) changes which subdirectory
distutils chooses, and results in the files being installed in
site-packages which is not on the default search path.
Users not affected by this python packaging bug should not set
PYTHON_PREFIX_ARG and their builds will not be affected. (Provided
PREFIX did not contain spaces. People who put spaces in PREFIX are
being quite optimistic.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
After the scheduler timer became suspended before entering cpu idle
state, the percpu timer_deadline is possible to be 0, i.e. no soft
timer in the queue. This case will cause unexpected large residency
percentage in C1 for the purely idle cpu.
Signed-off-by: Wei Gang <gang.wei@intel.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Update XEN_LINUX_GIT_REMOTEBRANCH to match changes made in upstream
repo. Needed if you want setting KERNELS=linux-2.6-pvops in
config/Linux.mk to work.
Signed-off-by: Alex Zeffert <alex.zeffert@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
op_pincpu method in SrvDomain.py does not currently work because
op_pincpu method gives string objects to a cpumap argument of
domain_pincpu method in XendDomain.py though the cpumap argument
expects list objects.
This patch solves the above problem as follows.
op_pincpu method gives string objects to the cpumap argument as is,
because op_pincpu method cannot give list objects to the cpumap
argument.
Instead, domain_pincpu method expects that the cpumap argument is
string objects, then domain_pincpu method converts the cpumap
argument into list objects.
Also, the patch modifies two methods (except for op_pincpu method)
calling domain_pincpu method. The methods give string objects to
the cpumap argument instead of list objects.
Network support is still provided the same way: using the tap
interface, created in qemu using netfront.
The lwip stack is still available to avoid additional compilation
issues.
However the stubdom is not going to have its own vif anymore,
this means that the only vnc server supported is the one in dom0.
You can still enable the vnc server in a stubdom at compile time, if
you want so.
Probably the most important change caused by this patch to xen users
is that you don't have to specify two vif in the stubdom config file
anymore, but just one:
Prior to changset 19510:5c69f98c348e - 'xm, xend: Replace "vslt" with
"vslot"', both vslt and vslot were used in the xm code, often fairly
arbitrarily.
However, in the dictionary that describes a pci function both vslt and
vslot were present. vslt stored the slot assigned to the function. And
vslot stored the slot the user requested for the function, or
AUTO_PHP_SLOT if no slot was requested.
With the renaming these two values got merged into a single entry.
This patch un-merges them by renaming the what was vslot to
requested_vslot.
So an out of chronological order list of name changes is:
xend: Do not overwrite xauthority and display with empty values
Display and xauthority vars are read from vmConfig['platform'] first,
then they are read again from dev_info.
However if the user does not set those variable in the config file,
dev_info won't contain them, hence we are going to overwrite the
current significant values with null.
This patch fixes the problem setting display and xauthority to the
current values if dev_info does not contain them.
This patch removes the need for a second configuration file for
stubdoms: it is going to be automatically generated by the script
stubdom-dm using command line options and xenstore to find any needed
information.
The configuration script will be placed under /etc/xen/stubdoms and
automatically removed when the domain is destroyed.
The only change needed in xend is not to write on xenstore sdl,
opengl and serial command line options for qemu, because stubdoms do
not support them.
It is safe to remove those two options from xenstore because qemu does
not use xenstore to read commans line options.
Finally this patch fixes blkfront disconnections from backends and
display and xauthority variables for pv guests.
If ${netdev} is bonding, brctl addif ${bridge} ${pdev} fails:
can't add ${pdev} to bridge ${bridge}: Invalid argument
Because ${pdev} has no slaves at this point.=20
# Notice that ifdown ${netdev} clears slaves of ${netdev}.
This patch restores slaves before add_to_bridge2 ${bridge} ${pdev}.
The following changeset broke booting xen-ia64 on some kinds of ia64 boxes.
http://xenbits.xensource.com/ext/ia64/xen-unstable.hg/rev/3fd8f9b34941
The tasklet_schedule call raise_softirq().
Because raise_softirq() use per_cpu, if we access per_cpu before cpu_init()
the behavior would be unexpected.
Event-channel setup: Re-bind if the connection becomes unbound (e.g.,
due to 'slow' domain suspend cancellation), even if the remote port
identifier has not changed.
Domain logging: Only open log file once (don't leak fds) and fix a
small memory leak.
Evtchn changes based on a patch by Jiri Denemark <jdenemar@redhat.com>
x86: fix next->vcpu_dirty_cpumask checking in context_switch()
There was a timing window where flush_tlb_mask() could be called with
an empty mask (triggering a WARN_ON() in send_IPI_mask_flat() along
with APIC errors) because rather than using the already taken snapshot
of next's vcpu_dirty_cpumask struct vcpu's field was used directly,
which can get its only bit cleared by remote CPUs.
Replacing the structure field's use by the local variable then made
the inner cpus_empty() check completely redundant with the one in the
surrounding if()'s condition.
This patch updates the Makefile to download the latest version of
tboot, which supports the interface changes made recently. This
should go into 3.4, since 3.4 supports the new tboot interface.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
A few ioapic redirection entries are initialized by hypervisor before
enabling iommu hardware. This patch copies those entries from ioapic
redirection table into interrupt remapping table after interrupt
remapping table has been allocated.
cpuidle: Add support for Always Running APIC timer, CPUID_0x6_EAX_Bit2.
This bit means the APIC timer continues to run even when CPU is
in deep C-states.
The advantage is that we can use LAPIC timer on these CPUs
always, and there is no need for "slow to read and program"
external timers (HPET/PIT) and the timer broadcast logic
and related code in C-state entry and exit.
Refer to the latest Intel SDM Vol 2A
(http://www.intel.com/products/processor/manuals/index.htm)
x86: avoid EPT scanning errors when splitting superpages during live migration
Since Xen did not lock the p2m table for p2m table reading, when
splitting the large page during live migration, we should make sure
the path of EPT entries be modified are always there while other CPUs
may access the super entries at the same time.
xend: clean up qemu-dm related items on domain destroy
Some qemu-dm related stuffs might be left behind after the domain is
destroyed.
- xenstore entry, /local/domain/0/device-model/<domid>
- named pipes, /var/run/tap/qemu-{read,write}-<domid>
Extend pt_bind_irq to handle the update of msi guest
vector and flag.
Unbind and rebind using separate hypercalls may not be viable
sometime.
For example, the guest may update MSI address/data on fly without
disabling it first (e.g. change delivery/destination), implement these
updates in such a way may result in interrupt loss.
I observed from xend.log that several domain restart threads run
simultaneously. This patch make it singleton.
Without this, several coredump of a domain might be created.
If a qemu-dm dies immediately (probably by wrong setting),
xend repeats to restart a domain so many times.=20
That causes system overload.
There is already a logic to avoid too early restarting, however,
it might not work. Since xenstore entry 'xend/previous_restart_time'
is volatile. XendDomainInfo.destroy() which removes the entry from
xenstore is called in some places.
Also, this patch prevents too early restarting even at the first
domain creation.
The corruption happens every time we pass a sector aligned buffer
(instead of a page aligned buffer) to blkfront_aio. To trigger the COW
we have to write at least a byte to each page of the buffer, but we
must be careful not to overwrite useful content.
Currently cpufreq HW-ALL coordination is handled same way as SW-ALL.
However, SW-ALL will bring more IPIs which is bad for cpuidle.
This patch implement HW-ALL coordination handled in different way from
SW-ALL, for the sake of performance and reduce IPIs. We also
suspend/resume HW-ALL dbs timer for idle.
Signed-off-by: Yu, Ke <ke.yu@intel.com> Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Tian, Kevin <kevin.tian@intel.com>
This patch makes two small changes to dom0 iptables rules that permit
(and revoke) domU network access.
First:
Currently, a rule intended to allow domU network access is appended to
the end of the FORWARD chain, where it can be preempted by other =20
rules. This patch causes the rule to be inserted at the top, where
it's more likely to have the intended effect.
Second:
In some cases (e.g. Fedora 9's default iptables configuration), the
first rule alone is insufficient to permit two-way packet flow. This
patch adds a second rule to the FORWARD chain that permits replies to
domU network requests to reach the domU vif.
Signed-off-by: Chris Bookholt <hap10@tycho.ncsc.mil>
Do not share synchronization variables between the trap handler and
the softirq handler, as this will cause problems. Abstract the
synchronization bits into functions. Make the synchronization code
aware of a panic, so that spinning with interrupts disabled is
avoided.
To avoid problems with MCEs happening while we're doing recovery
actions in the softirq handler, implement a deferred list of telemetry
structures, using the mctelem interfaces. Thist list will get updated
atomically, so any additional MCEs will not cause error telemetry to
be missed or not handled.
Signed-off-by: Frank van der Linden <frank.vanderlinden@sun.com> Signed-off-by: Liping Ke <liping.ke@intel.com> Signed-off-by: Yunhong Jiang<yunhong.jiang@intel.com>
tools: Always use sane upstream (`native') python paths
Previously, by default we would install our python modules into
/usr/lib/python/xen, for example /usr/lib/python/xen/__init__.py.
Upstream python's standard install location (a) includes the Python
version number and (b) puts things in site-packages by default.
Our best conjecture for the reason for this was an attempt to make the
installs portable between different python versions. However, that
doesn't work because compiled python modules (.pyc), and C python
extensions corresponding to one version of python, are not compatible
across different versions of python.
This is why upstream include the version number.
site-packages is the standard location for locally-installed packages
and is automatically included on the python search path.
In this change, we abandon our own unusual python path setup:
* Invoke setup.py in an entirely standard manner. We pass
PREFIX and DESTDIR using the appropriate options provided by
setup.py for those purposes (adding them to setup.py calls
which were previously lacking them).
* Since the installation locations are now on the standard
python path, we no longer need to add anything to the path
in any of our python utilities. Therefore remove all that
code from every python script. (Many of these scripts
unconditionally added /usr/lib/python and /usr/lib64/python which
is wrong even in the old world.)
* There is no longer any special `Xen python path'. xen-python-path
is no longer needed. It is no longer called by anything in our
tree. However since out-of-tree callers may still invoke it, we
retain it. It now prints a fixed string referring to a directory
which does not to exist; callers (who use it to augment their
python path) will thus add a nonexistent directory to their python
path which is harmless.
* Remove various workarounds including use of setup.py --home
(which is intended for something completely different).
* Remove tests for the XEN_PYTHON_NATIVE_INSTALL build-time
environment variable. The new behaviour is the behaviour which we
should have had if this variable had been set. That is, it is now
as if this variable was always set but also bugs in the resulting
install have been fixed.
This should be a proper fix for the bug addressed by c/s 19515.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
x86 mce: Small cleanups to machine-check hypercall handling. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
G: Enter commit message. Lines beginning with 'HG:' are removed.
When xend calls xc.test_assign_device, xend does not have to give
the domain-ID of a guest domain.
This patch gives domain-ID 0 to xc.test_assign_device.
The following methods give domain-ID 0 to xc.test_assign_device
currently.
- setupDevice@xend/server/pciif.py
- pciinfo@xend/XendNode.py