hw/piix4acpi: Make writes to ACPI_DBG_IO_ADDR actually work.
The ACPI AML code has little snippets where it uses two
memory locations to stash debug information when doing PCI
hotplug, such as:
Device (S20)
{
Name (_ADR, 0x00040000)
Name (_SUN, 0x04)
Method (_EJ0, 1, NotSerialized)
{
Store (0x20, \_GPE.DPT1)
Store (0x88, \_GPE.DPT2)
Store (One, \_GPE.PH20)
}
Method (_STA, 0, NotSerialized)
{
Store (0x20, \_GPE.DPT1)
Store (0x89, \_GPE.DPT2)
}
}
piix4acpi, xen, hotplug: Fix race with ACPI AML code and hotplug.
This is a race so the amount varies but on a 4PCPU box
I seem to get only ~14 out of 16 vCPUs I want to online.
The issue at hand is that QEMU xenstore.c hotplug code changes
the vCPU array and triggers an ACPI SCI for each vCPU
online/offline change. That means we modify the array of vCPUs
as the guests ACPI AML code is reading it - resulting in
the guest reading the data only once and not changing the
CPU states appropiately.
The fix is to seperate the vCPU array changes from the ACPI SCI
notification. The code now will enumerate all of the vCPUs
and change the vCPU array if there is a need for a change.
If a change did occur then only _one_ ACPI SCI pulse is sent
to the guest. The vCPU array at that point has the online/offline
modified to what the user wanted to have.
Specifically, if a user provided this command:
xl vcpu-set latest 16
(guest config has vcpus=1, maxvcpus=32) QEMU and the guest
(in this case Linux) would do:
QEMU: Guest OS:
-xenstore_process_vcpu_set_event
-> Gets an XenBus notification for CPU1
-> Updates the gpe_state.cpus_state bitfield.
-> Pulses the ACPI SCI
- ACPI SCI kicks in
-> Gets an XenBus notification for CPU2
-> Updates the gpe_state.cpus_state bitfield.
-> Pulses the ACPI SCI
-> Gets an XenBus notification for CPU3
-> Updates the gpe_state.cpus_state bitfield.
-> Pulses the ACPI SCI
...
- Method(PRST) invoked
-> Gets an XenBus notification for CPU12
-> Updates the gpe_state.cpus_state bitfield.
-> Pulses the ACPI SCI
- reads AF00 for CPU state
[gets 0xff]
- reads AF02 [gets 0x7f]
-> Gets an XenBus notification for CPU13
-> Updates the gpe_state.cpus_state bitfield.
-> Pulses the ACPI SCI
.. until VCPU 16
- Method PRST updates
PR01 through 13 FLG
entry.
- PR01->PR13 _MAD
invoked.
- Brings up 13 CPUs.
While QEMU updates the rest of the cpus_state bitfields the ACPI AML
only does the CPU hotplug on those it had read.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (for 4.3 release)
(cherry picked from commit f62079cd7de6ec37f48dfc80fb5906f49fecd6f6)
Ian Jackson [Thu, 17 Jan 2013 15:52:16 +0000 (15:52 +0000)]
e1000: fix compile warning introduced by security fix, and debugging
e33f918c19e393900b95a2bb6b10668dfe96a8f2, the fix for XSA-41,
and its cherry picks in 4.2 and 4.1 introduced this compiler warning:
hw/e1000.c:641: warning: 'return' with a value, in function returning void
In upstream qemu (where this change came from), e1000_receive returns
a value used by queueing machinery to decide whether to try
resubmitting the packet later. Returning "size" means that the packet
has been dealt with and should not be retried.
In this old branch (aka qemu-xen-traditional), this machinery is
absent and e1000_receive returns void. Fix the return statement.
Also add a debugging statement along the lines of the others in this
function.
e1000: Discard packets that are too long if !SBP and !LPE
The e1000_receive function for the e1000 needs to discard packets longer than
1522 bytes if the SBP and LPE flags are disabled. The linux driver assumes
this behavior and allocates memory based on this assumption.
Signed-off-by: Michael Contreras <michael@inetric.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
[ This is a security vulnerability, CVE-2012-6075 / XSA-41. ]
(cherry picked from commit 4c2cae2a882db4d2a231b27b3b31a5bbec6dacbf)
Ian Jackson [Tue, 13 Nov 2012 18:25:17 +0000 (18:25 +0000)]
mapcache: Fix invalidate if memory requested was not bucket aligned
When memory is mapped in qemu_map_cache with lock != 0 a reverse mapping
is created pointing to the virtual address of location requested.
The cached mapped entry is saved in last_address_vaddr with the memory
location of the base virtual address (without bucket offset).
However when this entry is invalidated the virtual address saved in the
reverse mapping is used. This cause that the mapping is freed but the
last_address_vaddr is not reset.
Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit d94efd9aa814f17f3243dae91476dc42b5ad052e)
Conflicts:
hw/xen_machine_fv.c
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 6 Sep 2012 16:05:30 +0000 (17:05 +0100)]
Disable qemu monitor by default. The qemu monitor is an overly
powerful feature which must be protected from untrusted (guest)
administrators.
Neither xl nor xend expect qemu to produce this monitor unless it is
explicitly requested.
This is a security problem, XSA-19. Previously it was CVE-2007-0998
in Red Hat but we haven't dealt with it in upstream. We hope to have
a new CVE for it here but we don't have one yet.
- if ioreq->postsync call bdrv_flush when the operation is actually
completed;
- do not increment aio_inflight when not submitting any operations.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit f0fad5549f8103fc0130d3121eb5f7913c5bc2a9)
qemu-xen-traditional: use O_DIRECT to open disk images with QDISK
Also enable batch_maps, use_aio and disable syncwrite.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 0efae2f1fe8f72628c58b3683f62725a613fcec3)
qemu-xen-traditional: use O_DIRECT to open disk images for IDE
[ Major performance fix. -iwj ]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 1307e42a4b3c1102d75401bc0cffb4eb6c9b7a38)
qemu-xen: ignore console disconnect events for console/0
The first console has a different location compared to other PV devices
(console, rather than device/console/0) and doesn't obey the xenstore
state protocol. We already special case the first console in con_init
and con_initialise, we should also do it in con_disconnect.
Ian Jackson [Tue, 1 Feb 2011 17:32:38 +0000 (17:32 +0000)]
vnc, xen: write vnc address and password to xenstore
The xend protocol as actually implemented is:
* xend writes:
/vm/UUID/vncpasswd = "PASS" (n0,rDOMID)
/local/domain/0/backend/vfb/DOMID/0/vncunused = "0" (n0,rDOMID)
/local/domain/0/backend/vfb/DOMID/0/vnc = "1" (n0,rDOMID)
/local/domain/0/backend/vfb/DOMID/0/vnclisten = "ADDR" (n0,rDOMID)
/local/domain/0/backend/vfb/DOMID/0/vncdisplay = "PORT" (n0,rDOMID)
/local/domain/0/backend/vfb/DOMID/0/vncpasswd = "PASS" (n0,rDOMID)
* qemu reads /vm/UUID/vncpasswd and overwrites it with "\0"
* qemu writes
/local/domain/DOMID/console/vnc-port = "PORT" (n0,rDOMID)
* xm vncviewer reads entries from backend/vfb,
as well as console/vnc-port.
Much of this is insane.
xl quite properly does not create anything in backend/vfb for an HVM
domain with no vfb. But xl vncviewer needs to know the port number
and the address and the password.
So, for now, have qemu write these nodes too:
/local/domain/DOMID/console/vnc-listen = "ADDR" (n0,rDOMID)
/local/domain/DOMID/console/vnc-pass = "PASS" (n0,rDOMID)
This corresponds to the protocol actually currently implemented in
libxl.
We will revisit this after the 4.1 release and invent a non-insane
protocol.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
If there are no devices assigned to the domain at boot, we don't read
the default pci passthrough parameters. This patch fixes it. Reading
num_devs is completely useless hence I am removing it.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
qemu-xen: disable buffering on the save file for stubdoms
We need to issue reads with the exact number of bytes to read the
qemu-xen save file, and to do that this patch disables buffering on all
the savevm reads/writes for stubdoms.
[This is pretty horrid; hopefully there will be better solution for 4.2 -iwj]
qemu-xen: use dynticks instead of a static 10ms timeout
Use dynticks instead of polling the timers every 10ms.
This allows a qemu running in dom0 to wake up only when the next timer
goes off (that is every 100ms, because of the buffer_io_timer) instead
of every 10ms.
For the moment stubdoms still run with the old 10ms timeout because
minios doesn't support the posix timer_create interface yet.
Also disable the nographic_timer when CONFIG_DM because it is only
useful with tcg.
Chun Yan Liu [Wed, 5 Jan 2011 23:48:36 +0000 (23:48 +0000)]
fix '|' key display problem in en-us with altgr processing
Commit f95d202ed644 handles altgr-insert problem. Unfortunately, with
that patch, there is a problem in En-us keyboard: '|' (bar) cannot be
displayed. After checking keymap files, we found there are two
definitions to "bar" in en-us: bar 0x56 altgr (in "common") bar 0x2b
shift (in "en-us") First line is actually invalid in en-us
lanuage. The 2nd definition will cover the 1st one.
The previous change in didn't consider multi-definition case. It scans
keymap files, if keysym needs altgr, it will records that, after that,
if keysym is pressed but altgr not pressed, it will add an altgr press
opeartion. It is correct if all keysyms are unique and valid. But in
the above multi-definition case, there is problem: when reading bar
0x56 altgr (in "common") it will record altgr needed, but in fact,
that definition won't be used, it always use the 2nd definition and
won't need altgr. Then if the keysym is pressed, the code will still
add an altgr press operation, that will cause problem.
So, if we cannot avoid multi-definition in keymap files, the altgr
flag (whether altgr needed or not) should also be refreshed according
to the 2nd defintion. In the above case, when reading the 1st line, it
records altgr needed; then reading 2nd line, 2nd definition will cover
the 1st, meanwhile the altgr flag should be reset (the 2nd definition
doesn't need altgr, so altgr flag should be removed.)
Following patch supplements f95d202ed644, and solve the
problem.
Signed-off-by: Chun Yan Liu <cyliu@novell.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
The technique is the same used with MSI: if the guest enables an MSIX
passing 0 as vector number, then read the address and use it as pirq
number for the following mapping request to Xen.
Ian Jackson [Tue, 14 Dec 2010 18:39:14 +0000 (18:39 +0000)]
xenfb: let xenfb_guest_copy() handle dept h=32 case
In hw/xenfb.c, xenfb_guest_copy only handles xenfb->depth=8 and 24
cases, I guess it assumes in xenfb->depth=16 or 32 cases, buffer is
shared. But that's not always the case: the code path that allows us
to have a shared buffer when xenfb->depth=16 or 32 is xenfb->do_resize
set, but on a guest vnc console, when enter CTRL+ALT+2 switch to qemu
monitor console then CTRL+ALT+1 back to guest window, the
xenfb->do_resize is not set, that is, buffer is not shared, and
xenfb_guest_copy does not handle xenfb->depth=32 case, the result is:
guest screen cannot be restored.
To fix above problem, this patch does two things:
1. Set xenfb->do_resize in xenfb_invalidate so that in console switch
case, buffer is shared when xenfb->depth=16 or 32. The screen cannot
be restored bug in above description can be solved.
2. To avoid that other special cases have the same problem, it's
better to let xenfb_guest_copy handle all cases, so add processing to
xenfb->depth=16 and 32 in xenfb_guest_copy.
Signed-off-by: Chun Yan Liu <cyliu@novell.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Jackson [Tue, 23 Nov 2010 17:52:44 +0000 (17:52 +0000)]
xen_disk: backport from upstream qemu
xen_disk is a pure userspace blkback implementation that can be used to
provided a disk backend called qdisk.
It is particularly useful with a dom0 kernel that doesn't have blktap2
(Linux 2.6.37).
[ This is a cherry pick of git commit 62d23efac8905a46277f666c909e826f91c12aa1
aka git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7223 c046a42c-6fe2-441c-8c8c-71466251a162
originally from Gerd Hoffman and Anthony Ligouri, edited
by Stefano to fit into current qemu-xen-unstable. -iwj ]
Ian Jackson [Tue, 23 Nov 2010 16:40:08 +0000 (16:40 +0000)]
qemu-xen: build adjustments to support out-of-tree builds
QEMU by itself can be built outside of its source directory. With the
qemu repository being separate from the hypervisor/tools one it seems
to make sense to make use of this feature, but doing so requires a
couple of adjustments to the Xen changes to it. Basically, if
CONFIG_QEMU is found to indicate an existing directory, this directory
will be used rather than cloning the git repo into the build tree.
Ian Jackson [Tue, 9 Nov 2010 18:01:13 +0000 (18:01 +0000)]
piix4acpi, xen: change in ACPI to match the change in the BIOS.
Some change have been introduced in the Xen firmware to match QEMU's
BIOS. So this patch adds the new sleep state values and handle old
and new ACPI IOPort mapping.
QEMU-Xen uses new ioport by default, but if it's a saved state with old
firmware, it unmaps the new ioport and maps the old one.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Ian Jackson [Mon, 8 Nov 2010 17:09:54 +0000 (17:09 +0000)]
stubdom: fix handing of dependency files
The previous change to switch qemu's make include directives to use
.*.d rather than *.d didn't consider the stubdom side, where no -MF
was passed to the compiler so far.
Ian Jackson [Wed, 3 Nov 2010 12:46:45 +0000 (12:46 +0000)]
block-vvfat.c: fix warnings with _FORTIFY_SOURCE
In function 'snprintf',
inlined from 'init_directories' at block-vvfat.c:868:10,
inlined from 'vvfat_open' at block-vvfat.c:1065:24:
/usr/include/bits/stdio2.h:65:3: warning: call to __builtin___snprintf_chk will always overflow destination buffer
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Olaf Hering <olaf@aepfle.de>
Ian Jackson [Wed, 3 Nov 2010 12:45:24 +0000 (12:45 +0000)]
pc.c: Fix compiler warning in load_linux
fix compile warning.
/usr/src/packages/BUILD/xen-unstable.hg-4.1.21864/tools/ioemu-dir/hw/pc.c: In function 'load_linux':
/usr/src/packages/BUILD/xen-unstable.hg-4.1.21864/tools/ioemu-dir/hw/pc.c:713:39: warning: operation on 'seg[4]' may be undefined
Ian Jackson [Thu, 28 Oct 2010 11:26:02 +0000 (12:26 +0100)]
qemu: fix incremental rebuild
While the .*.d dependency files get build nicely during the initial
build, they never got actually used: make's $(wildcard ) function acts
like the shell's, i.e. *.d doesn't match any file name starting with
'.' and hence none of the files would ever be used.
For the clean: rules the issue is the same, except here it should have
been very obvious that removing *.d won't do what was intended.
Ian Jackson [Thu, 21 Oct 2010 16:59:20 +0000 (17:59 +0100)]
e1000: Handle IO Port.
This patch introduces the two IOPorts on e1000, IOADDR and IODATA. The
IOADDR is used to specify which register we want to access when we read
or write on IODATA.
It also check the RDLEN register when a packet is received, if the value
is 0, the receive descriptor buffer is not set, so we don't accept any
network packets.
This patch fixes some weird behavior that I see when I use e1000 with
QEMU/Xen, the guest memory can be corrupted by this NIC because it will
write on memory that it doesn't own anymore after a reset. It's because
the kernel Linux use the IOPort to reset the network card instead of the
MMIO.
This patch also intruduces e1000_reset function.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 14 Sep 2010 16:31:43 +0000 (17:31 +0100)]
ioemu: fix VNC altgr-insert behavior
When access to a Xen DomU (Linux) from a VNC client in Windows, alt-gr
key is not working properly with Spanish keyboard. When Alt + another
key pressed, vncserver receives Altgr down, Altgr up and key down
messages in order, that causes incorrect output.
With following patch, when vncerver receives key down message, it
first check if the keysym needs altgr modifer, if it needs altgr
modifier but altgr is not 'down', sending altgr keycode before sending
key keycode.
Ian Jackson [Thu, 2 Sep 2010 18:08:58 +0000 (19:08 +0100)]
passthrough: enabling IGD passthrough for Calpella and Sandybridge
This patch enables IGD passthrough for Calpella and Sandybridge
platforms. To minimize impact of these changes, it checks for
vendor_id of 0x8086 before creating another PCH device in the virtual
platform. For opregion, it checks both vendor_ID of 0x8086 and a
non-zero PCI opregion value on device 0:2.0 before mapping the
opregion.
Ian Jackson [Thu, 19 Aug 2010 16:45:33 +0000 (17:45 +0100)]
passthrough: graphics passthrough cleanup
This patch as originally titled Calpella/Sandybridge IGD passthrough.
In this revision, I have separated cleanup portion of the patch from
Calpella/Sandybridge portion. This cleanup portion incorporates
inputs from Stephano and proposed patch from Isaku. The main changes
are consolidating graphics passthrough specific code into
pt-graphics.c and removal of hardcoded intercepts for host bridge
(device 00:00.0) accesses in pci.c.
Signed-off-by: Allen Kay [allen.m.kay@intel.com<mailto:allen.m.kay@intel.com>] Signed-off-by: Isaku Yamahata [yamahata@valinux.co.jp]
Ian Jackson [Thu, 8 Jul 2010 16:33:29 +0000 (17:33 +0100)]
Move the xenfb pointer handler to the connected method
Ensure that we read "request-abs-pointer" after the frontend has written
it. This means that we will correctly set up an ansolute or relative
pointer handler correctly.
Signed-off-by: John Haxby <john.haxby@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Jackson [Thu, 8 Jul 2010 16:31:57 +0000 (17:31 +0100)]
Introduce a new 'connected' xendev op called when Connected.
Rename the existing xendev 'connect' op to 'initialised' and introduce
a new 'connected' op. This new op, if defined, is called when the
backend is connected. Note that since there is no state transition this
may be called more than once.
Signed-off-by: John Haxby <john.haxby@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Jackson [Tue, 29 Jun 2010 13:42:48 +0000 (14:42 +0100)]
stubdom: fix creation hang by not initialising xenfb_pv if nographic
Contributed-by: Eric Chanudet <eric.chanudet@citrix.com> Modified-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Wed, 9 Jun 2010 16:10:59 +0000 (17:10 +0100)]
Fix read-only image file handling
Hi,
this is the patch for qemu-xen-3.4-testing to fix the read-only
image file handling since the image file was always treated as
read-write which means that all the HVM guests were able to
write to all the disk images available in domain configuration
file no matter what the mode of the image was defined. This
patch fixes this functionality to honor the O_RDONLY in the
BDRV_O_ACCESS flag in block.c and also fixes the IDE and SCSI
interfaces that uses it.
It's been tested on RHEL-5 with xen-3.4-testing version of
upstream xen with xen-3.4-testing qemu implementation. The
patch is applicable to qemu-xen-unstable.git as well with no
modifications.
When you want to mount an image that is set as read-only in the
domain configuration file but you omit to set mode to read-only
it results into I/O errors when processing the requests.
Remounting as read-only or unmounting and remounting using the
`mount /dev/* /path/to/mount -o ro` shall do the mounting the
correct way, i.e. with no I/O errors, so make sure you mount
those disks as read-only otherwise you can be getting errors like:
end_request: I/O error, dev hdb, sector 52
Buffer I/O error on device hdb1, logical block 1
lost page write due to I/O error on hdb1
and for IDE devices you'll be getting several additional DeviceFault
errors since mounting the device read-write (default setting) writes
some data onto a disk at the mount-time.
For SCSI devices the DATA PROTECT request sense has been added
as found at: http://en.wikipedia.org/wiki/SCSI_Request_Sense_Command
Michal
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Ian Jackson [Fri, 21 May 2010 14:46:55 +0000 (15:46 +0100)]
Wait for frontend state Connected before connecting the backend
The frontend of the framebuffer set a value (request-abs-pointer) and go
to the state Connected. The backend must read this value only when the
frontend has the state Connected.
From: Anthony PERARD <anthony.perard@citrix.com> Tested-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Ian Jackson [Fri, 30 Apr 2010 16:41:45 +0000 (17:41 +0100)]
Implement 'xm vcpu-set' command for HVM guest
Currently Xen has 'xm vcpu-set' command for PV domain, but not
available for HVM domain. This patch is use to enable 'xm vcpu-set'
command for HVM domain. It setup vcpu watch at xenstore, and at qemu
side, handle vcpu online/offline accordingly. With this patch, 'xm
vcpu-set' command works for both PV and HVM guest with same format.
Ian Jackson [Tue, 13 Apr 2010 11:07:33 +0000 (12:07 +0100)]
passthrough: fix segmentation fault after hotplug pass-through device
This patch fixed the QEMU segmentation fault after hotplug
pass-through devices with MSI-X for many times.
There is a wrong boundary check in cpu_register_io_memory that uses
io_index rather than io_mem_nb. After many times of hotplug of MSI-X
pass-through device, io_mem_read[] got extended to overwrite mmio_cnt,
then cause QEMU segmentation fault.
This fix sync with upstream QEMU code in exec.c, and free unused
io_mem_XXX element after hot removal.
Ian Jackson [Thu, 8 Apr 2010 15:56:24 +0000 (16:56 +0100)]
passthrough: fix header type register emulation
This patch fixes the emulation of latency timer and header type.
The change set of cc1a204423475ff7a918b11d78b9ae637f320e23
deleted the header type register emulation.
On the other hand, the change set of ec5e52d5cb2e6f8851c345b7c3095fe2030fff9c
tries to update header type emulation, however it wrongly
touches latency timer emulation part.
I think this was caused by mis-merging. This patch sorts it out.
Cc: Dexuan Cui <dexuan.cui@intel.com> Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au>
Ian Jackson [Wed, 24 Mar 2010 17:15:12 +0000 (17:15 +0000)]
Add sanity check for vcpu config
Currently Xen/Qemu support max 128 vcpus. To avoid mis-setting
at config file, this patch add sanity check for vcpu config.
1. maxvcpus and vcpus should no more than HVM_MAX_VCPUS (128)
2. vcpus should no more than maxvcpus.
Ian Jackson [Thu, 18 Mar 2010 16:50:44 +0000 (16:50 +0000)]
Allow changing CD for /dev/xvdX devices.
We found the issue being not able to change CD on the HVM-Domain.
It is possible on the /dev/hdc device,
but it is impossible on the/dev/xvdc device.
We want to work it as all /dev/xvdX devices on the HVM-domain
as well as on the PV-domain.
Signed-off-by: Takanori Kasai <kasai.takanori@jp.fujitsu.com>
The execution method is as follows.
----------------------------------------------------------------------
Domain configuration file:
disk = ["tap:aio:/<guest image file>,xvda,w", ",xvdc:cdrom,r"]
Operation that assign CD:
# xm block-configure <domain> file:<iso image> xvdc:cdrom r
Operation that releases CD
# xm block-configure <domain> '' xvdc:cdrom r
----------------------------------------------------------------------
Ian Jackson [Thu, 18 Mar 2010 16:45:51 +0000 (16:45 +0000)]
Fix vcpu hotplug bug: get correct vcpu_avail bitmap
Currently qemu has a bug: When maxvcpus > 64, qemu will get wrong
vcpu bitmap (s->cpus_sts[i]) since it only get bitmap from a long variable.
This patch, cooperate with another xend python patch, is to fix this bug.
This patch get hex string from xend, transfer it to correct vcpu_avail bitmap
which saved at an uint32_t array.
Signed-off-By: Liu, Jinsong <jinsong.liu@intel.com>
(This is [PATCH 2/2], the other half is in xen-unstable.hg)
Ian Jackson [Tue, 9 Mar 2010 17:55:41 +0000 (17:55 +0000)]
Enable sound
This enables sound emulation by fixing the missing feature in configure.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Contributed-by: Christian Motschke <christian@motschke.de>
(Trivial 3-line patch supplied without S-o-b from contributor.)
Ian Jackson [Mon, 1 Mar 2010 16:14:50 +0000 (16:14 +0000)]
passthrough: gfx_passthru: warning when vgabios rom has invalid checksum
In the native environment, the VGABIOS, the expansion ROM on the
graphics card, is placed into the 0C0000h address space, and then
executed. Of course, the checksum of the ROM must be valid.
After this initialization, the system BIOS, the actual BIOS of the M/B,
can resize the expansion ROM code to reduce the amount of occupied
space. If the system BIOS resizes it, a new checksum must be calculated
and stored in the ROM image that is on the RAM.
So, normally, shadowed VGABIOS, that is placed in 0C0000h, is already
modified and its checksum must be recalculated.
Qemu-dm copies 0C0000h's contents of the dom0 to guest's 0C0000h.
Guest re-uses dom0's used-up VGABIOS.
The problem that I mentioned is about this recalculated checksum.
System BIOS must guarantee the checksum after the resizing, but,
some M/B does not.
However, after adjusting the checksum, guest seems to work, and
current qemu-dm does so. The buggy system BIOS might just forgets
to recalculate.
Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com> Acked-by: Weidong Han <weidong.han@intel.com>
Ian Jackson [Tue, 16 Feb 2010 16:09:06 +0000 (16:09 +0000)]
passthrough: magic protocol passthrough fix fix
The previous changeset 60b80e3ee319e908069d1603e5b73f815acdffac had a
bug qemu-xen-unstable, in that test_pci_slot is only in 3.4-testing.
This patch makes it use the new devfn-based interface.
Contributed-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 15 Feb 2010 14:08:53 +0000 (14:08 +0000)]
passthrough: magic ioport protocol no longer unplugs passthrough NICs
On Fri, 12 Feb 2010, Zhai, Edwin wrote:
> [bugs:]
>
> 1. Pass-through NICs are also unplugged, although them have different
> path with vnif and emulated NIC.
You are right, that is a bug and this patch should fix it.
Ian Jackson [Thu, 4 Feb 2010 17:04:48 +0000 (17:04 +0000)]
passthrough: support Intel IGD passthrough with VT-D
Some registers of Intel IGD are mapped in host bridge, so it needs to
passthrough these registers of physical host bridge to guest because
emulated host bridge in guest doesn't have these mappings.
Some VBIOSs and drivers ssume the IGD BDF (bus:device:function) is
always 00:02.0, so this patch reserves 00:02.0 for assigned IGD in
guest.
(Patch modified slightly by Ian Jackson.)
Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 1 Feb 2010 16:33:52 +0000 (16:33 +0000)]
Fix lost serial TX interrupts. Report receive overruns.
This patch corrects emulation errors in QEMU's 16550 uart emulation,
which cause compatibility issues with FreeBSD's uart(9) driver.
o Implement receive overrun status. The FreeBSD uart(9) driver
relies on this status in it's probe routine to determine the size
of the FIFO supported.
o As per the 16550 spec, do not overwrite the RX FIFO on an RX overrun.
o Do not allow TX or RX FIFO overruns to increment the data valid count
beyond the size of the FIFO.
o For reads of the IIR register, only clear the "TX holding register
empty" (THRE) interrupt if the read reports this interrupt. This
is required by the specification and avoids losing TX interrupts
when other, higher priority interrupts (usually RX) are reported first.
This patch also includes a fix for a second cause of lost TX interrupts,
which was submitted by Jergen Lock, and is already in the latest QEMU.
o If a receive interrupt is suppressed due to the FIFO not yet filling
to its interrupt threshold, do not also supress any pending THRE
interrupt.
A version of this patch, against the latest QEMU, has also been submitted
to the qemu-devel mailing list.
Ian Jackson [Mon, 4 Jan 2010 17:49:06 +0000 (17:49 +0000)]
passthrough: Fix MSI-x devices assignment.
Currenlty, assigned MSI-x devices fails to
work due to incorrect table_offset_adjust setting.
The last field msix_entryof struct pt_msix_info is
a variable-size array, so there shouldn't be any field
after it, otherwise they maybe destroyed
when access msix_entry.
Ian Jackson [Mon, 4 Jan 2010 17:48:14 +0000 (17:48 +0000)]
passthrough: always use hw intx and always get it from the same place
The assumption that function zero always uses INTA tuns out not
to be true in the wild. This leaves us with three options.
1) Always use INTA
This was the case before multi-function pass-through was possible.
But with the advent of multi-function pass-through this may lead
to excessive virtual GSI sharing.
2) Fix emulation to use INTA for function zero
3) Always use the hardware value for INTx
There doesn't seem to be much between 2) and 3) but the latter seems
slightly cleaner so I advocate that approach.
Cc: Tom Rotenberg <tom.rotenberg@gmail.com> Cc: Edwin Zhai <edwin.zhai@intel.com> Signed-off-by: Simon Horman <horms@verge.net.au>
[patch 2/2] qemu-xen: pass-through: always use hw intx
From:
[patch 0/2] qemu-xen: pass-through: always use hw intx
pass-through: always use hw intx and always get it from the same place
The assumption that function zero always uses INTA tuns out not
to be true in the wild. This leaves us with three options.
1) Always use INTA
This was the case before multi-function pass-through was possible.
But with the advent of multi-function pass-through this may lead
to excessive virtual GSI sharing.
2) Fix emulation to use INTA for function zero
3) Always use the hardware value for INTx
There doesn't seem to be much between 2) and 3) but the latter seems
slightly cleaner so I advocate that approach.
Ian Jackson [Mon, 4 Jan 2010 17:47:03 +0000 (17:47 +0000)]
passthrough: move pci_read_intx() and pci_intx()
Move pci_read_intx() and pci_intx() to above pt_irqpin_reg_init().
This is requred for a subsequent patch where pt_irqpin_reg_init()
calls pci_read_intx().
Cc: Tom Rotenberg <tom.rotenberg@gmail.com> Cc: Edwin Zhai <edwin.zhai@intel.com> Signed-off-by: Simon Horman <horms@verge.net.au>
[patch 1/2] qemu-xen: pass-through: move pci_read_intx() and pci_intx()
From:
[patch 0/2] qemu-xen: pass-through: always use hw intx
pass-through: always use hw intx and always get it from the same place
The assumption that function zero always uses INTA tuns out not
to be true in the wild. This leaves us with three options.
1) Always use INTA
This was the case before multi-function pass-through was possible.
But with the advent of multi-function pass-through this may lead
to excessive virtual GSI sharing.
2) Fix emulation to use INTA for function zero
3) Always use the hardware value for INTx
There doesn't seem to be much between 2) and 3) but the latter seems
slightly cleaner so I advocate that approach.
Ian Jackson [Mon, 4 Jan 2010 17:12:44 +0000 (17:12 +0000)]
HVM vcpu add/remove: qemu logic for vcpu add/revmoe
-- at qemu side, get vcpu_avail which used for original cpu avail map;
-- setup gpe ioread/iowrite at qmeu;
-- setup vcpu add/remove user interface through monitor;
-- setup SCI logic;
Ian Jackson [Mon, 4 Jan 2010 16:21:55 +0000 (16:21 +0000)]
implement cdrom eject from the guest
Hi all,
this patch allows a guest to eject the cdrom: when qemu detects that a
cdrom eject request ahs been issued by the guest, it writes eject to the
corresponding xenstore frontend, so that the toolstack can take care of
removing the current cdrom frontend\backend couple and create an empty one
instead.
Ian Jackson [Mon, 4 Jan 2010 16:21:02 +0000 (16:21 +0000)]
stubdom: fix cdrom changing
Hi all,
the current code to change a cdrom doesn't work with stubdoms:
- media_filename set at boot time doesn't have the proper
value (that in the stubdom case is the frontend path and not the
filename);
- when a cdrom watch event is triggered, the code to decide whether the
new cdrom is valid and different from the current cdrom doesn't work for
stubdoms;
both issues are fixed by this patch, in particular now media_filename
consistently holds the frontend path for stubdoms while bs->filename
holds the filename (like in the normal qemu case) to allow comparisons
with the old cdrom filename.
If pt_find_reg_grp() fails and returns NULL, it will jump to out:,
but at this time reg is still NULL (pt_find_reg() is not reached)
which leads to a NULL dereference.
This patch fixes it.
Submitted-By: Qing He <qing.he@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 6 Nov 2009 18:11:50 +0000 (18:11 +0000)]
passthough: add no_wb option for pci conf write
Current pt_pci_write_config always writes back to real pci conf
space. However, in the case of MSI address and data registers,
if guest changes the affinity of the interrupt, stale data will
be written to these registers. This is particularly a problem
if Xen uses per-CPU vector, where the interrupt in question fails
to work. This patch fixes this by adding an option to disable the
write back of certain controls.
Ian Jackson [Fri, 6 Nov 2009 18:10:44 +0000 (18:10 +0000)]
Enlarge the size of the global mmio_space mmio[].
With the Multi-Function passthrough, we're actually able to assign more than
32 functions to guest, so we should enlarge the MAX_MMIO. 1024 should be big
enough.
Ian Jackson [Thu, 29 Oct 2009 13:00:31 +0000 (13:00 +0000)]
Extend max vcpu number for HVM guest
Reduce size of Xen-qemu shared ioreq structure to 32 bytes. This has two
advantages:
1. We can support up to 128 VCPUs with a single shared page
2. If/when we want to go beyond 128 VCPUs, a whole number of ioreq_t
structures will pack into a single shared page, so a multi-page array will
have no ioreq_t straddling a page boundary
Also, while modifying qemu, replace a 32-entry vcpu-indexed array with a
dynamically-allocated array.
Ian Jackson [Wed, 21 Oct 2009 15:42:15 +0000 (16:42 +0100)]
passthrough: fix security issue with stubdoms
this patch series fixes the outstanding security problem with stubdoms
and pci passthrough.
The idea is to allow mmio, irq and ioport remapping not only if the
current domain IS_PRIV_FOR but also if the current domain has
permissions over those mmio areas, irqs and ioports.
This way a stubdom can only remap resources that currently "owns".
This patch series also moves the de\assign_device hypercalls from the
list of hypercalls made by qemu\stubdom to xend.
The two patches must be applied at the same time otherwise pci
passthrough won't work for HVM guests.
[PATCH 2 of 2] qemu: do not call xc_assign_device
This patch removes the call to xc_assign_device from qemu.