Ian Jackson [Tue, 7 Jan 2014 18:40:05 +0000 (18:40 +0000)]
xl: Pass -v options on to migration receiver
Compute a -v option to pass to the migration receiver.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
v2: Use minmsglevel_default to initialise minmsglevel.
Ian Jackson [Tue, 7 Jan 2014 18:23:04 +0000 (18:23 +0000)]
xl: migration: pass -t to xl migrate-receive
If we ourselves are using cr-based overwriting for logging to stderr,
pass -t to the migration receiver so that it knows to do the same
(since its stderr is normally the pipe from sshd).
This requires, of course, that the receiver support that option. This
is OK from a compatibility point of view because we support migration
to newer, but not necessarily to older, versions. (If unsupported
backwards migration is still desired the use of -s "" allows the
remote invocation rune to be overridden by a command of one's choice.)
This fixes a regression introduced in 2f80ac9c0e8f, where migration
messages from the receiver would not use of the overwriting protocol.
CC: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 7 Jan 2014 18:07:30 +0000 (18:07 +0000)]
xentoollog: provide XTL_STDIOSTREAM_PROGRESS_USE_CR
Provide flags
XTL_STDIOSTREAM_PROGRESS_USE_CR
XTL_STDIOSTREAM_PROGRESS_NO_CR
to allow the caller to force, or disable, the use of \r-based
overwriting of progress messages.
In the implementation, rename the variable "tty" to "progress_use_cr".
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 18 Mar 2014 16:37:18 +0000 (16:37 +0000)]
libxl: hotplug scripts: stdin < /dev/null
Give hotplug scripts /dev/null for stdin. That way if they try read
anything anything (which really they shouldn't), nothing odd will
happen.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Roger Pau Monne <roger.pau@citrix.com> CC: Vasiliy Tolstov <v.tolstov@selfip.ru> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Ian Jackson [Tue, 18 Mar 2014 16:33:06 +0000 (16:33 +0000)]
libxl: hotplug scripts: stdout >& stderr
Plumb hotplug scripts' stdout to stderr. That way if they print
anything (which really they shouldn't), it won't get mixed up with
the application's stdout. (Eg, perhaps with an xl migration
stream...)
Ian Jackson [Tue, 18 Mar 2014 17:04:36 +0000 (17:04 +0000)]
libxl: Make libxl_exec tolerate foofd<=2
Make passing 0, 1, or 2 as stdinfd, stdoutfd or stderrfd work
properly.
Also, document the meaning of the fd arguments.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Roger Pau Monne <roger.pau@citrix.com> CC: Vasiliy Tolstov <v.tolstov@selfip.ru> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Ian Jackson [Thu, 27 Feb 2014 17:46:49 +0000 (17:46 +0000)]
tools/console: xenconsole tolerate tty errors
Since 28d386fc4341 (XSA-57), libxl writes an empty value for the
console tty node, with read-only permission for the guest, when
setting up pv console "frontends". (The actual tty value is later set
by xenconsoled.) Writing an empty node is not strictly necessary to
stop the frontend from writing dangerous values here, but it is a good
belt-and-braces approach.
Unfortunately this confuses xenconsole. It reads the empty value, and
tries to open it as the tty. xenconsole then exits.
Fix this by having xenconsole treat an empty value the same way as no
value at all.
Also, make the error opening the tty be nonfatal: we just print a
warning, but do not exit. I think this is helpful in theoretical
situations where xenconsole is racing with libxl and/or xenconsoled.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: George Dunlap <george.dunlap@eu.citrix.com>
---
v2: Combine two conditions and move the free
Ian Jackson [Mon, 24 Feb 2014 15:16:19 +0000 (15:16 +0000)]
tools/console: reset tty when xenconsole fails
If xenconsole (the client program) fails, it calls err. This would
previously neglect to reset the user's terminal to sanity. Use atexit
to do so.
This routinely happens in Xen 4.4 RC5 with pygrub because libxl
writes the value "" to the tty xenstore key when using xenconsole.
After this patch this just results in a harmless error message.
Reported-by: M A Young <m.a.young@durham.ac.uk> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: M A Young <m.a.young@durham.ac.uk> CC: Ian Campbell <Ian.Campbell@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix whitespace error (reintroduce hard tab)
Fix commit message not to claim ignorance about root cause
Ian Campbell [Mon, 17 Mar 2014 17:27:40 +0000 (17:27 +0000)]
xen: arm: make stage 2 page tables walks inner-shareable
The comment was previously incorrect and indicated that these mappings were
unshared (00) when in reality the register was set for outer-shareable (01).
Clarify ORGN0/IRGN0 in the comments while at it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Mon, 17 Mar 2014 14:53:29 +0000 (14:53 +0000)]
xen: arm: weaken SMP barriers to inner shareable.
Since all processors are in the inner-shareable domain and we map everything
that way this is sufficient.
The non-SMP barriers remain full system. Although in principle they could
become outer shareable barriers for some hardware this would require us to
know which class a given device is. Given the small number of device drivers
in Xen itself its probably not worth worrying over, although maybe someone
will benchmark at some point.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Mon, 17 Mar 2014 14:53:23 +0000 (14:53 +0000)]
xen: arm: map memory as inner shareable.
The inner shareable domain contains all SMP processors, including different
clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
memory mappings. The outer shareable domain is for devices on busses which are
coherent and barrier-aware (e.g. AMBA4 AXI with ACE). While the system domain
is for things behind bridges which are not.
One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
index so we can DTRT. On ARMv8 the sharability is ignored and considered to
always be Outer Shareable.
Don't adjust the barriers, flushes etc, those remain as they were (which is
more than is now required). I'll change those in a later patch.
Many thanks to Leif for explaining the difference between Inner- and
Outer-Shareable in words of two or less syllables, I hope I've replicated that
explanation properly above!
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Jackson [Tue, 18 Mar 2014 13:45:25 +0000 (13:45 +0000)]
libxc: Fix buffer length for get_suspend_file
Declaring a formal parameter to have an array type doesn't result in
the parameter actually having an array type. The type is "adjusted"
to a pointer. (C99 6.9.1(7), 6.7.5.3.)
So the use of sizeof in xc_suspend.c:get_suspend_file was wrong.
Instead, use the #define. Also get rid of the array size, as it is
misleading.
Newer versions of gcc warn about the erroneous code:
xc_suspend.c:39:25: error: argument to 'sizeof' in 'snprintf' call
is the same expression as the destination; did you mean to provide
an explicit length? [-Werror=sizeof-pointer-memaccess]
Reported-By: Julien Grall <julien.grall@linaro.org> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Julien Grall <julien.grall@linaro.org>
--
v2: Actually change the declaration of buf.
Jan Beulich [Tue, 18 Mar 2014 10:52:34 +0000 (11:52 +0100)]
x86/idle: update to include further package/core residency MSRs
With the number of these growing it becomes increasingly desirable to
not repeatedly alter the sysctl interface to accommodate them. Replace
the explicit listing of numbered states by arrays, unused fields of
which will remain untouched by the hypercall.
The adjusted sysctl interface at once fixes an unrelated shortcoming
of the original one: The "nr" field, specifying the size of the
"triggers" and "residencies" arrays, has to be an input (along with
being an output), which the previous implementation didn't obey to.
Note that the bouncing direction in the libxc interface at once gets
corrected to OUT (was BOTH).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Ian Jackson [Thu, 12 Dec 2013 19:17:03 +0000 (19:17 +0000)]
libxl: suspend: Apply guest timeout in evtchn case
When negotiating guest suspend via the evtchn ("fast") protocol,
the guest may still fail to respond.
So set the timeout. The existing error path will already properly
tear down our (event channel) wait.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 10 Dec 2013 17:40:49 +0000 (17:40 +0000)]
libxl: suspend: Async evtchn wait
When negotiating guest suspend via the evtchn ("fast") protocol,
abolish synchronous wait for domain suspend.
If the guest supports the event channel suspend protocol, we used to
sit in a loop in xc_await_suspend waiting (perhaps indefinitely) for
it to suspend.
Instead, use the new libxl event channel event facility. When we see
that the event is signaled, we look at the domain to see if it has
suspended. (In this patch we do not yet set a timeout; that will come
next.)
So the suspend operation no longer blocks with the libxl ctx lock
held, and instead returns to the event loop. Additionally, domains
which signal the event channel themselves, or undergo other state
changes, will be handled more correctly.
We end up making a few more hypercalls.
Also, if we encounter errors setting up the suspend event channel
(which should not happen), abort the operation rather than falling
back to the xenstore protocol.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Improve commit message.
Ian Jackson [Fri, 6 Dec 2013 16:40:08 +0000 (16:40 +0000)]
libxl: suspend: Fix suspend wait corner cases
When we are waiting for a guest to suspend, this suspend operation
would continue to wait (until the timeout) if the guest was destroyed
or shut down for another reason, or if xc_domain_getinfolist failed.
Handle these cases correctly, as errors.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove unwanted error call after a new "goto err".
Ian Jackson [Fri, 6 Dec 2013 16:12:44 +0000 (16:12 +0000)]
libxl: suspend: Abolish usleeps in domain suspend wait
Replace the use of a loop with usleep().
Instead, use a xenstore watch and an event system timeout. (xenstore
fires watches on @releaseDomain when a domain shuts down.)
The logic which checks for the state of the domain is unchanged, and
not ideal, but we will leave that for the next patch.
There is not intended to be any semantic change, other than to make
the algorithm properly asynchronous and the consequential waiting be
on xenstore, rather than polling.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove some trailing whitespace
Improve commit message.
v3X: Do NOT use an xswait instead of separate watch and timeout.
Ian Jackson [Thu, 5 Dec 2013 18:50:55 +0000 (18:50 +0000)]
libxl: suspend: Async xenstore pvcontrol wait
When negotiating guest suspend via the xenstore pvcontrol protocol
(ie when the guest does NOT support the evtchn fast suspend protocol):
Replace the use of loops and usleep with a call to libxl__xswait.
Also, replace the xenstore transaction loop with one using
libxl__xs_transaction_start et al.
There is not intended to be any semantic change, other than to make
the algorithm properly asynchronous.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Add a comment to clarify last minute ack.
v3X: Do NOT rename "pvcontrol" xswait state struct to "guest_wait"
(because we're NOT going to use it for the event channel based wait
too).
In domain_suspend_callback_common, use libxl__xs_transaction_start in
a loop, rather than xs_transaction_start and a goto label.
This will improve the error handling, but have no other semantic
effect.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 5 Dec 2013 18:48:21 +0000 (18:48 +0000)]
libxl: suspend: New domain_suspend_pvcontrol_acked
Factor out domain_suspend_pvcontrol_acked.
This replaces a bunch of open-coded strcmp()s and makes the code
clearer. It also eliminates the need to check for state==NULL each
time it's read, because we can check for NULL once before the strcmp.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Improve comment re xswatch state ENOENT
Ian Jackson [Thu, 5 Dec 2013 18:27:30 +0000 (18:27 +0000)]
libxl: suspend: New libxl__domain_pvcontrol_xspath
Factor out the pv control node xenstore path calculation into
libxl__domain_pvcontrol_xspath.
This xs path calculation was open coded in
libxl__domain_pvcontrol_read and _write. This is undesirable because
it duplicates the code and because it makes the path inaccessible to
other parts of libxl (which are soon going to want it).
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Make domain_suspend_callback_common more callback-oriented:
* Turn the functionality behind the goto labels "err" and
"guest_suspended" into functions which can be called just before
"return".
* Deindent the "issuing %s suspend request via XenBus control node"
branch; it is going to be split up into various functions as the
xenstore work becomes callback-based.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Make domain_suspend_callback_common do its work and then call
dss->callback_common_done, rather than simply returning its answer.
This is preparatory to abolishing the usleeps in this function and
replacing them with use of the event machinery.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove some trailing whitespace
Mark the suspend callback libxl__domain_suspend_callback as
asynchronous in the helper stub generator (libxl_save_msgs_gen.pl).
We are going to want to provide an asynchronous version of this
function to get rid of the usleeps and waiting loops in the suspend
code.
libxl__domain_suspend_common_callback, the common synchronous core,
which used to be provided directly as the callback function for the
helper machinery, becomes libxl__domain_suspend_callback_common. It
can now take a typesafe parameter.
For now, provide two very similar asynchronous wrappers for it
(normal, and remus). Each is a simple function which contains only
boilerplate, calls the common synchronous core, and returns the
asynchronous response.
Essentially, we have just moved (in the case of suspend callbacks) the
call site of libxl__srm_callout_sendreply. It was in the switch
statement in the autogenerated _libxl_save_msgs_callout.c, and is now
in the handwritten libxl_dom.c.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Clarify commit message.
Fix a misformatted CC in the commit message
Do not introduce a whitespace error in libxl_save_msgs_gen.pl
v2: Commit message mentions usleeps, not Remus, as motivation.
Ian Jackson [Wed, 11 Dec 2013 16:29:38 +0000 (16:29 +0000)]
libxc: suspend: Fix suspend event channel locking
Use fcntl F_SETLK, rather than writing our pid into a "lock" file.
That way if we crash we don't leave the lockfile lying about. Callers
now need to keep the fd for our lockfile. (We don't use flock because
we don't want anyone who inherits this fd across fork to end up with a
handle onto the lock.)
While we are here:
* Move the lockfile to /var/run/xen
* De-duplicate the calculation of the pathname
* Compute the buffer size for the pathname so that it will definitely
not overrun (and use the computed value everywhere)
* Fix various error handling bugs
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
xc_suspend_evtchn_init expects to eat the first event on the xce. If
the xce is used for any other purpose then this can break. Document
this fact and rename the function to xc_suspend_evtchn_init_exclusive.
(I haven't checked the call sites for improper shared use of the xce.)
Provide a corresponding xc_suspend_evtchn_init_sane which doesn't try
to eat an event, and instead leaves the caller the ability to
demultiplex.
Also document that xc_await_suspend needs exclusive use of the xce.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: Shriram Rajagopalan <rshriram@cs.ubc.ca> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Drop spurious addition of #include <assert.h>
Ian Jackson [Wed, 11 Dec 2013 14:06:02 +0000 (14:06 +0000)]
libxl: events: Provide libxl__ev_evtchn*
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix commit message not to refer to libxl_ctx_alloc's gc, done earlier
Change type of port in evtchn_fd_callback to evtchn_port_or_error_t
Clarify comment about use of ctx->xce.
Fix typo in comment.
Ian Jackson [Fri, 6 Dec 2013 15:31:02 +0000 (15:31 +0000)]
libxl: events: Use libxl__xswait_* in spawn code
Replace open-coded use of ev_time and ev_xswatch with xswait.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 14 Mar 2014 17:38:38 +0000 (17:38 +0000)]
libxl: events: libxl__xswait* support @paths
Special-case paths starting with '@' in libxl__xswait. Attempting to
read these from xenstore gives EINVAL. Callers waiting for (say)
@releaseDomain will be checking for some condition which can be
observed other than by looking at xenstore.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: New patch in this version of the series.
Ian Jackson [Thu, 5 Dec 2013 18:49:12 +0000 (18:49 +0000)]
libxl: events: Provide libxl__xswait_*
This is an ao utility for for conveniently doing a timed wait on
xenstore. It handles setting up and cancelling the timeout, and also
conveniently reads the key for you.
No callers yet in this patch.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix doc comments to refer to correct rc values
The comments for libxl__ev_time_isregistered and the corresponding
watch function even say that these should be const. Make it so.
Also fix libxl__ev_child_inuse and libxl__ev_spawn_inuse.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 17 Dec 2013 15:20:25 +0000 (15:20 +0000)]
libxl: init: libxl__poller_init and _get take gc
Change libxl__poller_init and libxl__poller__get to take a libxl__gc*
rather than a libxl_ctx*. The gc is not used for memory allocation
but simply to provide the standard local variable "gc" expected by the
convenience macros. Doing this makes the error logging more
convenient.
Hence, convert the logging calls to use the LOG* convenience macros.
And consequently, change the call sites, and the function bodies to
use CTX rather than ctx.
Also convert a call to malloc() (with error check) in
libxl__poller_get, to libxl__zalloc (no error check needed).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 17 Dec 2013 15:22:40 +0000 (15:22 +0000)]
libxl: init: Provide a gc later in libxl_ctx_alloc
Provide libxl__gc *gc for the second half of libxl_ctx_alloc.
(For the first half of the function, gc is in scope but set to NULL.)
This makes it possible to make gc-requiring calls. For example, it
makes error logging more convenient.
Make use of this by changing the logging calls to use the LOG*
convenience macros.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 17 Mar 2014 15:47:22 +0000 (16:47 +0100)]
x86/Intel: work around Xeon 7400 series erratum AAI65
Linux commit 40e2d7f9b5dae048789c64672bf3027fbb663ffa ("x86 idle:
Repair large-server 50-watt idle-power regression") tells us that this
applies not just to the named Xeon 7400 series, but also NHM-EX and
WSM-EX; sadly Intel's documentation is so badly searchable that I
wasn't able to locate the respective errata (and hence can't quote
their numbers here).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Mon, 17 Mar 2014 15:45:04 +0000 (16:45 +0100)]
VT-d: fix RMRR handling
Removing mapped RMRR tracking structures in dma_pte_clear_one() is
wrong for two reasons: First, these regions may cover more than a
single page. And second, multiple devices (and hence multiple devices
assigned to any particular guest) may share a single RMRR (whether
assigning such devices to distinct guests is a safe thing to do is
another question).
Therefore move the removal of the tracking structures into the
counterpart function to the one doing the insertion -
intel_iommu_remove_device(), and add a reference count to the tracking
structure.
Further, for the handling of the mappings of the respective memory
regions to be correct, RMRRs must not overlap. Add a respective check
to acpi_parse_one_rmrr().
And finally, with all of this being VT-d specific, move the cleanup
of the list as well as the structure type definition where it belongs -
in VT-d specific rather than IOMMU generic code.
Note that this doesn't address yet another issue associated with RMRR
handling: The purpose of the RMRRs as well as the way the respective
IOMMU page table mappings get inserted both suggest that these regions
would need to be marked E820_RESERVED in all (HVM?) guests' memory
maps, yet nothing like this is being done in hvmloader. (For PV guests
this would also seem to be necessary, but may conflict with PV guests
possibly assuming there to be just a single E820 entry representing all
of its RAM.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Ian Campbell [Mon, 17 Mar 2014 11:31:02 +0000 (11:31 +0000)]
xen: arm: increase priority of SGIs used as IPIs
Code such as on_selected_cpus expects/requires that an IPI can preempt a
processor which is just handling a normal interrupt. Lacking this property can
result in a deadlock between two CPUs trying to IPI each other from interrupt
context.
For the time being there is only two priorities, IRQ and IPI, although it is
also conceivable that in the future some IPIs might be higher priority than
others. This could be used to implement a better BUG() than we have now, but I
haven't tackled that yet.
Tested with a debug patch which sends a local IPI from a keyhandler, which is
run in serial interrupt context.
Julien Grall [Wed, 5 Mar 2014 04:46:25 +0000 (12:46 +0800)]
xen/arm: Remove processor specific setup in vcpu_initialise
This patch introduces the possibility to have specific processor callbacks
that can be called in various place.
Currently VCPU initialisation code contains processor specific setup (for
Cortex A7 and Cortex A15) for the ACTRL registers. It's possible to have
processor with a different layout for this register.
Move this setup in a specific callback for ARM v7 processor.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: marc.ceeeee@gmail.com
Julien Grall [Wed, 5 Mar 2014 04:46:23 +0000 (12:46 +0800)]
xen/arm32: Introduce lookup_processor_type
Looking for a specific proc_info structure is already implemented in assembly.
Implement lookup_processor_type to avoid duplicate code between C and
assembly.
This function searches the proc_info_list structure following the processor
ID. If the search fail, it will return NULL, otherwise a pointer to this
structure for the specific processor.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: Clean and invalidate dcache for boot pagetables
We need to invalidate dcache too after zeroing boot pagetables
to avoid unpredictable behavior which may take place after
non-boot CPUs enable their caches.
So, replace clean_xen_dcache() macro by a clean_and_invalidate_xen_dcache()
for boot pagetables.
Andrew Cooper [Fri, 14 Mar 2014 08:43:37 +0000 (09:43 +0100)]
common: shuffle use of __attribute__((packed))
This introduced a formal define in compiler.h, and is otherwise manual
shuffling of __attribute__((packed)) statements to __packed at the head of the
structure.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 14 Mar 2014 08:42:28 +0000 (09:42 +0100)]
functional cleanup for __attribute__((packed)) changes
This is to separate the functional changes from the noop consistency changes.
* Pack struct cper_mce_record rather than creating a structure named __packed
* Remove unreferenced struct xgt_desc
* Use two u16's rather than two u32 16-bit bitfields
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Also drop now pointless (and always having been bogus) pack pragmas.
If we failed to open an xc interface, using xch to log an error will end in
tears. Print to stderr instead, as we are bailing immediately later.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Coverity-id: 1191885 Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
Andrew Cooper [Thu, 13 Mar 2014 13:38:37 +0000 (14:38 +0100)]
console: Traditional console timestamps including milliseconds
Suggested-by: Don Slutz <dslutz@verizon.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Thu, 13 Mar 2014 13:37:58 +0000 (14:37 +0100)]
console: provide timestamps as an offset since boot
This adds a new "Linux style" console timestamp method, which is shorter and
more useful than the current date/time timestamps with single-second
granularity.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 13 Mar 2014 13:27:51 +0000 (14:27 +0100)]
x86: make hypercall preemption checks consistent
- never preempt on the first iteration (ensure forward progress)
- never preempt on the last iteration (pointless/wasteful)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 13 Mar 2014 13:26:35 +0000 (14:26 +0100)]
common: make hypercall preemption checks consistent
- never preempt on the first iteration (ensure forward progress)
- do cheap checks first
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
Ian Jackson [Mon, 24 Feb 2014 14:19:15 +0000 (14:19 +0000)]
libxl: Fix carefd lock leak in save callout
If libxl_pipe fails we leave the carefd locked, which translates to
the atfork lock remaining held. This would probably cause the process
to deadlock shortly afterwards.
Of course libxl_pipe is very unlikely to fail unless things are
already going very badly. This bug has not been observed anywhere as
far as we are aware.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: George Dunlap <george.dunlap@eu.citrix.com>
Ian Jackson [Mon, 24 Feb 2014 14:19:14 +0000 (14:19 +0000)]
libxl: Hold the atfork lock while closing carefd
This avoids the process being forked while a carefd is recorded in the
list but the actual fd has been closed. If that happened, a
subsequent libxl_postfork_child_noexec would attempt to close the fd
again. If we are lucky that results in a harmless warning; but if we
are unlucky the fd number has been reused and we close an unrelated
fd.
This race has not been observed anywhere as far as we are aware.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: George Dunlap <george.dunlap@eu.citrix.com>
Ian Campbell [Tue, 14 Jan 2014 16:55:04 +0000 (16:55 +0000)]
xen: arm: correctly write release target in smp_spin_table_cpu_up
flush_xen_data_tlb_range_va() is clearly bogus since it flushes the tlb, not
the data cache. Perhaps what was meant was flush_xen_dcache(), but the address
was mapped with ioremap_nocache and hence isn't cached in the first place.
Accesses should be via writeq though, so do that.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Andrew Cooper [Tue, 25 Feb 2014 10:54:14 +0000 (10:54 +0000)]
tools/xen-mceinj: Fix depency for the install rule
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Liu Jinsong <jinsong.liu@intel.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Instead of having hard-coded values. We only do PCI vendors
as Jan requested and put all PCI device vendors in one
new file.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Sorted them based on their numerical values per Jan's review] Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
serial: Seperate the PCI device ids and parameters (v1)
This will allow us to re-use the parameters for multiple PCI
devices.
No functional change.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: s/nr/idx/ of the enum, use __initconst and const by Jan's review] Reviewed-by: Jan Beulich <jbeulich@suse.com>
but since I don't have any of those cards this patch does not
enable it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Init for ARM and add offset to virt addr]
[v2: Remove the offset usage] Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
serial: Fix COM1 assumption if pci_uart_config did not find the AMT card.
The io_base by default is set to be 0x3f8 for COM1 and 0x2f8 for COM2
in __setup_xen. Then we call 'ns16550_init' which copies those in
the appropriate uart, which then calls 'ns16550_parse_port_config'
to deal with parameter parsing. If the 'amt' parameter has been
specified we further call 'pci_uart_config code' which scans the PCI bus.
If it does not find the AMT device it would overwrite the io_base with
0x3f8 regardless whether this is COM1 or COM2 - but only if 'amt'
parameter had been specified.
The overwrite is a way to set it back to the failsafe defaults -
except for COM2 it is bogus.
Note again - if an AMT card is found, this over-write will not happen.
This in theory (as I don't have a machine with two COM ports
readily available) means that if the user specified 'com2=9600,8n1,amt'
and the device did not have an AMT serial device, instead of using
0x2f8 for the io_base it ends up using 0x3f8 - and we don't get the
output on COM2. If the user had done 'com2=9600,8n1' we would never
get in this path so this bug would never manifest itself
(because we don't end up scanning for the AMT device).
We also unconditionally reset the IRQ value - so we would never get the
proper interrupt when falling back to the legacy 0x3f8 and 0x2f8 COM ports.
That is OK - as we would end up using the polling mode - while
not the best - it still would work.
Lastly the clock_hz is also set to the default one (UART_CLOCK_HZ,
which is the same for legacy COM1 and COM2 ports)- that is strictly
not a bug, but it is redundant and not needed.
This bug was introduced with the original AMT support and I cannot
recall why it was done that way - it is a bug.
Fix it by saving the original io_base before starting the
scan of the PCI bus. If we don't find an serial PCI device (because
we did not exit out of the loop using return) then
assign the original io_base value back.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Also remove the irq override spotted by Jan]
[v2: Add more details to the commit description] Reviewed-by: Jan Beulich <jbeulich@suse.com>
serial: Skip over PCIe device which have no quirks (fix AMT regression).
The "ns16550: Add support for UART present in Broadcom TruManage
capable NetXtreme chips" implies that only devices that are have
an MMIO BAR and are in the quirks table should be processed.
Even the comment at the end says so:
If we have an io_base, then we succeeded in the lookup
But the code was checking for the !io_base - which is to say if
the io_base was 0 then we would skip scanning. But io_base
always has a value - it is set by 'ns16550_init' to a default
value - so it would never hit the 'continue' path.
This means that if we have an communication device followed by
a serial AMT device we would pick the communication device instead
of the AMT device.
See:
00:16.0 Communication controller: Intel Corporation Cougar Point HECI Controller #1 (rev 04)
Subsystem: Intel Corporation Device 2008
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at fb12a000 (64-bit, non-prefetchable) [size=16]
00:16.3 Serial controller: Intel Corporation Cougar Point KT Controller (rev 04) (prog-if 02 [16550])
Subsystem: Intel Corporation Device 2008
Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 17
I/O ports at f0e0 [size=8]
Memory at fb129000 (32-bit, non-prefetchable) [size=4K]
pci 0000:00:16.0: [8086:1c3a] type 00 class 0x078000
pci 0000:00:16.3: [8086:1c3d] type 00 class 0x070002
And Xen picks 00:16.0 as its console when using 'com1=115200,8n1,amt'.
This patch fixes it and allows us to use AMT again by zeroing
out io_base to zero. If the scan did not work, the io_base is
set back to a default value (the 'pci_uart_config' does that
already at its end).
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> CC: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> CC: Thomas Lendacky <Thomas.Lendacky@amd.com> CC: Keir Fraser <keir@xen.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 18 Feb 2014 15:59:05 +0000 (15:59 +0000)]
tools/libxl: Don't read off the end of tinfo[]
It is very common for BIOSes to advertise more cpus than are actually present
on the system, and mark some of them as offline. This is what Xen does to
allow for later CPU hotplug, and what BIOSes common to multiple different
systems do to to save fully rewriting the MADT in memory.
An excerpt from `xl info` might look like:
...
nr_cpus : 2
max_cpu_id : 3
...
Which shows 4 CPUs in the MADT, but only 2 online (as this particular box is
the dual-core rather than the quad-core SKU of its particular brand)
Because of the way Xen exposes this information, a libxl_cputopology array is
bounded by 'nr_cpus', while cpu bitmaps are bounded by 'max_cpu_id + 1'.
The current libxl code has two places which erroneously assume that a
libxl_cputopology array is as long as the number of bits found in a cpu
bitmap, and valgrind complains:
==14961== Invalid read of size 4
==14961== at 0x407AB7F: libxl__get_numa_candidate (libxl_numa.c:230)
==14961== by 0x407030B: libxl__build_pre (libxl_dom.c:167)
==14961== by 0x406246F: libxl__domain_build (libxl_create.c:371)
...
==14961== Address 0x4324788 is 8 bytes after a block of size 24 alloc'd
==14961== at 0x402669D: calloc (in/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==14961== by 0x4075BB9: libxl__zalloc (libxl_internal.c:83)
==14961== by 0x4052F87: libxl_get_cpu_topology (libxl.c:4408)
==14961== by 0x407A899: libxl__get_numa_candidate (libxl_numa.c:342)
...
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Wed, 19 Feb 2014 14:03:30 +0000 (14:03 +0000)]
xl: Comment error handling in dolog
Coverity-ID: 1087116 Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: coverity@xenproject.org
Ian Jackson [Wed, 19 Feb 2014 14:03:29 +0000 (14:03 +0000)]
libxl: Fix error path in libxl_device_events_handler
libxl_device_events_handler would fail to call AO_ABORT if it failed;
instead it would simply return rc. (This leaves the egc etc. from the
now-abolished stack frame potentially live, and leaves the ctx
locked.)
In xl, this is of no consequence, because xl will immediately exit in
this situation. This is very likely to be true in any other callers
(of which we don't know of any, anyway).
Coverity-ID: 1181840 Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: coverity@xenproject.org