]> xenbits.xensource.com Git - xen.git/log
xen.git
15 years agoClean up MCA MSR virtualization and vMCE injection
Keir Fraser [Mon, 19 Apr 2010 07:54:53 +0000 (08:54 +0100)]
Clean up MCA MSR virtualization and vMCE injection

Remove all virtual MCE related work into a seperated file.
It also try to do some clean-up on the vMCE, including:
a) renmae some function name like mce_init_msr/mce_rdmsr to be
   vmce_init_msr/vmce_rdmsr to make it more straightforward,
b) make the vmca_msrs be a pointer in arch_domain,
    to decrease arch_domain's size
c) extract per-bank MCA MSR access to be seperated function
    (bank_mce_wrmsr/bank_mce_rdmsr) to make it be a bit cleaner.
d) A new file xen/include/asm-x86/mce.h  is added for vmce related
header.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
15 years agox86: Revert how we calculate 'total system RAM' after c/s 20236.
Keir Fraser [Thu, 15 Apr 2010 18:11:16 +0000 (19:11 +0100)]
x86: Revert how we calculate 'total system RAM' after c/s 20236.

This approach is more straightforward, in that it simply works the
original e820 map. It's what the user expects, and reporting a smaller
value is never appreciated. ;-)

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agox86_emulate: Emulate CLFLUSH instruction
Keir Fraser [Thu, 15 Apr 2010 17:47:58 +0000 (18:47 +0100)]
x86_emulate: Emulate CLFLUSH instruction

We recently found that FreeBSD 8.0 guest failed to install and boot on
Xen. The reason was that FreeBSD detected clflush feature and invoked
this instruction to flush MMIO space. This caused a page fault; but
x86_emulate.c failed to emulate this instruction (not supported). As a
result, a page fault was detected inside FreeBSD. A similar issue was
reported earlier.

http://lists.xensource.com/archives/html/xen-devel/2010-03/msg00362.html

From: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agopygrub: Fix Grub2 support for Ubuntu 10.04
Keir Fraser [Thu, 15 Apr 2010 16:36:55 +0000 (17:36 +0100)]
pygrub: Fix Grub2 support for Ubuntu 10.04

Due to changes in grub2, menu entry titles now have single quote
around them rather than double quotes, but the memtest entries still
are using double quotes, so we need to catch both.

Signed-off-by: David Markey <david.markey@citrix.com>
15 years agoxend: fix best NUMA node allocation
Keir Fraser [Thu, 15 Apr 2010 16:36:16 +0000 (17:36 +0100)]
xend: fix best NUMA node allocation

Since we moved several NUMA info fields from physinfo into separate
functions/structures, we must adapt the node picking algorithm, too.
Currently xm create complains about undefined hash values.
The patch uses the new Python xc binding to get the information and
create a reverse mapping for node_to_cpu, since we now only have a
cpu_to_node field.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
15 years agoxend: make NUMA in xm info optional (dependent on new -n switch)
Keir Fraser [Thu, 15 Apr 2010 12:16:17 +0000 (13:16 +0100)]
xend: make NUMA in xm info optional (dependent on new -n switch)

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
15 years agoacpi sleep: Must acquire hypercall_deadlock_mutex when a domain
Keir Fraser [Thu, 15 Apr 2010 12:06:48 +0000 (13:06 +0100)]
acpi sleep: Must acquire hypercall_deadlock_mutex when a domain
freezes its own vcpus.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoxend: Fix numainfo/topologyinfo loop iterators in Xc extension.
Keir Fraser [Thu, 15 Apr 2010 11:29:48 +0000 (12:29 +0100)]
xend: Fix numainfo/topologyinfo loop iterators in Xc extension.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
15 years agoFix changeset 21153:d2d8805868f1 (xend can't start)
Keir Fraser [Thu, 15 Apr 2010 11:28:33 +0000 (12:28 +0100)]
Fix changeset 21153:d2d8805868f1 (xend can't start)

21153 forgets to update the format string so xend can't start.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
15 years agolibfsimage: zfs build fix for NetBSD
Keir Fraser [Thu, 15 Apr 2010 11:24:16 +0000 (12:24 +0100)]
libfsimage: zfs build fix for NetBSD

uchar_t is not defined because both FSYS_ZFS and FSIMAGE
are defined at build time.
Also fix warnings with ctype.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
15 years agoacpi sleep: domain_freeze() pauses all vcpus, but does not sync the
Keir Fraser [Thu, 15 Apr 2010 11:21:00 +0000 (12:21 +0100)]
acpi sleep: domain_freeze() pauses all vcpus, but does not sync the
current vcpu (since that would obviously deadlock).

This simplifies thaw_domains() which is required now that thawing can
happen in deifferent context to freeze_domains().

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoacpi sleep: Rearrange code for entering system sleep states.
Keir Fraser [Thu, 15 Apr 2010 10:36:20 +0000 (11:36 +0100)]
acpi sleep: Rearrange code for entering system sleep states.

We cannot freeze_domains in hypercall-continuation context any more,
since that is a softirq context which can interrupt an arbitrary
vcpu. Hence sleeping all vcpus in that context can easily deadlock
(against the vcpu we interrupted). So rearrange the code to
freeze_domains before calling continue_hypercall_on_cpu().

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoUpdate comments around spin_trylock() usage for sysctl and xenpf locks.
Keir Fraser [Thu, 15 Apr 2010 10:33:39 +0000 (11:33 +0100)]
Update comments around spin_trylock() usage for sysctl and xenpf locks.

Since the execution of stop_machine_run() via cpu_down() is now always
deferred to a hypercall continuation context, the above locks are not
held at that time. Hence the trylock is not specifically to avoid
deadlock with stop_machine_run(), but rather a more general paranoia
about deadlocks in general.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agocontinue_hypercall_on_cpu() always defers execution of the continuation
Keir Fraser [Thu, 15 Apr 2010 10:31:58 +0000 (11:31 +0100)]
continue_hypercall_on_cpu() always defers execution of the continuation

...even when scheduled to run on the current physical cpu. This
ensures that locks get dropped correctly before executing the
continuation code, and also allows the original caller to determine
whether the continuation has/will execute based on c_h_o_c()'s
immediate return code.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoMake cpu param to continue_hypercall_on_cpu() an unsigned integer.
Keir Fraser [Thu, 15 Apr 2010 08:04:45 +0000 (09:04 +0100)]
Make cpu param to continue_hypercall_on_cpu() an unsigned integer.

Negative input makes no sense, and this makes the input range check
correct.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoFix tasklet_action() to notify correct cpu when running tasklet is rescheduled.
Keir Fraser [Thu, 15 Apr 2010 08:03:43 +0000 (09:03 +0100)]
Fix tasklet_action() to notify correct cpu when running tasklet is rescheduled.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoRemus: fix alignment bug in python rtnl library
Keir Fraser [Thu, 15 Apr 2010 07:42:40 +0000 (08:42 +0100)]
Remus: fix alignment bug in python rtnl library

Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
15 years agoRemus: make ebt_imq and sch_queue compatible with pvops
Keir Fraser [Thu, 15 Apr 2010 07:42:08 +0000 (08:42 +0100)]
Remus: make ebt_imq and sch_queue compatible with pvops

Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
15 years agoImprovements and bug fixes to continue_hypercall_on_cpu().
Keir Fraser [Wed, 14 Apr 2010 12:35:05 +0000 (13:35 +0100)]
Improvements and bug fixes to continue_hypercall_on_cpu().

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agocredit2: Add toolstack options to control credit2 scheduler parameters
Keir Fraser [Wed, 14 Apr 2010 11:10:19 +0000 (12:10 +0100)]
credit2: Add toolstack options to control credit2 scheduler parameters

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agocredit2: Add credit2 scheduler to hypervisor
Keir Fraser [Wed, 14 Apr 2010 11:07:21 +0000 (12:07 +0100)]
credit2: Add credit2 scheduler to hypervisor

This is the core credit2 patch.  It adds the new credit2 scheduler to
the hypervisor, as the non-default scheduler.  It should be emphasized
that this is still in the development phase, and is probably still
unstable.  It is known to be suboptimal for multi-socket systems.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agocredit2: Add a scheduler-specific schedule trace class
Keir Fraser [Wed, 14 Apr 2010 11:06:05 +0000 (12:06 +0100)]
credit2: Add a scheduler-specific schedule trace class

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agocredit2: Flexible cpu-to-schedule-spinlock mappings
Keir Fraser [Wed, 14 Apr 2010 11:05:31 +0000 (12:05 +0100)]
credit2: Flexible cpu-to-schedule-spinlock mappings

Credit2 shares a runqueue between several cpus.  Rather than have
double locking and dealing with the cpu-to-runqueue races, allow
the scheduler to redefine the sched_lock-to-cpu mapping.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agocredit2: Add context_saved scheduler callback
Keir Fraser [Wed, 14 Apr 2010 11:03:27 +0000 (12:03 +0100)]
credit2: Add context_saved scheduler callback

Because credit2 shares a runqueue between several cpus, it needs
to know when a scheduled-out process has finally been context-switched
away so that it can be added to the runqueue again.  (Otherwise it may
be grabbed by another processor before the context has been properly
saved.)

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoPort latest grub zfs boot code to pygrub
Keir Fraser [Wed, 14 Apr 2010 10:56:54 +0000 (11:56 +0100)]
Port latest grub zfs boot code to pygrub

Signed-off-by: Mark Johnson <mark.r.johnson@oracle.com>
Add -Werror to CFLAGS and fix numerous warnings/errors.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoArchitecture-independent, and tasklet-based, continue_hypercall_on_cpu().
Keir Fraser [Wed, 14 Apr 2010 10:29:05 +0000 (11:29 +0100)]
Architecture-independent, and tasklet-based, continue_hypercall_on_cpu().

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoPer-cpu tasklet lists.
Keir Fraser [Wed, 14 Apr 2010 09:44:29 +0000 (10:44 +0100)]
Per-cpu tasklet lists.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoxentrace: Add missing help option
Keir Fraser [Tue, 13 Apr 2010 17:19:33 +0000 (18:19 +0100)]
xentrace: Add missing help option

Describe the --reserve-disk-space option in the xentrace help.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoxentrace: Add an option not to enable tracing
Keir Fraser [Tue, 13 Apr 2010 17:19:10 +0000 (18:19 +0100)]
xentrace: Add an option not to enable tracing

Add an option that will set up the buffers and listen for updates,
but will not enable tracing.  This is useful if you have hacks
in Xen to enable tracing at key points (for example, debugging a
shadow bug).

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoxentrace: Skip to low cpu when throwing away portions of the circular buffer
Keir Fraser [Tue, 13 Apr 2010 17:18:36 +0000 (18:18 +0100)]
xentrace: Skip to low cpu when throwing away portions of the circular buffer

Skip to the next "low" cpu when throwing away portions of the circular
memory buffer.  This makes subsequent analysis easier.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoMake c/s 21089 work again with c/s 21092
Keir Fraser [Tue, 13 Apr 2010 14:38:27 +0000 (15:38 +0100)]
Make c/s 21089 work again with c/s 21092

Unfortunately the latter c/s' change to mpparse.c yielded the former
patch non-functional - Xen's serial port IRQ is not in IQR_DISABLED
state, yet must be allowed to get its trigger mode and polarity set
up in order for it to be usable.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agosysctl: Fix XEN_SYSCTL_debug_keys error path.
Keir Fraser [Tue, 13 Apr 2010 12:40:58 +0000 (13:40 +0100)]
sysctl: Fix XEN_SYSCTL_debug_keys error path.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoClean up numa-info sysctl.
Keir Fraser [Tue, 13 Apr 2010 12:27:46 +0000 (13:27 +0100)]
Clean up numa-info sysctl.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agolibxl: <sys/signal.h> -> <signal.h>
Keir Fraser [Tue, 13 Apr 2010 11:48:17 +0000 (12:48 +0100)]
libxl: <sys/signal.h> -> <signal.h>

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoUpdate QEMU_TAG to b5160622517fb2d16d0836172a2e34633c9d94bf
Keir Fraser [Tue, 13 Apr 2010 11:45:51 +0000 (12:45 +0100)]
Update QEMU_TAG to b5160622517fb2d16d0836172a2e34633c9d94bf

15 years agolibxl: build fix for netbsd
Keir Fraser [Tue, 13 Apr 2010 11:21:28 +0000 (12:21 +0100)]
libxl: build fix for netbsd

<sys/signal.h> is needed to get definition for SIGPIPE and SIG_IGN.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
15 years agop2m: merge ptp allocation
Keir Fraser [Tue, 13 Apr 2010 11:20:48 +0000 (12:20 +0100)]
p2m: merge ptp allocation

Signed-off-by: Christoph Egger <Christop.Egger@amd.com>
15 years agoTopology-info sysctl cleanups.
Keir Fraser [Tue, 13 Apr 2010 08:38:54 +0000 (09:38 +0100)]
Topology-info sysctl cleanups.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agosysctl: Remove sockets_per_node field from physinfo command.
Keir Fraser [Tue, 13 Apr 2010 07:37:16 +0000 (08:37 +0100)]
sysctl: Remove sockets_per_node field from physinfo command.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
15 years agoxentrace: fix lost records resume
Keir Fraser [Mon, 12 Apr 2010 17:28:33 +0000 (18:28 +0100)]
xentrace: fix lost records resume

Reorder the SCHED_SWITCH trace before the runstate change trace to fix
a problem with the lost records "resume" code.

Namely: The "lost records" trace includes the currently running
process.  But during SCHED_SWITCH, it reads the wrong value, confusing
xenalyze.  Making sure there are no trace records between runstate
change trace and the actual context switch fixes it.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoxentrace: Bounds checking and error handling
Keir Fraser [Mon, 12 Apr 2010 16:54:48 +0000 (17:54 +0100)]
xentrace: Bounds checking and error handling

Check tbuf_size to make sure that it will fit on the t_info struct
allocated at boot.   Also deal with allocation failures more
gracefully.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agox86, shadow: Fix read-to-use race condition
Keir Fraser [Mon, 12 Apr 2010 16:51:56 +0000 (17:51 +0100)]
x86, shadow: Fix read-to-use race condition

If OOS mode is enabled, after last possible resync, read the guest l1e
one last time.  If it's different than the original read, start over
again.

This fixes a race which can result in inconsistent in-sync shadow
tables, leading to corruption:

v1: take page fault, read gl1e from an out-of-sync PT.
v2: modify gl1e, lowering permissions
[v1,v3]: resync l1 which was just read.
v1: propagate change to l1 shadow using stale gl1e

Now we have an in-sync shadow with more permissions than the guest.

The resync can happen either as a result of a 3rd vcpu doing a cr3
update, or under certain conditions by v1 itself.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agoxl: Migration support
Keir Fraser [Mon, 12 Apr 2010 16:47:16 +0000 (17:47 +0100)]
xl: Migration support

Implement "xl migrate".

ssh is used as the transport by default, although this can be
overridden by specifying a different sshcommand.  This is a very
standard approach nowadays and avoids the need for daemons at the
target host in the default configuration, while providing flexibility
to admins.  (In the future it might be nice to support plain
unencrypted migration over TCP, which we do not rule out now, although
it is not currently implemented.)

Properties of the migration protocol:
  * The domain on the target machine is named "<domname>--incoming"
    while it is being transferred.
  * The domain on the source machine is renamed
  "<domain>--migratedaway"
    before we give the target permission to rename and unpause.
  * The locking in libxl_domain_rename ensures that of two
    simultaneous migration attempts no more than one will succeed.
  * We go to some considerable effort to avoid leaving the domain in
    a bad state if something goes wrong with one of the ends or the
    network, although there is still (inevitably) a possibility of a
    unresolvable state (in case of very badly timed network failure)
    which is probably best resolved by destroying the domain at both
    ends.

Incidental changes:
  create_domain now returns a libxl error code rather than exiting on
  error.
  New ERROR_BADFAIL error code for reporting unpleasant failures.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoxl: Domain creation logging fixes
Keir Fraser [Mon, 12 Apr 2010 16:46:39 +0000 (17:46 +0100)]
xl: Domain creation logging fixes

 * Make create_domain always return to caller
 * Have create_domain set its log callback sooner
 * Actually write things to logfile, and some error checking

With some combinations of options, create_domain would never return to
the caller, since it would have called daemon and will later exit.  So
we fork an additional time, so that we can call daemon in the child
and also return to the caller in the parent.  It's a shame that
there's no version of daemon(3) that allows us to do this without the
extra code and pointless extra fork.

daemon(0,0) closes all the fds.  So we need to call daemon(0,1) and
organise detaching our stdin/out/err ourselves.  Doing this makes
messages actually appear in the xl logfile in /var/log/xen.

Finally, make create_domain call libxl_ctx_set_log sooner.  This makes
some lost messages appear.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoxl: New savefile format. Save domain config when saving a domain.
Keir Fraser [Mon, 12 Apr 2010 16:46:10 +0000 (17:46 +0100)]
xl: New savefile format. Save domain config when saving a domain.

We introduce a new format for saved domains.  The new format, in
contrast to the old:
  * Has a magic number which can distinguish it from other kinds of
  file
  * Is extensible
  * Can contains the domain configuration file

On domain creation we remember the actual config file used (using the
toolstack data feature of libxl, just introduced), and by default save
it to the save file.

However, options are provided for the following:
  * When saving a domain, supplying an alternative config file to
    store in the savefile.
  * When restoring a domain, supplying an alternative config file.

If a domain is restored with a different config file, it is the
responsibility of the xl user to ensure that the two configs are
"compatible".  Changing the targets of virtual devices is supported;
changing other features of the domain is not recommended.  Bad changes
may lead to undefined behaviour in the domain, and are in practice
likely to cause resume failures or crashes.

Old format save files generated by old versions of xl are not
supported.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl,xl: Fix two minor bugs in domain destruction
Keir Fraser [Mon, 12 Apr 2010 16:45:26 +0000 (17:45 +0100)]
libxl,xl: Fix two minor bugs in domain destruction

* If /local/domain/<domid>/device does not exist, this is not
  necessarily an error.  It probably means the domain has been
  partially destroyed already.

* Have xl report errors from libxl_domain_destroy.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoxl: Remove some duplicated boilerplate. (Improves logging slightly.)
Keir Fraser [Mon, 12 Apr 2010 16:44:47 +0000 (17:44 +0100)]
xl: Remove some duplicated boilerplate. (Improves logging slightly.)

We remove six lines of boilerplate from the top of each function, and
instead have a single struct libxl_ctx which is initialised once at
the top of main.

Likewise we wrap domain_qualifier_to_domid in a new function
find_domain, which does the error handling, and stores the domid and
the specified name (if applicable).

This reduces the size of xl.c by 7% (!)

As a beneficial side effect, the earlier call to libxl_ctx_set_log in
main makes some lost messages appear.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Per-domain data storage for the convenience of the library user
Keir Fraser [Mon, 12 Apr 2010 16:44:01 +0000 (17:44 +0100)]
libxl: Per-domain data storage for the convenience of the library user

We provide a mechanism whereby a user of the libxl library is able to
store some information alongside the domain.  The information stored
is a block of bytes.  Its lifetime is that of the domain - ie the
userdata is garbage collected alongside the domain if the domain is
destroyed.  (This is why the feature needs to be in libxl and cannot
be implemented in the user itself or in libxlutil.)

If a libxl caller does not need to use this feature it can ignore it.

The data is tagged with the (self-declared) name of the libxl user, so
that different users cannot accidentally trip over each others'
userdata.  The data is not interpreted at all by libxl.

To assist developers and people debugging, there is a registry of the
known userdata userids, and the corresponding data format as declared
by that libxl user, in libxl.h next to these declarations:

 int libxl_userdata_store(struct libxl_ctx *ctx, uint32_t domid,
                               const char *userdata_userid,
                               const uint8_t *data, int datalen);
 int libxl_userdata_retrieve(struct libxl_ctx *ctx, uint32_t domid,
                                  const char *userdata_userid,
                                  uint8_t **data_r, int *datalen_r);

The next patch will introduce the data for the userid "xl".

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: New function libxl_domain_info
Keir Fraser [Mon, 12 Apr 2010 16:43:25 +0000 (17:43 +0100)]
libxl: New function libxl_domain_info

libxl_domain_info provides a way to get the struct libxl_dominfo
for a single domain given its domid.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoxl, libxl: xl list -v shows the uuid too
Keir Fraser [Mon, 12 Apr 2010 16:42:57 +0000 (17:42 +0100)]
xl, libxl: xl list -v shows the uuid too

Break uuid to string conversion (including logging) out into a new
function libxl_uuid2string for reuse, and expose it for the
convenience of callers.

Provide a new -v option to xl list which shows the domain's uuid.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Expose libxl_report_exitstatus
Keir Fraser [Mon, 12 Apr 2010 16:42:29 +0000 (17:42 +0100)]
libxl: Expose libxl_report_exitstatus

xl would like to use libxl_report_exitstatus, so expose it in
libxl_utils.h to avoid having to write it twice.  Also, give it a
"level" argument to set the loglevel of the resulting message.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoxenstore,libxl: cleanup of xenstore connections across fork()
Keir Fraser [Mon, 12 Apr 2010 16:41:58 +0000 (17:41 +0100)]
xenstore,libxl: cleanup of xenstore connections across fork()

Provide a new function xs_daemon_destroy_postfork which can be called
by a libxenstore user who has called fork, to close the fd for the
connection to xenstored and free the memory, without trying to do
anything to any threads which libxenstore may have created.

Use this new function in libxl_fork, to avoid accidental use of a
xenstore connection in both parent and child.

Also, fix the doc comment for libxl_spawn_spawn to have the success
return codes the right way round.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Expose functions for helping with subprocesses.
Keir Fraser [Mon, 12 Apr 2010 16:41:05 +0000 (17:41 +0100)]
libxl: Expose functions for helping with subprocesses.

 * Expose libxl_fork in libxl_utils.h
 * Expose libxl_pipe in libxl_utils.h
 * Make libxl_exec put SIGPIPE back (so that libxl callers may
    have SIGPIPE ignored)

xl would like to use libxl_fork (which is like fork(2) except that it
logs errors) and also a similar function libxl_pipe.  So put these in
libxl_utils.[ch] and use them in libxl.c as appropriate, to avoid
having to duplicate code between xl and libxl.

Also, make sure that subprocesses spawned by libxl have SIGPIPE set
back to SIG_DFL as they are entitled to expect.  This means that a
libxl caller which sets SIGPIPE to SIG_IGN is no longer buggy.  (This
is relevant for xl migration, because xl would like to be such a
caller.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Provide libxl_domain_rename
Keir Fraser [Mon, 12 Apr 2010 16:40:34 +0000 (17:40 +0100)]
libxl: Provide libxl_domain_rename

Provide a new function libxl_domain_rename.  It can check that the
domain being renamed has the expected name, to avoid races.

Use the new function to set the name during domain creation.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: libxl_domain_restore: Put fd back to blocking mode
Keir Fraser [Mon, 12 Apr 2010 16:40:06 +0000 (17:40 +0100)]
libxl: libxl_domain_restore: Put fd back to blocking mode

libxl_domain_restore calls, indirectly, xc_domain_restore.  The
latter, when doing a live migration, sets the fd from blocking mode
(which it must be on entry, or things go wrong) to nonblocking mode
and leaves it this way.  Arguably this is a bug in libxc, but to avoid
disrupting any callers we fix it in libxl.

So libxl_domain_restore now puts the fd back into blocking mode
before returning.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: New utility functions in for reading and writing files.
Keir Fraser [Mon, 12 Apr 2010 16:39:29 +0000 (17:39 +0100)]
libxl: New utility functions in for reading and writing files.

We introduce these functions in libxl_utils.h:

  int libxl_read_file_contents(struct libxl_ctx *ctx, const char
  *filename,
                               void **data_r, int *datalen_r);
  int libxl_read_exactly(struct libxl_ctx *ctx, int fd, void *data,
  ssize_t sz,
                         const char *filename, const char *what);
  int libxl_write_exactly(struct libxl_ctx *ctx, int fd, const void
  *data,
                          ssize_t sz, const char *filename, const char
  *what);

They will be needed by the following patches.  They have to be in
libxl.a rather than libxutil.a because they will be used, amongst
other places, in libxl itself.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Report error if logfile rotation fails
Keir Fraser [Mon, 12 Apr 2010 16:38:42 +0000 (17:38 +0100)]
libxl: Report error if logfile rotation fails

Check the return values from renames and errors from stat in
libxl_create_logfile (which, misleadingly, does not actually create
the logfile).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agolibxl: Make logging functions preserve errno
Keir Fraser [Mon, 12 Apr 2010 16:38:17 +0000 (17:38 +0100)]
libxl: Make logging functions preserve errno

This is needed by the following patches.  It makes it much more
convenient for libxl functions to return the errno value from the
failure, when they fail.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
15 years agoFix bug in 21089:4f796e29987c
Keir Fraser [Mon, 12 Apr 2010 16:36:54 +0000 (17:36 +0100)]
Fix bug in 21089:4f796e29987c

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agoBetter gcc error message if GUEST_PAGING_LEVELS is undefined.
Keir Fraser [Mon, 12 Apr 2010 16:36:10 +0000 (17:36 +0100)]
Better gcc error message if GUEST_PAGING_LEVELS is undefined.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
15 years agocpufreq: fix racing issue for cpu hotplug
Keir Fraser [Mon, 12 Apr 2010 16:33:47 +0000 (17:33 +0100)]
cpufreq: fix racing issue for cpu hotplug

To eliminate racing between dbs timer handler and cpufreq_del_cpu,
using kill_timer instead of stop_timer to make sure timer handler
execution finished before other stuff in cpufreq_del_cpu.

BTW, fix a lost point of cpufreq_statistic_lock taking sequence.

Signed-off-by: Wei Gang <gang.wei@intel.com>
15 years agoxen: 'make clean' really cleans unconfigured subdirs.
Keir Fraser [Mon, 12 Apr 2010 16:30:08 +0000 (17:30 +0100)]
xen: 'make clean' really cleans unconfigured subdirs.

Previously we skipped those listed in variable $(subdir-), only
including those in the more explicit $(subdir-n).

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoblktap2: a little fix to xen-hotplug-cleanup
Keir Fraser [Mon, 12 Apr 2010 06:23:21 +0000 (07:23 +0100)]
blktap2: a little fix to xen-hotplug-cleanup

Signed-off-by: James (Song Wei) <jsong@novell.com>
15 years agolibxc: Flush I/O before xc_domain_save completion
Keir Fraser [Mon, 12 Apr 2010 06:22:16 +0000 (07:22 +0100)]
libxc: Flush I/O before xc_domain_save completion

The final, flushing call to discard_file_cache also discards any
errors from fsync. Call fsync explicitly before leaving, to check if
all VM memory actually made it to the disk.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
15 years agopygrub: fix 64b Solaris PV guest boot on 32b Linux dom0 & 64b Xen
Keir Fraser [Mon, 12 Apr 2010 06:21:44 +0000 (07:21 +0100)]
pygrub: fix 64b Solaris PV guest boot on 32b Linux dom0 & 64b Xen

Signed-off-by: Mark Johnson <mark.r.johnson@oracle.com>
15 years agoxm: Fix string index out of range in 'vcpu-pin' command
Keir Fraser [Fri, 9 Apr 2010 07:54:25 +0000 (08:54 +0100)]
xm: Fix string index out of range in 'vcpu-pin' command

if <PCPUs> contins '' (e.g. '1,,2'), error 'string index out of range'
occurs. Fix this trivial bug.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
15 years agoAdd support for AMD MPERF/APERF
Keir Fraser [Fri, 9 Apr 2010 07:53:53 +0000 (08:53 +0100)]
Add support for AMD MPERF/APERF

Starting with Family 0x10, model 10 processors, some AMD processors
will have support for the APERF/MPERF MSRs.  This patch adds the
checks necessary to support those MSRs.

It also makes the get_measured_perf function defined inside cpufreq.c
driver independent.  max_freq is taken from the policy definition
instead of being a private argument in struct acpi_cpufreq_data.
The struct member is entirely removed from the function since it
is no longer used.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
15 years agoAdd Xen support for AMD Turbo/Boost
Keir Fraser [Fri, 9 Apr 2010 07:53:19 +0000 (08:53 +0100)]
Add Xen support for AMD Turbo/Boost

Add support for disabling AMD's Boost feature.  Boost is similar to
Intel's Turbo and uses the same high level interface.  The low
level implementation is different and encapsulated in the powernow
driver for cpufreq.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
15 years agoRefactor Xen Support for Intel Turbo Boost
Keir Fraser [Fri, 9 Apr 2010 07:52:43 +0000 (08:52 +0100)]
Refactor Xen Support for Intel Turbo Boost

Refactor the existing code that supports the Intel Turbo feature to
move all the driver specific bits in the cpufreq driver.  Create
a tri-state interface for the Turbo feature that can distinguish
amongst enabled Turbo, disabled Turbo, and processors that don't
support Turbo at all.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
15 years agolibxl: Fix the build by reinstating some sysctl.physinfo fields.
Keir Fraser [Thu, 8 Apr 2010 15:11:17 +0000 (16:11 +0100)]
libxl: Fix the build by reinstating some sysctl.physinfo fields.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoFix two issues for CPU online/offline.
Keir Fraser [Thu, 8 Apr 2010 14:31:52 +0000 (15:31 +0100)]
Fix two issues for CPU online/offline.

Firstly, we should return if we fail to get spin lock in cpu_down.
Secondly, in credit scheduler, the idlers need be limited only to
online map.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
15 years agotmem: fix ia64 build
Keir Fraser [Thu, 8 Apr 2010 14:31:21 +0000 (15:31 +0100)]
tmem: fix ia64 build
  /xen/common/built_in.o: In function `tmh_get_first_byte':
  /xen/include/xen/tmem_xen.h:350: undefined reference to
  `__map_domain_page'

Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
15 years agoxen: allow guests to set caching attributes for MMIOs
Keir Fraser [Thu, 8 Apr 2010 14:30:52 +0000 (15:30 +0100)]
xen: allow guests to set caching attributes for MMIOs

This patch allows guests that have directly mapped MMIO regions to set
the caching attributes for them, and only for them.
Currently we have just an on/off check for a directly assigned device
instead of looking for directly mapped MMIO regions.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoHost Numa information in dom0
Keir Fraser [Wed, 7 Apr 2010 15:22:05 +0000 (16:22 +0100)]
Host Numa information in dom0

'xm info' command now also gives the cpu topology & host numa
information. This will be later used to build guest numa support.  The
patch basically changes physinfo sysctl, and adds topology_info &
numa_info sysctls, and also changes the python & libxc code
accordingly.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
15 years agoRevert 21110:d791173ca65b and 21111:986d3b1d30fb
Keir Fraser [Wed, 7 Apr 2010 14:44:29 +0000 (15:44 +0100)]
Revert 21110:d791173ca65b and 21111:986d3b1d30fb

Break automated tests.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agox86: Fix caller of p2m_init(): cannot use paging_mode_hap() yet.
Keir Fraser [Wed, 7 Apr 2010 09:17:27 +0000 (10:17 +0100)]
x86: Fix caller of p2m_init(): cannot use paging_mode_hap() yet.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agomini-os: Fix xenbus_switch_state's transaction retry
Keir Fraser [Wed, 7 Apr 2010 07:17:21 +0000 (08:17 +0100)]
mini-os: Fix xenbus_switch_state's transaction retry

When xenbus_switch_state has to retry the transaction which it just
created, it needs to recreate another one.  Clearing xbt triggers it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agomini-os: Fix crash on frontend shutdown failures
Keir Fraser [Wed, 7 Apr 2010 07:16:15 +0000 (08:16 +0100)]
mini-os: Fix crash on frontend shutdown failures

Do not free frontend resources if some error happened, since the
backend may not have finished properly restarting in such case.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agomini-os: Fix frontend shutdown wait loop
Keir Fraser [Wed, 7 Apr 2010 07:15:55 +0000 (08:15 +0100)]
mini-os: Fix frontend shutdown wait loop

minios frontends must wait for backends to be shut down and
reinitialized before freeing resources.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agoFix 32bit PAE compilation error introduced by 1GB patches
Keir Fraser [Wed, 7 Apr 2010 07:15:33 +0000 (08:15 +0100)]
Fix 32bit PAE compilation error introduced by 1GB patches

Signed-off-by: Wei Huang2 <wei.huang2@amd.com>
15 years agoUpdate git clone command
Keir Fraser [Wed, 7 Apr 2010 07:14:34 +0000 (08:14 +0100)]
Update git clone command

When cloning the kernel repo:
   1. make remote called "xen" rather than the default "origin"
   2. directly checkout the desired branch in one step

Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
15 years agoSwitch default kernel branch to xen/stable-2.6.31.x
Keir Fraser [Wed, 7 Apr 2010 07:13:20 +0000 (08:13 +0100)]
Switch default kernel branch to xen/stable-2.6.31.x

This is functionally identical to xen/master at the moment, but more
meaningful.

Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
15 years agox86, cpu hotplug: Synchronise vcpu state earlier during cpu offline.
Keir Fraser [Wed, 7 Apr 2010 07:09:00 +0000 (08:09 +0100)]
x86, cpu hotplug: Synchronise vcpu state earlier during cpu offline.

Needs to happen before non-idle VCPU is fully descheduled after CPU is
removed from cpu_online_map. Else sync_vcpu_execstate() doesn't work
properly.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoEnable debug=y by default in the build.
Keir Fraser [Tue, 6 Apr 2010 06:16:47 +0000 (07:16 +0100)]
Enable debug=y by default in the build.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoEPT: 1GB large page support.
Keir Fraser [Tue, 6 Apr 2010 06:14:56 +0000 (07:14 +0100)]
EPT: 1GB large page support.

Alloc 1GB large page for EPT if possible. It also contains the logic
to split large page into small ones (2M or 4K).

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Xiaohui Xin <xiaohui.xin@intel.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agomini-os: Fix xenbus initialisation
Keir Fraser [Tue, 6 Apr 2010 06:13:19 +0000 (07:13 +0100)]
mini-os: Fix xenbus initialisation

This fixes xenbus initialization of blkfront, netfront and pcifront
by uniformizing with fbfront: after writing parameters, set state to
initialised, then wait for backend to switch to connect state, and
then only read its parameter and switch to the connect state.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agopv-grub: fix boot crash when no fb is available
Keir Fraser [Tue, 6 Apr 2010 06:13:01 +0000 (07:13 +0100)]
pv-grub: fix boot crash when no fb is available

When no fb is available, init_fbfront will return, so the local
semaphore for synchronization with the kbd thread would get dropped.
Using a global static semaphore instead fixes this.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agomini-os: Do not use the same wait element twice
Keir Fraser [Tue, 6 Apr 2010 06:12:39 +0000 (07:12 +0100)]
mini-os: Do not use the same wait element twice

To enqueue the kbdfront thread on two separate wait queues, we need
two different wait elements.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
15 years agotmem: add page deduplication with optional compression or trailing-zero-elimination
Keir Fraser [Tue, 6 Apr 2010 06:11:48 +0000 (07:11 +0100)]
tmem: add page deduplication with optional compression or trailing-zero-elimination

Add "page deduplication" capability (with optional compression
and trailing-zero elimination) to Xen's tmem.

(Transparent to tmem-enabled guests.)  Ephemeral pages
that have the exact same content are "combined" so that only
one page frame is needed.  Since ephemeral pages are essentially
read-only, no C-O-W (and thus no equivalent of swapping) is
necessary.  Deduplication can be combined with compression
or "trailing zero elimination" for even more space savings.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
15 years ago1GB Page Table Support for HVM Guest 3/3
Keir Fraser [Tue, 6 Apr 2010 06:09:35 +0000 (07:09 +0100)]
1GB Page Table Support for HVM Guest 3/3

This patch adds a new field in hvm to indicate 1gb is supported by
CPU.  In addition, users can turn 1GB feature on/off using a Xen
option ("hap_1gb", default is off). Per Tim's suggestion, I also add
an assertion check in shadow/common.c file to prevent affecting shadow
code.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Acked-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Tim Deegan <tim.deegan@citrix.com>
15 years ago1GB Page Table Support for HVM Guest 2/3
Keir Fraser [Tue, 6 Apr 2010 06:07:37 +0000 (07:07 +0100)]
1GB Page Table Support for HVM Guest 2/3

This patch changes P2M code to works with 1GB page now.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Acked-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Tim Deegan <tim.deegan@citrix.com>
15 years ago1GB Page Table Support for HVM Guest 1/3
Keir Fraser [Tue, 6 Apr 2010 06:02:17 +0000 (07:02 +0100)]
1GB Page Table Support for HVM Guest 1/3

This patch changes Xen tools to allocate 1GB first. If such requests
fail, it will fall back to 2MB and then 4KB. We skip 1GB allocation
for the MMIO space between 3GB and 4GB.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Tim Deegan <tim.deegan@citrix.com>
15 years agoCSCHED: Optimize __runq_tickle to reduce IPIs
Keir Fraser [Tue, 6 Apr 2010 05:59:32 +0000 (06:59 +0100)]
CSCHED: Optimize __runq_tickle to reduce IPIs

Limiting the number of idle cpus tickled for vcpu migration purpose
to ONLY ONE to get rid of a lot of IPI events which may impact the
average cpu idle residency time.

The default on option 'tickle_one_idle_cpu=0' can be used to disable
this optimization if needed.

Signed-off-by: Wei Gang <gang.wei@intel.com>
15 years agoxl: tsc_mode parameter in guest configuration file
Keir Fraser [Tue, 6 Apr 2010 05:56:20 +0000 (06:56 +0100)]
xl: tsc_mode parameter in guest configuration file

Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Acked-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoxl: vcpu-set command
Keir Fraser [Tue, 6 Apr 2010 05:55:37 +0000 (06:55 +0100)]
xl: vcpu-set command

Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Acked-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoxl: vcpu-pin command
Keir Fraser [Tue, 6 Apr 2010 05:54:51 +0000 (06:54 +0100)]
xl: vcpu-pin command

Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Acked-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoxl: vcpu-list command
Keir Fraser [Tue, 6 Apr 2010 05:54:08 +0000 (06:54 +0100)]
xl: vcpu-list command

Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Acked-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agocpuidle: mwait on softirq_pending & remove wakeup ipis
Keir Fraser [Tue, 6 Apr 2010 05:52:11 +0000 (06:52 +0100)]
cpuidle: mwait on softirq_pending & remove wakeup ipis

For cpu which enter deep C state via monitor/mwait, wakeup can be done
by writing to the monitored memory. So once monitor softirq_pending,
we can remove the redundant ipis.

Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Wei Gang <gang.wei@intel.com>
15 years agox86: use paging_mode_hap() consistently
Keir Fraser [Tue, 6 Apr 2010 05:51:04 +0000 (06:51 +0100)]
x86: use paging_mode_hap() consistently

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
15 years agoAllow all unused GSI to be configured via IO-APIC by new pv_ops dom0
Keir Fraser [Thu, 1 Apr 2010 08:55:27 +0000 (09:55 +0100)]
Allow all unused GSI to be configured via IO-APIC by new pv_ops dom0

Currently Xen disallows setting up any GSI < 16. This makes it
impossible by the kernel to use any PCI devices without ACPI override
but a mapping to this interrupts via IO-APIC.

The patch allows all unused interrupts to be setup via IO-APIC.

Signed-off-by: Bastian Blank <waldi@debian.org>