]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agoxen/tools: tracing: Report next slice time when continuing as well as switching
Dario Faggioli [Wed, 1 Mar 2017 16:56:35 +0000 (16:56 +0000)]
xen/tools: tracing: Report next slice time when continuing as well as switching

We record trace information about the next timeslice when
switching to a different vcpu, but not when continuing to
run the same cpu:

 csched2:schedule cpu 9, rq# 1, idle, SMT idle, tickled
 csched2:runq_candidate d0v3, 0 vcpus skipped, cpu 9 was tickled
 sched_switch prev d32767v9, run for 991.186us
 sched_switch next d0v3, was runnable for 2.515us, next slice 10000.0us
 sched_switch prev d32767v9 next d0v3              ^^^^^^^^^^^^^^^^^^^^
 runstate_change d32767v9 running->runnable
 ...
 csched2:schedule cpu 2, rq# 0, busy, not tickled
 csched2:burn_credits d1v5, credit = 9996950, delta = 502913
 csched2:runq_candidate d1v5, 0 vcpus skipped, no cpu was tickled
 runstate_continue d1v5 running->running
                                         ?????????????

This information is quite useful; so add a trace including
that information on the 'continue_running' path as well,
like this:

 csched2:schedule cpu 1, rq# 0, busy, not tickled
 csched2:burn_credits d0v8, credit = 9998645, delta = 12104
 csched2:runq_candidate d0v8, credit = 9998645, 0 vcpus skipped, no cpu was tickled
 sched_switch continue d0v8, run for 1125.820us, next slice 9998.645us
 runstate_continue d0v8 running->running         ^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen/tools: tracing: trace (Credit2) runq traversal.
Dario Faggioli [Wed, 1 Mar 2017 16:56:35 +0000 (16:56 +0000)]
xen/tools: tracing: trace (Credit2) runq traversal.

When traversing a Credit2 runqueue to select the
best candidate vCPU to be run next, show in the
trace which vCPUs we consider.

A bit verbose, but quite useful, considering that
we may end up looking at, but then discarding, one
of more vCPU. This will help understand which ones
are skipped and why.

Also, add how much credits the chosen vCPU has
(in the TRC_CSCHED2_RUNQ_CANDIDATE record). And,
while there, fix a bug in tools/xentrace/formats
(still in the output of TRC_CSCHED2_RUNQ_CANDIDATE).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: group the runq manipulating functions.
Dario Faggioli [Wed, 1 Mar 2017 16:56:35 +0000 (16:56 +0000)]
xen: credit2: group the runq manipulating functions.

So that they're all close among each other, and
also near to the comment describing the runqueue
organization (which is also moved).

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: tidy up functions names by removing leading '__'.
Dario Faggioli [Wed, 1 Mar 2017 16:56:35 +0000 (16:56 +0000)]
xen: credit2: tidy up functions names by removing leading '__'.

There is no reason for having pretty much all of the
functions whose names begin with double underscores
('__') to actually look like that.

In fact, that is misleading and makes the code hard
to read and understand. So, remove the '__'-s.

The only two that we keep are __runq_assign() and
__runq_deassign() (althought they're converted to
single underscore). In fact, in those cases, it is
indeed useful to have those sort of a "raw" variants.

In case of __runq_insert(), which is only called
once, by runq_insert(), merge the two functions.

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: make accessor helpers inline functions instead of macros
Dario Faggioli [Wed, 1 Mar 2017 16:56:34 +0000 (16:56 +0000)]
xen: credit2: make accessor helpers inline functions instead of macros

There isn't any particular reason for the accessor helpers
to be macro, so turn them into 'static inline'-s, which are
better.

Note that it is necessary to move the function definitions
below the structure declarations.

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: don't miss accounting while doing a credit reset.
Dario Faggioli [Wed, 1 Mar 2017 16:56:34 +0000 (16:56 +0000)]
xen: credit2: don't miss accounting while doing a credit reset.

A credit reset basically means going through all the
vCPUs of a runqueue and altering their credits, as a
consequence of a 'scheduling epoch' having come to an
end.

Blocked or runnable vCPUs are fine, all the credits
they've spent running so far have been accounted to
them when they were scheduled out.

But if a vCPU is running on a pCPU, when a reset event
occurs (on another pCPU), that does not get properly
accounted. Let's therefore begin to do so, for better
accuracy and fairness.

In fact, after this patch, we see this in a trace:

 csched2:schedule cpu 10, rq# 1, busy, not tickled
 csched2:burn_credits d1v5, credit = 9998353, delta = 202996
 runstate_continue d1v5 running->running
 ...
 csched2:schedule cpu 12, rq# 1, busy, not tickled
 csched2:burn_credits d1v6, credit = -1327, delta = 9999544
 csched2:reset_credits d0v13, credit_start = 10500000, credit_end = 10500000, mult = 1
 csched2:reset_credits d0v14, credit_start = 10500000, credit_end = 10500000, mult = 1
 csched2:reset_credits d0v7, credit_start = 10500000, credit_end = 10500000, mult = 1
 csched2:burn_credits d1v5, credit = 201805, delta = 9796548
 csched2:reset_credits d1v5, credit_start = 201805, credit_end = 10201805, mult = 1
 csched2:burn_credits d1v6, credit = -1327, delta = 0
 csched2:reset_credits d1v6, credit_start = -1327, credit_end = 9998673, mult = 1

Which shows how d1v5 actually executed for ~9.796 ms,
on pCPU 10, when reset_credit() is executed, on pCPU
12, because of d1v6's credits going below 0.

Without this patch, this 9.796ms are not accounted
to anyone. With this patch, d1v5 is charged for that,
and its credits drop down from 9796548 to 201805.

And this is important, as it means that it will
begin the new epoch with 10201805 credits, instead
of 10500000 (which he would have, before this patch).

Basically, we were forgetting one round of accounting
in epoch x, for the vCPUs that are running at the time
the epoch ends. And this meant favouring a little bit
these same vCPUs, in epoch x+1, providing them with
the chance of execute longer than their fair share.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: always mark a tickled pCPU as... tickled!
Dario Faggioli [Wed, 1 Mar 2017 16:56:34 +0000 (16:56 +0000)]
xen: credit2: always mark a tickled pCPU as... tickled!

In fact, whether or not a pCPU has been tickled, and is
therefore about to re-schedule, is something we look at
and base decisions on in various places.

So, let's make sure that we do that basing on accurate
information.

While there, also tweak a little bit smt_idle_mask_clear()
(used for implementing SMT support), so that it only alter
the relevant cpumask when there is the actual need for this.
(This is only for reduced overhead, behavior remains the
same).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
8 years agox86/vpmu: disable VPMU if guest's CPUID indicates no PMU support
Boris Ostrovsky [Wed, 1 Mar 2017 16:51:16 +0000 (17:51 +0100)]
x86/vpmu: disable VPMU if guest's CPUID indicates no PMU support

When toolstack overrides Intel CPUID leaf 0xa's PMU version with an
invalid value VPMU should not be available to the guest.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vpmu: add get/put_vpmu() and VPMU_AVAILABLE
Boris Ostrovsky [Wed, 1 Mar 2017 16:50:48 +0000 (17:50 +0100)]
x86/vpmu: add get/put_vpmu() and VPMU_AVAILABLE

vpmu_enabled() (used by hvm/pv_cpuid() to properly report 0xa leaf
for Intel processors) is based on the value of VPMU_CONTEXT_ALLOCATED
bit. This is problematic:
* For HVM guests VPMU context is allocated lazily, during the first
  access to VPMU MSRs. Since the leaf is typically queried before guest
  attempts to read or write the MSRs it is likely that CPUID will report
  no PMU support
* For PV guests the context is allocated eagerly but only in responce to
  guest's XENPMU_init hypercall. There is a chance that the guest will
  try to read CPUID before making this hypercall.

This patch introduces VPMU_AVAILABLE flag which is set (subject to vpmu_mode
constraints) during VCPU initialization for both PV and HVM guests. Since
this flag is expected to be managed together with vpmu_count, get/put_vpmu()
are added to simplify code.

vpmu_enabled() (renamed to vpmu_available()) can now use this new flag.

(As a side affect this patch also fixes a race in pvpmu_init() where we
increment vcpu_count in vpmu_initialise() after checking vpmu_mode)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/mm: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 16:49:57 +0000 (17:49 +0100)]
x86/mm: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoefi/boot: Don't free ebmalloc area at all
Andrew Cooper [Tue, 28 Feb 2017 14:07:09 +0000 (14:07 +0000)]
efi/boot: Don't free ebmalloc area at all

Freeing part of the BSS back for general use proves to be problematic.  It is
not accounted for in xen_in_range(), causing errors when constructing the
IOMMU tables, resulting in a failure to boot.

Other smaller issues are that tboot treats the entire BSS as hypervisor data,
creating and checking a MAC of it on S3, and that, by being 1MB in size,
freeing it guarentees to shatter the hypervisor superpage mappings.

This is a stopgap fix to unblock master, while alternatives are discussed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/Viridian: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:40:48 +0000 (10:40 +0100)]
x86/Viridian: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/SVM: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:40:22 +0000 (10:40 +0100)]
x86/SVM: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/HVMemul: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:39:44 +0000 (10:39 +0100)]
x86/HVMemul: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/HVM: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:39:06 +0000 (10:39 +0100)]
x86/HVM: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: switch away from temporary 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:38:30 +0000 (10:38 +0100)]
x86: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: re-introduce non-underscore prefixed 32-bit register names
Jan Beulich [Wed, 1 Mar 2017 09:37:28 +0000 (10:37 +0100)]
x86: re-introduce non-underscore prefixed 32-bit register names

For a transitional period (until we've managed to replace all
underscore prefixed instances), allow both names to co-exist.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: check HAP before enabling nested VMX
Haozhong Zhang [Wed, 1 Mar 2017 09:30:32 +0000 (10:30 +0100)]
x86/hvm: check HAP before enabling nested VMX

The current implementation of nested VMX cannot work without HAP.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: ensure copying runstate/time to L1 rather than L2
Haozhong Zhang [Wed, 1 Mar 2017 09:29:57 +0000 (10:29 +0100)]
x86: ensure copying runstate/time to L1 rather than L2

For a HVM domain, if a vcpu is in the nested guest mode,
__raw_copy_to_guest(), __copy_to_guest() and __copy_field_to_guest()
used by update_runstate_area() and update_secondary_system_time() will
copy data to L2 guest rather than the L1 guest.

This commit temporally clears the nested guest flag before all guest
copies in update_runstate_area() and update_secondary_system_time(),
and restores the flag after those guest copy operations.

The flag clear/restore is combined with the existing
smap_policy_change() which is renamed to update_guest_memory_policy().

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoiommu: elaborate the usage of RMRR specification on the command line
Venu Busireddy [Wed, 1 Mar 2017 09:29:23 +0000 (10:29 +0100)]
iommu: elaborate the usage of RMRR specification on the command line

As some users have suggested, elaborate the usage of RMRR specification
on the command line, and provide a usage example.

Also, always treat the specified page numbers as hexadecimal values.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agopassthrough: reject self-(de)assignment of devices
Chao Gao [Wed, 1 Mar 2017 09:28:35 +0000 (10:28 +0100)]
passthrough: reject self-(de)assignment of devices

That is to say, don't support a domain assigns a device to itself or detachs
a device from itself.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agoxen/arm: warn if dom0_mem is not specified
Stefano Stabellini [Tue, 28 Feb 2017 18:56:14 +0000 (10:56 -0800)]
xen/arm: warn if dom0_mem is not specified

The default dom0_mem is 128M which is not sufficient to boot a Ubuntu
based Dom0. It is not clear what a better default value could be.

Instead, loudly warn the user when dom0_mem is unspecified and wait 3
secs. Then use 512M.

Update the docs to specify that dom0_mem is required on ARM. (The
current xen-command-line document does not actually reflect the current
behavior of dom0_mem on ARM correctly.)

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxl: fix compilation of xl_migrate.c
Roger Pau Monne [Tue, 28 Feb 2017 17:31:04 +0000 (17:31 +0000)]
xl: fix compilation of xl_migrate.c

The usage of signal(3) requires the inclusion of the signal.h header:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/signal.html

This fixes the build on FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/layout: Correct Xen's idea of its own memory layout
Andrew Cooper [Tue, 28 Feb 2017 15:17:17 +0000 (15:17 +0000)]
x86/layout: Correct Xen's idea of its own memory layout

c/s b4cd59fe "x86: reorder .data and .init when linking" had an unintended
side effect, where xen_in_range() and the tboot S3 MAC were no longer correct.

In practice, it means that Xen's .data section is excluded from consideration,
which means:
 1) Default IOMMU construction for the hardware domain could create mappings.
 2) .data isn't included in the tboot MAC checked on resume from S3.

Adjust the comments and virtual address anchors used to define the regions.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoxenstore: remove memory report command line support
Juergen Gross [Fri, 24 Feb 2017 06:21:45 +0000 (07:21 +0100)]
xenstore: remove memory report command line support

As a memory report can now be triggered via XS_CONTROL support via
command line and signal handler is no longer needed. Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: make memory report available via XS_CONTROL
Juergen Gross [Fri, 24 Feb 2017 06:21:44 +0000 (07:21 +0100)]
xenstore: make memory report available via XS_CONTROL

Add a XS_CONTROL command to xenstored for doing a talloc report to a
file. Right now this is supported by specifying a command line option
when starting xenstored and sending a signal to the daemon to trigger
the report.

To dump the report to the standard log file call:

xenstore-control memreport

To dump the report to a new file call:

xenstore-control memreport <file>

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: add support for changing log functionality dynamically
Juergen Gross [Fri, 24 Feb 2017 06:21:43 +0000 (07:21 +0100)]
xenstore: add support for changing log functionality dynamically

Today Xenstore supports logging only if specified at start of the
Xenstore daemon. As it can't be disabled during runtime it is not
recommended to start xenstored with logging enabled.

Add support for switching logging on and off at runtime and to
specify a (new) logfile. This is done via the XS_CONTROL wire command
which can be sent with xenstore-control.

To switch logging on just use:

xenstore-control log on

To switch it off again:

xenstore-control log off

To specify a (new) logfile:

xenstore-control logfile <file>

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: enhance control command support
Juergen Gross [Fri, 24 Feb 2017 06:21:42 +0000 (07:21 +0100)]
xenstore: enhance control command support

The Xenstore protocol supports the XS_CONTROL command for triggering
various actions in the Xenstore daemon. Enhance that support by using
a command table and adding a help function.

Support multiple control commands in the associated xenstore-control
program used to issue XS_CONTROL commands.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: Split out XS_CONTROL action to dedicated source file
Juergen Gross [Fri, 24 Feb 2017 06:21:41 +0000 (07:21 +0100)]
xenstore: Split out XS_CONTROL action to dedicated source file

Move the XS_CONTROL handling of xenstored to a new source file
xenstored_control.c.

In order to avoid making get_string() in xenstored_core.c globally
visible use strlen() instead, which is save in this context due to
xs_count_strings() before returned a value > 1.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: rename XS_DEBUG wire command
Juergen Gross [Fri, 24 Feb 2017 06:21:40 +0000 (07:21 +0100)]
xenstore: rename XS_DEBUG wire command

In preparation to support other than pure debug functionality via the
Xenstore XS_DEBUG wire command rename it to XS_CONTROL and make
XS_DEBUG an alias of it.

Add an alias xs_control_command for the associated xs_debug_command,
too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxl: merge xl_cmdimpl.c into xl.c
Wei Liu [Fri, 24 Feb 2017 16:01:45 +0000 (16:01 +0000)]
xl: merge xl_cmdimpl.c into xl.c

After splitting out all the meaty bits, xl_cmdimpl.c doesn't contain
much. Merge the rest into xl.c and delete the file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out migration related code
Wei Liu [Fri, 24 Feb 2017 15:59:32 +0000 (15:59 +0000)]
xl: split out migration related code

Include COLO / Remus code because they are built on top of the existing
migration protocol.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out save/restore related code
Wei Liu [Fri, 24 Feb 2017 15:54:52 +0000 (15:54 +0000)]
xl: split out save/restore related code

Add some function declarations to xl.h because they are now needed in
multiple files.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vm lifecycle control functions
Wei Liu [Fri, 24 Feb 2017 15:21:05 +0000 (15:21 +0000)]
xl: split out vm lifecycle control functions

Including create, reboot, shutdown, pause, unpause and destroy.

Lift a bunch of core data structures and function declarations to xl.h
because they are needed in both xl_cmdimpl.c and xl_vmcontrol.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out miscellaneous functions
Wei Liu [Fri, 24 Feb 2017 14:58:29 +0000 (14:58 +0000)]
xl: split out miscellaneous functions

A collections of functions that don't warrant their own files.

Moving main_devd there requires lifting do_daemonize to xl_utils.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vnc and console related code
Wei Liu [Fri, 24 Feb 2017 14:45:36 +0000 (14:45 +0000)]
xl: split out vnc and console related code

The new file also contains code for channel, which is just a console
in disguise.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: call libxl_vncviewer_exec in main_vncviewer
Wei Liu [Mon, 27 Feb 2017 17:35:32 +0000 (17:35 +0000)]
xl: call libxl_vncviewer_exec in main_vncviewer

We will need to move main_vncviewer to a different file where it has no
access to the helper vncviewer.

Call libxl_vncviewer_exec directly.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out functions to print out information
Wei Liu [Fri, 24 Feb 2017 14:25:30 +0000 (14:25 +0000)]
xl: split out functions to print out information

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out psr related code
Wei Liu [Fri, 24 Feb 2017 14:15:46 +0000 (14:15 +0000)]
xl: split out psr related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out memory related code
Wei Liu [Fri, 24 Feb 2017 14:13:05 +0000 (14:13 +0000)]
xl: split out memory related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out cdrom related code
Wei Liu [Fri, 24 Feb 2017 14:08:53 +0000 (14:08 +0000)]
xl: split out cdrom related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vcpu related code
Wei Liu [Fri, 24 Feb 2017 14:02:19 +0000 (14:02 +0000)]
xl: split out vcpu related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out pci related code
Wei Liu [Fri, 24 Feb 2017 13:57:08 +0000 (13:57 +0000)]
xl: split out pci related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out scheduler related code
Wei Liu [Fri, 24 Feb 2017 13:54:43 +0000 (13:54 +0000)]
xl: split out scheduler related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out usb related code
Wei Liu [Fri, 24 Feb 2017 13:50:42 +0000 (13:50 +0000)]
xl: split out usb related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out network related code
Wei Liu [Fri, 24 Feb 2017 13:46:11 +0000 (13:46 +0000)]
xl: split out network related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out block related code
Wei Liu [Fri, 24 Feb 2017 13:43:37 +0000 (13:43 +0000)]
xl: split out block related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vtpm related code
Wei Liu [Fri, 24 Feb 2017 13:39:42 +0000 (13:39 +0000)]
xl: split out vtpm related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out flask related code
Wei Liu [Fri, 24 Feb 2017 13:34:54 +0000 (13:34 +0000)]
xl: split out flask related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out cpupool related code
Wei Liu [Fri, 24 Feb 2017 13:19:48 +0000 (13:19 +0000)]
xl: split out cpupool related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out xl_parse.[ch]
Wei Liu [Fri, 24 Feb 2017 13:10:14 +0000 (13:10 +0000)]
xl: split out xl_parse.[ch]

Move all parsing code into xl_parse.c. Export the ones needed in
xl_parse.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: introduce a function to get shutdown action name
Wei Liu [Mon, 27 Feb 2017 17:28:17 +0000 (17:28 +0000)]
xl: introduce a function to get shutdown action name

The array to map libxl_shutdown_action_to_shutdown to string is going to
be moved to a dedicated file.

Provide a function to do the translation so that we don't need to make
the array globally visible.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: rename cpurange_parse to parse_cpurange
Wei Liu [Mon, 27 Feb 2017 17:22:06 +0000 (17:22 +0000)]
xl: rename cpurange_parse to parse_cpurange

We want to consistently prefix functions to parse input with "parse_".

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out tmem related code to xl_tmem.c
Wei Liu [Fri, 24 Feb 2017 12:14:58 +0000 (12:14 +0000)]
xl: split out tmem related code to xl_tmem.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: move some helper functions to xl_utils.c
Wei Liu [Fri, 24 Feb 2017 11:40:41 +0000 (11:40 +0000)]
xl: move some helper functions to xl_utils.c

Move some commonly used functions to a new file.

find_domain requires access to global variable common_domname. Make that
non-static.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/PVHv2: fix dereference of native RSDP table mapping
Roger Pau Monne [Mon, 27 Feb 2017 12:14:38 +0000 (12:14 +0000)]
x86/PVHv2: fix dereference of native RSDP table mapping

Check that the RSDP is mapped before trying to access it.

Spotted-by: Coverity
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibs/devicemodel: free xencall handle in error path in _open()
Wei Liu [Mon, 27 Feb 2017 12:20:26 +0000 (12:20 +0000)]
libs/devicemodel: free xencall handle in error path in _open()

Change the allocation to use calloc to get zeroed structure. Free
xencall handler in error path.

Spotted by Coverity.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: lift a bunch of macros to xl_utils.h
Wei Liu [Thu, 23 Feb 2017 18:34:15 +0000 (18:34 +0000)]
xl: lift a bunch of macros to xl_utils.h

We're going to split xl_cmdimpl.c into multiple files. Lift the commonly
used macros to xl_utils.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove trailing spaces in xl_cmdimpl.c
Wei Liu [Thu, 23 Feb 2017 18:41:13 +0000 (18:41 +0000)]
xl: remove trailing spaces in xl_cmdimpl.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: generate _paths.h
Wei Liu [Thu, 23 Feb 2017 18:28:02 +0000 (18:28 +0000)]
xl: generate _paths.h

It is included by xl.h. Previously it was using _paths.h from some other
place. We'd better generate one for xl as well.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: use <> variant to include Xen tools library headers
Wei Liu [Thu, 23 Feb 2017 18:06:14 +0000 (18:06 +0000)]
xl: use <> variant to include Xen tools library headers

They should be treated like any other libraries installed on the build
host. Compiler options are set correctly to point to their locations.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove inclusion of libxl_osdeps.h
Wei Liu [Thu, 23 Feb 2017 18:03:11 +0000 (18:03 +0000)]
xl: remove inclusion of libxl_osdeps.h

There is no reason for a client to include a private header from libxl.
Remove the inclusion and define _GNU_SOURCE for {v,}asprintf in
xl_cmdimpl.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: update copyright information
Wei Liu [Thu, 23 Feb 2017 18:38:13 +0000 (18:38 +0000)]
xl: update copyright information

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove accidentally committed hunk from Makefile
Wei Liu [Thu, 23 Feb 2017 18:21:52 +0000 (18:21 +0000)]
xl: remove accidentally committed hunk from Makefile

It was never intended to be committed. Lucky the high level Makefile was
correct so it didn't cause us problem when building xl.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 45ac805
Andrew Cooper [Mon, 27 Feb 2017 11:47:10 +0000 (11:47 +0000)]
x86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 45ac805

c/s 45ac805 "x86/paging: Package up the log dirty function pointers" neglected
the case when CONFIG_SHADOW_PAGING is disabled.  Make a similar adjustment to
the none stubs.

Spotted by a Travis RANDCONFIG run.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86: fix memory leak in pvh_setup_acpi_xsdt
Wei Liu [Sun, 26 Feb 2017 15:49:32 +0000 (15:49 +0000)]
x86: fix memory leak in pvh_setup_acpi_xsdt

Switch to use goto style error handling to avoid leaking xsdt.

Coverity-ID: 1401535

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: fix memory leak in pvh_setup_acpi_madt
Wei Liu [Sun, 26 Feb 2017 15:49:31 +0000 (15:49 +0000)]
x86: fix memory leak in pvh_setup_acpi_madt

Switch to use goto style error handling to avoid leaking madt.

Coverity-ID: 1401534

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agobuild: add --with-rundir option to configure
Juergen Gross [Thu, 16 Feb 2017 07:47:07 +0000 (08:47 +0100)]
build: add --with-rundir option to configure

There have been reports that Fedora 25 uses /run instead of /var/run.

Add a --with-rundir option ito configure to be able to specify that
directory. Default is still /var/run.

A re-run of autogen.sh is required.

Signed-off-by: Juergen Gross <jgross@suse.com>
[ wei: run autogen.sh ]
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xen-mceinj: fix the type of cpu number
Haozhong Zhang [Fri, 24 Feb 2017 10:52:56 +0000 (18:52 +0800)]
tools/xen-mceinj: fix the type of cpu number

Use "unsigned int" rather than "int" to align to the type "uint32_t"
of xen_mc_physcpuinfo.ncpus.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/vmx: fix vmentry failure with TSX bits in LBR
Sergey Dyasli [Thu, 23 Feb 2017 09:33:27 +0000 (09:33 +0000)]
x86/vmx: fix vmentry failure with TSX bits in LBR

During VM entry, H/W will automatically load guest's MSRs from MSR-load
area in the same way as they would be written by WRMSR.

However, under the following conditions:

    1. LBR (Last Branch Record) MSRs were placed in the MSR-load area
    2. Address format of LBR includes TSX bits 61:62
    3. CPU has TSX support disabled

VM entry will fail with a message in the log similar to:

    (XEN) [   97.239514] d1v0 vmentry failure (reason 0x80000022): MSR loading (entry 3)
    (XEN) [   97.239516]   msr 00000680 val 1fff800000102e60 (mbz 0)

This happens because of the following behaviour:

    - When capturing branches, LBR H/W will always clear bits 61:62
      regardless of the sign extension
    - For WRMSR, bits 61:62 are considered the part of the sign extension

This bug affects only certain pCPUs (e.g. Haswell) with vCPUs that
use LBR.  Fix it by sign-extending TSX bits in all LBR entries during
VM entry in affected cases.

LBR MSRs are currently not Live Migrated. In order to implement such
functionality, the MSR levelling work has to be done first because
hosts can have different LBR formats.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vmx: optimize vmx_read/write_guest_msr()
Sergey Dyasli [Thu, 23 Feb 2017 09:33:26 +0000 (09:33 +0000)]
x86/vmx: optimize vmx_read/write_guest_msr()

Replace linear scan with vmx_find_msr().  This way the time complexity
of searching for required MSR reduces from linear to logarithmic.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vmx: introduce vmx_find_msr()
Sergey Dyasli [Thu, 23 Feb 2017 09:33:25 +0000 (09:33 +0000)]
x86/vmx: introduce vmx_find_msr()

Modify vmx_add_msr() to use a variation of insertion sort algorithm:
find a place for the new entry and shift all subsequent elements before
insertion.

The new vmx_find_msr() exploits the fact that MSR list is now sorted
and reuses the existing code for binary search.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/emul: Fix sarx emulation test
Andrew Cooper [Fri, 24 Feb 2017 18:12:19 +0000 (18:12 +0000)]
x86/emul: Fix sarx emulation test

The emulation tests run `sarx %edx,(%ecx),%ebx` with 0xfedcba98 pointed at by
%ecx, and 0xff13 in %rdx.

As the instruction uses a 32bit operand size, the expected result is
0x00000000ffffffdb in %rbx (rather than 0xffffffffffffffdb), due to usual
behaviour of 32bit operations on 64bit registers.

The test harness was incorrectly sign extending from 32 bits to 64 bits rather
than zero extending when checking the result of emulation, causing a false
negative failure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/paging: Package up the log dirty function pointers
Andrew Cooper [Thu, 16 Feb 2017 16:42:16 +0000 (16:42 +0000)]
x86/paging: Package up the log dirty function pointers

They depend soley on paging mode, so don't need to be repeated per domain, and
can live in .rodata.  While making this change, drop the redundant log_dirty
from the function pointer names.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agox86/cpuid: Handle leaf 0x1 in guest_cpuid()
Andrew Cooper [Fri, 17 Feb 2017 17:10:50 +0000 (17:10 +0000)]
x86/cpuid: Handle leaf 0x1 in guest_cpuid()

The features words, ecx and edx, are already audited as part of the featureset
logic.  The existing leaf 0x80000001 dynamic logic has its SYSCALL adjustment
split out, as the rest of the adjustments are common with leaf 0x1.  The
existing leaf 0x1 feature adjustments from {pv,hvm}_cpuid() are moved
wholesale into guest_cpuid(), although deduped against the common adjustments.

The eax word is family/model/stepping information, and is fine to use as
provided by the toolstack, although with reserved bits cleared.

The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
is forwarded straight from hardware, as nothing good can possibly come of
providing an alternative value to the guest.

The next 8 bits are slightly different between Intel and AMD, but are both
some property of the number of logical cores in the current physical package.
For now, the toolstack value is used unchanged until better topology support
is available.

The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
guests don't observe a constant number on all vcpus via their emulated or
faulted view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
Andrew Cooper [Fri, 13 Jan 2017 17:54:24 +0000 (17:54 +0000)]
x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies

Also update the AVX512 text similarly for EVEX, even if there are no
EVEX-encoded GPR instructions currently.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Disallow policy updates once the domain is running
Andrew Cooper [Fri, 17 Feb 2017 15:47:31 +0000 (15:47 +0000)]
x86/cpuid: Disallow policy updates once the domain is running

On real hardware, the bulk of CPUID data is system-specific and constant.
Hold the toolstack to the same behaviour when constructing domains.

Values which are expected to change dynamically (e.g. OSXSAVE) are unaffected
and continue to function as before.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul/test: split generic and testcase specific parts
Jan Beulich [Fri, 24 Feb 2017 16:22:13 +0000 (17:22 +0100)]
x86emul/test: split generic and testcase specific parts

Both the build logic and the invocation have their blowfish specific
aspects abstracted out here. Additionally
- run native execution (if suitable) first (as that one failing
  suggests a problem with the to be tested code itself, in which case
  having the emulator have a go over it is kind of pointless)
- move the 64-bit tests up in blobs[] so 64-bit native execution will
  also precede 32-bit emulation (on 64-bit systems only of course)
- instead of -msoft-float (we'd rather not have the compiler generate
  such code), pass -fno-asynchronous-unwind-tables and -g0 (reducing
  binary size of the helper images as well as [slightly] compilation
  time)
- skip tests with zero length blobs (these can result from failed
  compilation, but not failing the build in this case seems desirable:
  it may allow partial testing - e.g. with older compilers - and
  permits manually removing certain tests from the generated headers
  without having to touch actual source code)
- constrain rIP to the actual blob range rather than looking for the
  specific (fake) return address put on the stack
- also print the opcode when x86_emulate() fails
- print at least three progress dots (for relatively short tests)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: setup PVHv2 Dom0 ACPI tables
Roger Pau Monné [Fri, 24 Feb 2017 14:49:19 +0000 (15:49 +0100)]
x86: setup PVHv2 Dom0 ACPI tables

Create a new MADT table that contains the topology exposed to the guest. A
new XSDT table is also created, in order to filter the tables that we want
to expose to the guest, plus the Xen crafted MADT. This in turn requires Xen
to also create a new RSDP in order to make it point to the custom XSDT.

Also, regions marked as E820_ACPI or E820_NVS are identity mapped into Dom0
p2m, plus any top-level ACPI tables that should be accessible to Dom0 and
reside in reserved regions. This is needed because some memory maps don't
properly account for all the memory used by ACPI, so it's common to find ACPI
tables in reserved regions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: setup PVHv2 Dom0 CPUs
Roger Pau Monné [Fri, 24 Feb 2017 14:48:59 +0000 (15:48 +0100)]
x86: setup PVHv2 Dom0 CPUs

Initialize Dom0 BSP/APs and setup the memory and IO permissions. This also sets
the initial BSP state in order to match the protocol specified in
docs/misc/hvmlite.markdown.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: parse Dom0 kernel for PVHv2
Roger Pau Monné [Fri, 24 Feb 2017 14:48:43 +0000 (15:48 +0100)]
x86: parse Dom0 kernel for PVHv2

Introduce a helper to parse the Dom0 kernel.

A new helper is also introduced to libelf, that's used to store the destination
vcpu of the domain. This parameter is needed when loading the kernel on a HVM
domain (PVHv2), since hvm_copy_to_guest_phys requires passing the destination
vcpu.

While there also fix image_base and image_start to be of type "void *", and do
the necessary fixup of related functions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/libelf: pass the destination vCPU to libelf for Dom0 build
Roger Pau Monné [Fri, 24 Feb 2017 14:47:55 +0000 (15:47 +0100)]
x86/libelf: pass the destination vCPU to libelf for Dom0 build

Allow setting the destination vCPU for libelf, so that elf_load_image can take
it into account when loading the kernel for Dom0. This is needed for PVHv2 Dom0
build, so that hvm_copy_to_guest_phys can be called with a Dom0 vCPU instead of
current (that contains the idle vCPU at this point).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/bzimage: change the types from char * to void *
Roger Pau Monné [Fri, 24 Feb 2017 14:47:36 +0000 (15:47 +0100)]
x86/bzimage: change the types from char * to void *

This allows to also change the types of image_base and image_start in the Dom0
builder from char * to void *.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: populate PVHv2 Dom0 physical memory map
Roger Pau Monné [Fri, 24 Feb 2017 14:47:03 +0000 (15:47 +0100)]
x86: populate PVHv2 Dom0 physical memory map

Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
memory pages that are shared between Xen and a domain, and use it in order to
remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.

On hardware lacking support for unrestricted mode also craft the identity page
tables and the TSS used for virtual 8086 mode.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: remove XENFEAT_hvm_pirqs for PVHv2 guests
Roger Pau Monné [Fri, 24 Feb 2017 14:46:10 +0000 (15:46 +0100)]
x86: remove XENFEAT_hvm_pirqs for PVHv2 guests

PVHv2 guests, unlike HVM guests, won't have the option to route interrupts
from physical or emulated devices over event channels using PIRQs. This
applies to both DomU and Dom0 PVHv2 guests.

Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can
route physical interrupts (even from emulated devices) over event channels,
and is thus allowed to use some of the PHYSDEV ops.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: Don't let hvm_set_efer() raise #GP itself
Andrew Cooper [Fri, 24 Feb 2017 09:22:09 +0000 (09:22 +0000)]
x86/hvm: Don't let hvm_set_efer() raise #GP itself

c/s 49de10f3c "x86/hvm: Don't raise #GP behind the emulators back for MSR
accesses" missed an edge case.

hvm_set_efer() raises #GP itself, so deliberately avoided the goto gp_fault
path in hvm_msr_write_intercept().

With the above change, guest updates to MSR_EFER which end up faulting raises
hvm_msr_write_intercept() returning X86EMUL_EXCEPTION.  The second #GP gets
combined to #DF and handed back to the guest.

Update hvm_set_efer() to avoid raising #GP, requiring its callers to do so.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibxl/libxl_pci.c: Fix reverse logic when detaching device
Chao Gao [Thu, 23 Feb 2017 23:12:10 +0000 (07:12 +0800)]
libxl/libxl_pci.c: Fix reverse logic when detaching device

Commit 20b75251d97 ("libxl/libxl_pci.c: used LOG*D functions") reverses the
logic to call xc_deassign_device(). It makes the device unusable.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoarm/p2m: remove the page from p2m->pages list before freeing it
Julien Grall [Fri, 24 Feb 2017 08:58:50 +0000 (09:58 +0100)]
arm/p2m: remove the page from p2m->pages list before freeing it

The p2m code is using the page list field to link all the pages used
for the stage-2 page tables. The page is added into the p2m->pages
list just after the allocation but never removed from the list.

The page list field is also used by the allocator, not removing may
result a later Xen crash due to inconsistency (see [1]).

This bug was introduced by the reworking of p2m code in commit 2ef3e36ec7
"xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry".

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00524.html

Reported-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agotools: move xl to a dedicated directory
Wei Liu [Tue, 21 Feb 2017 14:52:46 +0000 (14:52 +0000)]
tools: move xl to a dedicated directory

It makes clear distinction between the client (xl) and library (libxl),
which should help design better APIs.  This will also help reduce the
code size in libxl directory.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools: provide libxlutil compiling and linking options
Wei Liu [Tue, 21 Feb 2017 14:40:48 +0000 (14:40 +0000)]
tools: provide libxlutil compiling and linking options

We are about to split out xl (which depends on libxlutil) to a different
directory. Provide the proper options for compiling and linking in
Rules.mk, and replace the hardcoded string in libxl/Makefile.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxen-access: request compat devicemodel API
Wei Liu [Thu, 23 Feb 2017 16:46:45 +0000 (16:46 +0000)]
xen-access: request compat devicemodel API

xc_hvm_inject_trap is moved to the new libdevicemodel. Request the
compat layer from libxenctrl for now to make xen-access build again.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
8 years agolibs/devicemodel: initialise op_bufs in xendevicemodel_xcall
Wei Liu [Thu, 23 Feb 2017 15:18:20 +0000 (15:18 +0000)]
libs/devicemodel: initialise op_bufs in xendevicemodel_xcall

To avoid freeing uninitialised buffer when taking the first error exit
path.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agopython: handle long type in scripts
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:28 +0000 (11:48 +0100)]
python: handle long type in scripts

In Python3 'long' type have been merged into 'int', '1L' syntax is no
longer valid. Assign 'int' type to a 'long' variable in python3, so
'long(1)' will give correct result in both python2 and python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: adjust module initalization for Python3
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:27 +0000 (11:48 +0100)]
python: adjust module initalization for Python3

In Python3, PyTypeObject looks slightly different, and also module
initialization is different. Instead of Py_InitModule, PyModule_Create
should be called on already defined PyModuleDef structure. And then
initialization function should return that module.

Additionally initialization function should be named PyInit_<name>,
instead of init<name>.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use PyLong_* for constructing 'int' type in Python3
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:26 +0000 (11:48 +0100)]
python: use PyLong_* for constructing 'int' type in Python3

In Python3 'int' and 'long' types are the same, there are no longer
separate PyInt_* functions.  Provide convenient #defines to limit #if in
code.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use PyBytes/PyUnicode instead of PyString
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:25 +0000 (11:48 +0100)]
python: use PyBytes/PyUnicode instead of PyString

In Python2 PyBytes is the same as PyString, but in Python3 PyString is
gone and 'str' is really PyUnicode in C-API.
When handling arbitrary data, use PyBytes - which is the right thing to
do in Python3, and pose no API change in Python2. When handling
xenstore paths and transaction ids, which have well defined format, use
PyUnicode - to ease API usage - no need to prefix all xenstore paths
with 'b' when migrating scripts to Python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: initialize specific fields of PyTypeObject
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:24 +0000 (11:48 +0100)]
python: initialize specific fields of PyTypeObject

Fields not named here will be zero-initialized anyway, but using this
way will be much easier to support both Python2 and Python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use Py_TYPE instead of looking directly into PyObject_HEAD
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:23 +0000 (11:48 +0100)]
python: use Py_TYPE instead of looking directly into PyObject_HEAD

Py_TYPE works on both Python2 and Python3, while internals of
PyObject_HEAD have changed.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: drop tp_getattr implementation
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:22 +0000 (11:48 +0100)]
python: drop tp_getattr implementation

tp_getattr method of type object is deprecated already in Python2 and
gone in Python3. Default implementation does the same as this custom one.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: check return value of PyErr_NewException
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:21 +0000 (11:48 +0100)]
python: check return value of PyErr_NewException

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>