]> xenbits.xensource.com Git - xen.git/log
xen.git
5 years agox86/IRQ: move and rename __do_IRQ_guest()
Jan Beulich [Fri, 27 Dec 2019 08:52:41 +0000 (09:52 +0100)]
x86/IRQ: move and rename __do_IRQ_guest()

This is for it to be next to do_IRQ(). Beyond the actual code movement
this
- drops the leading underscores,
- passes in desc and vector, rather than irq,
- flips the order of two ASSERT()s,
- changes i and sp to unsigned int,
- restricts the scope of d and sp,
- corrects style.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/IRQ: move do_IRQ()
Jan Beulich [Fri, 27 Dec 2019 08:51:52 +0000 (09:51 +0100)]
x86/IRQ: move do_IRQ()

This is to avoid forward declarations of static functions. Beyond the
actual code movement this does
- u8 -> uint8_t,
- convert to Xen style,
- drop unnecessary parentheses and alike,
- strip trailing white space.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/hvm/rtc: preserved guest RTC offset during suspend/resume/migrate
Paul Durrant [Fri, 27 Dec 2019 08:50:31 +0000 (09:50 +0100)]
x86/hvm/rtc: preserved guest RTC offset during suspend/resume/migrate

The emulated RTC is synchronized with the PV wallclock; any write to the
RTC will update struct domain's 'time_offset_seconds' field and call
update_domain_wallclock().

However, the value of 'time_offset_seconds' is not preserved in any save
record and indeed, when the RTC save record is loaded, the CMOS values
will be updated based on an offset value which may or may not have been
set by the toolstack [1]. This may result in making bogus values available
to the guest and messing up any calculations done in the call to
alarm_timer_update() at the end of rtc_load().

This patch extends the RTC save record to contain an offset value, which
will be zero filled on load of an older record. The 'time_offset_secoonds'
field in struct domain is also modified into a 'time_offset' struct,
containing a 'seconds' field and a boolean 'set' field.

The code in rtc_load() then uses the new value in the save record to
update the value of struct domain's 'time_offset.seconds' unless
'time_offset.set' is true, which will only be the case if the toolstack has
already performed a XEN_DOMCTL_settimeoffset.

[1] There is currently no way for a toolstack to read the value of
    'time_offset_seconds' from struct domain. In the past, any hope of
    preservation of the value across a guest life-cycle operation was based
    on relying on qemu-dm to write a value into xenstore whenever the RTC
    was updated, in response to an IOREQ with type IOREQ_TYPE_TIMEOFFSET
    being sent by Xen; see:

    https://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=blob;f=i386-dm/helper2.c#l457

    but this behaviour was never forward-ported into upstream QEMU, which
    completely ignores that IOREQ type.
    In either case, nothing in xl or libxl ever samples the value of
    RTC offset from xenstore so any offset adjustment to a non-zero value
    performed by the guest (which in the case of Windows is highly likely
    as it normally writes RTC in local time, whereas Xen maintains time in
    UTC) is completely lost with the de-facto toolstack, and always has
    been. Instead, PV drivers are relied upon to paper over this gaping
    hole.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien@xen.org>
5 years agox86/vvmx: virtualize x2APIC mode and APIC accesses can't both be enabled
Roger Pau Monne [Tue, 24 Dec 2019 15:32:47 +0000 (16:32 +0100)]
x86/vvmx: virtualize x2APIC mode and APIC accesses can't both be enabled

According to the Intel SDM, "virtualize x2APIC mode" and "virtualize
APIC accesses" can't be enabled at the same time, or else a
vm{launch/entry} failure will happen. This was seen when running Xen
nested and with x2APIC enabled:

  (XEN) d3v0 VMLAUNCH error: 0x7
  [...]
  (XEN) *** Control State ***
  (XEN) PinBased=0000003f CPUBased=b6a075fe SecondaryExec=000014fb
  [...]

Fix this by making sure nvmx_update_secondary_exec_control clears the
incompatible bits from the host vmcs before merging it with the nested
vmcs.

This fixes a regression reported by osstest in the
test-amd64-amd64-qemuu-nested-intel job.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolibxc/migration: Drop unimplemented domain types
Andrew Cooper [Tue, 17 Dec 2019 17:49:47 +0000 (17:49 +0000)]
libxc/migration: Drop unimplemented domain types

x86 PVH is completely obsolete - it was intended for legacy PVH before that
idea was abandoned.  There was an RFC series for ARM in 2015, but there is
plenty of outstanding work which hasn't been done yet.

No functional change.  New types can be (re)introduced with the code which
actually implements them.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien@xen.org>
Acked-by: Wei Liu <wl@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agolibxc/migration: Rename TSC_INFO to X86_TSC_INFO
Andrew Cooper [Tue, 17 Dec 2019 13:38:14 +0000 (13:38 +0000)]
libxc/migration: Rename TSC_INFO to X86_TSC_INFO

This record is specific to x86, and should have had a prefix to being with.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agodocs/migration: Remove numbering for typical records
Andrew Cooper [Mon, 16 Dec 2019 17:15:23 +0000 (17:15 +0000)]
docs/migration: Remove numbering for typical records

The numbers aren't referenced directly, and explicit numbering makes an
unnecesserily large diff when inserting something new in the middle.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agolibxc/restore: Don't duplicate state in process_vcpu_basic()
Andrew Cooper [Wed, 18 Dec 2019 19:43:18 +0000 (19:43 +0000)]
libxc/restore: Don't duplicate state in process_vcpu_basic()

vcpu_guest_context_any_t is currently allocated on the stack, and copied from
a mutable buffer which is freed immediately after its use here.  Mutate the
buffer in place instead of duplicating it.

The code is as it is due to how it was developed.  Originally,
process_vcpu_basic() operated on a const pointer from the X86_VCPU_BASIC
record, but during upstreaming, the addition of Remus support required
buffering of X86_VCPU_BASIC records each checkpoint.

By the time process_vcpu_basic() runs, we are commited to completing state
restoration and unpausing the guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agogolang/xenlight: implement array C to Go marshaling
Nick Rosbrook [Mon, 23 Dec 2019 15:17:02 +0000 (10:17 -0500)]
golang/xenlight: implement array C to Go marshaling

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: add error return type to Context.Cpupoolinfo
Nick Rosbrook [Mon, 23 Dec 2019 15:17:07 +0000 (10:17 -0500)]
golang/xenlight: add error return type to Context.Cpupoolinfo

A previous commit that removed Context.CheckOpen revealed
an ineffectual assignent to err in Context.Cpupoolinfo, as
there is no error return type.

Since it appears that the intent is to return an error here,
add an error return value to the function signature.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: revise use of Context type
Nick Rosbrook [Mon, 23 Dec 2019 15:17:06 +0000 (10:17 -0500)]
golang/xenlight: revise use of Context type

Remove the exported global context variable, 'Ctx.' Generally, it is
better to not export global variables for use through a Go package.
However, there are some exceptions that can be found in the standard
library.

Add a NewContext function instead, and remove the Open, IsOpen, and
CheckOpen functions as a result.

Also, comment-out an ineffectual assignment to 'err' inside the function
Context.CpupoolInfo so that compilation does not fail.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoMAINTAINERS: put hyperv-tlfs.h under viridian maintainership
Wei Liu [Mon, 23 Dec 2019 12:51:43 +0000 (12:51 +0000)]
MAINTAINERS: put hyperv-tlfs.h under viridian maintainership

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wl@xen.org>
Acked-by: Paul Durrant <paul@xen.org>
5 years agox86emul: introduce CASE_SIMD_..._FP_VEX()
Jan Beulich [Mon, 23 Dec 2019 13:16:11 +0000 (14:16 +0100)]
x86emul: introduce CASE_SIMD_..._FP_VEX()

Since there are many AVX{,2} insns having legacy SIMD counterparts, have
macros covering both in one go. This (imo) improves readability and helps
prepare for optionally disabling SIMD support in the emulator.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: drop CASE_SIMD_DOUBLE_FP()
Jan Beulich [Mon, 23 Dec 2019 13:15:17 +0000 (14:15 +0100)]
x86emul: drop CASE_SIMD_DOUBLE_FP()

It's used only by CASE_SIMD_ALL_FP(), which can equally well be
implemented in terms of CASE_SIMD_{PACKED,SCALAR}_FP().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: introduce CASE_SIMD_PACKED_INT_VEX()
Jan Beulich [Mon, 23 Dec 2019 13:13:37 +0000 (14:13 +0100)]
x86emul: introduce CASE_SIMD_PACKED_INT_VEX()

Since there are many AVX{,2} insns having legacy MMX and SIMD
counterparts, have a macro covering all three in one go. This (imo)
improves readability (simply by the shrunk number of lines) and helps
prepare for optionally disabling MMX and SIMD support in the emulator.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/hyperv: change hv_tlb_flush_ex to fix clang build
Wei Liu [Mon, 23 Dec 2019 11:03:30 +0000 (11:03 +0000)]
x86/hyperv: change hv_tlb_flush_ex to fix clang build

Clang complains:

In file included from synic.c:15:
/builds/xen-project/xen/xen/include/asm/guest/hyperv-tlfs.h:900:18: error: field 'hv_vp_set' with variable sized type 'struct hv_vpset' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
        struct hv_vpset hv_vp_set;
                        ^
1 error generated.
/builds/xen-project/xen/xen/Rules.mk:198: recipe for target 'synic.o' failed
make[6]: *** [synic.o] Error 1

Comment out the last variable size array from hv_tlb_flush_ex to fix
clang builds.

Fixes: bbba482664 ("x86: import hyperv-tlfs.h from Linux")
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/viridian: drop viridian_stimer_config_msr
Wei Liu [Sun, 22 Dec 2019 23:12:15 +0000 (23:12 +0000)]
x86/viridian: drop viridian_stimer_config_msr

Use hv_stimer_config instead. No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86/viridian: drop virdian_sint_msr
Wei Liu [Sun, 22 Dec 2019 23:06:00 +0000 (23:06 +0000)]
x86/viridian: drop virdian_sint_msr

Use hv_synic_sint in hyperv-tlfs.h instead.

This requires adding the missing "polling" member to hv_synic_sint.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86/viridian: drop a wrong invalid value from reference TSC implementation
Wei Liu [Fri, 20 Dec 2019 21:08:28 +0000 (21:08 +0000)]
x86/viridian: drop a wrong invalid value from reference TSC implementation

The only invalid value mentioned in Hyper-V TLFS 5.0c is 0. Michael
Kelley confirmed that 0xFFFFFFFF was never used [0].

[0] https://lists.xen.org/archives/html/xen-devel/2019-12/msg01564.html

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86: move viridian_guest_os_id_msr to hyperv-tlfs.h
Wei Liu [Fri, 20 Dec 2019 19:43:59 +0000 (19:43 +0000)]
x86: move viridian_guest_os_id_msr to hyperv-tlfs.h

Suggested-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86: provide and use hv_tsc_scale
Wei Liu [Fri, 20 Dec 2019 19:18:16 +0000 (19:18 +0000)]
x86: provide and use hv_tsc_scale

The Hyper-V clock source and Xen's own viridian code need the same
functionality.

Move the function in viridian/time.c to hyperv.h and use it in both
places.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86/viridian: drop private copy of HV_REFERENCE_TSC_PAGE in time.c
Wei Liu [Tue, 17 Dec 2019 18:28:39 +0000 (18:28 +0000)]
x86/viridian: drop private copy of HV_REFERENCE_TSC_PAGE in time.c

Use the one defined in hyperv-tlfs.h instead. No functional change
intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86/viridian: drop duplicate defines from private.h and viridian.c
Wei Liu [Tue, 17 Dec 2019 17:20:01 +0000 (17:20 +0000)]
x86/viridian: drop duplicate defines from private.h and viridian.c

Also add HVCALL_EXT_CALL_QUERY_CAPABILITIES to hyperv-tlfs.h.
HvGetPartitionID was never used in code so just dropped it.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <paul@xen.org>
5 years agox86: Hyper-V clock source's offset should be signed
Wei Liu [Fri, 20 Dec 2019 19:47:49 +0000 (19:47 +0000)]
x86: Hyper-V clock source's offset should be signed

Also drop the useless inline keyword.

Fixes: 685d16bd5 (x86: implement Hyper-V clock source)
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolivepatch: Fix typos and other errors in tests Makefile
Pawel Wieczorkiewicz [Fri, 20 Dec 2019 18:23:39 +0000 (18:23 +0000)]
livepatch: Fix typos and other errors in tests Makefile

There was a bunch of typos (s/actions/action/) as well as one missing
config.h target dependency. Also, xen_expectation target has
unnecessary cycle dependency.

Fixes: 25164571fc ('Merge branch 'livepatch.aws.v6' into staging')
Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Tested-by: Julien Grall <julien@xen.org>
5 years agox86/viridian: drop private copy of definitions from synic.c
Wei Liu [Wed, 18 Dec 2019 14:42:30 +0000 (14:42 +0000)]
x86/viridian: drop private copy of definitions from synic.c

Use hyperv-tlfs.h instead. No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Paul Durrant <pdurrant@amazon.com>
5 years agox86: implement Hyper-V clock source
Wei Liu [Thu, 24 Oct 2019 14:54:15 +0000 (15:54 +0100)]
x86: implement Hyper-V clock source

Implement a clock source using Hyper-V's reference TSC page.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/hyperv: extract more information from Hyper-V
Wei Liu [Thu, 24 Oct 2019 13:22:53 +0000 (14:22 +0100)]
x86/hyperv: extract more information from Hyper-V

Provide a structure to store that information. The structure will be
accessed from other places later so make it public.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86: import hyperv-tlfs.h from Linux
Wei Liu [Thu, 24 Oct 2019 11:17:03 +0000 (12:17 +0100)]
x86: import hyperv-tlfs.h from Linux

Take a pristine copy from Linux commit b2d8b167e15bb5ec2691d1119c025630a247f649.

Do the following to fix it up for Xen:

1. include xen/types.h and xen/bitops.h
2. fix up invocations of BIT macro

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agotools/libxc: Drop unused xc_compression_*()
Andrew Cooper [Thu, 19 Dec 2019 14:51:31 +0000 (14:51 +0000)]
tools/libxc: Drop unused xc_compression_*()

There have been no users of the xc_compression_*() interface since Migration
v2 replaced legacy migration (2015, c/s b15bc4345).

It would need adjusting to fit into migration v2, and can be pulled out of git
history if someone wants to resurrect it in the future.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agotools/libxc: Drop other examples of the 'goto x; } else if' antipattern
Andrew Cooper [Wed, 18 Dec 2019 22:08:02 +0000 (22:08 +0000)]
tools/libxc: Drop other examples of the 'goto x; } else if' antipattern

None of these are buggy, but the resulting code is more robust.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agox86emul: use CASE_SIMD_PACKED_INT() where possible
Jan Beulich [Fri, 20 Dec 2019 15:46:20 +0000 (16:46 +0100)]
x86emul: use CASE_SIMD_PACKED_INT() where possible

This (imo) improves readability (simply by the shrunk number of lines)
and helps prepare for optionally disabling MMX and SIMD support in the
emulator.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/vm_event: add short-circuit for breakpoints (aka "fast single step")
Sergey Kovalev [Fri, 20 Dec 2019 15:45:32 +0000 (16:45 +0100)]
x86/vm_event: add short-circuit for breakpoints (aka "fast single step")

When using DRAKVUF (or another system using altp2m with shadow pages similar
to what is described in
https://xenproject.org/2016/04/13/stealthy-monitoring-with-xen-altp2m),
after a breakpoint is hit the system switches to the default
unrestricted altp2m view with singlestep enabled. When the singlestep
traps to Xen another vm_event is sent to the monitor agent, which then
normally disables singlestepping and switches the altp2m view back to
the restricted view.

This patch short-circuiting that last part so that it doesn't need to send the
vm_event out for the singlestep event and should switch back to the restricted
view in Xen automatically.

This optimization gains about 35% speed-up.

Was tested on Debian branch of Xen 4.12. See at:
https://github.com/skvl/xen/tree/debian/knorrie/4.12/fast-singlestep

Rebased on master:
https://github.com/skvl/xen/tree/fast-singlestep

Signed-off-by: Sergey Kovalev <valor@list.ru>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
5 years agox86/time: update vtsc_last with cmpxchg and drop vtsc_lock
Igor Druzhinin [Fri, 20 Dec 2019 15:44:38 +0000 (16:44 +0100)]
x86/time: update vtsc_last with cmpxchg and drop vtsc_lock

Now that vtsc_last is the only entity protected by vtsc_lock we can
simply update it using a single atomic operation and drop the spinlock
entirely. This is extremely important for the case of running nested
(e.g. shim instance with lots of vCPUs assigned) since if preemption
happens somewhere inside the critical section that would immediately
mean that other vCPU stop progressing (and probably being preempted
as well) waiting for the spinlock to be freed.

This fixes constant shim guest boot lockups with ~32 vCPUs if there is
vCPU overcommit present (which increases the likelihood of preemption).

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86: explicitly disallow guest access to PPIN
Jan Beulich [Fri, 20 Dec 2019 15:30:13 +0000 (16:30 +0100)]
x86: explicitly disallow guest access to PPIN

To fulfill the "protected" in its name, don't let the real hardware
values leak. While we could report a control register value expressing
this (which I would have preferred), unconditionally raise #GP for all
accesses (in the interest of getting this done).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/apic: allow enabling x2APIC mode regardless of interrupt remapping
Roger Pau Monné [Fri, 20 Dec 2019 15:29:22 +0000 (16:29 +0100)]
x86/apic: allow enabling x2APIC mode regardless of interrupt remapping

x2APIC mode doesn't mandate interrupt remapping, and hence can be
enabled independently. This patch enables x2APIC when available,
regardless of whether there's interrupt remapping support.

This is beneficial specially when running on virtualized environments,
since it reduces the amount of vmexits. For example when sending an
IPI in xAPIC mode Xen performs at least 3 different accesses to the
APIC MMIO region, while when using x2APIC mode a single wrmsr is used.

The following numbers are from a lock profiling of a Xen PV shim
running a Linux PV kernel with 32 vCPUs and xAPIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804af1c0, lockval=03190319, not locked
(XEN)   lock:656153(892606463454), block:602183(9495067321843)

Average lock time:   1360363ns
Average block time: 15767743ns

While the following are from the same configuration but with the shim
using x2APIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804b01c0, lockval=1adb1adb, not locked
(XEN)   lock:1841883(1375128998543), block:1658716(10193054890781)

Average lock time:   746588ns
Average block time: 6145147ns

Enabling x2APIC has halved the average lock time, thus reducing
contention.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/smp: check APIC ID on AP bringup
Roger Pau Monné [Fri, 20 Dec 2019 15:28:27 +0000 (16:28 +0100)]
x86/smp: check APIC ID on AP bringup

Check that the processor to be woken up APIC ID is addressable in the
current APIC mode.

Note that in practice systems with APIC IDs > 255 should already have
x2APIC enabled by the firmware, and hence this is mostly a safety
belt.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/apic: force phys mode if interrupt remapping is disabled
Roger Pau Monné [Fri, 20 Dec 2019 15:27:48 +0000 (16:27 +0100)]
x86/apic: force phys mode if interrupt remapping is disabled

Cluster mode can only be used with interrupt remapping support, since
the top 16bits of the APIC ID are filled with the cluster ID, and
hence on systems where the physical ID is still smaller than 255 the
cluster ID is not. Force x2APIC to use physical mode if there's no
interrupt remapping support.

Note that this requires a further patch in order to enable x2APIC
without interrupt remapping support.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/ioapic: only use dest32 with x2apic and interrupt remapping enabled
Roger Pau Monné [Fri, 20 Dec 2019 15:26:09 +0000 (16:26 +0100)]
x86/ioapic: only use dest32 with x2apic and interrupt remapping enabled

The IO-APIC code assumes that x2apic being enabled also implies
interrupt remapping being enabled, and hence will use the 32bit
destination field in the IO-APIC entry.

This is safe now, but there's no reason to not enable x2APIC even
without interrupt remapping, and hence the IO-APIC code needs to use
the 32 bit destination field only when both interrupt remapping and
x2APIC are enabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agotools: bump library version numbers
Wei Liu [Tue, 17 Dec 2019 14:49:28 +0000 (14:49 +0000)]
tools: bump library version numbers

Signed-off-by: Wei Liu <wl@xen.org>
5 years agolibxc/restore: Fix data auditing in handle_x86_pv_vcpu_blob()
Andrew Cooper [Thu, 19 Dec 2019 20:32:20 +0000 (20:32 +0000)]
libxc/restore: Fix data auditing in handle_x86_pv_vcpu_blob()

The current logic only works by chance, in that XSAVE records also tend to be
a multiple of 128.  Implement the missing logic for XSAVE.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agolibxc/restore: Fix data auditing in handle_x86_pv_info()
Andrew Cooper [Wed, 18 Dec 2019 20:17:42 +0000 (20:17 +0000)]
libxc/restore: Fix data auditing in handle_x86_pv_info()

handle_x86_pv_info() has a subtle bug.  It uses an 'else if' chain with a
clause in the middle which doesn't exit unconditionally.  In practice, this
means that when restoring a 32bit PV guest, later sanity checks are skipped.

Rework the logic a little to be simpler.  There are exactly two valid
combinations of fields in X86_PV_INFO, so factor this out and check them all
in one go, before making adjustments to the current domain.

Once adjustments have been completed successfully, sanity check the result
against the X86_PV_INFO settings in one go, rather than piece-wise.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agotools/python: Python 3 compatibility
Andrew Cooper [Wed, 18 Dec 2019 14:00:16 +0000 (14:00 +0000)]
tools/python: Python 3 compatibility

convert-legacy-stream is only used for incomming migration from pre Xen 4.7,
and verify-stream-v2 appears to only be used by me during migration
development - it is little surprise that they missed the main converstion
effort in Xen 4.13.

Fix it all up.

Move open_file_or_fd() into a new util.py to avoid duplication, making it a
more generic wrapper around open() or fdopen().

In libxc.py, drop all long() conversion.  Python 2 will DTRT with int => long
promotion, even on 32bit builds.

In convert-legacy-stream, don't pass empty strings to write_record().  Join on
the empty argl will do the right thing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
5 years agoMerge branch 'livepatch.aws.v6' into staging
Konrad Rzeszutek Wilk [Fri, 20 Dec 2019 01:16:43 +0000 (20:16 -0500)]
Merge branch 'livepatch.aws.v6' into staging

* livepatch.aws.v6:
  livepatch: Add metadata runtime retrieval mechanism
  livepatch: Handle arbitrary size names with the list operation
  livepatch: Add support for modules .modinfo section metadata
  livepatch: Add support for inline asm livepatching expectations
  livepatch: Add per-function applied/reverted state tracking marker
  livepatch: Do not enforce ELF_LIVEPATCH_FUNC section presence
  livepatch: Add support for apply|revert action replacement hooks
  livepatch: Implement pre-|post- apply|revert hooks
  livepatch: Export payload structure via livepatch_payload.h
  livepatch: Allow to override inter-modules buildid dependency
  livepatch: Always check hypervisor build ID upon livepatch upload

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
5 years agotools/python: Drop test.py
Andrew Cooper [Wed, 18 Dec 2019 12:43:48 +0000 (12:43 +0000)]
tools/python: Drop test.py

This file hasn't been touched since it was introduced in 2005 (c/s 0c6f36628)
and has a wildly obsolete shebang for Python 2.3.  Most importantly for us is
that it isn't Python 3 compatible.

Drop the file entirely.  Since the 2.3 days, automatic discovery of tests has
been included in standard functionality.  Rewrite the test rule to use
"$(PYTHON) -m unittest discover" which is equivelent.

Dropping test.py drops the only piece of ZPL-2.0 code in the tree.  Drop the
ancillary files, and adjust COPYING to match.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Lars Kurth <lars.kurth@citrix.com>
5 years agox86/mem_sharing: cleanup code and comments in various locations
Tamas K Lengyel [Wed, 18 Dec 2019 19:40:41 +0000 (11:40 -0800)]
x86/mem_sharing: cleanup code and comments in various locations

No functional changes.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Further cleanup]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/libxc: clean up memory sharing files
Tamas K Lengyel [Wed, 18 Dec 2019 19:40:40 +0000 (11:40 -0800)]
tools/libxc: clean up memory sharing files

No functional changes.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Wei Liu <wl@xen.org>
[Further cleanup]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: provide Dom0 access to PPIN via XENPF_resource_op
Jan Beulich [Wed, 18 Dec 2019 13:49:59 +0000 (14:49 +0100)]
x86: provide Dom0 access to PPIN via XENPF_resource_op

It was requested that we provide a way independent of the MCE reporting
interface that Dom0 software could use to get hold of the values for
particular CPUs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: include the PPIN in MCE records when available
Jan Beulich [Wed, 18 Dec 2019 13:49:10 +0000 (14:49 +0100)]
x86: include the PPIN in MCE records when available

Quoting the respective Linux commit:

    Intel Xeons from Ivy Bridge onwards support a processor identification
    number set in the factory. To the user this is a handy unique number to
    identify a particular CPU. Intel can decode this to the fab/production
    run to track errors. On systems that have it, include it in the machine
    check record. I'm told that this would be helpful for users that run
    large data centers with multi-socket servers to keep track of which CPUs
    are seeing errors.

Newer AMD CPUs support this too, at different MSR numbers.

Take the opportunity and hide __MC_NMSRS from the public interface going
forward.

[Linux commit 3f5a7896a5096fd50030a04d4c3f28a7441e30a5]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/hotplug: Use ip on systems where brctl is not available
Steven Haigh [Wed, 18 Dec 2019 01:15:23 +0000 (12:15 +1100)]
tools/hotplug: Use ip on systems where brctl is not available

Newer distros like CentOS 8 do not have brctl available. As such, we
can't use it to configure networking anymore.

This patch will fall back to 'ip' or 'bridge' commands if brctl is not
available in the working PATH.

This would be a likely backport candidate to any version expected to be
built on CentOS 8 etc.

Signed-off-by: Steven Haigh <netwiz@crc.id.au>
Acked-by: Wei Liu <wl@xen.org>
5 years agox86/S3: Expand macros in wakeup_prot.S
Andrew Cooper [Fri, 13 Dec 2019 17:56:02 +0000 (17:56 +0000)]
x86/S3: Expand macros in wakeup_prot.S

Most users have been dropped, and they do nothing but obfuscate the assembly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/S3: Restore cr4 later during resume
Andrew Cooper [Fri, 13 Dec 2019 17:56:40 +0000 (17:56 +0000)]
x86/S3: Restore cr4 later during resume

Just like the BSP/AP paths, %cr4 is loaded with only PAE.  Defer restoring all
of %cr4 (MCE in particular) until all the system structures (IDT/TSS in
particular) have been loaded.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/S3: Don't save unnecessary GPRs
Andrew Cooper [Fri, 13 Dec 2019 17:52:21 +0000 (17:52 +0000)]
x86/S3: Don't save unnecessary GPRs

Only the callee-preserved registers need saving/restoring.  Spill them to the
stack like regular functions do.  %rsp is now the only GPR which gets stashed
in .data

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/S3: Don't bother saving %cr3, %ss or flags
Andrew Cooper [Fri, 13 Dec 2019 17:45:57 +0000 (17:45 +0000)]
x86/S3: Don't bother saving %cr3, %ss or flags

The trampoline has already set up the idle pagetables (which are the correct
ones to use), and sanitised the flags state.

For %ss, __HYPERVISOR_DS64 is the correct descriptor to restore.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/S3: Clarify and improve the behaviour of do_suspend_lowlevel()
Andrew Cooper [Fri, 13 Dec 2019 17:36:09 +0000 (17:36 +0000)]
x86/S3: Clarify and improve the behaviour of do_suspend_lowlevel()

do_suspend_lowlevel() behaves as a function call, even when the trampoline
jumps back into the middle of it.  Discuss this property, while renaming the
far-too-generic __ret_point to s3_resume.

Optimise the calling logic for acpi_enter_sleep_state().  $3 doesn't require a
64bit write, and the function isn't variadic so doesn't need to specify zero
FPU registers in use.

In the case of an acpi_enter_sleep_state() error, we didn't actually lose
state so don't need to restore it.  Jump straight to the end.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Simplify promote_l4_table()'s exit semantics
Andrew Cooper [Tue, 9 Oct 2018 12:48:57 +0000 (13:48 +0100)]
x86/mm: Simplify promote_l4_table()'s exit semantics

promote_l4_table() is different from its lower level helpers, by having an
extra return path out of the middle of the loop in the case of a failure.

Break from the loop, which is consistent with the other helpers, and
functionally equivalent.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoxen/page_alloc: statically allocate bootmem_region_list
Hongyan Xia [Tue, 17 Dec 2019 14:33:19 +0000 (14:33 +0000)]
xen/page_alloc: statically allocate bootmem_region_list

The existing code assumes that the first mfn passed to the boot
allocator is mapped, which creates problems when, e.g., we do not have
a direct map, and may create other bootstrapping problems in the
future. Make it static. The size is kept the same as before (1 page).

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Reviewed-by: Julien Grall <julien@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/arm: Basic support for sunxi/sun50i h6 platform.
Yangtao Li [Mon, 2 Dec 2019 08:49:24 +0000 (08:49 +0000)]
xen/arm: Basic support for sunxi/sun50i h6 platform.

adding compatible strings for h6 SoCs, Specifically orangepi3.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com
Tested-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Julien Grall <julien@xen.org>
5 years agolibxc/restore: Fix error message for unrecognised stream version
Andrew Cooper [Tue, 17 Dec 2019 13:49:56 +0000 (13:49 +0000)]
libxc/restore: Fix error message for unrecognised stream version

The Expected and Got values are rendered in the wrong order.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agogolang/xenlight: implement keyed union C to Go marshaling
Nick Rosbrook [Mon, 16 Dec 2019 18:08:10 +0000 (18:08 +0000)]
golang/xenlight: implement keyed union C to Go marshaling

Switch over union key to determine how to populate 'union' in Go struct.

Since the unions of C types cannot be directly accessed in cgo, use a
typeof trick to typedef a struct in the cgo preamble that is analagous
to each inner struct of a keyed union. For example, to define a struct
for the hvm inner struct of libxl_domain_build_info, do:

  typedef typeof(((struct libxl_domain_build_info *)NULL)->u.hvm) libxl_domain_build_info_type_union_hvm;

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: begin C to Go type marshaling
Nick Rosbrook [Mon, 16 Dec 2019 18:08:09 +0000 (18:08 +0000)]
golang/xenlight: begin C to Go type marshaling

Begin implementation of fromC marshaling functions for generated struct
types. This includes support for converting fields that are basic
primitive types such as string and integer types, nested anonymous
structs, nested libxl structs, and libxl built-in types.

This patch does not implement conversion of arrays or keyed unions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: remove no-longer used type MemKB
Nick Rosbrook [Mon, 16 Dec 2019 18:08:08 +0000 (18:08 +0000)]
golang/xenlight: remove no-longer used type MemKB

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: generate structs from the IDL
Nick Rosbrook [Mon, 16 Dec 2019 18:08:08 +0000 (18:08 +0000)]
golang/xenlight: generate structs from the IDL

Add struct and keyed union generation to gengotypes.py. For keyed unions,
use a method similar to gRPC's oneof to interpret C unions as Go types.
Meaning, for a given struct with a union field, generate a struct for
each sub-struct defined in the union. Then, define an interface of one
method which is implemented by each of the defined sub-structs. For
example:

  type domainBuildInfoTypeUnion interface {
          isdomainBuildInfoTypeUnion()
  }

  type DomainBuildInfoTypeUnionHvm struct {
      // HVM-specific fields...
  }

  func (x DomainBuildInfoTypeUnionHvm) isdomainBuildInfoTypeUnion() {}

  type DomainBuildInfoTypeUnionPv struct {
      // PV-specific fields...
  }

  func (x DomainBuildInfoTypeUnionPv) isdomainBuildInfoTypeUnion() {}

  type DomainBuildInfoTypeUnionPvh struct {
      // PVH-specific fields...
  }

  func (x DomainBuildInfoTypeUnionPvh) isdomainBuildInfoTypeUnion() {}

Then, remove existing struct definitions in xenlight.go that conflict
with the generated types, and modify existing marshaling functions to
align with the new type definitions. Notably, drop "time" package since
fields of type time.Duration are now of type uint64.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-factor Hwcap type implementation
Nick Rosbrook [Mon, 16 Dec 2019 18:08:07 +0000 (18:08 +0000)]
golang/xenlight: re-factor Hwcap type implementation

Re-define Hwcap as [8]uint32, and implement toC function. Also, re-name and
modify signature of toGo function to fromC.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-factor Uuid type implementation
Nick Rosbrook [Mon, 16 Dec 2019 18:08:06 +0000 (18:08 +0000)]
golang/xenlight: re-factor Uuid type implementation

Re-define Uuid as [16]byte and implement fromC, toC, and String functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define CpuidPolicyList builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:05 +0000 (18:08 +0000)]
golang/xenlight: define CpuidPolicyList builtin type

Define CpuidPolicyList as a string so that libxl_cpuid_parse_config can
be used in the toC function.

For now, fromC is a no-op since libxl does not support a way to read a
policy, modify it,and then give it back to libxl.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define EvLink builtin as empty struct
Nick Rosbrook [Mon, 16 Dec 2019 18:08:05 +0000 (18:08 +0000)]
golang/xenlight: define EvLink builtin as empty struct

Define EvLink as empty struct as there is currently no reason the internal of
this type should be used in Go.

Implement fromC and toC functions as no-ops.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define MsVmGenid builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:04 +0000 (18:08 +0000)]
golang/xenlight: define MsVmGenid builtin type

Define MsVmGenid as [int(C.LIBXL_MS_VM_GENID_LEN)]byte and implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Mac builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:03 +0000 (18:08 +0000)]
golang/xenlight: define Mac builtin type

Define Mac as [6]byte and implement fromC, toC, and String functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define StringList builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:02 +0000 (18:08 +0000)]
golang/xenlight: define StringList builtin type

Define StringList as []string an implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-name Bitmap marshaling functions
Nick Rosbrook [Mon, 16 Dec 2019 18:08:01 +0000 (18:08 +0000)]
golang/xenlight: re-name Bitmap marshaling functions

Re-name and modify signature of toGo function to fromC. The reason for
using 'fromC' rather than 'toGo' is that it is not a good idea to define
methods on the C types. Also, add error return type to Bitmap's toC function.

Finally, as code-cleanup, re-organize the Bitmap type's comments as per
Go conventions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
--
Changes in v2:
- Use consistent variable naming for slice created from
  libxl_bitmap.

5 years agogolang/xenlight: define KeyValueList as empty struct
Nick Rosbrook [Mon, 16 Dec 2019 18:08:01 +0000 (18:08 +0000)]
golang/xenlight: define KeyValueList as empty struct

Define KeyValueList as empty struct as there is currently no reason for
this type to be available in the Go package.

Implement fromC and toC functions as no-ops.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Devid type as int
Nick Rosbrook [Mon, 16 Dec 2019 18:08:00 +0000 (18:08 +0000)]
golang/xenlight: define Devid type as int

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Defbool builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: define Defbool builtin type

Define Defbool as struct analagous to the C type, and define the type
'defboolVal' that represent true, false, and default defbool values.

Implement Set, Unset, SetIfDefault, IsDefault, Val, and String functions
on Defbool so that the type can be used in Go analagously to how its
used in C.

Finally, implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: generate enum types from IDL
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: generate enum types from IDL

Introduce gengotypes.py to generate Go code the from IDL. As a first step,
implement 'enum' type generation.

As a result of the newly-generated code, remove the existing, and now
conflicting definitions in xenlight.go. In the case of the Error type,
rename the slice 'errors' to 'libxlErrors' so that it does not conflict
with the standard library package 'errors.' And, negate the values used
in 'libxlErrors' since the generated error values are negative.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86emul: correct far branch handling for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:37:09 +0000 (17:37 +0100)]
x86emul: correct far branch handling for 64-bit mode

AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopublic/io/ring.h: add FRONT/BACK_RING_ATTACH macros
Paul Durrant [Mon, 16 Dec 2019 16:36:37 +0000 (17:36 +0100)]
public/io/ring.h: add FRONT/BACK_RING_ATTACH macros

The version of this header present in the Linux source tree has contained
such macros for some time. These macros, as the names imply, allow front
or back rings to be set up for existent (rather than freshly created and
zeroed) shared rings.

This patch is to update this, the canonical version of the header, to
match the latest definition of these macros in the Linux source.

NOTE: The way the new macros are defined allows the FRONT/BACK_RING_INIT
      macros to be re-defined in terms of them, thereby reducing
      duplication.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
5 years agox86emul: correct LFS et al handling for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:35:50 +0000 (17:35 +0100)]
x86emul: correct LFS et al handling for 64-bit mode

AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: correct segment override decode for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:34:46 +0000 (17:34 +0100)]
x86emul: correct segment override decode for 64-bit mode

The legacy / compatibility mode ES, CS, SS, and DS overrides are fully
ignored prefixes in 64-bit mode, i.e. they in particular don't cancel an
earlier FS or GS one. (They don't violate the REX-prefix-must-be-last
rule though.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/time: drop vtsc_{kern, user}count debug counters
Igor Druzhinin [Fri, 13 Dec 2019 22:48:01 +0000 (22:48 +0000)]
x86/time: drop vtsc_{kern, user}count debug counters

They either need to be transformed to atomics to work correctly
(currently they left unprotected for HVM domains) or dropped entirely
as taking a per-domain spinlock is too expensive for high-vCPU count
domains even for debug build given this lock is taken too often.

Choose the latter as they are not extremely important anyway.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/pv: Fix `global-pages` to match the documentation
Andrew Cooper [Mon, 16 Dec 2019 13:58:45 +0000 (13:58 +0000)]
x86/pv: Fix `global-pages` to match the documentation

c/s 5de961d9c09 "x86: do not enable global pages when virtualized on AMD or
Hygon hardware" in fact does.  Fix the calculation in pge_init().

While fixing this, adjust the command line documenation, first to use the
newer style, and to expand the description to discuss cases where the option
might be useful to use, but Xen can't account for by default.

Fixes: 5de961d9c09 ('x86: do not enable global pages when virtualized on AMD or Hygon hardware')
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: More discriptive names for page de/validation functions
George Dunlap [Thu, 12 Dec 2019 15:57:51 +0000 (15:57 +0000)]
x86/mm: More discriptive names for page de/validation functions

The functions alloc_page_type(), alloc_lN_table(), free_page_type()
and free_lN_table() are confusingly named: nothing is being allocated
or freed.  Rather, the page being passed in is being either validated
or devalidated for use as the specific type; in the specific case of
pagetables, these may be promoted or demoted (i.e., grab appropriate
references for PTEs).

Rename alloc_page_type() and free_page_type() to validate_page() and
devalidate_page().  Also rename alloc_segdesc_page() to
validate_segdesc_page(), since this is what it's doing.

Rename alloc_lN_table() and free_lN_table() to promote_lN_table() and
demote_lN_table(), respectively.

After this change:
- get / put type consistenly refer to increasing or decreasing the count
- validate / devalidate consistently refers to actions done when a
type count goes 0 -> 1 or 1 -> 0
- promote / demote consistenly refers to acquiring or freeing
resources (in the form of type refs and general references) in order
to allow a page to be used as a pagetable.

No functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Use mfn_t in type get / put call tree
George Dunlap [Fri, 13 Dec 2019 14:09:46 +0000 (14:09 +0000)]
x86/mm: Use mfn_t in type get / put call tree

Replace `unsigned long` with `mfn_t` as appropriate throughout
alloc/free_lN_table, get/put_page_from_lNe, and
get_lN_linear_pagetable.  This obviates the need for a load of
`mfn_x()` and `_mfn()` casts.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Use a more descriptive name for pagetable mfns
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Use a more descriptive name for pagetable mfns

In many places, a PTE being modified is accompanied by the pagetable
mfn which contains the PTE (primarily in order to be able to maintain
linear mapping counts).  In many cases, this mfn is stored in the
non-descript variable (or argement) "pfn".

Replace these names with lNmfn, to indicate that 1) this is a
pagetable mfn, and 2) that it is the same level as the PTE in
question.  This should be enough to remind readers that it's the mfn
containing the PTE.

No functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Implement common put_data_pages for put_page_from_l[23]e
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Implement common put_data_pages for put_page_from_l[23]e

Both put_page_from_l2e and put_page_from_l3e handle having superpage
entries by looping over each page and "put"-ing each one individually.
As with putting page table entries, this code is functionally
identical, but for some reason different.  Moreover, there is already
a common function, put_data_page(), to handle automatically swapping
between put_page() (for read-only pages) or put_page_and_type() (for
read-write pages).

Replace this with put_data_pages() (plural), which does the entire
loop, as well as the put_page / put_page_and_type switch.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Refactor put_page_from_l*e to reduce code duplication
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Refactor put_page_from_l*e to reduce code duplication

put_page_from_l[234]e have identical functionality for devalidating an
entry pointing to a pagetable.  But mystifyingly, they duplicate the
code in slightly different arrangements that make it hard to tell that
it's the same.

Create a new function, put_pt_page(), which handles the common
functionality; and refactor all the functions to be symmetric,
differing only in the level of pagetable expected (and in whether they
handle superpages).

Other than put_page_from_l2e() gaining an ASSERT it probably should
have had already, no functional changes.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopublic/io/netif.h: document a mechanism to advertise carrier state
Paul Durrant [Fri, 13 Dec 2019 16:39:44 +0000 (16:39 +0000)]
public/io/netif.h: document a mechanism to advertise carrier state

This patch adds a specification for a 'carrier' node in xenstore to allow
a backend to notify a frontend of it's virtual carrier/link state. E.g.
a backend that is unable to forward packets from the guest because it is
not attached to a bridge may wish to advertise 'no carrier'.

While in the area also fix an erroneous backend path description.

NOTE: This is purely a documentation patch. No functional change.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
5 years agoConfig.mk: Remove stray comment
Anthony PERARD [Thu, 12 Dec 2019 18:27:34 +0000 (18:27 +0000)]
Config.mk: Remove stray comment

This comment isn't about CONFIG_TESTS, but about SEABIOS_DIR that has
been removed.

Originally, the comment was added by 5f82d0858de1 ("tools: support
SeaBIOS. Use by default when upstream qemu is configured."), then
later the SEABIOS_DIR was removed by 14ee3c05f3ef ("Clone and build
Seabios by default") but that comment about the pain was left behind.
The commit that made CONFIG_TESTS painful was 85896a7c4dc7 ("build:
add autoconf to replace custom checks in tools/check").

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoConfig.mk: Remove unused setvar_dir macro
Anthony PERARD [Thu, 12 Dec 2019 18:27:33 +0000 (18:27 +0000)]
Config.mk: Remove unused setvar_dir macro

And remove all mention of it in docs. It hasn't been used since
9ead9afcb935 ("Add configure --with-sysconfig-leaf-dir=SUBDIR to set
CONFIG_LEAF_DIR").

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolivepatch: Add metadata runtime retrieval mechanism
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:08:00 +0000 (10:08 +0000)]
livepatch: Add metadata runtime retrieval mechanism

Extend the livepatch list operation to fetch also payloads' metadata.
This is achieved by extending the sysctl list interface with 2 extra
guest handles:
* metadata     - an array of arbitrary size strings
* metadata_len - an array of metadata strings' lengths (uin32_t each)

Payloads' metadata is a string of arbitrary size and does not have an
upper bound limit. It may also vary in size between payloads.

In order to let the userland allocate enough space for the incoming
data add a metadata total size field to the list sysctl operation and
fill it with total size of all payloads' metadata.

Extend the libxc to handle the metadata back-to-back data transfers
as well as metadata length array data transfers.

The xen-livepatch userland tool is extended to always display the
metadata for each received module. The metadata is received with the
following format: key=value\0key=value\0...key=value\0. The format is
modified to the following one: key=value;key=value;...key=value.
The new format allows to easily parse the metadata for a given module
by a machine.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Handle arbitrary size names with the list operation
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:59 +0000 (10:07 +0000)]
livepatch: Handle arbitrary size names with the list operation

The payloads' name strings can be of arbitrary size (typically small
with an upper bound of XEN_LIVEPATCH_NAME_SIZE).
Current implementation of the list operation interface allows to copy
names in the XEN_LIVEPATCH_NAME_SIZE chunks regardless of its actual
size and enforces space allocation requirements on userland tools.

To unify and simplify the interface, handle the name strings of
arbitrary size by copying them in adhering chunks to the userland.
In order to let the userland allocate enough space for the incoming
data add an auxiliary interface xc_livepatch_list_get_sizes() that
provides the current number of payload entries and the total size of
all name strings. This is achieved by extending the sysctl list
interface with an extra fields: name_total_size.

The xc_livepatch_list_get_sizes() issues the livepatch sysctl list
operation with the nr field set to 0. In this mode the operation
returns the number of payload entries and calculates the total sizes
for all payloads' names.
When the sysctl operation is issued with a non-zero nr field (for
instance with a value obtained earlier with the prior call to the
xc_livepatch_list_get_sizes()) the new field name_total_size provides
the total size of actually copied data.

Extend the libxc to handle the name back-to-back data transfers.

The xen-livepatch tool is modified to start the list operation with a
call to the xc_livepatch_list_get_sizes() to obtain the actual number
of payloads as well as the necessary space for names.
The tool now always requests the actual number of entries and leaves
the preemption handling to the libxc routine. The libxc still returns
'done' and 'left' parameters with the same semantic allowing the tool
to detect anomalies and react to them. At the moment it is expected
that the tool receives the exact number of entries as requested.
The xen-livepatch tool has been also modified to handle the name
back-to-back transfers correctly.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Add support for modules .modinfo section metadata
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:58 +0000 (10:07 +0000)]
livepatch: Add support for modules .modinfo section metadata

Having detailed livepatch metadata helps to properly identify module's
origin and version. It also allows to keep track of the history of
livepatch loads in the system (at least within dmesg buffer size
limits).

The livepatch metadata are embedded in a form of .modinfo section.
Each such section contains data of the following format:
key=value\0key=value\0...key=value\0

The .modinfo section may be generated and appended to the resulting
livepatch ELF file optionally as an extra step of a higher level
livepatch build system.

The metadata section pointer and the section length is stored in the
livepatch payload structure and is used to display the content upon
livepatch apply operation.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Leonard Foerster <foersleo@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Add support for inline asm livepatching expectations
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:57 +0000 (10:07 +0000)]
livepatch: Add support for inline asm livepatching expectations

This is the initial implementation of the expectations enhancement
to improve inline asm livepatching.

Expectations are designed as optional feature, since the main use of
them is planned for inline asm livepatching. The flag enabled allows
to control the expectation state.
Each expectation has data and len fields that describe the data
that is expected to be found at a given patching (old_addr) location.
The len must not exceed the data array size. The data array size
follows the size of the opaque array, since the opaque array holds
the original data and therefore must match what is specified in the
expectation (if enabled).

The payload structure is modified as each expectation structure is
part of the livepatch_func structure and hence extends the payload.

Each expectation is checked prior to the apply action (i.e. as late
as possible to check against the most current state of the code).

For the replace action a new payload's expectations are checked AFTER
all applied payloads are successfully reverted, but BEFORE new payload
is applied. That breaks the replace action's atomicity and in case of
an expectation check failure would leave a system with all payloads
reverted. That is obviously insecure. Use it with caution and act
upon replace errors!

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Add per-function applied/reverted state tracking marker
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:56 +0000 (10:07 +0000)]
livepatch: Add per-function applied/reverted state tracking marker

Livepatch only tracks an entire payload applied/reverted state. But,
with an option to supply the apply_payload() and/or revert_payload()
functions as optional hooks, it becomes possible to intermix the
execution of the original apply_payload()/revert_payload() functions
with their dynamically supplied counterparts.
It is important then to track the current state of every function
being patched and prevent situations of unintentional double-apply
or unapplied revert.

To support that, it is necessary to extend public interface of the
livepatch. The struct livepatch_func gets additional field holding
the applied/reverted state marker.

To reflect the livepatch payload ABI change, bump the version flag
LIVEPATCH_PAYLOAD_VERSION up to 2.

[And also update the top of the design document]

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Do not enforce ELF_LIVEPATCH_FUNC section presence
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:55 +0000 (10:07 +0000)]
livepatch: Do not enforce ELF_LIVEPATCH_FUNC section presence

With default implementation the ELF_LIVEPATCH_FUNC section containing
all functions to be replaced or added must be part of the livepatch
payload, otherwise the payload is rejected (with -EINVAL).

However, with the extended hooks implementation, a livepatch may be
constructed of only hooks to perform certain actions without any code
to be added or replaced.
Therefore, do not always expect the functions section and allow it to
be missing, provided there is at least one section containing hooks
present. The functions section, when present in a payload, must be a
single, non-empty section.

Check also all extended hooks sections if they are a single, non-empty
sections each.

At least one of the functions or hooks section must be present in a
valid payload.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Add support for apply|revert action replacement hooks
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:54 +0000 (10:07 +0000)]
livepatch: Add support for apply|revert action replacement hooks

By default, in the quiescing zone, a livepatch payload is applied with
apply_payload() and reverted with revert_payload() functions. Both of
the functions receive the payload struct pointer as a parameter. The
functions are also a place where standard 'load' and 'unload' module
hooks are executed.

To increase livepatching system's agility and provide more flexible
long-term livepatch solution, allow to overwrite the default apply
and revert action functions with hook-like supplied alternatives.
The alternative functions are optional and the default functions are
used by default.

Since the alternative functions have direct access to the livepatch
payload structure, they can better control context of the 'load' and
'unload' hooks execution as well as exact instructions replacement
workflows. They can be also easily extended to support extra features
in the future.

To simplify the alternative function generation move code responsible
for payload and livepatch region registration outside of the function.
That way it is guaranteed that the registration step occurs even for
newly supplied functions.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Petre Eftime <epetre@amazon.com>
Reviewed-by: Martin Pohlack <mpohlack@amazon.com>
Reviewed-by: Norbert Manthey <nmanthey@amazon.com>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Implement pre-|post- apply|revert hooks
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:53 +0000 (10:07 +0000)]
livepatch: Implement pre-|post- apply|revert hooks

This is an implementation of 4 new livepatch module vetoing hooks,
that can be optionally supplied along with modules.
Hooks that currently exists in the livepatch mechanism aren't agile
enough and have various limitations:
* run only from within a quiescing zone
* cannot conditionally prevent applying or reverting
* do not have access to the module context
To address these limitations the following has been implemented:
1) pre-apply hook
  runs before the apply action is scheduled for execution. Its main
  purpose is to prevent from applying a livepatch when certain
  expected conditions aren't met or when mutating actions implemented
  in the hook fail or cannot be executed.

2) post-apply hook
  runs after the apply action has been executed and quiescing zone
  exited. Its main purpose is to provide an ability to follow-up on
  actions performed by the pre- hook, when module application was
  successful or undo certain preparation steps of the pre- hook in
  case of a failure. The success/failure error code is provided to
  the post- hooks via the rc field of the payload structure.

3) pre-revert hook
  runs before the revert action is scheduled for execution. Its main
  purpose is to prevent from reverting a livepatch when certain
  expected conditions aren't met or when mutating actions implemented
  in the hook fail or cannot be executed.

4) post-revert hook
  runs after the revert action has been executed and quiescing zone
  exited. Its main purpose is to perform cleanup of all previously
  executed mutating actions in order to restore the original system
  state from before the current module application.
  The success/failure error code is provided to the post- hooks via
  the rc field of the payload structure.

The replace action performs atomically the following actions:
- revert all applied modules
- apply a single replacement module.
With the vetoing hooks in place various inter-hook dependencies may
arise. Also, during the revert part of the operation certain vetoing
hooks may detect failing conditions that previously were satisfied.
That could in turn lead to situation when the revert part must be
rolled back with all the pre- and post- hooks re-applied, which again
can't be guaranteed to always succeed.
The simplest response to this complication is to disallow the replace
action completely on modules with vetoing hooks.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Petre Eftime <epetre@amazon.com>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Export payload structure via livepatch_payload.h
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:52 +0000 (10:07 +0000)]
livepatch: Export payload structure via livepatch_payload.h

The payload structure will be used by the new hooks implementation and
therefore its definition has to be exported via the livepatch_payload
header.
The new hooks will make use of the payload structure fields and the
hooks' pointers will also be defined in the payload structure, so
the structure along with all field definitions needs to be available
to the code being patched in.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Eslam Elnikety <elnikety@amazon.de>
Reviewed-by: Leonard Foerster <foersleo@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
5 years agolivepatch: Allow to override inter-modules buildid dependency
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:51 +0000 (10:07 +0000)]
livepatch: Allow to override inter-modules buildid dependency

By default Livepatch enforces the following buildid-based dependency
chain between livepatch modules:
  1) first module depends on given hypervisor buildid
  2) every consecutive module depends on previous module's buildid
This way proper livepatch stack order is maintained and enforced.
While it is important for production livepatches it limits agility and
blocks usage of testing or debug livepatches. These kinds of livepatch
modules are typically expected to be loaded at any time irrespective
of current state of the modules stack.

To enable testing and debug livepatches allow user dynamically ignore
the inter-modules dependency. In this case only hypervisor buildid
match is verified and enforced.

To allow userland pass additional paremeters for livepatch actions
add support for action flags.
Each of the apply, revert, unload and revert action gets additional
32-bit parameter 'flags' where extra flags can be applied in a mask
form.
Initially only one flag '--nodeps' is added for the apply action.
This flag modifies the default buildid dependency check as described
above.
The global sysctl interface input flag parameter is defined with a
single corresponding flag macro:
  LIVEPATCH_ACTION_APPLY_NODEPS (1 << 0)

The userland xen-livepatch tool is modified to support the '--nodeps'
flag for apply and load commands. A general mechanism for specifying
more flags in the future for apply and other action is however added.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Eslam Elnikety <elnikety@amazon.de>
Reviewed-by: Petre Eftime <epetre@amazon.com>
Reviewed-by: Leonard Foerster <foersleo@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agolivepatch: Always check hypervisor build ID upon livepatch upload
Pawel Wieczorkiewicz [Tue, 26 Nov 2019 10:07:50 +0000 (10:07 +0000)]
livepatch: Always check hypervisor build ID upon livepatch upload

This change is part of a independant stacked livepatch modules
feature. This feature allows to bypass dependencies between modules
upon loading, but still verifies Xen build ID matching.

In order to prevent (up)loading any livepatches built for different
hypervisor version as indicated by the Xen Build ID, add checking for
the payload's vs Xen's build id match.

To achieve that embed into every livepatch another section with a
dedicated hypervisor build id in it. After the payload is loaded and
the .livepatch.xen_depends section becomes available, perform the
check and reject the payload if there is no match.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Eslam Elnikety <elnikety@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>