Andrew Cooper [Thu, 13 Dec 2018 17:01:24 +0000 (17:01 +0000)]
x86/svm: Drop enum instruction_index and simplify svm_get_insn_len()
Passing a 32-bit integer index into an array with entries containing less than
32 bits of data is wasteful, and creates an unnecessary error condition of
passing an out-of-range index.
The width of the X86EMUL_OPC() encoding is currently 20 bits for the
instructions used, which leaves room for a modrm byte. Drop opc_tab[]
entirely, and encode the expected opcode/modrm information directly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Brian Woods <brian.woods@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Thu, 13 Dec 2018 17:01:24 +0000 (09:01 -0800)]
x86/svm: Remove list functionality from __get_instruction_length_* infrastructure
The existing __get_instruction_length_from_list() has a single user
which uses the list functionality. That user however should be looking
specifically for INVD or WBINVD, as reported by the vmexit exit reason.
Modify svm_vmexit_do_invalidate_cache() to ask for the correct
instruction, and drop all list functionality from the helper.
Take the opportunity to rename it to svm_get_insn_len(), and drop the
IOIO length handling which has never been used.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Brian Woods <brian.woods@amd.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Thu, 31 Jan 2019 10:38:24 +0000 (11:38 +0100)]
x86emul: correct AVX512BW write masking checks
For VPSADBW this likely was a result of bad copy-and-paste.
For VPS{L,R}LDQ comment and code were not in line, but then again the
comment also wasn't fully updated from the AVX2 original it got cloned
from.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Thu, 31 Jan 2019 10:37:56 +0000 (11:37 +0100)]
tools: fix build dependency upon generated header(s)
Commit fd35f32b4b ("tools/x86emul: Use struct cpuid_policy in the
userspace test harnesses") didn't account for the dependencies of
cpuid-autogen.h to potentially change between incremental builds.
Putting the make invocation to produce the header together with the
directory tree creation therefore does not work. Introduce a separate
goal.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Wei Liu [Wed, 30 Jan 2019 13:55:55 +0000 (13:55 +0000)]
x86/pvh-boot: don't mandate validity of RSDP pointer
RSDP is not mandatory according to PVH spec. Remove the BUG_ON. The
guest (xen) will fall back to scanning if necessary.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooepr3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrii Anisov [Fri, 25 Jan 2019 17:06:02 +0000 (19:06 +0200)]
xen/arm: gic-vgic: Fix the assert condition in vgic_connect_hw_irq
Currently, the assert condition in vgic_connect_hw_irq does not
correspond to the comment above and result to hit the assertion
on HW IRQ disconnection.
Fix the condition so it corresponds to the comment and allows IRQ
disconnection on debug builds.
Fixes: ec2a2f1 ("ARM: VGIC: factor out vgic_connect_hw_irq()") Signed-off-by: Andrii Anisov <andrii_anisov@epam.com> Suggested-by: Stefan Nuernberger <snu@amazon.de> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
[julieng: Reword the commit message] Acked-by: Julien Grall <julien.grall@arm.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Wei Liu [Tue, 29 Jan 2019 11:37:59 +0000 (11:37 +0000)]
libxl: correctly dispose of dominfo list in libxl_name_to_domid
Tamas reported ssid_label was leaked. Use the designated function to
free dominfo list to fix the leakage.
Reported-by: Tamas K Lengyel <tamas@tklengyel.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Tested-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Fri, 25 Jan 2019 16:23:46 +0000 (16:23 +0000)]
x86/hvm: Fix bit checking for CR4 and MSR_EFER
Before the cpuid_policy logic came along, %cr4/EFER auditing on migrate-in was
complicated, because at that point no CPUID information had been set for the
guest. Auditing against the host CPUID was better than nothing, but not
ideal.
Similarly at the time, PVHv1 lacked the "CPUID passed through from hardware"
behaviour with PV guests had, and PVH dom0 had to be special-cased to be able
to boot.
Order of information in the migration stream is still an issue (hence we still
need to keep the restore parameter to cope with a nested virt corner case for
%cr4), but since Xen 4.9, all domains start with a suitable CPUID policy,
which is a more appropriate upper bound than host_cpuid_policy.
Finally, reposition the UMIP logic as it is the only row out of order.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Tue, 22 Jan 2019 18:58:56 +0000 (18:58 +0000)]
x86/p2m: Drop erroneous #VE-enabled check in ept_set_entry()
Code clearing the "Suppress VE" bit in an EPT entry isn't nececsserily running
in current context. In ALTP2M_external mode, it definitely is not, and in PV
context, vcpu_altp2m(current) acts upon the HVM union.
Even if we could sensibly resolve the target vCPU, it may legitimately not be
fully set up at this point, so rejecting the EPT modification would be buggy.
There is a path in hvm_hap_nested_page_fault() which explicitly emulates #VE
in the cpu_has_vmx_virt_exceptions case, so the -EOPNOTSUPP part of this
condition is also wrong.
Drop the !sve check entirely.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-acked-by: Juergen Gross <jgross@suse.com>
In order to solve it move the vioapic_hwdom_map_gsi outside of the
locked region in vioapic_write_redirent. vioapic_hwdom_map_gsi will
not access any of the vioapic fields, so there's no need to call the
function holding the hvm.irq_lock.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Julien Grall [Mon, 28 Jan 2019 11:50:25 +0000 (11:50 +0000)]
xen/arm: Implement workaround for Cortex-A76 erratum 1165522
Early version of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction while the S1/S2 system registers are in an
inconsistent state.
This can happen during guest context switch and when invalidating the
TLBs for other than the current VMID.
The workaround implemented in Xen will:
- Use an empty stage-2 with a reserved VMID while context switching
between 2 guests
- Use an empty stage-2 with the VMID where TLBs need to be flushed
Julien Grall [Mon, 28 Jan 2019 11:50:24 +0000 (11:50 +0000)]
xen/arm: p2m: Only use isb() when it is necessary
The EL1 translation regime is out-of-context when running at EL2. This
means the processor cannot speculate memory accesses using the registers
associated to that regime.
An isb() is only needed if Xen is going to use the translation regime
before returning to the guest (exception returns will synchronize the
context).
Remove unnecessary isb() and document the ones left.
Julien Grall [Mon, 28 Jan 2019 11:50:23 +0000 (11:50 +0000)]
xen/arm: domain_build: Don't switch to the guest P2M when copying data
Until recently, kernel/initrd/dtb were loaded using guest VA and
therefore requiring to restore temporarily the P2M. This was reworked
in a series of commits (up to 9292086 "xen/arm: domain_build: Use
copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA.
This will also help a follow-up patch which will require
p2m_{save,restore}_state to work in pair to workaround an erratum.
Jan Beulich [Mon, 28 Jan 2019 16:40:39 +0000 (17:40 +0100)]
x86/AMD: flush TLB after ucode update
The increased number of messages (spec_ctrl.c:print_details()) within a
certain time window made me notice some slowness of boot time screen
output. Experimentally I've narrowed the time window to be from
immediately after the early ucode update on the BSP to the PAT write in
cpu_init(), which upon further investigation has an effect because of
the full TLB flush that's implied by that write.
For that reason, as a workaround, flush the TLB of the mapping of the
page that holds the blob. Note that flushing just a single page is
sufficient: As per verify_patch_size() patch size can't exceed 4k, and
the way xmalloc() works the blob can't be crossing a page boundary.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Brian Woods <brian.woods@amd.com> Release-acked-by: Juergen Gross <jgross@suse.com>
During instruction emulation, the cpuid instruction is emulated with
data that is controlled by the guest. As speculation might pass bound
checks, we have to ensure that no out-of-bound loads are possible.
To not rely on the compiler to perform value propagation, instead of
using the array_index_nospec macro, we replace the variable with the
constant to be propagated instead.
This commit is part of the SpectreV1+L1TF mitigation patch series.
When interacting with hpet, read and write operations can be executed
during instruction emulation, where the guest controls the data that
is used. As it is hard to predict the number of instructions that are
executed speculatively, we prevent out-of-bound accesses by using the
array_index_nospec function for guest specified addresses that should
be used for hpet operations.
We introduce another macro that uses the ARRAY_SIZE macro to block
speculative accesses. For arrays that are statically accessed, this macro
can be used instead of the usual macro. Using this macro results in more
readable code, and allows to modify the way this case is handled in a
single place.
This commit is part of the SpectreV1+L1TF mitigation patch series.
George Dunlap [Thu, 24 Jan 2019 17:48:27 +0000 (17:48 +0000)]
docs: Fix dm_restrict documentation
Remove "chatty" and redundant information from the xl man page;
restrict it to functional descriptions only, and point instead to
qemu-depriv.pandoc and SUPPORT.md as locations for "canonical"
information.
Add a man page entry for device_model_user.
Update qemu-deprivilege.pandoc:
Changes in missing feature list:
- Migration is functional
- But qdisk backends are not
Add a missing restriction list.
The following statements from the man page are dropped:
- Mentioning PV; PV guests never have a device model.
- Drop the confusing statement about stdvga and cirrus vga options.
- Re-used domain IDs are now handled.
- Device models should no longer be able to create world-readable
files on dom0's filesystem.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Thu, 24 Jul 2014 10:06:39 +0000 (11:06 +0100)]
xen/sched: Introduce domain_vcpu() helper
The progression of multi-vcpu support in Xen (originally a single pointer,
then an embedded d->vcpu[] array, then a dynamically allocated array) has
resulted in a large quantity of ad-hoc code for looking a vcpu up by id, and a
large number of ways that the toolstack can cause Xen to trip over a NULL
pointer. Some of this has been addressed in Xen 4.12, and work is ongoing.
Another property of looking a vcpu up by id is that it is frequently done in
unprivileged hypercall context, making it an attractive target for speculative
sidechannel attacks.
Introduce a helper to do the lookup correctly, and without speculative
interference. For performance reasons, it is useful not to have an smp_rmb()
in this helper on ARM, and luckily this is safe to do, because of the
serialisation offered by the global domlist lock.
As a minor change noticed when checking the safety of this construct, sanity
check during boot that idle->max_vcpus is a suitable upper bound for
idle->vcpu[].
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 21 Dec 2018 17:23:32 +0000 (17:23 +0000)]
x86/pvh-dom0: Remove unnecessary function pointer call from modify_identity_mmio()
Function pointer calls are far more expensive in a post-Spectre world, and
this one doesn't need to be.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Fri, 7 Dec 2018 13:43:27 +0000 (13:43 +0000)]
xen/dom0: Add a dom0-iommu=none option
For development purposes, it is very convenient to boot Xen as a PVH guest,
with an XTF PV or PVH "dom0". The edit-compile-go cycle is a matter of
seconds, and you can reasonably insert printk() debugging in places which
which would be completely infeasible when booting fully-fledged guests.
However, the PVH dom0 path insists on having a working IOMMU, which doesn't
exist when virtualised as a PVH guest, and isn't necessary for XTF anyway.
Introduce a developer mode to skip the IOMMU requirement.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 31 Dec 2018 14:06:52 +0000 (14:06 +0000)]
xen/dom0: Deprecate iommu_hwdom_inclusive and leave it disabled by default
This option is unique to x86 PV dom0's, but it is not sensible to have a
catch-all which blindly maps all non-RAM regions into the IOMMU.
The map-reserved option remains, and covers all the buggy firmware issues that
I am aware of. The two common cases are legacy USB keyboard emulation, and
the BMC mailbox used by vendor firmware in NICs/HBAs to report information
back to the iLO/iDRAC/etc for remote remote management purposes.
A specific advantage of this change is that x86 dom0's IOMMU setup is now
consistent between PV and PVH.
This change is not expected to have any impact, due to map-reserved remaining.
In the unlikely case that it does cause an issue, we should introduce other
map-$SPECIFIC options rather than re-introducing this catch-all.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 10 Dec 2018 21:29:10 +0000 (21:29 +0000)]
docs: Improve documentation and parsing for efi=
Update parse_efi_param() to use parse_boolean() for "rs", so it behaves
like other Xen booleans.
However, change "attr=uc" to not be a boolean. "no-attr=uc" is ambiguous and
shouldn't be accepted, but accept "attr=no" as an acceptable alternative.
Update the command line documentation for consistency.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Fri, 30 Nov 2018 17:15:33 +0000 (17:15 +0000)]
xen/arm: gic: Make sure the number of interrupt lines is valid before using it
GICv2 and GICv3 supports up to 1020 interrupts. However, the value computed
from GICD_TYPER.ITLinesNumber can be up to 1024. On GICv3, we will end up to
write in reserved registers that are right after the IROUTERs one as the
value is not capped early enough.
Cap the number of interrupts as soon as we compute it so we know we can
safely using it afterwards.
Andrii Anisov [Wed, 23 Jan 2019 12:50:07 +0000 (14:50 +0200)]
arm/p2m: call iommu iotlb flush if iommu exists and enabled
Taking decision by `need_iommu_pt_sync()` make us never kicking
`iommu_iotlb_flush()` for IOMMUs which do share P2M with CPU.
So check `has_iommu_pt()` instead.
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com> Reviewed-by: Paul Durant <paul.durrant@citrix.com> Release-Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Julien Grall <julien.grall@arm.com>
Anthony PERARD [Wed, 16 Jan 2019 16:16:56 +0000 (16:16 +0000)]
docs: Fix all links to Xen man pages in html
Second try, this time also works for all links to xen-vbd-interface(7).
We don't try anymore to have pod2html generate relative links, instead
we do it ourself.
First, we modify all links to man pages to have what looks like an
absolute URL and pod2html will just write it in the html output.
Absolute URL in POD are in the form L<text|scheme:...> so let's just use
a scheme that isn't real, but easy to find in the resulting html output:
"relative:".
Then we fix the output and remove all the bogus scheme "relative" and
can end up with nice relative links.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Anthony PERARD [Wed, 16 Jan 2019 16:16:57 +0000 (16:16 +0000)]
man: Highlight reference in xl-disk-configuration(5)
Provide a better way to see the link to a different manpage, with simple
words.
Suggested-by: Ian Jackson <ian.jackson@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Fri, 7 Dec 2018 13:43:27 +0000 (13:43 +0000)]
x86/dom0: Improve dom0= useability
Having a pvh boolean isn't ideal. If we gain a 3rd virtulsation mode,
what does `dom0=no-pvh` mean?
Change the syntax to be "dom0 = pv | pvh" which offers an option to more
obviously select PV mode. Hide both options behind the relevent
CONFIG_* settings, and default to PVH mode when CONFIG_PV is compiled
out.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Thu, 27 Dec 2018 18:40:19 +0000 (18:40 +0000)]
docs: Improve documentation and parsing for pci=
Alter parse_pci_param() to use parse_boolean(), so the sub options
behave like other Xen booleans.
Update the command line documentation for consistency.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Thu, 27 Dec 2018 18:40:19 +0000 (18:40 +0000)]
docs: Improve documentation and parsing for iommu=
Update parse_iommu_param() to uniformly use parse_boolean(), so the sub
booleans behave like other Xen boolean options. Reposition the
custom_param() to avoid a forward declaration of parse_iommu_param().
Rewrite the command line documentation almost from scratch, including
far more detail.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Fri, 7 Dec 2018 13:43:27 +0000 (13:43 +0000)]
docs: Improve documentation for dom0= and dom0-iommu=
Update to the latest metadata style, and discuss the options more
completely where appropriate.
Drop the redundant comment beside parse_dom0_param() - it is already out
of sync with the main documentation. Also drop the individual
documentation for deprecated options which refer to their newer
versions, for the same reason.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Razvan Cojocaru [Mon, 21 Jan 2019 11:13:22 +0000 (12:13 +0100)]
x86/vm_event: block interrupt injection for sync vm_events
Block interrupts (in vmx_intr_assist()) for the duration of
processing a sync vm_event (similarly to the strategy
currently used for single-stepping). Otherwise, attempting
to emulate an instruction when requested by a vm_event
reply may legitimately need to call e.g.
hvm_inject_page_fault(), which then overwrites the active
interrupt in the VMCS.
The sync vm_event handling path on x86/VMX is (roughly):
monitor_traps() -> process vm_event -> vmx_intr_assist()
(possibly writing VM_ENTRY_INTR_INFO) ->
hvm_vm_event_do_resume() -> hvm_emulate_one_vm_event()
(possibly overwriting the VM_ENTRY_INTR_INFO value).
This patch may also be helpful for the future removal
of may_defer in hvm_set_cr{0,3,4} and hvm_set_msr().
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Ian Jackson [Mon, 14 Jan 2019 14:59:37 +0000 (14:59 +0000)]
libxl: fix build (missing CLONE_NEWIPC) on astonishingly old systems
CLONE_NEWIPC was introduced in Linux 2.6.19, on the 29th of November
2006, which was 12 years, 1 month, and 14 days ago.
Nevertheless apparently some people are trying to build Xen on systems
whose kernel headers are that old. Placate these people by providing
a fallback #define for CLONE_NEWIPC.
The actual binary value will of course remain constant, because of the
kernel API promise, so this is and will be correct on all platforms
where the CLONE_NEWIPC is supported. (Even if for some reason we miss
the right #includes.)
Of course at runtime this value will not work on older kernels. It
will be rejected as unknown by anything except some pre-2.6.18
kernels. On those kernels we do not want to support dm_restrict, and
an attempt to use it will fail. It is OK for the failure to be a
messy EINVAL syscall failure. (The IPC namespace unshare is necessary
to avoid a suborned deprivileged qemu from causing trouble with shm,
sem, etc.)
On the very old kernels, the feature is totally out of scope.
(We are only interested, here, in making the build work, to avoid
blocking people who aren't using this feature.)
CC: Wei Liu <wei.liu2@citrix.com> CC: Juergen Gross <jgross@suse.com> CC: Jan Beulich <JBeulich@suse.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Ian Jackson [Mon, 14 Jan 2019 14:59:35 +0000 (14:59 +0000)]
docs/features/qemu-deprivilege.pandoc: No support with Linux <2.6.18
Some early kernels are known not to reject unknown flags to
unshare(). There may be other problems.
CC: Jan Beulich <JBeulich@suse.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Anthony PERARD [Tue, 15 Jan 2019 15:48:37 +0000 (15:48 +0000)]
docs: Fix links in html generation of man pages
Currently, all links to other man pages are sent to
http://man.he.net/man$mansection/$manpage, but that site doesn't have
Xen man pages, so all links to other Xen man pages are broken.
In order to fix that, this is going to be a bit complex.
First, we need to teach pod2html on where other .pod files can be found,
otherwise it isn't going make any links to our pages. This is done with
--podpath.
Second, pod2html doesn't actually understand our format
"$manpage.$mansection.pod". But instead of teaching it (which is
probably impossible) we are going to modify our .pod files in order to
tell pod2html which file to look for. This is done with the sed command
by transforming for example: "L<xl.conf(5)>" to "L<xl.conf(5)|xl.conf.5>".
Last but not least, in order to have relative links to the other
generated man page, we are going against the rules, we are going to use
"--htmlroot=." so that pod2html doesn't prepand "/" to all "relative"
links. We are also going to `cd` into the "man" dir and set podpath to
"." so that pod2html is going to generate relative links to other pod
file in the form "./$man" insteadof "man/$man" or "../$man" with other
compination of options. The result of --podpath + --podroot can be check
in pod2html's cache file "pod2html.tmp".
All of this is going to generate links in the form "./$html_manpage".
But all of this doesn't work for xen-vbd-interface(7), because it's not
a pod file... maybe we could generate pod2html's cache (pod2html.tmp)
file to add en entry.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Anthony PERARD [Tue, 15 Jan 2019 15:48:36 +0000 (15:48 +0000)]
man: Fix links in xl(1)
All links to other manpages should contain the man section number.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Andrew Cooper [Fri, 7 Dec 2018 13:43:27 +0000 (13:43 +0000)]
xen/cmdline: Fix buggy strncmp(s, LITERAL, ss - s) construct
When the command line parsing was updated to use const strings and no longer
tokenise with NUL characters, string matches could no longer be made with
strcmp().
Unfortunately, the replacement was buggy. strncmp(s, "opt", ss - s) matches
"o", "op" and "opt" on the command line, as ss - s may be shorter than the
passed literal. Furthermore, parse_bool() is affected by this, so substrings
such as "d", "e" and "o" are considered valid, with the latter being ambiguous
between "on" and "off".
Introduce a new strcmp-like function for the task, which looks for exact
string matches, but declares success when the NUL of the literal matches a
comma, colon or semicolon in the command line fragment.
No change to the intended parsing functionality, but fixes cases where a
partial string on the command line will inadvertently trigger options.
A few areas were more than just a trivial change:
* parse_irq_vector_map_param() gained some style corrections.
* parse_vpmu_params() was rewritten to use the normal list-of-options form,
rather than just fixing up parse_vpmu_param() and leaving the parsing being
hard to follow.
* Instead of making the trivial fix of adding an explicit length check in
parse_bool(), use the length to select which token to we search for, which
is more efficient than the previous linear search over all possible tokens.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <julien.grall@arm.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Coverity understandably complains that get_reaper_lock_and_uid leaks
the fd and hence open-file. But this is intentional: the lock becomes
owned by the child process as a whole, which is entirely the property
of libxl.
(The coding style here in this subprocess is a bit anomalous but it's
probably not worth it to convert get_reaper_lock_and_uid to `goto out'
style and have it explicitly return the fd number.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Fri, 11 Jan 2019 10:09:35 +0000 (03:09 -0700)]
libxl: fix build on rather old systems
CLONE_NEWIPC has been introduced in Linux 2.6.19 only (and into glibc
at around that time as well). Cope with it being undefined as well as
with the underlying kernel not knowing of it.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Anthony PERARD [Fri, 4 Jan 2019 13:53:21 +0000 (13:53 +0000)]
libxl_json: Remove libxl__json_object_append_to from header
It isn't possible to use libxl__json_object_append_to() outside of
libxl_json.c as there is no way to allocate a struct libxl__yajl_ctx.
So also remove libxl__yajl_ctx typedef from the internal header.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 25 Jul 2018 15:16:32 +0000 (16:16 +0100)]
libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp
The re-implementation is done because we want to be able to send the
file description that QEMU can use to save its state. When QEMU is
restricted, it would not be able to write to a path.
This replace both libxl__qmp_stop() and libxl__qmp_save().
qmp_qemu_check_version() was only used by libxl__qmp_save(), so it is
replace by a version using libxl__ev_qmp instead.
Coding style fixed in libxl__domain_suspend_device_model() for the
return value.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 25 Jul 2018 15:03:09 +0000 (16:03 +0100)]
libxl: Change libxl__domain_suspend_device_model() to be async
This create an extra step for the two call sites of the function.
libxl__domain_suspend_device_model() in this patch gets an extra error
variable (there is ret and rc), but ret goes away in the next patch.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
libxl_domain_soft_reset() haven't been tested, as it doesn't appear to
possible to call the function from xl.
Anthony PERARD [Thu, 31 May 2018 13:45:12 +0000 (14:45 +0100)]
libxl: QEMU startup sync based on QMP
This is only activated when dm_restrict=1, as explained in a previous
patch "libxl_dm: Pre-open QMP socket for QEMU"
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 22 Nov 2018 12:09:37 +0000 (12:09 +0000)]
libxl: Add dmss_init/dispose for libxl__dm_spawn_state
These two functions, dmss_init and dmss_dispose, need to be called to
initialise the private parts of a libxl__dm_spawn_state (dmss) as well
as dispose of them before giving back control to a caller.
There are 3 functions that can start using a dmss, the classic
libxl__spawn_local_dm, the one for stubdom libxl__spawn_stub_dm and
libxl__spawn_qdisk_backend. But there are only 2 exit path as
libxl__spawn_qdisk_backend is using libxl__spawn_local_dm functions.
These two new functions are empty but will be used shortly.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 31 May 2018 13:43:20 +0000 (14:43 +0100)]
libxl_dm: Pre-open QMP socket for QEMU
This patch moves the creation of the QMP unix socket from QEMU to libxl.
But libxl doesn't rely on this yet.
When starting QEMU with dm_restrict=1, pre-open the QMP socket before
exec QEMU. That socket will be useful to find out if QEMU is ready, and
pre-opening it means that libxl can connect to it without waiting for
QEMU to create it.
The pre-opening is conditional, based on the use of dm_restrict
because it is using a new command line option of QEMU, and dm_restrict
support in QEMU is newer.
-chardev socket,fd=X is available with QEMU 2.12, since commit:
> char: allow passing pre-opened socket file descriptor at startup
> 0935700f8544033ebbd41e1f13cd528f8a58d24d
dm_restrict is available in QEMU 3.0.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 26 Jul 2018 16:12:52 +0000 (17:12 +0100)]
libxl_exec: Add libxl__spawn_initiate_failure
This function can be used by user of libxl__spawn_* when they setup a
notification other than xenstore. The parent can already report success
via libxl__spawn_initiate_detach(), this new function can be used for
failure instead of waiting for the timeout.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 8 Nov 2018 17:38:19 +0000 (17:38 +0000)]
libxl_qmp: Implementation of libxl__ev_qmp_*
This patch implement the API libxl__ev_qmp documented in the previous
patch, "libxl: Design of an async API to issue QMP commands to QEMU".
Since this API is to interact with QEMU via the QMP protocol, it also
implement a QMP client. The specification for the QEMU Machine Protocol
(QMP) can be found in the QEMU repository at:
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/qmp-spec.txt
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ wei: fix build ] Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Anthony PERARD [Tue, 3 Jul 2018 09:29:17 +0000 (10:29 +0100)]
libxl: Design of an async API to issue QMP commands to QEMU
All the functions will be implemented in later patches.
This patch includes the API that libxl can use to send QMP commands to
QEMU.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ wei: fix build ] Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Anthony PERARD [Thu, 22 Nov 2018 18:38:39 +0000 (18:38 +0000)]
libxl: Add wrapper around libxl__json_object_to_json JSON
That wrapper is going to be used to safely log a json_object, as
libxl__json_object_to_json return NULL on error. In the error case,
JSON() will return an invalid json string.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 31 Oct 2018 16:31:49 +0000 (16:31 +0000)]
libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK
This patch change the behavior of libxl__sendmsg_fds to retry sendmsg on
EINTR error and return an error on short writes.
This patch allow a caller of libxl__sendmsg_fds to deal with EWOULDBLOCK
and short writes. The function now requires to send only 1 byte of data
so that when dealing with non-blocking fds a EWOULDBLOCK error would
mean that the fds haven't been sent yet. Current caller already send
only 1 byte.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ wei: fix build ] Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Fri, 11 Jan 2019 11:30:29 +0000 (12:30 +0100)]
tmem: default to off
As a short term alternative to deleting the code, default its building
to off (overridable in EXPERT mode only). Additionally make sure other
related baggage (LZO code) won't be carried when the option is off (with
TMEM scheduled to be deleted anyway, I didn't want to introduce a
separate Kconfig option to control the LZO compression code, and hence
CONFIG_TMEM is used directly there). Similarly I couldn't be bothered to
add actual content to the command line option doc for the two affected
options.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Razvan Cojocaru [Fri, 11 Jan 2019 11:28:49 +0000 (12:28 +0100)]
x86/p2m: fix p2m_finish_type_change()
finish_type_change() returns a negative int on error, but the
current code checks if ( !rc ). We also need to treat
finish_type_change()'s return codes cumulatively in the
success case (don't overwrite a 1 returned while processing
the hostp2m if processing an altp2m returns 0).
The breakage was introduced by commit 0fb4b58c8b
("x86/altp2m: fix display frozen when switching to a new view
early").
Properly indent the out: label while at it.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Sergey Dyasli [Wed, 9 Jan 2019 14:45:14 +0000 (15:45 +0100)]
mm/page_alloc: fix MEMF_no_dma allocations for single NUMA
Currently dma_bitsize is zero by default on single NUMA node machines.
This makes all alloc_domheap_pages() calls with MEMF_no_dma return NULL.
There is only 1 user of MEMF_no_dma: dom0_memflags, which are used
during memory allocation for Dom0. Failing allocation with default
dom0_memflags is especially severe for the PV Dom0 case: it makes
alloc_chunk() to use suboptimal 2MB allocation algorithm with a search
for higher memory addresses.
This can lead to the NMI watchdog timeout during PV Dom0 construction
on some machines, which can be worked around by specifying "dma_bits"
in Xen's cmdline manually.
Fix the issue by ignoring MEMF_no_dma in cases when dma_bitsize is zero,
which means there is no DMA zone. This shouldn't cause any issues for
Dom0 because alloc_heap_pages() will first use higher memory addresses
for satisfying memory allocation requests.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Anthony PERARD [Wed, 9 Jan 2019 11:07:30 +0000 (11:07 +0000)]
docs: Fix output of man/xen-vbd-interface
In pandoc's markdown, a code block needs at least 4 spaces to be
recognize as such. This patch fix the rendering of description of the
encoding in the VBD interface so that [1] can be readable.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: rebase on top of staging ] Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monné [Tue, 8 Jan 2019 09:03:45 +0000 (10:03 +0100)]
x86/shim: only mark special pages as RAM in pvshim mode
When running Xen as a guest it's not necessary to mark such pages as
RAM because they won't be assigned to the initial domain memory map.
While there move the functions to the PV shim specific file and rename
them accordingly.
No functional change expected.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Paul Durrant [Mon, 17 Dec 2018 09:22:59 +0000 (09:22 +0000)]
x86/mm/p2m: stop checking for IOMMU shared page tables in mmio_order()
Now that the iommu_map() and iommu_unmap() operations take an order
parameter and elide flushing there's no strong reason why modifying MMIO
ranges in the p2m should be restricted to a 4k granularity simply because
the IOMMU is enabled but shared page tables are not in operation.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Mon, 17 Dec 2018 09:22:58 +0000 (09:22 +0000)]
iommu: elide flushing for higher order map/unmap operations
This patch removes any implicit flushing that occurs in the implementation
of map and unmap operations and adds new iommu_map/unmap() wrapper
functions. To maintain semantics of the iommu_legacy_map/unmap() wrapper
functions, these are modified to call the new wrapper functions and then
perform an explicit flush operation.
Because VT-d currently performs two different types of flush dependent upon
whether a PTE is being modified versus merely added (i.e. replacing a non-
present PTE) 'iommu flush flags' are defined by this patch and the
iommu_ops map_page() and unmap_page() methods are modified to OR the type
of flush necessary for the PTE that has been populated or depopulated into
an accumulated flags value. The accumulated value can then be passed into
the explicit flush operation.
The ARM SMMU implementations of map_page() and unmap_page() currently
perform no implicit flushing and therefore the modified methods do not
adjust the flush flags.
NOTE: The per-cpu 'iommu_dont_flush_iotlb' is respected by the
iommu_legacy_map/unmap() wrapper functions and therefore this now
applies to all IOMMU implementations rather than just VT-d.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Julien Grall <julien.grall@arm.com> Acked-by: Brian Woods <brian.woods@amd.com>
Paul Durrant [Mon, 17 Dec 2018 09:22:57 +0000 (09:22 +0000)]
iommu: rename wrapper functions
A subsequent patch will add semantically different versions of
iommu_map/unmap() so, in advance of that change, this patch renames the
existing functions to iommu_legacy_map/unmap() and modifies all call-sites.
It also adjusts a comment that refers to iommu_map_page(), which was re-
named by a previous patch.
This patch is purely cosmetic. No functional change.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Paul Durrant [Mon, 17 Dec 2018 09:22:56 +0000 (09:22 +0000)]
amd-iommu: add flush iommu_ops
The iommu_ops structure contains two methods for flushing: 'iotlb_flush' and
'iotlb_flush_all'. This patch adds implementations of these for AMD IOMMUs.
The iotlb_flush method takes a base DFN and a (4k) page count, but the
flush needs to be done by page order (i.e. 0, 9 or 18). Because a flush
operation is fairly expensive to perform, the code calculates the minimum
order single flush that will cover the specified page range rather than
performing multiple flushes.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Brian Woods <brian.woods@amd.com>
Andrew Cooper [Wed, 2 Jan 2019 10:26:49 +0000 (10:26 +0000)]
docs/man: Fix/simplify generation of manpages
The original intent of this patch was to rename xen-vbd-interface.markdown.7
to xen-vbd-interface.pandoc.7 to remove the final markdown file from the docs/
tree.
The DOC_MANx lists are broken. They contain MANxSRC-y twice, the first half
with a partial %.pod substituation, and the second half with a partial
%.markdown substitution. This is also the root cause behind the filtering
activity in the uninstall-man$(i)-pages rule.
Furthermore, the logic for generating the manpage targets is unnecesserily
repetative, owing to the layout of source files in the man/ directory.
Therefore, tackle the problem by renaming all of our manpage source files from
"$FORMAT.$SECTION" to "$SECTION.$FORMAT". For the two xl.cfg.5 and xl.1 which
are preprocessed by autoconf to contain path information, this requires
updating configure.ac and .gitignore. The markdown to pandoc conversion is
performed as well, as it is also a straight rename.
An ancillary benefit of this renaming is that text editors stand a chance of
being able to work out the correct mode to use.
As for the makefile:
1) Break the MAN_SECTIONS list out of the GENERATE_MANPAGE_RULES loop, as we
are going to use it a second time.
2) Do away with the individaul MANxSRC-y variables. Use a single list,
derived from all *.pod and *.pandoc files, with their format suffixes
removed.
3) Use a $(foreach ...) to generate the DOC_MANx lists, filling them with the
correct content.
4) The DOC_HTML and DOC_TXT can now include all manpages with a single
substitution, as they don't need to separate the manpages by
section-numbered-directory.
5) Fix up the filenames in the manpage metarule to match the renaming.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Wed, 2 Jan 2019 10:26:47 +0000 (10:26 +0000)]
docs/markdown: Switch to using pandoc, and fix underscore escaping
c/s a3a99df44 "docs/cmdline: Rewrite the cpuid_mask_* section" completely
forgot about how markdown gets rendered to HTML (as opposed to PDF), because
we use different translators depending on the destination format.
markdown and pandoc are very similar markup languages, but a couple of details
about pandoc cause it to have far more user-friendly inline markup.
Switch all markdown documents to be pandoc (so we are using a single
translator, and therefore a single flavour of markdown), which fixes the
rendered docs on xenbits.xen.org/docs.
While changing the format, fix the remainder of the escaped underscores in the
same mannor as the previous patch. The two problem cases here are __LINE__
and __FILE__ where the first underscore still needs escaping.
In addition, dmop.markdown and dom0less.markdown didn't used to get processed,
as only .markdown files in the misc/ directory got considered.
dom0less.pandoc gets picked up automatically now, due to being in the
features/ directory, but designs/ needs adding to the pandoc directory list
for dmop.pandoc to get processed.
While edting in appropriate areas, take the opportunity to fix some markup to
the surrounding style, and drop trailing whitespace.
No change in content - only formatting. This results in the text being easier
to read and grep.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Wed, 2 Jan 2019 10:26:45 +0000 (10:26 +0000)]
docs/pandoc: Don't escape underscores in the middle of text
Pandoc deliberately (and contrary to markdown) doesn't treat underscores in
the middle of normal text as emphasis markers, as this is almost always the
unhelpful interpretation.
For text which is emphasised using _, an underscore in the middle is
interpreted, but the emphasis marker can be switched to * instead.
One problem case is where we use {} globbing with identifier names, as it
counts as a word break. Therefore, we do need to retain the escaped
underscore immediately following a closing brace.
No change in content - only formatting. This results in the text being easier
to read and grep.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Roger Pau Monne [Fri, 21 Dec 2018 09:41:05 +0000 (10:41 +0100)]
x86/mm-locks: apply a bias to lock levels for control domain
paging_log_dirty_op function takes mm locks from a subject domain and
then attempts to perform copy to operations against the caller domain
in order to copy the result of the hypercall into the caller provided
buffer.
This works fine when the caller is a non-paging domain, but triggers a
lock order panic when the caller is a paging domain due to the fact
that at the point where the copy to operation is performed the subject
domain paging lock is locked, and the copy operation requires
locking the caller p2m lock which has a lower level.
Fix this limitation by adding a bias to the level of control domain mm
locks, so that the lower control domain mm lock always has a level
greater than the higher unprivileged domain lock level. This allows
locking the subject domain mm locks and then locking the control
domain mm locks, while keeping the same lock ordering and the changes
mostly confined to mm-locks.h.
Note that so far only this flow (locking a subject domain locks and
then the control domain ones) has been identified, but not all
possible code paths have been inspected. Hence this solution attempts
to be a non-intrusive fix for the problem at hand, without discarding
further changes in the future if other valid code paths are found that
require more complex lock level ordering.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Lars Kurth [Mon, 10 Dec 2018 19:33:09 +0000 (11:33 -0800)]
CONTRIBUTING: Clarifications on how to handle license deviations
This patch makes a few clarifications which were discussed on
IRC recently.
Specifically:
- Highlight the principle that license deviations
should be brought to the attention of maintainers
- Add a requirement for GPLv2 compatibility
- Restructure the document to highlight use-cases for
"New components" and "Importing code" clearer
- Add conventions and instructions for "New files"
Signed-off-by: Lars Kurth <lars.kurth@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Julien Grall <julien.grall@arm.com>
Anthony PERARD [Wed, 12 Dec 2018 14:53:46 +0000 (14:53 +0000)]
libxl_create: Re-order callbacks of initiate_domain_create
Callbacks should be in the order that there are going to be executed.
This patch fixes the initiate_domain_create callbacks, and also
reorders the callbacks prototypes. That way, it's easier to follow the
flow.
This patch:
- move libxl__colo_restore_setup_done after domcreate_bootloader_done.
- move domcreate_attach_devices after domcreate_devmodel_started.
No functional change.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Razvan Cojocaru [Tue, 18 Dec 2018 15:11:44 +0000 (17:11 +0200)]
x86/altp2m: add altp2m_vcpu_disable_notify
Allow altp2m users to disable #VE/VMFUNC alone. Currently it is
only possible to disable this functionality when we disable altp2m
completely; #VE/VMFUNC can only be enabled once per altp2m session.
In addition to making things complete, disabling #VE is also a
workaround for CFW116 ("When Virtualization Exceptions are Enabled,
EPT Violations May Generate Erroneous Virtualization Exceptions")
on Xeon E-2100 CPUs.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Fri, 28 Dec 2018 11:18:56 +0000 (12:18 +0100)]
x86/dom0: take alignment into account when populating p2m in PVH mode
Current code that allocates memory and populates the p2m for PVH Dom0
doesn't take the address alignment into account, this can lead to high
order allocations that start on a non-aligned address to be broken
down into lower order entries on the p2m page tables.
Fix this by taking into account the p2m page sizes and alignment
requirements when allocating the memory and populating the p2m.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Thu, 27 Dec 2018 15:26:35 +0000 (16:26 +0100)]
x86/dom0: allow stealing RAM from a region that starts in the low 1MB
As long as the memory stolen is always above 1MB. This allows the PVH
Dom0 builder to be used on a memory map that only has a single RAM
region starting at 0.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 20 Dec 2018 15:08:50 +0000 (15:08 +0000)]
x86/vtx: Improvements to ept= command line handling
Switch parse_ept_param() to use the parse_boolean() infrastructure for more
consistency with related command line parameters. Rename opt_pml_enabled to
opt_ept_pml for consistency with opt_ept_ad, and switch it to being bool
Drop the leading comment for parse_ept_param(). It is stale, and just repeats
the command line documentation.
For the command line documentation, rewrite it largely from scratch, updating
to the latest metadata style. Document A/D first, including a note about
AVR41, and modify PML to note its dependency on A/D.
No practical changes to behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Razvan Cojocaru [Sat, 22 Dec 2018 09:43:52 +0000 (09:43 +0000)]
p2m: change_type_range: Only invalidate mapped gfns
change_type_range() invalidates gfn ranges to lazily change the type
of a range of gfns, and also modifies the logdirty rangesets of that
p2m. At the moment, it clips both down by the hostp2m.
While this will result in correct behavior, it's not entirely efficient,
since invalidated entries outside that range will, on fault, simply be
modified back to "empty" before faulting normally again.
Separate out the calculation of the two ranges. Keep using the
hostp2m's max_mapped_pfn to clip the logdirty ranges, but use the
current p2m's max_mapped_pfn to further clip the invalidation range
for alternate p2ms.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Tested-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Razvan Cojocaru [Sat, 22 Dec 2018 09:43:52 +0000 (09:43 +0000)]
p2m: Always use hostp2m when clipping rangesets
The logdirty rangesets of the altp2ms need to be kept in sync with the
hostp2m. This means when iterating through the altp2ms, we need to
use the host p2m to clip the rangeset, not the indiviual altp2m's
value.
This change also:
- Documents that the end is non-inclusive
- Calculates an "inclusive" value for the end once, rather than
open-coding the modification, and (worse) back-modifying updates so
that the calculation ends up correct
- Clarifies the logic deciding whether to call
change_entry_type_global() or change_entry_type_range()
- Handles the case where start >= hostp2m->max_mapped_pfn
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Tested-by: Tamas K Lengyel <tamas@tklengyel.com>