Robert Richter [Wed, 12 Aug 2015 11:31:38 +0000 (13:31 +0200)]
pci, thunder, acpi: Fix ITS initialization for ACPI
Since pci bridges are enable with generic acpi pci code now, the its
was no longer initialized which leads to bad requester IDs for
ITS. Fixing that by moving its initialization to ThunderX bridge
fixup. The setup is now done for both, acpi and devicetree.
This should also fix broken pci root complexes for other vendor's
since the previous ACPI code enabled ThunderX requester IDs on all
systems without any device or vendor check.
It was root caused by Tomasz Nowicki <tn@semihalf.com>.
Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
IORT shows representation of IO topology that will be used by
ARM based systems. It describes how various components are connected
together e.g. which devices are connected to given ITS instance.
This patch implements calls which allow to:
- register/remove ITS as MSI chip
- parse all IORT nodes and form node tree (for easy lookup)
- find ITS (MSI chip) that device is assigned to
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Tomasz Nowicki [Tue, 28 Jul 2015 15:51:27 +0000 (17:51 +0200)]
pci, acpi, dma: Unify coherency checking logic for PCI devices.
ACPI spec5.1 states that the value of _CCA is inherited by all
descendants of bus master devices, root PCI bridge in this case.
So this patch is checking if PCI device's root bridge has coherency
flag set and then mounts DMA ops (in similar way as DT does).
Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Tomasz Nowicki [Tue, 24 Mar 2015 03:31:58 +0000 (20:31 -0700)]
ARM64 / ACPI: Point KVM to the virtual timer interrupt when booting with ACPI
With ACPI enabled, kvm_timer_hyp_init can't access any device tree
information. Although registration of the virtual timer interrupt
already happened when architected timers were initialized, we need to
point KVM to the interrupt line used.
Signed-off-by: Alexander Spyridakis <a.spyridakis@virtualopensystems.com> Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Tomasz Nowicki [Fri, 12 Dec 2014 08:41:23 +0000 (09:41 +0100)]
arm64/acpi/pci: provide hook for MCFG fixups
Some MCFG tables may be broken or the underlying hardware may not
be fully compliant with the PCIe ECAM mechanism. This patch provides
a mechanism to override the default mmconfig read/write routines
and/or do other MCFG related fixups.
Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Tomasz Nowicki [Fri, 14 Nov 2014 10:01:16 +0000 (11:01 +0100)]
pci, acpi: Share ACPI PCI config space accessors.
MMCFG can be used perfectly for all architectures which support ACPI.
ACPI mandates MMCFG to describe PCI config space ranges which means
we should use MMCONFIG accessors by default.
mmconfig_64.c version is going to be default implementation for arch
agnostic low-level direct PCI config space accessors via MMCONFIG.
However, now it initialize raw_pci_ext_ops pointer which is used in
x86 specific code only. Moreover, mmconfig_32.c is doing the same thing
at the same time.
Move it to mmconfig_shared.c so it becomes common for both and
mmconfig_64.c turns out to be purely arch agnostic.
Tomasz Nowicki [Thu, 13 Nov 2014 14:48:24 +0000 (15:48 +0100)]
x86, acpi, pci: Move PCI config space accessors.
We are going to use mmio_config_{} name convention across all architectures.
Currently it belongs to asm/pci_x86.h header which should be included
only for x86 specific files. From now on, those accessors are in asm/pci.h
header which can be included in non-architecture code much easier.
Tomasz Nowicki [Thu, 13 Nov 2014 10:59:18 +0000 (11:59 +0100)]
x86, acpi, pci: Move arch-agnostic MMCFG code out of arch/x86/ directory
MMCFG table seems to be architecture independent and it makes sense
to share common code across all architectures. The ones that may need
architectural specific actions have default prototype (__weak).
Tomasz Nowicki [Thu, 13 Nov 2014 10:54:53 +0000 (11:54 +0100)]
x86, acpi, pci: Reorder logic of pci_mmconfig_insert() function
This patch is the first step for MMCONFIG refactoring process.
Code that uses pci_mmcfg_lock will be moved to common file and become
accessible for all architectures. pci_mmconfig_insert() cannot be moved
so easily since it is mixing generic mmcfg code with x86 specific logic
inside of mutual exclusive block guarded by pci_mmcfg_lock.
To get rid of that constraint we reorder actions as fallow:
1. mmconfig entry allocation can be done at first, does not need lock
2. insertion to iomem_resource has its own lock, no need to wrap it into mutex
3. insertion to mmconfig list can be done as the final stage in separate
function (candidate for further factoring)
Tomasz Nowicki [Wed, 21 May 2014 13:53:23 +0000 (15:53 +0200)]
GICv3: Refactor gic_of_init() of GICv3 driver to allow for FDT and ACPI initialization.
Isolate hardware abstraction (FDT) code to gic_of_init().
Rest of the logic goes to gic_init_bases() and expects well defined
data to initialize GIC properly. The same solution is used for GICv2 driver.
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Tomasz Nowicki [Thu, 8 Jan 2015 11:36:33 +0000 (12:36 +0100)]
arm64, acpi: Implement new "GIC version" field of MADT GIC entry.
There is no need to probe GICv2 and GICv3 sequentially. From now on,
we know GIC version in advance. Note this patch does not break backward
compatibility for machines which are compliant with ACPI spec. 5.1.
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Al Stone [Tue, 11 Nov 2014 00:11:20 +0000 (17:11 -0700)]
clocksource: arm_arch_timer: fix system hang
Arm allows for two possible architectural clock sources. One memory mapped
and the other coprocessor based. If both timers exist, then the driver waits
for both to be probed before registering a clocksource.
Commit c387f07e6205 ("clocksource: arm_arch_timer: Discard unavailable timers
correctly") attempted to fix a hang occurring when one of the two possible
timers had a device node, but was disabled. In that case, the second probe
would never occur and the system would hang without a clocksource being
registered.
Unfortunately, incorrect logic in that commit made things worse such that
a hang would occur unless both timers had a device node and were enabled.
This patch fixes the logic so that we don't wait to probe a second timer
unless it exists and is enabled.
Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Naresh Bhat [Wed, 23 Oct 2013 11:23:31 +0000 (16:53 +0530)]
mfd: vexpress-sysreg Add ACPI support for probing to driver
Add match table and pointers for ACPI probing into vexpress-sysreg driver.
vexpress-sysreg is self-contained so it gets resources automatically
being platform driver. However, while it is not probed, it still should
provides base address so that other related drivers can take advantage
of it. Make it possible and find resources based on device HID.
Signed-off-by: Naresh Bhat <naresh.bhat@linaro.org> Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Graeme Gregory [Mon, 30 Jun 2014 18:52:02 +0000 (19:52 +0100)]
Juno / net: smsc911x add support for probing from ACPI
This is a standard platform device to resources are converted in the
ACPI core in the same fasion as DT resources. For the other DT
provided information there is _DSD for ACPI.
Andrew Pinski [Sat, 21 Mar 2015 20:08:01 +0000 (13:08 -0700)]
ARM64: Improve copy_page for 128 cache line sizes.
Adding a check for the cache line size is not much overhead.
Special case 128 byte cache line size.
This improves copy_page by 85% on ThunderX compared to the
original implementation.
For LMBench, it improves between 4-10%.
Signed-off-by: Andrew Pinski <apinski@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andrew Pinski [Sat, 21 Mar 2015 01:55:27 +0000 (18:55 -0700)]
ARM64:spinlocks: Fix up for WFE and improve performance slightly.
In the previous patch, I had made a mistake of putting WFE after the delay which
meant if we enable the WFE, we would get the same bad performance as before.
Also use the flags register some more to allow the instructions to be fused together.
Signed-off-by: Andrew Pinski <apinski@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andrew Pinski [Thu, 5 Mar 2015 09:40:11 +0000 (01:40 -0800)]
ARM64:VDSO: Improve __do_get_tspec, don't use udiv
In most other targets (x86/tile for an example),
the division in __do_get_tspec is converted into
a simple loop. The main reason for this is
because the result of this division is going
to be either 0 or 1.
This changes the division to the simple loop
and thus speeding up gettimeofday.
Signed-off-by: Andrew Pinski <apinski@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144
This implements a workaround for gicv3-its erratum 23144 applicable
for Cavium's ThunderX multinode systems.
The erratum fixes the hang of ITS SYNC command by avoiding inter node
io and collections/cpu mapping. This fix is only applicable for
Cavium's ThunderX dual-socket platforms.
arm64, numa: adding numa support for arm64 platforms.
Adding numa support for arm64 based platforms.
This patch adds by default the dummy numa node and
maps all memory and cpus to node 0.
using this patch, numa can be simulated on single node arm64 platforms.
arm64, numa: adding numa support for arm64 platforms.
Adding numa support for arm64 based platforms.
This patch adds by default the dummy numa node and
maps all memory and cpus to node 0.
using this patch, numa can be simulated on single node arm64 platforms.
Ard Biesheuvel [Mon, 2 Mar 2015 18:10:07 +0000 (18:10 +0000)]
arm64/efi: adapt to relaxed FDT placement requirements
With the relaxed FDT placement requirements in place, we can change
the allocation strategy used by the stub to put the FDT image higher
up in memory. At the same time, reduce the minimal alignment to 8 bytes,
and impose a 2 MB size limit, as per the new requirements.
Ard Biesheuvel [Sun, 10 May 2015 06:41:31 +0000 (08:41 +0200)]
arm64/efi: ignore DT memreserve entries instead of removing them
Now that the reservation of the FDT image itself is split off, we
can make the DT scanning of memreserves conditional on whether we
booted via UEFI and have its memory map available. This allows us
to drop deletion of these memreserves in the stub. It also fixes
the issue where the /reserved-memory/ node (which offers another
way of reserving memory ranges) was not being ignored under UEFI.
Ard Biesheuvel [Sun, 10 May 2015 08:26:44 +0000 (10:26 +0200)]
arm64/efi: ignore DT memory nodes instead of removing them
There are two problems with the UEFI stub DT memory node removal
routine:
- it deletes nodes as it traverses the tree, which happens to work
but is not supported, as deletion invalidates the node iterator;
- deleting memory nodes entirely may discard annotations in the form
of additional properties on the nodes.
Now that the UEFI initialization has moved to an earlier stage, we can
actually just ignore any memblocks that are installed after we have
processed the UEFI memory map. This way, it is no longer necessary to
remove the nodes, so we can remove that logic from the stub as well.
Ard Biesheuvel [Sun, 10 May 2015 12:03:31 +0000 (14:03 +0200)]
arm64/efi: move EFI init before early FDT processing
The early FDT processing is responsible for enumerating the
DT memory nodes and installing them as memblocks. This should
only be done if we are not booting via EFI, but at this point,
we don't know yet if that is the case or not.
So move the EFI init to before the early FDT processing. This involves
making some changes to the way EFI discovers the locations of the
EFI system table and the memory map, since those values are retrieved
from the FDT as well. Instead the of_scan infrastructure, it now uses
libfdt directly to access the /chosen node.
Ard Biesheuvel [Sun, 10 May 2015 10:09:14 +0000 (12:09 +0200)]
efi: move FDT handling to separate object file
The EFI specific FDT handling is compiled conditionally, and is
logically independent of the rest of efi.o. So move it to a separate
file before making changes to it in subsequent patches.
Acked-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Override the __weak early_init_dt_add_memory_arch() with our own
version. This allows us to relax the imposed restrictions at memory
discovery time, which is needed if we want to defer the assignment
of PHYS_OFFSET and make it independent of where the kernel Image
is placed in physical memory.
So copy the generic original, but only retain the check against
regions whose sizes become zero when clipped to page alignment.
For now, we will remove the range below PHYS_OFFSET explicitly
until we rework that logic in a subsequent patch. Any memory that
we will not be able to map due to insufficient size of the linear
region is also removed.
Eric Auger [Thu, 18 Jun 2015 08:46:07 +0000 (10:46 +0200)]
KVM: arm/arm64: enable MSI routing
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries, with or without
device id. Although standard MSI entries can be set, their
injection still is not supported. For ARM64, let's also increase
KVM_MAX_IRQ_ROUTES to 4096: include SPI irqchip flat routes plus
MSI routes. In the future this might be extended.
The new MSI routing entry type also must be managed similarly to
legacy KVM_IRQ_ROUTING_MSI in eventfd irqfd_wakeup and irqfd_update.
Eric Auger [Tue, 23 Jun 2015 14:55:02 +0000 (16:55 +0200)]
KVM: arm/arm64: build a default routing table
Implement a default routing table made of flat irqchip routing
entries (gsi = irqchip.pin) covering the VGIC SPI indexes.
This routing table is overwritten by the first user-space call
to KVM_SET_GSI_ROUTING ioctl.
Eric Auger [Tue, 7 Apr 2015 09:43:29 +0000 (11:43 +0200)]
KVM: arm/arm64: enable irqchip routing
This patch adds compilation and link against irqchip.
On ARM, irqchip routing is not really useful since there is
a single irqchip. However main motivation behind using irqchip
code is to enable MSI routing code. With the support of in-kernel
GICv3 ITS emulation, it now seems to be a MUST HAVE requirement.
Functions previously implemented in vgic.c and substitute
to more complex irqchip implementation are removed:
Eric Auger [Thu, 18 Jun 2015 08:30:19 +0000 (10:30 +0200)]
KVM: irqchip: convey devid to kvm_set_msi
on ARM, a devid field is populated in kvm_msi struct in case the
flag is set to KVM_MSI_VALID_DEVID. Let's populate the corresponding
kvm_kernel_irq_routing_entry devid field and set the msi type to
KVM_IRQ_ROUTING_EXTENDED_MSI.
Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Eric Auger [Thu, 18 Jun 2015 13:26:59 +0000 (15:26 +0200)]
KVM: api: introduce KVM_IRQ_ROUTING_EXTENDED_MSI
On ARM, the MSI msg (address and data) comes along with
out-of-band device ID information. The device ID encodes the
device that writes the MSI msg. Let's convey the device id in
kvm_irq_routing_msi and use a new routing entry type to
indicate the devid is populated.
Andre Przywara [Fri, 10 Jul 2015 14:21:51 +0000 (15:21 +0100)]
KVM: arm64: enable ITS emulation as a virtual MSI controller
If userspace has provided a base address for the ITS register frame,
we enable the bits that advertise LPIs in the GICv3.
When the guest has enabled LPIs and the ITS, we enable the emulation
part by initializing the ITS data structures and trapping on ITS
register frame accesses by the guest.
Also we enable the KVM_SIGNAL_MSI feature to allow userland to inject
MSIs into the guest. Not having enabled the ITS emulation will lead
to a -ENODEV when trying to inject a MSI.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:50 +0000 (15:21 +0100)]
KVM: arm64: implement MSI injection in ITS emulation
When userland wants to inject a MSI into the guest, we have to use
our data structures to find the LPI number and the VCPU to receive
the interrupt.
Use the wrapper functions to iterate the linked lists and find the
proper Interrupt Translation Table Entry. Then set the pending bit
in this ITTE to be later picked up by the LR handling code. Kick
the VCPU which is meant to handle this interrupt.
We provide a VGIC emulation model specific routine for the actual
MSI injection. The wrapper functions return an error for models not
(yet) implementing MSIs (like the GICv2 emulation).
We also provide the handler for the ITS "INT" command, which allows a
guest to trigger an MSI via the ITS command queue.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:49 +0000 (15:21 +0100)]
KVM: arm64: implement ITS command queue command handlers
The connection between a device, an event ID, the LPI number and the
allocated CPU is stored in in-memory tables in a GICv3, but their
format is not specified by the spec. Instead software uses a command
queue in a ring buffer to let the ITS implementation use their own
format.
Implement handlers for the various ITS commands and let them store
the requested relation into our own data structures.
To avoid kmallocs inside the ITS spinlock, we preallocate possibly
needed memory outside of the lock and free that if it turns out to
be not needed (mostly error handling).
Error handling is very basic at this point, as we don't have a good
way of communicating errors to the guest (usually a SError).
The INT command handler is missing at this point, as we gain the
capability of actually injecting MSIs into the guest only later on.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:48 +0000 (15:21 +0100)]
KVM: arm64: sync LPI configuration and pending tables
The LPI configuration and pending tables of the GICv3 LPIs are held
in tables in (guest) memory. To achieve reasonable performance, we
cache this data in our own data structures, so we need to sync those
two views from time to time. This behaviour is well described in the
GICv3 spec and is also exercised by hardware, so the sync points are
well known.
Provide functions that read the guest memory and store the
information from the configuration and pending tables in the kernel.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:47 +0000 (15:21 +0100)]
KVM: arm64: handle pending bit for LPIs in ITS emulation
As the actual LPI number in a guest can be quite high, but is mostly
assigned using a very sparse allocation scheme, bitmaps and arrays
for storing the virtual interrupt status are a waste of memory.
We use our equivalent of the "Interrupt Translation Table Entry"
(ITTE) to hold this extra status information for a virtual LPI.
As the normal VGIC code cannot use it's fancy bitmaps to manage
pending interrupts, we provide a hook in the VGIC code to let the
ITS emulation handle the list register queueing itself.
LPIs are located in a separate number range (>=8192), so
distinguishing them is easy. With LPIs being only edge-triggered, we
get away with a less complex IRQ handling.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:46 +0000 (15:21 +0100)]
KVM: arm64: add data structures to model ITS interrupt translation
The GICv3 Interrupt Translation Service (ITS) uses tables in memory
to allow a sophisticated interrupt routing. It features device tables,
an interrupt table per device and a table connecting "collections" to
actual CPUs (aka. redistributors in the GICv3 lingo).
Since the interrupt numbers for the LPIs are allocated quite sparsely
and the range can be quite huge (8192 LPIs being the minimum), using
bitmaps or arrays for storing information is a waste of memory.
We use linked lists instead, which we iterate linearily. This works
very well with the actual number of LPIs/MSIs in the guest being
quite low. Should the number of LPIs exceed the number where iterating
through lists seems acceptable, we can later revisit this and use more
efficient data structures.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:45 +0000 (15:21 +0100)]
KVM: arm64: implement basic ITS register handlers
Add emulation for some basic MMIO registers used in the ITS emulation.
This includes:
- GITS_{CTLR,TYPER,IIDR}
- ID registers
- GITS_{CBASER,CREADR,CWRITER}
those implement the ITS command buffer handling
Most of the handlers are pretty straight forward, but CWRITER goes
some extra miles to allow fine grained locking. The idea here
is to let only the first instance iterate through the command ring
buffer, CWRITER accesses on other VCPUs meanwhile will be picked up
by that first instance and handled as well. The ITS lock is thus only
hold for very small periods of time and is dropped before the actual
command handler is called.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:44 +0000 (15:21 +0100)]
KVM: arm64: introduce ITS emulation file with stub functions
The ARM GICv3 ITS emulation code goes into a separate file, but
needs to be connected to the GICv3 emulation, of which it is an
option.
Introduce the skeleton with function stubs to be filled later.
Introduce the basic ITS data structure and initialize it, but don't
return any success yet, as we are not yet ready for the show.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:43 +0000 (15:21 +0100)]
KVM: arm64: handle ITS related GICv3 redistributor registers
In the GICv3 redistributor there are the PENDBASER and PROPBASER
registers which we did not emulate so far, as they only make sense
when having an ITS. In preparation for that emulate those MMIO
accesses by storing the 64-bit data written into it into a variable
which we later read in the ITS emulation.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:42 +0000 (15:21 +0100)]
KVM: arm64: Introduce new MMIO region for the ITS base address
The ARM GICv3 ITS controller requires a separate register frame to
cover ITS specific registers. Add a new VGIC address type and store
the address in a field in the vgic_dist structure.
Provide a function to check whether userland has provided the address,
so ITS functionality can be guarded by that check.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:41 +0000 (15:21 +0100)]
KVM: arm/arm64: make GIC frame address initialization model specific
Currently we initialize all the possible GIC frame addresses in one
function, without looking at the specific GIC model we instantiate
for the guest.
As this gets confusing when adding another VGIC model later, lets
move these initializations into the respective model's init functions.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:40 +0000 (15:21 +0100)]
KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
KVM capabilities can be a per-VM property, though ARM/ARM64 currently
does not pass on the VM pointer to the architecture specific
capability handlers.
Add a "struct kvm*" parameter to those function to later allow proper
per-VM capability reporting.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:39 +0000 (15:21 +0100)]
KVM: arm/arm64: add emulation model specific destroy function
Currently we destroy the VGIC emulation in one function that cares for
all emulated models. To be on par with init_model (which is model
specific), lets introduce a per-emulation-model destroy method, too.
Use it for a tiny GICv3 specific code already, later it will be handy
for the ITS emulation.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
KVM: extend struct kvm_msi to hold a 32-bit device ID
The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.
Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:37 +0000 (15:21 +0100)]
KVM: arm/arm64: VGIC: don't track used LRs in the distributor
Currently we track which IRQ has been mapped to which VGIC list
register and also have to synchronize both. We used to do this
to hold some extra state (for instance the active bit).
It turns out that this extra state in the LRs is no longer needed and
this extra tracking causes some pain later.
Remove the tracking feature (lr_map and lr_used) and get rid of
quite some code on the way.
On a guest exit we pick up all still pending IRQs from the LRs and put
them back in the distributor. We don't care about active-only IRQs,
so we keep them in the LRs. They will be retired either by our
vgic_process_maintenance() routine or by the GIC hardware in case of
edge triggered interrupts.
In places where we scan LRs we now use our shadow copy of the ELRSR
register directly.
This code change means we lose the "piggy-back" optimization, which
would re-use an active-only LR to inject the pending state on top of
it. Tracing with various workloads shows that this actually occurred
very rarely, the ballpark figure is about once every 10,000 exits
in a disk I/O heavy workload. Also the list registers don't seem to
as scarce as assumed, with all 4 LRs on the popular implementations
used less than once every 100,000 exits.
This has been briefly tested on Midway, Juno and the model (the latter
both with GICv2 and GICv3 guests).
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
David Daney [Mon, 6 Apr 2015 23:00:29 +0000 (16:00 -0700)]
net/mlx4: Remove improper usage of dma_alloc_coherent().
The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory. On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent. Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent(). Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).
The MLX4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent))
This results in an OOPs when the device is opened.
To fix this...
Always use result of dma_alloc_coherent() directly.
Remove 'max_direct' parameter to mlx4_buf_alloc(), as it is unused,
and adjust all callers.
Remove mlx4_en_map_buffer() and mlx4_en_unmap_buffer() as they now do
nothing, and adjust all callers.
Remove 'page_list' element from struct mlx4_buf as it is unused.
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:16 +0000 (12:29 +0300)]
net: thunderx: Support for internal loopback mode
Support for setting VF's corresponding BGX LMAC in internal
loopback mode. This mode can be used for verifying basic HW
functionality such as packet I/O, RX checksum validation,
CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external
network connectivity.
'loopback' mode can be enabled or disabled via ethtool.
Note: This feature is not supported when no of VFs enabled are
morethan no of physical interfaces i.e active BGX LMACs
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d77a2384988fd397cf4f71417b9d971aa435758d) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:15 +0000 (12:29 +0300)]
net: thunderx: Support for upto 96 queues for a VF
This patch adds support for handling multiple qsets assigned to a
single VF. There by increasing no of queues from earlier 8 to max
no of CPUs in the system i.e 48 queues on a single node and 96 on
dual node system. User doesn't have option to assign which Qsets/VFs
to be merged. Upon request from VF, PF assigns next free Qsets as
secondary qsets. To maintain current behavior no of queues is kept
to 8 by default which can be increased via ethtool.
If user wants to unbind NICVF driver from a secondary Qset then it
should be done after tearing down primary VF's interface.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92dc87697e6a71675a9e9eec04ebecd8cf4837a3) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:14 +0000 (12:29 +0300)]
net: thunderx: Rework interrupt handling
Rework interrupt handler to avoid checking IRQ affinity of
CQ interrupts. Now separate handlers are registered for each IRQ
including RBDR. Register interrupt handlers for only those
which are being used. Add nicvf_dump_intr_status() and use it
in irq handlers.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 39ad6eea6c1a01b69abb1102a767697fb9349830) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:13 +0000 (12:29 +0300)]
net: thunderx: Support for HW VLAN stripping
This patch configures HW to strip 802.1Q header if found in a
receiving packet. The stripped VLAN ID and TCI information is
passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
so that other HW offload features can be used for tagged packets.
This offload feature can be enabled or disabled via ethtool.
Network stack normally ignores RPS for 802.1Q packets and hence low
throughput. With this offload enabled throughput for tagged packets
will be almost same as normal packets.
Note: This patch doesn't enable HW VLAN insertion for transmit packets.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit aa2e259b474a4f52ecc9f6e0d444547de0aac4b2) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:12 +0000 (12:29 +0300)]
net: thunderx: Receive hashing HW offload support
Adding support for receive hashing HW offload by using RSS_ALG
and RSS_TAG fields of CQE_RX descriptor. Also removed dependency
on minimum receive queue count to configure RSS so that hash is
always generated.
This hash is used by RPS logic to distribute flows across multiple
CPUs. Offload can be disabled via ethtool.
Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 38bb5d4f4f988c98035fca003138dd84471432f2) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:11 +0000 (12:29 +0300)]
net: thunderx: mailboxes: remove code duplication
Use the nicvf_send_msg_to_pf() function in the mailbox code.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6051cba77c1c768d954cf9e423c44bcb85b9adb8) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Aleksey Makarov [Sun, 30 Aug 2015 09:29:09 +0000 (12:29 +0300)]
net: thunderx: fix MAINTAINERS
The liquidio and thunder drivers have different maintainers.
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 322e5cc5c6c03584ff9362357fc1448b5e442e9e) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Based on code from: Narinder Dhillon <ndhillon@cavium.com>
Tomasz Nowicki <tomasz.nowicki@linaro.org>
Robert Richter <rrichter@cavium.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 46b903a01c053d0c94975ea7a6819618f121d3d6) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>