KVM: extend struct kvm_msi to hold a 32-bit device ID
The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.
Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Andre Przywara [Fri, 10 Jul 2015 14:21:37 +0000 (15:21 +0100)]
KVM: arm/arm64: VGIC: don't track used LRs in the distributor
Currently we track which IRQ has been mapped to which VGIC list
register and also have to synchronize both. We used to do this
to hold some extra state (for instance the active bit).
It turns out that this extra state in the LRs is no longer needed and
this extra tracking causes some pain later.
Remove the tracking feature (lr_map and lr_used) and get rid of
quite some code on the way.
On a guest exit we pick up all still pending IRQs from the LRs and put
them back in the distributor. We don't care about active-only IRQs,
so we keep them in the LRs. They will be retired either by our
vgic_process_maintenance() routine or by the GIC hardware in case of
edge triggered interrupts.
In places where we scan LRs we now use our shadow copy of the ELRSR
register directly.
This code change means we lose the "piggy-back" optimization, which
would re-use an active-only LR to inject the pending state on top of
it. Tracing with various workloads shows that this actually occurred
very rarely, the ballpark figure is about once every 10,000 exits
in a disk I/O heavy workload. Also the list registers don't seem to
as scarce as assumed, with all 4 LRs on the popular implementations
used less than once every 100,000 exits.
This has been briefly tested on Midway, Juno and the model (the latter
both with GICv2 and GICv3 guests).
Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
David Daney [Mon, 6 Apr 2015 23:00:29 +0000 (16:00 -0700)]
net/mlx4: Remove improper usage of dma_alloc_coherent().
The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory. On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent. Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent(). Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).
The MLX4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent))
This results in an OOPs when the device is opened.
To fix this...
Always use result of dma_alloc_coherent() directly.
Remove 'max_direct' parameter to mlx4_buf_alloc(), as it is unused,
and adjust all callers.
Remove mlx4_en_map_buffer() and mlx4_en_unmap_buffer() as they now do
nothing, and adjust all callers.
Remove 'page_list' element from struct mlx4_buf as it is unused.
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:16 +0000 (12:29 +0300)]
net: thunderx: Support for internal loopback mode
Support for setting VF's corresponding BGX LMAC in internal
loopback mode. This mode can be used for verifying basic HW
functionality such as packet I/O, RX checksum validation,
CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external
network connectivity.
'loopback' mode can be enabled or disabled via ethtool.
Note: This feature is not supported when no of VFs enabled are
morethan no of physical interfaces i.e active BGX LMACs
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d77a2384988fd397cf4f71417b9d971aa435758d) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:15 +0000 (12:29 +0300)]
net: thunderx: Support for upto 96 queues for a VF
This patch adds support for handling multiple qsets assigned to a
single VF. There by increasing no of queues from earlier 8 to max
no of CPUs in the system i.e 48 queues on a single node and 96 on
dual node system. User doesn't have option to assign which Qsets/VFs
to be merged. Upon request from VF, PF assigns next free Qsets as
secondary qsets. To maintain current behavior no of queues is kept
to 8 by default which can be increased via ethtool.
If user wants to unbind NICVF driver from a secondary Qset then it
should be done after tearing down primary VF's interface.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92dc87697e6a71675a9e9eec04ebecd8cf4837a3) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:14 +0000 (12:29 +0300)]
net: thunderx: Rework interrupt handling
Rework interrupt handler to avoid checking IRQ affinity of
CQ interrupts. Now separate handlers are registered for each IRQ
including RBDR. Register interrupt handlers for only those
which are being used. Add nicvf_dump_intr_status() and use it
in irq handlers.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 39ad6eea6c1a01b69abb1102a767697fb9349830) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:13 +0000 (12:29 +0300)]
net: thunderx: Support for HW VLAN stripping
This patch configures HW to strip 802.1Q header if found in a
receiving packet. The stripped VLAN ID and TCI information is
passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
so that other HW offload features can be used for tagged packets.
This offload feature can be enabled or disabled via ethtool.
Network stack normally ignores RPS for 802.1Q packets and hence low
throughput. With this offload enabled throughput for tagged packets
will be almost same as normal packets.
Note: This patch doesn't enable HW VLAN insertion for transmit packets.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit aa2e259b474a4f52ecc9f6e0d444547de0aac4b2) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:12 +0000 (12:29 +0300)]
net: thunderx: Receive hashing HW offload support
Adding support for receive hashing HW offload by using RSS_ALG
and RSS_TAG fields of CQE_RX descriptor. Also removed dependency
on minimum receive queue count to configure RSS so that hash is
always generated.
This hash is used by RPS logic to distribute flows across multiple
CPUs. Offload can be disabled via ethtool.
Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 38bb5d4f4f988c98035fca003138dd84471432f2) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Sunil Goutham [Sun, 30 Aug 2015 09:29:11 +0000 (12:29 +0300)]
net: thunderx: mailboxes: remove code duplication
Use the nicvf_send_msg_to_pf() function in the mailbox code.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6051cba77c1c768d954cf9e423c44bcb85b9adb8) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Aleksey Makarov [Sun, 30 Aug 2015 09:29:09 +0000 (12:29 +0300)]
net: thunderx: fix MAINTAINERS
The liquidio and thunder drivers have different maintainers.
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 322e5cc5c6c03584ff9362357fc1448b5e442e9e) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Based on code from: Narinder Dhillon <ndhillon@cavium.com>
Tomasz Nowicki <tomasz.nowicki@linaro.org>
Robert Richter <rrichter@cavium.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 46b903a01c053d0c94975ea7a6819618f121d3d6) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Robert Richter [Tue, 11 Aug 2015 00:58:36 +0000 (17:58 -0700)]
net: thunder: Factor out DT specific code in BGX
Separate DT code in preparation for follow-on ACPI integration.
Based on code from: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit de387e1156c76b273529f1803c6bd87b61eac2c5) Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
arm64: gicv3: its: Increase FORCE_MAX_ZONEORDER for Cavium ThunderX
In case of ARCH_THUNDER, there is a need to allocate the GICv3 ITS table
which is bigger than the allowed max order. So we are forcing it only in
case of 4KB page size.
Robert Richter [Mon, 29 Jun 2015 16:25:45 +0000 (18:25 +0200)]
irqchip, gicv3-its: Add HW revision detection and configuration
Some GIC revisions require an individual configuration to esp. add
workarounds for HW bugs. This patch implements generic code to parse
the hw revision provided by an IIDR register value and runs specific
code if hw matches. There are functions that read the IIDR registers
for GICV3 and ITS (GICD_IIDR/GITS_IIDR) and then go through a list of
init functions to be called for specific versions.
A MIDR register value may also be used, this is especially useful for
hw detection from a guest.
The patch is needed to implement workarounds for HW errata in Cavium's
ThunderX GICV3.
v4:
* only enable hw detection for its in its_enable_quirks()
* removed gicv3_check_capabilities()
v3:
* use arm64 errata framework for midr check
v2:
* adding MIDR check
Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Robert Richter [Wed, 27 Aug 2014 14:24:24 +0000 (16:24 +0200)]
irqchip, gicv3-its: Read typer register outside the loop
No need to read the typer register in the loop. Values do not change.
This patch is basically a prerequisite for a follow-on patch that adds
errata code for Cavium ThunderX. It moves the calculation of the
number of id entries to the beginning of the function close to other
setup values that are needed to allocate the its table. Now we have a
central location to modify the setup parameters and the errata code
can be implemented in a single block.
Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Robert Richter [Mon, 29 Jun 2015 16:46:22 +0000 (18:46 +0200)]
irqchip, gicv3: Workaround for Cavium ThunderX erratum 23154
This patch implements Cavium ThunderX erratum 23154.
The gicv3 of ThunderX requires a modified version for reading the IAR
status to ensure data synchronization. Since this is in the fast-path
and called with each interrupt, runtime patching is used using jump
label patching for smallest overhead (no-op). This is the same
technique as used for tracepoints.
v4:
* simplify code to only use cpus_have_cap() in gicv3_enable_quirks()
v3:
* fix erratum to be dependend from midr
* use arm64 errata framework
v2:
* implement code in a single asm() to keep instruction sequence
* added comment to the code that explains the erratum
* apply workaround also if running as guest, thus check MIDR
Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg
While making the root blkg unconditional, ec13b1d6f0a0 ("blkcg: always
create the blkcg_gq for the root blkcg") removed the part which clears
q->root_blkg and ->root_rl.blkg during q exit. This leaves the two
pointers dangling after blkg_destroy_all(). blk-throttle exit path
performs blkg traversals and dereferences ->root_blkg and can lead to
the following oops.
While the bug is a straigh-forward use-after-free bug, it is tricky to
reproduce because blkg release is RCU protected and the rest of exit
path usually finishes before RCU grace period.
This patch fixes the bug by updating blkg_destro_all() to clear
q->root_blkg and ->root_rl.blkg.
Tim Gardner [Tue, 4 Aug 2015 17:26:04 +0000 (11:26 -0600)]
workqueue: Make flush_workqueue() available again to non GPL modules
Commit 37b1ef31a568fc02e53587620226e5f3c66454c8 ("workqueue: move
flush_scheduled_work() to workqueue.h") moved the exported non GPL
flush_scheduled_work() from a function to an inline wrapper.
Unfortunately, it directly calls flush_workqueue() which is a GPL function.
This has the effect of changing the licensing requirement for this function
and makes it unavailable to non GPL modules.
Thomas Hellstrom [Wed, 26 Aug 2015 12:49:21 +0000 (05:49 -0700)]
drm/vmwgfx: Allow dropped masters render-node like access on legacy nodes v2
Applications like gnome-shell may try to render after dropping master
privileges. Since the driver should now be safe against this scenario,
allow those applications to use their legacy node like a render node.
v2: Add missing return statement.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Thomas Hellstrom [Thu, 25 Jun 2015 17:47:43 +0000 (10:47 -0700)]
vmwgfx: Rework device initialization
This commit reworks device initialization so that we always enable the
FIFO at driver load, deferring SVGA enable until either first modeset
or fbdev enable.
This should always leave the fifo properly enabled for render- and
control nodes.
In addition,
*) We disable the use of VRAM when SVGA is not enabled.
*) We simplify PM support so that we only throw out resources on hibernate,
not on suspend, since the device keeps its state on suspend.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Jonathon Jongsma [Thu, 20 Aug 2015 19:04:32 +0000 (12:04 -0700)]
drm/qxl: validate monitors config modes
Due to some recent changes in
drm_helper_probe_single_connector_modes_merge_bits(), old custom modes
were not being pruned properly. In current kernels,
drm_mode_validate_basic() is called to sanity-check each mode in the
list. If the sanity-check passes, the mode's status gets set to to
MODE_OK. In older kernels this check was not done, so old custom modes
would still have a status of MODE_UNVERIFIED at this point, and would
therefore be pruned later in the function.
As a result of this new behavior, the list of modes for a device always
includes every custom mode ever configured for the device, with the
largest one listed first. Since desktop environments usually choose the
first preferred mode when a hotplug event is emitted, this had the
result of making it very difficult for the user to reduce the size of
the display.
The qxl driver did implement the mode_valid connector function, but it
was empty. In order to restore the old behavior where old custom modes
are pruned, we implement a proper mode_valid function for the qxl
driver. This function now checks each mode against the last configured
custom mode and the list of standard modes. If the mode doesn't match
any of these, its status is set to MODE_BAD so that it will be pruned as
expected.
Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com> Cc: stable at vger.kernel.org
Hans de Goede [Thu, 23 Jul 2015 15:20:12 +0000 (17:20 +0200)]
nv46: Change mc subdev oclass from nv44 to nv4c
MSI interrupts appear to not work for nv46 based cards. Change the mc
subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
identical to the nv44 mc code except that it does not use msi
(it does not define a msi_rearm callback).
Eric Sandeen [Wed, 5 Aug 2015 22:13:58 +0000 (15:13 -0700)]
ext4: don't manipulate recovery flag when freezing no-journal fs
At some point along this sequence of changes:
f6e63f9 ext4: fold ext4_nojournal_sops into ext4_sops bb04457 ext4: support freezing ext2 (nojournal) file systems 9ca9238 ext4: Use separate super_operations structure for no_journal filesystems
ext4 started setting needs_recovery on filesystems without journals
when they are unfrozen. This makes no sense, and in fact confuses
blkid to the point where it doesn't recognize the filesystem at all.
(freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs,
see needs_recovery set on fs w/ no journal).
To fix this, don't manipulate the INCOMPAT_RECOVER feature on
filesystems without journals.
Reported-by: Stu Mark <smark@xxxxxxxxx> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
Chris Leech [Tue, 16 Jun 2015 23:07:13 +0000 (16:07 -0700)]
iSCSI: let session recovery_tmo sysfs writes persist across recovery
The iSCSI session recovery_tmo setting is writeable in sysfs, but it's
also set every time a connection is established when parameters are set
from iscsid over netlink. That results in the timeout being reset to
the default value after every recovery.
The DM multipath tools want to use the sysfs interface to lower the
default timeout when there are multiple paths to fail over. It has
caused confusion that we have a writeable sysfs value that seem to keep
resetting itself.
This patch adds an in-kernel flag that gets set once a sysfs write
occurs, and then ignores netlink parameter setting once it's been
modified via the sysfs interface. My thinking here is that the sysfs
interface is much simpler for external tools to influence the session
timeout, but if we're going to allow it to be modified directly we
should ensure that setting is maintained.
Signed-off-by: Chris Leech <cleech@redhat.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Odin.com>
Hans de Goede [Fri, 24 Jul 2015 09:45:28 +0000 (11:45 +0200)]
ideapad-laptop: Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list
Like some of the other Yoga models the Lenovo Yoga 3 14 does not have a
hw rfkill switch, and trying to read the hw rfkill switch through the
ideapad module causes it to always reported blocking breaking wifi.
This commit adds the Lenovo Yoga 3 14 to the no_hw_rfkill dmi list, fixing
the wifi breakage.
Dave Young [Fri, 18 Sep 2015 15:27:32 +0000 (15:27 +0000)]
kexec/uefi: copy secure_boot flag in boot params across kexec reboot
Kexec reboot in case secure boot being enabled does not keep the secure boot
mode in new kernel, so later one can load unsigned kernel via legacy kexec_load.
In this state, the system is missing the protections provided by secure boot.
Adding a patch to fix this by retain the secure_boot flag in original kernel.
secure_boot flag in boot_params is set in EFI stub, but kexec bypasses the stub.
Fixing this issue by copying secure_boot flag across kexec reboot.
HID: chicony: Add support for Acer Aspire Switch 12
Acer Aspire Switch 12 keyboard Chicony's controller reports too big usage
index on the 1st interface. The patch fixes the report. The work based on
solution from drivers/hid/hid-holtek-mouse.c
When resuming, the bluetooth driver needs to re-request the
firmware. This re-request is happening before usermodehelper
is fully enabled. If the firmware load succeeded previously, the
caching behavior of the firmware code typically negates the
need to call the usermodehelper code again and the request
succeeds. If the firmware was never loaded because it isn't
actually present in the file system, this results in a call
to usermodehelper and a failure warning every resume.
The proper fix is to add a reset_resume functionality to the
btusb driver to be able to handle the resume case. The
work for this is ongoing so in the mean time just silence
the warning since we know it's a problem.
Bugzilla: 1133378
Upstream-status: Working on it. It's a difficult problem :( Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Input - synaptics: pin 3 touches when the firmware reports 3 fingers
Synaptics PS/2 touchpad can send only 2 touches in a report. They can
detect 4 or 5 and this information is valuable.
In commit 63c4fda (Input: synaptics - allocate 3 slots to keep stability
in image sensors), we allocate 3 slots, but we still continue to report
the 2 available fingers. That means that the client sees 2 used slots while
there is a total of 3 fingers advertised by BTN_TOOL_TRIPLETAP.
For old kernels this is not a problem because max_slots was 2 and libinput/
xorg-synaptics knew how to deal with that. Now that max_slot is 3, the
clients ignore BTN_TOOL_TRIPLETAP and count the actual used slots (so 2).
It then gets confused when receiving the BTN_TOOL_TRIPLETAP and DOUBLETAP
information, and goes wild.
We can pin the 3 slots until we get a total number of fingers below 2.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1212230 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
There is no need for this at all. Worst it means that if
the guest tries to write to BARs it could lead (on certain
platforms) to PCI SERR errors.
Please note that with af6fc858a35b90e89ea7a7ee58e66628c55c776b
"xen-pciback: limit guest control of command register"
a guest is still allowed to enable those control bits (safely), but
is not allowed to disable them and that therefore a well behaved
frontend which enables things before using them will still
function correctly.
This is done via an write to the configuration register 0x4 which
triggers on the backend side:
command_write
\- pci_enable_device
\- pci_enable_device_flags
\- do_pci_enable_device
\- pcibios_enable_device
\-pci_enable_resourcess
[which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO]
However guests (and drivers) which don't do this could cause
problems, including the security issues which XSA-120 sought
to address.
Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Dave Jones [Tue, 24 Jun 2014 12:43:34 +0000 (08:43 -0400)]
watchdog: Disable watchdog on virtual machines.
For various reasons, VMs seem to trigger the soft lockup detector a lot,
in cases where it's just not possible for a lockup to occur.
(Example: https://bugzilla.redhat.com/show_bug.cgi?id=971139)
In some cases it seems that the host just never scheduled the app running
the VM for a very long time (Could be the host was under heavy load).
Just disable the detector on VMs.
Bugzilla: 971139
Upstream-status: Fedora mustard for now
Éric Piel [Thu, 3 Nov 2011 15:22:40 +0000 (16:22 +0100)]
lis3: improve handling of null rate
When obtaining a rate of 0, we would disable the device supposely
because it seems to behave incorectly. It actually only comes from the
fact that the device is off and on lis3dc it's reflected in the rate.
So handle this nicely by just waiting a safe time, and then using the
device as normally.
Bastien Nocera [Thu, 20 May 2010 14:30:31 +0000 (10:30 -0400)]
disable i8042 check on apple mac
As those computers never had any i8042 controllers, and the
current lookup code could potentially lock up/hang/wait for
timeout for long periods of time.
Fixes intermittent hangs on boot on a MacbookAir1,1
Bugzilla: N/A
Upstream-status: http://lkml.indiana.edu/hypermail/linux/kernel/1005.0/00938.html (and pinged on Dec 17, 2013)
Adam Jackson [Wed, 13 Nov 2013 15:17:24 +0000 (10:17 -0500)]
drm/i915: hush check crtc state
This is _by far_ the most common backtrace for i915 on retrace.fp.o, and
it's mostly useless noise. There's not enough context when it's generated
to know if something actually went wrong. Downgrade the message to
KMS debugging so we can still get it if we want it.
Josh Boyer [Thu, 3 Oct 2013 14:14:23 +0000 (10:14 -0400)]
MODSIGN: Support not importing certs from db
If a user tells shim to not use the certs/hashes in the UEFI db variable
for verification purposes, shim will set a UEFI variable called MokIgnoreDB.
Have the uefi import code look for this and not import things from the db
variable.
Josh Boyer [Fri, 26 Oct 2012 16:42:16 +0000 (12:42 -0400)]
MODSIGN: Import certificates from UEFI Secure Boot
Secure Boot stores a list of allowed certificates in the 'db' variable.
This imports those certificates into the system trusted keyring. This
allows for a third party signing certificate to be used in conjunction
with signed modules. By importing the public certificate into the 'db'
variable, a user can allow a module signed with that certificate to
load. The shim UEFI bootloader has a similar certificate list stored
in the 'MokListRT' variable. We import those as well.
In the opposite case, Secure Boot maintains a list of disallowed
certificates in the 'dbx' variable. We load those certificates into
the newly introduced system blacklist keyring and forbid any module
signed with those from loading.
Josh Boyer [Fri, 26 Oct 2012 16:36:24 +0000 (12:36 -0400)]
KEYS: Add a system blacklist keyring
This adds an additional keyring that is used to store certificates that
are blacklisted. This keyring is searched first when loading signed modules
and if the module's certificate is found, it will refuse to load. This is
useful in cases where third party certificates are used for module signing.
Josh Boyer [Fri, 20 Jun 2014 12:53:24 +0000 (08:53 -0400)]
hibernate: Disable in a signed modules environment
There is currently no way to verify the resume image when returning
from hibernate. This might compromise the signed modules trust model,
so until we can work with signed hibernate images we disable it in
a secure modules environment.
Josh Boyer [Wed, 6 Feb 2013 00:25:05 +0000 (19:25 -0500)]
efi: Disable secure boot if shim is in insecure mode
A user can manually tell the shim boot loader to disable validation of
images it loads. When a user does this, it creates a UEFI variable called
MokSBState that does not have the runtime attribute set. Given that the
user explicitly disabled validation, we can honor that and not enable
secure boot mode if that variable is set.
Matthew Garrett [Fri, 9 Aug 2013 22:36:30 +0000 (18:36 -0400)]
Add option to automatically enforce module signatures when in Secure Boot mode
UEFI Secure Boot provides a mechanism for ensuring that the firmware will
only load signed bootloaders and kernels. Certain use cases may also
require that all kernel modules also be signed. Add a configuration option
that enforces this automatically when enabled.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Fri, 8 Feb 2013 19:12:13 +0000 (11:12 -0800)]
x86: Restrict MSR access when module loading is restricted
Writing to MSRs should not be allowed if module loading is restricted,
since it could lead to execution of arbitrary code in kernel mode. Based
on a patch by Kees Cook.
Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Fri, 9 Aug 2013 07:33:56 +0000 (03:33 -0400)]
kexec: Disable at runtime if the kernel enforces module loading restrictions
kexec permits the loading and execution of arbitrary code in ring 0, which
is something that module signing enforcement is meant to prevent. It makes
sense to disable kexec in this situation.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Josh Boyer [Mon, 25 Jun 2012 23:57:30 +0000 (19:57 -0400)]
acpi: Ignore acpi_rsdp kernel parameter when module loading is restricted
This option allows userspace to pass the RSDP address to the kernel, which
makes it possible for a user to circumvent any restrictions imposed on
loading modules. Disable it in that case.
Matthew Garrett [Fri, 9 Mar 2012 14:28:15 +0000 (09:28 -0500)]
Restrict /dev/mem and /dev/kmem when module loading is restricted
Allowing users to write to address space makes it possible for the kernel
to be subverted, avoiding module loading restrictions. Prevent this when
any restrictions have been imposed on loading modules.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Fri, 9 Mar 2012 13:46:50 +0000 (08:46 -0500)]
asus-wmi: Restrict debugfs interface when module loading is restricted
We have no way of validating what all of the Asus WMI methods do on a
given machine, and there's a risk that some will allow hardware state to
be manipulated in such a way that arbitrary code can be executed in the
kernel, circumventing module loading restrictions. Prevent that if any of
these features are enabled.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Fri, 9 Mar 2012 13:39:37 +0000 (08:39 -0500)]
ACPI: Limit access to custom_method
custom_method effectively allows arbitrary access to system memory, making
it possible for an attacker to circumvent restrictions on module loading.
Disable it if any such restrictions have been enabled.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Thu, 8 Mar 2012 15:35:59 +0000 (10:35 -0500)]
x86: Lock down IO port access when module security is enabled
IO port access would permit users to gain access to PCI configuration
registers, which in turn (on a lot of hardware) give access to MMIO register
space. This would potentially permit root to trigger arbitrary DMA, so lock
it down by default.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Thu, 8 Mar 2012 15:10:38 +0000 (10:10 -0500)]
PCI: Lock down BAR access when module security is enabled
Any hardware that can potentially generate DMA has to be locked down from
userspace in order to avoid it being possible for an attacker to modify
kernel code, allowing them to circumvent disabled module loading or module
signing. Default to paranoid - in future we can potentially relax this for
sufficiently IOMMU-isolated devices.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Matthew Garrett [Fri, 9 Aug 2013 21:58:15 +0000 (17:58 -0400)]
Add secure_modules() call
Provide a single call to allow kernel code to determine whether the system
has been configured to either disable module loading entirely or to load
only modules signed with a trusted key.
Bugzilla: N/A
Upstream-status: Fedora mustard. Replaced by securelevels, but that was nak'd
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Josh Stone [Fri, 21 Nov 2014 18:40:00 +0000 (10:40 -0800)]
Kbuild: Add an option to enable GCC VTA
Due to recent codegen issues, gcc -fvar-tracking-assignments was
unconditionally disabled in commit 2062afb4f804a ("Fix gcc-4.9.0
miscompilation of load_balance() in scheduler"). However, this reduces
the debuginfo coverage for variable locations, especially in inline
functions. VTA is certainly not perfect either in those cases, but it
is much better than without. With compiler versions that have fixed the
codegen bugs, we would prefer to have the better details for SystemTap,
and surely other debuginfo consumers like perf will benefit as well.
This patch simply makes CONFIG_DEBUG_INFO_VTA an option. I considered
Frank and Linus's discussion of a cc-option-like -fcompare-debug test,
but I'm convinced that a narrow test of an arch-specific codegen issue
is not really useful. GCC has their own regression tests for this, so
I'd suggest GCC_COMPARE_DEBUG=-fvar-tracking-assignments-toggle is more
useful for kernel developers to test confidence.
In fact, I ran into a couple more issues when testing for this patch[1],
although neither of those had any codegen impact.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1140872
With gcc-4.9.2-1.fc22, I can now build v3.18-rc5 with Fedora's i686 and
x86_64 configs, and this is completely clean with GCC_COMPARE_DEBUG.
Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Josh Stone <jistone@redhat.com>
Peter Jones [Thu, 25 Sep 2008 20:23:33 +0000 (16:23 -0400)]
input: silence i8042 noise
Don't print an error message just because there's no i8042 chip.
Some systems, such as EFI-based Apple systems, won't necessarily have an
i8042 to initialize. We shouldn't be printing an error message in this
case, since not detecting the chip is the correct behavior.
Mark Langsdorf [Wed, 25 Mar 2015 18:12:51 +0000 (14:12 -0400)]
usb: make xhci platform driver use 64 bit or 32 bit DMA
The xhci platform driver needs to work on systems that either only
support 64-bit DMA or only support 32-bit DMA. Attempt to set a
coherent dma mask for 64-bit DMA, and attempt again with 32-bit
DMA if that fails.
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
Mark Salter [Wed, 25 Mar 2015 18:17:50 +0000 (14:17 -0400)]
arm64: avoid needing console= to enable serial console
Tell kernel to prefer one of the serial ports for console on
platforms currently supported (pl011 or 8250). console= on
command line will override these assumed preferences. This is
just a hack to get the behavior we want from DT provided by
firmware.
Josh Boyer [Mon, 11 Nov 2013 13:39:16 +0000 (08:39 -0500)]
lib/cpumask: Make CPUMASK_OFFSTACK usable without debug dependency
When CPUMASK_OFFSTACK was added in 2008, it was dependent upon
DEBUG_PER_CPU_MAPS being enabled, or an architecture could select it.
The debug dependency adds additional overhead that isn't required for
operation of the feature, and we need CPUMASK_OFFSTACK to increase the
NR_CPUS value beyond 512 on x86. We drop the current dependency and make
sure SMP is set.
Bugzilla: N/A
Upstream-status: Nak'd, supposedly replacement coming to auto-select
Javi Merino [Tue, 25 Aug 2015 18:22:35 +0000 (19:22 +0100)]
thermal: power_allocator: allocate with kcalloc what you free with kfree
Commit cf736ea6f902 ("thermal: power_allocator: do not use devm*
interfaces") forgot to change a devm_kcalloc() to just kcalloc(), but
it's corresponding devm_kfree() was changed to kfree(). Allocate with
kcalloc() to match the kfree().
Fixes: cf736ea6f902 ("thermal: power_allocator: do not use devm* interfaces") Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 28 Aug 2015 18:42:00 +0000 (11:42 -0700)]
Merge tag 'sound-fix-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are stable fixes that have been gathered since rc8: fixes for
HD-audio widget power control regressions since 4.1, a NULL fix for
HD-audio HDMI, a noise fix for Conexant codecs and a quirk addition
for USB-Audio DSD"
* tag 'sound-fix-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix path power activation
ALSA: hda - Check all inputs for is_active_nid_for_any()
ALSA: hda: fix possible NULL dereference
ALSA: hda - Shutdown CX20722 on reboot/free to avoid spurious noises
ALSA: usb: Add native DSD support for Gustard DAC-X20U
Linus Torvalds [Fri, 28 Aug 2015 00:59:17 +0000 (17:59 -0700)]
Merge tag 'powerpc-4.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix MSI/MSI-X on pseries from Guilherme"
* tag 'powerpc-4.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case
PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
Pull networking fixes from David Miller:
"Some straggler bug fixes here:
1) Netlink_sendmsg() doesn't check iterator type properly in mmap
case, from Ken-ichirou MATSUZAWA.
2) Don't sleep in atomic context in bcmgenet driver, from Florian
Fainelli.
3) The pfkey_broadcast() code patch can't actually ever use anything
other than GFP_ATOMIC. And the cases that right now pass
GFP_KERNEL or similar will currently trigger an RCU splat. Just
use GFP_ATOMIC unconditionally. From David Ahern.
4) Fix FD bit timings handling in pcan_usb driver, from Marc
Kleine-Budde.
5) Cache dst leaked in ip6_gre tunnel removal, fix from Huaibin Wang.
6) Traversal into drivers/net/ethernet/renesas should be triggered by
CONFIG_NET_VENDOR_RENESAS, not a particular driver's config
option. From Kazuya Mizuguchi.
7) Fix regression in handling of igmp_join errors in vxlan, from
Marcelo Ricardo Leitner.
8) Make phy_{read,write}_mmd_indirect() properly take the mdio_lock
mutex when programming the registers. From Russell King.
9) Fix non-forced handling in u32_destroy(), from WANG Cong.
10) Test the EVENT_NO_RUNTIME_PM flag before it is cleared in
usbnet_stop(), from Eugene Shatokhin.
11) In sfc driver, don't fetch statistics firmware isn't capable of,
from Bert Kenward.
12) Verify ASCONF address parameter location in SCTP, from Xin Long"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state
sctp: asconf's process should verify address parameter is in the beginning
sfc: only use vadaptor stats if firmware is capable
net: phy: fixed: propagate fixed link values to struct
usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared
drivers: net: xgene: fix: Oops in linkwatch_fire_event
cls_u32: complete the check for non-forced case in u32_destroy()
net: fec: use reinit_completion() in mdio accessor functions
net: phy: add locking to phy_read_mmd_indirect()/phy_write_mmd_indirect()
vxlan: re-ignore EADDRINUSE from igmp_join
net: compile renesas directory if NET_VENDOR_RENESAS is configured
ip6_gre: release cached dst on tunnel removal
phylib: Make PHYs children of their MDIO bus, not the bus' parent.
can: pcan_usb: don't provide CAN FD bittimings by non-FD adapters
net: Fix RCU splat in af_key
net: bcmgenet: fix uncleaned dma flags
net: bcmgenet: Avoid sleeping in bcmgenet_timeout
netlink: mmap: fix tx type check
Linus Torvalds [Fri, 28 Aug 2015 00:46:06 +0000 (17:46 -0700)]
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull nvdimm fixlet from Dan Williams:
"This is a libnvdimm ABI fixup.
I pushed back on this change quite hard given the late date, that it
appears to be purely cosmetic, sysfs is not necessarily meant to be a
user friendly UI, and the kernel interprets the reversed polarity of
the ACPI_NFIT_MEM_ARMED flag correctly. When this flag is set, the
energy source of an NVDIMM is not armed and any new writes to the DIMM
may not be preserved.
However, Bob Moore warned me that it is important to get these things
named correctly wherever they appear otherwise we run the risk of a
less than cautious firmware engineer implementing the polarity the
wrong way. Once a mistake like that escapes into production platforms
the flag becomes useless and we need to move to a new bit position.
Bob has agreed to take a change through ACPICA to rename
ACPI_NFIT_MEM_ARMED to ACPI_NFIT_MEM_NOT_ARMED, and the patch below
from Toshi brings the sysfs representation of these flags in line with
their respective polarities.
Please pull for 4.2 as this is the first kernel to expose the ACPI
NFIT sysfs representation, and this is likely a kernel that firmware
developers will be using for checking out their NVDIMM enabling"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: Clarify memory device state flags strings
lucien [Wed, 26 Aug 2015 20:52:20 +0000 (04:52 +0800)]
sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state
Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
resetting the association overall_error_count. This allowed the association
to better enforce assoc.max_retrans limit.
However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
state. In this state, HB-ACKs will continue to reset the overall_error_count
for the association would extend the lifetime of association unnecessarily.
This patch solves this by resetting the overall_error_count whenever the current
state is small then SCTP_STATE_SHUTDOWN_PENDING. As a small side-effect, we
end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
states, but they are not really impacted because we disable Heartbeats in those
states.
Fixes: Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
lucien [Thu, 27 Aug 2015 08:26:34 +0000 (16:26 +0800)]
sctp: asconf's process should verify address parameter is in the beginning
in sctp_process_asconf(), we get address parameter from the beginning of
the addip params. but we never check if it's really there. if the addr
param is not there, it still can pass sctp_verify_asconf(), then to be
handled by sctp_process_asconf(), it will not be safe.
so add a code in sctp_verify_asconf() to check the address parameter is in
the beginning, or return false to send abort.
note that this can also detect multiple address parameters, and reject it.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>