Steven Smith [Tue, 7 Jul 2009 16:50:23 +0000 (17:50 +0100)]
Temporary work around to enable LRO packets to be forwarded
by the bridge. This may have been fixed in a different way
on more recent kernel versions as there was some discussion
in the lkml. This should be fixed at some point in the
future. For now it serves the purpose of enabling experiments
with LRO in netchannel2.
Signed-off-by: Jose Renato Santos <jsantos@hpl.hp.com>
Steven Smith [Tue, 7 Jul 2009 15:59:24 +0000 (16:59 +0100)]
Fix bug on netchannel2 receiver copy mode which computes
the wrong packet size when the number of fragments
exceeds MAX_SKB_FRAGS.
The previous code seems more complex that it actually needs
to be. Using a simpler version fixes the problem.
It may be possible to fix the original code if we
understand where exactly is the bug, but I am
not convinced we want to preserve the original complexity.
Another problem with previous code is that it chains
multiple skb's using frag_list when we should use skb->next
for all skb's after the first in the chain.
Using frag_list to store fragments works
for packets received from the external network, but it
breaks guest to guest communication. Not sure why the
stack treats both types of packets differently.
The current fix is to use skb_shinfo(skb)->frags when
the number of fragments does not exceed MAX_SKB_FRAGS,
and skb_shinfo(skb)->frag_list otherwise.
This is ugly but it works for now. We probably need
a better fix.
Signed-off-by: Jose Renato Santos <jsantos@hpl.hp.com>
Minor fixup: remove assumption that kmalloc() of less than a page size
returns something which doesn't cross a page boundary.
Signed-off-by: Steven Smith <steven.smith@citrix.com>
Steven Smith [Tue, 7 Jul 2009 14:20:04 +0000 (15:20 +0100)]
The RSCB fragment counter occasionally overestimates slightly if you have
a particularly complicated packet. Make the rest of the pipeline
tolerant of this.
Linux never actually generates such a packet, but it's easy enough
to handle it anyway.
Steven Smith [Tue, 7 Jul 2009 13:23:57 +0000 (14:23 +0100)]
Fix netchannel2 TX path to correctly handle LRO packets received
from a physical interface.
LRO packets have skb's chainned using skb->next, instead
of skb_shinfo(skb)->frag_list. frag_list is used only on the
main (first) skb, while the others use skb->next.
Signed-off-by: Jose Renato Santos <jsantos@hpl.hp.com>
Steven Smith [Wed, 24 Jun 2009 15:24:47 +0000 (16:24 +0100)]
Use a minimally-sized inline prefix for RSCB packets. There's no point
in pushing more than this through the ring if the receiver's going to
have to do a copy anyway, and this reduces ring pressure fairly
significantly.
Steven Smith [Tue, 23 Jun 2009 09:48:05 +0000 (10:48 +0100)]
Fix performance regression introduced by patch that avoid
message to wrap around the ring.
When we fail to transmit a packet in the ring because it is full
we should not put the packet in the pending_skbs list since
this may cause out of order packets. Instead just keep the
packet in the same pending_tx_queue list and stop processing
the list.
Signed-off-by: Jose Renato Santos <jsantos@hpl.hp.com>
Set skb protocol field only after grant operations are completed.
Otherwise it will be corrupted when the header is not inline and
thus it has not been copied yet.
Signed-off-by: Jose Renato Santos <jsantos@hpl.hp.com>
Steven Smith [Fri, 19 Jun 2009 15:17:12 +0000 (16:17 +0100)]
Make the NC2 rate limiter optional. Turning it off is unsafe, because
it opens up denial-of-service attacks, but can help performance when
everyone is trusted.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
Add a new ioctl to /proc/xen/privcmd which allows domctls to be performed
without using the generic hypercall interface, so that they are available
on restricted fds.
This requires an unfortunate amount of fiddling with headers so that
XEN_GUEST_HANDLE_64 and uint64_aligned_t are available in kernel
space.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
Watch the online node in the backend area, as well as the state node
in the frontend area, and fire the frontend state changed watch
whenever it changes. This allows us to catch the case where a device
shuts down in a domU and then gets xm detach'd from in dom0.
Otherwise, the backend doesn't shut down correctly, since online was
set when the frontend shut down and we don't get another kick when it
becomes unset.
Steven Smith [Tue, 30 Jun 2009 11:55:47 +0000 (12:55 +0100)]
__gnttab_dma_map_page can be called from a softirq (via the network
transmit softirq for example) therefor gnttab_copy_grant_page needs to
take gntab_dma_lock in an interrupt safe manner.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
There's no point in sending lots of little packets to a copying
receiver if we can instead arrange to copy them all into a single RX
buffer. We need to copy anyway, so there's no overhead here, and this
is a little bit easier on the receiving domain's network stack.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
Ensure that packet csums are computed correctly when sending a GSO
packet to an interface which supports scatter-gather but not transmit
checksum offloads.
Signed-off-by: Steven Smith <ssmith@xensource.com>
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
[NETBACK] Try to pull a minimum of 72 bytes into the skb data area
when receiving a packet into netback. The previous number, 64, tended
to place a fragment boundary in the middle of the TCP header options
and led to unnecessary fragmentation in Windows <-> Windows
networking.
Signed-off-by: Steven Smith <ssmith@xensource.com>
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
It is possible for a frontend to generate a TSO request which doesn't
actually need segmentation (i.e. with size < MTU). Make sure this
doesn't crash the backend.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
The Windows drivers push the network frontend to state Closed, then
Initialised, then Closed again as part of device disable. Make sure
the backend doesn't get stuck at closed.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
Arrange that netback waits for the hotplug scripts to complete before
going to state Connected. WHQL gets quite upset if it sends packets
which don't arrive, and that can happen if our hotplug scripts are
slow and don't hook the network interface up to the bridge in time.
Steven Smith [Tue, 30 Jun 2009 11:55:48 +0000 (12:55 +0100)]
It turns out that Windows occasionally generates packets in which the
IP and TCP headers are in different fragments. Make sure that the
backends can handle this.
Steven Smith [Tue, 30 Jun 2009 11:55:47 +0000 (12:55 +0100)]
CA-27974: Fix blktap shutdown race due to improper event ordering.
Writing shutdown-done before switching device state to closed (6)
opens a remarkably small race window to fall through: The agent
removes the device directory just before the write to the 'state'
field will recreate it again. This in turn leads to xenbus failing to
remove the device, since removal is guided by directory existence.
With shutdown-done and connection state being rather independent,
trivially fixing event ordering to write shutdown-done last appears
safe but mandatory. Comment this tiny detail.
Steven Smith [Tue, 30 Jun 2009 11:55:47 +0000 (12:55 +0100)]
Close block devices when the pv drivers take over and flush the buffer cache.
- close and free the block devices in qemu when we switch to pv drivers in
the guest
- use BLKFLSBUF to flush the buffer cache, both in qemu and in blkback