linux-2.6
15 years agonetfilter: conntrack: don't deliver events for racy packets
Pablo Neira Ayuso [Mon, 16 Mar 2009 14:06:42 +0000 (15:06 +0100)] 
netfilter: conntrack: don't deliver events for racy packets

This patch skips the delivery of conntrack events if the packet
was drop due to a race condition in the conntrack insertion.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
15 years agotcp: make sure xmit goal size never becomes zero
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:07 +0000 (14:23 +0000)] 
tcp: make sure xmit goal size never becomes zero

It's not too likely to happen, would basically require crafted
packets (must hit the max guard in tcp_bound_to_half_wnd()).
It seems that nothing that bad would happen as there's tcp_mems
and congestion window that prevent runaway at some point from
hurting all too much (I'm not that sure what all those zero
sized segments we would generate do though in write queue).
Preventing it regardless is certainly the best way to go.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: cache result of earlier divides when mss-aligning things
Ilpo Järvinen [Sat, 14 Mar 2009 22:45:16 +0000 (22:45 +0000)] 
tcp: cache result of earlier divides when mss-aligning things

The results is very unlikely change every so often so we
hardly need to divide again after doing that once for a
connection. Yet, if divide still becomes necessary we
detect that and do the right thing and again settle for
non-divide state. Takes the u16 space which was previously
taken by the plain xmit_size_goal.

This should take care part of the tso vs non-tso difference
we found earlier.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: simplify tcp_current_mss
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:05 +0000 (14:23 +0000)] 
tcp: simplify tcp_current_mss

There's very little need for most of the callsites to get
tp->xmit_goal_size updated. That will cost us divide as is,
so slice the function in two. Also, the only users of the
tp->xmit_goal_size are directly behind tcp_current_mss(),
so there's no need to store that variable into tcp_sock
at all! The drop of xmit_goal_size currently leaves 16-bit
hole and some reorganization would again be necessary to
change that (but I'm aiming to fill that hole with u16
xmit_goal_size_segs to cache the results of the remaining
divide to get that tso on regression).

Bring xmit_goal_size parts into tcp.c

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: don't check mtu probe completion in the loop
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:04 +0000 (14:23 +0000)] 
tcp: don't check mtu probe completion in the loop

It seems that no variables clash such that we couldn't do
the check just once later on. Therefore move it.

Also kill dead obvious comment, dead argument and add
unlikely since this mtu probe does not happen too often.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: consolidate paws check
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:03 +0000 (14:23 +0000)] 
tcp: consolidate paws check

Wow, it was quite tricky to merge that stream of negations
but I think I finally got it right:

check & replace_ts_recent:
(s32)(rcv_tsval - ts_recent) >= 0                  => 0
(s32)(ts_recent - rcv_tsval) <= 0                  => 0

discard:
(s32)(ts_recent - rcv_tsval)  > TCP_PAWS_WINDOW    => 1
(s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW    => 0

I toggled the return values of tcp_paws_check around since
the old encoding added yet-another negation making tracking
of truth-values really complicated.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: kill dead end_seq variable in clean_rtx_queue
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:02 +0000 (14:23 +0000)] 
tcp: kill dead end_seq variable in clean_rtx_queue

I've already forgotten what for this was necessary, anyway
it's no longer used (if it ever was).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: remove pointless .dsack/.num_sacks code
Ilpo Järvinen [Sat, 14 Mar 2009 14:23:01 +0000 (14:23 +0000)] 
tcp: remove pointless .dsack/.num_sacks code

In the pure assignment case, the earlier zeroing is
still in effect.

David S. Miller raised concerns if the ifs are there to avoid
dirtying cachelines. I came to these conclusions:

> We'll be dirty it anyway (now that I check), the first "real" statement
> in tcp_rcv_established is:
>
>       tp->rx_opt.saw_tstamp = 0;
>
> ...that'll land on the same dword. :-/
>
> I suppose the blocks are there just because they had more complexity
> inside when they had to calculate the eff_sacks too (maybe it would
> have been better to just remove them in that drop-patch so you would
> have had less head-ache :-)).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agor8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)"
françois romieu [Sun, 15 Mar 2009 01:10:50 +0000 (01:10 +0000)] 
r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)"

It fails on the following systems:
- RTL8169sc/8110sc (XID 18000000)
  reported by Tim Durack <tdurack@gmail.com> (x86)
- RTL8169sb/8110sb (XID 10000000)
  reported by Mikael Pettersson <mikpe@it.uu.se> (ARM)

The patch appeared to work on x86 for the following systems:
RTL8169sb/8110sb 10000000 PCI   (EXT)
RTL8110s         04000000 PCI   (EXT)
RTL8102e         24a00000 PCI-E (LOM)
RTL8168c/8111c   3c2000c0 PCI-E (LOM)
RTL8168b/8111b   38000000 PCI-E (LOM)
RTL8168b/8111b   38000000 PCI-E (EXT)

The patch exposes two problems:
1) while not completely wrong, mac addresses are not read correctly
   from the EEPROM
2) the MAC address registers are not correctly set

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agor8169: use hardware auto-padding.
françois romieu [Sun, 15 Mar 2009 01:09:54 +0000 (01:09 +0000)] 
r8169: use hardware auto-padding.

It shortens the code and fixes the current pci_unmap leak with
padded skb reported by Dave Jones.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agopkt_sched: Change misleading code in class delete.
Jarek Poplawski [Mon, 16 Mar 2009 03:00:19 +0000 (20:00 -0700)] 
pkt_sched: Change misleading code in class delete.

While looking for a possible reason of bugzilla report on HTB oops:
http://bugzilla.kernel.org/show_bug.cgi?id=12858
I found the code in htb_delete calling htb_destroy_class on zero
refcount is very misleading: it can suggest this is a common path, and
destroy is called under sch_tree_lock. Actually, this can never happen
like this because before deletion cops->get() is done, and after
delete a class is still used by tclass_notify. The class destroy is
always called from cops->put(), so without sch_tree_lock.

This doesn't mean much now (since 2.6.27) because all vulnerable calls
were moved from htb_destroy_class to htb_delete, but there was a bug
in older kernels. The same change is done for other classful scheds,
which, it seems, didn't have similar locking problems here.

Reported-by: m0sia <m0sia@m0sia.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: reorder fields of struct socket
Eric Dumazet [Mon, 16 Mar 2009 02:59:13 +0000 (19:59 -0700)] 
net: reorder fields of struct socket

On x86_64, its rather unfortunate that "wait_queue_head_t wait"
field of "struct socket" spans two cache lines (assuming a 64
bytes cache line in current cpus)

offsetof(struct socket, wait)=0x30
sizeof(wait_queue_head_t)=0x18

This might explain why Kenny Chang noticed that his multicast workload
was performing bad with 64 bit kernels, since more cache lines ping pongs
were involved.

This litle patch moves "wait" field next "fasync_list" so that both
fields share a single cache line, to speedup sock_def_readable()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: remove ASPM L0s workaround
Alexander Duyck [Sun, 15 Mar 2009 05:26:40 +0000 (22:26 -0700)] 
igb: remove ASPM L0s workaround

The L0s workaround should be moved into a pci quirk and so it is not
necessary in the driver.  This update removes the L0s workaround from the
igb driver.

This was the second half of the PCI quirk patch that Matthew Wilcox did
not pick up when he picked up the quirk patch.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: update version to 4.0.30
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:06 +0000 (14:52 +0000)] 
netxen: update version to 4.0.30

To mark all features and bugfixes submitted since 4.0.11.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: add receive side scaling (rss) support
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:05 +0000 (14:52 +0000)] 
netxen: add receive side scaling (rss) support

This patch enables the load balancing capability of firmware
and hardware to spray traffic into different cpus through
separate rx msix interrupts.

The feature is being enabled for NX3031, NX2031 (old) will be
enabled later. This depends on msi-x and compatibility with
msi and legacy is maintained by enabling single rx ring.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: remove old lro code
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:04 +0000 (14:52 +0000)] 
netxen: remove old lro code

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: sanitize variable names
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:03 +0000 (14:52 +0000)] 
netxen: sanitize variable names

o remove max_ prefix from ring sizes, since they don't really
  represent max possible sizes.
o cleanup naming of rx ring types (normal, jumbo, lro).
o simplify logic to choose rx ring size, gig ports get half
  rx ring of 10 gig ports.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: add suspend resume support
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:02 +0000 (14:52 +0000)] 
netxen: add suspend resume support

Detach network interface on PCI suspend and recreate hardware
context after resumes.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: fix endianness in serial number
Dhananjay Phadke [Fri, 13 Mar 2009 14:52:01 +0000 (14:52 +0000)] 
netxen: fix endianness in serial number

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Add documentation for the driver
PJ Waskiewicz [Fri, 13 Mar 2009 22:15:54 +0000 (22:15 +0000)] 
ixgbe: Add documentation for the driver

Documentation for the ixgbe driver in the kernel docs area is missing.
This adds that documentation.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Cleanup some whitespace issues, fixup and add some comments
Jesse Brandeburg [Fri, 13 Mar 2009 22:15:31 +0000 (22:15 +0000)] 
ixgbe: Cleanup some whitespace issues, fixup and add some comments

Cleanup a bit of whitespace, add some function header comments, and fix a
few comments around the driver.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Two small fixes for 82599 when bringing the device down and for WoL
PJ Waskiewicz [Fri, 13 Mar 2009 22:15:10 +0000 (22:15 +0000)] 
ixgbe: Two small fixes for 82599 when bringing the device down and for WoL

The Tx DMA unit should be disabled when bringing the device down.  Also,
the KX4 device with 82599 supports WoL, so we should clear the Wake Up
Status (WUS) after a PCIe slot reset.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Add a few safety nets for register writes and descriptor cleanups
Jesse Brandeburg [Fri, 13 Mar 2009 22:14:50 +0000 (22:14 +0000)] 
ixgbe: Add a few safety nets for register writes and descriptor cleanups

There are possible times that a driver may fail to completely initialize,
due to a buggy platform or a buggy kernel.  In those cases, we'd rather
fail gracefully instead of a panic.  Add a few safety checks to some
critical paths to try and prevent a panic in these corner-case situations.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Cleanup on the Rx init path
Jesse Brandeburg [Fri, 13 Mar 2009 22:14:30 +0000 (22:14 +0000)] 
ixgbe: Cleanup on the Rx init path

This cleans up the following pieces of the Rx initialization path:

- Enable the ECC memory fault interrupt in OTHER causes.

- Fix an 82598 initialization of RDRXCTL when depending on RSS and VMDq to
be enabled.  We don't need these features enabled to safely set the MVMEN
bit to allow multiple SRRCTL register mappings into the RXDCTL registers.

- Fix the RSS initialization path to not stomp on DCB accidentally.  When
configuring the MRQC (multiple Rx queue contol) register, we want to make
sure we only OR in features as necessary, instead of full assignment.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Fix the Tx clean logic to return proper status
Jesse Brandeburg [Fri, 13 Mar 2009 22:14:10 +0000 (22:14 +0000)] 
ixgbe: Fix the Tx clean logic to return proper status

The Tx accounting when cleaning during NAPI was not completely properly.
We should use the work_limit to determine when to finish cleaning, and
use the same to return the cleaned status.  The impact of running like this
causes the NAPI clean for this Tx to get stuck in a scheduling loop, and
can result in Tx not getting cleaned, ending with a Tx hang and device
reset.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: fix bug with napi add before request_irq
Jesse Brandeburg [Fri, 13 Mar 2009 22:13:49 +0000 (22:13 +0000)] 
ixgbe: fix bug with napi add before request_irq

Occasionally if the driver was loaded in a system that
didn't support MSI-X or MSI and was on a shared interrupt,
the driver would then panic in NAPI on the first shared
interrupt because we hadn't called napi_add yet.

Solution: call napi_add before calling request_irq

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Fix interrupt configuration for 82599
Jesse Brandeburg [Fri, 13 Mar 2009 22:13:28 +0000 (22:13 +0000)] 
ixgbe: Fix interrupt configuration for 82599

The interrupt models using EITR have changed in 82599.  The way the register
is laid out, the change is transparent to some of the existing code.
However, some of it isn't.  This patch fixes all the cases where EITR
handling is different than 82598.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Disable DROP_EN for Rx queues
PJ Waskiewicz [Fri, 13 Mar 2009 22:13:08 +0000 (22:13 +0000)] 
ixgbe: Disable DROP_EN for Rx queues

82599 mistakenly enabled drop on Rx queues in the packet buffer.  The
default mode should be store-and-forward from the FIFO.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Fix an accounting problem when the Rx FIFO is full
PJ Waskiewicz [Fri, 13 Mar 2009 22:12:48 +0000 (22:12 +0000)] 
ixgbe: Fix an accounting problem when the Rx FIFO is full

The rx_no_dma_resources counter reported by ethtool -S ethX is not
counting correctly.  In 82599, the queue mappings for the counters need
to be mapped properly, and accounted for properly.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: Fix get_supported_physical_layer() due to new 82599 PHY types
PJ Waskiewicz [Fri, 13 Mar 2009 22:12:29 +0000 (22:12 +0000)] 
ixgbe: Fix get_supported_physical_layer() due to new 82599 PHY types

A purely cosmetic change.  Report which physical layer is present, instead
of PHY unknown.  82599 added new PHY types for the SFP+ devices, and this
was missed getting updated.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: add support for 82576 quad copper adapter
Alexander Duyck [Fri, 13 Mar 2009 20:42:35 +0000 (20:42 +0000)] 
igb: add support for 82576 quad copper adapter

Add support for 82576 copper adapter and necessary code to restrict wol for
quad port adapter to first port.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: add support for another dual port 82576 non-security nic
Alexander Duyck [Fri, 13 Mar 2009 20:42:15 +0000 (20:42 +0000)] 
igb: add support for another dual port 82576 non-security nic

Adding device id to support 82576NS dual port copper
NIC.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: correct typo that was setting vfta mask to 1
Alexander Duyck [Fri, 13 Mar 2009 20:41:55 +0000 (20:41 +0000)] 
igb: correct typo that was setting vfta mask to 1

This patch corrects a typo that was doing a less than comparison instead of
a left shift due to the fact that I didn't get enough <'s in there.

This resolves an issue in which vlans were not functioning correctly.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: add PF to pool
Alexander Duyck [Fri, 13 Mar 2009 20:41:37 +0000 (20:41 +0000)] 
igb: add PF to pool

Add Pf to pool if adding a VLVF register value and the VFTA bit is
already set.

This patch addresses the unlikely situation that the PF adds a vlan
entry when the vlvf is full, and a vf later adds the vlan to the vlvf.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: support wol on second port
Alexander Duyck [Fri, 13 Mar 2009 20:41:17 +0000 (20:41 +0000)] 
igb: support wol on second port

We need to support wol on the second port for situations such as when the
lan ports are on the motherboard itself.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: resolve warning of unused adapter struct
Alexander Duyck [Fri, 13 Mar 2009 20:40:58 +0000 (20:40 +0000)] 
igb: resolve warning of unused adapter struct

If DCA is undefined then the adapter struct becomes unnecessary.  To
resolve this issue the DCA calls can simply make a call to the adapter
struct through the rx_ring adapter struct member.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: remove netif running call from igb_poll
Alexander Duyck [Fri, 13 Mar 2009 20:40:38 +0000 (20:40 +0000)] 
igb: remove netif running call from igb_poll

The netif_running check in igb poll is a hold over from the use of fake
netdevs to use multiple queues with NAPI prior to 2.6.24.  It is no longer
necessary to have the call there and it currently can cause errors if
work_done == budget.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: switch to new dca API
Maciej Sosnowski [Fri, 13 Mar 2009 20:40:21 +0000 (20:40 +0000)] 
igb: switch to new dca API

With the new DCA API, the driver should use dca3_get_tag() instead of
the obsolete dca_get_tag().

Signed-off-by: Maciej Sosnowski < maciej.sosnowski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: remove old flash check.
Dhananjay Phadke [Fri, 6 Mar 2009 14:52:12 +0000 (14:52 +0000)] 
netxen: remove old flash check.

Remove flash size check which made sense only for ancient
boards with 1MB flash. The check is based on values read
from specific locations and fails with firmware size changes.

This prevents driver from getting right mac addresses.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoppp: ppp_mp_explode() redesign
Gabriele Paoloni [Fri, 13 Mar 2009 23:09:12 +0000 (16:09 -0700)] 
ppp: ppp_mp_explode() redesign

I found the PPP subsystem to not work properly when connecting channels
with different speeds to the same bundle.

Problem Description:

As the "ppp_mp_explode" function fragments the sk_buff buffer evenly
among the PPP channels that are connected to a certain PPP unit to
make up a bundle, if we are transmitting using an upper layer protocol
that requires an Ack before sending the next packet (like TCP/IP for
example), we will have a bandwidth bottleneck on the slowest channel
of the bundle.

Let's clarify by an example. Let's consider a scenario where we have
two PPP links making up a bundle: a slow link (10KB/sec) and a fast
link (1000KB/sec) working at the best (full bandwidth). On the top we
have a TCP/IP stack sending a 1000 Bytes sk_buff buffer down to the
PPP subsystem. The "ppp_mp_explode" function will divide the buffer in
two fragments of 500B each (we are neglecting all the headers, crc,
flags etc?.). Before the TCP/IP stack sends out the next buffer, it
will have to wait for the ACK response from the remote peer, so it
will have to wait for both fragments to have been sent over the two
PPP links, received by the remote peer and reconstructed. The
resulting behaviour is that, rather than having a bundle working
@1010KB/sec (the sum of the channels bandwidths), we'll have a bundle
working @20KB/sec (the double of the slowest channels bandwidth).

Problem Solution:

The problem has been solved by redesigning the "ppp_mp_explode"
function in such a way to make it split the sk_buff buffer according
to the speeds of the underlying PPP channels (the speeds of the serial
interfaces respectively attached to the PPP channels). Referring to
the above example, the redesigned "ppp_mp_explode" function will now
divide the 1000 Bytes buffer into two fragments whose sizes are set
according to the speeds of the channels where they are going to be
sent on (e.g .  10 Byets on 10KB/sec channel and 990 Bytes on
1000KB/sec channel).  The reworked function grants the same
performances of the original one in optimal working conditions (i.e. a
bundle made up of PPP links all working at the same speed), while
greatly improving performances on the bundles made up of channels
working at different speeds.

Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: '< 0' test on unsigned
Roel Kluin [Fri, 13 Mar 2009 23:05:14 +0000 (16:05 -0700)] 
tcp: '< 0' test on unsigned

promote 'cnt' to size_t, to match 'len'.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agox25: '< 0' and '>= 0' test on unsigned
Roel Kluin [Fri, 13 Mar 2009 23:04:12 +0000 (16:04 -0700)] 
x25: '< 0' and '>= 0' test on unsigned

skb->len is an unsigned int, so the test in x25_rx_call_request() always
evaluates to true.

len in x25_sendmsg() is unsigned as well. so -ERRORS returned by x25_output()
are not noticed.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv4: arp announce, arp_proxy and windows ip conflict verification
Denys Fedoryshchenko [Fri, 13 Mar 2009 23:02:07 +0000 (16:02 -0700)] 
ipv4: arp announce, arp_proxy and windows ip conflict verification

Windows (XP at least) hosts on boot, with configured static ip, performing
address conflict detection, which is defined in RFC3927.
Here is quote of important information:

"
An ARP announcement is identical to the ARP Probe described above,
except    that now the sender and target IP addresses are both set
to the host's newly selected IPv4 address.
"

But it same time this goes wrong with RFC5227.
"
The 'sender IP address' field MUST be set to all zeroes; this is to avoid
polluting ARP caches in other hosts on the same link in the case
where the address turns out to be already in use by another host.
"

When ARP proxy configured, it must not answer to both cases, because
it is address conflict verification in any case. For Windows it is just
causing to detect false "ip conflict". Already there is code for RFC5227, so
just trivially we just check also if source ip == target ip.

Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomv643xx_eth: fix unicast address filter corruption on mtu change
Lennert Buytenhek [Fri, 13 Mar 2009 22:48:02 +0000 (15:48 -0700)] 
mv643xx_eth: fix unicast address filter corruption on mtu change

When mv643xx_eth_open() is called to up an interface, port_start()
will first re-program the unicast address filter, and then
re-initialise the PORT_CONFIG register, but that will disable unicast
promiscuous mode if it was enabled by the unicast address filter setup.

This isn't a problem on ifconfig up, as ->set_rx_mode() will be called
shortly afterwards which will program the filters again, but it does
trigger when changing the MTU, which calls mv643xx_eth_stop() and then
mv643xx_eth_open() by hand to repopulate the receive rings with skbuffs
of the new size.

Swap the initialisation of the PORT_START register and the call to
the unicast filter setup function to fix this.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotulip: Fix for MTU problems with 802.1q tagged frames
Tomasz Lemiech [Fri, 13 Mar 2009 22:43:38 +0000 (15:43 -0700)] 
tulip: Fix for MTU problems with 802.1q tagged frames

The original patch was submitted last year but wasn't discussed or applied
because of missing maintainer's CCs. I only fixed some formatting errors,
but as I saw tulip is very badly formatted and needs further work.

Original description:
This patch fixes MTU problem, which occurs when using 802.1q VLANs. We
should allow receiving frames of up to 1518 bytes in length, instead of
1514.

Based on patch written by Ben McKeegan for 2.4.x kernels. It is archived
at http://www.candelatech.com/~greear/vlan/howto.html#tulip
I've adjusted a few things to make it apply on 2.6.x kernels.

Tested on D-Link DFE-570TX quad-fastethernet card.

Signed-off-by: Tomasz Lemiech <szpajder@staszic.waw.pl>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agophylib: convert state_queue work to delayed_work
Marcin Slusarz [Fri, 13 Mar 2009 22:41:19 +0000 (15:41 -0700)] 
phylib: convert state_queue work to delayed_work

It closes a race in phy_stop_machine when reprogramming of phy_timer
(from phy_state_machine) happens between del_timer_sync and cancel_work_sync.

Without this change it could lead to crash if phy_device would be freed after
phy_stop_machine (timer would fire and schedule freed work).

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoxfrm: Fix xfrm_state_find() wrt. wildcard source address.
David S. Miller [Fri, 13 Mar 2009 21:22:40 +0000 (14:22 -0700)] 
xfrm: Fix xfrm_state_find() wrt. wildcard source address.

The change to make xfrm_state objects hash on source address
broke the case where such source addresses are wildcarded.

Fix this by doing a two phase lookup, first with fully specified
source address, next using saddr wildcarded.

Reported-by: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agobmac: remove unused variable bp in bmac_misc_intr()
Pavel Roskin [Fri, 13 Mar 2009 21:17:16 +0000 (14:17 -0700)] 
bmac: remove unused variable bp in bmac_misc_intr()

From: Pavel Roskin <proski@gnu.org>

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoehea: fix circular locking problem
Jan-Bernd Themann [Fri, 13 Mar 2009 20:50:40 +0000 (13:50 -0700)] 
ehea: fix circular locking problem

This patch fixes the circular locking problem by changing the locking strategy
concerning the logging of firmware handles.

Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoemac: Fix clock control for 405EX and 405EXr chips
Benjamin Herrenschmidt [Fri, 13 Mar 2009 20:48:46 +0000 (13:48 -0700)] 
emac: Fix clock control for 405EX and 405EXr chips

The EMAC variant in the 405EX and 405EXr chips needs the "440EP" type clock
control workaround to avoid lockups of the Rx side during reset.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Felix Radensky <felix@embedded-sol.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoixgbe: fix multiple unicast address support
Chris Leech [Tue, 10 Mar 2009 16:00:24 +0000 (16:00 +0000)] 
ixgbe: fix multiple unicast address support

Multiple unicast address support appears to have been broken with the
change to support net_device_ops.  This a regression from 2.6.28 to 2.6.29.

I'm not 100% on whether ndo_set_multicast_list can be NULL after this
or not.  If ndo_set_rx_mode is set everything _should_ be using it.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agovia-velocity: Fix DMA mapping length errors on transmit.
Dave Jones [Fri, 13 Mar 2009 20:37:46 +0000 (13:37 -0700)] 
via-velocity: Fix DMA mapping length errors on transmit.

From: Dave Jones <davej@redhat.com>

The dma-debug changes caught that this driver uses the
wrong DMA mapping length when skb_padto() does something.

With suggestions from Eric Dumazet.

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomacvlan: Deterministic ingress packet delivery
Eric Biederman [Fri, 13 Mar 2009 20:16:13 +0000 (13:16 -0700)] 
macvlan: Deterministic ingress packet delivery

Changing the mac address when a macvlan device is up will leave the
device on the wrong hash chain making it impossible to receive
packets.

There is no checking of the mac address set on the macvlan.  Allowing
a misconfiguration to grab packets from the the underlying device or
another macvlan.

To resolve these problems I update the hash table of macvlans when the
mac address of a macvlan changes, and when updating the hash table
I verify that the new mac address is usable.

The result is well defined and predictable if not perfect handling of
mac vlan mac addresses.

To keep the code clear I have created a set of hash table maintenance
in macvlan so I am not open coding the hash function and the logic
needed to update the hash table all over the place.

Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomacvlan: Support creating macvlans from macvlans
Eric Biederman [Fri, 13 Mar 2009 20:15:37 +0000 (13:15 -0700)] 
macvlan: Support creating macvlans from macvlans

When running in a network namespace whose only link to
the outside world is a macvlan device, not being
able to create another macvlan is a real pain.

So modify macvlan creation to allow automatically forward
a creation of a macvlan on a macvlan to become a creation
of a macvlan on the underlying network device.

Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosmsc911x: improve EEPROM loading timeout logic in open
Steve Glendinning [Wed, 4 Mar 2009 07:33:25 +0000 (07:33 +0000)] 
smsc911x: improve EEPROM loading timeout logic in open

This patch from Juha Leppanen suppresses a false warning if the eeprom
load succeeds on the very last attempt.

Juha> In function smsc911x_open smsc911x_reg_read+udelay can be run 50
Juha> times with timeout reaching -1, and the following if statetement
Juha> does not catch the timeout and no warning is issued. Also if the
Juha> 50th smsc911x_reg_read is GOOD, loop is exited with timeout as 0
Juha> and bogus warning issued.  Replace testing order and --timeout
Juha> instead of timeout-- and now max 50 smsc911x_reg_read's are done,
Juha> with max 49 udelays.

Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosmsc911x: check for FFWD success before checking for timeout
Steve Glendinning [Wed, 4 Mar 2009 07:33:24 +0000 (07:33 +0000)] 
smsc911x: check for FFWD success before checking for timeout

This patch from Juha Leppanen suppresses a false warning if a fast
forward operation succeeds on the very last attempt.

Juha> If smsc911x_reg_read loop is executed 500 times, timeout reaches 0
Juha> and the 500th smsc911x_reg_read result in val is ignored. If
Juha> testing order is changed, then val is checked first. The 500th
Juha> reg_read might be GOOD, why ignore it!

Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoNetwork Drop Monitor: Adding Build changes to enable drop monitor
Neil Horman [Wed, 11 Mar 2009 09:53:16 +0000 (09:53 +0000)] 
Network Drop Monitor: Adding Build changes to enable drop monitor

Network Drop Monitor: Adding Build changes to enable drop monitor

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
 include/linux/Kbuild |    1 +
 net/Kconfig          |   11 +++++++++++
 net/core/Makefile    |    1 +
 3 files changed, 13 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoNetwork Drop Monitor: Adding drop monitor implementation & Netlink protocol
Neil Horman [Wed, 11 Mar 2009 09:51:26 +0000 (09:51 +0000)] 
Network Drop Monitor: Adding drop monitor implementation & Netlink protocol

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
 include/linux/net_dropmon.h |   56 +++++++++
 net/core/drop_monitor.c     |  263 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 319 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoNetwork Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line...
Neil Horman [Wed, 11 Mar 2009 09:49:55 +0000 (09:49 +0000)] 
Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line points for skbs

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
 include/linux/skbuff.h |    4 +++-
 net/core/datagram.c    |    2 +-
 net/core/skbuff.c      |   22 ++++++++++++++++++++++
 net/ipv4/arp.c         |    2 +-
 net/ipv4/udp.c         |    2 +-
 net/packet/af_packet.c |    2 +-
 6 files changed, 29 insertions(+), 5 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoNetwork Drop Monitor: Add trace declaration for skb frees
Neil Horman [Wed, 11 Mar 2009 09:48:26 +0000 (09:48 +0000)] 
Network Drop Monitor: Add trace declaration for skb frees

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
 include/trace/skb.h   |    8 ++++++++
 net/core/Makefile     |    2 ++
 net/core/net-traces.c |   29 +++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agogianfar: Convert to use netdev_ops
Andy Fleming [Tue, 10 Mar 2009 12:58:28 +0000 (12:58 +0000)] 
gianfar: Convert to use netdev_ops

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agogianfar: remove gianfar_mii.c
Andy Fleming [Tue, 10 Mar 2009 12:58:27 +0000 (12:58 +0000)] 
gianfar: remove gianfar_mii.c

commit 1577ecef766650a57fceb171acee2b13cbfaf1d3
Author: Andy Fleming <afleming@freescale.com>
Date:   Wed Feb 4 16:42:12 2009 -0800

    netdev: Merge UCC and gianfar MDIO bus drivers

left out the deletion of gianfar_mii.c.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago8139too: allow to set mac address on running device
Jiri Pirko [Fri, 13 Mar 2009 18:48:18 +0000 (11:48 -0700)] 
8139too: allow to set mac address on running device

Similar patch as for 8139cp posted yesterday, so the same comment:

So far there was not a chance to set a mac address on running 8139too device.
This is for example needed when you want to use this NIC as a bonding slave in
bonding device in mode balance-alb. This simple patch allows it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago8139cp: allow to set mac address on running device
Jiri Pirko [Fri, 13 Mar 2009 18:47:48 +0000 (11:47 -0700)] 
8139cp: allow to set mac address on running device

So far there was not a chance to set a mac address on running 8139cp device.
This is for example needed when you want to use this NIC as a bonding slave in
bonding device in mode balance-alb. This simple patch allows it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: add Adaptation Layer Indication parameter only when it's set
malc [Thu, 12 Mar 2009 09:49:20 +0000 (09:49 +0000)] 
sctp: add Adaptation Layer Indication parameter only when it's set

RFC5061 states:

        Each adaptation layer that is defined that wishes
        to use this parameter MUST specify an adaptation code point in an
        appropriate RFC defining its use and meaning.

If the user has not set one - assume they don't want to sent the param
with a zero Adaptation Code Point.

Rationale - Currently the IANA defines zero as reserved - and
1 as the only valid value - so we consider zero to be unset - to save
adding a boolean to the socket structure.

Including this parameter unconditionally causes endpoints that do not
understand it to report errors unnecessarily.

Signed-off-by: Malcolm Lashley <mlashley@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: fix to send FORWARD-TSN chunk only if peer has such capable
Wei Yongjun [Thu, 12 Mar 2009 09:49:19 +0000 (09:49 +0000)] 
sctp: fix to send FORWARD-TSN chunk only if peer has such capable

RFC3758 Section 3.3.1.  Sending Forward-TSN-Supported param in INIT

   Note that if the endpoint chooses NOT to include the parameter, then
   at no time during the life of the association can it send or process
   a FORWARD TSN.

If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk
to peer.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: fix to indicate ASCONF support in INIT-ACK only if peer has such capable
Wei Yongjun [Thu, 12 Mar 2009 09:49:18 +0000 (09:49 +0000)] 
sctp: fix to indicate ASCONF support in INIT-ACK only if peer has such capable

This patch fix to indicate ASCONF support in INIT-ACK only if peer has
such capable.

This patch also fix to calc the chunk size if peer has no FWD-TSN
capable.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: simplify sctp listening code
Vlad Yasevich [Thu, 12 Mar 2009 09:49:17 +0000 (09:49 +0000)] 
sctp: simplify sctp listening code

sctp_inet_listen() call is split between UDP and TCP style.  Looking
at the code, the two functions are almost the same and can be
merged into a single helper.  This also fixes a bug that was
fixed in the UDP function, but missed in the TCP function.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: update driver version
Divy Le Ray [Thu, 12 Mar 2009 21:14:29 +0000 (21:14 +0000)] 
cxgb3: update driver version

update driver version to 1.1.1-ko

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: update FW
Divy Le Ray [Thu, 12 Mar 2009 21:14:24 +0000 (21:14 +0000)] 
cxgb3: update FW

Update FW to 7.1

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: detect mac link faults.
Divy Le Ray [Thu, 12 Mar 2009 21:14:19 +0000 (21:14 +0000)] 
cxgb3: detect mac link faults.

The driver currently ignores the local or remote link faults
raised at the mac layer. This patch fixes it.
Our mac however only advertizes link events, so wait for the
phy to stabilize the link, then enable mac link events interrupts.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: Update Rev3 mac workaround
Divy Le Ray [Thu, 12 Mar 2009 21:14:14 +0000 (21:14 +0000)] 
cxgb3: Update Rev3 mac workaround

Update the heurstics workaround unlocking a hung mac:
- reduce Tx mac toggling by enabling Tx drain before resetting the mac
- Take Tx (lack of) activity in account only
- Update the monitoring counter range to 64 bits

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: disable high freq non-data interrupts
Divy Le Ray [Thu, 12 Mar 2009 21:14:09 +0000 (21:14 +0000)] 
cxgb3: disable high freq non-data interrupts

Under RX pressure, The HW might generate a high load of interrupts
to signal mac fifo or free lists overflow.
Disable the interrupts, and poll the relevant status bits
to maintain stats.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: separate TX and RX reclaim handlers
Divy Le Ray [Thu, 12 Mar 2009 21:14:04 +0000 (21:14 +0000)] 
cxgb3: separate TX and RX reclaim handlers

Separate TX and RX reclaim handlers
Don't disable interrupts in RX reclaim handler.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: prefetch buffer access in GRO mode
Divy Le Ray [Thu, 12 Mar 2009 21:13:59 +0000 (21:13 +0000)] 
cxgb3: prefetch buffer access in GRO mode

Elmininate a cache miss when accessing the CPL header within
the first aggregated buffer.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: fix skb truesize in jumbo mode
Divy Le Ray [Thu, 12 Mar 2009 21:13:54 +0000 (21:13 +0000)] 
cxgb3: fix skb truesize in jumbo mode

Update skb truesize correctly for the 2nd buffer from a Jumbo frame

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: release page ref on mapping error
Divy Le Ray [Thu, 12 Mar 2009 21:13:49 +0000 (21:13 +0000)] 
cxgb3: release page ref on mapping error

Release page chunk reference in case we fail to map it.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocxgb3: ring rx door bell less frequently
Divy Le Ray [Thu, 12 Mar 2009 21:13:43 +0000 (21:13 +0000)] 
cxgb3: ring rx door bell less frequently

Ring free lists door bell less frequently,
specifically every quarter of the active FL
size.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: bugfix: Pad outbound frames smaller than 60 bytes.
Ron Mercer [Wed, 11 Mar 2009 11:55:43 +0000 (11:55 +0000)] 
qlge: bugfix: Pad outbound frames smaller than 60 bytes.

With some asic configurations xmit of frames smaller than 60 bytes may
fail.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: bugfix: Move netif_napi_del() to common call point.
Ron Mercer [Wed, 11 Mar 2009 11:55:42 +0000 (11:55 +0000)] 
qlge: bugfix: Move netif_napi_del() to common call point.

Moving netif_napi_del() up the call chain so it will get called from all
exit points.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: bugfix: Tell hw to strip vlan header.
Ron Mercer [Wed, 11 Mar 2009 11:55:41 +0000 (11:55 +0000)] 
qlge: bugfix: Tell hw to strip vlan header.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: bugfix: Increase filter on inbound csum.
Ron Mercer [Wed, 11 Mar 2009 11:55:40 +0000 (11:55 +0000)] 
qlge: bugfix: Increase filter on inbound csum.

Chip does not do UDP checksum when fragmentation occurs.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodnet: replace obsolete *netif_rx_* functions with *napi_*
Ilya Yanok [Fri, 13 Mar 2009 16:51:46 +0000 (09:51 -0700)] 
dnet: replace obsolete *netif_rx_* functions with *napi_*

*netif_rx_* functions is obsolete and removed in newer kernels so
we need to use corresponding *napi_* functions instead.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Add be2net driver.
Sathya Perla [Thu, 12 Mar 2009 06:32:03 +0000 (23:32 -0700)] 
net: Add be2net driver.

Signed-off-by: Sathya Perla <sathyap@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodnet: Fix warnings on 64-bit.
David S. Miller [Thu, 12 Mar 2009 06:28:57 +0000 (23:28 -0700)] 
dnet: Fix warnings on 64-bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodnet: Dave DNET ethernet controller driver (updated)
Ilya Yanok [Thu, 12 Mar 2009 06:26:02 +0000 (23:26 -0700)] 
dnet: Dave DNET ethernet controller driver (updated)

Driver for Dave DNET ethernet controller found on Dave/DENX QongEVB-LITE
FPGA. Heavily based on Dave sources, I've just adopted it to current
kernel version and done some code cleanup.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agowimax: fix i2400m printk formats
Randy Dunlap [Thu, 12 Mar 2009 06:24:03 +0000 (23:24 -0700)] 
wimax: fix i2400m printk formats

Fix printk format warnings:

drivers/net/wimax/i2400m/netdev.c:523: warning: format '%zu' expects type 'size_t', but argument 7 has type 'unsigned int'
drivers/net/wimax/i2400m/netdev.c:548: warning: format '%zu' expects type 'size_t', but argument 7 has type 'unsigned int'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: allow timestamps even if SYN packet has tsval=0
Eric Dumazet [Wed, 11 Mar 2009 16:23:57 +0000 (09:23 -0700)] 
tcp: allow timestamps even if SYN packet has tsval=0

Some systems send SYN packets with apparently wrong RFC1323 timestamp
option values [timestamp tsval=0 tsecr=0].
It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml )

Linux TCP stack ignores this option and sends back a SYN+ACK packet
without timestamp option, thus many TCP flows cannot use timestamps
and lose some benefit of RFC1323.

Other operating systems seem to not care about initial tsval value, and let
tcp flows to negotiate timestamp option.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv6: Fix BUG when disabled ipv6 module is unloaded
John Dykstra [Wed, 11 Mar 2009 16:22:51 +0000 (09:22 -0700)] 
ipv6:  Fix BUG when disabled ipv6 module is unloaded

Do not try to "uninitialize" ipv6 if its initialization had been skipped
because module parameter disable=1 had been specified.

Reported-by: Thomas Backlund <tmb@mandriva.org>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Acked-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: version bump to 64
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:38 +0000 (08:02 +0000)] 
forcedeth: version bump to 64

This patch bumps up the version to 0.64

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: fix irq clearing and napi spin lock changes
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:34 +0000 (08:02 +0000)] 
forcedeth: fix irq clearing and napi spin lock changes

This patch clears the irqstatus register with the exact same events it
has read from it. Since the read-write operation is not atomic, a new
irqstatus bit could have been set in between these operations and would
then be cleared accidentally.

Secondly, we now don't need any spin lock protection when
scheduling/completing napi poll as the isr will not execute anymore (as
we turn off all interrupts now).

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: performance changes
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:30 +0000 (08:02 +0000)] 
forcedeth: performance changes

This patch modifies the throughput mode poll settings to reduce the
number of interrupts. This is only used by older hardware that need a
timer irq in throughput mode.

Secondly, this patch increases the default rx ring from 128 to 512. This
drastically improves bandwidth utilization for small packets sizes i.e
512 bytes.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: add interrupt moderation logic
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:26 +0000 (08:02 +0000)] 
forcedeth: add interrupt moderation logic

This patch adds the logic to moderate the interrupts by changing the
mode between throughput and poll. If there has been a large amount of
time without any burst of network load, the code will transition to pure
throughput mode (where each tx/rx/other will cause an interrupt). If
bursts of network load occurs, it will transition to poll based mode to
help reduce cpu utilization (it will not interrupt on each packet) while
maintaining the optimum network bandwidth utilization.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: remove isr processing loop
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:22 +0000 (08:02 +0000)] 
forcedeth: remove isr processing loop

This patch is only a subset of changes so that it is easier to see the
modifications. This patch removes the isr 'for' loop and shifts all the
logic to account for new tab spacing.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: add new optimization mode
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:18 +0000 (08:02 +0000)] 
forcedeth: add new optimization mode

A new optimization mode called Dynamic has been added. This will be mode
where interrupt moderation logic will dynamically switch between pure
throughput mode and poll based (called 'cpu') mode.

Also, for newer chipsets, the timer irq is not needed for throughput
mode. Secondly, since we are modifying the irqmask to change between
modes, msix is not supported.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: napi - handle all processing
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:14 +0000 (08:02 +0000)] 
forcedeth: napi - handle all processing

The napi poll routine has been modified to handle all interrupt events
and process them accordingly. Therefore, the ISR will now only schedule
the napi poll and disable all interrupts instead of just disabling rx
interrupt.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: add/modify tx done with limit
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:10 +0000 (08:02 +0000)] 
forcedeth: add/modify tx done with limit

There are two tx_done routines to handle tx completion processing. Both
these functions now take in a limit value and return the amount of tx
completions. This will be used by a future patch to determine the total
amount of work done.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: remove overhead
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:06 +0000 (08:02 +0000)] 
forcedeth: remove overhead

This patch removes unnecessary overhead code. Firstly, there is no nead
to mask off unwanted interrupts as we will be checking against the
irqmask field anyways. Secondly, there has been no value in last few
years from detecting error or unknown interrupts.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: save irq events for napi processing
Ayaz Abdulla [Thu, 5 Mar 2009 08:02:03 +0000 (08:02 +0000)] 
forcedeth: save irq events for napi processing

This patch will save the irq events in the driver's context so that the
napi routine knows which interrupts have occurred. Subsequent changes
will be moving all interrupt processing into the napi poll routine.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoforcedeth: remove msix + napi
Ayaz Abdulla [Thu, 5 Mar 2009 08:01:59 +0000 (08:01 +0000)] 
forcedeth: remove msix + napi

This patch removes support for msix running in conjunction with napi.
There has been reported issues regarding the behaviour of irqmask and
generation of interrupts by the HW when in MSIX mode. When running napi,
the driver is constantly turning off/on the irqmask. For the time being,
I am going to disable it until I can root cause the issue.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>