linux-2.6
16 years agobnx2x: Update version
Eilon Greenstein [Tue, 24 Jun 2008 03:36:51 +0000 (20:36 -0700)] 
bnx2x: Update version

Updating to version 1.45.6

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Add PCIE EEH support
Wendy Xiong [Tue, 24 Jun 2008 03:36:22 +0000 (20:36 -0700)] 
bnx2x: Add PCIE EEH support

Add PCI recovery functions to the driver.  The initial PCI state is
also saved so the MSI state can be restored during PCI recovery.

Signed-off-by: Wendy Xiong <wendyx@us.ibm.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Enhanced self test
Yitchak Gertner [Tue, 24 Jun 2008 03:35:51 +0000 (20:35 -0700)] 
bnx2x: Enhanced self test

Added registers, memories, loopback, nvram, interrupt and link tests to
the self-test

Signed-off-by: Yitchak Gertner <gertner@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Re-factor Tx code
Eilon Greenstein [Tue, 24 Jun 2008 03:35:13 +0000 (20:35 -0700)] 
bnx2x: Re-factor Tx code

Add support for IPv6 TSO
Re-factor the Tx code with smaller functions to increase readability.
Add linearization code in case packet is too fragmented for the
microcode to handle.

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Add TPA, Broadcoms HW LRO
Vladislav Zolotarov [Tue, 24 Jun 2008 03:34:36 +0000 (20:34 -0700)] 
bnx2x: Add TPA, Broadcoms HW LRO

The TPA stands for Transparent Packet Aggregation. When enabled, the FW
aggregate in-order TCP packets according to the 4-tuple match and sends
1 big packet to the driver. This packet is stored on an SGL in which
each SGE is 1 page. The FW also implements a timeout algorithm and it
honors all TCP flag, including the push flag as a trigger to halt
aggregation.

After receiving Ben Hutchings comments, we also added ethtool support,
so now, thanks to Ben's patch, when forwarding is enabled, our
aggregation is turned off using the LRO flags.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New statistics code
Yitchak Gertner [Tue, 24 Jun 2008 03:33:36 +0000 (20:33 -0700)] 
bnx2x: New statistics code

To avoid race conditions with link up/down and driver up/down - the
statistics handling was re-written in a form of state machine.
Also supporting statistics for 57711

Signed-off-by: Yitchak Gertner <gertner@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Add support for BCM57711 HW
Eilon Greenstein [Tue, 24 Jun 2008 03:33:01 +0000 (20:33 -0700)] 
bnx2x: Add support for BCM57711 HW

Supporting the 57711 and 57711E - refers to in the code as E1H. The
57710 is referred to as E1.

To support the new members in the family, the bnx2x structure was
divided to 3 parts: common, port and function. These changes caused some
rearrangement in the bnx2x.h file.

A set of accessories macros were added to make access to the bnx2x
structure more readable

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New microcode part 3/3
Eilon Greenstein [Tue, 24 Jun 2008 03:32:28 +0000 (20:32 -0700)] 
bnx2x: New microcode part 3/3

The new Microcode BLOB - broken into a separate patch to make it small
enough for the mailing list

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New microcode part 2/3
Eilon Greenstein [Tue, 24 Jun 2008 03:32:04 +0000 (20:32 -0700)] 
bnx2x: New microcode part 2/3

The new Microcode BLOB - broken into a separate patch to make it small
enough for the mailing list

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New microcode part 1/3
Eilon Greenstein [Tue, 24 Jun 2008 03:31:40 +0000 (20:31 -0700)] 
bnx2x: New microcode part 1/3

The new Microcode BLOB - broken into a separate patch to make it small
enough for the mailing list

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Remove old microcode
Eilon Greenstein [Tue, 24 Jun 2008 03:30:11 +0000 (20:30 -0700)] 
bnx2x: Remove old microcode

Removing the old Microcode from the BLOB - broken into a separate
patch to make it small enough for the mailing list

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New init infrastructure
Eilon Greenstein [Tue, 24 Jun 2008 03:29:02 +0000 (20:29 -0700)] 
bnx2x: New init infrastructure

This new initialization code supports the 57711 HW. It also supports
the emulation and FPGA for the 57711 and 57710 initializations values
(very small amount of code which is very helpful in the lab - less
than 30 lines).

The initialization is done via DMAE after the DMAE block is ready -
before it is ready, some of the initialization is done via PCI
configuration transactions (referred to as indirect write).  A mutex
to protect the DMAE from being overlapped was added.  There are few
new registers which needs to be initialized by SW - the full comment
for those registers is added to the register file.  A place holder for
the 57711 (referred to as E1H) microcode was added- the microcode
itself is too big and it is split over the following 4 patches

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: New link code
Yaniv Rosner [Tue, 24 Jun 2008 03:27:52 +0000 (20:27 -0700)] 
bnx2x: New link code

New Link code:
Moving all the link related code (including the calculations, the
initialization of the MAC and PHY and the external PHY's code) into
a separated file. The changes from the code that used to be part of
bnx2x.c (now called bnx2x_main.c) are:
- Using separate structures for link inputs and link outputs to clearly
  identify what was configured and what is the outcome
- Adding code to read external PHY FW version and print it as part of
  ethtool -i
- Adding code to upgrade external PHY FW from ethtool -E with special
  magic number - Changing the link down indication to ERR level
- Adding a lock on all PHY access to prevent an interrupt and
  setting changes to overlap
- Adding support for emulation and FPGA (small chunk of code that really
  helps in the lab) - Adding support for 1G on BCM8706 PHY
- Adding clear debug print incase of fan failure (the PHY type is now
  "failure")

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Adding bnx2x_link
Yaniv Rosner [Tue, 24 Jun 2008 03:27:26 +0000 (20:27 -0700)] 
bnx2x: Adding bnx2x_link

This patch is int the new bnx2x_link files (C and H). The files are
still not used in this patch, only in the next one so the patch will
be small enough for the mailing list.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilong Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2x: Rename bnx2x.c to bnx2x_main.c
Eilon Greenstein [Tue, 24 Jun 2008 03:24:56 +0000 (20:24 -0700)] 
bnx2x: Rename bnx2x.c to bnx2x_main.c

This patch is the rename of bnx2x.c to bnx2x_main.c.

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Kill unused variable in sctp_assoc_bh_rcv()
Vlad Yasevich [Fri, 20 Jun 2008 17:34:47 +0000 (10:34 -0700)] 
sctp: Kill unused variable in sctp_assoc_bh_rcv()

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Update driver version to 1.7.7.
Michael Chan [Thu, 19 Jun 2008 23:44:44 +0000 (16:44 -0700)] 
bnx2: Update driver version to 1.7.7.

And update module description.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Cleanup error handling in bnx2_open().
Michael Chan [Thu, 19 Jun 2008 23:44:10 +0000 (16:44 -0700)] 
bnx2: Cleanup error handling in bnx2_open().

All error handling in bnx2_open() can be consolidated.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Turn on multi rx rings.
Michael Chan [Thu, 19 Jun 2008 23:43:17 +0000 (16:43 -0700)] 
bnx2: Turn on multi rx rings.

Enable multiple rx rings if MSI-X vectors are available.  We enable
up to 7 rx rings.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Update firmware to support multi rx rings.
Michael Chan [Thu, 19 Jun 2008 23:42:39 +0000 (16:42 -0700)] 
bnx2: Update firmware to support multi rx rings.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Use one handler for all MSI-X vectors.
Michael Chan [Thu, 19 Jun 2008 23:41:57 +0000 (16:41 -0700)] 
bnx2: Use one handler for all MSI-X vectors.

Use the same MSI-X handler to schedule NAPI.  Change the dev_instance
void pointer to the bnx2_napi struct instead so we can have the proper
context for each MSI-X vector.

Add a new bnx2_poll_msix() that is optimized for handling MSI-X
NAPI polling of rx/tx work only.  Remove the old bnx2_tx_poll() that
is no longer needed.  Each MSI-X vector handles 1 tx and 1 rx ring.
The first vector handles link events as well.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Optimize fast-path tx and rx work.
Michael Chan [Thu, 19 Jun 2008 23:41:08 +0000 (16:41 -0700)] 
bnx2: Optimize fast-path tx and rx work.

Add hw_tx_cons_ptr and hw_rx_cons_ptr to speed up the retreival of
the tx and rx consumer index, since the MSI-X and default status
blocks have different structures.

Combine status_blk and status_blk_msix into a union.  We'll only use
one type of status block for each vector.

Separate the code to detect more rx and tx work from the code to
detect link related work.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Put rx ring variables in a separate struct.
Michael Chan [Thu, 19 Jun 2008 23:38:19 +0000 (16:38 -0700)] 
bnx2: Put rx ring variables in a separate struct.

In preparation for multi-ring support, rx ring variables are now put
in a separate bnx2_rx_ring_info struct.  With MSI-X, we can support
multiple rx rings.

The functions to allocate/free rx memory and to initialize rx rings
are now modified to handle multiple rings.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Put tx ring variables in a separate struct.
Michael Chan [Thu, 19 Jun 2008 23:37:42 +0000 (16:37 -0700)] 
bnx2: Put tx ring variables in a separate struct.

In preparation for multi-ring support, tx ring variables are now put
in a separate bnx2_tx_ring_info struct.  Multi tx ring will not be
enabled until it is fully supported by the stack.  Only 1 tx ring
will be used at the moment.

The functions to allocate/free tx memory and to initialize tx rings
are now modified to handle multiple rings.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Discard and warn about LRO'd skbs received for forwarding
Ben Hutchings [Thu, 19 Jun 2008 23:22:28 +0000 (16:22 -0700)] 
net: Discard and warn about LRO'd skbs received for forwarding

Add skb_warn_if_lro() to test whether an skb was received with LRO and
warn if so.

Change br_forward(), ip_forward() and ip6_forward() to call it) and
discard the skb if it returns true.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Disable LRO on devices that are forwarding
Ben Hutchings [Thu, 19 Jun 2008 23:15:47 +0000 (16:15 -0700)] 
net: Disable LRO on devices that are forwarding

Large Receive Offload (LRO) is only appropriate for packets that are
destined for the host, and should be disabled if received packets may be
forwarded.  It can also confuse the GSO on output.

Add dev_disable_lro() function which uses the appropriate ethtool ops to
disable LRO if enabled.

Add calls to dev_disable_lro() in br_add_if() and functions that enable
IPv4 and IPv6 forwarding.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Follow security requirement of responding with 1 packet
Vlad Yasevich [Thu, 19 Jun 2008 23:08:18 +0000 (16:08 -0700)] 
sctp: Follow security requirement of responding with 1 packet

RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

When an SCTP stack receives a packet containing multiple control or
DATA chunks and the processing of the packet requires the sending of
multiple chunks in response, the sender of the response chunk(s) MUST
NOT send more than one packet.  If bundling is supported, multiple
response chunks that fit into a single packet MAY be bundled together
into one single response packet.  If bundling is not supported, then
the sender MUST NOT send more than one response chunk and MUST
discard all other responses.  Note that this rule does NOT apply to a
SACK chunk, since a SACK chunk is, in itself, a response to DATA and
a SACK does not require a response of more DATA.

We implement this by not servicing our outqueue until we reach the end
of the packet.  This enables maximum bundling.  We also identify
'response' chunks and make sure that we only send 1 packet when sending
such chunks.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Validate Initiate Tag when handling ICMP message
Wei Yongjun [Thu, 19 Jun 2008 23:07:48 +0000 (16:07 -0700)] 
sctp: Validate Initiate Tag when handling ICMP message

This patch add to validate initiate tag and chunk type if verification
tag is 0 when handling ICMP message.

RFC 4960, Appendix C. ICMP Handling

ICMP6) An implementation MUST validate that the Verification Tag
contained in the ICMP message matches the Verification Tag of the peer.
If the Verification Tag is not 0 and does NOT match, discard the ICMP
message.  If it is 0 and the ICMP message contains enough bytes to
verify that the chunk type is an INIT chunk and that the Initiate Tag
matches the tag of the peer, continue with ICMP7.  If the ICMP message
is too short or the chunk type or the Initiate Tag does not match,
silently discard the packet.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Thu, 19 Jun 2008 23:00:04 +0000 (16:00 -0700)] 
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:

net/mac80211/tx.c

16 years agomac80211: detect driver tx bugs
Johannes Berg [Wed, 18 Jun 2008 22:39:48 +0000 (15:39 -0700)] 
mac80211: detect driver tx bugs

When a driver rejects a frame in it's ->tx() callback, it must also
stop queues, otherwise mac80211 can go into a loop here. Detect this
situation and abort the loop after five retries, warning about the
driver bug.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetlink: genl: fix circular locking
Patrick McHardy [Wed, 18 Jun 2008 09:07:07 +0000 (02:07 -0700)] 
netlink: genl: fix circular locking

genetlink has a circular locking dependency when dumping the registered
families:

- dump start:
genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

- dump continuance:
netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

Register genl_lock as callback mutex with netlink to fix this. This slightly
widens an already existing module unload race, the genl ops used during the
dump might go away when the module is unloaded. Thomas Graf is working on a
seperate fix for this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdevice: Fix promiscuity and allmulti overflow
Wang Chen [Wed, 18 Jun 2008 08:48:28 +0000 (01:48 -0700)] 
netdevice: Fix promiscuity and allmulti overflow

Max of promiscuity and allmulti plus positive @inc can cause overflow.
Fox example: when allmulti=0xFFFFFFFF, any caller give dev_set_allmulti() a
positive @inc will cause allmulti be off.
This is not what we want, though it's rare case.
The fix is that only negative @inc will cause allmulti or promiscuity be off
and when any caller makes the counters touch the roof, we return error.

Change of v2:
Change void function dev_set_promiscuity/allmulti to return int.
So callers can get the overflow error.
Caller's fix will be done later.

Change of v3:
1. Since we return error to caller, we don't need to print KERN_ERROR,
KERN_WARNING is enough.
2. In dev_set_promiscuity(), if __dev_set_promiscuity() failed, we
return at once.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoRevert "mac80211: Use skb_header_cloned() on TX path."
David S. Miller [Wed, 18 Jun 2008 08:19:51 +0000 (01:19 -0700)] 
Revert "mac80211: Use skb_header_cloned() on TX path."

This reverts commit 608961a5eca8d3c6bd07172febc27b5559408c5d.

The problem is that the mac80211 stack not only needs to be able to
muck with the link-level headers, it also might need to mangle all of
the packet data if doing sw wireless encryption.

This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
bringing this bisection analysis to my attention), and Ilpo (for
trying to analyze this purely from the TCP side).

In 2.6.27 we can take another stab at this, by using something like
skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
tx->key.  The ESP protocol code in the IPSEC stack can be used as a
model for implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv6: minor cleanup in net/ipv6/tcp_ipv6.c [RESEND ].
Rami Rosen [Wed, 18 Jun 2008 07:51:09 +0000 (00:51 -0700)] 
ipv6: minor cleanup in net/ipv6/tcp_ipv6.c [RESEND ].

In net/ipv6/tcp_ipv6.c:

  - Remove unneeded tcp_v6_send_check() declaration.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Add sk_set_socket() helper.
David S. Miller [Wed, 18 Jun 2008 05:41:38 +0000 (22:41 -0700)] 
net: Add sk_set_socket() helper.

In order to more easily grep for all things that set
sk->sk_socket, add sk_set_socket() helper inline function.

Suggested (although only half-seriously) by Evgeniy Polyakov.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoaf_unix: fix 'poll for write'/ connected DGRAM sockets
Rainer Weikusat [Wed, 18 Jun 2008 05:28:05 +0000 (22:28 -0700)] 
af_unix: fix 'poll for write'/ connected DGRAM sockets

The unix_dgram_sendmsg routine implements a (somewhat crude)
form of receiver-imposed flow control by comparing the length of the
receive queue of the 'peer socket' with the max_ack_backlog value
stored in the corresponding sock structure, either blocking
the thread which caused the send-routine to be called or returning
EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
sockets. The poll-implementation for these socket types is
datagram_poll from core/datagram.c. A socket is deemed to be writeable
by this routine when the memory presently consumed by datagrams
owned by it is less than the configured socket send buffer size. This
is always wrong for connected PF_UNIX non-stream sockets when the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an
(usual) application to 'poll for writeability by repeated send request
with O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_sendmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Wed, 18 Jun 2008 04:37:14 +0000 (21:37 -0700)] 
Merge branch 'davem-next' of /linux/kernel/git/jgarzik/netdev-2.6

16 years agoMerge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Wed, 18 Jun 2008 04:32:08 +0000 (21:32 -0700)] 
Merge branch 'davem-fixes' of /linux/kernel/git/jgarzik/netdev-2.6

16 years agoax25: Fix std timer socket destroy handling.
David S. Miller [Wed, 18 Jun 2008 04:26:37 +0000 (21:26 -0700)] 
ax25: Fix std timer socket destroy handling.

Tihomir Heidelberg - 9a4gl, reports:

--------------------
I would like to direct you attention to one problem existing in ax.25
kernel since 2.4. If listening socket is closed and its SKB queue is
released but those sockets get weird. Those "unAccepted()" sockets
should be destroyed in ax25_std_heartbeat_expiry, but it will not
happen. And there is also a note about that in ax25_std_timer.c:
/* Magic here: If we listen() and a new link dies before it
is accepted() it isn't 'dead' so doesn't get removed. */

This issue cause ax25d to stop accepting new connections and I had to
restarted ax25d approximately each day and my services were unavailable.
Also netstat -n -l shows invalid source and device for those listening
sockets. It is strange why ax25d's listening socket get weird because of
this issue, but definitely when I solved this bug I do not have problems
with ax25d anymore and my ax25d can run for months without problems.
--------------------

Actually as far as I can see, this problem is even in releases
as far back as 2.2.x as well.

It seems senseless to special case this test on TCP_LISTEN state.
Anything still stuck in state 0 has no external references and
we can just simply kill it off directly.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdevice: change net_device->promiscuity/allmulti to unsigned int
Wang Chen [Wed, 18 Jun 2008 04:12:48 +0000 (21:12 -0700)] 
netdevice: change net_device->promiscuity/allmulti to unsigned int

The comments of dev_set_allmulti/promiscuity() is that "While the count in
the device remains above zero...". So negative count is useless.
Fix the type of the counter from "int" to "unsigned int".

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set
Ang Way Chuang [Wed, 18 Jun 2008 04:10:33 +0000 (21:10 -0700)] 
tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set

By default, tun.c running in TUN_TUN_DEV mode will set the protocol of
packet to IPv4 if TUN_NO_PI is set. My program failed to work when I
assumed that the driver will check the first nibble of packet,
determine IP version and set the appropriate protocol.

Signed-off-by: Ang Way Chuang <wcang@nav6.org>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoudp: sk_drops handling
Eric Dumazet [Wed, 18 Jun 2008 04:04:56 +0000 (21:04 -0700)] 
udp: sk_drops handling

In commits 33c732c36169d7022ad7d6eb474b0c9be43a2dc1 ([IPV4]: Add raw
drops counter) and a92aa318b4b369091fd80433c80e62838db8bc1c ([IPV6]:
Add raw drops counter), Wang Chen added raw drops counter for
/proc/net/raw & /proc/net/raw6

This patch adds this capability to UDP sockets too (/proc/net/udp &
/proc/net/udp6).

This means that 'RcvbufErrors' errors found in /proc/net/snmp can be also
be examined for each udp socket.

# grep Udp: /proc/net/snmp
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 23971006 75 899420 16390693 146348 0

# cat /proc/net/udp
 sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt  ---
uid  timeout inode ref pointer drops
 75: 00000000:02CB 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2358 2 ffff81082a538c80 0
111: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2286 2 ffff81042dd35c80 146348

In this example, only port 111 (0x006F) was flooded by messages that
user program could not read fast enough. 146348 messages were lost.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMAINTAINERS
Jeff Kirsher [Wed, 11 Jun 2008 22:15:53 +0000 (15:15 -0700)] 
MAINTAINERS

Add PJ Waskiewicz to the list of maintainers for Intel 10/100/1000/10GbE
adapters.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agobonding: Allow setting max_bonds to zero
Jay Vosburgh [Sat, 14 Jun 2008 01:12:04 +0000 (18:12 -0700)] 
bonding: Allow setting max_bonds to zero

Permit bonding to function rationally if max_bonds is set to
zero.  This will load the module, but create no master devices (which can
be created via sysfs).

Requires some change to bond_create_sysfs; currently, the
netdev sysfs directory is determined from the first bonding device created,
but this is no longer possible.  Instead, an interface from net/core is
created to create and destroy files in net_class.

Based on a patch submitted by Phil Oester <kernel@linuxaces.com>.
Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to
update the documentation.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agobonding: Rework / fix multiple gratuitous ARP support
Jay Vosburgh [Sat, 14 Jun 2008 01:12:03 +0000 (18:12 -0700)] 
bonding: Rework / fix multiple gratuitous ARP support

Support for sending multiple gratuitous ARPs during failovers
was added by commit:

commit 7893b2491a2d5f716540ac5643d78d37a7f6628b
Author: Moni Shoua <monis@voltaire.com>
Date:   Sat May 17 21:10:12 2008 -0700

    bonding: Send more than one gratuitous ARP when slave takes over

This change modifies that support to remove duplicated code,
add support for ARP monitor (the original only supported miimon), clear
the grat ARP counter in bond_close (lest a later "ifconfig up" immediately
start spewing ARPs), and add documentation for the module parameter.

Also updated driver version to 3.3.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agobonding: deliver netdev event for fail-over under the active-backup mode
Or Gerlitz [Sat, 14 Jun 2008 01:12:02 +0000 (18:12 -0700)] 
bonding: deliver netdev event for fail-over under the active-backup mode

under active-backup mode and when there's actual new_active slave,
have bond_change_active_slave() call the networking core to deliver
NETDEV_BONDING_FAILOVER event such that the fail-over can be notable
by code outside of the bonding driver such as the RDMA stack and
monitoring tools.

As the correct context of locking appropriate for notifier calls is RTNL
and nothing else, bond->curr_slave_lock and bond->lock are unlocked and
later locked again. This is ensured by the rest of the code to be safe
under backup-mode AND when new_active is not NULL.

Jay Vosburgh modified the original patch for formatting and fixed a
compiler error.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agobonding: bond_change_active_slave() cleanup under active-backup
Or Gerlitz [Sat, 14 Jun 2008 01:12:01 +0000 (18:12 -0700)] 
bonding: bond_change_active_slave() cleanup under active-backup

simplified the code of bond_change_active_slave() such that under
active-backup mode there's one "if (new_active)" test and the rest
of the code only does extra checks on top of it. This removed an
unneeded "if (bond->send_grat_arp > 0)" check and avoid calling
bond_send_gratuitous_arp when there's no active slave.

Jay Vosburgh made minor coding style changes to the orignal patch.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonet/core: add NETDEV_BONDING_FAILOVER event
Or Gerlitz [Sat, 14 Jun 2008 01:12:00 +0000 (18:12 -0700)] 
net/core: add NETDEV_BONDING_FAILOVER event

Add NETDEV_BONDING_FAILOVER event to be used in a successive patch
by bonding to announce fail-over for the active-backup mode through the
netdev events notifier chain mechanism. Such an event can be of use for the
RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
always be aligned with the IP stack, in the sense that they use the same
ports/links as the stack does. More usages can be done to allow monitoring
tools based on netlink events being aware to bonding fail-over.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: version 1.22
Stephen Hemminger [Tue, 17 Jun 2008 16:04:28 +0000 (09:04 -0700)] 
sky2: version 1.22

New version to reflect new hardware support

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: 88E8057 chip support
Stephen Hemminger [Tue, 17 Jun 2008 16:04:27 +0000 (09:04 -0700)] 
sky2: 88E8057 chip support

Add support for Yukon 2 Ultra 2 chip set (88E8057) based on code in latest
version of vendor driver (sk98lin 10.60.2.3).  Untested on real hardware.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: use DEFINE_PCI_DEVICE_TABLE
Stephen Hemminger [Tue, 17 Jun 2008 16:04:26 +0000 (09:04 -0700)] 
sky2: use DEFINE_PCI_DEVICE_TABLE

PCI device table can be marked as devinitconst by using macro.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: chip version printout
Stephen Hemminger [Tue, 17 Jun 2008 16:04:25 +0000 (09:04 -0700)] 
sky2: chip version printout

Change how chip version is printed so that if an unknown version is detected
nothing breaks.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: phy setup changes
Stephen Hemminger [Tue, 17 Jun 2008 16:04:24 +0000 (09:04 -0700)] 
sky2: phy setup changes

Change the setup of the PHY registers on some chip ids. These changes
make the latest sky2 driver follow the vendor driver.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoibm_emac: Remove the ibm_emac driver
Josh Boyer [Tue, 17 Jun 2008 23:35:23 +0000 (19:35 -0400)] 
ibm_emac: Remove the ibm_emac driver

The arch/ppc sub-tree has been removed in the powerpc git tree.  The old
ibm_emac driver is no longer used by anything as a result of this.  This
removes it, leaving the ibm_newemac driver as the proper driver to use for
PowerPC boards with the EMAC hardware.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoMerge branch 'for-2.6.27' of git://git.marvell.com/mv643xx_eth into upstream-next
Jeff Garzik [Wed, 18 Jun 2008 03:24:19 +0000 (23:24 -0400)] 
Merge branch 'for-2.6.27' of git://git.marvell.com/mv643xx_eth into upstream-next

16 years agoatl1: relax eeprom mac address error check
Radu Cristescu [Thu, 12 Jun 2008 22:04:54 +0000 (17:04 -0500)] 
atl1: relax eeprom mac address error check

The atl1 driver tries to determine the MAC address thusly:

- If an EEPROM exists, read the MAC address from EEPROM and
  validate it.
- If an EEPROM doesn't exist, try to read a MAC address from
  SPI flash.
- If that fails, try to read a MAC address directly from the
  MAC Station Address register.
- If that fails, assign a random MAC address provided by the
  kernel.

We now have a report of a system fitted with an EEPROM containing all
zeros where we expect the MAC address to be, and we currently handle
this as an error condition.  Turns out, on this system the BIOS writes
a valid MAC address to the NIC's MAC Station Address register, but we
never try to read it because we return an error when we find the all-
zeros address in EEPROM.

This patch relaxes the error check and continues looking for a MAC
address even if it finds an illegal one in EEPROM.

Signed-off-by: Radu Cristescu <advantis@gmx.net>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonet/enc28j60: low power mode
David Brownell [Fri, 13 Jun 2008 04:38:06 +0000 (21:38 -0700)] 
net/enc28j60: low power mode

Keep enc28j60 chips in low-power mode when they're not in use.
At typically 120 mA, these chips run hot even when idle; this
low power mode cuts that power usage by a factor of around 100.

This version provides a generic routine to poll a register until
its masked value equals some value ... e.g. bit set or cleared.
It's basically what the previous wait_phy_ready() did, but this
version is generalized to support the handshaking needed to
enter and exit low power mode.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Claudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonet/enc28j60: section fix
David Brownell [Fri, 13 Jun 2008 04:36:24 +0000 (21:36 -0700)] 
net/enc28j60: section fix

Minor bugfixes to the enc28j60 driver ... wrong section marking,
indentation, and bogus use of spi_bus_type.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Claudio Lanconelli <lanconelli.claudio@eptar.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosky2: 88E8040T pci device id
Stephen Hemminger [Sat, 14 Jun 2008 17:32:15 +0000 (10:32 -0700)] 
sky2: 88E8040T pci device id

Missed one pci id for 88E8040T.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonetxen: download firmware in pci probe
Dhananjay Phadke [Mon, 16 Jun 2008 05:59:46 +0000 (22:59 -0700)] 
netxen: download firmware in pci probe

Downloading firmware in pci probe allows recovery in case of
firmware failure by reloading the driver.

Also reduced delays in firmware load.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonetxen: cleanup debug messages
Dhananjay Phadke [Mon, 16 Jun 2008 05:59:45 +0000 (22:59 -0700)] 
netxen: cleanup debug messages

o Remove unnecessary debug prints and functions.
o Explicitly specify pci class (0x020000) to avoid enabling
  management function.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonetxen: remove global physical_port array
Dhananjay Phadke [Mon, 16 Jun 2008 05:59:44 +0000 (22:59 -0700)] 
netxen: remove global physical_port array

Store physical port number in netxen_adapter structure.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agonetxen: fix portnum for hp mezz cards
Dhananjay Phadke [Mon, 16 Jun 2008 05:59:43 +0000 (22:59 -0700)] 
netxen: fix portnum for hp mezz cards

This fixes a the issue where logical port number is set incorrectly
for HP blade mezz cards.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoibm_newemac: select CRC32 in Kconfig
Josh Boyer [Tue, 17 Jun 2008 23:27:55 +0000 (19:27 -0400)] 
ibm_newemac: select CRC32 in Kconfig

The ibm_newemac driver requires ether_crc to be defined.  Apparently it is
possible to generate a .config without CONFIG_CRC32 set which causes the
following link errors if IBM_NEW_EMAC is selected:

  LD      .tmp_vmlinux1
drivers/built-in.o: In function `emac_hash_mc':
core.c:(.text+0x2f524): undefined reference to `crc32_le'
core.c:(.text+0x2f528): undefined reference to `bitrev32'
make: *** [.tmp_vmlinux1] Error 1

This patch has IBM_NEW_EMAC select CRC32 so we don't hit this error.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agorose: improving AX25 routing frames via ROSE network
Bernard Pidoux [Wed, 18 Jun 2008 00:08:32 +0000 (17:08 -0700)] 
rose: improving AX25 routing frames via ROSE network

ROSE network is organized through nodes connected via hamradio or Internet.
AX25 packet radio frames sent to a remote ROSE address destination are routed
through these nodes.

Without the present patch, automatic routing mechanism did not work optimally
due to an improper parameter checking.

rose_get_neigh() function is called either by rose_connect() or by
rose_route_frame().

In the case of a call from rose_connect(), f0 timer is checked to find if a connection
is already pending. In that case it returns the address of the neighbour, or returns a NULL otherwise.

When called by rose_route_frame() the purpose was to route a packet AX25 frame
through an adjacent node given a destination rose address.
However, in that case, t0 timer checked does not indicate if the adjacent node
is actually connected even if the timer is not null. Thus, for each frame sent, the
function often tried to start a new connexion even if the adjacent node was already connected.

The patch adds a "new" parameter that is true when the function is called by
rose route_frame().
This instructs rose_get_neigh() to check node parameter "restarted".
If restarted is true it means that the route to the destination address is opened via a neighbour
node already connected.
If "restarted" is false the function returns a NULL.
In that case the calling function will initiate a new connection as before.

This results in a fast routing of frames, from nodes to nodes, until
destination is reached, as originaly specified by ROSE protocole.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoxfrm: fix fragmentation for ipv4 xfrm tunnel
Steffen Klassert [Tue, 17 Jun 2008 23:37:13 +0000 (16:37 -0700)] 
xfrm: fix fragmentation for ipv4 xfrm tunnel

When generating the ip header for the transformed packet we just copy
the frag_off field of the ip header from the original packet to the ip
header of the new generated packet. If we receive a packet as a chain
of fragments, all but the last of the new generated packets have the
IP_MF flag set. We have to mask the frag_off field to only keep the
IP_DF flag from the original packet. This got lost with git commit
36cf9acf93e8561d9faec24849e57688a81eb9c5 ("[IPSEC]: Separate
inner/outer mode processing on output")

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [fore200e] convert to use request_firmware()
Chas Williams [Tue, 17 Jun 2008 23:23:11 +0000 (16:23 -0700)] 
atm: [fore200e] convert to use request_firmware()

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [he] remove #ifdef clutter
Chas Williams [Tue, 17 Jun 2008 23:21:44 +0000 (16:21 -0700)] 
atm: [he] remove #ifdef clutter

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [iphase] 64-bit cleanup
Alan Cox [Tue, 17 Jun 2008 23:21:18 +0000 (16:21 -0700)] 
atm: [iphase] 64-bit cleanup

This fixes the most obvious 64-bit problems, but it is still very very
broken in other aspects.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: use const where reasonable
Mitchell Blank Jr [Tue, 17 Jun 2008 23:20:06 +0000 (16:20 -0700)] 
atm: use const where reasonable

From: Mitchell Blank Jr <mitch@sfgoth.com>

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [suni] add support for setting loopback and framing modes
Chas Williams [Tue, 17 Jun 2008 23:19:24 +0000 (16:19 -0700)] 
atm: [suni] add support for setting loopback and framing modes

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [iphase] move struct suni_priv to suni.h
Jorge Boncompte [DTI2] [Tue, 17 Jun 2008 23:18:49 +0000 (16:18 -0700)] 
atm: [iphase] move struct suni_priv to suni.h

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobridge: fix IPV6=n build
Randy Dunlap [Tue, 17 Jun 2008 23:16:13 +0000 (16:16 -0700)] 
bridge: fix IPV6=n build

Fix bridge netfilter code so that it uses CONFIG_IPV6 as needed:

net/built-in.o: In function `ebt_filter_ip6':
ebt_ip6.c:(.text+0x87c37): undefined reference to `ipv6_skip_exthdr'
net/built-in.o: In function `ebt_log_packet':
ebt_log.c:(.text+0x88dee): undefined reference to `ipv6_skip_exthdr'
make[1]: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobridge: make bridge address settings sticky
Stephen Hemminger [Tue, 17 Jun 2008 23:10:06 +0000 (16:10 -0700)] 
bridge: make bridge address settings sticky

Normally, the bridge just chooses the smallest mac address as the
bridge id and mac address of bridge device. But if the administrator
has explictly set the interface address then don't change it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobridge: handle process all link-local frames
Stephen Hemminger [Tue, 17 Jun 2008 23:09:45 +0000 (16:09 -0700)] 
bridge: handle process all link-local frames

Any frame addressed to link-local addresses should be processed by local
receive path. The earlier code would process them only if STP was enabled.
Since there are other frames like LACP for bonding, we should always
process them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: fix error path in sctp_proc_init
Pavel Emelyanov [Tue, 17 Jun 2008 22:54:14 +0000 (15:54 -0700)] 
sctp: fix error path in sctp_proc_init

After the sctp_remaddr_proc_init failed, the proper rollback is
not the sctp_remaddr_proc_exit, but the sctp_assocs_proc_exit.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetfilter: nf_conntrack_h323: fix module unload crash
Patrick McHardy [Tue, 17 Jun 2008 22:52:32 +0000 (15:52 -0700)] 
netfilter: nf_conntrack_h323: fix module unload crash

The H.245 helper is not registered/unregistered, but assigned to
connections manually from the Q.931 helper. This means on unload
existing expectations and connections using the helper are not
cleaned up, leading to the following oops on module unload:

CPU 0 Unable to handle kernel paging request at virtual address c00a6828, epc == 802224dc, ra == 801d4e7c
Oops[#1]:
Cpu 0
$ 0   : 00000000 00000000 00000004 c00a67f0
$ 4   : 802a5ad0 81657e00 00000000 00000000
$ 8   : 00000008 801461c8 00000000 80570050
$12   : 819b0280 819b04b0 00000006 00000000
$16   : 802a5a60 80000000 80b46000 80321010
$20   : 00000000 00000004 802a5ad0 00000001
$24   : 00000000 802257a8
$28   : 802a4000 802a59e8 00000004 801d4e7c
Hi    : 0000000b
Lo    : 00506320
epc   : 802224dc ip_conntrack_help+0x38/0x74     Tainted: P
ra    : 801d4e7c nf_iterate+0xbc/0x130
Status: 1000f403    KERNEL EXL IE
Cause : 00800008
BadVA : c00a6828
PrId  : 00019374
Modules linked in: ip_nat_pptp ip_conntrack_pptp ath_pktlog wlan_acl wlan_wep wlan_tkip wlan_ccmp wlan_xauth ath_pci ath_dev ath_dfs ath_rate_atheros wlan ath_hal ip_nat_tftp ip_conntrack_tftp ip_nat_ftp ip_conntrack_ftp pppoe ppp_async ppp_deflate ppp_mppe pppox ppp_generic slhc
Process swapper (pid: 0, threadinfo=802a4000, task=802a6000)
Stack : 801e7d98 00000004 802a5a60 80000000 801d4e7c 801d4e7c 802a5ad0 00000004
        00000000 00000000 801e7d98 00000000 00000004 802a5ad0 00000000 00000010
        801e7d98 80b46000 802a5a60 80320000 80000000 801d4f8c 802a5b00 00000002
        80063834 00000000 80b46000 802a5a60 801e7d98 80000000 802ba854 00000000
        81a02180 80b7e260 81a021b0 819b0000 819b0000 80570056 00000000 00000001
        ...
Call Trace:
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801d4e7c>] nf_iterate+0xbc/0x130
 [<801d4e7c>] nf_iterate+0xbc/0x130
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801d4f8c>] nf_hook_slow+0x9c/0x1a4

One way to fix this would be to split helper cleanup from the unregistration
function and invoke it for the H.245 helper, but since ctnetlink needs to be
able to find the helper for synchonization purposes, a better fix is to
register it normally and make sure its not assigned to connections during
helper lookup. The missing l3num initialization is enough for this, this
patch changes it to use AF_UNSPEC to make it more explicit though.

Reported-by: liannan <liannan@twsz.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetfilter: nf_conntrack_h323: fix memory leak in module initialization error path
Patrick McHardy [Tue, 17 Jun 2008 22:52:07 +0000 (15:52 -0700)] 
netfilter: nf_conntrack_h323: fix memory leak in module initialization error path

Properly free h323_buffer when helper registration fails.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetfilter: nf_nat: fix RCU races
Patrick McHardy [Tue, 17 Jun 2008 22:51:47 +0000 (15:51 -0700)] 
netfilter: nf_nat: fix RCU races

Fix three ct_extend/NAT extension related races:

- When cleaning up the extension area and removing it from the bysource hash,
  the nat->ct pointer must not be set to NULL since it may still be used in
  a RCU read side

- When replacing a NAT extension area in the bysource hash, the nat->ct
  pointer must be assigned before performing the replacement

- When reallocating extension storage in ct_extend, the old memory must
  not be freed immediately since it may still be used by a RCU read side

Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetrom: Kill spurious NULL'ing of sk->sk_socket.
David S. Miller [Tue, 17 Jun 2008 10:19:58 +0000 (03:19 -0700)] 
netrom: Kill spurious NULL'ing of sk->sk_socket.

In nr_release(), one code path calls sock_orphan() which
will NULL out sk->sk_socket already.

In the other case, handling states other than NR_STATE_{0,1,2,3},
seems to not be possible other than due to bugs.  Even for an
uninitialized nr->state value, that would be zero or NR_STATE_0.
It might be wise to stick a WARN_ON() here.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agox25: Use sock_orphan() instead of open-coded (and buggy) variant.
David S. Miller [Tue, 17 Jun 2008 10:05:13 +0000 (03:05 -0700)] 
x25: Use sock_orphan() instead of open-coded (and buggy) variant.

It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoeconet: Use sock_orphan() instead of open-coded (and buggy) variant.
David S. Miller [Tue, 17 Jun 2008 10:01:47 +0000 (03:01 -0700)] 
econet: Use sock_orphan() instead of open-coded (and buggy) variant.

It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agox25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
David S. Miller [Tue, 17 Jun 2008 09:44:35 +0000 (02:44 -0700)] 
x25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.

This is the x25 variant of changeset
9375cb8a1232d2a15fe34bec4d3474872e02faec
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agorose: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
David S. Miller [Tue, 17 Jun 2008 09:39:21 +0000 (02:39 -0700)] 
rose: Use sock_graft() and remove bogus sk_socket and sk_sleep init.

This is the rose variant of changeset
9375cb8a1232d2a15fe34bec4d3474872e02faec
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetrom: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
David S. Miller [Tue, 17 Jun 2008 09:36:44 +0000 (02:36 -0700)] 
netrom: Use sock_graft() and remove bogus sk_socket and sk_sleep init.

This is the netrom variant of changeset
9375cb8a1232d2a15fe34bec4d3474872e02faec
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
David S. Miller [Tue, 17 Jun 2008 09:20:54 +0000 (02:20 -0700)] 
ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.

The way that listening sockets work in ax25 is that the packet input
code path creates new socks via ax25_make_new() and attaches them
to the incoming SKB.  This SKB gets queued up into the listening
socket's receive queue.

When accept()'d the sock gets hooked up to the real parent socket.
Alternatively, if the listening socket is closed and released, any
unborn socks stuff up in the receive queue get released.

So during this time period these sockets are unreachable in any
other way, so no wakeup events nor references to their ->sk_socket
and ->sk_sleep members can occur.  And even if they do, all such
paths have to make NULL checks.

So do not deceptively initialize them in ax25_make_new() to the
values in the listening socket.  Leave them at NULL.

Finally, use sock_graft() in ax25_accept().

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agollc: Use sock_graft() instead of by-hand version.
David S. Miller [Tue, 17 Jun 2008 08:21:03 +0000 (01:21 -0700)] 
llc: Use sock_graft() instead of by-hand version.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Kill SOCK_SLEEP_PRE and SOCK_SLEEP_POST, no users.
David S. Miller [Tue, 17 Jun 2008 08:09:00 +0000 (01:09 -0700)] 
net: Kill SOCK_SLEEP_PRE and SOCK_SLEEP_POST, no users.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agodecnet: Remove SOCK_SLEEP_{PRE,POST} usage.
David S. Miller [Tue, 17 Jun 2008 08:06:01 +0000 (01:06 -0700)] 
decnet: Remove SOCK_SLEEP_{PRE,POST} usage.

Just expand the wait sequence.  And as a nice side-effect
the timeout is respected now.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Kill SCTP_SOCK_SLEEP_{PRE,POST}, unused.
David S. Miller [Tue, 17 Jun 2008 07:40:36 +0000 (00:40 -0700)] 
sctp: Kill SCTP_SOCK_SLEEP_{PRE,POST}, unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Tue, 17 Jun 2008 01:25:48 +0000 (18:25 -0700)] 
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:

drivers/net/wireless/rt2x00/Kconfig
drivers/net/wireless/rt2x00/rt2x00usb.c
net/sctp/protocol.c

16 years agoatm: [he] send idle cells instead of unassigned when in SDH mode
Chas Williams [Tue, 17 Jun 2008 00:21:27 +0000 (17:21 -0700)] 
atm: [he] send idle cells instead of unassigned when in SDH mode

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [he] limit queries to the device's register space
Robert T. Johnson [Tue, 17 Jun 2008 00:20:52 +0000 (17:20 -0700)] 
atm: [he] limit queries to the device's register space

From: "Robert T. Johnson" <rtjohnso@eecs.berkeley.edu>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
16 years agoatm: [br2864] fix routed vcmux support
Eric Kinzie [Tue, 17 Jun 2008 00:18:18 +0000 (17:18 -0700)] 
atm: [br2864] fix routed vcmux support

From: Eric Kinzie <ekinzie@cmf.nrl.navy.mil>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [he] only support suni driver on multimode interfaces
Chas Williams [Tue, 17 Jun 2008 00:17:31 +0000 (17:17 -0700)] 
atm: [he] only support suni driver on multimode interfaces

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [iphase] doesn't call phy->start due to a bogus #ifndef
Jorge Boncompte [DTI2] [Tue, 17 Jun 2008 00:16:35 +0000 (17:16 -0700)] 
atm: [iphase] doesn't call phy->start due to a bogus #ifndef

This causes the suni driver to oops if you try to use sonetdiag to get
the statistics. Also add the corresponding phy->stop call to fix another
oops if you try to remove the module.

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [iphase] set drvdata before enabling interrupts
Jorge Boncompte [DTI2] [Tue, 17 Jun 2008 00:16:04 +0000 (17:16 -0700)] 
atm: [iphase] set drvdata before enabling interrupts

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatm: [br2684] Fix oops due to skb->dev being NULL
Jorge Boncompte [DTI2] [Tue, 17 Jun 2008 00:15:33 +0000 (17:15 -0700)] 
atm: [br2684] Fix oops due to skb->dev being NULL

It happens that if a packet arrives in a VC between the call to open it on
the hardware and the call to change the backend to br2684, br2684_regvcc
processes the packet and oopses dereferencing skb->dev because it is
NULL before the call to br2684_push().

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
16 years agonetns: introduce the net_hash_mix "salt" for hashes
Pavel Emelyanov [Tue, 17 Jun 2008 00:14:11 +0000 (17:14 -0700)] 
netns: introduce the net_hash_mix "salt" for hashes

There are many possible ways to add this "salt", thus I made this
patch to be the last in the series to change it if required.

Currently I propose to use the struct net pointer itself as this
salt, but since this pointer is most often cache-line aligned, shift
this right to eliminate the bits, that are most often zeroed.

After this, simply add this mix to prepared hashfn-s.

For CONFIG_NET_NS=n case this salt is 0 and no changes in hashfn
appear.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoinet6: add struct net argument to inet6_ehashfn
Pavel Emelyanov [Tue, 17 Jun 2008 00:13:48 +0000 (17:13 -0700)] 
inet6: add struct net argument to inet6_ehashfn

Same as for inet_hashfn, prepare its ipv6 incarnation.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>