linux-2.6
16 years agoipv6: remove unused parameter from ip6_ra_control
Denis V. Lunev [Sat, 19 Jul 2008 07:28:58 +0000 (00:28 -0700)] 
ipv6: remove unused parameter from ip6_ra_control

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: fix kernel panic with listening_get_next
Daniel Lezcano [Sat, 19 Jul 2008 07:15:13 +0000 (00:15 -0700)] 
tcp: fix kernel panic with listening_get_next

# BUG: unable to handle kernel NULL pointer dereference at
0000000000000038
IP: [<ffffffff821ed01e>] listening_get_next+0x50/0x1b3
PGD 11e4b9067 PUD 11d16c067 PMD 0
Oops: 0000 [1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 3
Modules linked in: bridge ipv6 button battery ac loop dm_mod tg3 ext3
jbd edd fan thermal processor thermal_sys hwmon sg sata_svw libata dock
serverworks sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
Pid: 3368, comm: slpd Not tainted 2.6.26-rc2-mm1-lxc4 #1
RIP: 0010:[<ffffffff821ed01e>] [<ffffffff821ed01e>]
listening_get_next+0x50/0x1b3
RSP: 0018:ffff81011e1fbe18 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8100be0ad3c0 RCX: ffff8100619f50c0
RDX: ffffffff82475be0 RSI: ffff81011d9ae6c0 RDI: ffff8100be0ad508
RBP: ffff81011f4f1240 R08: 00000000ffffffff R09: ffff8101185b6780
R10: 000000000000002d R11: ffffffff820fdbfa R12: ffff8100be0ad3c8
R13: ffff8100be0ad6a0 R14: ffff8100be0ad3c0 R15: ffffffff825b8ce0
FS: 00007f6a0ebd16d0(0000) GS:ffff81011f424540(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000038 CR3: 000000011dc20000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process slpd (pid: 3368, threadinfo ffff81011e1fa000, task
ffff81011f4b8660)
Stack: 00000000000002ee ffff81011f5a57c0 ffff81011f4f1240
ffff81011e1fbe90
0000000000001000 0000000000000000 00007fff16bf2590 ffffffff821ed9c8
ffff81011f5a57c0 ffff81011d9ae6c0 000000000000041a ffffffff820b0abd
Call Trace:
[<ffffffff821ed9c8>] ? tcp_seq_next+0x34/0x7e
[<ffffffff820b0abd>] ? seq_read+0x1aa/0x29d
[<ffffffff820d21b4>] ? proc_reg_read+0x73/0x8e
[<ffffffff8209769c>] ? vfs_read+0xaa/0x152
[<ffffffff82097a7d>] ? sys_read+0x45/0x6e
[<ffffffff8200bd2b>] ? system_call_after_swapgs+0x7b/0x80

Code: 31 a9 25 00 e9 b5 00 00 00 ff 45 20 83 7d 0c 01 75 79 4c 8b 75 10
48 8b 0e eb 1d 48 8b 51 20 0f b7 45 08 39 02 75 0e 48 8b 41 28 <4c> 39
78 38 0f 84 93 00 00 00 48 8b 09 48 85 c9 75 de 8b 55 1c
RIP [<ffffffff821ed01e>] listening_get_next+0x50/0x1b3
RSP <ffff81011e1fbe18>
CR2: 0000000000000038

This kernel panic appears with CONFIG_NET_NS=y.

How to reproduce ?

    On the buggy host (host A)
       * ip addr add 1.2.3.4/24 dev eth0

    On a remote host (host B)
       * ip addr add 1.2.3.5/24 dev eth0
       * iptables -A INPUT -p tcp -s 1.2.3.4 -j DROP
       * ssh 1.2.3.4

    On host A:
       * netstat -ta or cat /proc/net/tcp

This bug happens when reading /proc/net/tcp[6] when there is a req_sock
at the SYN_RECV state.

When a SYN is received the minisock is created and the sk field is set to
NULL. In the listening_get_next function, we try to look at the field
req->sk->sk_net.

When looking at how to fix this bug, I noticed that is useless to do
the check for the minisock belonging to the namespace. A minisock belongs
to a listen point and this one is per namespace, so when browsing the
minisock they are always per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: Remove redundant checks when setting eff_sacks
Adam Langley [Sat, 19 Jul 2008 07:07:02 +0000 (00:07 -0700)] 
tcp: Remove redundant checks when setting eff_sacks

Remove redundant checks when setting eff_sacks and make the number of SACKs a
compile time constant. Now that the options code knows how many SACK blocks can
fit in the header, we don't need to have the SACK code guessing at it.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: options clean up
Adam Langley [Sat, 19 Jul 2008 07:04:31 +0000 (00:04 -0700)] 
tcp: options clean up

This should fix the following bugs:
  * Connections with MD5 signatures produce invalid packets whenever SACK
    options are included
  * MD5 signatures are counted twice in the MSS calculations

Behaviour changes:
  * A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK

    This is because we can't fit any SACK blocks in a packet with MD5 + TS
    options. There was discussion about disabling SACK rather than TS in
    order to fit in better with old, buggy kernels, but that was deemed to
    be unnecessary.

  * SYNs with MD5 don't include a TS option

    See above.

Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: Fix MD5 signatures for non-linear skbs
Adam Langley [Sat, 19 Jul 2008 07:01:42 +0000 (00:01 -0700)] 
tcp: Fix MD5 signatures for non-linear skbs

Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.

Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].

[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Update sctp global memory limit allocations.
Vlad Yasevich [Sat, 19 Jul 2008 06:08:21 +0000 (23:08 -0700)] 
sctp: Update sctp global memory limit allocations.

Update sctp global memory limit allocations to be the same as TCP.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: remove unnecessary byteshifting, calculate directly in big-endian
Harvey Harrison [Sat, 19 Jul 2008 06:07:09 +0000 (23:07 -0700)] 
sctp: remove unnecessary byteshifting, calculate directly in big-endian

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Allow only 1 listening socket with SO_REUSEADDR
Vlad Yasevich [Sat, 19 Jul 2008 06:06:32 +0000 (23:06 -0700)] 
sctp: Allow only 1 listening socket with SO_REUSEADDR

When multiple socket bind to the same port with SO_REUSEADDR,
only 1 can be listining.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Do not leak memory on multiple listen() calls
Vlad Yasevich [Sat, 19 Jul 2008 06:06:07 +0000 (23:06 -0700)] 
sctp: Do not leak memory on multiple listen() calls

SCTP permits multiple listen call and on subsequent calls
we leak he memory allocated for the crypto transforms.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Support ipv6only AF_INET6 sockets.
Vlad Yasevich [Sat, 19 Jul 2008 06:05:40 +0000 (23:05 -0700)] 
sctp: Support ipv6only AF_INET6 sockets.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Prevent uninitialized memory access
Florian Westphal [Sat, 19 Jul 2008 06:04:39 +0000 (23:04 -0700)] 
sctp: Prevent uninitialized memory access

valgrind reports uninizialized memory accesses when running
sctp inside the network simulation cradle simulator:

 Conditional jump or move depends on uninitialised value(s)
    at 0x570E34A: sctp_assoc_sync_pmtu (associola.c:1324)
    by 0x57427DA: sctp_packet_transmit (output.c:403)
    by 0x5710EFF: sctp_outq_flush (outqueue.c:824)
    by 0x5710B88: sctp_outq_uncork (outqueue.c:701)
    by 0x5745262: sctp_cmd_interpreter (sm_sideeffect.c:1548)
    by 0x57444B7: sctp_side_effects (sm_sideeffect.c:976)
    by 0x5744460: sctp_do_sm (sm_sideeffect.c:945)
    by 0x572157D: sctp_primitive_ASSOCIATE (primitive.c:94)
    by 0x5725C04: __sctp_connect (socket.c:1094)
    by 0x57297DC: sctp_connect (socket.c:3297)

 Conditional jump or move depends on uninitialised value(s)
    at 0x575D3A5: mod_timer (timer.c:630)
    by 0x5752B78: sctp_cmd_hb_timers_start (sm_sideeffect.c:555)
    by 0x5754133: sctp_cmd_interpreter (sm_sideeffect.c:1448)
    by 0x5753607: sctp_side_effects (sm_sideeffect.c:976)
    by 0x57535B0: sctp_do_sm (sm_sideeffect.c:945)
    by 0x571E9AE: sctp_endpoint_bh_rcv (endpointola.c:474)
    by 0x573347F: sctp_inq_push (inqueue.c:104)
    by 0x572EF93: sctp_rcv (input.c:256)
    by 0x5689623: ip_local_deliver_finish (ip_input.c:230)
    by 0x5689759: ip_local_deliver (ip_input.c:268)
    by 0x5689CAC: ip_rcv_finish (dst.h:246)

#1 is due to "if (t->pmtu_pending)".
8a4794914f9cf2681235ec2311e189fe307c28c7 "[SCTP] Flag a pmtu change request"
suggests it should be initialized to 0.

#2 is the heartbeat timer 'expires' value, which is uninizialised, but
test by mod_timer().
T3_rtx_timer seems to be affected by the same problem, so initialize it, too.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agosctp: Don't abort initialization when CONFIG_PROC_FS=n
Florian Westphal [Sat, 19 Jul 2008 06:03:44 +0000 (23:03 -0700)] 
sctp: Don't abort initialization when CONFIG_PROC_FS=n

This puts CONFIG_PROC_FS defines around the proc init/exit functions
and also avoids compiling proc.c if procfs is not supported.
Also make SCTP_DBG_OBJCNT depend on procfs.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agotcp: RTT metrics scaling
Stephen Hemminger [Sat, 19 Jul 2008 06:02:15 +0000 (23:02 -0700)] 
tcp: RTT metrics scaling

Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.

This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters.  Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Fix noqueue_qdisc initialization.
David S. Miller [Sat, 19 Jul 2008 06:00:11 +0000 (23:00 -0700)] 
pkt_sched: Fix noqueue_qdisc initialization.

Like noop_qdisc, it needs a dummy backpointer and
explicit qdisc->q.lock initialization.

Based upon a report by Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Manage qdisc list inside of root qdisc.
David S. Miller [Sat, 19 Jul 2008 05:50:15 +0000 (22:50 -0700)] 
pkt_sched: Manage qdisc list inside of root qdisc.

Idea is from Patrick McHardy.

Instead of managing the list of qdiscs on the device level, manage it
in the root qdisc of a netdev_queue.  This solves all kinds of
visibility issues during qdisc destruction.

The way to iterate over all qdiscs of a netdev_queue is to visit
the netdev_queue->qdisc, and then traverse it's list.

The only special case is to ignore builting qdiscs at the root when
dumping or doing a qdisc_lookup().  That was not needed previously
because builtin qdiscs were not added to the device's qdisc_list.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Get rid of u32_list.
David S. Miller [Sat, 19 Jul 2008 03:54:17 +0000 (20:54 -0700)] 
pkt_sched: Get rid of u32_list.

The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.

Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopacket: add PACKET_RESERVE sockopt
Patrick McHardy [Sat, 19 Jul 2008 01:05:19 +0000 (18:05 -0700)] 
packet: add PACKET_RESERVE sockopt

Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Update version to 1.7.9.
Benjamin Li [Sat, 19 Jul 2008 00:58:57 +0000 (17:58 -0700)] 
bnx2: Update version to 1.7.9.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Fix Sparse warnings
Benjamin Li [Sat, 19 Jul 2008 00:57:26 +0000 (17:57 -0700)] 
bnx2: Fix Sparse warnings

This patch will fix the following sparse warnings:

/home/benli/sparse/bnx2.c:297:8: warning: symbol 'val' shadows an earlier one
/home/benli/sparse/bnx2.c:286:60: originally declared here
/home/benli/sparse/bnx2.c:7461:7: warning: symbol 'i' shadows an earlier one
/home/benli/sparse/bnx2.c:7265:10: originally declared here

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Add TX multiqueue support.
Benjamin Li [Sat, 19 Jul 2008 00:55:11 +0000 (17:55 -0700)] 
bnx2: Add TX multiqueue support.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agobnx2: Update TPAT firmware
Benjamin Li [Sat, 19 Jul 2008 00:54:17 +0000 (17:54 -0700)] 
bnx2: Update TPAT firmware

This change allows the first TX ring (CID 16) and the first TSS TX ring
(CID 32) to be used concurrently.  Before this change, we could get TSO
errors when both TX rings were used concurrently.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoe1000: resolve tx multiqueue bug
Ben Hutchings [Sat, 19 Jul 2008 00:50:57 +0000 (17:50 -0700)] 
e1000: resolve tx multiqueue bug

With the recent changes to tx mutiqueue, e1000 was not calling
netif_start_queue() before calling netif_wake_queue().
This causes an oops during loading of the driver.

(Based on commit d55b53fff0c2ddb639dca04c3f5a0854f292d982
("igb/ixgbe/e1000e: resolve tx multiqueue bug").)

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoigb/ixgbe/e1000e: resolve tx multiqueue bug
Jeff Kirsher [Fri, 18 Jul 2008 11:33:03 +0000 (04:33 -0700)] 
igb/ixgbe/e1000e: resolve tx multiqueue bug

With the recent changes to tx mutiqueue, igb/ixgbe/e1000e was not calling
netif_tx_start_all_queues() before calling netif_tx_wake_all_queues().
This causes an issue during loading of the driver.

In addition, updated e1000e to use the updated tx mutliqueue api.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: consolidate per-net single-release callers
Pavel Emelyanov [Fri, 18 Jul 2008 11:07:44 +0000 (04:07 -0700)] 
proc: consolidate per-net single-release callers

They are symmetrical to single_open ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: consolidate per-net single_open callers
Pavel Emelyanov [Fri, 18 Jul 2008 11:07:21 +0000 (04:07 -0700)] 
proc: consolidate per-net single_open callers

There are already 7 of them - time to kill some duplicate code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: clean the ip_misc_proc_init and ip_proc_init_net error paths
Pavel Emelyanov [Fri, 18 Jul 2008 11:06:50 +0000 (04:06 -0700)] 
proc: clean the ip_misc_proc_init and ip_proc_init_net error paths

After all this stuff is moved outside, this function can look better.

Besides, I tuned the error path in ip_proc_init_net to make it have
only 2 exit points, not 3.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: show per-net ip_devconf.forwarding in /proc/net/snmp
Pavel Emelyanov [Fri, 18 Jul 2008 11:06:26 +0000 (04:06 -0700)] 
proc: show per-net ip_devconf.forwarding in /proc/net/snmp

This one has become per-net long ago, but the appropriate file
is per-net only now.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: create /proc/net/snmp file in each net
Pavel Emelyanov [Fri, 18 Jul 2008 11:06:04 +0000 (04:06 -0700)] 
proc: create /proc/net/snmp file in each net

All the statistics shown in this file have been made per-net already.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoproc: create /proc/net/netstat file in each net
Pavel Emelyanov [Fri, 18 Jul 2008 11:05:17 +0000 (04:05 -0700)] 
proc: create /proc/net/netstat file in each net

Now all the shown in it statistics is netnsizated, time to
show it in appropriate net.

The appropriate net init/exit ops already exist - they make
the sockstat file per net - so just extend them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv4: clean the init_ipv4_mibs error paths
Pavel Emelyanov [Fri, 18 Jul 2008 11:04:51 +0000 (04:04 -0700)] 
ipv4: clean the init_ipv4_mibs error paths

After moving all the stuff outside this function it looks
a bit ugly - make it look better.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put icmpmsg statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:04:22 +0000 (04:04 -0700)] 
mib: put icmpmsg statistics on struct net

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put icmp statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:04:02 +0000 (04:04 -0700)] 
mib: put icmp statistics on struct net

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put udplite statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:03:45 +0000 (04:03 -0700)] 
mib: put udplite statistics on struct net

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put udp statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:03:27 +0000 (04:03 -0700)] 
mib: put udp statistics on struct net

Similar to... ouch, I repeat myself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put net statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:03:08 +0000 (04:03 -0700)] 
mib: put net statistics on struct net

Similar to ip and tcp ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put ip statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:02:42 +0000 (04:02 -0700)] 
mib: put ip statistics on struct net

Similar to tcp one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: put tcp statistics on struct net
Pavel Emelyanov [Fri, 18 Jul 2008 11:02:08 +0000 (04:02 -0700)] 
mib: put tcp statistics on struct net

Proc temporary uses stats from init_net.

BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv4: add pernet mib operations
Pavel Emelyanov [Fri, 18 Jul 2008 11:01:44 +0000 (04:01 -0700)] 
ipv4: add pernet mib operations

These ones are currently empty, but stuff from init_ipv4_mibs will
sequentially migrate there.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomib: add netns/mib.h file
Pavel Emelyanov [Fri, 18 Jul 2008 11:01:24 +0000 (04:01 -0700)] 
mib: add netns/mib.h file

The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoRevert "remove the strip driver"
David S. Miller [Fri, 18 Jul 2008 10:58:52 +0000 (03:58 -0700)] 
Revert "remove the strip driver"

This reverts commit 94d9842403f770239a656586442454b7a8f2df29.

Alan says it's not appropriate to remove this driver,
Adrian Bunk also agrees with this revert.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
David S. Miller [Fri, 18 Jul 2008 09:39:39 +0000 (02:39 -0700)] 
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6

Conflicts:

Documentation/powerpc/booting-without-of.txt
drivers/atm/Makefile
drivers/net/fs_enet/fs_enet-main.c
drivers/pci/pci-acpi.c
net/8021q/vlan.c
net/iucv/iucv.c

16 years agopkt_sched: Make default qdisc nonshared-multiqueue safe.
David S. Miller [Thu, 17 Jul 2008 08:46:06 +0000 (01:46 -0700)] 
pkt_sched: Make default qdisc nonshared-multiqueue safe.

Instead of 'pfifo_fast' we have just plain 'fifo_fast'.
No priority queues, just a straight FIFO.

This is necessary in order to legally have a seperate
qdisc per queue in multi-TX-queue setups, and thus get
full parallelization.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()
David S. Miller [Thu, 17 Jul 2008 11:03:43 +0000 (04:03 -0700)] 
pkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()

We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Add multiqueue handling to qdisc_graft().
David S. Miller [Thu, 17 Jul 2008 11:54:10 +0000 (04:54 -0700)] 
pkt_sched: Add multiqueue handling to qdisc_graft().

Move the destruction of the old queue into qdisc_graft().

When operating on a root qdisc (ie. "parent == NULL"), apply
the operation to all queues.  The caller has grabbed a single
implicit reference for this graft, therefore when we apply the
change to more than one queue we must grab additional qdisc
references.

Otherwise, we are operating on a class of a specific parent qdisc, and
therefore no multiqueue handling is necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Kill netdev_queue lock.
David S. Miller [Thu, 17 Jul 2008 07:53:03 +0000 (00:53 -0700)] 
pkt_sched: Kill netdev_queue lock.

We can simply use the qdisc->q.lock for all of the
qdisc tree synchronization.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.
David S. Miller [Wed, 16 Jul 2008 10:22:39 +0000 (03:22 -0700)] 
pkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Rework {sch,tbf}_tree_lock().
David S. Miller [Wed, 16 Jul 2008 10:12:24 +0000 (03:12 -0700)] 
pkt_sched: Rework {sch,tbf}_tree_lock().

Make sch_tree_lock() lock the qdisc's root.  All of the
users hold the RTNL semaphore and the root qdisc is not
changing.

Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Make qdisc grafting locking more specific.
David S. Miller [Wed, 16 Jul 2008 10:00:19 +0000 (03:00 -0700)] 
pkt_sched: Make qdisc grafting locking more specific.

Lock the root of the qdisc being operated upon.

All explicit references to qdisc_tree_lock() are now gone.
The only remaining uses are via the sch_tree_{lock,unlock}()
and tcf_tree_{lock,unlock}() macros.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdevice: Move qdisc_list back into net_device proper.
David S. Miller [Thu, 17 Jul 2008 07:50:32 +0000 (00:50 -0700)] 
netdevice: Move qdisc_list back into net_device proper.

And give it it's own lock.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Kill qdisc_lock_tree usage in cls_route.c
David S. Miller [Wed, 16 Jul 2008 09:42:51 +0000 (02:42 -0700)] 
pkt_sched: Kill qdisc_lock_tree usage in cls_route.c

It just wants the qdisc tree to be synchronized, so grabbing
qdisc_root_lock() is sufficient.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Remove qdisc_lock_tree usage in cls_api.c
David S. Miller [Wed, 16 Jul 2008 09:40:45 +0000 (02:40 -0700)] 
pkt_sched: Remove qdisc_lock_tree usage in cls_api.c

It just wants the qdisc tree for the filter to be synchronized.
So just BH lock qdisc_root_lock(q) instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Use per-queue locking in shutdown_scheduler_queue.
David S. Miller [Wed, 16 Jul 2008 09:36:04 +0000 (02:36 -0700)] 
pkt_sched: Use per-queue locking in shutdown_scheduler_queue.

This eliminates another qdisc_lock_tree user.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Perform bulk of qdisc destruction in RCU.
David S. Miller [Thu, 17 Jul 2008 07:47:45 +0000 (00:47 -0700)] 
pkt_sched: Perform bulk of qdisc destruction in RCU.

This allows less strict control of access to the qdisc attached to a
netdev_queue.  It is even allowed to enqueue into a qdisc which is
in the process of being destroyed.  The RCU handler will toss out
those packets.

We will need this to handle sharing of a qdisc amongst multiple
TX queues.  In such a setup the lock has to be shared, so will
be inside of the qdisc itself.  At which point the netdev_queue
lock cannot be used to hard synchronize access to the ->qdisc
pointer.

One operation we have to keep inside of qdisc_destroy() is the list
deletion.  It is the only piece of state visible after the RCU quiesce
period, so we have to undo it early and under the appropriate locking.

The operations in the RCU handler do not need any looking because the
qdisc tree is no longer visible to anything at that point.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: dev_init_scheduler() does not need to lock qdisc tree.
David S. Miller [Wed, 16 Jul 2008 09:23:17 +0000 (02:23 -0700)] 
pkt_sched: dev_init_scheduler() does not need to lock qdisc tree.

We are registering the device, there is no way anyone can get
at this object's qdiscs yet in any meaningful way.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Schedule qdiscs instead of netdev_queue.
David S. Miller [Wed, 16 Jul 2008 09:15:04 +0000 (02:15 -0700)] 
pkt_sched: Schedule qdiscs instead of netdev_queue.

When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.

Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.

Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Add and use qdisc_root() and qdisc_root_lock().
David S. Miller [Wed, 16 Jul 2008 08:42:40 +0000 (01:42 -0700)] 
pkt_sched: Add and use qdisc_root() and qdisc_root_lock().

When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.

Add qdisc_root_lock() to represent this and convert the
easiest cases.

In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Make QDISC_RUNNING a qdisc state.
David S. Miller [Wed, 16 Jul 2008 07:56:32 +0000 (00:56 -0700)] 
pkt_sched: Make QDISC_RUNNING a qdisc state.

Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Move gso_skb into Qdisc.
David S. Miller [Wed, 16 Jul 2008 03:14:35 +0000 (20:14 -0700)] 
pkt_sched: Move gso_skb into Qdisc.

We liberate any dangling gso_skb during qdisc destruction.

It really only matters for the root qdisc.  But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoniu: Add TX multiqueue support.
David S. Miller [Tue, 15 Jul 2008 10:48:19 +0000 (03:48 -0700)] 
niu: Add TX multiqueue support.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Kill plain netif_schedule()
David S. Miller [Tue, 15 Jul 2008 10:48:01 +0000 (03:48 -0700)] 
netdev: Kill plain netif_schedule()

No more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Convert all drivers away from netif_schedule().
David S. Miller [Tue, 15 Jul 2008 10:47:41 +0000 (03:47 -0700)] 
netdev: Convert all drivers away from netif_schedule().

They logically all want to trigger a schedule for all device
TX queues.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Implement simple sw TX hashing.
David S. Miller [Tue, 15 Jul 2008 10:47:03 +0000 (03:47 -0700)] 
net: Implement simple sw TX hashing.

It just xor hashes over IPv4/IPv6 addresses and ports of transport.

The only assumption it makes is that skb_network_header() is set
correctly.

With bug fixes from Eric Dumazet.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomac80211: Reimplement WME using ->select_queue().
David S. Miller [Tue, 15 Jul 2008 10:34:57 +0000 (03:34 -0700)] 
mac80211: Reimplement WME using ->select_queue().

The only behavior change is that we do not drop packets under any
circumstances.  If that is absolutely needed, we could easily add it
back.

With cleanups and help from Johannes Berg.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Add netdev->select_queue() method.
David S. Miller [Tue, 15 Jul 2008 10:03:33 +0000 (03:03 -0700)] 
netdev: Add netdev->select_queue() method.

Devices or device layers can set this to control the queue selection
performed by dev_pick_tx().

This function runs under RCU protection, which allows overriding
functions to have some way of synchronizing with things like dynamic
->real_num_tx_queues adjustments.

This makes the spinlock prefetch in dev_queue_xmit() a little bit
less effective, but that's the price right now for correctness.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: netdev_priv() can now be sane again.
David S. Miller [Tue, 15 Jul 2008 09:58:39 +0000 (02:58 -0700)] 
netdev: netdev_priv() can now be sane again.

The private area of a netdev is now at a fixed offset once more.

Unfortunately, some assumptions that netdev_priv() == netdev->priv
crept back into the tree.  In particular this happened in the
loopback driver.  Make it use netdev->ml_priv.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Kill struct net_device_subqueue and netdev->egress_subqueue*
David S. Miller [Tue, 15 Jul 2008 09:58:10 +0000 (02:58 -0700)] 
netdev: Kill struct net_device_subqueue and netdev->egress_subqueue*

No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonet: Use queue aware tests throughout.
David S. Miller [Thu, 17 Jul 2008 08:56:23 +0000 (01:56 -0700)] 
net: Use queue aware tests throughout.

This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.

Non-multiqueue drivers need no changes.  The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.

Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.

pktgen and netpoll required a little bit more surgery than the others.

In particular the pktgen changes, whilst functional, could be largely
improved.  The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless.  The thing to do is probably to
invoke fill_packet() earlier.

The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.

Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.

Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.

With IGB changes from Jeff Kirsher.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomac80211: Temporarily mark QoS support BROKEN.
David S. Miller [Tue, 15 Jul 2008 09:53:04 +0000 (02:53 -0700)] 
mac80211: Temporarily mark QoS support BROKEN.

We will undo this after a few changsets.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: Remove RR scheduler.
David S. Miller [Tue, 15 Jul 2008 09:52:19 +0000 (02:52 -0700)] 
pkt_sched: Remove RR scheduler.

This actually fixes a bug added by the RR scheduler changes.  The
->bands and ->prio2band parameters were being set outside of the
sch_tree_lock() and thus could result in strange behavior and
inconsistencies.

It might be possible, in the new design (where there will be one qdisc
per device TX queue) to allow similar functionality via a TX hash
algorithm for RR but I really see no reason to export this aspect of
how these multiqueue cards actually implement the scheduling of the
the individual DMA TX rings and the single physical MAC/PHY port.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Kill NETIF_F_MULTI_QUEUE.
David S. Miller [Thu, 17 Jul 2008 08:52:12 +0000 (01:52 -0700)] 
netdev: Kill NETIF_F_MULTI_QUEUE.

There is no need for a feature bit for something that
can be tested by simply checking the TX queue count.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agonetdev: Allocate multiple queues for TX.
David S. Miller [Thu, 17 Jul 2008 07:34:19 +0000 (00:34 -0700)] 
netdev: Allocate multiple queues for TX.

alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces.  This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoigb: Kill CONFIG_NETDEVICES_MULTIQUEUE references, no longer exists.
David S. Miller [Thu, 17 Jul 2008 08:50:11 +0000 (01:50 -0700)] 
igb: Kill CONFIG_NETDEVICES_MULTIQUEUE references, no longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Thu, 17 Jul 2008 17:55:51 +0000 (10:55 -0700)] 
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  [PATCH] ocfs2: fix oops in mmap_truncate testing
  configfs: call drop_link() to cleanup after create_link() failure
  configfs: Allow ->make_item() and ->make_group() to return detailed errors.
  configfs: Fix failing mkdir() making racing rmdir() fail
  configfs: Fix deadlock with racing rmdir() and rename()
  configfs: Make configfs_new_dirent() return error code instead of NULL
  configfs: Protect configfs_dirent s_links list mutations
  configfs: Introduce configfs_dirent_lock
  ocfs2: Don't snprintf() without a format.
  ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
  ocfs2/net: Silence build warnings on sparc64
  ocfs2: Handle error during journal load
  ocfs2: Silence an error message in ocfs2_file_aio_read()
  ocfs2: use simple_read_from_buffer()
  ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
  [PATCH 2/2] ocfs2: Instrument fs cluster locks
  [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-fixes-2.6
Linus Torvalds [Thu, 17 Jul 2008 17:55:07 +0000 (10:55 -0700)] 
Merge git://git./linux/kernel/git/brodo/pcmcia-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-fixes-2.6:
  pcmcia: ide-cs: Remove outdated comment
  pcmcia: fix cisinfo_t removal
  pcmcia: fix return value in cm4000_cs.c

16 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 17 Jul 2008 17:38:59 +0000 (10:38 -0700)] 
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix asm/e820.h for userspace inclusion
  x86: fix numaq_tsc_disable
  x86: fix kernel_physical_mapping_init() for large x86 systems

16 years agoMerge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 17 Jul 2008 17:37:10 +0000 (10:37 -0700)] 
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: do not trace library functions
  ftrace: do not trace scheduler functions
  ftrace: fix lockup with MAXSMP
  ftrace: fix merge buglet

16 years agox86: fix asm/e820.h for userspace inclusion
Rusty Russell [Tue, 15 Jul 2008 05:02:27 +0000 (15:02 +1000)] 
x86: fix asm/e820.h for userspace inclusion

asm-x86/e820.h is included from userspace.  'x86: make e820.c to have
common functions' (b79cd8f1268bab57ff85b19d131f7f23deab2dee) broke it:

make -C Documentation/lguest
cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c  -lz -o lguest
In file included from ../../include/asm-x86/bootparam.h:8,
                 from lguest.c:45:
../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix numaq_tsc_disable
Yinghai Lu [Tue, 15 Jul 2008 06:29:01 +0000 (23:29 -0700)] 
x86: fix numaq_tsc_disable

fix:

 arch/x86/kernel/numaq_32.c: In function ‘numaq_tsc_disable’:
 arch/x86/kernel/numaq_32.c:99: warning: ‘return’ with a value, in function returning void

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into x86/urgent
Ingo Molnar [Thu, 17 Jul 2008 17:24:56 +0000 (19:24 +0200)] 
Merge branch 'linus' into x86/urgent

16 years agofix build error of arch/ia64/kvm/*
Takashi Iwai [Thu, 17 Jul 2008 16:09:12 +0000 (18:09 +0200)] 
fix build error of arch/ia64/kvm/*

Fix calls of smp_call_function*() in arch/ia64/kvm for recent API
changes.

    CC [M]  arch/ia64/kvm/kvm-ia64.o
  arch/ia64/kvm/kvm-ia64.c: In function 'handle_global_purge':
  arch/ia64/kvm/kvm-ia64.c:398: error: too many arguments to function 'smp_call_function_single'
  arch/ia64/kvm/kvm-ia64.c: In function 'kvm_vcpu_kick':
  arch/ia64/kvm/kvm-ia64.c:1696: error: too many arguments to function 'smp_call_function_single'

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob...
Linus Torvalds [Thu, 17 Jul 2008 16:15:23 +0000 (09:15 -0700)] 
Merge branch 'ptrace-cleanup' of git://git./linux/kernel/git/frob/linux-2.6-utrace

* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  fix dangling zombie when new parent ignores children
  do_wait: return security_task_wait() error code in place of -ECHILD
  ptrace children revamp
  do_wait reorganization

16 years agoUpdate scripts/Makefile.fwinst to cope with older make
David Woodhouse [Thu, 17 Jul 2008 06:44:32 +0000 (23:44 -0700)] 
Update scripts/Makefile.fwinst to cope with older make

Also fix unwanted rebuilds of the firmware/ihex2fw tool by including
the .ihex2fw.cmd file when present.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Reported-and-tested-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Thu, 17 Jul 2008 16:05:38 +0000 (09:05 -0700)] 
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP
  [S390] qdio: new qdio driver.
  [S390] cio: Export chsc_error_from_response().
  [S390] vmur: Fix return code handling.
  [S390] Fix stacktrace compile bug.
  [S390] Increase default warning stacksize.
  [S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().
  [S390] chsc headers userspace cleanup
  [S390] dasd: fix unsolicited SIM handling.
  [S390] zfcpdump: Make SCSI disk dump tool recognize storage holes

16 years agoFix collateral damage to top level Makefile
Grant Likely [Thu, 17 Jul 2008 07:06:55 +0000 (01:06 -0600)] 
Fix collateral damage to top level Makefile

The patch named "powerpc/mpc5121: Add clock driver", also contained
an unrelated and bogus change to the top-level makefile.  This patch
backs out the bad bit.

SHA1 of offending patch: 137e95906e294913fab02162e8a1948ade49acb5)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Repented-by: John Rigby <jrigby@freescale.com>
[ Heh. Normally I pick these out from the diffstats, but I guess
  I've grown to trust the ppc tree too much ;)   - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoftrace: do not trace library functions
Ingo Molnar [Thu, 17 Jul 2008 15:40:48 +0000 (17:40 +0200)] 
ftrace: do not trace library functions

make function tracing more robust: do not trace library functions.

We've already got a sizable list of exceptions:

 ifdef CONFIG_FTRACE
 # Do not profile string.o, since it may be used in early boot or vdso
 CFLAGS_REMOVE_string.o = -pg
 # Also do not profile any debug utilities
 CFLAGS_REMOVE_spinlock_debug.o = -pg
 CFLAGS_REMOVE_list_debug.o = -pg
 CFLAGS_REMOVE_debugobjects.o = -pg
 CFLAGS_REMOVE_find_next_bit.o = -pg
 CFLAGS_REMOVE_cpumask.o = -pg
 CFLAGS_REMOVE_bitmap.o = -pg
 endif

... and the pattern has been that random library functionality showed
up in ftrace's critical path (outside of its recursion check), causing
hard to debug lockups.

So be a bit defensive about it and exclude all lib/*.o functions by
default. It's not that they are overly interesting for tracing purposes
anyway. Specific ones can still be traced, in an opt-in manner.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: do not trace scheduler functions
Ingo Molnar [Tue, 15 Apr 2008 20:39:31 +0000 (22:39 +0200)] 
ftrace: do not trace scheduler functions

do not trace scheduler functions - it's still a bit fragile
and can lock up with:

  http://redhat.com/~mingo/misc/config-Thu_Jul_17_13_34_52_CEST_2008

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: fix lockup with MAXSMP
Ingo Molnar [Thu, 17 Jul 2008 15:38:17 +0000 (17:38 +0200)] 
ftrace: fix lockup with MAXSMP

MAXSMP brings in lots of use of various bitops in smp_processor_id()
and friends - causing ftrace to lock up during bootup:

  calling  anon_inode_init+0x0/0x130
  initcall anon_inode_init+0x0/0x130 returned 0 after 0 msecs
  calling  acpi_event_init+0x0/0x57
  [ hard hang ]

So exclude the bitops facilities from tracing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years ago[S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP
Stefan Haberland [Thu, 17 Jul 2008 15:16:49 +0000 (17:16 +0200)] 
[S390] dasd: use -EOPNOTSUPP instead of -ENOTSUPP

return value -ENOTSUPP is not valid in userspace context, use
-EOPNOTSUPP instead

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] qdio: new qdio driver.
Jan Glauber [Thu, 17 Jul 2008 15:16:48 +0000 (17:16 +0200)] 
[S390] qdio: new qdio driver.

List of major changes:
- split qdio driver into several files
- seperation of thin interrupt code
- improved handling for multiple thin interrupt devices
- inbound and outbound processing now always runs in tasklet context
- significant less tasklet schedules per interrupt needed
- merged qebsm with non-qebsm handling
- cleanup qdio interface and added kerneldoc
- coding style

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com>
Reviewed-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cio: Export chsc_error_from_response().
Cornelia Huck [Thu, 17 Jul 2008 15:16:47 +0000 (17:16 +0200)] 
[S390] cio: Export chsc_error_from_response().

Make chsc_error_from_response() available to chsc callers outside
of chsc.c (namely qdio) to avoid duplicating error checking code.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] vmur: Fix return code handling.
Frank Munzert [Thu, 17 Jul 2008 15:16:46 +0000 (17:16 +0200)] 
[S390] vmur: Fix return code handling.

Use -EOPNOTSUPP instead of -ENOTSUPP.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] Fix stacktrace compile bug.
Heiko Carstens [Thu, 17 Jul 2008 15:16:45 +0000 (17:16 +0200)] 
[S390] Fix stacktrace compile bug.

Add missing module.h include to fix this:

  CC      arch/s390/kernel/stacktrace.o
arch/s390/kernel/stacktrace.c:84: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:84: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:84: warning: parameter names (without types) in function declaration
arch/s390/kernel/stacktrace.c:97: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:97: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:97: warning: parameter names (without types) in function declaration

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] Increase default warning stacksize.
Heiko Carstens [Thu, 17 Jul 2008 15:16:44 +0000 (17:16 +0200)] 
[S390] Increase default warning stacksize.

Compiling a kernel with allmodconfig or allyesconfig results in tons
of gcc warnings, because the default maximum stacksize from which on
gcc will emit a warning is just 256 bytes.
Increase this to 2048, so these warnings don't distract from the real
warnings that we need to watch at.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().
Cornelia Huck [Thu, 17 Jul 2008 15:16:43 +0000 (17:16 +0200)] 
[S390] dasd: Fix cleanup in dasd_{fba,diag}_check_characteristics().

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] chsc headers userspace cleanup
Adrian Bunk [Thu, 17 Jul 2008 15:16:42 +0000 (17:16 +0200)] 
[S390] chsc headers userspace cleanup

Kernel headers shouldn't expose functions to userspace.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] dasd: fix unsolicited SIM handling.
Stefan Haberland [Thu, 17 Jul 2008 15:16:41 +0000 (17:16 +0200)] 
[S390] dasd: fix unsolicited SIM handling.

Add missing schedule_bh and check that there is 32 bit sense data.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years ago[S390] zfcpdump: Make SCSI disk dump tool recognize storage holes
Frank Munzert [Thu, 17 Jul 2008 15:16:40 +0000 (17:16 +0200)] 
[S390] zfcpdump: Make SCSI disk dump tool recognize storage holes

The kernel part of zfcpdump establishes a new debugfs file zcore/memmap
which exports information on memory layout (start address and length of each
memory chunk) to its userspace counterpart.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
16 years agoftrace: fix merge buglet
Ingo Molnar [Thu, 17 Jul 2008 11:26:50 +0000 (13:26 +0200)] 
ftrace: fix merge buglet

-tip testing found a bootup hang here:

  initcall anon_inode_init+0x0/0x130 returned 0 after 0 msecs
  calling  acpi_event_init+0x0/0x57

the bootup should have continued with:

  initcall acpi_event_init+0x0/0x57 returned 0 after 45 msecs

but it hung hard there instead.

bisection led to this commit:

| commit 5806b81ac1c0c52665b91723fd4146a4f86e386b
| Merge: d14c8a6... 6712e29...
| Author: Ingo Molnar <mingo@elte.hu>
| Date:   Mon Jul 14 16:11:52 2008 +0200
|     Merge branch 'auto-ftrace-next' into tracing/for-linus

turns out that i made this mistake in the merge:

  ifdef CONFIG_FTRACE
  # Do not profile debug utilities
  CFLAGS_REMOVE_tsc_64.o = -pg
  CFLAGS_REMOVE_tsc_32.o = -pg

those two files got unified meanwhile - so the dont-profile annotation
got lost. The proper rule is:

  CFLAGS_REMOVE_tsc.o = -pg

i guess this could have been caught sooner if the CFLAGS_REMOVE* kbuild
rule aborted the build if it met a target that does not exist anymore?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agogarp: retry sending JoinIn messages after allocation failures
Patrick McHardy [Thu, 17 Jul 2008 03:51:47 +0000 (20:51 -0700)] 
garp: retry sending JoinIn messages after allocation failures

Increase reliability by retrying to send JoinIn messages after memory
allocation failures on each TRANSMIT_PDU event until it succeeds.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agocore: add stat to track unresolved discards in neighbor cache
Neil Horman [Thu, 17 Jul 2008 03:50:49 +0000 (20:50 -0700)] 
core: add stat to track unresolved discards in neighbor cache

in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs.  If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb.  This results in a drop
for which we have no statistics incremented.  This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>