David S. Miller [Sat, 22 Jul 2006 08:12:09 +0000 (01:12 -0700)]
[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.
That way we'll have at least some debugging info even if
the stack dump explodes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Mon, 24 Jul 2006 22:31:14 +0000 (15:31 -0700)]
[NET]: Correct dev_alloc_skb kerneldoc
dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers
use it for the latter anyway, but that's a different story)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Mon, 24 Jul 2006 22:30:28 +0000 (15:30 -0700)]
[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKB
skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow
architectures to reimplement __dev_alloc_skb. It's not set on any
architecture and now that we have an architecture-overrideable
NET_SKB_PAD there is not point at all to have one either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Rompf [Mon, 24 Jul 2006 20:52:13 +0000 (13:52 -0700)]
[VLAN]: Fix link state propagation
When the queue of the underlying device is stopped at initialization time
or the device is marked "not present", the state will be propagated to the
vlan device and never change. Based on an analysis by Patrick McHardy.
Signed-off-by: Stefan Rompf <stefan@loplof.de>
ACKed-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jul 2006 20:49:06 +0000 (13:49 -0700)]
[IPV6] xfrm6_tunnel: Delete debugging code.
It doesn't compile, and it's dubious in several regards:
1) is enabled by non-Kconfig controlled CONFIG_* value
(noted by Randy Dunlap)
2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use
3) the debugging messages print object pointer addresses
which have no meaning without context
So let's just get rid of it.
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcel Holtmann [Tue, 18 Jul 2006 16:32:33 +0000 (18:32 +0200)]
[Bluetooth] Enable SCO support for Broadcom HID proxy dongle
The Broadcom dongles with HID proxy support actually support SCO over
HCI if the SCO buffer size values are corrected. So instead of disabling
the SCO support, mark this dongle with the quirk for the Bluetooth core
to correct the wrong buffer size values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Tue, 18 Jul 2006 16:04:59 +0000 (18:04 +0200)]
[Bluetooth] Add quirk for another broken RTX Telecom based dongle
This patch disables the ISOC transfers for another broken RTX Telecom
based USB dongle. Starting the USB ISOC transfers only ends in a burst
of error messages for invalid SCO packets on connection handle 0.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Tue, 18 Jul 2006 15:47:40 +0000 (17:47 +0200)]
[Bluetooth] Correct SCO buffer size for Belkin devices
The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip
from Broadcom and their SCO buffer size values are wrong. The Bluetooth
core should correct these values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Fri, 14 Jul 2006 14:01:52 +0000 (16:01 +0200)]
[Bluetooth] Correct SCO buffer size for another Broadcom chip
The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a
Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver
has to set a quirk to correct the SCO buffer size values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Marcel Holtmann [Fri, 14 Jul 2006 09:42:12 +0000 (11:42 +0200)]
[Bluetooth] Correct RFCOMM channel MTU for broken implementations
Some Bluetooth RFCOMM implementations try to negotiate a bigger channel
MTU than we can support for a particular session. The maximum MTU for
a RFCOMM session is limited through the L2CAP layer. So if the other
side proposes a channel MTU that is bigger than the underlying L2CAP
MTU, we should reduce it to the L2CAP MTU of the session minus five
bytes for the RFCOMM headers.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Guillaume Chazarain [Mon, 24 Jul 2006 06:37:24 +0000 (23:37 -0700)]
[PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.
In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so,
less than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to
revert to the original code before
4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in
'while'.
[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian McDonald [Mon, 24 Jul 2006 06:33:28 +0000 (23:33 -0700)]
[DCCP]: Fix default sequence window size
When using the default sequence window size (100) I got the following in
my logs:
Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for
DATA packet, (LSWL(
6279674225) <= P.seqno(
6279674749) <=
S.SWH(
6279674324)) and (P.ackno doesn't exist or LAWL(
18798206530) <=
P.ackno(
1125899906842620) <= S.AWH(
18798206548), sending SYNC...
Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for
DATA packet, (LSWL(
6279674225) <= P.seqno(
6279674750) <=
S.SWH(
6279674324)) and (P.ackno doesn't exist or LAWL(
18798206530) <=
P.ackno(
1125899906842620) <= S.AWH(
18798206549), sending SYNC...
I went to alter the default sysctl and it didn't take for new sockets.
Below patch fixes this.
I think the default is too low but it is what the DCCP spec specifies.
As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec
to 3.5. This is still far too slow but it is a step in the right direction.
Compile tested only for IPv6 but not particularly complex change.
Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roland Dreier [Mon, 24 Jul 2006 16:36:50 +0000 (09:36 -0700)]
IB/mthca: Initialize max_cmds before debug code prints it
Read the max_cmds value from the response to the QUERY_FW command
before printing out the value, so that the real value goes into the
debug output.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael S. Tsirkin [Wed, 19 Jul 2006 14:44:37 +0000 (17:44 +0300)]
IB/ipoib: Fix packet loss after hardware address update
The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Mon, 24 Jul 2006 07:42:00 +0000 (10:42 +0300)]
IB/ipoib: Fix oops with ipoib_debug_mcast set
Need to set mcast->ah before debug code dereferences it.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Thu, 20 Jul 2006 08:25:50 +0000 (11:25 +0300)]
IB/mad: Validate MADs for spec compliance
Validate MADs sent by userspace clients for spec compliance with
C13-18.1.1 (prevent duplicate requests and responses sent on the
same port). Without this, RMPP transactions get aborted because
of duplicate packets.
This patch is similar to that provided by Jack Morgenstein.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Tue, 18 Jul 2006 01:21:24 +0000 (18:21 -0700)]
IB/ipath: ipath_skip_sge() can break if num_sge > 1
ipath_skip_sge() doesn't exactly duplicate the side effects of
ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge.
This could result in the sg_list being accessed out of bounds.
Since ipath_skip_sge() is almost always called with num_sge == 1,
the original "optimization" is almost never used.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Tue, 18 Jul 2006 01:19:54 +0000 (18:19 -0700)]
IB/ipath: Fix ib_ipath driver to work with SRP
I am still working on a proposal to remove the phys_to_virt() calls
in the ib_ipath driver. In the mean time, this patch allows SRP
to work by fixing the R_Key check and conversion from IB address
to kernel virtual address. It also returns the correct page size
for FMRs.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ralph Campbell [Tue, 18 Jul 2006 01:18:36 +0000 (18:18 -0700)]
IB/ipath: Fix a data corruption
This patch fixes a problem where certain error packets are passed
to the InfiniBand layer for processing even though the packet
actually was received with an error.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Dotan Barak [Thu, 13 Jul 2006 08:05:49 +0000 (11:05 +0300)]
IB/mthca: Fix SRQ limit event range check
Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot
be set beyond srq->max - 1.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Sun, 23 Jul 2006 22:16:04 +0000 (15:16 -0700)]
IB/uverbs: Fix lockdep warnings
Lockdep warns because uverbs is trying to take uobj->mutex when it
already holds that lock. This is because there are really multiple
types of uobjs even though all of their locks are initialized in
common code.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael S. Tsirkin [Mon, 17 Jul 2006 15:20:51 +0000 (18:20 +0300)]
IB/uverbs: Fix unlocking in error paths
ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the
PD's read lock in their error paths, which lead to deadlock when
destroying the PD.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Paul Jackson [Sun, 23 Jul 2006 18:36:08 +0000 (11:36 -0700)]
[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock
Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.
It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.
This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.
This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it. We don't have any other reproducible test case.
The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.
This is in the cpuset_rmdir() code, where we:
* take the callback_mutex lock
* mark the cpuset CS_REMOVED
* call update_cpu_domains for cpu_exclusive cpusets
* in that call, take the cpu_hotplug lock if the
cpuset is marked for removal.
Thanks to Jack Steiner for identifying this deadlock.
The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock. This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.
I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two. The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 23 Jul 2006 19:12:16 +0000 (12:12 -0700)]
cpu hotplug: simplify and hopefully fix locking
The CPU hotplug locking was quite messy, with a recursive lock to
handle the fact that both the actual up/down sequence wanted to
protect itself from being re-entered, but the callbacks that it
called also tended to want to protect themselves from CPU events.
This splits the lock into two (one to serialize the whole hotplug
sequence, the other to protect against the CPU present bitmaps
changing). The latter still allows recursive usage because some
subsystems (ondemand policy for cpufreq at least) had already gotten
too used to the lax locking, but the locking mistakes are hopefully
now less fundamental, and we now warn about recursive lock usage
when we see it, in the hope that it can be fixed.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 23 Jul 2006 19:05:00 +0000 (12:05 -0700)]
[cpufreq] ondemand: make shutdown sequence more robust
Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.
This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Fri, 21 Jul 2006 23:44:45 +0000 (16:44 -0700)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
[TIPC]: Removing useless casts
[IPV4]: Fix nexthop realm dumping for multipath routes
[DUMMY]: Avoid an oops when dummy_init_one() failed
[IFB] After ifb_init_one() failed, i is increased. Decrease
[NET]: Fix reversed error test in netif_tx_trylock
[MAINTAINERS]: Mark LAPB as Oprhan.
[NET]: Conversions from kmalloc+memset to k(z|c)alloc.
[NET]: sun happymeal, little pci cleanup
[IrDA]: Use alloc_skb() in IrDA TX path
[I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
[SCTP]: Set chunk->data_accepted only if we are going to accept it.
[SCTP]: Verify all the paths to a peer via heartbeat before using them.
[SCTP]: Unhash the endpoint in sctp_endpoint_free().
[SCTP]: Check for NULL arg to sctp_bucket_destroy().
[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
[WAN]: Converted synclink drivers to use netif_carrier_*()
[WAN]: Cosmetic changes to N2 and C101 drivers
[WAN]: Added missing netif_dormant_off() to generic HDLC
...
Panagiotis Issaris [Fri, 21 Jul 2006 22:52:20 +0000 (15:52 -0700)]
[TIPC]: Removing useless casts
Removing useless casts
Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Fri, 21 Jul 2006 22:09:55 +0000 (15:09 -0700)]
[IPV4]: Fix nexthop realm dumping for multipath routes
Routing realms exist per nexthop, but are only returned to userspace
for the first nexthop. This is due to the fact that iproute2 only
allows to set the realm for the first nexthop and the kernel refuses
multipath routes where only a single realm is present.
Dump all realms for multipath routes to enable iproute to correctly
display them.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 21 Jul 2006 22:09:07 +0000 (15:09 -0700)]
[DUMMY]: Avoid an oops when dummy_init_one() failed
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 21 Jul 2006 21:56:02 +0000 (14:56 -0700)]
[IFB] After ifb_init_one() failed, i is increased. Decrease
It before entering in the loop for freeing the other ifb devices.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 21 Jul 2006 21:55:38 +0000 (14:55 -0700)]
[NET]: Fix reversed error test in netif_tx_trylock
A non-zero return value indicates success from spin_trylock,
not error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Jul 2006 21:55:17 +0000 (14:55 -0700)]
[MAINTAINERS]: Mark LAPB as Oprhan.
Maintainer email not longer exists.
Signed-off-by: David S. Miller <davem@davemloft.net>
Panagiotis Issaris [Fri, 21 Jul 2006 21:51:30 +0000 (14:51 -0700)]
[NET]: Conversions from kmalloc+memset to k(z|c)alloc.
Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Slaby [Fri, 21 Jul 2006 21:51:02 +0000 (14:51 -0700)]
[NET]: sun happymeal, little pci cleanup
Use pci_register_driver instead of pci_module_init. Use PCI_DEVICE macro.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Fri, 21 Jul 2006 21:50:41 +0000 (14:50 -0700)]
[IrDA]: Use alloc_skb() in IrDA TX path
As pointed out by Christoph Hellwig, dev_alloc_skb() is not intended to be
used for allocating TX sk_buff. The IrDA stack was exclusively calling
dev_alloc_skb() on the TX path, and this patch fixes that.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Henrik Kretzschmar [Fri, 21 Jul 2006 21:50:13 +0000 (14:50 -0700)]
[I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
Changes pci_module_init() to pci_register_driver().
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Fri, 21 Jul 2006 21:49:49 +0000 (14:49 -0700)]
[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
Every file should #include the headers containing the prototypes for
its global functions.
Especially in cases like this one where gcc can tell us through a
compile error that the prototype was wrong...
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Fri, 21 Jul 2006 21:49:25 +0000 (14:49 -0700)]
[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Fri, 21 Jul 2006 21:49:07 +0000 (14:49 -0700)]
[SCTP]: Set chunk->data_accepted only if we are going to accept it.
Currently there is a code path in sctp_eat_data() where it is possible
to set this flag even when we are dropping this chunk.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Fri, 21 Jul 2006 21:48:50 +0000 (14:48 -0700)]
[SCTP]: Verify all the paths to a peer via heartbeat before using them.
This patch implements Path Initialization procedure as described in
Sec 2.36 of RFC4460.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 21 Jul 2006 21:48:26 +0000 (14:48 -0700)]
[SCTP]: Unhash the endpoint in sctp_endpoint_free().
This prevents a race between the close of a socket and receive of an
incoming packet.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Fri, 21 Jul 2006 21:45:47 +0000 (14:45 -0700)]
[SCTP]: Check for NULL arg to sctp_bucket_destroy().
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Chazarain [Fri, 21 Jul 2006 21:45:25 +0000 (14:45 -0700)]
[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
CONFIG_DEBUG_SLAB found the following bug:
netem_enqueue() in sch_netem.c gets a pointer inside a slab object:
struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb;
But then, the slab object may be freed:
skb = skb_unshare(skb, GFP_ATOMIC)
cb is still pointing inside the freed skb, so here is a patch to
initialize cb later, and make it clear that initializing it sooner
is a bad idea.
[From Stephen Hemminger: leave cb unitialized in order to let gcc
complain in case of use before initialization]
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Halasa [Fri, 21 Jul 2006 21:44:55 +0000 (14:44 -0700)]
[WAN]: Converted synclink drivers to use netif_carrier_*()
WAN: Converted synclink drivers to use netif_carrier_*() instead
of hdlc_set_carrier().
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Halasa [Fri, 21 Jul 2006 21:41:36 +0000 (14:41 -0700)]
[WAN]: Cosmetic changes to N2 and C101 drivers
WAN: Cosmetic changes to N2 and C101 drivers
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Krzysztof Halasa [Fri, 21 Jul 2006 21:41:01 +0000 (14:41 -0700)]
[WAN]: Added missing netif_dormant_off() to generic HDLC
WAN: Fixed a problem with PPP/raw HDLC/X.25 protocols not doing
netif_dormant_off() at startup.
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 21 Jul 2006 21:29:53 +0000 (14:29 -0700)]
[IPV4]: Get rid of redundant IPCB->opts initialisation
Now that we always zero the IPCB->opts in ip_rcv, it is no longer
necessary to do so before calling netif_rx for tunneled packets.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Jul 2006 21:19:45 +0000 (14:19 -0700)]
[SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Jul 2006 21:12:39 +0000 (14:12 -0700)]
[SPARC]: Fix length parameter verification in sys_getdomainname().
Found by scrashme.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Jul 2006 05:55:08 +0000 (22:55 -0700)]
[SERIAL] sunzilog: Fix instance enumeration.
Just do a linear enumeration so that we handle sun4d systems
correctly. As a consequence, eliminate the hard coded keyboard and
mouse channel line values, use the CONS_{KEYB,MS} flags instead.
Also, report the keyboard/mouse Zilog channels just like the uart ones
do.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Jul 2006 04:04:04 +0000 (21:04 -0700)]
[SERIAL] sunzilog: Remove duplicate IRQ registry in zs_probe().
We do it now in sunzilog_init() after all devices have been
probed.
Signed-off-by: David S. Miller <davem@davemloft.net>
Raymond Burns [Tue, 18 Jul 2006 04:57:09 +0000 (21:57 -0700)]
[SPARC]: Get sun4d SMP building again.
Signed-off-by: David S. Miller <davem@davemloft.net>
Raymond Burns [Tue, 18 Jul 2006 04:50:55 +0000 (21:50 -0700)]
[SPARC]: Do not call sun4m_irq_rotate on sun4d.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2006 04:49:58 +0000 (21:49 -0700)]
[SPARC]: Simplify and correct __cpu_find_by()
By using for_each_node_by_type().
Also, correct a spurioud test in check_cpu_node() on sparc64.
It is only called with nodes that have device_type "cpu".
Signed-off-by: David S. Miller <davem@davemloft.net>
Raymond Burns [Tue, 18 Jul 2006 04:40:27 +0000 (21:40 -0700)]
[SPARC]: Initialize iounit spinlock in iounit_init().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2006 04:39:09 +0000 (21:39 -0700)]
[SPARC]: Fix initialization of sun4d SBUS interrupts.
1) Explicitly traverse to the root looking for the "sbi".
2) Grab the "board#" property from the sbi's parent and
verify that this parent is an "io-unit" node.
3) Skip IRQ initialization when device lacks "reg" property.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2006 04:07:17 +0000 (21:07 -0700)]
[SERIAL] sunzilog: Register IRQ after all devices have been probed.
Otherwise we will deref half-initialized channel pointers
and crash in the interrupt handler.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2006 04:06:15 +0000 (21:06 -0700)]
[SPARC] sbus: Make sure sbus nodes are named uniquely.
Just name them "sbus%d" otherwise on sun4d we try to register
multiple entries named "sbi@0,0" which does not work.
Based upon a report from Raymond Burns.
Signed-off-by: David S. Miller <davem@davemloft.net>
Bob Breuer [Tue, 18 Jul 2006 00:05:56 +0000 (17:05 -0700)]
[SPARC]: Fix property name acquisition in prom.c
On sparc32 the prom_{first,next}prop() interfaces work
a little differently. The buffer argument is ignored on
sparc32 and the firmware just returns a raw pointer to
the property name.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 17 Jul 2006 23:40:26 +0000 (16:40 -0700)]
[SERIAL] sunsab: Get line numbers and table sizing correct.
Table sizing code should look for "se" not "su" nodes.
The chip at the lower address should get the first index.
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Zyngier [Mon, 17 Jul 2006 22:53:32 +0000 (15:53 -0700)]
[SPARC64] Fix sunsab ports ordering
Register second SAB port before the first one, as serial A is wired to
it, and expected to appear as ttyS0.
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 17 Jul 2006 05:19:40 +0000 (22:19 -0700)]
[SPARC]: Kill prom_getname, unused and not implemented properly.
The m68k port's sun3 asm/oplib.h had a stray reference too, so I
killed that off as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 17 Jul 2006 05:10:44 +0000 (22:10 -0700)]
[SPARC64]: Fix more of_device layer IRQ bugs, and correct PROMREG_MAX.
Sabre and Psycho PCI controllers can have partial interrupt-map
properties, meaning that on-board devices don't match up to any
entries. Instead, they are fully specified from the beginning and
we should pass them directly to the IRQ translator as-is.
Also, fill in the necessary translator slots for the "graphics"
and "expansion UPA" interrupts on Sabre, Psycho, and SYSIO SBUS.
Increase PROMREG_MAX to 24, as seen on SUNW,ffb devices.
Finally, prevent accidentally writing past the end of the of_device
struct resource[] and irqs[] arrays. Spit out a log message when
we ignore some entries because there are too many of them.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 21 Jul 2006 19:04:53 +0000 (12:04 -0700)]
Merge /linux/kernel/git/jejb/scsi-rc-fixes-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (38 commits)
[SCSI] More buffer->request_buffer changes
[SCSI] mptfusion: bump version to 3.04.01
[SCSI] mptfusion: misc fix's
[SCSI] mptfusion: firmware download boot fix's
[SCSI] mptfusion: task abort fix's
[SCSI] mptfusion: sas nexus loss support
[SCSI] mptfusion: sas loginfo update
[SCSI] mptfusion: mptctl panic when loading
[SCSI] mptfusion: sas enclosures with smart drive
[SCSI] NCR_D700: misc fixes (section and argument ordering)
[SCSI] scsi_debug: must_check fixes
[SCSI] scsi_transport_sas: kill the use of channel
[SCSI] scsi_transport_sas: add expander backlink
[SCSI] hide EH backup data outside the scsi_cmnd
[SCSI] ibmvscsi: handle inactive SCSI target during probe
[SCSI] ibmvscsi: allocate lpevents for ibmvscsi on iseries
[SCSI] aic7[9x]xx: Remove last vestiges of reverse_scan
[SCSI] aha152x: stop poking at saved scsi_cmnd members
[SCSI] st.c: Improve sense output
[SCSI] lpfc 8.1.7: Change version number to 8.1.7
...
Linus Torvalds [Fri, 21 Jul 2006 19:03:57 +0000 (12:03 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] sysfs_create_xxx return values.
[S390] .align 4096 statements in head.S
[S390] get_clock inline assembly.
[S390] channel measurement interval display.
[S390] xpram module parameter parsing - take 2.
[S390] Fix gcc warning about unused return values.
Linus Torvalds [Fri, 21 Jul 2006 19:03:32 +0000 (12:03 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
[PATCH] spidernet: rework tx queue handling
[PATCH] spidernet: bug fix for init code
[PATCH] sky2: NAPI poll fix
[NET] ethtool: fix oops by testing correct struct member
e1000: bump version to 7.1.9-k4
e1000: fix panic on large frame receive when mtu=default
e1000: remove CRC bytes from measured packet length
e1000: Redo netpoll fix to address community concerns
Heiko Carstens [Tue, 18 Jul 2006 11:46:58 +0000 (13:46 +0200)]
[S390] sysfs_create_xxx return values.
Take return values of sysfs_create_group & friends into account.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 18 Jul 2006 11:44:57 +0000 (13:44 +0200)]
[S390] .align 4096 statements in head.S
SLES9 binutils don't like .align 4096 statements in head.S. Work around this
by using .org statements.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Jens Osterkamp [Thu, 13 Jul 2006 09:54:08 +0000 (11:54 +0200)]
[PATCH] spidernet: rework tx queue handling
With this patch TX queue descriptors are not chained per default any more.
The pointer to next descriptor is set only when next descriptor is prepaired
for transfer. Also the mechanism of checking wether Spider is ready has been
changed: it checks not for CARDOWNED flag in status of previous descriptor
but for a TXDMAENABLED flag in Spider's register.
Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jens Osterkamp [Thu, 13 Jul 2006 09:54:23 +0000 (11:54 +0200)]
[PATCH] spidernet: bug fix for init code
We want to intitialize addr instead of data register first.
Signed-off-by: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stephen Hemminger [Mon, 17 Jul 2006 13:54:34 +0000 (09:54 -0400)]
[PATCH] sky2: NAPI poll fix
When sky2 driver gets lots of received packets at once, it can get stuck.
The NAPI poll routine gets called back to keep going, but since no IRQ bits
are set it doesn't make progress.
Increase version, since this is serious enough problem that I want to be
able to tell new from old problems.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Mon, 17 Jul 2006 17:26:52 +0000 (13:26 -0400)]
Merge branch 'upstream-fixes-jgarzik' of git://lost.foo-projects.org/~ahkok/git/netdev-2.6 into upstream-fixes
Jeff Garzik [Mon, 17 Jul 2006 16:54:40 +0000 (12:54 -0400)]
[NET] ethtool: fix oops by testing correct struct member
Noticed by Willy Tarreau.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Andreas Krebbel [Mon, 17 Jul 2006 14:09:42 +0000 (16:09 +0200)]
[S390] get_clock inline assembly.
Add missing volatile to the get_clock / get_cycles inline assemblies
to avoid that consecutive calls get optimized away.
Signed-off-by: Andreas Krebbel <krebbel1@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cornelia Huck [Mon, 17 Jul 2006 14:09:28 +0000 (16:09 +0200)]
[S390] channel measurement interval display.
Display avg_sample_interval in nanoseconds, like it is documented.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 17 Jul 2006 14:09:23 +0000 (16:09 +0200)]
[S390] xpram module parameter parsing - take 2.
Don't use memparse since the default size modifier is 'k'.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 17 Jul 2006 14:09:18 +0000 (16:09 +0200)]
[S390] Fix gcc warning about unused return values.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Jeff Garzik [Tue, 11 Jul 2006 19:28:12 +0000 (15:28 -0400)]
[libata] ata_piix: correct 'invalid MAP value' typo-caused error
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Tue, 11 Jul 2006 17:11:17 +0000 (13:11 -0400)]
[libata] ata_piix: minor cleanups noticed in prior patch run
* delete unused PIIX_FLAG_COMBINED*
* port_enable should be u16 rather than u32
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Tue, 11 Jul 2006 15:57:44 +0000 (11:57 -0400)]
[libata] ata_piix: attempt to fix ICH8 support
Take into account the fact that ICH8 changed the register layout of
the MAP and PCS register bits.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jeff Garzik [Tue, 11 Jul 2006 15:48:50 +0000 (11:48 -0400)]
[libata] ata_piix: Consolidate PCS register writing
Prior to this patch, the driver would do this for each port:
read 8-bit PCS
write 8-bit PCS
read 8-bit PCS
write 8-bit PCS
In the field, flaky behavior has been observed related to this register.
In particular, these overzealous register writes can cause misdetection
problems.
Update to do the following once (not once per port) at boot:
read 16-bit PCS
if needs changing,
write 16-bit PCS
And thereafter, we only perform a 'read 16-bit PCS' per port.
This should eliminate all PCS writes in many cases, and be more friendly
in the cases where we do need to enable ports.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Wed, 28 Jun 2006 16:58:28 +0000 (01:58 +0900)]
[PATCH] ata_piix: add host_set private structure
Add host_set private structure piix_host_priv. Currently the only
field is ->map which used to be stored directly at
host_set->private_data. This change allows more host_set private
fields to be added.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Linus Torvalds [Sat, 15 Jul 2006 21:53:08 +0000 (14:53 -0700)]
Linux 2.6.18-rc2
Finishing up for the kernel summit. Ottawa, here I come.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 15 Jul 2006 21:43:30 +0000 (14:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/shaggy/jfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
JFS: commit_mutex cleanups
Jeff Dike [Fri, 14 Jul 2006 19:52:23 +0000 (15:52 -0400)]
[PATCH] UML - fix utsname build breakage
Some -mm-only material leaked into a patch destined for mainline, and I didn't
notice.
This was the replacement of system_utsname with utsname() that's required by
the uts namespace patch. This patch reverts those changes (which are correct
in -mm) so that mainline UML builds again.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 15 Jul 2006 19:26:45 +0000 (12:26 -0700)]
Don't allow chmod() on the /proc/<pid>/ files
This just turns off chmod() on the /proc/<pid>/ files, since there is no
good reason to allow it, and had we disallowed it originally, the nasty
/proc race exploit wouldn't have been possible.
The other patches already fixed the problem chmod() could cause, so this
is really just some final mop-up..
This particular version is based off a patch by Eugene and Marcel which
had much better naming than my original equivalent one.
Signed-off-by: Eugene Teo <eteo@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 15 Jul 2006 19:20:05 +0000 (12:20 -0700)]
Mark /proc MS_NOSUID and MS_NOEXEC
Not that we really need this any more, but at the same time there's no
reason not to do this.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dave Jones [Sat, 15 Jul 2006 07:41:12 +0000 (03:41 -0400)]
[PATCH] sch_htb compile fix.
net/sched/sch_htb.c: In function 'htb_change_class':
net/sched/sch_htb.c:1605: error: expected ';' before 'do_gettimeofday'
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 15 Jul 2006 04:57:45 +0000 (21:57 -0700)]
Merge /pub/scm/linux/kernel/git/herbert/crypto-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] padlock: Fix alignment after aes_ctx rearrange
Linus Torvalds [Sat, 15 Jul 2006 04:57:23 +0000 (21:57 -0700)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64] Fix PSYCHO PCI controler init.
[SPARC64] psycho: Fix pbm->name handling in pbm_register_toplevel_resources()
[SERIAL] sunsab: Fix significant typo in sab_probe()
[SERIAL] sunsu: Report keyboard and mouse ports in kernel log.
[SPARC64]: Make sure IRQs are disabled properly during early boot.
Linus Torvalds [Sat, 15 Jul 2006 04:57:06 +0000 (21:57 -0700)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[VLAN]: __vlan_hwaccel_rx can use the faster ether_compare_addr
[PKT_SCHED] HTB: initialize upper bound properly
[IPV4]: Clear skb cb on IP input
[NET]: Update frag_list in pskb_trim
Steven Rostedt [Fri, 14 Jul 2006 20:05:03 +0000 (16:05 -0400)]
[PATCH] remove set_wmb - arch removal
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.
All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.
Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Steven Rostedt [Fri, 14 Jul 2006 20:05:01 +0000 (16:05 -0400)]
[PATCH] remove set_wmb - doc update
This patch removes the reference to set_wmb from memory-barriers.txt
since it shouldn't be used.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:47 +0000 (00:24 -0700)]
[PATCH] Remove down_write() from taskstats code invoked on the exit() path
In send_cpu_listeners(), which is called on the exit path, a down_write()
was protecting operations like skb_clone() and genlmsg_unicast() that do
GFP_KERNEL allocations. If the oom-killer decides to kill tasks to satisfy
the allocations,the exit of those tasks could block on the same semphore.
The down_write() was only needed to allow removal of invalid listeners from
the listener list. The patch converts the down_write to a down_read and
defers the removal to a separate critical region. This ensures that even
if the oom-killer is called, no other task's exit is blocked as it can
still acquire another down_read.
Thanks to Andrew Morton & Herbert Xu for pointing out the oom related
pitfalls, and to Chandra Seetharaman for suggesting this fix instead of
using something more complex like RCU.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:47 +0000 (00:24 -0700)]
[PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks
On systems with a large number of cpus, with even a modest rate of tasks
exiting per cpu, the volume of taskstats data sent on thread exit can
overflow a userspace listener's buffers.
One approach to avoiding overflow is to allow listeners to get data for a
limited and specific set of cpus. By scaling the number of listeners
and/or the cpus they monitor, userspace can handle the statistical data
overload more gracefully.
In this patch, each listener registers to listen to a specific set of cpus
by specifying a cpumask. The interest is recorded per-cpu. When a task
exits on a cpu, its taskstats data is unicast to each listener interested
in that cpu.
Thanks to Andrew Morton for pointing out the various scalability and
general concerns of previous attempts and for suggesting this design.
[akpm@osdl.org: build fix]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:46 +0000 (00:24 -0700)]
[PATCH] per-task delay accounting: avoid send without listeners
Don't send taskstats (per-pid or per-tgid) on thread exit when no one is
listening for such data.
Currently the taskstats interface allocates a structure, fills it in and
calls netlink to send out per-pid and per-tgid stats regardless of whether
a userspace listener for the data exists (netlink layer would check for
that and avoid the multicast).
As a result of this patch, the check for the no-listener case is performed
early, avoiding the redundant allocation and filling up of the taskstats
structures.
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:45 +0000 (00:24 -0700)]
[PATCH] per task delay accounting taskstats interface: documentation fix
Change documentation and example program to reflect the flow control issues
being addressed by the cpumask changes.
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:44 +0000 (00:24 -0700)]
[PATCH] delay accounting taskstats interface send tgid once
Send per-tgid data only once during exit of a thread group instead of once
with each member thread exit.
Currently, when a thread exits, besides its per-tid data, the per-tgid data
of its thread group is also sent out, if its thread group is non-empty.
The per-tgid data sent consists of the sum of per-tid stats for all
*remaining* threads of the thread group.
This patch modifies this sending in two ways:
- the per-tgid data is sent only when the last thread of a thread group
exits. This cuts down heavily on the overhead of sending/receiving
per-tgid data, especially when other exploiters of the taskstats
interface aren't interested in per-tgid stats
- the semantics of the per-tgid data sent are changed. Instead of being
the sum of per-tid data for remaining threads, the value now sent is the
true total accumalated statistics for all threads that are/were part of
the thread group.
The patch also addresses a minor issue where failure of one accounting
subsystem to fill in the taskstats structure was causing the send of
taskstats to not be sent at all.
The patch has been tested for stability and run cerberus for over 4 hours
on an SMP.
[akpm@osdl.org: bugfixes]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:43 +0000 (00:24 -0700)]
[PATCH] per-task-delay-accounting: /proc export of aggregated block I/O delays
Export I/O delays seen by a task through /proc/<tgid>/stats for use in top
etc.
Note that delays for I/O done for swapping in pages (swapin I/O) is clubbed
together with all other I/O here (this is not the case in the netlink
interface where the swapin I/O is kept distinct)
[akpm@osdl.org: printk warning fix]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shailabh Nagar [Fri, 14 Jul 2006 07:24:42 +0000 (00:24 -0700)]
[PATCH] per-task-delay-accounting: documentation
Some documentation for delay accounting.
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>