linux-2.6
17 years ago[SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option
Vlad Yasevich [Fri, 20 Apr 2007 19:23:15 +0000 (12:23 -0700)] 
[SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option

This option was introduced in draft-ietf-tsvwg-sctpsocket-13.  It
prevents head-of-line blocking in the case of one-to-many endpoint.
Applications enabling this option really must enable SCTP_SNDRCV event
so that they would know where the data belongs.  Based on an
earlier patch by Ivan Skytte Jørgensen.

Additionally, this functionality now permits multiple associations
on the same endpoint to enter Partial Delivery.  Applications should
be extra careful, when using this functionality, to track EOR indicators.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: qdisc: remove unnecessary memory barriers
Patrick McHardy [Fri, 23 Mar 2007 18:30:04 +0000 (11:30 -0700)] 
[NET_SCHED]: qdisc: remove unnecessary memory barriers

We're holding dev->queue_lock in qdisc_watchdog_schedule and
qdisc_watchdog_cancel, no need for the barriers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: Unline tcf_destroy
Patrick McHardy [Fri, 23 Mar 2007 18:29:43 +0000 (11:29 -0700)] 
[NET_SCHED]: Unline tcf_destroy

Uninline tcf_destroy and add a helper function to destroy an entire filter
chain.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: turn PSCHED_GET_TIME into inline function
Patrick McHardy [Fri, 23 Mar 2007 18:29:25 +0000 (11:29 -0700)] 
[NET_SCHED]: turn PSCHED_GET_TIME into inline function

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function
Patrick McHardy [Fri, 23 Mar 2007 18:29:11 +0000 (11:29 -0700)] 
[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function

Also rename to psched_tdiff_bounded.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill PSCHED_TDIFF
Patrick McHardy [Fri, 23 Mar 2007 18:28:55 +0000 (11:28 -0700)] 
[NET_SCHED]: kill PSCHED_TDIFF

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill PSCHED_SET_PASTPERFECT/PSCHED_IS_PASTPERFECT
Patrick McHardy [Fri, 23 Mar 2007 18:28:30 +0000 (11:28 -0700)] 
[NET_SCHED]: kill PSCHED_SET_PASTPERFECT/PSCHED_IS_PASTPERFECT

Use direct assignment and comparison instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill PSCHED_TLESS
Patrick McHardy [Fri, 23 Mar 2007 18:28:07 +0000 (11:28 -0700)] 
[NET_SCHED]: kill PSCHED_TLESS

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill PSCHED_TADD/PSCHED_TADD2
Patrick McHardy [Fri, 23 Mar 2007 18:27:45 +0000 (11:27 -0700)] 
[NET_SCHED]: kill PSCHED_TADD/PSCHED_TADD2

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill PSCHED_AUDIT_TDIFF
Patrick McHardy [Fri, 23 Mar 2007 18:27:29 +0000 (11:27 -0700)] 
[NET_SCHED]: kill PSCHED_AUDIT_TDIFF

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: sch_netem: fix off-by-one in send time comparison
Patrick McHardy [Fri, 23 Mar 2007 18:27:04 +0000 (11:27 -0700)] 
[NET_SCHED]: sch_netem: fix off-by-one in send time comparison

netem checks PSCHED_TLESS(cb->time_to_send, now) to find out whether it is
allowed to send a packet, which is equivalent to cb->time_to_send < now.
Use !PSCHED_TLESS(now, cb->time_to_send) instead to properly handle
cb->time_to_send == now.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER] nfnetlink: netlink_run_queue() already checks for NLM_F_REQUEST
Thomas Graf [Fri, 23 Mar 2007 18:17:57 +0000 (11:17 -0700)] 
[NETFILTER] nfnetlink: netlink_run_queue() already checks for NLM_F_REQUEST

Patrick has made use of netlink_run_queue() in nfnetlink while my patches
have been waiting for net-2.6.22 to open. So this check for NLM_F_REQUEST
can go as well.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nf_conntrack: kill destroy() in struct nf_conntrack for diet
Yasuyuki Kozakai [Fri, 23 Mar 2007 18:17:27 +0000 (11:17 -0700)] 
[NETFILTER]: nf_conntrack: kill destroy() in struct nf_conntrack for diet

The destructor per conntrack is unnecessary, then this replaces it with
system wide destructor.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nf_conntrack: don't use nfct in skb if conntrack is disabled
Yasuyuki Kozakai [Fri, 23 Mar 2007 18:17:07 +0000 (11:17 -0700)] 
[NETFILTER]: nf_conntrack: don't use nfct in skb if conntrack is disabled

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: Use setup_timer
Patrick McHardy [Fri, 23 Mar 2007 18:16:30 +0000 (11:16 -0700)] 
[NETFILTER]: Use setup_timer

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: remove conditional locking
Patrick McHardy [Fri, 23 Mar 2007 18:12:50 +0000 (11:12 -0700)] 
[NETFILTER]: nfnetlink_log: remove conditional locking

This is gross, have the wrapper function take the lock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: micro-optimization: inst->skb != NULL in __nfulnl_send()
Michal Miroslaw [Fri, 23 Mar 2007 18:12:21 +0000 (11:12 -0700)] 
[NETFILTER]: nfnetlink_log: micro-optimization: inst->skb != NULL in __nfulnl_send()

No other function calls __nfulnl_send() with inst->skb == NULL than
nfulnl_timer().

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: iterator functions need iter_state * only
Michal Miroslaw [Fri, 23 Mar 2007 18:12:03 +0000 (11:12 -0700)] 
[NETFILTER]: nfnetlink_log: iterator functions need iter_state * only

get_*() don't need access to seq_file - iter_state is enough for them.

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: micro-optimization: don't modify destroyed instance
Michal Miroslaw [Fri, 23 Mar 2007 18:11:48 +0000 (11:11 -0700)] 
[NETFILTER]: nfnetlink_log: micro-optimization: don't modify destroyed instance

Simple micro-optimization: Don't change any options if the instance is
being destroyed.

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: micro-optimization for inst==NULL in nfulnl_recv_config()
Michal Miroslaw [Fri, 23 Mar 2007 18:11:31 +0000 (11:11 -0700)] 
[NETFILTER]: nfnetlink_log: micro-optimization for inst==NULL in nfulnl_recv_config()

Simple micro-optimization: don't call instance_put() on known NULL pointers.

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: kill duplicate code
Michal Miroslaw [Fri, 23 Mar 2007 18:11:05 +0000 (11:11 -0700)] 
[NETFILTER]: nfnetlink_log: kill duplicate code

Kill some duplicate code in nfulnl_log_packet().

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: nfnetlink_log: don't count max(a,b) twice
Michal Miroslaw [Fri, 23 Mar 2007 18:10:47 +0000 (11:10 -0700)] 
[NETFILTER]: nfnetlink_log: don't count max(a,b) twice

We don't need local nlbufsiz (skb size) as nfulnl_alloc_skb() takes
the maximum anyway.

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: Remove changelogs and CVS IDs
Patrick McHardy [Fri, 23 Mar 2007 18:10:13 +0000 (11:10 -0700)] 
[NETFILTER]: Remove changelogs and CVS IDs

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETEM]: spelling errors
Stephen Hemminger [Fri, 23 Mar 2007 07:12:09 +0000 (00:12 -0700)] 
[NETEM]: spelling errors

Get rid of some of my creative spelling.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Directly return -EINTR from netlink_dump_start()
Thomas Graf [Fri, 23 Mar 2007 06:30:55 +0000 (23:30 -0700)] 
[NETLINK]: Directly return -EINTR from netlink_dump_start()

Now that all users of netlink_dump_start() use netlink_run_queue()
to process the receive queue, it is possible to return -EINTR from
netlink_dump_start() directly, therefore simplying the callers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPv4] diag: Use netlink_run_queue() to process the receive queue
Thomas Graf [Fri, 23 Mar 2007 06:30:35 +0000 (23:30 -0700)] 
[IPv4] diag: Use netlink_run_queue() to process the receive queue

Makes use of netlink_run_queue() to process the receive queue and
converts inet_diag_rcv_msg() to use the type safe netlink interface.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Remove error pointer from netlink message handler
Thomas Graf [Fri, 23 Mar 2007 06:30:12 +0000 (23:30 -0700)] 
[NETLINK]: Remove error pointer from netlink message handler

The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.

nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.

This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Ignore control messages directly in netlink_run_queue()
Thomas Graf [Fri, 23 Mar 2007 06:29:10 +0000 (23:29 -0700)] 
[NETLINK]: Ignore control messages directly in netlink_run_queue()

Changes netlink_rcv_skb() to skip netlink controll messages and don't
pass them on to the message handler.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue()
Thomas Graf [Fri, 23 Mar 2007 06:28:46 +0000 (23:28 -0700)] 
[NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue()

netlink_rcv_skb() is changed to skip messages which don't have the
NLM_F_REQUEST bit to avoid every netlink family having to perform this
check on their own.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Remove unused groups variable
Thomas Graf [Fri, 23 Mar 2007 06:27:39 +0000 (23:27 -0700)] 
[NETLINK]: Remove unused groups variable

Leftover from dynamic multicast groups allocation work.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP] westwood: Use type safe netlink interface
Thomas Graf [Fri, 23 Mar 2007 06:27:19 +0000 (23:27 -0700)] 
[TCP] westwood: Use type safe netlink interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP] vegas: Use type safe netlink interface
Thomas Graf [Fri, 23 Mar 2007 06:27:01 +0000 (23:27 -0700)] 
[TCP] vegas: Use type safe netlink interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNL]: Properly return rntl message handler
Thomas Graf [Fri, 23 Mar 2007 04:41:06 +0000 (21:41 -0700)] 
[RTNL]: Properly return rntl message handler

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED] qdisc: avoid transmit softirq on watchdog wakeup
Stephen Hemminger [Thu, 22 Mar 2007 19:18:35 +0000 (12:18 -0700)] 
[NET_SCHED] qdisc: avoid transmit softirq on watchdog wakeup

If possible, avoid having to do a transmit softirq when a qdisc
watchdog decides to re-enable.  The watchdog routine runs off
a timer, so it is already in the same effective context as
the softirq.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETEM]: avoid excessive requeues
Stephen Hemminger [Thu, 22 Mar 2007 19:17:42 +0000 (12:17 -0700)] 
[NETEM]: avoid excessive requeues

The netem code would call getnstimeofday() and dequeue/requeue after
every packet, even if it was waiting. Avoid this overhead by using
the throttled flag.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETEM]: Optimize tfifo
Stephen Hemminger [Thu, 22 Mar 2007 19:17:05 +0000 (12:17 -0700)] 
[NETEM]: Optimize tfifo

In most cases, the next packet will be sent after the
last one. So optimize that case.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETEM]: use better types for time values
Stephen Hemminger [Thu, 22 Mar 2007 19:16:21 +0000 (12:16 -0700)] 
[NETEM]: use better types for time values

The random number generator always generates 32 bit values.
The time values are limited by psched_tdiff_t

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETEM]: report reorder percent correctly.
Stephen Hemminger [Thu, 22 Mar 2007 19:15:45 +0000 (12:15 -0700)] 
[NETEM]: report reorder percent correctly.

If you setup netem to just delay packets; "tc qdisc ls" will report
the reordering as 100%. Well it's a lie, reorder isn't used unless
gap is set, so just set value to 0 so the output of utility
is correct.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[TCP]: cubic optimization
Stephen Hemminger [Thu, 22 Mar 2007 19:10:58 +0000 (12:10 -0700)] 
[TCP]: cubic optimization

Use willy's work in optimizing cube root by having table for small values.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[LIB]: div64_64 optimization
Stephen Hemminger [Thu, 22 Mar 2007 19:10:18 +0000 (12:10 -0700)] 
[LIB]: div64_64 optimization

Minor optimization of div64_64.  do_div() already does optimization
for the case of 32 by 32 divide, so no need to do it here.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET] rules: Unified rules dumping
Thomas Graf [Mon, 26 Mar 2007 06:24:24 +0000 (23:24 -0700)] 
[NET] rules: Unified rules dumping

Implements a unified, protocol independant rules dumping function
which is capable of both, dumping a specific protocol family or
all of them. This speeds up dumping as less lookups are required.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNL]: Use rtnl registration interface for dump-all aliases
Thomas Graf [Thu, 22 Mar 2007 18:59:42 +0000 (11:59 -0700)] 
[RTNL]: Use rtnl registration interface for dump-all aliases

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[BRIDGE]: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:59:03 +0000 (11:59 -0700)] 
[BRIDGE]: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPv6]: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:58:32 +0000 (11:58 -0700)] 
[IPv6]: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DECNet]: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:57:46 +0000 (11:57 -0700)] 
[DECNet]: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PKT_SCHED] act: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:56:59 +0000 (11:56 -0700)] 
[PKT_SCHED] act: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PKT_SCHED] cls: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:56:22 +0000 (11:56 -0700)] 
[PKT_SCHED] cls: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PKT_SCHED] qdisc: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:55:50 +0000 (11:55 -0700)] 
[PKT_SCHED] qdisc: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPv4]: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:55:17 +0000 (11:55 -0700)] 
[IPv4]: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET] rules: Use rtnl registration interface
Thomas Graf [Mon, 26 Mar 2007 06:20:05 +0000 (23:20 -0700)] 
[NET] rules: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NEIGH]: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:50:06 +0000 (11:50 -0700)] 
[NEIGH]: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET] link: Use rtnl registration interface
Thomas Graf [Thu, 22 Mar 2007 18:49:22 +0000 (11:49 -0700)] 
[NET] link: Use rtnl registration interface

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[RTNL]: Message handler registration interface
Thomas Graf [Thu, 22 Mar 2007 18:48:11 +0000 (11:48 -0700)] 
[RTNL]: Message handler registration interface

This patch adds a new interface to register rtnetlink message
handlers replacing the exported rtnl_links[] array which
required many message handlers to be exported unnecessarly.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Use initial RTT sample from SYN exchange
Gerrit Renker [Tue, 20 Mar 2007 18:31:56 +0000 (15:31 -0300)] 
[CCID3]: Use initial RTT sample from SYN exchange

The patch follows the following recommendation made in an erratum to RFC 4342:

  "Senders MAY additionally make use of other available RTT measurements,
   including those from the initial Request-Response packet exchange."

It implements larger initial windows with regard to this inital RTT measurement,
using the mechanism suggested in draft-ietf-dccp-rfc3448bis, section 4.2.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Sample RTT from SYN exchange
Gerrit Renker [Tue, 20 Mar 2007 18:27:17 +0000 (15:27 -0300)] 
[DCCP]: Sample RTT from SYN exchange

Function:

17 years ago[CCID3]: Use function for RTT sampling
Gerrit Renker [Tue, 20 Mar 2007 18:24:37 +0000 (15:24 -0300)] 
[CCID3]: Use function for RTT sampling

This replaces the existing occurrences of RTT sampling with
the use of the new function dccp_sample_rtt.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Provide function for RTT sampling
Gerrit Renker [Tue, 20 Mar 2007 18:23:18 +0000 (15:23 -0300)] 
[DCCP]: Provide function for RTT sampling

A recurring problem, in particular in the CCID code, is that RTT samples
from packets with timestamp echo and elapsed time options need to be taken.

This service is provided via a new function dccp_sample_rtt in this patch.
Furthermore, to protect against `insane' RTT samples, the sampled value
is bounded between 100 microseconds and 4 seconds - for which u32 is sufficient.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Handle Idle and Application-Limited periods
Gerrit Renker [Tue, 20 Mar 2007 18:19:07 +0000 (15:19 -0300)] 
[CCID3]: Handle Idle and Application-Limited periods

This updates the code with regard to handling idle and application-limited
periods as specified in [RFC 4342, 5.1].

Background:

17 years ago[CCID3]: Wrap computation of RFC3390-initial rate into separate function
Gerrit Renker [Tue, 20 Mar 2007 18:12:10 +0000 (15:12 -0300)] 
[CCID3]: Wrap computation of RFC3390-initial rate into separate function

The CCID 3 and TFRC specs (RFC 4342, RFC 3448, draft-3448bis) make frequent
reference to the computation of the RFC-3390 initial sending rate:

  1. Initial sending rate when RTT is known (RFC 4342, p. 6)
  2. Response to Idle/Application-Limited periods (RFC 4342, 5.1)

This warrants putting the code into its own function, for later code reuse.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Remove build warnings for 64bit
Gerrit Renker [Tue, 20 Mar 2007 18:04:30 +0000 (15:04 -0300)] 
[CCID3]: Remove build warnings for 64bit

This clears the following sparc64 build warnings:
 1) warning: format "%ld" expects type "long int", but argument 3 has type "suseconds_t"
 2) warning: format "%llu" expects type "long long unsigned int", but argument 3 has type "__u64"
Fixed by using typecast to unsigned. This is argued to be safe, since the quantities, after
de-scaling (factor 2^6) fit all in u32.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: More to see in dccp_probe
Gerrit Renker [Tue, 20 Mar 2007 18:02:10 +0000 (15:02 -0300)] 
[CCID3]: More to see in dccp_probe

This adds a few more fields of interest to /proc/net/dccpprobe, the following output ensues:

1           2          3           4     5  6     7   8        9        10   11
sec.usec   src:sport   dst:dport   size  s  rtt   p   X_calc   X_recv   X    t_ipi

Also made the formatting consistent.

Scripts that go with this can be downloaded from http://139.133.210.30/users/gerrit/dccp/dccp_probe/

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Add documentation for socket options
Gerrit Renker [Tue, 20 Mar 2007 18:01:14 +0000 (15:01 -0300)] 
[CCID3]: Add documentation for socket options

This updates the documentation on CCID3-specific options.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: More debug information for dccp_wait_for_ccid
Gerrit Renker [Tue, 20 Mar 2007 18:00:28 +0000 (15:00 -0300)] 
[DCCP]: More debug information for dccp_wait_for_ccid

This adds more detail in the wait_for_ccid packet scheduling loop.
In particular, it informs about (i) when delay is used and (ii) why
a packet is discarded.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Always use debug-toggle parameters
Gerrit Renker [Tue, 20 Mar 2007 17:59:23 +0000 (14:59 -0300)] 
[DCCP]: Always use debug-toggle parameters

Currently debugging output (when configured) is automatically enabled when
DCCP modules are compiled into the kernel rather than built as loadable modules.
This is not necessary, since the module parameters in this case become kernel
commandline parameters, e.g. DCCP or CCID3 debug output can be enabled for a
static build by appending the following at the boot prompt:

dccp.dccp_debug=1  dccp_ccid3.ccid3_debug=1

This patch therefore does away with the more complicated way of always enabling
debug output for static builds

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Remove race condition and update t_ipi when `s' changes
Gerrit Renker [Tue, 20 Mar 2007 17:56:11 +0000 (14:56 -0300)] 
[CCID3]: Remove race condition and update t_ipi when `s' changes

This:

 1. removes a race condition in the access to the scheduled send time t_nom which
    results from allowing asynchronous r/w access to t_nom without locks;

 2. updates the inter-packet interval t_ipi = s/X when `s' changes, following a
    suggestion by Ian McDonald.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: More verbose debugging
Ian McDonald [Tue, 20 Mar 2007 17:49:20 +0000 (14:49 -0300)] 
[CCID3]: More verbose debugging

This adds a few debugging statements to ccid3.c

Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Fix use of invalid loss intervals
Ian McDonald [Tue, 20 Mar 2007 17:46:52 +0000 (14:46 -0300)] 
[CCID3]: Fix use of invalid loss intervals

This fixes a bug which uses an invalid comparison.
The bug resulted in the use of invalid loss intervals.

Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Use MSS for larger initial windows
Gerrit Renker [Tue, 20 Mar 2007 17:28:44 +0000 (14:28 -0300)] 
[CCID3]: Use MSS for larger initial windows

This improves the slow-start phase by using the MSS
(as suggested in RFC 4342, sec. 5) instead of the packet size s.
Also figured out that __u32 is ample resource enough.

After applying, I got the following in the logs:

  ccid3_hc_tx_packet_recv: client(f7421700), s=6, MSS=1424, w_init=4380, R_sample=176us, X=24886363

Had the previous variant been used, w_init would have been as low as 24.

Committer note: removed unneeded cast to unsigned long long that was
                causing a compiler warning on 64bit architectures.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Re-order CCID 3 source file
Gerrit Renker [Tue, 20 Mar 2007 16:11:24 +0000 (13:11 -0300)] 
[CCID3]: Re-order CCID 3 source file

No code change at all.
This splits ccid3.c into a RX and a TX section, so that the file has an
organisation similar to the other ones (e.g. packet_history.{h,c}).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[CCID3]: Remove redundant `len' test
Gerrit Renker [Tue, 20 Mar 2007 16:10:15 +0000 (13:10 -0300)] 
[CCID3]: Remove redundant `len' test

Since CCID3 avoids  sending 0-byte data packets (cf. ccid3_hc_tx_send_packet),
testing for zero-payload length, as performed by ccid3_hc_tx_update_s, is
redundant - hence removed by this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Remove ambiguity in the way before48 is used
Gerrit Renker [Tue, 20 Mar 2007 16:08:19 +0000 (13:08 -0300)] 
[DCCP]: Remove ambiguity in the way before48 is used

This removes two ambiguities in employing the new definition of before48,
following the analysis on http://www.mail-archive.com/dccp@vger.kernel.org/msg01295.html

 (1) Updating GSR when P.seqno >= S.SWL
     With the old definition we did not update when P.seqno and S.SWL are 2^47 apart. To
     ensure the same behaviour as with the old definition, this is replaced with the
     equivalent condition dccp_delta_seqno(S.SWL, P.seqno) >= 0

 (2) Sending SYNC when P.seqno >= S.OSR
     Here it is debatable whether the new definition causes an ambiguity: the case is
     similar to (1); and to have consistency with the case (1), we use the equivalent
     condition dccp_delta_seqno(S.OSR, P.seqno) >= 0

Detailed Justification

17 years ago[DCCP]: Fix for follows48
Gerrit Renker [Tue, 20 Mar 2007 16:03:47 +0000 (13:03 -0300)] 
[DCCP]: Fix for follows48

The follows48 relation identifies whether 48-bit sequence number
x is the direct successor of y. Currently, it does not handle cases
of the following type correctly:

follows48(0x(prefix)10000LL, 0x(prefix)0FFFFLL)

where prefix is an arbitrary hex sequence of up to 7 digits.

This is fixed by reusing the new dccp_delta_seqno function.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[DCCP]: Make `before' relation unambiguous
Gerrit Renker [Tue, 20 Mar 2007 16:00:26 +0000 (13:00 -0300)] 
[DCCP]: Make `before' relation unambiguous

Problem:

17 years ago[DCCP]: Make dccp_delta_seqno return signed numbers
Gerrit Renker [Tue, 20 Mar 2007 15:45:59 +0000 (12:45 -0300)] 
[DCCP]: Make dccp_delta_seqno return signed numbers

Problem:

17 years ago[DCCP]: 48-bit sequence number arithmetic
Gerrit Renker [Tue, 20 Mar 2007 15:26:51 +0000 (12:26 -0300)] 
[DCCP]: 48-bit sequence number arithmetic

This patch
 * organizes the sequence arithmetic functions into one corner of dccp.h
 * performs a small modification of dccp_set_seqno to make it more widely reusable
   (now it is safe to use any number, since it performs modulo-2^48 assignment)
 * adds functions and generic macros for 48-bit sequence arithmetic:
  --48 bit complement
  --modulo-48 addition and modulo-48 subtraction
--dccp_inc_seqno now a special case of add48
Constants renamed following a suggestion by Arnaldo.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[FORCEDETH]: Use skb_tailroom where appropriate
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 15:08:20 +0000 (12:08 -0300)] 
[FORCEDETH]: Use skb_tailroom where appropriate

Reducing the number of skb->data direct accesses.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[LMC]: lmc_main wants to use skb_tailroom
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 15:00:44 +0000 (12:00 -0300)] 
[LMC]: lmc_main wants to use skb_tailroom

At that point it is equivalent to what was being used, skb->end - skb->data,
and the need is clearly the one skb_tailroom satisfies.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[ATM] idt77252: Fix double kfree_skb on failure in push_rx_skb
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 14:52:34 +0000 (11:52 -0300)] 
[ATM] idt77252: Fix double kfree_skb on failure in push_rx_skb

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
17 years ago[SK_BUFF] ipv6: Use skb_network_offset in some more places
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 01:29:03 +0000 (22:29 -0300)] 
[SK_BUFF] ipv6: Use skb_network_offset in some more places

So that we reduce the number of direct accesses to skb->data.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
17 years ago[NETLINK]: Use nlmsg_trim() where appropriate
Arnaldo Carvalho de Melo [Mon, 26 Mar 2007 06:06:12 +0000 (23:06 -0700)] 
[NETLINK]: Use nlmsg_trim() where appropriate

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETLINK]: Remove NLMSG_{NEW_ANSWER,CANCEL,END}
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 01:28:08 +0000 (22:28 -0300)] 
[NETLINK]: Remove NLMSG_{NEW_ANSWER,CANCEL,END}

Not used anywhere and defined inside __KERNEL__, Thomas acked this on irc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
17 years ago[SK_BUFF]: Remove skb_add_mtu() leftovers
Arnaldo Carvalho de Melo [Tue, 20 Mar 2007 01:27:36 +0000 (22:27 -0300)] 
[SK_BUFF]: Remove skb_add_mtu() leftovers

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
17 years ago[NETLINK]: Introduce nlmsg_hdr() helper
Arnaldo Carvalho de Melo [Thu, 26 Apr 2007 02:08:35 +0000 (19:08 -0700)] 
[NETLINK]: Introduce nlmsg_hdr() helper

For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV4]: fib_trie root node settings
Robert Olsson [Mon, 19 Mar 2007 23:29:58 +0000 (16:29 -0700)] 
[IPV4]: fib_trie root node settings

The threshold for root node can be more aggressive set to get
better tree compression. The new setting mekes the root grow
from 16 to 19 bits and substansial improvemnt in Aver depth
this with the current table of 214393 prefixes

But really the dynamic resize should need more investigation
both in terms convergence and performance and maybe it should
be possible to change...

Maybe just for the brave to start with or we may have to back
this out.

17 years ago[IPV4]: fib_trie resize break
Robert Olsson [Mon, 19 Mar 2007 23:27:37 +0000 (16:27 -0700)] 
[IPV4]: fib_trie resize break

The patch below adds break condition for the resize operations. If
we don't achieve the desired fill factor a warning is printed. Trie
should still be operational but new thresholds should be considered.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Adjust the zeroing up to tail in __alloc_skb too
Arnaldo Carvalho de Melo [Mon, 19 Mar 2007 13:48:59 +0000 (10:48 -0300)] 
[SK_BUFF]: Adjust the zeroing up to tail in __alloc_skb too

I did it just in alloc_skb_from_cache, forgot __alloc_skb, fixed now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Convert skb->end to sk_buff_data_t
Arnaldo Carvalho de Melo [Fri, 20 Apr 2007 03:43:29 +0000 (20:43 -0700)] 
[SK_BUFF]: Convert skb->end to sk_buff_data_t

Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Convert skb->tail to sk_buff_data_t
Arnaldo Carvalho de Melo [Fri, 20 Apr 2007 03:29:13 +0000 (20:29 -0700)] 
[SK_BUFF]: Convert skb->tail to sk_buff_data_t

So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[VLAN] vlan_dev: Use skb_reset_network_header().
David S. Miller [Fri, 20 Apr 2007 03:34:51 +0000 (20:34 -0700)] 
[VLAN] vlan_dev: Use skb_reset_network_header().

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: SMC SuperIO Chip LPC47N227 not identified properly
Peter Kovar [Sat, 17 Mar 2007 03:39:25 +0000 (20:39 -0700)] 
[IrDA]: SMC SuperIO Chip LPC47N227 not identified properly

SMC SuperIO Chip LPC47N227 used for IrDA is not detected because its device
identification byte can be 0x7A instead of 0x5A.

Patch from Peter Kovar <peter.kovar@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: irda lockdep annotation
Samuel Ortiz [Sat, 17 Mar 2007 03:38:23 +0000 (20:38 -0700)] 
[IrDA]: irda lockdep annotation

Rmmoding irda triggers a lockdep false positive.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: removing stir4200 useless include
Samuel Ortiz [Sat, 17 Mar 2007 03:35:25 +0000 (20:35 -0700)] 
[IrDA]: removing stir4200 useless include

stir4200 doesn't need to include irlap.h

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Use offsets for skb->{mac,network,transport}_header on 64bit architectures
Arnaldo Carvalho de Melo [Wed, 11 Apr 2007 04:22:35 +0000 (21:22 -0700)] 
[SK_BUFF]: Use offsets for skb->{mac,network,transport}_header on 64bit architectures

With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used
in further shrinking work, likely with the offsetization of other pointers,
such as ->{data,tail,end}, at the cost of adds, that were minimized by the
usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then
accessed multiple times in each function, it also is not more expensive than
before with regards to most of the handling of such headers, like setting one
of these headers to another (transport to network, etc), or subtracting, adding
to/from it, comparing them, etc.

Now we have this layout for sk_buff on a x86_64 machine:

[acme@mica net-2.6.22]$ pahole vmlinux sk_buff
struct sk_buff {
struct sk_buff *       next;             /*   0   8 */
struct sk_buff *       prev;             /*   8   8 */
struct rb_node         rb;               /*  16  24 */
struct sock *          sk;               /*  40   8 */
ktime_t                tstamp;           /*  48   8 */
struct net_device *    dev;              /*  56   8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct net_device *    input_dev;        /*  64   8 */
sk_buff_data_t         transport_header; /*  72   4 */
sk_buff_data_t         network_header;   /*  76   4 */
sk_buff_data_t         mac_header;       /*  80   4 */

/* XXX 4 bytes hole, try to pack */

struct dst_entry *     dst;              /*  88   8 */
struct sec_path *      sp;               /*  96   8 */
char                   cb[48];           /* 104  48 */
/* cacheline 2 boundary (128 bytes) was 24 bytes ago*/
unsigned int           len;              /* 152   4 */
unsigned int           data_len;         /* 156   4 */
unsigned int           mac_len;          /* 160   4 */
union {
__wsum         csum;             /*       4 */
__u32          csum_offset;      /*       4 */
};                                       /* 164   4 */
__u32                  priority;         /* 168   4 */
__u8                   local_df:1;       /* 172   1 */
__u8                   cloned:1;         /* 172   1 */
__u8                   ip_summed:2;      /* 172   1 */
__u8                   nohdr:1;          /* 172   1 */
__u8                   nfctinfo:3;       /* 172   1 */
__u8                   pkt_type:3;       /* 173   1 */
__u8                   fclone:2;         /* 173   1 */
__u8                   ipvs_property:1;  /* 173   1 */

/* XXX 2 bits hole, try to pack */

__be16                 protocol;         /* 174   2 */
void    (*destructor)(struct sk_buff *); /* 176   8 */
struct nf_conntrack *  nfct;             /* 184   8 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct sk_buff *       nfct_reasm;       /* 192   8 */
struct nf_bridge_info *nf_bridge;        /* 200   8 */
__u16                  tc_index;         /* 208   2 */
__u16                  tc_verd;          /* 210   2 */
dma_cookie_t           dma_cookie;       /* 212   4 */
__u32                  secmark;          /* 216   4 */
__u32                  mark;             /* 220   4 */
unsigned int           truesize;         /* 224   4 */
atomic_t               users;            /* 228   4 */
unsigned char *        head;             /* 232   8 */
unsigned char *        data;             /* 240   8 */
unsigned char *        tail;             /* 248   8 */
/* --- cacheline 4 boundary (256 bytes) --- */
unsigned char *        end;              /* 256   8 */
}; /* size: 264, cachelines: 5 */
   /* sum members: 260, holes: 1, sum holes: 4 */
   /* bit holes: 1, sum bit holes: 2 bits */
   /* last cacheline: 8 bytes */

On 32 bits nothing changes, and pointers continue to be used with the compiler
turning all this abstraction layer into dust. But there are some sk_buff
validation tricks that are now possible, humm... :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: unions of just one member don't get anything done, kill them
Arnaldo Carvalho de Melo [Wed, 11 Apr 2007 04:21:55 +0000 (21:21 -0700)] 
[SK_BUFF]: unions of just one member don't get anything done, kill them

Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Introduce skb_network_header_len
Arnaldo Carvalho de Melo [Fri, 16 Mar 2007 20:26:39 +0000 (17:26 -0300)] 
[SK_BUFF]: Introduce skb_network_header_len

For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SK_BUFF]: Use the helpers to get the layer header pointer
Arnaldo Carvalho de Melo [Fri, 16 Mar 2007 20:19:57 +0000 (17:19 -0300)] 
[SK_BUFF]: Use the helpers to get the layer header pointer

Some more cases...

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: Fix warning
Patrick McHardy [Fri, 16 Mar 2007 19:34:52 +0000 (12:34 -0700)] 
[NET_SCHED]: Fix warning

net/sched/sch_api.c: In function 'psched_show':
net/sched/sch_api.c:1219: warning: format '%08x' expects type 'unsigned int', but argument 6 has type 's64'

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: sch_cbq: fix watchdog scheduled too late
Patrick McHardy [Fri, 16 Mar 2007 19:31:28 +0000 (12:31 -0700)] 
[NET_SCHED]: sch_cbq: fix watchdog scheduled too late

q->now is increased during dequeue and doesn't contain the current time
afterwards, resulting in a too large timeout value for the qdisc watchdog.
Use "now" instead, which still contains the current time.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: Export real timer resolution in /proc/net/psched
Patrick McHardy [Fri, 16 Mar 2007 08:23:28 +0000 (01:23 -0700)] 
[NET_SCHED]: Export real timer resolution in /proc/net/psched

The timer resolution exported in /proc/net/psched is used by userspace to
calculate HTB's burst values. Currently it is set to HZ, since we're now
using hrtimers, use KTIME_MONOTONIC_RES, which makes HTB use smaller burst
values.

This patch also affects libnl, which incorrectly uses this value for
the SFQ perturbation parameter, which is always in seconds, and some
routing cache values, which are in USER_HZ, so both cases are broken
anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: kill jiffie conversion macros
Patrick McHardy [Fri, 16 Mar 2007 08:23:02 +0000 (01:23 -0700)] 
[NET_SCHED]: kill jiffie conversion macros

Now that all packet schedulers have been converted to hrtimers most users
of PSCHED_JIFFIE2US and PSCHED_US2JIFFIE are gone. The remaining users use
it to convert external time units to packet scheduler clock ticks, so use
PSCHED_TICKS_PER_SEC instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>