linux-2.6
17 years ago[GFS2] make lock_dlm drop_count tunable in sysfs
David Teigland [Thu, 25 Jan 2007 20:24:04 +0000 (14:24 -0600)] 
[GFS2] make lock_dlm drop_count tunable in sysfs

We want to be able to change or disable the default drop_count (number at
which the dlm asks gfs to limit the the number of locks it's holding).
Add it to the collection of sysfs tunables for an fs.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] increase default lock limit
David Teigland [Thu, 25 Jan 2007 19:50:52 +0000 (13:50 -0600)] 
[GFS2] increase default lock limit

Increase the number of locks at which point the dlm begins asking gfs to
reduce its lock usage.  The default value is largely arbitrary, but the
current value of 50,000 ends up limiting performance unnecessarily for too
many users.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix list corruption in lops.c
Steven Whitehouse [Thu, 25 Jan 2007 10:04:20 +0000 (10:04 +0000)] 
[GFS2] Fix list corruption in lops.c

The patch below appears to fix the list corruption that we are seeing on
occasion. Although the transaction structure is private to a single
thread, when the queued structures are dismantled during an in-core
commit, its possible for a different thread to be trying to add the same
structure to another, new, transaction at the same time.

To avoid this, this patch takes the log spinlock during this operation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix recursive locking attempt with NFS
Steven Whitehouse [Thu, 25 Jan 2007 17:14:59 +0000 (17:14 +0000)] 
[GFS2] Fix recursive locking attempt with NFS

In certain cases, its possible for NFS to call the lookup code while
holding the glock (when doing a readdirplus operation) so we need to
check for that and not try and lock the glock twice. This also fixes a
typo in a previous NFS related GFS2 patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] can miss clearing resend flag
David Teigland [Wed, 24 Jan 2007 16:21:33 +0000 (10:21 -0600)] 
[DLM] can miss clearing resend flag

A long, complicated sequence of events, beginning with the RESEND flag not
being cleared on an lkb, can result in an unlock never completing.

- lkb on waiters list for remote lookup
- the remote node is both the dir node and the master node, so
  it optimizes the lookup into a request and sends a request
  reply back
- the request reply is saved on the requestqueue to be processed
  after recovery
- recovery runs dlm_recover_waiters_pre() which sets RESEND flag
  so the lookup will be resent after recovery
- end of recovery: process_requestqueue takes saved request reply
  which removes the lkb off the waitesr list, _without_ clearing
  the RESEND flag
- end of recovery: dlm_recover_waiters_post() doesn't do anything
  with the now completed lookup lkb (would usually clear RESEND)
- later, the node unmounts, unlocks this lkb that still has RESEND
  flag set
- the lkb is on the waiters list again, now for unlock, when recovery
  occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
  set, doesn't do anything since the master still exists
- end of recovery: dlm_recover_waiters_post() takes this lkb off
  the waiters list because it has the RESEND flag set, then reports
  an error because unlocks are never supposed to be handled in
  recover_waiters_post().
- later, the unlock reply is received, doesn't find the lkb on
  the waiters list because recover_waiters_post() has wrongly
  removed it.
- the unlock operation has been lost, and we're left with a
  stray granted lock
- unmount spins waiting for the unlock to complete

The visible evidence of this problem will be a node where gfs umount is
spinning, the dlm waiters list will be empty, and the dlm locks list will
show a granted lock.

The fix is simply to clear the RESEND flag when taking an lkb off the
waiters list.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] saved dlm message can be dropped
David Teigland [Wed, 24 Jan 2007 16:11:45 +0000 (10:11 -0600)] 
[DLM] saved dlm message can be dropped

dlm_receive_message() returns 0 instead of returning 'error'.  What would
happen is that process_requestqueue would take a saved message off the
requestqueue and call receive_message on it.  receive_message would then
see that recovery had been aborted, set error to EINTR, and 'goto out',
expecting that the error would be returned.  Instead, 0 was always
returned, so process_requestqueue would think that the message had been
processed and delete it instead of saving it to process next time.  This
means the message (usually an unlock in my tests) would be lost.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] Make sock_sem into a mutex
Patrick Caulfield [Wed, 24 Jan 2007 11:17:59 +0000 (11:17 +0000)] 
[DLM] Make sock_sem into a mutex

Now that there can be multiple dlm_recv threads running we need to prevent two
recvs running for the same connection - it's unlikely but it can happen and it
causes message corruption.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix typo in glock.c
Steven Whitehouse [Tue, 23 Jan 2007 21:56:36 +0000 (16:56 -0500)] 
[GFS2] Fix typo in glock.c

This is a one letter typo fix in glock.c, spotted by Rob Kenna.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2
Eric Sandeen [Thu, 18 Jan 2007 22:41:23 +0000 (16:41 -0600)] 
[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2

I was looking something else up and came across this...

I don't honestly have a good reason to change it other than to make it
like every other Linux filesystem in this regard.  ;-)  It doesn't
functionally change anything, but makes some lines shorter. :)

I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but
not in timespec_t format, and only stores second resolutions?  Seems like
you're halfway to sub-second resolutions already.

I suppose if that gets implemented then all of the below should
instead be CURRENT_TIME not CURRENT_TIME_SEC.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Compile fix for glock.c
Steven Whitehouse [Tue, 23 Jan 2007 18:20:41 +0000 (13:20 -0500)] 
[GFS2] Compile fix for glock.c

This one liner got missed from the previous patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Remove queue_empty() function
Steven Whitehouse [Mon, 22 Jan 2007 18:09:04 +0000 (13:09 -0500)] 
[GFS2] Remove queue_empty() function

This function is not longer required since we do not do recursive
locking in the glock layer. As a result all its callers can be
replaceed with list_empty() calls.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix lowcomms receiving
Patrick Caulfield [Mon, 22 Jan 2007 14:51:33 +0000 (14:51 +0000)] 
[DLM] fix lowcomms receiving

This patch fixes a bug whereby data on a newly accepted connection would be
ignored if it arrived soon after the accept.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Tidy up glops calls
Steven Whitehouse [Mon, 22 Jan 2007 17:15:34 +0000 (12:15 -0500)] 
[GFS2] Tidy up glops calls

This patch doesn't make any changes to the ordering of the various
operations related to glocking, but it does tidy up the calls to the
glops.c functions to make the structure more obvious.

The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be
made static within glock.c since they are called by every set of glock
operations. The xmote_th and drop_th glock operations are then made
conditional upon those two routines existing and called from the
previously mentioned functions in glock.c respectively.

Also it can be seen that the go_sync operation isn't needed since it can
easily be replaced by calls to xmote_bh and drop_bh respectively. This
results in no longer (confusingly) calling back into routines in glock.c
from glops.c and also reducing the glock operations by one member.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] lowcomms tidy
Patrick Caulfield [Mon, 22 Jan 2007 14:50:10 +0000 (14:50 +0000)] 
[DLM] lowcomms tidy

This patch removes some redundant fields from the connection structure and adds
some lockdep annotation to remove spurious warnings.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Remove local exclusive glock mode
Steven Whitehouse [Mon, 22 Jan 2007 17:10:39 +0000 (12:10 -0500)] 
[GFS2] Remove local exclusive glock mode

Here is a patch for GFS2 to remove the local exclusive flag. In
the places it was used, mutex's are always held earlier in the
call path, so it appears redundant in the LM_ST_SHARED case.

Also, the GFS2 holders were setting local exclusive in any case where
the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock
code where the flag was tested have been replaced with tests for the
lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the
same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well
as globally exclusive).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Remove unused go_callback operation
Steven Whitehouse [Fri, 19 Jan 2007 18:57:36 +0000 (13:57 -0500)] 
[GFS2] Remove unused go_callback operation

This is never used, so we might as well remove it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Remove the "greedy" function from glock.[ch]
Steven Whitehouse [Thu, 18 Jan 2007 17:44:20 +0000 (17:44 +0000)] 
[GFS2] Remove the "greedy" function from glock.[ch]

The "greedy" code was an attempt to retain glocks for a minimum length
of time when they relate to mmap()ed files. The current implementation
of this feature is not, however, ideal in that it required allocating
memory in order to do this and its overly complicated.

It also misses the mark by ignoring the other I/O operations which are
just as likely to suffer from the same problem. So the plan is to remove
this now and then add the functionality back as part of the glock state
machine at a later date (and thus take into account all the possible
users of this feature)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Shrink gfs2_inode memory by half
Steven Whitehouse [Wed, 17 Jan 2007 15:33:23 +0000 (15:33 +0000)] 
[GFS2] Shrink gfs2_inode memory by half

Here is something I spotted (while looking for something entirely
different) the other day.

Rather than using a completion in each and every struct gfs2_holder,
this removes it in favour of hashed wait queues, thus saving a
considerable amount of memory both on the stack (where a number of
gfs2_holder structures are allocated) and in particular in the
gfs2_inode which has 8 gfs2_holder structures embedded within it.

As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to
1912 bytes, a saving of 576 bytes per inode (no thats not a typo!).
In actual practice we get a much better result than that since
now that a gfs2_inode is under the 2048 byte barrier, we get two
per 4k slab page effectively halving the amount of memory required
to store gfs2_inodes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Remove max_atomic_write tunable
Steven Whitehouse [Mon, 15 Jan 2007 21:36:26 +0000 (16:36 -0500)] 
[GFS2] Remove max_atomic_write tunable

This removes an unused sysfs tunable parameter.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Clean up/speed up readdir
Steven Whitehouse [Wed, 17 Jan 2007 15:09:20 +0000 (15:09 +0000)] 
[GFS2] Clean up/speed up readdir

This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Add writepages for "data=writeback" mounts
Steven Whitehouse [Mon, 15 Jan 2007 13:52:17 +0000 (13:52 +0000)] 
[GFS2] Add writepages for "data=writeback" mounts

It occurred to me that although a gfs2 specific writepages for ordered
writes and journaled data would be tricky, by hooking writepages only
for "data=writeback" mounts we could take advantage of not needing
buffer heads (we don't use them on the read side, nor have we for some
time) and create much larger I/Os for the block layer.

Using blktrace both before and after, its possible to see that for large
I/Os, most of the requests generated through writepages are now 1024
sectors after this patch is applied as opposed to 8 sectors before.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix master recovery
David Teigland [Mon, 15 Jan 2007 16:28:22 +0000 (10:28 -0600)] 
[DLM] fix master recovery

If master recovery happens on an rsb in one recovery sequence, then that
sequence is aborted before lock recovery happens, then in the next
sequence, we rely on the previous master recovery (which may now be
invalid due to another node ignoring a lookup result) and go on do to the
lock recovery where we get stuck due to an invalid master value.

 recovery cycle begins: master of rsb X has left
 nodes A and B send node C an rcom lookup for X to find the new master
 C gets lookup from B first, sets B as new master, and sends reply back to B
 C gets lookup from A next, and sends reply back to A saying B is master
 A gets lookup reply from C and sets B as the new master in the rsb
 recovery cycle on A, B and C is aborted to start a new recovery
 B gets lookup reply from C and ignores it since there's a new recovery
 recovery cycle begins: some other node has joined
 B doesn't think it's the master of X so it doesn't rebuild it in the directory
 C looks up the master of X, no one is master, so it becomes new master
 B looks up the master of X, finds it's C
 A believes that B is the master of X, so it sends its lock to B
 B sends an error back to A
 A resends
 this repeats forever, the incorrect master value on A is never corrected

The fix is to do master recovery on an rsb that still has the NEW_MASTER
flag set from an earlier recovery sequence, and therefore didn't complete
lock recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix user unlocking
David Teigland [Mon, 15 Jan 2007 16:34:52 +0000 (10:34 -0600)] 
[DLM] fix user unlocking

When a user process exits, we clear all the locks it holds.  There is a
problem, though, with locks that the process had begun unlocking before it
exited.  We couldn't find the lkb's that were in the process of being
unlocked remotely, to flag that they are DEAD.  To solve this, we move
lkb's being unlocked onto a new list in the per-process structure that
tracks what locks the process is holding.  We can then go through this
list to flag the necessary lkb's when clearing locks for a process when it
exits.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] Use workqueues for dlm lowcomms
Patrick Caulfield [Mon, 15 Jan 2007 14:33:34 +0000 (14:33 +0000)] 
[DLM] Use workqueues for dlm lowcomms

This patch converts the DLM TCP lowcomms to use workqueues rather than using its
own daemon functions. Simultaneously removing a lot of code and making it more
scalable on multi-processor machines.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] make gfs2_change_nlink_i() static
Adrian Bunk [Sat, 13 Jan 2007 09:56:41 +0000 (10:56 +0100)] 
[GFS2] make gfs2_change_nlink_i() static

On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc3-mm1:
>...
>  git-gfs2-nmw.patch
>...
>  git trees
>...

This patch makes the needlessly globlal gfs2_change_nlink_i() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] gfs2 knows of directories which it chooses not to display
Robert Peterson [Thu, 11 Jan 2007 19:25:00 +0000 (13:25 -0600)] 
[GFS2] gfs2 knows of directories which it chooses not to display

This is for Red Hat bugzilla bug bz #222302:

Moving a virtual IP from node to node between two NFS-over-GFS2
servers was causing one of the GFS2 servers to become confused and
reference a deleted inode.  The problem was due to vfs dentries that did
not reference the gfs2_dops and therefore didn't call the gfs2 revalidate
code to revalidate a dentry after a directory had been deleted & recreated.
This patch is a crosswrite from a RHEL4 bug found in GFS1 as
bz #190756 and it is against the latest -nmw git tree.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] expose dlm_config_info fields in configfs
David Teigland [Tue, 9 Jan 2007 15:46:02 +0000 (09:46 -0600)] 
[DLM] expose dlm_config_info fields in configfs

Make the dlm_config_info values readable and writeable via configfs
entries.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] add config entry to enable log_debug
David Teigland [Tue, 9 Jan 2007 15:44:01 +0000 (09:44 -0600)] 
[DLM] add config entry to enable log_debug

Add a new dlm_config_info field to enable log_debug output and change
log_debug() to use it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] rename dlm_config_info fields
David Teigland [Tue, 9 Jan 2007 15:41:48 +0000 (09:41 -0600)] 
[DLM] rename dlm_config_info fields

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch).  No functional changes in this patch, just naming changes.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] change some log_error to log_debug
David Teigland [Tue, 9 Jan 2007 15:38:39 +0000 (09:38 -0600)] 
[DLM] change some log_error to log_debug

Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix gfs2_rename deadlock
S. Wendy Cheng [Thu, 18 Jan 2007 21:07:03 +0000 (16:07 -0500)] 
[GFS2] Fix gfs2_rename deadlock

Second round of gfs2_rename lock re-ordering to allow Anaconda adding
root partition on top of gfs2. Previous to this patch the recursive
lock detector in glock.c can be triggered due to attempting to lock
the rgrp twice. This fixes it by checking to see whether the rgrp
is already locked.

This fixes Red Hat bugzilla #221237

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] BZ 217008 fsfuzzer fix.
Russell Cattelan [Mon, 8 Jan 2007 23:47:51 +0000 (17:47 -0600)] 
[GFS2] BZ 217008 fsfuzzer fix.

Update the quilt header comments to match the
code changes.

Change gfs2_lookup_simple to return an error in the case
of a NULL inode.
The callers of gfs2_lookup_simple do not check for NULL
in the no entry case and such would end up dereferencing a NULL ptr.

This fixes:
http://projects.info-pull.com/mokb/MOKB-15-11-2006.html

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix ordering of page disposal vs. glock_dq
Steven Whitehouse [Mon, 8 Jan 2007 14:31:40 +0000 (14:31 +0000)] 
[GFS2] Fix ordering of page disposal vs. glock_dq

In case of unlinked files with dirty pages GFS2 wasn't clearing
the pages in quite the right order. This patch clears the pages
earlier (before the qlock_dq) to avoid the situation that the
release of the glock results in attempting to write back data that
has already been deallocated.

This fixes Red Hat bugzilla: #220117

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] Fix spin lock already unlocked bug
Patrick Caulfield [Tue, 2 Jan 2007 17:08:54 +0000 (17:08 +0000)] 
[DLM] Fix spin lock already unlocked bug

I just noticed this message when testing some other changes I'd made to
lowcomms (to use workqueues) but the problem seems to be in the current
git trees too. I'm amazed no-one has seen it.

    BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] Fix schedule() calls
Patrick Caulfield [Tue, 2 Jan 2007 17:01:05 +0000 (17:01 +0000)] 
[DLM] Fix schedule() calls

I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.

These four should really be calls to schedule() or the dlm can busy-wait.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fix change nlink deadlock
S. Wendy Cheng [Thu, 18 Jan 2007 20:56:34 +0000 (15:56 -0500)] 
[GFS2] Fix change nlink deadlock

Bugzilla 215088

Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2
partition. The gfs2_rename() apparently needs block allocation for the
new name (into the directory) where it requires rg locks. At the same
time, while updating the nlink count for the replaced file,
gfs2_change_nlink() tries to return the inode meta-data back to resource
group where it needs rg locks too. Our logic doesn't allow process to
acquire these locks recursively by the same process  (RHEL installer)
that results a BUG call. This only happens within rename code path and
only if the destination file exists before the rename operation.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] Fail over to readpage for stuffed files
Steven Whitehouse [Fri, 15 Dec 2006 21:49:51 +0000 (16:49 -0500)] 
[GFS2] Fail over to readpage for stuffed files

This is partially derrived from a patch written by Russell Cattelan.
It fixes a bug where there is a race between readpages and truncate
by ignoring readpages for stuffed files. This is ok because a stuffed
file will never be more than one block (minus sizeof(struct gfs2_dinode))
in size and block size is always less than page size, so we do not lose
anything efficiency-wise by not doing readahead for stuffed files. They
will have already been "read ahead" by the action of reading the inode
in, in the first place.

This is the remaining part of the fix for Red Hat bugzilla #218966
which had not yet made it upstream.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Russell Cattelan <cattelan@redhat.com>
17 years ago[GFS2] Fix DIO deadlock
Steven Whitehouse [Thu, 14 Dec 2006 18:24:26 +0000 (18:24 +0000)] 
[GFS2] Fix DIO deadlock

This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs
due to trying to take the i_mutex while holding a glock. The correct
locking order is defined as i_mutex -> glock in all cases.

I've left dealing with allocating writes. I know that we need to do
that, but for now this should do the trick. We don't need to take the
i_mutex on write, because the VFS has already taken it for us. On read
we don't need it since the glock is enough protection. The reason that
I've made some of the checks into a separate function is that we'll need
to do the checks again in the allocating write case eventually, so this
is partly in preparation for this. Likewise the return value test of !=
1 might look a bit odd and thats because we'll need a third return value
in case of requiring an allocation.

I've made the change to deferred mode on the glock to ensure flushing
read caches on other nodes. I notice that (using blktrace to look at
whats going on) we appear to do a better job of large I/Os than ext3
after this patch (in terms of not splitting up the I/Os).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>
17 years ago[DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions
Adrian Bunk [Tue, 19 Dec 2006 21:04:03 +0000 (13:04 -0800)] 
[DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions

Remove the following unused functions:

- lowcomms_send_message()
- lowcomms_max_buffer_size()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix lost flags in stub replies
David Teigland [Wed, 13 Dec 2006 16:40:26 +0000 (10:40 -0600)] 
[DLM] fix lost flags in stub replies

When the dlm fakes an unlock/cancel reply from a failed node using a stub
message struct, it wasn't setting the flags in the stub message.  So, in
the process of receiving the fake message the lkb flags would be updated
and cleared from the zero flags in the message.  The problem observed in
tests was the loss of the USER flag which caused the dlm to think a user
lock was a kernel lock and subsequently fail an assertion checking the
validity of the ast/callback field.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix receive_request() lvb copying
David Teigland [Wed, 13 Dec 2006 16:39:20 +0000 (10:39 -0600)] 
[DLM] fix receive_request() lvb copying

LVB's are not sent as part of new requests, but the code receiving the
request was copying data into the lvb anyway.  The space in the message
where it mistakenly thought the lvb lived actually contained the resource
name, so it wound up incorrectly copying this name data into the lvb.  Fix
is to just create the lvb, not copy junk into it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix send_args() lvb copying
David Teigland [Wed, 13 Dec 2006 16:38:45 +0000 (10:38 -0600)] 
[DLM] fix send_args() lvb copying

The send_args() function is used to copy parameters into a message for a
number different message types.  Only some of those types are set up
beforehand (in create_message) to include space for sending lvb data.
send_args was wrongly copying the lvb for all message types as long as the
lock had an lvb.  This means that the lvb data was being written past the
end of the message into unknown space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] add version check
David Teigland [Wed, 13 Dec 2006 16:37:55 +0000 (10:37 -0600)] 
[DLM] add version check

Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix old rcom messages
David Teigland [Wed, 13 Dec 2006 16:37:16 +0000 (10:37 -0600)] 
[DLM] fix old rcom messages

A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery.  There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.

Each recovery sequence already has a locally unique sequence number
associated with it.  This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq.  When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply.  When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.

An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.

The protocol version number is changed to reflect the different rcom
structures.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[DLM] fix resend rcom lock
David Teigland [Wed, 13 Dec 2006 16:36:37 +0000 (10:36 -0600)] 
[DLM] fix resend rcom lock

There's a chance the new master of resource hasn't learned it's the new
master before another node sends it a lock during recovery.  The node
sending the lock needs to resend if this happens.

- A sends a master lookup for resource R to C
- B sends a master lookup for resource R to C
- C receives A's lookup, assigns A to be master of R and
  sends a reply back to A
- C receives B's lookup and sends a reply back to B saying
  that A is the master
- B receives lookup reply from C and sends its lock for R to A
- A receives lock from B, doesn't think it's the master of R
  and sends an error back to B
- A receives lookup reply from C and becomes master of R
- B gets error back from A and resends its lock back to A
  (this resending is what this patch does)
- A receives lock from B, it now sees it's the master of R
  and takes the lock

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years ago[GFS2] don't try to lockfs after shutdown
David Teigland [Wed, 6 Dec 2006 17:46:33 +0000 (11:46 -0600)] 
[GFS2] don't try to lockfs after shutdown

If an fs has already been shut down, a lockfs callback should do nothing.
An fs that's been shut down can't acquire locks or do anything with
respect to the cluster.

Also, remove FIXME comment in withdraw function.  The missing bits of the
withdraw procedure are now all done by user space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
17 years agoLinux 2.6.20 v2.6.20
Linus Torvalds [Sun, 4 Feb 2007 18:44:54 +0000 (10:44 -0800)] 
Linux 2.6.20

17 years ago[PATCH] EFI x86: pass firmware call parameters on the stack
Frédéric Riss [Tue, 30 Jan 2007 20:41:17 +0000 (21:41 +0100)] 
[PATCH] EFI x86: pass firmware call parameters on the stack

When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <frederic.riss@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] fix rtl8150
Al Viro [Sun, 4 Feb 2007 03:02:17 +0000 (03:02 +0000)] 
[PATCH] fix rtl8150

That code doesn't do what its author apparently thought it would do...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
Linus Torvalds [Sat, 3 Feb 2007 19:26:39 +0000 (11:26 -0800)] 
Merge /linux/kernel/git/jejb/scsi-rc-fixes-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash
  [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect
  [SCSI] qla4xxx: bug fixes
  [SCSI] Fix scsi_add_device() for async scanning

17 years ago[PATCH] x86-64: define dma noncoherent API functions
Jeff Garzik [Sat, 3 Feb 2007 09:14:03 +0000 (01:14 -0800)] 
[PATCH] x86-64: define dma noncoherent API functions

x86-64 is missing these:

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] Altix: more ACPI PRT support
John Keller [Sat, 3 Feb 2007 09:14:02 +0000 (01:14 -0800)] 
[PATCH] Altix: more ACPI PRT support

The SN Altix platform does not conform to the IOSAPIC IRQ routing model.
Add code in acpi_unregister_gsi() to check if (acpi_irq_model ==
ACPI_IRQ_MODEL_PLATFORM) and return.

Due to an oversight, this code was not added previously when
similar code was added to acpi_register_gsi().

http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2

Signed-off-by: John Keller <jpk@sgi.com>
Acked-by: Len Brown <lenb@kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] revert blockdev direct io back to 2.6.19 version
Andrew Morton [Sat, 3 Feb 2007 09:14:01 +0000 (01:14 -0800)] 
[PATCH] revert blockdev direct io back to 2.6.19 version

Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
slowdown due to the recent special-casing of direct-io against
blockdevs.  We don't know why either of these things are occurring.

The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
release.

Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: Ken Chen <kenchen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] alpha: fix epoll syscall enumerations
Mike Frysinger [Sat, 3 Feb 2007 09:13:55 +0000 (01:13 -0800)] 
[PATCH] alpha: fix epoll syscall enumerations

We went and named them __NR_sys_foo instead of __NR_foo.

It may be too late to change this, but we can at least add the proper names
now.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] net/smc911x: match up spin lock/unlock
Peter Korsgaard [Sat, 3 Feb 2007 09:13:50 +0000 (01:13 -0800)] 
[PATCH] net/smc911x: match up spin lock/unlock

smc911x_phy_configure's error handling unconditionally unlocks the
spinlock even if it wasn't locked. Patch fixes it.

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] kexec: Avoid migration of already disabled irqs (ia64)
Magnus Damm [Sat, 3 Feb 2007 09:13:48 +0000 (01:13 -0800)] 
[PATCH] kexec: Avoid migration of already disabled irqs (ia64)

This patch fixes up ia64 kexec support for HP rx2620 hardware.  It does
this by skipping migration of already disabled irqs.  This is most likely a
problem on other ia64 platforms as well, but I've only been able to
reproduce it on one machine so far.

The full story is that handle_bad_irq() gets invoked before starting the
new kernel without this patch.  This seems to happen when fixup_irqs()
calls generic_handle_irq() on already migrated (and disabled) irqs.  So by
avoiding migration of disabled irqs we stay away of handle_bad_irq().

The code has been tested on three different ia64 machines, all with good
results.  It is possible to trigger the same bug by offlining a processor
using echo 0 > /sys/devices/system/cpu/cpuX/online.

More detailed information is available in the following mail thread:
http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Zou, Nanhai <nanhai.zou@intel.com>
Acked-by: Jay Lan <jlan@sgi.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2
Ken Chen [Sat, 3 Feb 2007 09:13:45 +0000 (01:13 -0800)] 
[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2

An AIO bug was reported that sleeping function is being called in softirq
context:

BUG: warning at kernel/mutex.c:132/__mutex_lock_common()
Call Trace:
     [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0
     [<a000000100577ba0>] mutex_lock+0x20/0x40
     [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0
     [<a00000010018c0c0>] __put_ioctx+0xc0/0x240
     [<a00000010018d470>] aio_complete+0x2f0/0x420
     [<a00000010019cc80>] finished_one_bio+0x200/0x2a0
     [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200
     [<a00000010019d260>] dio_bio_end_aio+0x60/0x80
     [<a00000010014acd0>] bio_endio+0x110/0x1c0
     [<a0000001002770e0>] __end_that_request_first+0x180/0xba0
     [<a000000100277b90>] end_that_request_chunk+0x30/0x60
     [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod]
     [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod]
     [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod]
     [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod]
     [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod]
     [<a000000100277d20>] blk_done_softirq+0x160/0x1c0
     [<a000000100083e00>] __do_softirq+0x200/0x240
     [<a000000100083eb0>] do_softirq+0x70/0xc0

See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2

flush_workqueue() is not allowed to be called in the softirq context.
However, aio_complete() called from I/O interrupt can potentially call
put_ioctx with last ref count on ioctx and triggers bug.  It is simply
incorrect to perform ioctx freeing from aio_complete.

The bug is trigger-able from a race between io_destroy() and aio_complete().
A possible scenario:

cpu0                               cpu1
io_destroy                         aio_complete
  wait_for_all_aios {                __aio_put_req
     ...                                 ctx->reqs_active--;
     if (!ctx->reqs_active)
        return;
  }
  ...
  put_ioctx(ioctx)

                                     put_ioctx(ctx);
                                        __put_ioctx
                                          bam! Bug trigger!

The real problem is that the condition check of ctx->reqs_active in
wait_for_all_aios() is incorrect that access to reqs_active is not
being properly protected by spin lock.

This patch adds that protective spin lock, and at the same time removes
all duplicate ref counting for each kiocb as reqs_active is already used
as a ref count for each active ioctx.  This also ensures that buggy call
to flush_workqueue() in softirq context is eliminated.

Signed-off-by: "Ken Chen" <kenchen@google.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Suparna Bhattacharya <suparna@in.ibm.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: <stable@kernel.org>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[NETFILTER]: nf_conntrack_h323: fix compile error with CONFIG_IPV6=m, CONFIG_NF_CONNT...
Adrian Bunk [Sat, 3 Feb 2007 03:33:52 +0000 (19:33 -0800)] 
[NETFILTER]: nf_conntrack_h323: fix compile error with CONFIG_IPV6=m, CONFIG_NF_CONNTRACK_H323=y

Fix this by letting NF_CONNTRACK_H323 depend on (IPV6 || IPV6=n).

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=n
Patrick McHardy [Sat, 3 Feb 2007 03:33:11 +0000 (19:33 -0800)] 
[NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=n

  CC      net/netfilter/nf_conntrack_netlink.o
net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event':
net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark'
make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash
Nagendra Singh Tomar [Fri, 2 Feb 2007 12:04:56 +0000 (17:34 +0530)] 
[SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash

sd_probe() calls class_device_add() even before initializing the
sdkp->device variable. class_device_add() eventually results in the user mode
udev program to be called. udev program can read the the allow_restart
attribute of the newly created scsi device. This is resulting in a crash as
the show function for allow_restart (i.e sd_show_allow_restart) returns the
attribute value by reading the sdkp->device->allow_restart variable. As the
sdkp->device is not initialized before calling the user mode hotplug helper,
this results in a crash.
The patch below solves it by calling class_device_add() only after the
necessary fields in the scsi_disk structure are initialized properly.

Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
17 years agoMerge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Fri, 2 Feb 2007 17:14:48 +0000 (09:14 -0800)] 
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Initialize nbytes for internal sg commands
  libata: Fix ata_busy_wait() kernel docs
  pata_via: Correct missing comments
  pata_atiixp: propogate cable detection hack from drivers/ide to the new driver
  ahci/pata_jmicron: fix JMicron quirk

17 years agolibata: Initialize nbytes for internal sg commands
Brian King [Tue, 30 Jan 2007 17:32:26 +0000 (11:32 -0600)] 
libata: Initialize nbytes for internal sg commands

Some LLDDs, like ipr, use nbytes and pad_len to determine
the total data transfer length of a command. Make sure
nbytes gets initialized for internally generated commands.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agolibata: Fix ata_busy_wait() kernel docs
Alan [Wed, 31 Jan 2007 17:47:24 +0000 (17:47 +0000)] 
libata: Fix ata_busy_wait() kernel docs

> Looks like you should use ata_busy_wait() here, rather than reproducing
> the same code again.

It waits in 10uS chunks while 1uS chunks were used in the workaround.
Could indeed do that once I know the fix is right. While I'm at it the
ata_busy_wait kerneldoc is borked so here's a fix

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agopata_via: Correct missing comments
Alan [Wed, 31 Jan 2007 17:14:38 +0000 (17:14 +0000)] 
pata_via: Correct missing comments

The 8237S was added to the chipsets but not to the comments. Fix this

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agopata_atiixp: propogate cable detection hack from drivers/ide to the new driver
Alan [Wed, 31 Jan 2007 17:10:46 +0000 (17:10 +0000)] 
pata_atiixp: propogate cable detection hack from drivers/ide to the new driver

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agoahci/pata_jmicron: fix JMicron quirk
Tejun Heo [Fri, 2 Feb 2007 05:51:09 +0000 (14:51 +0900)] 
ahci/pata_jmicron: fix JMicron quirk

For all JMicrons except for 361 and 368, AHCI mode enable bits in the
Control(1) should be set.  This used to be done in both ahci and
pata_jmicron but while moving programming to PCI quirk, it was removed
from ahci part while still left in pata_jmicron.

The implemented JMicron PCI quirk was incorrect in that it didn't
program AHCI mode enable bits.  If pata_jmicron is loaded first and
programs those bits, the ahci ports work; otherwise, ahci device
detection fails miserably.

This patch makes JMicron PCI quirk clear SATA IDE mode bits and set
AHCI mode bits and remove the respective part from pata_jmicron.
Tested on JMB361, 363 and 368.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agoMerge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Fri, 2 Feb 2007 16:13:23 +0000 (08:13 -0800)] 
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  spidernet : fix memory leak in spider_net_stop
  e100: fix napi ifdefs removing needed code
  netxen patches

17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/bnx2-2.6
Linus Torvalds [Fri, 2 Feb 2007 16:10:58 +0000 (08:10 -0800)] 
Merge /pub/scm/linux/kernel/git/davem/bnx2-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/bnx2-2.6:
  [BNX2]: PHY workaround for 5709 A0.

17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 2 Feb 2007 16:10:30 +0000 (08:10 -0800)] 
Merge /pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET_SCHED]: act_ipt: fix regression in ipt action

17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Fri, 2 Feb 2007 16:10:17 +0000 (08:10 -0800)] 
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC32]: Fix over-optimization by GCC near ip_fast_csum.

17 years ago[PATCH] MAINTAINERS: ufs entry
Evgeniy Dushistov [Fri, 2 Feb 2007 08:36:34 +0000 (11:36 +0300)] 
[PATCH] MAINTAINERS: ufs entry

Mark ufs file system as maintainable, and add me as maintainer,
to help people find appropriate person to assign bugs.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoRevert "[PATCH] fix typo in geode_configre()@cyrix.c"
Linus Torvalds [Fri, 2 Feb 2007 16:07:42 +0000 (08:07 -0800)] 
Revert "[PATCH] fix typo in geode_configre()@cyrix.c"

This reverts commit e4f0ae0ea63caceff37a13f281a72652b7ea71ba.

It's not wrong, but it's not right either, and everybody seems to agree
that the right fix is probably to do the ccr3 write after the ccr4 one
(and that we also should clean it up a bit).  And after that we need to
really validate that all the bits that we write to ccr4 actually do
work.

The old 2.6.19 code was insane, and basically didn't change ccr4 at all
(even though it certainly looks like it was the *intent* to do so).  So
let's revert the change that may fix things, just because it's not what
was actually ever tested when the code was written, even if it _was_ the
intent.

There's a discussion on http://lkml.org/lkml/2007/1/9/63 that was
started by the patch that now gets reverted, and that discussion may
well contain the proper long-term fix.

Suggested-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agospidernet : fix memory leak in spider_net_stop
Jens Osterkamp [Thu, 1 Feb 2007 11:07:47 +0000 (12:07 +0100)] 
spidernet : fix memory leak in spider_net_stop

We forget to call spider_net_free_rx_chain_contents which does the
actual dev_kfree_skb. New skbs are allocated from skbuff_head_cache
on each "ifconfig up" letting the cache grow infinitely.

This patch fixes it.

Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agoe100: fix napi ifdefs removing needed code
Auke Kok [Wed, 31 Jan 2007 19:02:46 +0000 (11:02 -0800)] 
e100: fix napi ifdefs removing needed code

e100: fix napi ifdefs removing needed code

From: Auke Kok <auke-jan.h.kok@intel.com>

The e100 driver is NAPI mode only. We need to netif_poll_disable
during suspend and shutdown. The non-NAPI driver code was removed
and is only avaiable in the out-of-tree e100 kernel driver.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agoMerge ../linux-2.6
Jeff Garzik [Fri, 2 Feb 2007 13:31:55 +0000 (08:31 -0500)] 
Merge ../linux-2.6

17 years ago[BNX2]: PHY workaround for 5709 A0.
Michael Chan [Fri, 2 Feb 2007 08:46:35 +0000 (00:46 -0800)] 
[BNX2]: PHY workaround for 5709 A0.

5709 A0 copper devices will not link up with some link partners
without this workaround.

Update driver to 1.5.5.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET_SCHED]: act_ipt: fix regression in ipt action
Patrick McHardy [Fri, 2 Feb 2007 08:40:36 +0000 (00:40 -0800)] 
[NET_SCHED]: act_ipt: fix regression in ipt action

The x_tables patch broke target module autoloading in the ipt action
by replacing the ipt_find_target call (which does autoloading) by
xt_find_target (which doesn't do autoloading). Additionally xt_find_target
may return ERR_PTR values in case of an error, which are not handled.

Use xt_request_find_target, which does both autoloading and ERR_PTR
handling properly. Also don't forget to drop the target module reference
again when xt_check_target fails.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC32]: Fix over-optimization by GCC near ip_fast_csum.
Bob Breuer [Fri, 2 Feb 2007 04:24:35 +0000 (20:24 -0800)] 
[SPARC32]: Fix over-optimization by GCC near ip_fast_csum.

In some cases such as:
iph->check = 0;
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
GCC may optimize out the previous store.

Observed as a failure of NFS over udp (bad checksums on ip fragments)
when compiled with GCC 3.4.2.

Signed-off-by: Bob Breuer <breuerr@mc.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[PATCH] Remove avr32@atmel.com from MAINTAINERS
Haavard Skinnemoen [Thu, 1 Feb 2007 15:49:31 +0000 (16:49 +0100)] 
[PATCH] Remove avr32@atmel.com from MAINTAINERS

avr32@atmel.com is a technical support address and is not really
appropriate for sending patches. Lots of annoying automatics getting
in the way.

I'm still the maintainer of all the entries touched by this patch, so
nothing changes with regard to the "Supported" status of the AVR32
architecture or the macb driver.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] via82cxxx: fix typo ("cx7000" should be corrected to "cx700")
Bartlomiej Zolnierkiewicz [Thu, 1 Feb 2007 13:12:27 +0000 (14:12 +0100)] 
[PATCH] via82cxxx: fix typo ("cx7000" should be corrected to "cx700")

Noticed by JosephChan@via.com.tw.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] sysrq: showBlockedTasks is sysrq-W
Randy Dunlap [Thu, 1 Feb 2007 07:48:17 +0000 (23:48 -0800)] 
[PATCH] sysrq: showBlockedTasks is sysrq-W

Change SysRq showBlockedTasks from sysrq-X to sysrq-W and show that in the
Help message.

It was previously done via X, but X is already used for Xmon on ppc & powerpc
platforms and this collision needs to be avoided.

All callers of register_sysrq_key() are now marked in the sysrq op/key table.
I didn't mark 'h' as Help because Help is just printed for any unknown key,
such as '?'.

Added some omitted sysrq key entries in the sysrq.txt file.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] procfs: Fix listing of /proc/NOT_A_TGID/task
Guillaume Chazarain [Thu, 1 Feb 2007 07:48:14 +0000 (23:48 -0800)] 
[PATCH] procfs: Fix listing of /proc/NOT_A_TGID/task

Listing /proc/PID/task were PID is not a TGID should not result in
duplicated entries.

[g ~]$ pidof thunderbird-bin
2751
[g ~]$ ls /proc/2751/task
2751  2770  2771  2824  2826  2834  2835  2851  2853
[g ~]$ ls /proc/2770/task
2751  2770  2771  2824  2826  2834  2835  2851  2853
2770  2771  2824  2826  2834  2835  2851  2853
[g ~]$

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] KVM: fix lockup on 32-bit intel hosts with nx disabled in the bios
Avi Kivity [Thu, 1 Feb 2007 07:48:13 +0000 (23:48 -0800)] 
[PATCH] KVM: fix lockup on 32-bit intel hosts with nx disabled in the bios

Intel hosts, without long mode, and with nx support disabled in the bios
have an efer that is readable but not writable.  This causes a lockup on
switch to guest mode (even though it should exit with reason 34 according
to the documentation).

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] pci: remove warning messages
Andrew Morton [Thu, 1 Feb 2007 07:48:13 +0000 (23:48 -0800)] 
[PATCH] pci: remove warning messages

Remove these recently-added warnings.  They don't tell us anythng very
interesting and Kumar says "On an embedded PPC reference system I see this
message 6 times when I've got no cards in the PCI slots."

Acked-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] via quirk update
Jean Delvare [Thu, 1 Feb 2007 07:48:12 +0000 (23:48 -0800)] 
[PATCH] via quirk update

Add special handling for the VT82C686.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] uml-i386: fix build breakage with CONFIG_HIGHMEM
Al Viro [Thu, 1 Feb 2007 13:53:04 +0000 (13:53 +0000)] 
[PATCH] uml-i386: fix build breakage with CONFIG_HIGHMEM

missing helper used by arch/i386/mm/highmem.c, which is pulled
into build on that configuration.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] radio modems sitting on serial port are not for s390
Al Viro [Thu, 1 Feb 2007 13:52:59 +0000 (13:52 +0000)] 
[PATCH] radio modems sitting on serial port are not for s390

Won't build (request_irq()/free_irq()), even if you manage to find an
s390 box with 8250-compatible UART they are expecting.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] sanitize sections for sparc32 smp
Al Viro [Thu, 1 Feb 2007 13:52:33 +0000 (13:52 +0000)] 
[PATCH] sanitize sections for sparc32 smp

a) sun4d_boot_one_cpu() should be __cpuinit (called only from
   __cpuinit __cpu_up(), for one thing, leads to calls of __cpuinit
   functions for another).
b) got externs in arch/sparc/kernel/smp.c to match reality.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] efi_set_rtc_mmss() is not __init
Al Viro [Thu, 1 Feb 2007 13:52:54 +0000 (13:52 +0000)] 
[PATCH] efi_set_rtc_mmss() is not __init

fix the extern in efi.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] __crc_... is intended to be absolute
Al Viro [Thu, 1 Feb 2007 13:52:23 +0000 (13:52 +0000)] 
[PATCH] __crc_... is intended to be absolute

i386 boot/compressed/relocs checks for absolute symbols and warns about
unexpected ones.  If you build with modversions, you get ~2500 warnings
about __crc_<symbol>.  These suckers are really absolute symbols - we
do _not_ want to modify them on relocation.

They are generated by genksyms - EXPORT_... generates a weak alias, then
genksyms produces an ld script with __crc_<symbol> = <checksum> and it's
fed to ld to produce the final object file.  Their only use is to match
kernel and module at modprobe time; they _must_ be absolute.

boot/compressed/relocs has a whitelist of known absolute symbols, but
it doesn't know about __crc_... stuff.  As the result, we get shitloads
of false positives on any ld(1) version.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] fork_idle() should be __cpuinit, not __devinit
Al Viro [Thu, 1 Feb 2007 13:52:48 +0000 (13:52 +0000)] 
[PATCH] fork_idle() should be __cpuinit, not __devinit

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] endianness bug: ntohl() misspelled as >> 24 in fh_verify().
Al Viro [Thu, 1 Feb 2007 13:52:43 +0000 (13:52 +0000)] 
[PATCH] endianness bug: ntohl() misspelled as >> 24 in fh_verify().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] ide section fixes
Al Viro [Thu, 1 Feb 2007 13:52:38 +0000 (13:52 +0000)] 
[PATCH] ide section fixes

a) cleanup_module() should be __exit
b) externs should match reality

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] mca_nmi_hook() can be called at any point
Al Viro [Thu, 1 Feb 2007 13:52:28 +0000 (13:52 +0000)] 
[PATCH] mca_nmi_hook() can be called at any point

... and having it __init is a bad idea.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[PATCH] fix frv headers_check
Al Viro [Thu, 1 Feb 2007 13:08:45 +0000 (13:08 +0000)] 
[PATCH] fix frv headers_check

a) registers.h is really needed there
b) include of asm-generic/termios should be under __KERNEL__
c) includes of asm-generic/{memory_model,page} should be under
   __KERNEL (nothing in there that would work in userland)
d) a lot of stuff in ptrace.h should be under __KERNEL__.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Thu, 1 Feb 2007 00:58:12 +0000 (16:58 -0800)] 
Merge /pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETFILTER]: xt_hashlimit: fix ip6tables dependency
  [SCTP]: Force update of the rto when processing HB-ACK
  [IPV6]: fix BUG of ndisc_send_redirect()
  [IPV6]: Fix up some CONFIG typos
  [NETFILTER]: SIP conntrack: fix out of bounds memory access
  [NETFILTER]: SIP conntrack: fix skipping over user info in SIP headers
  [NETFILTER]: xt_connbytes: fix division by zero
  [MAINTAINERS]: netfilter@ is subscribers-only

17 years agoRevert "[PATCH] mm: micro optimise zone_watermark_ok"
Linus Torvalds [Thu, 1 Feb 2007 00:43:36 +0000 (16:43 -0800)] 
Revert "[PATCH] mm: micro optimise zone_watermark_ok"

This reverts commit e80ee884ae0e3794ef2b65a18a767d502ad712ee.

Pawel Sikora had a boot-time oops due to it - because the sign change
invalidates the following comparisons, since 'free_pages' can be
negative.

The micro-optimization just isn't worth it.

Bisected-by: Pawel Sikora <pluto@agmk.net>
Acked-by: Andrew Morton <akpm@osdl.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agonetxen patches
Al Viro [Tue, 2 Jan 2007 10:39:10 +0000 (10:39 +0000)] 
netxen patches

Have fun.

>From 24f4a1a77431575a9cdfaae25adda85842099f70 Mon Sep 17 00:00:00 2001
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon, 1 Jan 2007 15:22:56 -0500
Subject: [PATCH] netxen trivial annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[NETFILTER]: xt_hashlimit: fix ip6tables dependency
Patrick McHardy [Wed, 31 Jan 2007 05:36:09 +0000 (21:36 -0800)] 
[NETFILTER]: xt_hashlimit: fix ip6tables dependency

IP6_NF_IPTABLES=m, CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=y results in a
linker error since ipv6_find_hdr is defined in ip6_tables.c. Fix similar
to Adrian Bunk's H.323 conntrack patch: selecting ip6_tables to be build
as module requires hashlimit to be built as module as well.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years agoLinux 2.6.20-rc7 v2.6.20-rc7
Linus Torvalds [Wed, 31 Jan 2007 03:42:57 +0000 (19:42 -0800)] 
Linux 2.6.20-rc7

Ok, so I said there wouldn't be another -rc.

I lied.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>