Ingo Molnar [Mon, 10 Nov 2008 08:16:27 +0000 (09:16 +0100)]
Merge commit 'v2.6.28-rc4' into x86/apic
Linus Torvalds [Mon, 10 Nov 2008 00:36:15 +0000 (16:36 -0800)]
Linux 2.6.28-rc4
Arjan van de Ven [Sun, 9 Nov 2008 20:45:10 +0000 (12:45 -0800)]
regression: disable timer peek-ahead for 2.6.28
It's showing up as regressions; disabling it very likely just papers
over an underlying issue, but time is running out for 2.6.28, lets get
back to this for 2.6.29
Fixes: #11826 and #11893
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Nov 2008 00:20:49 +0000 (16:20 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/sam/kbuild-fixes
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: Fixup deb-pkg target to generate separate firmware deb
Jonathan McDowell [Sat, 13 Sep 2008 16:08:31 +0000 (17:08 +0100)]
kbuild: Fixup deb-pkg target to generate separate firmware deb
The below is a simplistic fix for "make deb-pkg"; it splits the
firmware out to a linux-firmware-image package and adds an
(unversioned) Suggests to the linux package for this firmware.
Signed-Off-By: Jonathan McDowell <noodles@earth.li>
Acked-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Linus Torvalds [Sun, 9 Nov 2008 20:47:04 +0000 (12:47 -0800)]
Don't ask twice about not including staging drivers
The "Exclude staging drivers" question is there so that we don't build
staging drivers for allyesconfig or allnoconfig settings, but it's very
irritating when you've already said "no" to staging drivers earlier.
There is absolutely no point in declining twice - once you've declined
the staging drivers, you're done.
So make the second question depend on the first question having been
answered in the affirmative.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 9 Nov 2008 20:25:44 +0000 (12:25 -0800)]
Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
Fix nfsd truncation of readdir results
Linus Torvalds [Sun, 9 Nov 2008 20:20:56 +0000 (12:20 -0800)]
Merge branch 'cpus4096' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
cpumask: introduce new API, without changing anything, v3
cpumask: new API, v2
cpumask: introduce new API, without changing anything
Doug Nazar [Wed, 5 Nov 2008 11:16:28 +0000 (06:16 -0500)]
Fix nfsd truncation of readdir results
Commit
8d7c4203 "nfsd: fix failure to set eof in readdir in some
situations" introduced a bug: on a directory in an exported ext3
filesystem with dir_index unset, a READDIR will only return about 250
entries, even if the directory was larger.
Bisected it back to this commit; reverting it fixes the problem.
It turns out that in this case ext3 reads a block at a time, then
returns from readdir, which means we can end up with buf.full==0 but
with more entries in the directory still to be read. Before
8d7c4203
(but after
c002a6c797 "Optimise NFS readdir hack slightly"), this would
cause us to return the READDIR result immediately, but with the eof bit
unset. That could cause a performance regression (because the client
would need more roundtrips to the server to read the whole directory),
but no loss in correctness, since the cleared eof bit caused the client
to send another readdir. After
8d7c4203, the setting of the eof bit
made this a correctness problem.
So, move nfserr_eof into the loop and remove the buf.full check so that
we loop until buf.used==0. The following seems to do the right thing
and reduces the network traffic since we don't return a READDIR result
until the buffer is full.
Tested on an empty directory & large directory; eof is properly sent and
there are no more short buffers.
Signed-off-by: Doug Nazar <nazard@dragoninc.ca>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Rusty Russell [Sat, 8 Nov 2008 09:24:19 +0000 (20:24 +1100)]
cpumask: introduce new API, without changing anything, v3
Impact: cleanup
Clean up based on feedback from Andrew Morton and others:
- change to inline functions instead of macros
- add __init to bootmem method
- add a missing debug check
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Miklos Szeredi [Sun, 9 Nov 2008 14:23:57 +0000 (15:23 +0100)]
net: unix: fix inflight counting bug in garbage collector
Previously I assumed that the receive queues of candidates don't
change during the GC. This is only half true, nothing can be received
from the queues (see comment in unix_gc()), but buffers could be added
through the other half of the socket pair, which may still have file
descriptors referring to it.
This can result in inc_inflight_move_tail() erronously increasing the
"inflight" counter for a unix socket for which dec_inflight() wasn't
previously called. This in turn can trigger the "BUG_ON(total_refs <
inflight_refs)" in a later garbage collection run.
Fix this by only manipulating the "inflight" counter for sockets which
are candidates themselves. Duplicating the file references in
unix_attach_fds() is also needed to prevent a socket becoming a
candidate for GC while the skb that contains it is not yet queued.
Reported-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolas Pitre [Sun, 9 Nov 2008 05:27:53 +0000 (00:27 -0500)]
clarify usage expectations for cnt32_to_63()
Currently, all existing users of cnt32_to_63() are fine since the CPU
architectures where it is used don't do read access reordering, and user
mode preemption is disabled already. It is nevertheless a good idea to
better elaborate usage requirements wrt preemption, and use an explicit
memory barrier on SMP to avoid different CPUs accessing the counter
value in the wrong order. On UP a simple compiler barrier is
sufficient.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 9 Nov 2008 19:14:16 +0000 (11:14 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: struct device - replace bus_id with dev_name(), dev_set_name()
mmc: increase SD write timeout for crappy cards
Takashi Iwai [Thu, 30 Oct 2008 14:57:05 +0000 (15:57 +0100)]
regulator: Use menuconfig in Kconfig
Use menuconfig instead of flat configs so that you can disable/enable
regulator items with one selection. Also, use depends instead of
reverse selections to make life easier, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Kay Sievers [Sat, 8 Nov 2008 20:37:46 +0000 (21:37 +0100)]
mmc: struct device - replace bus_id with dev_name(), dev_set_name()
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 26 Oct 2008 11:37:25 +0000 (12:37 +0100)]
mmc: increase SD write timeout for crappy cards
It seems that some cards are slightly out of spec and occasionally
will not be able to complete a write in the alloted 250 ms [1].
Incease the timeout slightly to allow even these cards to function
properly.
[1] http://lkml.org/lkml/2008/9/23/390
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Linus Torvalds [Sat, 8 Nov 2008 18:24:28 +0000 (10:24 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: optimize sched_clock() a bit
sched: improve sched_clock() performance
Linus Torvalds [Sat, 8 Nov 2008 18:22:38 +0000 (10:22 -0800)]
Merge branch 'oprofile-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'oprofile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: Fix p6 counter overflow check
Cell OProfile: Incorrect local array size in activate spu profiling function
Revert "Cell OProfile: Incorrect local array size in activate spu profiling function"
oprofile: fix memory ordering
Cell OProfile: Incorrect local array size in activate spu profiling function
Change UTF8 chars in Kconfig help text about Oprofile AMD barcelona
Linus Torvalds [Sat, 8 Nov 2008 18:22:00 +0000 (10:22 -0800)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
Staging: make usbip depend on CONFIG_NET
Staging: only build the tree if we really want to
Rafael J. Wysocki [Sat, 8 Nov 2008 12:53:33 +0000 (13:53 +0100)]
Fix __pfn_to_page(pfn) for CONFIG_DISCONTIGMEM=y
Fix the __pfn_to_page(pfn) macro so that it doesn't evaluate its
argument twice in the CONFIG_DISCONTIGMEM=y case, because 'pfn' may
be a result of a funtion call having side effects.
For example, the hibernation code applies pfn_to_page(pfn) to the
result of a function returning the pfn corresponding to the next set
bit in a bitmap and the current bit position is modified on each
call. This leads to "interesting" failures for CONFIG_DISCONTIGMEM=y
due to the current behavior of __pfn_to_page(pfn).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Sat, 8 Nov 2008 16:05:38 +0000 (17:05 +0100)]
sched: optimize sched_clock() a bit
sched_clock() uses cycles_2_ns() needlessly - which is an irq-disabling
variant of __cycles_2_ns().
Most of the time sched_clock() is called with irqs disabled already.
The few places that call it with irqs enabled need to be updated.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sat, 8 Nov 2008 15:19:55 +0000 (16:19 +0100)]
sched: improve sched_clock() performance
in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:
659567 system_call 41222.9375
686796 schedule 435.7843
718382 __switch_to 665.1685
823875 switch_mm 4526.7857
1883122 native_read_tsc 55385.9412
9761990 total 2.8468
this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.
But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().
This improves lat_ctx performance by about 5%.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Greg Kroah-Hartman [Wed, 29 Oct 2008 17:44:55 +0000 (10:44 -0700)]
Staging: make usbip depend on CONFIG_NET
Thanks to Randy Dunlap for finding this problem.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Greg Kroah-Hartman [Sat, 8 Nov 2008 05:12:17 +0000 (21:12 -0800)]
Staging: only build the tree if we really want to
This Kconfig change allows the common 'make allmodconfig' and
'make allyesconfig' build options to skip the staging tree, which is
probably what you want to have happen anyway.
This makes the linux-next developer's life a lot easier so he doesn't
have to worry about changes that break the staging tree, that's for me
to worry about...
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ingo Molnar [Fri, 7 Nov 2008 18:22:10 +0000 (19:22 +0100)]
Merge branch 'oprofile-for-tip' of git://git./linux/kernel/git/rric/oprofile into x86/urgent
Linus Torvalds [Fri, 7 Nov 2008 18:09:28 +0000 (10:09 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP
[IA64] fix boot panic caused by offline CPUs
[IA64] reorder Kconfig options to match x86
[IA64] Build VT-D iommu support into generic kernel
[IA64] remove dead BIO_VMERGE_BOUNDARY definition
[IA64] remove duplicated #include from pci-dma.c
[IA64] use common header for software IO/TLB
[IA64] fix the difference between node_mem_map and node_start_pfn
[IA64] Add error_recovery_info field to SAL section header
[IA64] Add UV watchlist support.
[IA64] Simplify SGI uv vs. sn2 driver issues
Jay Lan [Fri, 7 Nov 2008 17:51:55 +0000 (09:51 -0800)]
[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP
IA64 kdump kernel failed to initialize /proc/vmcore in 2.6.28-rc2.
A bug was introduced in this patch commit:
d9a9855d0b06ca6d6cc92596fedcc03f8512e062
always reserve elfcore header memory in crash kernel
The problem was that the call to reserve_elfcorehdr() should be placed
in CONFIG_CRASH_DUMP rather than in CONFIG_CRASH_KERNEL, which does
not exist.
Signed-off-by: Jay Lan <jlan@sgi.com>
Acked-by: Simon Hormon <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Linus Torvalds [Fri, 7 Nov 2008 17:18:14 +0000 (09:18 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fix range check on mmapped sysfs resource files
PCI: remove excess kernel-doc notation
PCI: annotate return value of pci_ioremap_bar with __iomem
PCI: fix VPD limit quirk for Broadcom 5708S
Linus Torvalds [Fri, 7 Nov 2008 17:17:59 +0000 (09:17 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, xen: fix use of pgd_page now that it really does return a page
Linus Torvalds [Fri, 7 Nov 2008 17:17:46 +0000 (09:17 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fine-tune SD_SIBLING_INIT
sched: fine-tune SD_MC_INIT
sched: fix memory leak in a failure path
sched: fix a bug in sched domain degenerate
Linus Torvalds [Fri, 7 Nov 2008 17:17:21 +0000 (09:17 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
xen: make sure stray alias mappings are gone before pinning
vmap: cope with vm_unmap_aliases before vmalloc_init()
Andi Kleen [Fri, 7 Nov 2008 13:02:49 +0000 (14:02 +0100)]
oprofile: Fix p6 counter overflow check
Fix the counter overflow check for CPUs with counter width > 32
I had a similar change in a different patch that I didn't submit
and I didn't notice the problem earlier because it was always
tested together.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Alan Cox [Fri, 7 Nov 2008 16:07:02 +0000 (16:07 +0000)]
trivial: MPT fusion - remove long dead code
This triggers false bug reports as it does a bogus kmalloc with locks held
but is never really compiled into the kernel.
Closes #8329
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Fri, 7 Nov 2008 16:03:46 +0000 (16:03 +0000)]
trivial: dmi_scan typo
As we've lost our trivial maintainer for the moment I'll send this
directly. Only touches a comment
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 7 Nov 2008 16:15:18 +0000 (08:15 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
ext4: calculate journal credits correctly
ext4: wait on all pending commits in ext4_sync_fs()
ext4: Convert to host order before using the values.
ext4: fix missing ext4_unlock_group in error path
jbd2: deregister proc on failure in jbd2_journal_init_inode
jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space
jbd: don't give up looking for space so easily in __log_wait_for_space
Ingo Molnar [Fri, 7 Nov 2008 15:09:23 +0000 (16:09 +0100)]
sched: fine-tune SD_SIBLING_INIT
fine-tune the HT sched-domains parameters as well.
On a HT capable box, this increases lat_ctx performance from 23.87
usecs to 1.49 usecs:
# before
$ ./lat_ctx -s 0 2
"size=0k ovr=1.89
2 23.87
# after
$ ./lat_ctx -s 0 2
"size=0k ovr=1.84
2 1.49
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Fri, 7 Nov 2008 14:26:50 +0000 (15:26 +0100)]
sched: fine-tune SD_MC_INIT
Tune SD_MC_INIT the same way as SD_CPU_INIT:
unset SD_BALANCE_NEWIDLE, and set SD_WAKE_BALANCE.
This improves vmark by 5%:
vmark 132102 125968 125497 messages/sec avg 127855.66 .984
vmark 139404 131719 131272 messages/sec avg 134131.66 1.033
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
# *DOCUMENTATION*
Frederic Bohe [Fri, 7 Nov 2008 14:21:01 +0000 (09:21 -0500)]
ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
When initializing an uninitialized block group in ext4_new_inode(),
its block group checksum must be re-calculated. This fixes a race
when several threads try to allocate a new inode in an UNINIT'd group.
There is some question whether we need to be initializing the block
bitmap in ext4_new_inode() at all, but for now, if we are going to
init the block group, let's eliminate the race.
Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Aneesh Kumar K.V [Fri, 7 Nov 2008 14:06:45 +0000 (09:06 -0500)]
ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
We need to make sure we mark the buffer_heads as dirty and uptodate
so that block_write_full_page write them correctly.
This fixes mmap corruptions that can occur in low memory situations.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Rusty Russell [Fri, 7 Nov 2008 00:12:29 +0000 (11:12 +1100)]
cpumask: new API, v2
- add cpumask_of()
- add free_bootmem_cpumask_var()
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge [Tue, 28 Oct 2008 08:23:06 +0000 (19:23 +1100)]
xen: make sure stray alias mappings are gone before pinning
Xen requires that all mappings of pagetable pages are read-only, so
that they can't be updated illegally. As a result, if a page is being
turned into a pagetable page, we need to make sure all its mappings
are RO.
If the page had been used for ioremap or vmalloc, it may still have
left over mappings as a result of not having been lazily unmapped.
This change makes sure we explicitly mop them all up before pinning
the page.
Unlike aliases created by kmap, the there can be vmalloc aliases even
for non-high pages, so we must do the flush unconditionally.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jeremy Fitzhardinge [Tue, 28 Oct 2008 08:22:34 +0000 (19:22 +1100)]
vmap: cope with vm_unmap_aliases before vmalloc_init()
Xen can end up calling vm_unmap_aliases() before vmalloc_init() has
been called. In this case its safe to make it a simple no-op.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Fri, 7 Nov 2008 06:47:21 +0000 (14:47 +0800)]
sched: fix memory leak in a failure path
Impact: fix rare memory leak in the sched-domains manual reconfiguration code
In the failure path, rd is not attached to a sched domain,
so it causes a leak.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Thu, 6 Nov 2008 01:45:16 +0000 (09:45 +0800)]
sched: fix a bug in sched domain degenerate
Impact: re-add incorrectly eliminated sched domain layers
(1) on i386 with SCHED_SMT and SCHED_MC enabled
# mount -t cgroup -o cpuset xxx /mnt
# echo 0 > /mnt/cpuset.sched_load_balance
# mkdir /mnt/0
# echo 0 > /mnt/0/cpuset.cpus
# dmesg
CPU0 attaching sched-domain:
domain 0: span 0 level CPU
groups: 0
(2) on i386 with SCHED_MC enabled but SCHED_SMT disabled
# same with (1)
# dmesg
CPU0 attaching NULL sched-domain.
The bug is that some sched domains may be skipped unintentionally when
degenerating (optimizing) sched domains.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Fri, 7 Nov 2008 00:43:44 +0000 (16:43 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: Fix recursive descent in __scm_destroy().
iwl3945: fix deadlock on suspend
iwl3945: do not send scan command if channel count zero
iwl3945: clear scanning bits upon failure
ath5k: correct handling of rx status fields
zd1211rw: Add 2 device IDs
Fix logic error in rfkill_check_duplicity
iwlagn: avoid sleep in softirq context
iwlwifi: clear scanning bits upon failure
Revert "ath5k: honor FIF_BCN_PRBRESP_PROMISC in STA mode"
tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
netfilter: netns ct: walk netns list under RTNL
ipv6: fix run pending DAD when interface becomes ready
net/9p: fix printk format warnings
net: fix packet socket delivery in rx irq handler
xfrm: Have af-specific init_tempsel() initialize family field of temporary selector
Linus Torvalds [Thu, 6 Nov 2008 23:57:24 +0000 (15:57 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "x86: default to reboot via ACPI"
x86: align DirectMap in /proc/meminfo
AMD IOMMU: fix lazy IO/TLB flushing in unmap path
x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
x86: mention ACPI in top-level Kconfig menu
x86: size NR_IRQS on 32-bit systems the same way as 64-bit
x86: don't allow nr_irqs > NR_IRQS
x86/docs: remove noirqbalance param docs
x86: don't use tsc_khz to calculate lpj if notsc is passed
x86, voyager: fix smp_intr_init() compile breakage
AMD IOMMU: fix detection of NP capable IOMMUs
Linus Torvalds [Thu, 6 Nov 2008 23:56:29 +0000 (15:56 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] xsc3: fix xsc3_l2_inv_range
[ARM] mm: fix page table initialization
[ARM] fix naming of MODULE_START / MODULE_END
ARM: OMAP: Fix define for twl4030 irqs
ARM: OMAP: Fix get_irqnr_and_base to clear spurious interrupt bits
ARM: OMAP: Fix debugfs_create_*'s error checking method for arm/plat-omap
ARM: OMAP: Fix compiler warnings in gpmc.c
[ARM] fix VFP+softfloat binaries
Linus Torvalds [Thu, 6 Nov 2008 23:55:34 +0000 (15:55 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: dv1394: fix possible deadlock in multithreaded clients
ieee1394: raw1394: fix possible deadlock in multithreaded clients
ieee1394: struct device - replace bus_id with dev_name(), dev_set_name()
firewire: struct device - replace bus_id with dev_name(), dev_set_name()
Linus Torvalds [Thu, 6 Nov 2008 23:53:47 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Block: use round_jiffies_up()
Add round_jiffies_up and related routines
block: fix __blkdev_get() for removable devices
generic-ipi: fix the smp_mb() placement
blk: move blk_delete_timer call in end_that_request_last
block: add timer on blkdev_dequeue_request() not elv_next_request()
bio: define __BIOVEC_PHYS_MERGEABLE
block: remove unused ll_new_mergeable()
David S. Miller [Thu, 6 Nov 2008 23:52:00 +0000 (15:52 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6
Linus Torvalds [Thu, 6 Nov 2008 23:50:54 +0000 (15:50 -0800)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] SAM9 watchdog - supported on all SAM9 and CAP9 processors
[WATCHDOG] SAM9 watchdog - update for moved headers
Linus Torvalds [Thu, 6 Nov 2008 23:50:11 +0000 (15:50 -0800)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: linear: Fix a division by zero bug for very small arrays.
md: fix bug in raid10 recovery.
md: revert the recent addition of a call to the BLKRRPART ioctl.
Linus Torvalds [Thu, 6 Nov 2008 23:46:28 +0000 (15:46 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix "unused variable" warning in pci_dlpar.c
powerpc/cell: Fix compile error in ras.c
powerpc/ps3: Fix compile error in ps3-lpm.c
Linus Torvalds [Thu, 6 Nov 2008 23:45:57 +0000 (15:45 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: fix printk format warnings
unsigned fid->fid cannot be negative
9p: rdma: remove duplicated #include
p9: Fix leak of waitqueue in request allocation path
9p: Remove unneeded free of fcall for Flush
9p: Make all client spin locks IRQ safe
9p: rdma: Set trans prior to requesting async connection ops
Linus Torvalds [Thu, 6 Nov 2008 23:45:40 +0000 (15:45 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: re-tune balancing
sched: fix buddies for group scheduling
sched: backward looking buddy
sched: fix fair preempt check
sched: cleanup fair task selection
David S. Miller [Thu, 6 Nov 2008 23:45:32 +0000 (15:45 -0800)]
net: Fix recursive descent in __scm_destroy().
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Wed, 5 Nov 2008 17:38:47 +0000 (17:38 +0000)]
Fix accidental implicit cast in HR-timer conversion
Fix the hrtimer_add_expires_ns() function. It should take a 'u64 ns' argument,
but rather takes an 'unsigned long ns' argument - which might only be 32-bits.
On FRV, this results in the kernel locking up because hrtimer_forward() passes
the result of a 64-bit multiplication to this function, for which the compiler
discards the top 32-bits - something that didn't happen when ktime_add_ns() was
called directly.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 6 Nov 2008 23:43:13 +0000 (15:43 -0800)]
Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
[JFFS2] fix race condition in jffs2_lzo_compress()
[MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
[JFFS2] Fix lack of locking in thread_should_wake()
[JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFER
[MTD] [NAND] OMAP2: remove duplicated #include
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:58 +0000 (12:53 -0800)]
fat: i_blocks warning fix
blkcnt_t type depends on CONFIG_LSF. Use unsigned long long always for
printk(). But lazy to type it, so add "llu" and use it.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:57 +0000 (12:53 -0800)]
fat: ->i_pos race fix
i_pos is 64bits value, hence it's not atomic to update.
Important place is fat_write_inode() only, other places without lock
are just for printk().
This adds lock for "BITS_PER_LONG == 32" kernel.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:57 +0000 (12:53 -0800)]
fat: mmu_private race fix
mmu_private is 64bits value, hence it's not atomic to update.
So, the access rule for mmu_private is we must hold ->i_mutex. But,
fat_get_block() path doesn't follow the rule on non-allocation path.
This fixes by using i_size instead if non-allocation path.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:56 +0000 (12:53 -0800)]
fat: Add printf attribute to fat_fs_panic()
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:56 +0000 (12:53 -0800)]
fat: Fix _fat_bmap() race
fat_get_cluster() assumes the requested blocknr isn't truncated during
read. _fat_bmap() doesn't follow this rule.
This protects it by ->i_mutex.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:55 +0000 (12:53 -0800)]
fat: Fix ATTR_RO for directory
FAT has the ATTR_RO (read-only) attribute. But on Windows, the ATTR_RO
of the directory will be just ignored actually, and is used by only
applications as flag. E.g. it's setted for the customized folder by
Explorer.
http://msdn2.microsoft.com/en-us/library/
aa969337.aspx
This adds "rodir" option. If user specified it, ATTR_RO is used as
read-only flag even if it's the directory. Otherwise, inode->i_mode
is not used to hold ATTR_RO (i.e. fat_mode_can_save_ro() returns 0).
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Fix ATTR_RO in the case of (~umask & S_WUGO) == 0
If inode->i_mode doesn't have S_WUGO, current code assumes it means
ATTR_RO. However, if (~[ufd]mask & S_WUGO) == 0, inode->i_mode can't
hold S_WUGO. Therefore the updated directory entry will always have
ATTR_RO.
This adds fat_mode_can_hold_ro() to check it. And if inode->i_mode
can't hold, uses -i_attrs to hold ATTR_RO instead.
With this, we don't set ATTR_RO unless users change it via ioctl() if
(~[ufd]mask & S_WUGO) == 0.
And on FAT_IOCTL_GET_ATTRIBUTES path, this adds ->i_mutex to it for
not returning the partially updated attributes by FAT_IOCTL_SET_ATTRIBUTES
to userland.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Cleanup FAT attribute stuff
This adds three helpers:
fat_make_attrs() - makes FAT attributes from inode.
fat_make_mode() - makes mode_t from FAT attributes.
fat_save_attrs() - saves FAT attributes to inode.
Then this replaces: MSDOS_MKMODE() by fat_make_mode(), fat_attr() by
fat_make_attrs(), ->i_attrs = attr & ATTR_UNUSED by fat_save_attrs().
And for root inode, those is used with ATTR_DIR instead of bogus
ATTR_NONE.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:53 +0000 (12:53 -0800)]
fat: Cleanup msdos_lookup()
Use same style with vfat_lookup().
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:52 +0000 (12:53 -0800)]
fat: Kill d_invalidate() in vfat_lookup()
d_invalidate() for positive dentry doesn't work in some cases
(vfsmount, nfsd, and maybe others). shrink_dcache_parent() by
d_invalidate() is pointless for vfat usage at all.
So, this kills it, and intead of it uses d_move().
To save old behavior, this returns alias simply for directory (don't
change pwd, etc..). the directory lookup shouldn't be important for
performance.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:51 +0000 (12:53 -0800)]
fat: Fix/Cleanup dcache handling for vfat
- Add comments for handling dcache of vfat.
- Separate case-sensitive case and case-insensitive to
vfat_revalidate() and vfat_ci_revalidate().
vfat_revalidate() doesn't need to drop case-insensitive negative
dentry on creation path.
- Current code is missing to set ->d_revalidate to the negative dentry
created by unlink/etc..
This sets ->d_revalidate always, and returns 1 for positive
dentry. Now, we don't need to change ->d_op dynamically anymore,
so this just uses sb->s_root->d_op to set ->d_op.
- d_find_alias() may return DCACHE_DISCONNECTED dentry. It's not
the interesting dentry there. This checks it.
- Add missing LOOKUP_PARENT check. We don't need to drop the valid
negative dentry for (LOOKUP_CREATE | LOOKUP_PARENT) lookup.
- For consistent filename on creation path, this drops negative dentry
if we can't see intent.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:51 +0000 (12:53 -0800)]
vfat: Fix vfat_find() error path in vfat_lookup()
Current vfat_lookup() creates negetive dentry blindly if vfat_find()
returned a error. It's wrong. If the error isn't -ENOENT, just return
error.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:50 +0000 (12:53 -0800)]
fat: use fat_detach() in fat_clear_inode()
Use fat_detach() instead of opencoding it.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:49 +0000 (12:53 -0800)]
fat: Fix fat_ent_update_ptr() for FAT12
This fixes the missing update for bhs/nr_bhs in case the caller
accessed from block boundary to first block of boundary.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:49 +0000 (12:53 -0800)]
fat: improve fat_hash()
fat_hash() is using the algorithm known as bad. Instead of it, this
uses hash_32(). The following is the summary of test.
old hash:
hash func (1000 times): 33489 cycles
total inodes in hash table: 70926
largest bucket contains: 696
smallest bucket contains: 54
new hash:
hash func (1000 times): 33129 cycles
total inodes in hash table: 70926
largest bucket contains: 315
smallest bucket contains: 236
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darren Jenkins [Thu, 6 Nov 2008 20:53:48 +0000 (12:53 -0800)]
fat: cleanup fat_parse_long() error handling
Coverity CID 2332 & 2333 RESOURCE_LEAK
In fat_search_long() if fat_parse_long() returns a -ve value we return
without first freeing unicode. This patch free's them on this error path.
The above was false positive on current tree, but this change is more
clean, so apply as cleanup.
[hirofumi@mail.parknet.co.jp: fix coding style]
Signed-off-by: Darren Jenkins <darrenrjenkins@gmail.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:47 +0000 (12:53 -0800)]
fat: use generic_file_llseek() for directory
Since fat_dir_ioctl() was already fixed (i.e. called under ->i_mutex),
and __fat_readdir() doesn't take BKL anymore. So, BKL for ->llseek()
is pointless, and we have to use generic_file_llseek().
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:47 +0000 (12:53 -0800)]
fat: Fix and cleanup timestamp conversion
This cleans date_dos2unix()/fat_date_unix2dos() up. New code should be
much more readable.
And this fixes those old functions. Those doesn't handle 2100
correctly. 2100 isn't leap year, but old one handles it as leap year.
Also, with this, centi sec is handled and is fixed.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:46 +0000 (12:53 -0800)]
fat: split include/msdos_fs.h
This splits __KERNEL__ stuff in include/msdos_fs.h into fs/fat/fat.h.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:45 +0000 (12:53 -0800)]
fat: move fs/vfat/* and fs/msdos/* to fs/fat
This just moves those files, but change link order from MSDOS, VFAT to
VFAT, MSDOS.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bart Trojanowski [Thu, 6 Nov 2008 20:53:44 +0000 (12:53 -0800)]
fat: document additional vfat mount options
While debugging a sync mount regression on vfat I noticed that there were
mount options parsed by the driver that were not documented.
[hirofumi@mail.parknet.co.jp: fix some parts]
Signed-off-by: Bart Trojanowski <bart@jukie.net>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Victor [Thu, 6 Nov 2008 20:53:42 +0000 (12:53 -0800)]
SAM9 watchdog: update for moved headers
The architecture header files were recently moved from
include/asm-arm/mach-at91/ to arch/arm/mach-at91/include/mach/. The SAM9
watchdog driver still includes a header from the old location.
Signed-off-by: Andrew Victor <linux@maxim.org.za>
Cc: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frans Pop [Thu, 6 Nov 2008 20:53:41 +0000 (12:53 -0800)]
rtc-cmos: fix boot log message
-rtc0: alarms up to one month, y3k, 114 bytes nvram, , hpet irqs irqs
+rtc0: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Thu, 6 Nov 2008 20:53:40 +0000 (12:53 -0800)]
atmel_serial: keep clock off when it's not needed
The atmel_serial driver is mismanaging its clock by leaving it on at all
times ... the whole point of clock management is to leave it off unless
it's actively needed, which conserves power!!
Although the kernel doesn't actually hang without my fix, it does
discard quite a lot of early console output.
The result still looks correct:
usart users= 1 on
35000000 Hz, for atmel_usart.0
usart users= 0 off
35000000 Hz, for atmel_usart.2
when using ttyS0 as serial console.
[haavard.skinnemoen@atmel.com: Make sure clock is enabled early for console]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Thu, 6 Nov 2008 20:53:39 +0000 (12:53 -0800)]
Documentation/kernel-parameters.txt: update 'isolcpus' kernel option
cpuset can be used to move a process onto or off an isolated CPU.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Thu, 6 Nov 2008 20:53:37 +0000 (12:53 -0800)]
fbdev: fix fb_compat_ioctl() deadlocks
commit
3e680aae4e53ab54cdbb0c29257dae0cbb158e1c ("fb: convert
lock/unlock_kernel() into local fb mutex") introduced several deadlocks
in the fb_compat_ioctl() path, as mutex_lock() doesn't allow recursion,
unlike lock_kernel(). This broke frame buffer applications on 64-bit
systems with a 32-bit userland.
commit
120a37470c2831fea49fdebaceb5a7039f700ce6 ("framebuffer compat_ioctl
deadlock") fixed one of the deadlocks.
This patch fixes the remaining deadlocks:
- Revert commit
120a37470c2831fea49fdebaceb5a7039f700ce6,
- Extract the core logic of fb_ioctl() into a new function do_fb_ioctl(),
- Change all callsites of fb_ioctl() where info->lock is already held to
call do_fb_ioctl() instead,
- Add sparse annotations to all routines that take info->lock.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Thu, 6 Nov 2008 20:53:36 +0000 (12:53 -0800)]
memory hotplug: fix page_zone() calculation in test_pages_isolated()
My last bugfix here (adding zone->lock) introduced a new problem: Using
page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
pfn will then be >= end_pfn, which may be in a different zone or not
present at all. This may lead to an addressing exception in page_zone()
or spin_lock_irqsave().
Now I use __first_valid_page() again after the loop to find a valid page
for page_zone().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Nathan Fontenot <nfont@austin.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arthur Jones [Thu, 6 Nov 2008 20:53:35 +0000 (12:53 -0800)]
ext3: wait on all pending commits in ext3_sync_fs
In ext3_sync_fs, we only wait for a commit to finish if we started it, but
there may be one already in progress which will not be synced.
In the case of a data=ordered umount with pending long symlinks which are
delayed due to a long list of other I/O on the backing block device, this
causes the buffer associated with the long symlinks to not be moved to the
inode dirty list in the second phase of fsync_super. Then, before they
can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
dirty pages are never written to the backing block device, causing long
symlink corruption and exposing new or previously freed block data to
userspace.
This can be reproduced with a script created
by Eric Sandeen <sandeen@redhat.com>:
#!/bin/bash
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
rm -f /mnt/test2/*
dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
touch
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
ln -s
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
/mnt/test2/link
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
ls /mnt/test2/
umount /mnt/test2
To ensure all commits are synced, we flush all journal commits now when
sync_fs'ing ext3.
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org> [2.6.everything]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qinghuang Feng [Thu, 6 Nov 2008 20:53:34 +0000 (12:53 -0800)]
mm/oom_kill.c: fix badness() kerneldoc
Paramter @mem has been removed since v2.6.26, now delete it's comment.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dann frazier [Thu, 6 Nov 2008 20:53:34 +0000 (12:53 -0800)]
cciss: add P700m to list of supported controllers
P700m support was added in:
9cff3b383dad193b0762c27278a16237e10b53dc
Update cciss.txt to match.
Signed-off-by: dann frazier <dannf@hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tim Hockin [Thu, 6 Nov 2008 20:53:33 +0000 (12:53 -0800)]
Documentation/email-clients.txt: add some info about gmail
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Thu, 6 Nov 2008 20:53:32 +0000 (12:53 -0800)]
cgroups: fix invalid cgrp->dentry before cgroup has been completely removed
This fixes an oops when reading /proc/sched_debug.
A cgroup won't be removed completely until finishing cgroup_diput(), so we
shouldn't invalidate cgrp->dentry in cgroup_rmdir(). Otherwise, when a
group is being removed while cgroup_path() gets called, we may trigger
NULL dereference BUG.
The bug can be reproduced:
# cat test.sh
#!/bin/sh
mount -t cgroup -o cpu xxx /mnt
for (( ; ; ))
{
mkdir /mnt/sub
rmdir /mnt/sub
}
# ./test.sh &
# cat /proc/sched_debug
BUG: unable to handle kernel NULL pointer dereference at
00000038
IP: [<
c045a47f>] cgroup_path+0x39/0x90
...
Call Trace:
[<
c0420344>] ? print_cfs_rq+0x6e/0x75d
[<
c0421160>] ? sched_debug_show+0x72d/0xc1e
...
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Thu, 6 Nov 2008 20:53:31 +0000 (12:53 -0800)]
vmemmap: warn about page_structs with remote distance
It's insufficient to simply compare node ids when warning about offnode
page_structs since it's possible to still have local affinity.
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Lameter [Thu, 6 Nov 2008 20:53:30 +0000 (12:53 -0800)]
mm: move migrate_prep out from under mmap_sem
Move the migrate_prep outside the mmap_sem for the following system calls
1. sys_move_pages
2. sys_migrate_pages
3. sys_mbind()
It really does not matter when we flush the lru. The system is free to
add pages onto the lru even during migration which will make the page
migration either skip the page (mbind, migrate_pages) or return a busy
state (move_pages).
Fixes this lockdep warning (and potential deadlock):
Some VM place has
mmap_sem -> kevent_wq via lru_add_drain_all()
net/core/dev.c::dev_ioctl() has
rtnl_lock -> mmap_sem (*) the ioctl has copy_from_user() and it can do page fault.
linkwatch_event has
kevent_wq -> rtnl_lock
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anatolij Gustschin [Thu, 6 Nov 2008 20:53:29 +0000 (12:53 -0800)]
fbdev: add new framebuffer driver for Fujitsu MB862xx GDCs
Add a framebuffer driver for the Fujitsu Carmine/Coral-P(A)/Lime graphics
controllers. Lime GDC support is known to work on PPC440EPx based lwmon5
and MPC8544E based socrates embedded boards, both equipped with Lime GDC.
Carmine/Coral-P PCI GDC support is known to work on PPC440EPx based
Sequoia board and also on x86 platform.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matteo Fortini <m.fortini@selcomgroup.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Thu, 6 Nov 2008 20:53:29 +0000 (12:53 -0800)]
oom: do not dump task state for non thread group leaders
When /proc/sys/vm/oom_dump_tasks is enabled, it's only necessary to dump
task state information for thread group leaders. The kernel log gets
quickly overwhelmed on machines with a massive number of threads by
dumping non-thread group leaders.
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 6 Nov 2008 20:53:28 +0000 (12:53 -0800)]
MAINTAINERS: make IOAT easier to find
Searching MAINTAINERS for "ioat" comes up empty. Fix this.
Cc: "Dan Williams" <dan.j.williams@intel.com>
Cc: "Sosnowski, Maciej" <maciej.sosnowski@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Thu, 6 Nov 2008 20:53:27 +0000 (12:53 -0800)]
hugetlb: pull gigantic page initialisation out of the default path
As we can determine exactly when a gigantic page is in use we can optimise
the common regular page cases by pulling out gigantic page initialisation
into its own function. As gigantic pages are never released to buddy we
do not need a destructor. This effectivly reverts the previous change to
the main buddy allocator. It also adds a paranoid check to ensure we
never release gigantic pages from hugetlbfs to the main buddy.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Thu, 6 Nov 2008 20:53:26 +0000 (12:53 -0800)]
hugetlbfs: handle pages higher order than MAX_ORDER
When working with hugepages, hugetlbfs assumes that those hugepages are
smaller than MAX_ORDER. Specifically it assumes that the mem_map is
contigious and uses that to optimise access to the elements of the mem_map
that represent the hugepage. Gigantic pages (such as 16GB pages on
powerpc) by definition are of greater order than MAX_ORDER (larger than
MAX_ORDER_NR_PAGES in size). This means that we can no longer make use of
the buddy alloctor guarentees for the contiguity of the mem_map, which
ensures that the mem_map is at least contigious for maximmally aligned
areas of MAX_ORDER_NR_PAGES pages.
This patch adds new mem_map accessors and iterator helpers which handle
any discontiguity at MAX_ORDER_NR_PAGES boundaries. It then uses these to
implement gigantic page versions of copy_huge_page and clear_huge_page,
and to allow follow_hugetlb_page handle gigantic pages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Miller [Thu, 6 Nov 2008 20:53:25 +0000 (12:53 -0800)]
cciss: fix regression firmware not displayed in procfs
This regression was introduced by commit
6ae5ce8e8d4de666f31286808d2285aa6a50fa40 ("cciss: remove redundant code").
This patch fixes a regression where the controller firmware version is not
displayed in procfs. The previous patch would be called anytime something
changed. This will get called only once for each controller.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Miller [Thu, 6 Nov 2008 20:53:24 +0000 (12:53 -0800)]
cciss: fix sysfs broken symlink regression
Regression introduced by commit
6ae5ce8e8d4de666f31286808d2285aa6a50fa40
("cciss: remove redundant code").
This patch fixes a broken symlink in sysfs that was introduced by the
above commit. We broke it in 2.6.27-rc on or about
20080804. Some
installers are broken if this symlink does not exist and they may not
detect the logical drives configured on the controller. It does not
require being backported into 2.6.26.x or earlier kernels.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Thu, 6 Nov 2008 20:53:23 +0000 (12:53 -0800)]
autofs4: collect version check return
The function check_dev_ioctl_version() returns an error code upon fail but
it isn't captured and returned in validate_dev_ioctl() as it should be.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>