linux-2.6
19 years ago[PATCH] x86: GDT alignment fix
Zachary Amsden [Fri, 6 Jan 2006 08:11:47 +0000 (00:11 -0800)] 
[PATCH] x86: GDT alignment fix

Make GDT page aligned and page padded to support running inside of a
hypervisor.  This prevents false sharing of the GDT page with other hot
data, which is not allowed in Xen, and causes performance problems in
VMware.

Rather than go back to the old method of statically allocating the GDT
(which wastes unneded space for non-present CPUs), the GDT for APs is
allocated dynamically.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mips: remove include/asm-mips/riscos-syscall.h
Domen Puncer [Fri, 6 Jan 2006 08:11:46 +0000 (00:11 -0800)] 
[PATCH] mips: remove include/asm-mips/riscos-syscall.h

Remove nowhere referenced file ("grep riscos -r ." didn't find anything).

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] frv: improve signal handling
David Howells [Fri, 6 Jan 2006 08:11:45 +0000 (00:11 -0800)] 
[PATCH] frv: improve signal handling

The attached patch improves the signal handling:

 (1) It makes do_signal() static as it isn't called from anywhere outside of
     the arch code.

 (2) It removes the regs argument to all the static functions within that file,
     using __frame instead (which is the same thing held in a global register).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] frv: fix signal handling
David Howells [Fri, 6 Jan 2006 08:11:44 +0000 (00:11 -0800)] 
[PATCH] frv: fix signal handling

The attached patch makes FRV signal handling work properly:

 (1) After do_notify_resume() has been called, the work flags must be checked
     again (there may be another signal to deliver or the process might require
     rescheduling for instance).

 (2) After the signal frame is set up on the userspace stack, ptrace() should
     be given an opportunity to single-step into the signal handler.

 (3) The error state from setting up a signal frame should be passed back up
     the call chain.

 (4) The segfault handler shouldn't be preemptively reset in the arch if we
     fail to deliver a SEGV signal: force_sig() will take care of that.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] FRV: Make futex code compilable on nommu [try #2]
David Howells [Fri, 6 Jan 2006 08:11:44 +0000 (00:11 -0800)] 
[PATCH] FRV: Make futex code compilable on nommu [try #2]

Make the futex code compilable and usable on NOMMU by making the attempt to
handle page faults conditional on CONFIG_MMU.  If this is not enabled, then
we can assume that EFAULT returned from futex_atomic_op_inuser() is not
recoverable, and that the address lies outside of valid memory.

handle_mm_fault() is made to BUG if called on NOMMU without attempting to
invoke the actual handler (__handle_mm_fault).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] FRV: Implement futex operations for FRV
David Howells [Fri, 6 Jan 2006 08:11:43 +0000 (00:11 -0800)] 
[PATCH] FRV: Implement futex operations for FRV

The attached patch implements futex operations for the FRV architecture. The
operations are applicable to both MMU and no-MMU modes; though the EFAULT
handling will be a little bit of wasted space on the latter.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] NOMMU: Make SYSV IPC SHM use ramfs facilities on NOMMU
David Howells [Fri, 6 Jan 2006 08:11:42 +0000 (00:11 -0800)] 
[PATCH] NOMMU: Make SYSV IPC SHM use ramfs facilities on NOMMU

The attached patch makes the SYSV IPC shared memory facilities use the new
ramfs facilities on a no-MMU kernel.

The following changes are made:

 (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to
     allow the IPC SHM facilities to commune with the tiny-shmem and shmem
     code.

 (2) ramfs files now need resizing using do_truncate() rather than by modifying
     the inode size directly (see shmem_file_setup()). This causes ramfs to
     attempt to bind a block of pages of sufficient size to the inode.

 (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] NOMMU: Provide shared-writable mmap support on ramfs
David Howells [Fri, 6 Jan 2006 08:11:41 +0000 (00:11 -0800)] 
[PATCH] NOMMU: Provide shared-writable mmap support on ramfs

The attached patch makes ramfs support shared-writable mmaps by:

 (1) Attempting to perform a contiguous block allocation to the requested size
     when truncate attempts to increase the file from zero size, such as
     happens when:

fd = shm_open("/file/on/ramfs", ...):
ftruncate(fd, size_requested);
addr = mmap(NULL, subsize, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED,
    fd, offset);

 (2) Permitting any shared-writable mapping over any contiguous set of extant
     pages. get_unmapped_area() will return the address into the actual ramfs
     pages. The mapping may start anywhere and be of any size, but may not go
     over the end of file. Multiple mappings may overlap in any way.

 (3) Not permitting a file to be shrunk if it would truncate any shared
     mappings (private mappings are copied).

Thus this patch provides support for POSIX shared memory on NOMMU kernels,
with certain limitations such as there being a large enough block of pages
available to support the allocation and it only working on directly mappable
filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] therm_adt746x: Quiet fan speed change messages
Ben Collins [Fri, 6 Jan 2006 08:11:40 +0000 (00:11 -0800)] 
[PATCH] therm_adt746x: Quiet fan speed change messages

Only output the messages about fan speed changes with a verbose=1 module
param.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Re-add embed_config.c to ml300/ep405
Peter Korsgaard [Fri, 6 Jan 2006 08:11:39 +0000 (00:11 -0800)] 
[PATCH] ppc32: Re-add embed_config.c to ml300/ep405

Commit 3e9e7c1d0b7a36fb8affb973a054c5098e27baa8 (ppc32: cleanup AMCC PPC40x
eval boards to support U-Boot) broke the kernel for ML300 / EP405.

It still compiles as there's a weak definition of the function in
misc-embedded.c, but the kernel crashes as the bd_t fixup isn't performed.

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Allows compilation of a MPC52xx kernel without PCI
Sylvain Munaut [Fri, 6 Jan 2006 08:11:38 +0000 (00:11 -0800)] 
[PATCH] ppc32: Allows compilation of a MPC52xx kernel without PCI

Some custom cards might not need PCI, without this patch, compilation fails.

Signed-off-by: Roger Blofeld <blofeldus@yahoo.com>
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Fix MPC52xx PCI init in cas the bootloader didn't do it
Sylvain Munaut [Fri, 6 Jan 2006 08:11:37 +0000 (00:11 -0800)] 
[PATCH] ppc32: Fix MPC52xx PCI init in cas the bootloader didn't do it

We were counting on the bootloader to init some stuff, like get the bus out of
reset and enable accesses.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Fix MPC52xx configuration space access
Sylvain Munaut [Fri, 6 Jan 2006 08:11:36 +0000 (00:11 -0800)] 
[PATCH] ppc32: Fix MPC52xx configuration space access

This patch takes care of an errata of the MPC5200 by avoiding 32 bits access
in type 1 configuration accesses.  All others accesses are still 32 bits wide.
 It also adds some mb() since the simple out_be(...) are not sufficient in
this case.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Remove __init qualifier from mpc52xx pci resources fixups
Sylvain Munaut [Fri, 6 Jan 2006 08:11:35 +0000 (00:11 -0800)] 
[PATCH] ppc32: Remove __init qualifier from mpc52xx pci resources fixups

The mpc52xx_pci_fixup_resources is not only called at init but also when there
is a pci hotplug like when a cardbus card is plugged in.  So that function is
needed after init too.

Thanks to Asier Llano Palacios for reporting this.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Modify Freescale MPC52xx IRQ mapping to _not_ use irq 0
Sylvain Munaut [Fri, 6 Jan 2006 08:11:35 +0000 (00:11 -0800)] 
[PATCH] ppc32: Modify Freescale MPC52xx IRQ mapping to _not_ use irq 0

AFAIK IRQ number 0 is a perfectly valid IRQ number.  But it seems there are
numerous places where it's considered to be invalid or "no irq" value.  Since
that value is problematic, the IRQ mapping is changed to not use it.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Fix static IO mapping for Freescale MPC52xx
Sylvain Munaut [Fri, 6 Jan 2006 08:11:34 +0000 (00:11 -0800)] 
[PATCH] ppc32: Fix static IO mapping for Freescale MPC52xx

The current iomapping used MBAR_SIZE for the size argument of
io_block_mapping, resulting in a call to setbat with a size argument of 64k
which is invalid.

This patch correct this and maps the whole 0xf0000000->0xffffffff range so
that devices on the local bus are also included in the BAT mapping.

Thanks to Bernhard Kuhn from Metrowerks for pointing this out.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32/serial: Change mpc52xx_uart.c to use the Low Density Serial port major
Sylvain Munaut [Fri, 6 Jan 2006 08:11:32 +0000 (00:11 -0800)] 
[PATCH] ppc32/serial: Change mpc52xx_uart.c to use the Low Density Serial port major

Before this patch we were just using the "classic" /dev/ttySx devices.
However when another on the system is loaded that uses those (like drivers for
serial PCMCIA), that creates a conflict for the minors.  Therefore, we now use
/dev/ttyPSC[0:5] (note the 0-based numbering !) with some minors we've been
assigned in the "Low Density Serial port major"

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32/serial: Fix compiler errors with GCC 4.x in mpc52xx_uart.c
Sylvain Munaut [Fri, 6 Jan 2006 08:11:31 +0000 (00:11 -0800)] 
[PATCH] ppc32/serial: Fix compiler errors with GCC 4.x in mpc52xx_uart.c

Signed-off-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: Remove useless file arch/ppc/platforms/mpc5200.c
Sylvain Munaut [Fri, 6 Jan 2006 08:11:30 +0000 (00:11 -0800)] 
[PATCH] ppc32: Remove useless file arch/ppc/platforms/mpc5200.c

That file is a left-over of the 'old' OCP model that should have been erased
during the change to platform model but I forgot it ...

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] macintosh: don't store i2c_add_driver() return if no further processing done
Arthur Othieno [Fri, 6 Jan 2006 08:11:29 +0000 (00:11 -0800)] 
[PATCH] macintosh: don't store i2c_add_driver() return if no further processing done

therm_pm72.c and windfarm_lm75_sensor.c both store the return from
i2c_add_driver() but do no further processing on the result.  Simply return
what i2c_add_driver() did, instead.

Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] arch/ppc/kernel/idle.c: don't declare cpu variable in non-SMP kernels
Otavio Salvador [Fri, 6 Jan 2006 08:11:26 +0000 (00:11 -0800)] 
[PATCH] arch/ppc/kernel/idle.c: don't declare cpu variable in non-SMP kernels

Disable declaration of cpu variable in default_idle function when
building non-SMP kernels.

Signed-off-by: Otavio Salvador <otavio@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ppc32: remove "jumbo" member from ocp_func_emac_data
Eugene Surovegin [Fri, 6 Jan 2006 08:11:26 +0000 (00:11 -0800)] 
[PATCH] ppc32: remove "jumbo" member from ocp_func_emac_data

Remove the not needed anymore "jumbo" member from ocp_func_emac_data.
Jumbo frame support is handled by PPC4xx EMAC driver internally now.

Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] security/: possible cleanups
Adrian Bunk [Fri, 6 Jan 2006 08:11:25 +0000 (00:11 -0800)] 
[PATCH] security/: possible cleanups

make needlessly global code static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Keys: Remove key duplication
David Howells [Fri, 6 Jan 2006 08:11:24 +0000 (00:11 -0800)] 
[PATCH] Keys: Remove key duplication

Remove the key duplication stuff since there's nothing that uses it, no way
to get at it and it's awkward to deal with for LSM purposes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] selinux: more ARRAY_SIZE cleanups
Tobias Klauser [Fri, 6 Jan 2006 08:11:23 +0000 (00:11 -0800)] 
[PATCH] selinux: more ARRAY_SIZE cleanups

Further ARRAY_SIZE cleanups under security/selinux.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] selinux: ARRAY_SIZE cleanups
Nicolas Kaiser [Fri, 6 Jan 2006 08:11:22 +0000 (00:11 -0800)] 
[PATCH] selinux: ARRAY_SIZE cleanups

Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]).

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: page_state opt docs
Nick Piggin [Fri, 6 Jan 2006 08:11:21 +0000 (00:11 -0800)] 
[PATCH] mm: page_state opt docs

Comment the new locking rules for page_state statistics.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: page_state opt
Nick Piggin [Fri, 6 Jan 2006 08:11:20 +0000 (00:11 -0800)] 
[PATCH] mm: page_state opt

Optimise page_state manipulations by introducing interrupt unsafe accessors
to page_state fields.  Callers must provide their own locking (either
disable interrupts or not update from interrupt context).

Switch over the hot callsites that can easily be moved under interrupts off
sections.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] atomic_long_t & include/asm-generic/atomic.h V2
Christoph Lameter [Fri, 6 Jan 2006 08:11:20 +0000 (00:11 -0800)] 
[PATCH] atomic_long_t & include/asm-generic/atomic.h V2

Several counters already have the need to use 64 atomic variables on 64 bit
platforms (see mm_counter_t in sched.h).  We have to do ugly ifdefs to fall
back to 32 bit atomic on 32 bit platforms.

The VM statistics patch that I am working on will also make more extensive
use of atomic64.

This patch introduces a new type atomic_long_t by providing definitions in
asm-generic/atomic.h that works similar to the c "long" type.  Its 32 bits
on 32 bit platforms and 64 bits on 64 bit platforms.

Also cleans up the determination of the mm_counter_t in sched.h.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] build_zonelists_node(): rename args
Christoph Lameter [Fri, 6 Jan 2006 08:11:19 +0000 (00:11 -0800)] 
[PATCH] build_zonelists_node(): rename args

Give j and r meaningful names.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Fix zone policy determination
Christoph Lameter [Fri, 6 Jan 2006 08:11:18 +0000 (00:11 -0800)] 
[PATCH] Fix zone policy determination

The use k in the inner loop means that the highest zone nr is always used
if any zone of a node is populated.  This means that the policy zone is not
correctly determined on arches that do no use HIGHMEM like ia64.

Change the loop to decrement k which also simplifies the BUG_ON.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: move determination of policy_zone into page allocator
Christoph Lameter [Fri, 6 Jan 2006 08:11:17 +0000 (00:11 -0800)] 
[PATCH] mm: move determination of policy_zone into page allocator

Currently the function to build a zonelist for a BIND policy has the side
effect to set the policy_zone.  This seems to be a bit strange.  policy
zone seems to not be initialized elsewhere and therefore 0.  Do we police
ZONE_DMA if no bind policy has been used yet?

This patch moves the determination of the zone to apply policies to into
the page allocator.  We determine the zone while building the zonelist for
nodes.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: simplify build_zonelists_node by removing the case statement.
Christoph Lameter [Fri, 6 Jan 2006 08:11:16 +0000 (00:11 -0800)] 
[PATCH] mm: simplify build_zonelists_node by removing the case statement.

Simplify build_zonelists_node by removing the case statement.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: add populated_zone() helper
Con Kolivas [Fri, 6 Jan 2006 08:11:15 +0000 (00:11 -0800)] 
[PATCH] mm: add populated_zone() helper

There are numerous places we check whether a zone is populated or not.

Provide a helper function to check for populated zones and convert all
checks for zone->present_pages.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] consolidate lru_add_drain() and lru_drain_cache()
Andrew Morton [Fri, 6 Jan 2006 08:11:14 +0000 (00:11 -0800)] 
[PATCH] consolidate lru_add_drain() and lru_drain_cache()

Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Li Shaohua <shaohua.li@intel.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] vmscan: balancing fix
Andrew Morton [Fri, 6 Jan 2006 08:11:14 +0000 (00:11 -0800)] 
[PATCH] vmscan: balancing fix

Revert a patch which went into 2.6.8-rc1.  The changelog for that patch was:

  The shrink_zone() logic can, under some circumstances, cause far too many
  pages to be reclaimed.  Say, we're scanning at high priority and suddenly
  hit a large number of reclaimable pages on the LRU.

  Change things so we bale out when SWAP_CLUSTER_MAX pages have been
  reclaimed.

Problem is, this change caused significant imbalance in inter-zone scan
balancing by truncating scans of larger zones.

Suppose, for example, ZONE_HIGHMEM is 10x the size of ZONE_NORMAL.  The zone
balancing algorithm would require that if we're scanning 100 pages of
ZONE_HIGHMEM, we should scan 10 pages of ZONE_NORMAL.  But this logic will
cause the scanning of ZONE_HIGHMEM to bale out after only 32 pages are
reclaimed.  Thus effectively causing smaller zones to be scanned relatively
harder than large ones.

Now I need to remember what the workload was which caused me to write this
patch originally, then fix it up in a different way...

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: pfault optimisation
Nick Piggin [Fri, 6 Jan 2006 08:11:13 +0000 (00:11 -0800)] 
[PATCH] mm: pfault optimisation

This atomic operation is superfluous: the pte will be added with the
referenced bit set, and the page will be referenced through this mapping after
the page fault handler returns anyway.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: rmap optimisation
Nick Piggin [Fri, 6 Jan 2006 08:11:12 +0000 (00:11 -0800)] 
[PATCH] mm: rmap optimisation

Optimise rmap functions by minimising atomic operations when we know there
will be no concurrent modifications.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: bad_page optimisation
Nick Piggin [Fri, 6 Jan 2006 08:11:11 +0000 (00:11 -0800)] 
[PATCH] mm: bad_page optimisation

Cut down size slightly by not passing bad_page the function name (it should be
able to be determined by dump_stack()).  And cut down the number of printks in
bad_page.

Also, cut down some branching in the destroy_compound_page path.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: dma32 zone statistics
Nick Piggin [Fri, 6 Jan 2006 08:11:10 +0000 (00:11 -0800)] 
[PATCH] mm: dma32 zone statistics

Add dma32 to zone statistics.  Also attempt to arrange struct page_state a
bit better (visually).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] kill last zone_reclaim() bits
Andrew Morton [Fri, 6 Jan 2006 08:11:09 +0000 (00:11 -0800)] 
[PATCH] kill last zone_reclaim() bits

Remove the last bits of Martin's ill-fated sys_set_zone_reclaim().

Cc: Martin Hicks <mort@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] find_lock_page(): call __lock_page() directly.
Nikita Danilov [Fri, 6 Jan 2006 08:11:08 +0000 (00:11 -0800)] 
[PATCH] find_lock_page(): call __lock_page() directly.

As find_lock_page() already checks with TestSetPageLocked() that page is
locked, there is no need to call lock_page() that will try-lock page again
(chances of page being unlocked in between are small).  Call __lock_page()
directly, this saves one atomic operation.

Also, mark truncate-while-slept path as unlikely while we are here.

(akpm: ug.  But this is actually a common path for normal old read()s against
a page which is under readahead I/O so ho-hum.)

Signed-off-by: Nikita Danilov <danilov@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] FRV: Clean up bootmem allocator's page freeing algorithm
David Howells [Fri, 6 Jan 2006 08:11:08 +0000 (00:11 -0800)] 
[PATCH] FRV: Clean up bootmem allocator's page freeing algorithm

The attached patch cleans up the way the bootmem allocator frees pages.

A new function, __free_pages_bootmem(), is provided in mm/page_alloc.c that is
called from mm/bootmem.c to turn pages over to the main allocator.  All the
bits of code to initialise pages (clearing PG_reserved and setting the page
count) are moved to here.  The checks on page validity are removed, on the
assumption that the struct page arrays will have been prepared correctly.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Cleanup bootmem allocator and fix alloc_bootmem_low
Ravikiran G Thirumalai [Fri, 6 Jan 2006 08:11:01 +0000 (00:11 -0800)] 
[PATCH] Cleanup bootmem allocator and fix alloc_bootmem_low

Patch cleans up the alloc_bootmem fix for swiotlb.  Patch removes
alloc_bootmem_*_limit api and fixes alloc_boot_*low api to do the right
thing -- allocate from low32 memory.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: page_alloc cleanups
Nick Piggin [Fri, 6 Jan 2006 08:11:01 +0000 (00:11 -0800)] 
[PATCH] mm: page_alloc cleanups

Small cleanups that does not change generated code with the gcc's I've tested
with.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: page_state fixes
Nick Piggin [Fri, 6 Jan 2006 08:11:00 +0000 (00:11 -0800)] 
[PATCH] mm: page_state fixes

read_page_state and __get_page_state only traverse online CPUs, which will
cause results to fluctuate when CPUs are plugged in or out.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: remove pcp low
Nick Piggin [Fri, 6 Jan 2006 08:10:59 +0000 (00:10 -0800)] 
[PATCH] mm: remove pcp low

struct per_cpu_pages.low is useless.  Remove it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: remove bad_range
Nick Piggin [Fri, 6 Jan 2006 08:10:58 +0000 (00:10 -0800)] 
[PATCH] mm: remove bad_range

bad_range is supposed to be a temporary check.  It would be a pity to throw it
out.  Make it depend on CONFIG_DEBUG_VM instead.

CONFIG_HOLES_IN_ZONE systems were relying on this to check pfn_valid in the
page allocator.  Add that to page_is_buddy instead.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: microopt conditions
Nick Piggin [Fri, 6 Jan 2006 08:10:57 +0000 (00:10 -0800)] 
[PATCH] mm: microopt conditions

Micro optimise some conditionals where we don't need lazy evaluation.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: set_page_refs opt
Nick Piggin [Fri, 6 Jan 2006 08:10:57 +0000 (00:10 -0800)] 
[PATCH] mm: set_page_refs opt

Inline set_page_refs.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: pagealloc opt
Nick Piggin [Fri, 6 Jan 2006 08:10:56 +0000 (00:10 -0800)] 
[PATCH] mm: pagealloc opt

Slightly optimise some page allocation and freeing functions by taking
advantage of knowing whether or not interrupts are disabled.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: free_pages_and_swap_cache opt
Hugh Dickins [Fri, 6 Jan 2006 08:10:55 +0000 (00:10 -0800)] 
[PATCH] mm: free_pages_and_swap_cache opt

Minor optimization (though it doesn't help in the PREEMPT case, severely
constrained by small ZAP_BLOCK_SIZE).  free_pages_and_swap_cache works in
chunks of 16, calling release_pages which works in chunks of PAGEVEC_SIZE.
But PAGEVEC_SIZE was dropped from 16 to 14 in 2.6.10, so we're now doing more
spin_lock_irq'ing than necessary: use PAGEVEC_SIZE throughout.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] sparsemem: provide pfn_to_nid
Andy Whitcroft [Fri, 6 Jan 2006 08:10:54 +0000 (00:10 -0800)] 
[PATCH] sparsemem: provide pfn_to_nid

Before SPARSEMEM is initialised we cannot provide an efficient pfn_to_nid()
implmentation; before initialisation is complete we use early_pfn_to_nid()
to provide location information.  Until recently there was no non-init user
of this functionality.  Provide a post init pfn_to_nid() implementation.

Note that this implmentation assumes that the pfn passed has been validated
with pfn_valid().  The current single user of this function already has
this check.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] flatmem split out memory model
Andy Whitcroft [Fri, 6 Jan 2006 08:10:53 +0000 (00:10 -0800)] 
[PATCH] flatmem split out memory model

There are three places we define pfn_to_nid().  Two in linux/mmzone.h and one
in asm/mmzone.h.  These in essence represent the three memory models.  The
definition in linux/mmzone.h under !NEED_MULTIPLE_NODES is both the FLATMEM
definition and the optimisation for single NUMA nodes; the one under SPARSEMEM
is the NUMA sparsemem one; the one in asm/mmzone.h under DISCONTIGMEM is the
discontigmem one.  This is not in the least bit obvious, particularly the
connection between the non-NUMA optimisations and the memory models.

Two patches:

flatmem-split-out-memory-model: simplifies the selection of pfn_to_nid()
implementations.  The selection is based primarily off the memory model
selected.  Optimisations for non-NUMA are applied where needed.

sparse-provide-pfn_to_nid: implement pfn_to_nid() for SPARSEMEM

This patch:

pfn_to_nid is memory model specific

The pfn_to_nid() call is memory model specific.  It represents the locality
identifier for the memory passed.  Classically this would be a NUMA node,
but not a chunk of memory under DISCONTIGMEM.

The SPARSEMEM and FLATMEM memory model non-NUMA versions of pfn_to_nid()
are folded together under NEED_MULTIPLE_NODES, while DISCONTIGMEM has its
own optimisation.  This is all very confusing.

This patch splits out each implementation of pfn_to_nid() so that we can
see them and the optimisations to each.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Shut up warnings in ipc/shm.c
Russell King [Fri, 6 Jan 2006 08:10:52 +0000 (00:10 -0800)] 
[PATCH] Shut up warnings in ipc/shm.c

Fix two warnings in ipc/shm.c

ipc/shm.c:122: warning: statement with no effect
ipc/shm.c:560: warning: statement with no effect

by converting the macros to empty inline functions.  For safety, let's do
all three.  This also has the advantage that typechecking gets performed
even without CONFIG_SHMEM enabled.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES
Mike Kravetz [Fri, 6 Jan 2006 08:10:51 +0000 (00:10 -0800)] 
[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES

The NODES_SPAN_OTHER_NODES config option was created so that DISCONTIGMEM
could handle pSeries numa layouts.  However, support for DISCONTIGMEM has
been replaced by SPARSEMEM on powerpc.  As a result, this config option and
supporting code is no longer needed.

I have already sent a patch to Paul that removes the option from powerpc
specific code.  This removes the arch independent piece.  Doesn't really
matter which is applied first.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: pfn_to_pgdat not used in common code
Andy Whitcroft [Fri, 6 Jan 2006 08:10:50 +0000 (00:10 -0800)] 
[PATCH] mm: pfn_to_pgdat not used in common code

pfn_to_pgdat() isn't used in common code.  Remove definition.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: kvaddr_to_nid not used in common code
Andy Whitcroft [Fri, 6 Jan 2006 08:10:50 +0000 (00:10 -0800)] 
[PATCH] mm: kvaddr_to_nid not used in common code

kvaddr_to_nid() isn't used in common code nor in i386 code.  Remove these
definitions.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] hugepages: fold find_or_alloc_pages into huge_no_page()
Christoph Lameter [Fri, 6 Jan 2006 08:10:49 +0000 (00:10 -0800)] 
[PATCH] hugepages: fold find_or_alloc_pages into huge_no_page()

The number of parameters for find_or_alloc_page increases significantly after
policy support is added to huge pages.  Simplify the code by folding
find_or_alloc_huge_page() into hugetlb_no_page().

Adam Litke objected to this piece in an earlier patch but I think this is a
good simplification.  Diffstat shows that we can get rid of almost half of the
lines of find_or_alloc_page().  If we can find no consensus then lets simply
drop this patch.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Remove old node based policy interface from mempolicy.c
Christoph Lameter [Fri, 6 Jan 2006 08:10:47 +0000 (00:10 -0800)] 
[PATCH] Remove old node based policy interface from mempolicy.c

mempolicy.c contains provisional interface for huge page allocation based on
node numbers.  This is in use in SLES9 but was never used (AFAIK) in upstream
versions of Linux.

Huge page allocations now use zonelists to figure out where to allocate pages.
 The use of zonelists allows us to find the closest hugepage which was the
consideration of the NUMA distance for huge page allocations.

Remove the obsolete functions.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Add NUMA policy support for huge pages.
Christoph Lameter [Fri, 6 Jan 2006 08:10:46 +0000 (00:10 -0800)] 
[PATCH] Add NUMA policy support for huge pages.

The huge_zonelist() function in the memory policy layer provides an list of
zones ordered by NUMA distance.  The hugetlb layer will walk that list looking
for a zone that has available huge pages but is also in the nodeset of the
current cpuset.

This patch does not contain the folding of find_or_alloc_huge_page() that was
controversial in the earlier discussion.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: dequeue a huge page near to this node
Christoph Lameter [Fri, 6 Jan 2006 08:10:45 +0000 (00:10 -0800)] 
[PATCH] mm: dequeue a huge page near to this node

This was discussed at
http://marc.theaimsgroup.com/?l=linux-kernel&m=113166526217117&w=2

This patch changes the dequeueing to select a huge page near the node
executing instead of always beginning to check for free nodes from node 0.
This will result in a placement of the huge pages near the executing
processor improving performance.

The existing implementation can place the huge pages far away from the
executing processor causing significant degradation of performance.  The
search starting from zero also means that the lower zones quickly run out
of memory.  Selecting a huge page near the process distributed the huge
pages better.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Hugetlb: Copy on Write support
David Gibson [Fri, 6 Jan 2006 08:10:44 +0000 (00:10 -0800)] 
[PATCH] Hugetlb: Copy on Write support

Implement copy-on-write support for hugetlb mappings so MAP_PRIVATE can be
supported.  This helps us to safely use hugetlb pages in many more
applications.  The patch makes the following changes.  If needed, I also have
it broken out according to the following paragraphs.

1. Add a pair of functions to set/clear write access on huge ptes.  The
   writable check in make_huge_pte is moved out to the caller for use by COW
   later.

2. Hugetlb copy-on-write requires special case handling in the following
   situations:

   - copy_hugetlb_page_range() - Copied pages must be write protected so
     a COW fault will be triggered (if necessary) if those pages are written
     to.

   - find_or_alloc_huge_page() - Only MAP_SHARED pages are added to the
     page cache.  MAP_PRIVATE pages still need to be locked however.

3. Provide hugetlb_cow() and calls from hugetlb_fault() and
   hugetlb_no_page() which handles the COW fault by making the actual copy.

4. Remove the check in hugetlbfs_file_map() so that MAP_PRIVATE mmaps
   will be allowed.  Make MAP_HUGETLB exempt from the depricated VM_RESERVED
   mapping check.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Hugetlb: Reorganize hugetlb_fault to prepare for COW
Adam Litke [Fri, 6 Jan 2006 08:10:43 +0000 (00:10 -0800)] 
[PATCH] Hugetlb: Reorganize hugetlb_fault to prepare for COW

This patch splits the "no_page()" type activity into its own function,
hugetlb_no_page().  hugetlb_fault() becomes the entry point for hugetlb faults
and delegates to the appropriate handler depending on the type of fault.
Right now we still have only hugetlb_no_page() but a later patch introduces a
COW fault.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Hugetlb: Rename find_lock_page to find_or_alloc_huge_page
Adam Litke [Fri, 6 Jan 2006 08:10:42 +0000 (00:10 -0800)] 
[PATCH] Hugetlb: Rename find_lock_page to find_or_alloc_huge_page

find_lock_huge_page() isn't a great name, since it does extra things not
analagous to find_lock_page().  Rename it find_or_alloc_huge_page() which is
closer to the mark.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Hugetlb: Remove duplicate i_size check
Adam Litke [Fri, 6 Jan 2006 08:10:40 +0000 (00:10 -0800)] 
[PATCH] Hugetlb: Remove duplicate i_size check

cleanup

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store
Badari Pulavarty [Fri, 6 Jan 2006 08:10:38 +0000 (00:10 -0800)] 
[PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store

Here is the patch to implement madvise(MADV_REMOVE) - which frees up a
given range of pages & its associated backing store.  Current
implementation supports only shmfs/tmpfs and other filesystems return
-ENOSYS.

"Some app allocates large tmpfs files, then when some task quits and some
client disconnect, some memory can be released.  However the only way to
release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli

Databases want to use this feature to drop a section of their bufferpool
(shared memory segments) - without writing back to disk/swap space.

This feature is also useful for supporting hot-plug memory on UML.

Concerns raised by Andrew Morton:

- "We have no plan for holepunching!  If we _do_ have such a plan (or
  might in the future) then what would the API look like?  I think
  sys_holepunch(fd, start, len), so we should start out with that."

- Using madvise is very weird, because people will ask "why do I need to
  mmap my file before I can stick a hole in it?"

- None of the other madvise operations call into the filesystem in this
  manner.  A broad question is: is this capability an MM operation or a
  filesytem operation?  truncate, for example, is a filesystem operation
  which sometimes has MM side-effects.  madvise is an mm operation and with
  this patch, it gains FS side-effects, only they're really, really
  significant ones."

Comments:

- Andrea suggested the fs operation too but then it's more efficient to
  have it as a mm operation with fs side effects, because they don't
  immediatly know fd and physical offset of the range.  It's possible to
  fixup in userland and to use the fs operation but it's more expensive,
  the vmas are already in the kernel and we can use them.

Short term plan &  Future Direction:

- We seem to need this interface only for shmfs/tmpfs files in the short
  term.  We have to add hooks into the filesystem for correctness and
  completeness.  This is what this patch does.

- In the future, plan is to support both fs and mmap apis also.  This
  also involves (other) filesystem specific functions to be implemented.

- Current patch doesn't support VM_NONLINEAR - which can be addressed in
  the future.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] reiser4: vfs: add truncate_inode_pages_range()
Hans Reiser [Fri, 6 Jan 2006 08:10:36 +0000 (00:10 -0800)] 
[PATCH] reiser4: vfs: add truncate_inode_pages_range()

This patch makes truncate_inode_pages_range from truncate_inode_pages.
truncate_inode_pages became a one-liner call to truncate_inode_pages_range.

Reiser4 needs truncate_inode_pages_ranges because it tries to keep
correspondence between existences of metadata pointing to data pages and pages
to which those metadata point to.  So, when metadata of certain part of file
is removed from filesystem tree, only pages of corresponding range are to be
truncated.

(Needed by the madvise(MADV_REMOVE) patch)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] memhotplug: register_memory should be global
Andy Whitcroft [Fri, 6 Jan 2006 08:10:35 +0000 (00:10 -0800)] 
[PATCH] memhotplug: register_memory should be global

register_memory is global and declared so in linux/memory.h.  Update the
HOTPLUG specific definition to match.  This fixes a compile warning when
HOTPLUG is enabled.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] memhotplug: register_ and unregister_memory_notifier should be global
Andy Whitcroft [Fri, 6 Jan 2006 08:10:35 +0000 (00:10 -0800)] 
[PATCH] memhotplug: register_ and unregister_memory_notifier should be global

Both register_memory_notifer and unregister_memory_notifier are global and
declared so in linux/memory.h.  Update the HOTPLUG specific definitions to
match.  This fixes a compile warning when HOTPLUG is enabled.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] memhotplug: __add_section remove unused pgdat definition
Andy Whitcroft [Fri, 6 Jan 2006 08:10:33 +0000 (00:10 -0800)] 
[PATCH] memhotplug: __add_section remove unused pgdat definition

__add_section defines an unused pointer to the zones pgdat.  Remove this
definition.  This fixes a compile warning.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] mm: fix __alloc_pages cpuset ALLOC_* flags
Paul Jackson [Fri, 6 Jan 2006 08:10:32 +0000 (00:10 -0800)] 
[PATCH] mm: fix __alloc_pages cpuset ALLOC_* flags

Two changes to the setting of the ALLOC_CPUSET flag in
mm/page_alloc.c:__alloc_pages()

- A bug fix - the "ignoring mins" case should not be honoring ALLOC_CPUSET.
  This case of all cases, since it is handling a request that will free up
  more memory than is asked for (exiting tasks, e.g.) should be allowed to
  escape cpuset constraints when memory is tight.

- A logic change to make it simpler.  Honor cpusets even on GFP_ATOMIC
  (!wait) requests.  With this, cpuset confinement applies to all requests
  except ALLOC_NO_WATERMARKS, so that in a subsequent cleanup patch, I can
  remove the ALLOC_CPUSET flag entirely.  Since I don't know any real reason
  this logic has to be either way, I am choosing the path of the simplest
  code.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] swsusp: resume_store() retval fix
Andrew Morton [Fri, 6 Jan 2006 08:09:50 +0000 (00:09 -0800)] 
[PATCH] swsusp: resume_store() retval fix

- This function returns -EINVAL all the time.  Fix.

- Decruftify it a bit too.

- Writing to it doesn't seem to do what it's suppoed to do.

Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] alpha: dma_map_page() fix
Andrew Morton [Fri, 6 Jan 2006 08:09:50 +0000 (00:09 -0800)] 
[PATCH] alpha: dma_map_page() fix

Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.
NeilBrown [Fri, 6 Jan 2006 08:09:49 +0000 (00:09 -0800)] 
[PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.

The hash.h hash_long function, when used on a 64 bit machine, ignores many
of the middle-order bits.  (The prime chosen it too bit-sparse).

IP addresses for clients of an NFS server are very likely to differ only in
the low-order bits.  As addresses are stored in network-byte-order, these
bits become middle-order bits in a little-endian 64bit 'long', and so do
not contribute to the hash.  Thus you can have the situation where all
clients appear on one hash chain.

So, until hash_long is fixed (or maybe forever), us a hash function that
works well on IP addresses - xor the bytes together.

Thanks to "Iozone" <capps@iozone.org> for identifying this problem.

Cc: "Iozone" <capps@iozone.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] nbd: fix TX/RX race condition
Herbert Xu [Fri, 6 Jan 2006 08:09:47 +0000 (00:09 -0800)] 
[PATCH] nbd: fix TX/RX race condition

Janos Haar of First NetCenter Bt.  reported numerous crashes involving the
NBD driver.  With his help, this was tracked down to bogus bio vectors
which in turn was the result of a race condition between the
receive/transmit routines in the NBD driver.

The bug manifests itself like this:

CPU0 CPU1
do_nbd_request
add req to queuelist
nbd_send_request
send req head
for each bio
kmap
send
nbd_read_stat
nbd_find_request
nbd_end_request
kunmap

When CPU1 finishes nbd_end_request, the request and all its associated
bio's are freed.  So when CPU0 calls kunmap whose argument is derived from
the last bio, it may crash.

Under normal circumstances, the race occurs only on the last bio.  However,
if an error is encountered on the remote NBD server (such as an incorrect
magic number in the request), or if there were a bug in the server, it is
possible for the nbd_end_request to occur any time after the request's
addition to the queuelist.

The following patch fixes this problem by making sure that requests are not
added to the queuelist until after they have been completed transmission.

In order for the receiving side to be ready for responses involving
requests still being transmitted, the patch introduces the concept of the
active request.

When a response matches the current active request, its processing is
delayed until after the tranmission has come to a stop.

This has been tested by Janos and it has been successful in curing this
race condition.

From: Herbert Xu <herbert@gondor.apana.org.au>

  Here is an updated patch which removes the active_req wait in
  nbd_clear_queue and the associated memory barrier.

  I've also clarified this in the comment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: <djani22@dynamicweb.hu>
Cc: Paul Clements <Paul.Clements@SteelEye.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] hfsplus oops fix
Joshua Kwan [Fri, 6 Jan 2006 08:09:45 +0000 (00:09 -0800)] 
[PATCH] hfsplus oops fix

nls_utf8 is available, and the check in hfsplus_fill_super checks the wrong
pointer for NULLness (it checks the saved nls, not the new one that it
needs to use.)

Signed-off-by: Joshua Kwan <joshk@triplehelix.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] i386: PTRACE_POKEUSR: allow changing RF bit in EFLAGS register.
Chuck Ebbert [Fri, 6 Jan 2006 04:11:29 +0000 (23:11 -0500)] 
[PATCH] i386: PTRACE_POKEUSR: allow changing RF bit in EFLAGS register.

Setting RF (resume flag) allows a debugger to resume execution after a
code breakpoint without tripping the breakpoint again.  It is reset by
the CPU after execution of one instruction.

Requested by Stephane Eranian:
  "I am trying to the user HW debug registers on i386 and I am running
   into a problem with ptrace() not allowing access to EFLAGS_RF for
   POKEUSER (see FLAG_MASK).  [ ...  ] It avoids the need to remove the
   breakpoint, single step, and reinstall.  The equivalent functionality
   exists on IA-64 and is allowed by ptrace()"

Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years agoMerge http://oss.oracle.com/git/ocfs2
Linus Torvalds [Fri, 6 Jan 2006 04:43:11 +0000 (20:43 -0800)] 
Merge http://oss.oracle.com/git/ocfs2

19 years agoUpdate MAINTAINERS - Jody is no longer at Steamballoon.
Jody McIntyre [Fri, 6 Jan 2006 04:04:08 +0000 (23:04 -0500)] 
Update MAINTAINERS - Jody is no longer at Steamballoon.

19 years agoMerge with http://kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Jody McIntyre [Fri, 6 Jan 2006 03:22:50 +0000 (22:22 -0500)] 
Merge ... /linux/kernel/git/torvalds/linux-2.6.git

19 years ago[NET]: Change 1500 to ETH_DATA_LEN in some files
Kris Katterjohn [Fri, 6 Jan 2006 00:35:42 +0000 (16:35 -0800)] 
[NET]: Change 1500 to ETH_DATA_LEN in some files

These patches add the header linux/if_ether.h and change 1500 to
ETH_DATA_LEN in some files.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[IPVS]: Another file needs linux/interrupt.h
Andrew Morton [Thu, 5 Jan 2006 22:57:36 +0000 (14:57 -0800)] 
[IPVS]: Another file needs linux/interrupt.h

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6
Linus Torvalds [Thu, 5 Jan 2006 23:55:49 +0000 (15:55 -0800)] 
Merge git://git./linux/kernel/git/brodo/pcmcia-2.6

19 years ago[PATCH] pcmcia: add some IDs for ide-cs and dtl1_cs
Richard Purdie [Thu, 5 Jan 2006 09:56:03 +0000 (09:56 +0000)] 
[PATCH] pcmcia: add some IDs for ide-cs and dtl1_cs

Add some PCMCIA device IDs for the microdrive found in the Sharp Zaurus
and a different revision of the Socket CF+ Bluetooth card.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: cleanup cs.c, reduce size
Daniel Ritz [Fri, 30 Dec 2005 14:12:35 +0000 (15:12 +0100)] 
[PATCH] pcmcia: cleanup cs.c, reduce size

kill the socket_shutdown()/shutdown_socket() confusion by making it
one single function. move cs_socket_put() in there. nicer to read and
smaller:

original:
   text    data     bss     dec     hex filename
  25181    1076      32   26289    66b1 drivers/pcmcia/pcmcia_core.ko

patched:
   text    data     bss     dec     hex filename
  24973    1076      32   26081    65e1 drivers/pcmcia/pcmcia_core.ko

Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] drivers/pcmcia/cistpl.c: fix endian warnings
Alexey Dobriyan [Fri, 23 Dec 2005 20:51:08 +0000 (23:51 +0300)] 
[PATCH] drivers/pcmcia/cistpl.c: fix endian warnings

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: kzalloc conversion
Dominik Brodowski [Sun, 11 Dec 2005 20:18:26 +0000 (21:18 +0100)] 
[PATCH] pcmcia: kzalloc conversion

Convert users of kmalloc and memset to kzalloc

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: no probing of ioports on PARISC
Dominik Brodowski [Sun, 11 Dec 2005 19:47:44 +0000 (20:47 +0100)] 
[PATCH] pcmcia: no probing of ioports on PARISC

Do not wildly probe the IO ports we're trying to use on PARISC.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: export stored values in sysfs
Dominik Brodowski [Thu, 8 Dec 2005 22:50:36 +0000 (23:50 +0100)] 
[PATCH] pcmcia: export stored values in sysfs

Export the stored values instead of re-reading everything in the socket
information sysfs files, and make them accessible to all users, not only
to root.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] 8xx PCMCIA: support for MPC885ADS and MPC866ADS
Vitaly Bordug [Thu, 8 Dec 2005 15:56:12 +0000 (13:56 -0200)] 
[PATCH] 8xx PCMCIA: support for MPC885ADS and MPC866ADS

This adds PCMCIA support for both MPC885ADS and MPC866ADS.

This is established not together with FADS, because 885 does not have
io_block_mapping() for BCSR area.
Also, some cleanups done both for 885ADS and MBX.

Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com>
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] m8xx_pcmcia: support MAP_AUTOSZ required for CF cards
Vitaly Bordug [Thu, 8 Dec 2005 15:53:20 +0000 (13:53 -0200)] 
[PATCH] m8xx_pcmcia: support MAP_AUTOSZ required for CF cards

This fixes misconfiguration that could result in odd work of some old CF
cards.

Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com>
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: properly handle static mem, but dynamic io sockets
Dominik Brodowski [Wed, 7 Dec 2005 11:32:20 +0000 (12:32 +0100)] 
[PATCH] pcmcia: properly handle static mem, but dynamic io sockets

Some PCMCIA sockets have statically mapped memory windows, but dynamically
mapped IO windows. Using the "nonstatic" socket library is inpractical for
them, as they do neither need a resource database (as we can trust the
kernel resource database on m68k and ppc) nor lots of other features of that
library. Let them get a small "iodyn" socket library (105 lines of code)
instead.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: fix sound drivers
Dominik Brodowski [Thu, 5 Jan 2006 23:27:16 +0000 (00:27 +0100)] 
[PATCH] pcmcia: fix sound drivers

Update the PCMCIA sound drivers to handle the recent changes to the PCMCIA
core. A part of this merge was done by Takashi Iwai <tiwai@suse.de>.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: unify attach, EVENT_CARD_INSERTION handlers into one probe callback
Dominik Brodowski [Mon, 14 Nov 2005 20:25:51 +0000 (21:25 +0100)] 
[PATCH] pcmcia: unify attach, EVENT_CARD_INSERTION handlers into one probe callback

Unify the EVENT_CARD_INSERTION and "attach" callbacks to one unified
probe() callback. As all in-kernel drivers are changed to this new
callback, there will be no temporary backwards-compatibility. Inside a
probe() function, each driver _must_ set struct pcmcia_device
*p_dev->instance and instance->handle correctly.

With these patches, the basic driver interface for 16-bit PCMCIA drivers
now has the classic four callbacks known also from other buses:

        int (*probe)            (struct pcmcia_device *dev);
        void (*remove)          (struct pcmcia_device *dev);

        int (*suspend)          (struct pcmcia_device *dev);
        int (*resume)           (struct pcmcia_device *dev);

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: remove dev_list from drivers
Dominik Brodowski [Mon, 14 Nov 2005 20:25:35 +0000 (21:25 +0100)] 
[PATCH] pcmcia: remove dev_list from drivers

The linked list of devices managed by each PCMCIA driver is, in very most
cases, unused. Therefore, remove it from many drivers.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: remove old detach mechanism
Dominik Brodowski [Mon, 14 Nov 2005 20:25:23 +0000 (21:25 +0100)] 
[PATCH] pcmcia: remove old detach mechanism

Remove the old "detach" mechanism as it is unused now.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: unify detach, REMOVAL_EVENT handlers into one remove callback
Dominik Brodowski [Mon, 14 Nov 2005 20:23:14 +0000 (21:23 +0100)] 
[PATCH] pcmcia: unify detach, REMOVAL_EVENT handlers into one remove callback

Unify the "detach" and REMOVAL_EVENT handlers to one "remove" function.
Old functionality is preserved, for the moment.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: merge suspend into device model
Dominik Brodowski [Thu, 5 Jan 2006 23:02:03 +0000 (00:02 +0100)] 
[PATCH] pcmcia: merge suspend into device model

Merge the suspend and resume methods for 16-bit PCMCIA cards into the
device model -- for both runtime power management and suspend to ram/disk.

Bugfix in ds.c by Richard Purdie
Signed-Off-By: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
19 years ago[PATCH] pcmcia: new suspend core
Dominik Brodowski [Mon, 14 Nov 2005 20:21:18 +0000 (21:21 +0100)] 
[PATCH] pcmcia: new suspend core

Move the suspend and resume methods out of the event handler, and into
special functions. Also use these functions for pre- and post-reset, as
almost all drivers already do, and the remaining ones can easily be
converted.

Bugfix to include/pcmcia/ds.c
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>