linux-2.6
17 years agoxen: add the Xenbus sysfs and virtual device hotplug driver
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: add the Xenbus sysfs and virtual device hotplug driver

This communicates with the machine control software via a registry
residing in a controlling virtual machine. This allows dynamic
creation, destruction and modification of virtual device
configurations (network devices, block devices and CPUS, to name some
examples).

[ Greg, would you mind giving this a review?  Thanks -J ]

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Greg KH <greg@kroah.com>
17 years agoxen: Add grant table support
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: Add grant table support

Add Xen 'grant table' driver which allows granting of access to
selected local memory pages by other virtual machines and,
symmetrically, the mapping of remote memory pages which other virtual
machines have granted access to.

This driver is a prerequisite for many of the Xen virtual device
drivers, which grant the 'device driver domain' restricted and
temporary access to only those memory pages that are currently
involved in I/O operations.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: use the hvc console infrastructure for Xen console
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: use the hvc console infrastructure for Xen console

Implement a Xen back-end for hvc console.

* * *
Add early printk support via hvc console, enable using
"earlyprintk=xen" on the kernel command line.

From: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Olof Johansson <olof@lixom.net>
17 years agoxen: hack to prevent bad segment register reload
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: hack to prevent bad segment register reload

The hypervisor saves and restores the segment registers as part of the
state is saves while context switching.  If, during a context switch,
the next process doesn't use the TLS segments, it invalidates the GDT
entry, causing the segment register reload to fault.  This fault
effectively doubles the cost of a context switch.

This patch is a band-aid workaround which clears the usermode %gs
after it has been saved for the previous process, but before it gets
reloaded for the next, and it avoids having the hypervisor attempt to
erroneously reload it.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: lazy-mmu operations
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: lazy-mmu operations

This patch uses the lazy-mmu hooks to batch mmu operations where
possible.  This is primarily useful for batching operations applied to
active pagetables, which happens during mprotect, munmap, mremap and
the like (mmap does not do bulk pagetable operations, so it isn't
helped).

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: Add support for preemption
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: Add support for preemption

Add Xen support for preemption.  This is mostly a cleanup of existing
preempt_enable/disable calls, or just comments to explain the current
usage.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: SMP guest support
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:06 +0000 (18:37 -0700)] 
xen: SMP guest support

This is a fairly straightforward Xen implementation of smp_ops.

Xen has its own IPI mechanisms, and has no dependency on any
APIC-based IPI.  The smp_ops hooks and the flush_tlb_others pv_op
allow a Xen guest to avoid all APIC code in arch/i386 (the only apic
operation is a single apic_read for the apic version number).

One subtle point which needs to be addressed is unpinning pagetables
when another cpu may have a lazy tlb reference to the pagetable. Xen
will not allow an in-use pagetable to be unpinned, so we must find any
other cpus with a reference to the pagetable and get them to shoot
down their references.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andi Kleen <ak@suse.de>
17 years agoxen: Implement sched_clock
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: Implement sched_clock

Implement xen_sched_clock, which returns the number of ns the current
vcpu has been actually in an unstolen state (ie, running or blocked,
vs runnable-but-not-running, or offline) since boot.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: john stultz <johnstul@us.ibm.com>
17 years agoxen: Account for stolen time
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: Account for stolen time

This patch accounts for the time stolen from our VCPUs.  Stolen time is
time where a vcpu is runnable and could be running, but all available
physical CPUs are being used for something else.

This accounting gets run on each timer interrupt, just as a way to get
it run relatively often, and when interesting things are going on.
Stolen time is not really used by much in the kernel; it is reported
in /proc/stats, and that's about it.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
17 years agoxen: ignore RW mapping of RO pages in pagetable_init
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: ignore RW mapping of RO pages in pagetable_init

When setting up the initial pagetable, which includes mappings of all
low physical memory, ignore a mapping which tries to set the RW bit on
an RO pte.  An RO pte indicates a page which is part of the current
pagetable, and so it cannot be allowed to become RW.

Once xen_pagetable_setup_done is called, set_pte reverts to its normal
behaviour.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: ebiederm@xmission.com (Eric W. Biederman)
17 years agoxen: Complete pagetable pinning
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: Complete pagetable pinning

Xen requires all active pagetables to be marked read-only.  When the
base of the pagetable is loaded into %cr3, the hypervisor validates
the entire pagetable and only allows the load to proceed if it all
checks out.

This is pretty slow, so to mitigate this cost Xen has a notion of
pinned pagetables.  Pinned pagetables are pagetables which are
considered to be active even if no processor's cr3 is pointing to is.
This means that it must remain read-only and all updates are validated
by the hypervisor.  This makes context switches much cheaper, because
the hypervisor doesn't need to revalidate the pagetable each time.

This also adds a new paravirt hook which is called during setup once
the zones and memory allocator have been initialized.  When the
init_mm pagetable is first built, the struct page array does not yet
exist, and so there's nowhere to put he init_mm pagetable's PG_pinned
flags.  Once the zones are initialized and the struct page array
exists, we can set the PG_pinned flags for those pages.

This patch also adds the Xen support for pte pages allocated out of
highmem (highpte) by implementing xen_kmap_atomic_pte.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Zach Amsden <zach@vmware.com>
17 years agoxen: add pinned page flag
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: add pinned page flag

Add a new definition for PG_owner_priv_1 to define PG_pinned on Xen
pagetable pages.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: configuration
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: configuration

Put config options for Xen after the core pieces are in place.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: time implementation
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: time implementation

Xen maintains a base clock which measures nanoseconds since system
boot.  This is provided to guests via a shared page which contains a
base time in ns, a tsc timestamp at that point and tsc frequency
parameters.  Guests can compute the current time by reading the tsc
and using it to extrapolate the current time from the basetime.  The
hypervisor makes sure that the frequency parameters are updated
regularly, paricularly if the tsc changes rate or stops.

This is implemented as a clocksource, so the interface to the rest of
the kernel is a simple clocksource which simply returns the current
time directly in nanoseconds.

Xen also provides a simple timer mechanism, which allows a timeout to
be set in the future.  When that time arrives, a timer event is sent
to the guest.  There are two timer interfaces:
 - An old one which also delivers a stream of (unused) ticks at 100Hz,
   and on the same event, the actual timer events.  The 100Hz ticks
   cause a lot of spurious wakeups, but are basically harmless.
 - The new timer interface doesn't have the 100Hz ticks, and can also
   fail if the specified time is in the past.

This code presents the Xen timer as a clockevent driver, and uses the
new interface by preference.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
17 years agoxen: event channels
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:05 +0000 (18:37 -0700)] 
xen: event channels

Xen implements interrupts in terms of event channels.  Each guest
domain gets 1024 event channels which can be used for a variety of
purposes, such as Xen timer events, inter-domain events,
inter-processor events (IPI) or for real hardware IRQs.

Within the kernel, we map the event channels to IRQs, and implement
the whole interrupt handling using a Xen irq_chip.

Rather than setting NR_IRQ to 1024 under PARAVIRT in order to
accomodate Xen, we create a dynamic mapping between event channels and
IRQs.  Ideally, Linux will eventually move towards dynamically
allocating per-irq structures, and we can use a 1:1 mapping between
event channels and irqs.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric W. Biederman <ebiederm@xmission.com>
17 years agoxen: virtual mmu
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
xen: virtual mmu

Xen pagetable handling, including the machinery to implement direct
pagetables.

Xen presents the real CPU's pagetables directly to guests, with no
added shadowing or other layer of abstraction.  Naturally this means
the hypervisor must maintain close control over what the guest can put
into the pagetable.

When the guest modifies the pte/pmd/pgd, it must convert its
domain-specific notion of a "physical" pfn into a global machine frame
number (mfn) before inserting the entry into the pagetable.  Xen will
check to make sure the domain is allowed to create a mapping of the
given mfn.

Xen also requires that all mappings the guest has of its own active
pagetable are read-only.  This is relatively easy to implement in
Linux because all pagetables share the same pte pages for kernel
mappings, so updating the pte in one pagetable will implicitly update
the mapping in all pagetables.

Normally a pagetable becomes active when you point to it with cr3 (or
the Xen equivalent), but when you do so, Xen must check the whole
pagetable for correctness, which is clearly a performance problem.

Xen solves this with pinning which keeps a pagetable effectively
active even if its currently unused, which means that all the normal
update rules are enforced.  This means that it need not revalidate the
pagetable when loading cr3.

This patch has a first-cut implementation of pinning, but it is more
fully implemented in a later patch.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoxen: Core Xen implementation
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
xen: Core Xen implementation

This patch is a rollup of all the core pieces of the Xen
implementation, including:
 - booting and setup
 - pagetable setup
 - privileged instructions
 - segmentation
 - interrupt flags
 - upcalls
 - multicall batching

BOOTING AND SETUP

The vmlinux image is decorated with ELF notes which tell the Xen
domain builder what the kernel's requirements are; the domain builder
then constructs the address space accordingly and starts the kernel.

Xen has its own entrypoint for the kernel (contained in an ELF note).
The ELF notes are set up by xen-head.S, which is included into head.S.
In principle it could be linked separately, but it seems to provoke
lots of binutils bugs.

Because the domain builder starts the kernel in a fairly sane state
(32-bit protected mode, paging enabled, flat segments set up), there's
not a lot of setup needed before starting the kernel proper.  The main
steps are:
  1. Install the Xen paravirt_ops, which is simply a matter of a
     structure assignment.
  2. Set init_mm to use the Xen-supplied pagetables (analogous to the
     head.S generated pagetables in a native boot).
  3. Reserve address space for Xen, since it takes a chunk at the top
     of the address space for its own use.
  4. Call start_kernel()

PAGETABLE SETUP

Once we hit the main kernel boot sequence, it will end up calling back
via paravirt_ops to set up various pieces of Xen specific state.  One
of the critical things which requires a bit of extra care is the
construction of the initial init_mm pagetable.  Because Xen places
tight constraints on pagetables (an active pagetable must always be
valid, and must always be mapped read-only to the guest domain), we
need to be careful when constructing the new pagetable to keep these
constraints in mind.  It turns out that the easiest way to do this is
use the initial Xen-provided pagetable as a template, and then just
insert new mappings for memory where a mapping doesn't already exist.

This means that during pagetable setup, it uses a special version of
xen_set_pte which ignores any attempt to remap a read-only page as
read-write (since Xen will map its own initial pagetable as RO), but
lets other changes to the ptes happen, so that things like NX are set
properly.

PRIVILEGED INSTRUCTIONS AND SEGMENTATION

When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
This means that it is more privileged than user-mode in ring 3, but it
still can't run privileged instructions directly.  Non-performance
critical instructions are dealt with by taking a privilege exception
and trapping into the hypervisor and emulating the instruction, but
more performance-critical instructions have their own specific
paravirt_ops.  In many cases we can avoid having to do any hypercalls
for these instructions, or the Xen implementation is quite different
from the normal native version.

The privileged instructions fall into the broad classes of:
  Segmentation: setting up the GDT and the GDT entries, LDT,
     TLS and so on.  Xen doesn't allow the GDT to be directly
     modified; all GDT updates are done via hypercalls where the new
     entries can be validated.  This is important because Xen uses
     segment limits to prevent the guest kernel from damaging the
     hypervisor itself.
  Traps and exceptions: Xen uses a special format for trap entrypoints,
     so when the kernel wants to set an IDT entry, it needs to be
     converted to the form Xen expects.  Xen sets int 0x80 up specially
     so that the trap goes straight from userspace into the guest kernel
     without going via the hypervisor.  sysenter isn't supported.
  Kernel stack: The esp0 entry is extracted from the tss and provided to
     Xen.
  TLB operations: the various TLB calls are mapped into corresponding
     Xen hypercalls.
  Control registers: all the control registers are privileged.  The most
     important is cr3, which points to the base of the current pagetable,
     and we handle it specially.

Another instruction we treat specially is CPUID, even though its not
privileged.  We want to control what CPU features are visible to the
rest of the kernel, and so CPUID ends up going into a paravirt_op.
Xen implements this mainly to disable the ACPI and APIC subsystems.

INTERRUPT FLAGS

Xen maintains its own separate flag for masking events, which is
contained within the per-cpu vcpu_info structure.  Because the guest
kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
ignored (and must be, because even if a guest domain disables
interrupts for itself, it can't disable them overall).

(A note on terminology: "events" and interrupts are effectively
synonymous.  However, rather than using an "enable flag", Xen uses a
"mask flag", which blocks event delivery when it is non-zero.)

There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
are implemented to manage the Xen event mask state.  The only thing
worth noting is that when events are unmasked, we need to explicitly
see if there's a pending event and call into the hypervisor to make
sure it gets delivered.

UPCALLS

Xen needs a couple of upcall (or callback) functions to be implemented
by each guest.  One is the event upcalls, which is how events
(interrupts, effectively) are delivered to the guests.  The other is
the failsafe callback, which is used to report errors in either
reloading a segment register, or caused by iret.  These are
implemented in i386/kernel/entry.S so they can jump into the normal
iret_exc path when necessary.

MULTICALL BATCHING

Xen provides a multicall mechanism, which allows multiple hypercalls
to be issued at once in order to mitigate the cost of trapping into
the hypervisor.  This is particularly useful for context switches,
since the 4-5 hypercalls they would normally need (reload cr3, update
TLS, maybe update LDT) can be reduced to one.  This patch implements a
generic batching mechanism for hypercalls, which gets used in many
places in the Xen code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ian Pratt <ian.pratt@xensource.com>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Adrian Bunk <bunk@stusta.de>
17 years agoxen: Add Xen interface header files
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
xen: Add Xen interface header files

Add Xen interface header files. These are taken fairly directly from
the Xen tree, but somewhat rearranged to suit the kernel's conventions.

Define macros and inline functions for doing hypercalls into the
hypervisor.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoAdd nosegneg capability to the vsyscall page notes
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
Add nosegneg capability to the vsyscall page notes

Add the "nosegneg" fake capabilty to the vsyscall page notes. This is
used by the runtime linker to select a glibc version which then
disables negative-offset accesses to the thread-local segment via
%gs. These accesses require emulation in Xen (because segments are
truncated to protect the hypervisor address space) and avoiding them
provides a measurable performance boost.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Zachary Amsden <zach@vmware.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
17 years agoAdd a sched_clock paravirt_op
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
Add a sched_clock paravirt_op

The tsc-based get_scheduled_cycles interface is not a good match for
Xen's runstate accounting, which reports everything in nanoseconds.

This patch replaces this interface with a sched_clock interface, which
matches both Xen and VMI's requirements.

In order to do this, we:
   1. replace get_scheduled_cycles with sched_clock
   2. hoist cycles_2_ns into a common header
   3. update vmi accordingly

One thing to note: because sched_clock is implemented as a weak
function in kernel/sched.c, we must define a real function in order to
override this weak binding.  This means the usual paravirt_ops
technique of using an inline function won't work in this case.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Cc: john stultz <johnstul@us.ibm.com>
17 years agoparavirt: helper to disable all IO space
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
paravirt: helper to disable all IO space

In a virtual environment, device drivers such as legacy IDE will waste
quite a lot of time probing for their devices which will never appear.
This helper function allows a paravirt implementation to lay claim to
the whole iomem and ioport space, thereby disabling all device drivers
trying to claim IO resources.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
17 years agoAllocate and free vmalloc areas
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
Allocate and free vmalloc areas

Allocate/release a chunk of vmalloc address space:
 alloc_vm_area reserves a chunk of address space, and makes sure all
 the pagetables are constructed for that address range - but no pages.

 free_vm_area releases the address space range.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: "Jan Beulich" <JBeulich@novell.com>
Cc: "Andi Kleen" <ak@muc.de>
17 years agoparavirt: export __supported_pte_mask
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:04 +0000 (18:37 -0700)] 
paravirt: export __supported_pte_mask

__supported_pte_mask is needed when constructing pte values.  Xen
device drivers need to do this to make mappings of foreign pages (ie,
pages granted to us by other domains).

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
17 years agoparavirt: make siblingmap functions visible
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: make siblingmap functions visible

Paravirt implementations need to set the sibling map on new cpus.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
17 years agoparavirt: unstatic smp_store_cpu_info
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: unstatic smp_store_cpu_info

Paravirt implementations need to store cpu info when bringing up cpus.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
17 years agoparavirt: unstatic leave_mm
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: unstatic leave_mm

Make globally leave_mm visible, specifically so that Xen can use it to
shoot-down lazy uses of cr3.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
17 years agoparavirt: increase IRQ limit
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: increase IRQ limit

When running with CONFIG_PARAVIRT, we may want lots of IRQs even if
there's no IO APIC.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
17 years agoparavirt: add a hook for once the allocator is ready
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: add a hook for once the allocator is ready

Add a hook so that the paravirt backend knows when the allocator is
ready.  This is useful for the obvious reason that the allocator is
available, but the other side-effect of having the bootmem allocator
available is that each page now has an associated "struct page".

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
17 years agoparavirt: add an "mm" argument to alloc_pt
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
paravirt: add an "mm" argument to alloc_pt

It's useful to know which mm is allocating a pagetable.  Xen uses this
to determine whether the pagetable being added to is pinned or not.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
17 years agouse elfnote.h to generate vsyscall notes.
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
use elfnote.h to generate vsyscall notes.

Use existing elfnote.h to generate vsyscall notes, rather than doing
it locally.  Changes elfnote.h a bit to suit, since this is the first
asm user, and it wasn't quite right.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.com>
17 years agousermodehelper: Tidy up waiting
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:03 +0000 (18:37 -0700)] 
usermodehelper: Tidy up waiting

Rather than using a tri-state integer for the wait flag in
call_usermodehelper_exec, define a proper enum, and use that.  I've
preserved the integer values so that any callers I've missed should
still work OK.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>
17 years agoAdd common orderly_poweroff()
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:02 +0000 (18:37 -0700)] 
Add common orderly_poweroff()

Various pieces of code around the kernel want to be able to trigger an
orderly poweroff.  This pulls them together into a single
implementation.

By default the poweroff command is /sbin/poweroff, but it can be set
via sysctl: kernel/poweroff_cmd.  This is split at whitespace, so it
can include command-line arguments.

This patch replaces four other instances of invoking either "poweroff"
or "shutdown -h now": two sbus drivers, and acpi thermal
management.

sparc64 has its own "powerd"; still need to determine whether it should
be replaced by orderly_poweroff().

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David S. Miller <davem@davemloft.net>
17 years agousermodehelper: split setup from execution
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:02 +0000 (18:37 -0700)] 
usermodehelper: split setup from execution

Rather than having hundreds of variations of call_usermodehelper for
various pieces of usermode state which could be set up, split the
info allocation and initialization from the actual process execution.

This means the general pattern becomes:
 info = call_usermodehelper_setup(path, argv, envp); /* basic state */
 call_usermodehelper_<SET EXTRA STATE>(info, stuff...); /* extra state */
 call_usermodehelper_exec(info, wait); /* run process and free info */

This patch introduces wrappers for all the existing calling styles for
call_usermodehelper_*, but folds their implementations into one.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Bj?rn Steinbrink <B.Steinbrink@gmx.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
17 years agoadd argv_split()
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:02 +0000 (18:37 -0700)] 
add argv_split()

argv_split() is a helper function which takes a string, splits it at
whitespace, and returns a NULL-terminated argv vector.  This is
deliberately simple - it does no quote processing of any kind.

[ Seems to me that this is something which is already being done in
  the kernel, but I couldn't find any other implementations, either to
  steal or replace.  Keep an eye out. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
17 years agoadd kstrndup
Jeremy Fitzhardinge [Wed, 18 Jul 2007 01:37:02 +0000 (18:37 -0700)] 
add kstrndup

Add a kstrndup function, modelled on strndup.  Like strndup this
returns a string copied into its own allocated memory, but it copies
no more than the specified number of bytes from the source.

Remove private strndup() from irda code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Panagiotis Issaris <takis@issaris.org>
Cc: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
17 years agozs: move to the serial subsystem
Maciej W. Rozycki [Wed, 18 Jul 2007 07:49:11 +0000 (00:49 -0700)] 
zs: move to the serial subsystem

This is a reimplementation of the zs driver for the serial subsystem.  Any
resemblance to the old driver is purely coincidential.  ;-) I do hope I got
the handling of modem lines right -- better do not tackle me about the
issue unless you feel too good...

Any users of the old driver: please note the numbers of the serial lines
have now been swapped, i.e.  ttyS0 <-> ttyS1 and ttyS2 <-> ttyS3.  It has
to do with the modem lines mentioned above; basically the port A in a given
chip has to be initialised before the port B if you want to use the latter
as the serial console (which is usually the case), as operations on modem
lines of the serial line associated with the port B access both ports (see
the comment at the top of the driver for the details of wiring used).
Please update your scripts.

This is also the reason each SCC now requests an IRQ once only (as seen in
"/proc/interrupts") -- the handler takes care of both ports at once as the
line associated with the port B has to take status update interrupts from
both ports (and yet the line of the port A takes its own for itself too).
The old driver never got it right...

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoserial: add early_serial_setup() back to header file
Yinghai Lu [Wed, 18 Jul 2007 07:49:10 +0000 (00:49 -0700)] 
serial: add early_serial_setup() back to header file

early_serial_setup was removed from serial.h, but forgot to put in
serial_8250.h

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agofbdev: make fb_append_extra_logo() depend on fb=y
Arnd Bergmann [Wed, 18 Jul 2007 07:49:09 +0000 (00:49 -0700)] 
fbdev: make fb_append_extra_logo() depend on fb=y

We can't show the extra logo from boot code if FB is built as a module.
Make the FB_LOGO_EXTRA depend on FB=y.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agodm: fix memory leak in dm_create_persistent() when starting metadata update thread...
Jesper Juhl [Wed, 18 Jul 2007 07:49:08 +0000 (00:49 -0700)] 
dm: fix memory leak in dm_create_persistent() when starting metadata update thread fails

If, in dm_create_persistent(), the call to create_singlethread_workqueue()
fails then we'll return without freeing the memory allocated to 'ps', thus
leaking sizeof(struct pstore) bytes.  This patch fixes the leak.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoslob: Kill off duplicate kzalloc() definition.
Paul Mundt [Wed, 18 Jul 2007 00:18:36 +0000 (09:18 +0900)] 
slob: Kill off duplicate kzalloc() definition.

With the slab zeroing allocations cleanups Christoph stubbed in a generic
kzalloc(), which was missed on SLOB. Follow the SLAB/SLUB changes and
kill off the __kzalloc() wrapper that SLOB was using.

Reported-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoRevert drivers/ide/ide.c scsi_cmd_ioctl() usage changes
Linus Torvalds [Tue, 17 Jul 2007 22:57:42 +0000 (15:57 -0700)] 
Revert drivers/ide/ide.c scsi_cmd_ioctl() usage changes

The old IDE driver is not ready to take generic SCSI commands, even if
it uses them for some specific issues (ie the tray open/close ioctls for
IDE CD-ROM's). Pointed out by Bartlomiej.

I'm sure we'll have it fixed properly soon enough, but for now we should
not allow it to cause problems.

Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMake the "z/VM unit record device driver" depend on S390
Linus Torvalds [Tue, 17 Jul 2007 22:43:56 +0000 (15:43 -0700)] 
Make the "z/VM unit record device driver" depend on S390

I really don't see anybody else wanting to select it ;)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Tue, 17 Jul 2007 22:29:33 +0000 (15:29 -0700)] 
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] Fix broken logic, SIGA flags must be bitwise ORed
  [S390] cio: Dont print trailing \0 in modalias_show().
  [S390] Simplify stack trace.
  [S390] z/VM unit record device driver
  [S390] vmcp cleanup
  [S390] qdio: output queue stall on FCP and network devices
  [S390] Fix disassembly of RX_URRD, SI_URD & PC-relative instructions.
  [S390] Update default configuration.

17 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Tue, 17 Jul 2007 22:28:18 +0000 (15:28 -0700)] 
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: (21 commits)
  [WATCHDOG] at32ap700x_wdt.c - Fix compilation warnings
  [WATCHDOG] at32ap700x_wdt.c - Add spinlock support
  [WATCHDOG] at32ap700x_wdt.c - Add nowayout + MAGICCLOSE features
  [WATCHDOG] at32ap700x_wdt.c - timeout module parameter patch
  [WATCHDOG] at32ap700x_wdt.c - checkpatch.pl-0.05 clean-up's
  [WATCHDOG] change s3c2410_wdt to using dev_() macros for output
  [WATCHDOG] s3c2410_wdt announce initialisation
  [WATCHDOG] at32ap700x-wdt: add iounmap if probe function fails
  [WATCHDOG] at32ap700x-wdt: add missing iounmap in _remove
  [WATCHDOG] watchdog-driver-for-at32ap700x-devices-fix-2
  [WATCHDOG] watchdog-driver-for-at32ap700x-devices-fix
  [WATCHDOG] Watchdog driver for AT32AP700X devices
  [WATCHDOG] Mixcom Watchdog - CodingStyle clean-up
  [WATCHDOG] Mixcom Watchdog - clean-up printk's
  [WATCHDOG] Mixcom Watchdog - clean-up printk's
  [WATCHDOG] Mixcom Watchdog - checkcard part 2
  [WATCHDOG] Mixcom Watchdog - checkcard
  [WATCHDOG] Mixcom Watchdog - get rid of port offset's
  [WATCHDOG] Mixcom Watchdog - update "Documentation"
  [WATCHDOG] Remove the redundant check for pwrite() in EP93XXX watchdog.
  ...

17 years agoMerge branch 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block
Linus Torvalds [Tue, 17 Jul 2007 22:26:31 +0000 (15:26 -0700)] 
Merge branch 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block

* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block:
  bsg: fix missing space in version print
  Don't define empty struct bsg_class_device if !CONFIG_BLK_DEV_BSG
  bsg: Kconfig updates
  bsg: minor cleanup
  bsg: device hash table cleanup
  bsg: fix initialization error handling bugs
  bsg: mark FUJITA Tomonori as bsg maintainer
  bsg: convert to dynamic major
  bsg: address various review comments

17 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh...
Linus Torvalds [Tue, 17 Jul 2007 22:23:50 +0000 (15:23 -0700)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: fix debug compilation error

17 years agoMerge branch 'isdn-cleanup' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Tue, 17 Jul 2007 22:23:37 +0000 (15:23 -0700)] 
Merge branch 'isdn-cleanup' of /linux/kernel/git/jgarzik/misc-2.6

* 'isdn-cleanup' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  [ISDN] HiSax hfc_pci: minor cleanups
  [ISDN] HiSax bkm_a4t: split setup into two smaller functions
  [ISDN] HiSax enternow: split setup into 3 smaller functions
  [ISDN] HiSax netjet_u: split setup into 3 smaller functions
  [ISDN] HiSax netjet_s: code movement, prep for hotplug
  [ISDN] HiSax: move card state alloc/setup code into separate functions
  [ISDN] HiSax: move card setup into separate function

17 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Tue, 17 Jul 2007 22:19:27 +0000 (15:19 -0700)] 
Merge branch 'master' of /linux/kernel/git/davem/sparc-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().
  [SPARC64]: Update defconfig.
  [SPARC64]: Kill explicit %gl register reference.

17 years agoMerge branch 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Tue, 17 Jul 2007 22:19:06 +0000 (15:19 -0700)] 
Merge branch 'uninit-var' of /linux/kernel/git/jgarzik/misc-2.6

* 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  arch/i386/* fs/* ipc/*: mark variables with uninitialized_var()
  drivers/*: mark variables with uninitialized_var()

17 years agoMerge branch 'warnings' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
Linus Torvalds [Tue, 17 Jul 2007 22:18:33 +0000 (15:18 -0700)] 
Merge branch 'warnings' of /linux/kernel/git/jgarzik/misc-2.6

* 'warnings' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  drivers/atm/ambassador: kill uninit'd var warning, and fix bug
  [libata] sata_mv: use pci_try_set_mwi()
  drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd var warning
  drivers/net/wan/sbni: kill uninit'd var warning
  drivers/mtd/ubi/eba: minor cleanup: tighten scope of a local var
  drivers/telephony/ixj: cleanup and fix gcc warning
  drivers/net/wan/pc300_drv: fix bug caught by gcc warning
  drivers/usb/misc/auerswald: fix status check, remove redundant check
  [netdrvr] eepro100, ne2k-pci: abort resume if pci_enable_device() fails
  [netdrvr] natsemi: Fix device removal bug
  kernel/auditfilter: kill bogus uninit'd-var compiler warning

17 years agosmp_call_function_single() should be a macro on UP
Al Viro [Tue, 17 Jul 2007 21:29:46 +0000 (22:29 +0100)] 
smp_call_function_single() should be a macro on UP

... or we end up with header include order problems from hell.

E.g. on m68k this is 100% fatal - local_irq_enable() there
wants preempt_count(), which wants task_struct fields, which
we won't have when we are in smp.h pulled from sched.h.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years ago[SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().
Oleg Nesterov [Tue, 17 Jul 2007 21:37:54 +0000 (14:37 -0700)] 
[SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().

From: Oleg Nesterov <oleg@tv-sign.ru>

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[ISDN] HiSax hfc_pci: minor cleanups
Jeff Garzik [Mon, 16 Jul 2007 01:48:07 +0000 (21:48 -0400)] 
[ISDN] HiSax hfc_pci: minor cleanups

* trim trailing whitespace
* remove CONFIG_PCI ifdefs, this driver is always PCI (Kconfig enforced)
* remove return statements at the tail of a function
* remove indentation levels by returning an error code immediately.
  Makes the code much more readable, and easier to update to PCI hotplug
  API.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax bkm_a4t: split setup into two smaller functions
Jeff Garzik [Sun, 15 Jul 2007 23:58:24 +0000 (19:58 -0400)] 
[ISDN] HiSax bkm_a4t: split setup into two smaller functions

No behavior changes, just code movement.  Prep for PCI hotplug API.

Well, CONFIG_PCI useless ifdef was removed.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax enternow: split setup into 3 smaller functions
Jeff Garzik [Sun, 15 Jul 2007 23:25:45 +0000 (19:25 -0400)] 
[ISDN] HiSax enternow: split setup into 3 smaller functions

No behavior changes, just code movement.  Prep for PCI hotplug API.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax netjet_u: split setup into 3 smaller functions
Jeff Garzik [Sun, 15 Jul 2007 20:59:01 +0000 (16:59 -0400)] 
[ISDN] HiSax netjet_u: split setup into 3 smaller functions

No behavior changes, just code movement.  Prep for PCI hotplug API.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax netjet_s: code movement, prep for hotplug
Jeff Garzik [Sun, 15 Jul 2007 08:25:35 +0000 (04:25 -0400)] 
[ISDN] HiSax netjet_s: code movement, prep for hotplug

1) Remove CONFIG_PCI ifdefs.  PCI is required in Kconfig.

2) Break up setup_netjet_s() into three separate internal functions.
This helps facilitate upcoming use of PCI hotplug API, and in addition
makes the code much easier to follow.

No code is changed, just moved around.  I even kept the out-of-favor
"return(0)" style used in the current source code.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax: move card state alloc/setup code into separate functions
Jeff Garzik [Tue, 17 Jul 2007 21:14:23 +0000 (17:14 -0400)] 
[ISDN] HiSax: move card state alloc/setup code into separate functions

Just code movement.  No code changes or cleanups besides that which
is required to call the new functions from the old code site.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[ISDN] HiSax: move card setup into separate function
Jeff Garzik [Sun, 15 Jul 2007 01:58:34 +0000 (21:58 -0400)] 
[ISDN] HiSax: move card setup into separate function

No behavior changes, just code movement.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agomixart: Add missing vmalloc.h include
Frank Lichtenheld [Tue, 17 Jul 2007 17:50:53 +0000 (19:50 +0200)] 
mixart: Add missing vmalloc.h include

Fixes the following build error:
  CC      sound/pci/mixart/mixart_hwdep.o
sound/pci/mixart/mixart_hwdep.c: In function ‘mixart_hwdep_dsp_load’:
sound/pci/mixart/mixart_hwdep.c:610: error: implicit declaration of function ‘vmalloc’
sound/pci/mixart/mixart_hwdep.c:617: error: implicit declaration of function ‘vfree’

Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agohppb: Add missing dma-mapping.h include
Frank Lichtenheld [Tue, 17 Jul 2007 17:30:38 +0000 (19:30 +0200)] 
hppb: Add missing dma-mapping.h include

This fixes the following build-error:

 CC      drivers/parisc/hppb.o
drivers/parisc/hppb.c: In function ‘hppb_probe’:
drivers/parisc/hppb.c:73: error: implicit declaration of function ‘ccio_request_resource’

Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoarch/i386/* fs/* ipc/*: mark variables with uninitialized_var()
Jeff Garzik [Tue, 17 Jul 2007 09:40:59 +0000 (05:40 -0400)] 
arch/i386/* fs/* ipc/*: mark variables with uninitialized_var()

Mark variables with uninitialized_var() if such a warning appears,
and analysis proves that the var is initialized properly on all paths
it is used.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/*: mark variables with uninitialized_var()
Jeff Garzik [Tue, 17 Jul 2007 09:39:58 +0000 (05:39 -0400)] 
drivers/*: mark variables with uninitialized_var()

Mark variables in drivers/* with uninitialized_var() if such a warning
appears, and analysis proves that the var is initialized properly on all
paths it is used.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/atm/ambassador: kill uninit'd var warning, and fix bug
Jeff Garzik [Tue, 17 Jul 2007 06:32:21 +0000 (02:32 -0400)] 
drivers/atm/ambassador: kill uninit'd var warning, and fix bug

An uninitialized variable warning illuminated an area where indeed the
variable was being used without initialization.  Unfortunately, after
verifying all such paths were fixed, the warning still appears.  So we
follow the initialization practice of other variables in this function.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[libata] sata_mv: use pci_try_set_mwi()
Jeff Garzik [Tue, 17 Jul 2007 06:21:50 +0000 (02:21 -0400)] 
[libata] sata_mv: use pci_try_set_mwi()

Because sometimes in life, it's ok to fail.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/infiniband/hw/mthca/mthca_qp: kill uninit'd var warning
Jeff Garzik [Tue, 17 Jul 2007 06:03:49 +0000 (02:03 -0400)] 
drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd var warning

drivers/infiniband/hw/mthca/mthca_qp.c: In function
  ‘mthca_tavor_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1594: warning: ‘f0’ may be used
  uninitialized in this function
drivers/infiniband/hw/mthca/mthca_qp.c: In function
  ‘mthca_arbel_post_send’:
drivers/infiniband/hw/mthca/mthca_qp.c:1949: warning: ‘f0’ may be used
  uninitialized in this function

Initializing 'f0' is not strictly necessary in either case, AFAICS.

I was considering use of uninitialized_var(), but looking at the
complex flow of control in each function, I feel it is wiser and
safer to simply zero the var and be certain of ourselves.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/net/wan/sbni: kill uninit'd var warning
Jeff Garzik [Tue, 17 Jul 2007 05:56:32 +0000 (01:56 -0400)] 
drivers/net/wan/sbni: kill uninit'd var warning

It's actually convenient in the code to initialize this and a sister
variable to zero.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/mtd/ubi/eba: minor cleanup: tighten scope of a local var
Jeff Garzik [Tue, 17 Jul 2007 05:49:56 +0000 (01:49 -0400)] 
drivers/mtd/ubi/eba: minor cleanup: tighten scope of a local var

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/telephony/ixj: cleanup and fix gcc warning
Jeff Garzik [Tue, 17 Jul 2007 05:35:08 +0000 (01:35 -0400)] 
drivers/telephony/ixj: cleanup and fix gcc warning

1) Fix gcc uninit'd var warnings by adding 'default' switch stmt labels
in two cases.  It was lightning-strikes unlikely that a problem would
ever arise, but not impossible.

2) Tighten the scope of 'blankword' in two cases.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/net/wan/pc300_drv: fix bug caught by gcc warning
Jeff Garzik [Tue, 17 Jul 2007 05:32:29 +0000 (01:32 -0400)] 
drivers/net/wan/pc300_drv: fix bug caught by gcc warning

The warning

drivers/net/wan/pc300_drv.c: In function ‘cpc_open’:
drivers/net/wan/pc300_drv.c:2942: warning: ‘br’ may be used
uninitialized in this function

was valid.  Ensure 'br' is initialized in all cases.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agodrivers/usb/misc/auerswald: fix status check, remove redundant check
Jeff Garzik [Tue, 17 Jul 2007 05:08:29 +0000 (01:08 -0400)] 
drivers/usb/misc/auerswald: fix status check, remove redundant check

1) We should only set 'actual_length' output variable if usb length is
known to be good.

2) No need to check actual_length for NULL.  The only caller always
passes non-NULL value.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[netdrvr] eepro100, ne2k-pci: abort resume if pci_enable_device() fails
Jeff Garzik [Tue, 17 Jul 2007 04:15:54 +0000 (00:15 -0400)] 
[netdrvr] eepro100, ne2k-pci: abort resume if pci_enable_device() fails

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[netdrvr] natsemi: Fix device removal bug
Jeff Garzik [Tue, 17 Jul 2007 04:01:09 +0000 (00:01 -0400)] 
[netdrvr] natsemi: Fix device removal bug

This episode illustrates how an overused warning can train people to
ignore that warning, which winds up hiding bugs.

The warning

drivers/net/natsemi.c: In function ‘natsemi_remove1’:
drivers/net/natsemi.c:3222: warning: ignoring return value of
‘device_create_file’, declared with attribute warn_unused_result

is oft-ignored, even though at close inspection one notices this occurs
in the /remove/ function, not normally where creation occurs.  A quick
s/create/remove/ and we are fixed, with the warning gone.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years agokernel/auditfilter: kill bogus uninit'd-var compiler warning
Jeff Garzik [Tue, 17 Jul 2007 01:25:01 +0000 (21:25 -0400)] 
kernel/auditfilter: kill bogus uninit'd-var compiler warning

Kill this warning...

kernel/auditfilter.c: In function ‘audit_receive_filter’:
kernel/auditfilter.c:1213: warning: ‘ndw’ may be used uninitialized in this function
kernel/auditfilter.c:1213: warning: ‘ndp’ may be used uninitialized in this function

...with a simplification of the code.  audit_put_nd() can accept NULL
arguments, just like kfree().  It is cleaner to init two existing vars
to NULL, remove the redundant test variable 'putnd_needed' branches, and call
audit_put_nd() directly.

As a desired side effect, the warning goes away.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
17 years ago[SPARC64]: Update defconfig.
David S. Miller [Tue, 17 Jul 2007 08:20:17 +0000 (01:20 -0700)] 
[SPARC64]: Update defconfig.

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC64]: Kill explicit %gl register reference.
David S. Miller [Tue, 17 Jul 2007 04:33:19 +0000 (21:33 -0700)] 
[SPARC64]: Kill explicit %gl register reference.

Older binutils can't handle it.  Use SET_GL() instead,
which is explicitly for this purpose.

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years agoIntroduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check
Satyam Sharma [Tue, 17 Jul 2007 09:30:08 +0000 (15:00 +0530)] 
Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check

Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
Linus Torvalds [Tue, 17 Jul 2007 18:50:26 +0000 (11:50 -0700)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/avi/kvm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (80 commits)
  KVM: Use CPU_DYING for disabling virtualization
  KVM: Tune hotplug/suspend IPIs
  KVM: Keep track of which cpus have virtualization enabled
  SMP: Allow smp_call_function_single() to current cpu
  i386: Allow smp_call_function_single() to current cpu
  x86_64: Allow smp_call_function_single() to current cpu
  HOTPLUG: Adapt thermal throttle to CPU_DYING
  HOTPLUG: Adapt cpuset hotplug callback to CPU_DYING
  HOTPLUG: Add CPU_DYING notifier
  KVM: Clean up #includes
  KVM: Remove kvmfs in favor of the anonymous inodes source
  KVM: SVM: Reliably detect if SVM was disabled by BIOS
  KVM: VMX: Remove unnecessary code in vmx_tlb_flush()
  KVM: MMU: Fix Wrong tlb flush order
  KVM: VMX: Reinitialize the real-mode tss when entering real mode
  KVM: Avoid useless memory write when possible
  KVM: Fix x86 emulator writeback
  KVM: Add support for in-kernel pio handlers
  KVM: VMX: Fix interrupt checking on lightweight exit
  KVM: Adds support for in-kernel mmio handlers
  ...

17 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Tue, 17 Jul 2007 18:31:57 +0000 (11:31 -0700)] 
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Clean away some code inside some non-existent CONFIG ifdefs
  [IA64] ar.itc access must really be after xtime_lock.sequence has been read
  [IA64] correctly count CPU objects in the ia64/sn hwperf interface
  [IA64] arbitary speed tty ioctl support
  [IA64] use machvec=dig on hpzx1 platforms

17 years agoatl1: missing include
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
atl1: missing include

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomark a bunch of ISA|EISA|PCI drivers as such
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
mark a bunch of ISA|EISA|PCI drivers as such

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomissing exports of csum_...
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
missing exports of csum_...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoum_kmalloc() remnants
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
um_kmalloc() remnants

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agosparc32 has working dma-mapping only with CONFIG_PCI
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
sparc32 has working dma-mapping only with CONFIG_PCI

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agosaner typechecking in generic unaligned.h
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
saner typechecking in generic unaligned.h

Verify that types would match for assignment (under sizeof, so we are safe from
side effects or any code actually getting generated), then explicitly cast
everywhere to the fixed-sized types.  Kills a bunch of bogus warnings about
constants being truncated (gcc, sparse), finds a pile of endianness problems
hidden by old noise (sparse).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoalpha __init fixes
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
alpha __init fixes

__init and __initdata stuff used from __devinit one

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoalpha termios.h hadn't been updated
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
alpha termios.h hadn't been updated

... fortunately, termios and ktermios there are identical, so no
run-time breakage happened.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agono USB on M32R
Al Viro [Tue, 17 Jul 2007 07:49:35 +0000 (08:49 +0100)] 
no USB on M32R

Won't build due to lack of dma-mapping.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomd: change bitmap_unplug and others to void functions
NeilBrown [Tue, 17 Jul 2007 11:06:13 +0000 (04:06 -0700)] 
md: change bitmap_unplug and others to void functions

bitmap_unplug only ever returns 0, so it may as well be void.  Two callers try
to print a message if it returns non-zero, but that message is already printed
by bitmap_file_kick.

write_page returns an error which is not consistently checked.  It always
causes BITMAP_WRITE_ERROR to be set on an error, and that can more
conveniently be checked.

When the return of write_page is checked, an error causes bitmap_file_kick to
be called - so move that call into write_page - and protect against recursive
calls into bitmap_file_kick.

bitmap_update_sb returns an error that is never checked.

So make these 'void' and be consistent about checking the bit.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomd: check that internal bitmap does not overlap other data
NeilBrown [Tue, 17 Jul 2007 11:06:12 +0000 (04:06 -0700)] 
md: check that internal bitmap does not overlap other data

We current completely trust user-space to set up metadata describing an
consistant array.  In particlar, that the metadata, data, and bitmap do not
overlap.

But userspace can be buggy, and it is better to report an error than corrupt
data.  So put in some appropriate checks.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomd: improve the is_mddev_idle test fix
NeilBrown [Tue, 17 Jul 2007 11:06:12 +0000 (04:06 -0700)] 
md: improve the is_mddev_idle test fix

Don't use 'unsigned' variable to track sync vs non-sync IO, as the only thing
we want to do with them is a signed comparison, and fix up the comment which
had become quite wrong.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agomd: improve message about invalid superblock during autodetect
NeilBrown [Tue, 17 Jul 2007 11:06:11 +0000 (04:06 -0700)] 
md: improve message about invalid superblock during autodetect

People try to use raid auto-detect with version-1 superblocks (which is not
supported) and get confused when they are told they have an invalid
superblock.

So be more explicit, and say it it is not a valid v0.90 superblock.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoUse menuconfig objects II - MD
Jan Engelhardt [Tue, 17 Jul 2007 11:06:11 +0000 (04:06 -0700)] 
Use menuconfig objects II - MD

Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: add TI TWL92330/Menelaus Power Management chip driver
Tony Lindgren [Tue, 17 Jul 2007 11:06:09 +0000 (04:06 -0700)] 
OMAP: add TI TWL92330/Menelaus Power Management chip driver

Add Texas Instruments TWL92330/Menelaus Power Management chip driver.  This
includes voltage regulators, Dual slot memory card tranceivers and
real-time clock(RTC).

The support for RTC is integrated with this driver only; it is not separate
module.  Passes 'rtctest' on OMAP H4 EVM, other than lack of "periodic"
(1/N second) IRQs.  System wakeup alarms (from suspend-to-RAM) work too.

The battery keeps the RTC active over power off, so once you set clock
(rdate/ntpdate/etc, then "hwclock -w") then RTC_HCTOSYS at boot time will
behave as expected.

Cc: "Jean Delvare" <khali@linux-fr.org>
Cc: "Tony Lindgren" <tony@atomide.com>
Cc: "David Brownell" <david-b@pacbell.net>
Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Acked-by: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for the Siemens SX1 mobile phone
Vovan888@gmail [Tue, 17 Jul 2007 11:06:09 +0000 (04:06 -0700)] 
OMAP: LCD panel support for the Siemens SX1 mobile phone

- Add support for LCD panel on Siemens sx1 mobile phone.

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for the TI OMAP OSK board
Dirk Behme [Tue, 17 Jul 2007 11:06:07 +0000 (04:06 -0700)] 
OMAP: LCD panel support for the TI OMAP OSK board

- Adds TFT LCD panel support for TI OMAP OSK board.

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for the TI OMAP1510 Innovator board
Imre Deak [Tue, 17 Jul 2007 11:06:07 +0000 (04:06 -0700)] 
OMAP: LCD panel support for the TI OMAP1510 Innovator board

- Add TFT LCD panel spport for TI OMAP1510 Innovator EVM.

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for the TI OMAP1610 Innovator board
Imre Deak [Tue, 17 Jul 2007 11:06:06 +0000 (04:06 -0700)] 
OMAP: LCD panel support for the TI OMAP1610 Innovator board

- Add TFT LCD panel spport for TI OMAP1610 Innovator EVM.

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for the Palm Zire71
Marek Vasut [Tue, 17 Jul 2007 11:06:06 +0000 (04:06 -0700)] 
OMAP: LCD panel support for the Palm Zire71

- Adds support for TFT LCD panel on Palm Zire71

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoOMAP: LCD panel support for Palm Tungsten|T
Marek Vasut [Tue, 17 Jul 2007 11:06:05 +0000 (04:06 -0700)] 
OMAP: LCD panel support for Palm Tungsten|T

- Add TFT LCD panel support for Palm Tungsten|T

Signed-off-by: Trilok Soni <soni.trilok@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>