linux-2.6
17 years ago[POWERPC] Fix bogus BUG_ON() in in hugetlb_get_unmapped_area()
David Gibson [Thu, 21 Dec 2006 22:23:03 +0000 (09:23 +1100)] 
[POWERPC] Fix bogus BUG_ON() in in hugetlb_get_unmapped_area()

The powerpc specific version of hugetlb_get_unmapped_area() makes some
unwarranted assumptions about what checks have been made to its
parameters by its callers.  This will lead to a BUG_ON() if a 32-bit
process attempts to make a hugepage mapping which extends above
TASK_SIZE (4GB).

I'm not sure if these assumptions came about because they were valid
with earlier versions of the get_unmapped_area() path, or if it was
always broken.  Nonetheless this patch fixes the logic, and removes
the crash.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
17 years agoMerge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jmorris/selin...
Linus Torvalds [Mon, 8 Jan 2007 23:08:22 +0000 (15:08 -0800)] 
Merge branch 'for-linus' of /linux/kernel/git/jmorris/selinux-2.6

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  selinux: Delete mls_copy_context

17 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Mon, 8 Jan 2007 23:07:31 +0000 (15:07 -0800)] 
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] PNX8550: Fix system timer support
  [MIPS] TX49: Fix use of CDEX build_store_reg()
  [MIPS] pnx8550: Fix write_config_byte() PCI config space accessor
  [MIPS] Fix build errors on SEAD
  [MIPS] SMTC build fix
  [MIPS] csum_partial and copy in parallel
  [MIPS] Malta: Add missing MTD file.

17 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Mon, 8 Jan 2007 23:06:39 +0000 (15:06 -0800)] 
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] Provide basic printk_clock() implementation
  [ARM] Resolve fuse and direct-IO failures due to missing cache flushes
  [ARM] pass vma for flush_anon_page()
  [ARM] Fix potential MMCI bug
  [ARM] Fix kernel-mode undefined instruction aborts
  [ARM] 4082/1: iop3xx: fix iop33x gpio register offset
  [ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes
  [ARM] 4079/1: iop: Update MAINTAINERS

17 years agoRevert "[PATCH] x86-64: Try multiple timer variants in check_timer"
Linus Torvalds [Mon, 8 Jan 2007 23:04:46 +0000 (15:04 -0800)] 
Revert "[PATCH] x86-64: Try multiple timer variants in check_timer"

This reverts commit b026872601976f666bae77b609dc490d1834bf77, which has
been linked to several problem reports with IO-APIC and the timer.
Machines either don't boot because the timer doesn't happen, or we get
double timer interrupts because we end up double-routing the timer irq
through multiple interfaces.

See for example

http://lkml.org/lkml/2006/12/16/101
http://lkml.org/lkml/2007/1/3/9
http://bugzilla.kernel.org/show_bug.cgi?id=7789

about some of the discussion.

Patches to fix this cleanup exist (and have been confirmed to work fine
at least for some of the affected cases) and we'll revisit it for
2.6.21, but this late in the -rc series we're better off just reverting
the incomplete commit that caused the problems.

Suggested-by: Adrian Bunk <bunk@stusta.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Yinghai Lu <yinghai.lu@amd.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years agoselinux: Delete mls_copy_context
Venkat Yekkirala [Tue, 12 Dec 2006 19:02:41 +0000 (13:02 -0600)] 
selinux: Delete mls_copy_context

This deletes mls_copy_context() in favor of mls_context_cpy() and
replaces mls_scopy_context() with mls_context_cpy_low().

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
17 years ago[MIPS] PNX8550: Fix system timer support
Vitaly Wool [Thu, 28 Dec 2006 14:14:05 +0000 (17:14 +0300)] 
[MIPS] PNX8550: Fix system timer support

the patch inlined below restores proper time accounting for PNX8550-based
boards. It also gets rid of #ifdef in the generic code which becomes
unnecessary then.

It's functionally identical to the previous patch with the same name but
it has minor comments from Atsushi and Sergei taken into account.

Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] TX49: Fix use of CDEX build_store_reg()
Atsushi Nemoto [Sun, 17 Dec 2006 15:38:21 +0000 (00:38 +0900)] 
[MIPS] TX49: Fix use of CDEX build_store_reg()

The commit a923660d786a53e78834b19062f7af2535f7f8ad accidently
prevents TX49 from using CDEX.  Use build_dst_pref() only if prefetch
for store was really available.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] pnx8550: Fix write_config_byte() PCI config space accessor
Davy Chan [Fri, 5 Jan 2007 05:56:46 +0000 (13:56 +0800)] 
[MIPS] pnx8550: Fix write_config_byte() PCI config space accessor

There's a serious typo in the function:
  arch/mips/pci/ops-pnx8550.c:write_config_byte()

The parameter passed to the function config_access() is PCI_CMD_CONFIG_READ
instead of PCI_CMD_CONFIG_WRITE. This renders any attempts to write
a single byte to the PCI configuration registers useless.

This problem does not exist for write_config_word() nor write_config_dword().

This problem has been there since kernel v2.6.17 and is still there
as of kernel v2.6.19.1.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] Fix build errors on SEAD
Atsushi Nemoto [Sun, 7 Jan 2007 16:27:40 +0000 (01:27 +0900)] 
[MIPS] Fix build errors on SEAD

Quick and dirty fix for build errors on SEAD.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] SMTC build fix
Atsushi Nemoto [Sun, 7 Jan 2007 15:50:34 +0000 (00:50 +0900)] 
[MIPS] SMTC build fix

Pass "irq" to __DO_IRQ_SMTC_HOOK() macro.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] csum_partial and copy in parallel
Atsushi Nemoto [Tue, 12 Dec 2006 16:22:06 +0000 (01:22 +0900)] 
[MIPS] csum_partial and copy in parallel

Implement optimized asm version of csum_partial_copy_nocheck,
csum_partial_copy_from_user and csum_and_copy_to_user which can do
calculate and copy in parallel, based on memcpy.S.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[MIPS] Malta: Add missing MTD file.
Ralf Baechle [Tue, 12 Dec 2006 11:52:34 +0000 (11:52 +0000)] 
[MIPS] Malta: Add missing MTD file.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
17 years ago[ARM] Provide basic printk_clock() implementation
Russell King [Mon, 8 Jan 2007 19:49:12 +0000 (19:49 +0000)] 
[ARM] Provide basic printk_clock() implementation

Current sched_clock() implementations on ARM cause unbootable kernels
with PRINTK_TIME support enabled.  To avoid this, provide a basic
printk_clock() implementation which avoids sched_clock() being called
before the page tables have been set up.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years ago[ARM] Resolve fuse and direct-IO failures due to missing cache flushes
Russell King [Sat, 30 Dec 2006 23:17:40 +0000 (23:17 +0000)] 
[ARM] Resolve fuse and direct-IO failures due to missing cache flushes

fuse does not work on ARM due to cache incoherency issues - fuse wants
to use get_user_pages() to copy data from the current process into
kernel space.  However, since this accesses userspace via the kernel
mapping, the kernel mapping can be out of date wrt data written to
userspace.

This can lead to unpredictable behaviour (in the case of fuse) or data
corruption for direct-IO.

This resolves debian bug #402876

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years ago[ARM] pass vma for flush_anon_page()
Russell King [Sat, 30 Dec 2006 22:24:19 +0000 (22:24 +0000)] 
[ARM] pass vma for flush_anon_page()

Since get_user_pages() may be used with processes other than the
current process and calls flush_anon_page(), flush_anon_page() has to
cope in some way with non-current processes.

It may not be appropriate, or even desirable to flush a region of
virtual memory cache in the current process when that is different to
the process that we want the flush to occur for.

Therefore, pass the vma into flush_anon_page() so that the architecture
can work out whether the 'vmaddr' is for the current process or not.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years ago[ARM] Fix potential MMCI bug
Russell King [Mon, 8 Jan 2007 16:42:51 +0000 (16:42 +0000)] 
[ARM] Fix potential MMCI bug

The MMCI driver might end up aborting the initial command and leaving
the data part of the command sequence still in place.  Avoid this
problem by ensuring that any data sequence is properly cleared out
when a command completes.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years agoLinux 2.6.20-rc4 v2.6.20-rc4
Linus Torvalds [Sun, 7 Jan 2007 05:45:51 +0000 (21:45 -0800)] 
Linux 2.6.20-rc4

17 years ago[ARM] Fix kernel-mode undefined instruction aborts
Russell King [Sat, 6 Jan 2007 22:53:48 +0000 (22:53 +0000)] 
[ARM] Fix kernel-mode undefined instruction aborts

If the kernel attempts to execute a CP1 or CP2 instruction and it
aborts, and a FP emulator is not loaded, we try to return as if to
a user context, instead of the proper kernel context.  Since the
fault came from kernel mode, we must use the kernel return paths.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years agoRevert "[PATCH] binfmt_elf: randomize PIE binaries (2nd try)"
Linus Torvalds [Sat, 6 Jan 2007 21:28:21 +0000 (13:28 -0800)] 
Revert "[PATCH] binfmt_elf: randomize PIE binaries (2nd try)"

This reverts commit 59287c0913cc9a6c75712a775f6c1c1ef418ef3b.

Hugh Dickins reports that it causes random failures on x86 with SuSE
10.2, and points out

  "Isn't that randomization, anywhere from 0x10000 to ELF_ET_DYN_BASE,
   sure to place the ET_DYN from time to time just where the comment
   says it's trying to avoid? I assume that somehow results in the error
   reported."

(where the comment in question is the existing comment in the source
code about mmap/brk clashes).

Suggested-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Marcus Meissner <meissner@suse.de>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[ARM] 4082/1: iop3xx: fix iop33x gpio register offset
Dan Williams [Thu, 4 Jan 2007 01:14:49 +0000 (02:14 +0100)] 
[ARM] 4082/1: iop3xx: fix iop33x gpio register offset

iop33x gpio offset is correct in include/asm-arm/arch-iop33x/iop33x.h, but
include/asm-arm/hardware/iop3xx.h adds 4.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years ago[ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes
Ben Dooks [Sun, 24 Dec 2006 00:36:35 +0000 (01:36 +0100)] 
[ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes

Include <asm/io.h> to fix the warning:

arch/arm/kernel/traps.c:647:6: warning: symbol '__readwrite_bug' was not declared. Should it be static?

Include <linux/mc146818rtc.h> to fix the warning:
arch/arm/kernel/time.c:42:1: warning: symbol 'rtc_lock' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years ago[ARM] 4079/1: iop: Update MAINTAINERS
Dan Williams [Tue, 2 Jan 2007 17:32:37 +0000 (18:32 +0100)] 
[ARM] 4079/1: iop: Update MAINTAINERS

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
Linus Torvalds [Sat, 6 Jan 2007 08:10:55 +0000 (00:10 -0800)] 
Merge /pub/scm/linux/kernel/git/gregkh/driver-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  [PATCH] Driver core: Fix prefix driver links in /sys/module by bus-name

17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
Linus Torvalds [Sat, 6 Jan 2007 08:10:37 +0000 (00:10 -0800)] 
Merge /pub/scm/linux/kernel/git/gregkh/pci-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
  [PATCH] PCI: disable PCI_MULTITHREAD_PROBE

17 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Sat, 6 Jan 2007 08:10:21 +0000 (00:10 -0800)] 
Merge /pub/scm/linux/kernel/git/gregkh/usb-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: asix: Fix AX88772 device PHY selection
  USB: usblp.c - add Kyocera Mita FS 820 to list of "quirky" printers
  sisusb_con warning fixes
  USB: Fixed bug in endpoint release function.
  USB: small update to Documentation/usb/acm.txt
  USB storage: fix ipod ejecting issue
  USB Storage: unusual_devs: add supertop drives
  USB: omap_udc build fixes (sync with linux-omap)
  USB: funsoft is borken on sparc
  USB: fix interaction between different interfaces in an "Option" usb device
  UHCI: support device_may_wakeup
  UHCI: make test for ASUS motherboard more specific

17 years agoMerge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
Linus Torvalds [Sat, 6 Jan 2007 08:09:14 +0000 (00:09 -0800)] 
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  i2c/m41t00: Do not forget to write year
  i2c-mv64xxx: Fix random oops at boot
  i2c: Migration aids for i2c_adapter.dev removal
  i2c-pnx: Add entry to MAINTAINERS
  i2c-pnx: Fix interrupt handler, get rid of EARLY config option

17 years ago[PATCH] connector: some fixes for ia64 unaligned access errors
Erik Jacobson [Sat, 6 Jan 2007 00:37:05 +0000 (16:37 -0800)] 
[PATCH] connector: some fixes for ia64 unaligned access errors

On ia64, the various functions that make up cn_proc.c cause kernel
unaligned access errors.

If you are using these, for example, to get notification about all tasks
forking and exiting, you get multiple unaligned access errors per process.

Use put_unaligned() in the appropriate palces to fix this.

Signed-off-by: Erik Jacobson <erikj@sgi.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] shrink_all_memory(): fix lru_pages handling
Andrew Morton [Sat, 6 Jan 2007 00:37:05 +0000 (16:37 -0800)] 
[PATCH] shrink_all_memory(): fix lru_pages handling

At the end of shrink_all_memory() we forget to recalculate lru_pages: it can
be zero.

Fix that up, and add a helper function for this operation too.

Also, recalculate lru_pages each time around the inner loop to get the
balancing correct.

Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] fix garbage instead of zeroes in UFS
Evgeniy Dushistov [Sat, 6 Jan 2007 00:37:04 +0000 (16:37 -0800)] 
[PATCH] fix garbage instead of zeroes in UFS

Looks like this is the problem, which point Al Viro some time ago:

ufs's get_block callback allocates 16k of disk at a time, and links that
entire 16k into the file's metadata.  But because get_block is called for only
a single buffer_head (a 2k buffer_head in this case?) we are only able to tell
the VFS that this 2k is buffer_new().

So when ufs_getfrag_block() is later called to map some more data in the file,
and when that data resides within the remaining 14k of this fragment,
ufs_getfrag_block() will incorrectly return a !buffer_new() buffer_head.

I don't see _right_ way to do nullification of whole block, if use inode
page cache, some pages may be outside of inode limits (inode size), and
will be lost; if use blockdev page cache it is possible to zero real data,
if later inode page cache will be used.

The simpliest way, as can I see usage of block device page cache, but not only
mark dirty, but also sync it during "nullification".  I use my simple tests
collection, which I used for check that create,open,write,read,close works on
ufs, and I see that this patch makes ufs code 18% slower then before.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] fix OOM killing of swapoff
Hugh Dickins [Sat, 6 Jan 2007 00:37:03 +0000 (16:37 -0800)] 
[PATCH] fix OOM killing of swapoff

These days, if you swapoff when there isn't enough memory, OOM killer gives
"BUG: scheduling while atomic" and the machine hangs: badness() needs to do
its PF_SWAPOFF return after the task_unlock (tasklist_lock is also held
here, so p isn't going to be freed: PF_SWAPOFF might get turned off at any
moment, but that doesn't really matter).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] fix the toshiba_acpi write_lcd return value
Matthijs van Otterdijk [Sat, 6 Jan 2007 00:37:03 +0000 (16:37 -0800)] 
[PATCH] fix the toshiba_acpi write_lcd return value

write_lcd() in toshiba_acpi returns 0 on success since the big ACPI patch
merged in 2.6.20-rc2.  It should return count.

Signed-off-by: Matthijs van Otterdijk <thotter@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] qconf: fix SIGSEGV on empty menu items
Cyrill V. Gorcunov [Sat, 6 Jan 2007 00:37:02 +0000 (16:37 -0800)] 
[PATCH] qconf: fix SIGSEGV on empty menu items

qconf may cause SIGSEGV by trying to show debug information on empty menu
items

Signed-off-by: Cyrill V. Gorcunov <gorcunov@gmail.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] Check for populated zone in __drain_pages
Christoph Lameter [Sat, 6 Jan 2007 00:37:02 +0000 (16:37 -0800)] 
[PATCH] Check for populated zone in __drain_pages

Both process_zones() and drain_node_pages() check for populated zones
before touching pagesets.  However, __drain_pages does not do so,

This may result in a NULL pointer dereference for pagesets in unpopulated
zones if a NUMA setup is combined with cpu hotplug.

Initially the unpopulated zone has the pcp pointers pointing to the boot
pagesets.  Since the zone is not populated the boot pageset pointers will
not be changed during page allocator and slab bootstrap.

If a cpu is later brought down (first call to __drain_pages()) then the pcp
pointers for cpus in unpopulated zones are set to NULL since __drain_pages
does not first check for an unpopulated zone.

If the cpu is then brought up again then we call process_zones() which will
ignore the unpopulated zone.  So the pageset pointers will still be NULL.

If the cpu is then again brought down then __drain_pages will attempt to
drain pages by following the NULL pageset pointer for unpopulated zones.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] hpt37x: Two important bug fixes
Alan [Sat, 6 Jan 2007 00:37:01 +0000 (16:37 -0800)] 
[PATCH] hpt37x: Two important bug fixes

The HPT37x driver very carefully handles DMA completions and the needed
fixups are done on pci registers 0x50 and 0x52.  This is unfortunate
because the actual registers are 0x50 and 0x54.  Fixing this offset cures
the second channel problems reported.

Secondly there are some problems with the HPT370 and certain ATA drives.
The filter code however only filters ATAPI devices due to a reversed type
check.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] pata_optidma: typo in Kconfig
Alexey Dobriyan [Sat, 6 Jan 2007 00:37:00 +0000 (16:37 -0800)] 
[PATCH] pata_optidma: typo in Kconfig

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Simplify test for interrupt window
Dor Laor [Sat, 6 Jan 2007 00:37:00 +0000 (16:37 -0800)] 
[PATCH] KVM: Simplify test for interrupt window

No need to test for rflags.if as both VT and SVM specs assure us that on exit
caused from interrupt window opening, 'if' is set.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Simplify mmu_alloc_roots()
Ingo Molnar [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)] 
[PATCH] KVM: Simplify mmu_alloc_roots()

Small optimization/cleanup:

    page == page_header(page->page_hpa)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Make loading cr3 more robust
Ingo Molnar [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)] 
[PATCH] KVM: Make loading cr3 more robust

Prevent the guest's loading of a corrupt cr3 (pointing at no guest phsyical
page) from crashing the host.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Add missing dirty bit
Avi Kivity [Sat, 6 Jan 2007 00:36:59 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Add missing dirty bit

If we emulate a write, we fail to set the dirty bit on the guest pte, leading
the guest to believe the page is clean, and thus lose data.  Bad.

Fix by setting the guest pte dirty bit under such conditions.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Don't set guest cr3 from vmx_vcpu_setup()
Avi Kivity [Sat, 6 Jan 2007 00:36:58 +0000 (16:36 -0800)] 
[PATCH] KVM: Don't set guest cr3 from vmx_vcpu_setup()

It overwrites the right cr3 set from mmu setup.  Happens only with the test
harness.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Add missing 'break'
Avi Kivity [Sat, 6 Jan 2007 00:36:58 +0000 (16:36 -0800)] 
[PATCH] KVM: Add missing 'break'

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Avoid oom on cr3 switch
Ingo Molnar [Sat, 6 Jan 2007 00:36:57 +0000 (16:36 -0800)] 
[PATCH] KVM: Avoid oom on cr3 switch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Initialize vcpu->kvm a little earlier
Avi Kivity [Sat, 6 Jan 2007 00:36:57 +0000 (16:36 -0800)] 
[PATCH] KVM: Initialize vcpu->kvm a little earlier

Fixes oops on early close of /dev/kvm.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Improve reporting of vmwrite errors
Avi Kivity [Sat, 6 Jan 2007 00:36:56 +0000 (16:36 -0800)] 
[PATCH] KVM: Improve reporting of vmwrite errors

This will allow us to see the root cause when a vmwrite error happens.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: add audit code to check mappings, etc are correct
Avi Kivity [Sat, 6 Jan 2007 00:36:56 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: add audit code to check mappings, etc are correct

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Destroy mmu while we still have a vcpu left
Avi Kivity [Sat, 6 Jan 2007 00:36:55 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Destroy mmu while we still have a vcpu left

mmu_destroy flushes the guest tlb (indirectly), which needs a valid vcpu.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Flush guest tlb when reducing permissions on a pte
Avi Kivity [Sat, 6 Jan 2007 00:36:55 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Flush guest tlb when reducing permissions on a pte

If we reduce permissions on a pte, we must flush the cached copy of the pte
from the guest's tlb.

This is implemented at the moment by flushing the entire guest tlb, and can be
improved by flushing just the relevant virtual address, if it is known.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace
Avi Kivity [Sat, 6 Jan 2007 00:36:54 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects
Avi Kivity [Sat, 6 Jan 2007 00:36:53 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects

The mmu sometimes needs memory for reverse mapping and parent pte chains.
however, we can't allocate from within the mmu because of the atomic context.

So, move the allocations to a central place that can be executed before the
main mmu machinery, where we can bail out on failure before any damage is
done.

(error handling is deffered for now, but the basic structure is there)

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Free pages on kvm destruction
Avi Kivity [Sat, 6 Jan 2007 00:36:52 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Free pages on kvm destruction

Because mmu pages have attached rmap and parent pte chain structures, we need
to zap them before freeing so the attached structures are freed.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Treat user-mode faults as a hint that a page is no longer a page...
Avi Kivity [Sat, 6 Jan 2007 00:36:52 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Treat user-mode faults as a hint that a page is no longer a page table

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Fix cmpxchg8b emulation
Avi Kivity [Sat, 6 Jan 2007 00:36:51 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Fix cmpxchg8b emulation

cmpxchg8b uses edx:eax as the compare operand, not edi:eax.

cmpxchg8b is used by 32-bit pae guests to set page table entries atomically,
and this is emulated touching shadowed guest page tables.

Also, implement it for 32-bit hosts.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Never free a shadow page actively serving as a root
Avi Kivity [Sat, 6 Jan 2007 00:36:51 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Never free a shadow page actively serving as a root

We always need cr3 to point to something valid, so if we detect that we're
freeing a root page, simply push it back to the top of the active list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Page table write flood protection
Avi Kivity [Sat, 6 Jan 2007 00:36:50 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Page table write flood protection

In fork() (or when we protect a page that is no longer a page table), we can
experience floods of writes to a page, which have to be emulated.  This is
expensive.

So, if we detect such a flood, zap the page so subsequent writes can proceed
natively.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: If an empty shadow page is not empty, report more info
Avi Kivity [Sat, 6 Jan 2007 00:36:50 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: If an empty shadow page is not empty, report more info

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Ensure freed shadow pages are clean
Avi Kivity [Sat, 6 Jan 2007 00:36:49 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Ensure freed shadow pages are clean

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: <ove is_empty_shadow_page() above kvm_mmu_free_page()
Avi Kivity [Sat, 6 Jan 2007 00:36:49 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: <ove is_empty_shadow_page() above kvm_mmu_free_page()

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Handle misaligned accesses to write protected guest page tables
Avi Kivity [Sat, 6 Jan 2007 00:36:48 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Handle misaligned accesses to write protected guest page tables

A misaligned access affects two shadow ptes instead of just one.

Since a misaligned access is unlikely to occur on a real page table, just zap
the page out of existence, avoiding further trouble.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Remove release_pt_page_64()
Avi Kivity [Sat, 6 Jan 2007 00:36:48 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Remove release_pt_page_64()

Unused.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Remove invlpg interception
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Remove invlpg interception

Since we write protect shadowed guest page tables, there is no need to trap
page invalidations (the guest will always change the mapping before issuing
the invlpg instruction).

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: oom handling
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: oom handling

When beginning to process a page fault, make sure we have enough shadow pages
available to service the fault.  If not, free some pages.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: kvm_mmu_put_page() only removes one link to the page
Avi Kivity [Sat, 6 Jan 2007 00:36:47 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: kvm_mmu_put_page() only removes one link to the page

...  and so must not free it unconditionally.

Move the freeing to kvm_mmu_zap_page().

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Implement child shadow unlinking
Avi Kivity [Sat, 6 Jan 2007 00:36:46 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Implement child shadow unlinking

When removing a page table, we must maintain the parent_pte field all child
shadow page tables.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: If emulating an instruction fails, try unprotecting the page
Avi Kivity [Sat, 6 Jan 2007 00:36:45 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: If emulating an instruction fails, try unprotecting the page

A page table may have been recycled into a regular page, and so any
instruction can be executed on it.  Unprotect the page and let the cpu do its
thing.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Zap shadow page table entries on writes to guest page tables
Avi Kivity [Sat, 6 Jan 2007 00:36:45 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Zap shadow page table entries on writes to guest page tables

Iterate over all shadow pages which correspond to a the given guest page table
and remove the mappings.

A subsequent page fault will reestablish the new mapping.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Support emulated writes into RAM
Avi Kivity [Sat, 6 Jan 2007 00:36:44 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Support emulated writes into RAM

As the mmu write protects guest page table, we emulate those writes.  Since
they are not mmio, there is no need to go to userspace to perform them.

So, perform the writes in the kernel if possible, and notify the mmu about
them so it can take the approriate action.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Let the walker extract the target page gfn from the pte
Avi Kivity [Sat, 6 Jan 2007 00:36:44 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Let the walker extract the target page gfn from the pte

This fixes a problem where set_pte_common() looked for shadowed pages based on
the page directory gfn (a huge page) instead of the actual gfn being mapped.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Write protect guest pages when a shadow is created for them
Avi Kivity [Sat, 6 Jan 2007 00:36:43 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Write protect guest pages when a shadow is created for them

When we cache a guest page table into a shadow page table, we need to prevent
further access to that page by the guest, as that would render the cache
incoherent.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Shadow page table caching
Avi Kivity [Sat, 6 Jan 2007 00:36:43 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Shadow page table caching

Define a hashtable for caching shadow page tables. Look up the cache on
context switch (cr3 change) or during page faults.

The key to the cache is a combination of
- the guest page table frame number
- the number of paging levels in the guest
   * we can cache real mode, 32-bit mode, pae, and long mode page
     tables simultaneously.  this is useful for smp bootup.
- the guest page table table
   * some kernels use a page as both a page table and a page directory.  this
     allows multiple shadow pages to exist for that page, one per level
- the "quadrant"
   * 32-bit mode page tables span 4MB, whereas a shadow page table spans
     2MB.  similarly, a 32-bit page directory spans 4GB, while a shadow
     page directory spans 1GB.  the quadrant allows caching up to 4 shadow page
     tables for one guest page in one level.
- a "metaphysical" bit
   * for real mode, and for pse pages, there is no guest page table, so set
     the bit to avoid write protecting the page.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Make kvm_mmu_alloc_page() return a kvm_mmu_page pointer
Avi Kivity [Sat, 6 Jan 2007 00:36:42 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Make kvm_mmu_alloc_page() return a kvm_mmu_page pointer

This allows further manipulation on the shadow page table.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Make the shadow page tables also special-case pae
Avi Kivity [Sat, 6 Jan 2007 00:36:41 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Make the shadow page tables also special-case pae

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Use the guest pdptrs instead of mapping cr3 in pae mode
Avi Kivity [Sat, 6 Jan 2007 00:36:41 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Use the guest pdptrs instead of mapping cr3 in pae mode

This lets us not write protect a partial page, and is anyway what a real
processor does.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MU: Special treatment for shadow pae root pages
Avi Kivity [Sat, 6 Jan 2007 00:36:40 +0000 (16:36 -0800)] 
[PATCH] KVM: MU: Special treatment for shadow pae root pages

Since we're not going to cache the pae-mode shadow root pages, allocate a
single pae shadow that will hold the four lower-level pages, which will act as
roots.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Fold fetch_guest() into init_walker()
Avi Kivity [Sat, 6 Jan 2007 00:36:40 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Fold fetch_guest() into init_walker()

It is never necessary to fetch a guest entry from an intermediate page table
level (except for large pages), so avoid some confusion by always descending
into the lowest possible level.

Rename init_walker() to walk_addr() as it is no longer restricted to
initialization.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Load the pae pdptrs on cr3 change like the processor does
Avi Kivity [Sat, 6 Jan 2007 00:36:39 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Load the pae pdptrs on cr3 change like the processor does

In pae mode, a load of cr3 loads the four third-level page table entries in
addition to cr3 itself.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Teach the page table walker to track guest page table gfns
Avi Kivity [Sat, 6 Jan 2007 00:36:39 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Teach the page table walker to track guest page table gfns

Saving the table gfns removes the need to walk the guest and host page tables
in lockstep.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: MMU: Implement simple reverse mapping
Avi Kivity [Sat, 6 Jan 2007 00:36:38 +0000 (16:36 -0800)] 
[PATCH] KVM: MMU: Implement simple reverse mapping

Keep in each host page frame's page->private a pointer to the shadow pte which
maps it.  If there are multiple shadow ptes mapping the page, set bit 0 of
page->private, and use the rest as a pointer to a linked list of all such
mappings.

Reverse mappings are needed because we when we cache shadow page tables, we
must protect the guest page tables from being modified by the guest, as that
would invalidate the cached ptes.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Prevent stale bits in cr0 and cr4
Avi Kivity [Sat, 6 Jan 2007 00:36:38 +0000 (16:36 -0800)] 
[PATCH] KVM: Prevent stale bits in cr0 and cr4

Hardware virtualization implementations allow the guests to freely change some
of the bits in cr0 and cr4, but trap when changing the other bits.  This is
useful to avoid excessive exits due to changing, for example, the ts flag.

It also means the kvm's copy of cr0 and cr4 may be stale with respect to these
bits.  most of the time this doesn't matter as these bits are not very
interesting.  Other times, however (for example when returning cr0 to
userspace), they are, so get the fresh contents of these bits from the guest
by means of a new arch operation.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] Update the rtc-rs5c372 driver
David Brownell [Sat, 6 Jan 2007 00:36:37 +0000 (16:36 -0800)] 
[PATCH] Update the rtc-rs5c372 driver

 Bugfixes:
  - Handle RTCs which are configured to use 12-hour mode.
  - Never report bogus/un-initialized times.
  - Displaying "raw trim" requires not masking it first!
  - Fix the sysfs and procfs display of crystal and trim data.

 Features:
  - Handle other RTCs in this family, notably rv5c386/rv5c387.
  - Declare the other registers.
  - Provide alarm get/set functionality.
  - Handle AIE and UIE; but no IRQ handling yet.

 Cleanup:
  - Shrink object by not including needless sysfs or procfs support
  - We don't need no steenkin' forward declarations.  (Except one.)

Until the I2C framework merges "new style" driver support, matching
the driver model better, using rv5c chips or alarm IRQs requires a
separate board-specific patch.  (And an IRQ handler, handing off labor
through a work_struct...)

This uses the "method 3" register reads, but notes that it's done
to work around an evident i2c adapter driver bug.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] fix BUG_ON(!PageSlab) from fallback_alloc
Hugh Dickins [Sat, 6 Jan 2007 00:36:36 +0000 (16:36 -0800)] 
[PATCH] fix BUG_ON(!PageSlab) from fallback_alloc

pdflush hit the BUG_ON(!PageSlab(page)) in kmem_freepages called from
fallback_alloc: cache_grow already freed those pages when alloc_slabmgmt
failed.  But it wouldn't have freed them if __GFP_NO_GROW, so make sure
fallback_alloc doesn't waste its time on that case.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Pekka J Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] fix memory corruption from misinterpreted bad_inode_ops return values
Eric Sandeen [Sat, 6 Jan 2007 00:36:36 +0000 (16:36 -0800)] 
[PATCH] fix memory corruption from misinterpreted bad_inode_ops return values

CVE-2006-5753 is for a case where an inode can be marked bad, switching
the ops to bad_inode_ops, which are all connected as:

static int return_EIO(void)
{
        return -EIO;
}

#define EIO_ERROR ((void *) (return_EIO))

static struct inode_operations bad_inode_ops =
{
        .create         = bad_inode_create
...etc...

The problem here is that the void cast causes return types to not be
promoted, and for ops such as listxattr which expect more than 32 bits of
return value, the 32-bit -EIO is interpreted as a large positive 64-bit
number, i.e. 0x00000000fffffffa instead of 0xfffffffa.

This goes particularly badly when the return value is taken as a number of
bytes to copy into, say, a user's buffer for example...

I originally had coded up the fix by creating a return_EIO_<TYPE> macro
for each return type, like this:

static int return_EIO_int(void)
{
return -EIO;
}
#define EIO_ERROR_INT ((void *) (return_EIO_int))

static struct inode_operations bad_inode_ops =
{
.create = EIO_ERROR_INT,
...etc...

but Al felt that it was probably better to create an EIO-returner for each
actual op signature.  Since so few ops share a signature, I just went ahead
& created an EIO function for each individual file & inode op that returns
a value.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] ip2 warning fix
Andrew Morton [Sat, 6 Jan 2007 00:36:35 +0000 (16:36 -0800)] 
[PATCH] ip2 warning fix

Make this:

drivers/char/ip2/ip2main.c: In function 'ip2_loadmain':
drivers/char/ip2/ip2main.c:654: warning: control may reach end of non-void function 'iiSetAddress' being inlined
drivers/char/ip2/ip2main.c:808: warning: control may reach end of non-void function 'iiInitialize' being inlined

go away.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] i386: modpost smpboot code warning fix
Vivek Goyal [Sat, 6 Jan 2007 00:36:34 +0000 (16:36 -0800)] 
[PATCH] i386: modpost smpboot code warning fix

o Currently synchronize_tsc_ap() is of type __init. It is called by
  smp_callin() which is of type __cpuinit. So synchronize_tsc_ap()
  should be of type __cpuinit.

o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and
  CONFIG_HOTPLUG_CPU=y

WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary'

o tsc is of type __initdata. It should be of type __cpuinitdata.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] i386: fix another modpost warning
Vivek Goyal [Sat, 6 Jan 2007 00:36:34 +0000 (16:36 -0800)] 
[PATCH] i386: fix another modpost warning

o MODPOST generates warning for i386 if kernel is compiled with
  CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.data: from .data between 'this_cpu' (at offset 0xc05194d0) and 'cpuinfo_op'

o this_cpu pointer should be of type __cpuinitdata.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] i386: fix modpost warning in SMP trampoline code
Vivek Goyal [Sat, 6 Jan 2007 00:36:33 +0000 (16:36 -0800)] 
[PATCH] i386: fix modpost warning in SMP trampoline code

o MODPOST generates warning for i386 if kernel is compiled with
  CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.text:startup_32_smp
from .data between 'trampoline_data' (at offset 0xc0519cf8) and 'boot_gdt'

o trampoline code/data can go into init section is CPU hotplug is not
  enabled.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] Sanely size hash tables when using large base pages
Paul Mundt [Sat, 6 Jan 2007 00:36:30 +0000 (16:36 -0800)] 
[PATCH] Sanely size hash tables when using large base pages

At the moment the inode/dentry cache hash tables (common by way of
alloc_large_system_hash()) are incorrectly sized by their respective
detection logic when we attempt to use large base pages on systems with
little memory.

This results in odd behaviour when using a 64kB PAGE_SIZE, such as:

Dentry cache hash table entries: 8192 (order: -1, 32768 bytes)
Inode-cache hash table entries: 4096 (order: -2, 16384 bytes)

The mount cache hash table is seemingly the only one that gets this right
by directly taking PAGE_SIZE in to account.

The following patch attempts to catch the bogus values and round it up to
at least 0-order.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] i386: Restore CONFIG_PHYSICAL_START option
Vivek Goyal [Sat, 6 Jan 2007 00:36:30 +0000 (16:36 -0800)] 
[PATCH] i386: Restore CONFIG_PHYSICAL_START option

o Relocatable bzImage support had got rid of CONFIG_PHYSICAL_START option
  thinking that now this option is not required as people can build a
  second kernel as relocatable and load it anywhere. So need of compiling
  the kernel for a custom address was gone. But Magnus uses vmlinux images
  for second kernel in Xen environment and he wants to continue to use
  it.

o Restoring the CONFIG_PHYSICAL_START option for the time being. I think
  down the line we can get rid of it.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] profiling: fix sched profiling typo
Ingo Molnar [Sat, 6 Jan 2007 00:36:29 +0000 (16:36 -0800)] 
[PATCH] profiling: fix sched profiling typo

Fix sched profiling typo, introduced by the sleep profiling patch.  This
bug caused profile=sched to not work.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] swsusp: Do not fail if resume device is not set
Rafael J. Wysocki [Sat, 6 Jan 2007 00:36:28 +0000 (16:36 -0800)] 
[PATCH] swsusp: Do not fail if resume device is not set

In the kernels later than 2.6.19 there is a regression that makes swsusp
fail if the resume device is not explicitly specified.

It can be fixed by adding an additional parameter to
mm/swapfile.c:swap_type_of() allowing us to pass the (struct block_device
*) corresponding to the first available swap back to the caller.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] adfs: fix filename handling
James Bursa [Sat, 6 Jan 2007 00:36:28 +0000 (16:36 -0800)] 
[PATCH] adfs: fix filename handling

Fix filenames on adfs discs being terminated at the first character greater
than 128 (adfs filenames are Latin 1).  I saw this problem when using a
loopback adfs image on a 2.6.17-rc5 x86_64 machine, and the patch fixed it
there.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] atiixp: Old drivers/ide layer driver for the ATIIXP hang fix
Alan [Sat, 6 Jan 2007 00:36:27 +0000 (16:36 -0800)] 
[PATCH] atiixp: Old drivers/ide layer driver for the ATIIXP hang fix

When the old IDE layer calls into methods in the driver during error
handling it is essentially random whether ide_lock is already held.  This
causes a deadlock in the atiixp driver which also uses ide_lock internally
for locking.

Switch to a private lock instead.

[akpm@osl.org: cleanup]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] Fix BUG at drivers/scsi/scsi_lib.c:1118 caused by "pktsetup dvd /dev/sr0"
Christoph Hellwig [Sat, 6 Jan 2007 00:36:26 +0000 (16:36 -0800)] 
[PATCH] Fix BUG at drivers/scsi/scsi_lib.c:1118 caused by "pktsetup dvd /dev/sr0"

Fix http://bugzilla.kernel.org/show_bug.cgi?id=7667

This is because the packet driver tries to send down read/write BLOCK_PC
commands that don't use a bio and do not use sg lists.

The right fix is to replace all the packet_command stuff in the packet
driver by scsi_execute() which needs to be lifted from scsi code to
the block code for that.

Fix the bug for now.  It's not the full way to a generic execute block pc
infrastcuture but fixes the bug for the time being.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] rtc-at91rm9200 build fix
David Brownell [Sat, 6 Jan 2007 00:36:25 +0000 (16:36 -0800)] 
[PATCH] rtc-at91rm9200 build fix

The at91rm9200 RTC driver needs some assistance to build, because of recent
header file rearrangement.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Cc: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Improve interrupt response
Dor Laor [Sat, 6 Jan 2007 00:36:24 +0000 (16:36 -0800)] 
[PATCH] KVM: Improve interrupt response

The current interrupt injection mechanism might delay an interrupt under
the following circumstances:

 - if injection fails because the guest is not interruptible (rflags.IF clear,
   or after a 'mov ss' or 'sti' instruction).  Userspace can check rflags,
   but the other cases or not testable under the current API.
 - if injection fails because of a fault during delivery.  This probably
   never happens under normal guests.
 - if injection fails due to a physical interrupt causing a vmexit so that
   it can be handled by the host.

In all cases the guest proceeds without processing the interrupt, reducing
the interactive feel and interrupt throughput of the guest.

This patch fixes the situation by allowing userspace to request an exit
when the 'interrupt window' opens, so that it can re-inject the interrupt
at the right time.  Guest interactivity is very visibly improved.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Recover after an arch module load failure
Yoshimi Ichiyanagi [Sat, 6 Jan 2007 00:36:24 +0000 (16:36 -0800)] 
[PATCH] KVM: Recover after an arch module load failure

If we load the wrong arch module, it leaves behind kvm_arch_ops set, which
prevents loading of the correct arch module later.

Fix be not setting kvm_arch_ops until we're sure it's good.

Signed-off-by: Yoshimi Ichiyanagi <ichiyanagi.yoshimi@lab.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Use raw_smp_processor_id() instead of smp_processor_id() where applicable
Ingo Molnar [Sat, 6 Jan 2007 00:36:23 +0000 (16:36 -0800)] 
[PATCH] KVM: Use raw_smp_processor_id() instead of smp_processor_id() where applicable

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] KVM: Fix GFP_KERNEL alloc in atomic section bug
Ingo Molnar [Sat, 6 Jan 2007 00:36:23 +0000 (16:36 -0800)] 
[PATCH] KVM: Fix GFP_KERNEL alloc in atomic section bug

KVM does kmalloc() in an atomic section while having preemption disabled via
vcpu_load().  Fix this by moving the ->*_msr setup from the vcpu_setup method
to the vcpu_create method.

(This is also a small speedup for setting up a vcpu, which can in theory be
more frequent than the vcpu_create method).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] via82cxxx: fix cable detection
Bartlomiej Zolnierkiewicz [Sat, 6 Jan 2007 00:36:21 +0000 (16:36 -0800)] 
[PATCH] via82cxxx: fix cable detection

This patch fixes 2.6.15 regression, is straightforward and tested.

Cable detection got broken probably while converting the driver to support
multiple controllers.  Cable detection is done by examining how BIOS
configured the attached devices.  The current code is broken in that it
examines the status *after* modifying Clk66 configuration ending up
detecting 40c cables as 80c.  This patch fixes it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
17 years ago[PATCH] PCI: prevent down_read when pci_devices is empty
Ard van Breemen [Sat, 6 Jan 2007 00:36:21 +0000 (16:36 -0800)] 
[PATCH] PCI: prevent down_read when pci_devices is empty

The pci_find_subsys gets called very early by obsolete ide setup parameters.
This is a bogus call since pci is not initialized yet, so the list is empty.
But in the mean time, interrupts get enabled by down_read.  This can result in
a kernel panic when the irq controller gets initialized.

This patch checks if the device list is empty before taking the semaphore, and
hence will not enable irq's.  Furthermore it will inform that it is called
while pci_devices is empty as a reminder that the ide code needs to be fixed.

The pci_get_subsys can get called in the same manner, and as such is patched
in the same manner.

[akpm@osdl.org: cleanups]
Signed-off-by: Ard van Breemen <ard@telegraafnet.nl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>