linux-2.6
16 years ago[POWERPC] mpc512x: Device tree for MPC5121 ADS
John Rigby [Mon, 28 Jan 2008 17:28:54 +0000 (04:28 +1100)] 
[POWERPC] mpc512x: Device tree for MPC5121 ADS

Minimal /dts-v1/ device tree for mpc5121 ads.

port-number property in uart nodes
will go away after the driver learns to use aliases

Signed-off-by: John Rigby <jrigby@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
16 years ago[POWERPC] mpc512x: Basic platform support
John Rigby [Mon, 28 Jan 2008 17:28:53 +0000 (04:28 +1100)] 
[POWERPC] mpc512x: Basic platform support

512x is very similar to 83xx and most
of this is patterned after code from 83xx.

New platform:
    changed:
arch/powerpc/Kconfig
arch/powerpc/platforms/Kconfig
arch/powerpc/platforms/Kconfig.cputype
arch/powerpc/platforms/Makefile
    new:
arch/powerpc/platforms/512x/*
include/asm-powerpc/mpc512x.h

Signed-off-by: John Rigby <jrigby@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
16 years ago[POWERPC] Cell IOMMU fixed mapping support
Michael Ellerman [Wed, 30 Jan 2008 00:03:44 +0000 (11:03 +1100)] 
[POWERPC] Cell IOMMU fixed mapping support

This patch adds support for setting up a fixed IOMMU mapping on certain
cell machines.  For 64-bit devices this avoids the performance overhead of
mapping and unmapping pages at runtime.  32-bit devices are unable to use
the fixed mapping.

The fixed mapping is established at boot, and maps all of physical memory
1:1 into device space at some offset.  On machines with < 30 GB of memory
we setup the fixed mapping immediately above the normal IOMMU window.

For example a machine with 4GB of memory would end up with the normal
IOMMU window from 0-2GB and the fixed mapping window from 2GB to 6GB. In
this case a 64-bit device wishing to DMA to 1GB would be told to DMA to
3GB, plus any offset required by firmware.  The firmware offset is encoded
in the "dma-ranges" property.

On machines with 30GB or more of memory, we are unable to place the fixed
mapping above the normal IOMMU window as we would run out of address space.
Instead we move the normal IOMMU window to coincide with the hash page
table, this region does not need to be part of the fixed mapping as no
device should ever be DMA'ing to it.  We then setup the fixed mapping
from 0 to 32GB.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Split out the ioid fetching/checking logic
Michael Ellerman [Tue, 29 Jan 2008 14:14:02 +0000 (01:14 +1100)] 
[POWERPC] Split out the ioid fetching/checking logic

Split out the ioid fetching and checking logic so we can use it elsewhere
in a subsequent patch.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Add support to cell_iommu_setup_page_tables() for multiple windows
Michael Ellerman [Tue, 29 Jan 2008 14:14:01 +0000 (01:14 +1100)] 
[POWERPC] Add support to cell_iommu_setup_page_tables() for multiple windows

Add support to cell_iommu_setup_page_tables() for handling two windows,
the dynamic window and the fixed window.  A fixed window size of 0
indicates that there is no fixed window at all.

Currently there are no callers who pass a non-zero fixed window, but the
upcoming fixed IOMMU mapping patch will change that.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Split out the IOMMU logic from cell_dma_dev_setup()
Michael Ellerman [Tue, 29 Jan 2008 14:14:01 +0000 (01:14 +1100)] 
[POWERPC] Split out the IOMMU logic from cell_dma_dev_setup()

Split the IOMMU logic out from cell_dma_dev_setup() into a separate
function.  If we're not using dma_direct_ops or dma_iommu_ops we don't
know what the hell's going on, so BUG.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Split cell_iommu_setup_hardware() into two parts
Michael Ellerman [Tue, 29 Jan 2008 14:14:00 +0000 (01:14 +1100)] 
[POWERPC] Split cell_iommu_setup_hardware() into two parts

Split cell_iommu_setup_hardware() into two parts.  Split the page table
setup into cell_iommu_setup_page_tables() and the bits that kick the
hardware into cell_iommu_enable_hardware().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Split out the logic that allocates struct iommus
Michael Ellerman [Tue, 29 Jan 2008 14:13:59 +0000 (01:13 +1100)] 
[POWERPC] Split out the logic that allocates struct iommus

Split out the logic that allocates a struct iommu into a separate
function.  This can fail however the calling code has never cared - so
just return if we can't allocate an iommu.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Allocate the hash table under 1G on cell
Michael Ellerman [Tue, 29 Jan 2008 14:13:59 +0000 (01:13 +1100)] 
[POWERPC] Allocate the hash table under 1G on cell

In order to support the fixed IOMMU mapping (in a subsequent patch),
we need the hash table to be inside the IOMMUs DMA window.  This is
usually 2G, but let's make sure the hash table is under 1G as that
will satisfy the IOMMU requirements and also means the hash table will
be on node 0.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years ago[POWERPC] Add set_dma_ops() to match get_dma_ops()
Michael Ellerman [Tue, 29 Jan 2008 14:13:58 +0000 (01:13 +1100)] 
[POWERPC] Add set_dma_ops() to match get_dma_ops()

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
16 years agoMerge branch 'linux-2.6'
Paul Mackerras [Thu, 31 Jan 2008 00:25:51 +0000 (11:25 +1100)] 
Merge branch 'linux-2.6'

16 years agoMerge branch 'for-2.6.25' of git://git.secretlab.ca/git/linux-2.6-mpc52xx
Paul Mackerras [Wed, 30 Jan 2008 23:50:17 +0000 (10:50 +1100)] 
Merge branch 'for-2.6.25' of git://git.secretlab.ca/git/linux-2.6-mpc52xx

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Wed, 30 Jan 2008 22:36:35 +0000 (09:36 +1100)] 
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] use SGI_HAS_INDYDOG for INDYDOG depends

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Wed, 30 Jan 2008 22:35:32 +0000 (09:35 +1100)] 
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
  lguest: use __PAGE_KERNEL instead of _PAGE_KERNEL
  lguest: Use explicit includes rateher than indirect
  lguest: get rid of lg variable assignments
  lguest: change gpte_addr header
  lguest: move changed bitmap to lg_cpu
  lguest: move last_pages to lg_cpu
  lguest: change last_guest to last_cpu
  lguest: change spte_addr header
  lguest: per-vcpu lguest pgdir management
  lguest: make pending notifications per-vcpu
  lguest: makes special fields be per-vcpu
  lguest: per-vcpu lguest task management
  lguest: replace lguest_arch with lg_cpu_arch.
  lguest: make registers per-vcpu
  lguest: make emulate_insn receive a vcpu struct.
  lguest: map_switcher_in_guest() per-vcpu
  lguest: per-vcpu interrupt processing.
  lguest: per-vcpu lguest timers
  lguest: make hypercalls use the vcpu struct
  lguest: make write() operation smp aware
  ...

Manual conflict resolved (maybe even correctly, who knows) in
drivers/lguest/x86/core.c

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Wed, 30 Jan 2008 22:32:24 +0000 (09:32 +1100)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/selinux-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  security: compile capabilities by default
  selinux: make selinux_set_mnt_opts() static
  SELinux: Add warning messages on network denial due to error
  SELinux: Add network ingress and egress control permission checks
  NetLabel: Add auditing to the static labeling mechanism
  NetLabel: Introduce static network labels for unlabeled connections
  SELinux: Allow NetLabel to directly cache SIDs
  SELinux: Enable dynamic enable/disable of the network access checks
  SELinux: Better integration between peer labeling subsystems
  SELinux: Add a new peer class and permissions to the Flask definitions
  SELinux: Add a capabilities bitmap to SELinux policy version 22
  SELinux: Add a network node caching mechanism similar to the sel_netif_*() functions
  SELinux: Only store the network interface's ifindex
  SELinux: Convert the netif code to use ifindex values
  NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function
  NetLabel: Add secid token support to the NetLabel secattr struct
  NetLabel: Consolidate the LSM domain mapping/hashing locks
  NetLabel: Cleanup the LSM domain hash functions
  NetLabel: Remove unneeded RCU read locks

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Linus Torvalds [Wed, 30 Jan 2008 22:31:37 +0000 (09:31 +1100)] 
Merge git://git./linux/kernel/git/gregkh/driver-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  PPC: Fix powerpc vio_find_name to not use devices_subsys
  Driver core: add bus_find_device_by_name function
  Module: check to see if we have a built in module with the same name
  x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
  Driver core: Fix up build when CONFIG_BLOCK=N

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
Linus Torvalds [Wed, 30 Jan 2008 22:30:10 +0000 (09:30 +1100)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/avi/kvm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (249 commits)
  KVM: Move apic timer migration away from critical section
  KVM: Put kvm_para.h include outside __KERNEL__
  KVM: Fix unbounded preemption latency
  KVM: Initialize the mmu caches only after verifying cpu support
  KVM: MMU: Fix dirty page setting for pages removed from rmap
  KVM: Portability: Move kvm_fpu to asm-x86/kvm.h
  KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD
  KVM: MMU: Merge shadow level check in FNAME(fetch)
  KVM: MMU: Move kvm_free_some_pages() into critical section
  KVM: MMU: Switch to mmu spinlock
  KVM: MMU: Avoid calling gfn_to_page() in mmu_set_spte()
  KVM: Add kvm_read_guest_atomic()
  KVM: MMU: Concurrent guest walkers
  KVM: Disable vapic support on Intel machines with FlexPriority
  KVM: Accelerated apic support
  KVM: local APIC TPR access reporting facility
  KVM: Print data for unimplemented wrmsr
  KVM: MMU: Add cache miss statistic
  KVM: MMU: Coalesce remote tlb flushes
  KVM: Expose ioapic to ia64 save/restore APIs
  ...

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
Linus Torvalds [Wed, 30 Jan 2008 22:29:31 +0000 (09:29 +1100)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/teigland/dlm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: (21 commits)
  dlm: static initialization improvements
  dlm: clean ups
  dlm: Sanity check namelen before copying it
  dlm: keep cached master rsbs during recovery
  dlm: change error message to debug
  dlm: fix possible use-after-free
  dlm: limit dir lookup loop
  dlm: reject normal unlock when lock is waiting for lookup
  dlm: validate messages before processing
  dlm: reject messages from non-members
  dlm: another call to confirm_master in receive_request_reply
  dlm: recover locks waiting for overlap replies
  dlm: clear ast_type when removing from astqueue
  dlm: use fixed errno values in messages
  dlm: swap bytes for rcom lock reply
  dlm: align midcomms message buffer
  dlm: close othercons
  dlm: use dlm prefix on alloc and free functions
  dlm: don't print common non-errors
  dlm: proper prototypes
  ...

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
Linus Torvalds [Wed, 30 Jan 2008 22:28:49 +0000 (09:28 +1100)] 
Merge git://git./linux/kernel/git/jejb/scsi-misc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (21 commits)
  [SCSI] Revert "[SCSI] aacraid: fib context lock for management ioctls"
  [SCSI] bsg: copy the cmd_type field to the subordinate request for bidi
  [SCSI] handle scsi_init_queue failure properly
  [SCSI] destroy scsi_bidi_sdb_cache in scsi_exit_queue
  [SCSI] scsi_debug: add XDWRITEREAD_10 support
  [SCSI] scsi_debug: add bidi data transfer support
  [SCSI] scsi_debug: add get_data_transfer_info helper function
  [SCSI] remove use_sg_chaining
  [SCSI] bidirectional: fix up for the new blk_end_request code
  [SCSI] bidirectional command support
  [SCSI] implement scsi_data_buffer
  [SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable
  [SCSI] aic7xxx: fix warnings with CONFIG_PM disabled
  [SCSI] aic79xx: fix warnings with CONFIG_PM disabled
  [SCSI] aic7xxx: fix ahc_done check SCB_ACTIVE for tagged transactions
  [SCSI] sgiwd93: use cached memory access to make driver work on IP28
  [SCSI] zfcp: fix sense_buffer access bug
  [SCSI] ncr53c8xx: fix sense_buffer access bug
  [SCSI] aic79xx: fix sense_buffer access bug
  [SCSI] hptiop: fix sense_buffer access bug
  ...

16 years agodocbook: fix block api fatal error
Randy Dunlap [Wed, 30 Jan 2008 19:51:00 +0000 (11:51 -0800)] 
docbook: fix block api fatal error

Fix docbook fatal error:
docproc: linux-2.6.24-git8/block/ll_rw_blk.c: No such file or directory

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodocbook: fix drivers/base/class warning
Randy Dunlap [Wed, 30 Jan 2008 19:51:08 +0000 (11:51 -0800)] 
docbook: fix drivers/base/class warning

Fix kernel-doc empty line warning:
Warning(linux-2.6.24-git8//drivers/base/class.c:866): bad line:

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years ago[SCSI] Revert "[SCSI] aacraid: fib context lock for management ioctls"
James Bottomley [Tue, 29 Jan 2008 21:17:15 +0000 (16:17 -0500)] 
[SCSI] Revert "[SCSI] aacraid: fib context lock for management ioctls"

This reverts commit a119ee8ee3045bf559d4cf02d72b112f3de2a15b.

Adaptec found this was causing system lockups.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] bsg: copy the cmd_type field to the subordinate request for bidi
James Bottomley [Sat, 26 Jan 2008 02:05:55 +0000 (20:05 -0600)] 
[SCSI] bsg: copy the cmd_type field to the subordinate request for bidi

This fixes a problem in SCSI where we use the (previously
uninitialised) cmd_type via blk_pc_request() to set up the transfer in
scsi_init_sgtable().

Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] handle scsi_init_queue failure properly
FUJITA Tomonori [Fri, 25 Jan 2008 14:25:14 +0000 (23:25 +0900)] 
[SCSI] handle scsi_init_queue failure properly

scsi_init_queue is expected to clean up allocated things when it
fails.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] destroy scsi_bidi_sdb_cache in scsi_exit_queue
FUJITA Tomonori [Fri, 25 Jan 2008 14:25:13 +0000 (23:25 +0900)] 
[SCSI] destroy scsi_bidi_sdb_cache in scsi_exit_queue

Needs to call kmem_cache_destroy for scsi_bidi_sdb_cache in
scsi_exit_queue.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] scsi_debug: add XDWRITEREAD_10 support
FUJITA Tomonori [Tue, 22 Jan 2008 16:32:01 +0000 (01:32 +0900)] 
[SCSI] scsi_debug: add XDWRITEREAD_10 support

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] scsi_debug: add bidi data transfer support
FUJITA Tomonori [Tue, 22 Jan 2008 16:32:00 +0000 (01:32 +0900)] 
[SCSI] scsi_debug: add bidi data transfer support

This enables fill_from_dev_buffer and fetch_to_dev_buffer to handle
bidi commands.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] scsi_debug: add get_data_transfer_info helper function
FUJITA Tomonori [Tue, 22 Jan 2008 16:31:59 +0000 (01:31 +0900)] 
[SCSI] scsi_debug: add get_data_transfer_info helper function

This adds get_data_transfer_info helper function that get lha and
sectors for READ_* and WRITE_* commands (and XDWRITEREAD_10 later).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] remove use_sg_chaining
James Bottomley [Tue, 15 Jan 2008 17:11:46 +0000 (11:11 -0600)] 
[SCSI] remove use_sg_chaining

With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.

Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] bidirectional: fix up for the new blk_end_request code
Kiyoshi Ueda [Fri, 18 Jan 2008 17:02:15 +0000 (12:02 -0500)] 
[SCSI] bidirectional: fix up for the new blk_end_request code

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] bidirectional command support
Boaz Harrosh [Thu, 13 Dec 2007 11:50:53 +0000 (13:50 +0200)] 
[SCSI] bidirectional command support

At the block level bidi request uses req->next_rq pointer for a second
bidi_read request.
At Scsi-midlayer a second scsi_data_buffer structure is used for the
bidi_read part. This bidi scsi_data_buffer is put on
request->next_rq->special. Struct scsi_cmnd is not changed.

- Define scsi_bidi_cmnd() to return true if it is a bidi request and a
  second sgtable was allocated.

- Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
  from this command This API is to isolate users from the mechanics of
  bidi.

- Define scsi_end_bidi_request() to do what scsi_end_request() does but
  for a bidi request. This is necessary because bidi commands are a bit
  tricky here. (See comments in body)

- scsi_release_buffers() will also release the bidi_read scsi_data_buffer

- scsi_io_completion() on bidi commands will now call
  scsi_end_bidi_request() and return.

- The previous work done in scsi_init_io() is now done in a new
  scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
  The new scsi_init_io() will call the above twice if needed also for
  the bidi_read command. Only at this point is a command bidi.

- In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
  confused by a get-sense command that looks like bidi. This is done
  by puting NULL at request->next_rq, and restoring.

[jejb: update to sg_table and resolve conflicts
also update to blk-end-request and resolve conflicts]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] implement scsi_data_buffer
Boaz Harrosh [Thu, 13 Dec 2007 11:47:40 +0000 (13:47 +0200)] 
[SCSI] implement scsi_data_buffer

In preparation for bidi we abstract all IO members of scsi_cmnd,
that will need to duplicate, into a substructure.

- Group all IO members of scsi_cmnd into a scsi_data_buffer
  structure.
- Adjust accessors to new members.
- scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
  scsi_cmnd. And work on it.
- Adjust scsi_init_io() and  scsi_release_buffers() for above
  change.
- Fix other parts of scsi_lib/scsi.c to members migration. Use
  accessors where appropriate.

- fix Documentation about scsi_cmnd in scsi_host.h

- scsi_error.c
  * Changed needed members of struct scsi_eh_save.
  * Careful considerations in scsi_eh_prep/restore_cmnd.

- sd.c and sr.c
  * sd and sr would adjust IO size to align on device's block
    size so code needs to change once we move to scsi_data_buff
    implementation.
  * Convert code to use scsi_for_each_sg
  * Use data accessors where appropriate.

- tgt: convert libsrp to use scsi_data_buffer

- isd200: This driver still bangs on scsi_cmnd IO members,
  so need changing

[jejb: rebased on top of sg_table patches fixed up conflicts
and used the synergy to eliminate use_sg and sg_count]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable
Boaz Harrosh [Fri, 14 Dec 2007 00:14:27 +0000 (16:14 -0800)] 
[SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable

If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more
insulated from scsi_lib changes. As a bonus it will also gain bidi
capability when it comes.

[jejb: rebase on to sg_table and fix up rejections]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] aic7xxx: fix warnings with CONFIG_PM disabled
FUJITA Tomonori [Sat, 26 Jan 2008 15:08:19 +0000 (00:08 +0900)] 
[SCSI] aic7xxx: fix warnings with CONFIG_PM disabled

  CC [M]  drivers/scsi/aic7xxx/aic7xxx_osm_pci.o
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:148: warning: 'ahc_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:166: warning: 'ahc_linux_pci_dev_resume' defined but not used

This moves aic7xxx_pci_driver struct, removes some forward declarations,
and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] aic79xx: fix warnings with CONFIG_PM disabled
FUJITA Tomonori [Sat, 26 Jan 2008 15:08:18 +0000 (00:08 +0900)] 
[SCSI] aic79xx: fix warnings with CONFIG_PM disabled

  CC [M]  drivers/scsi/aic7xxx/aic79xx_osm_pci.o
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:101: warning: 'ahd_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:121: warning: 'ahd_linux_pci_dev_resume' defined but not used

This moves aic79xx_pci_driver struct, removes some forward
declarations, and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] aic7xxx: fix ahc_done check SCB_ACTIVE for tagged transactions
David Milburn [Fri, 25 Jan 2008 18:16:18 +0000 (12:16 -0600)] 
[SCSI] aic7xxx: fix ahc_done check SCB_ACTIVE for tagged transactions

The driver only needs to check the SCB_ACTIVE flag if the SCB is not
in the untagged queue.

If the driver is in error recovery, you may end panic'ing on a TUR
that is in the untagged queue.

Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
SCB 3 done'd twice

This patch is included in Adaptec's 6.3.11 driver on their website.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] sgiwd93: use cached memory access to make driver work on IP28
Thomas Bogendoerfer [Sat, 26 Jan 2008 23:25:53 +0000 (00:25 +0100)] 
[SCSI] sgiwd93: use cached memory access to make driver work on IP28

SGI IP28 machines would need special treatment (enable adding addtional
wait states) when accessing memory uncached. To avoid this pain I
changed the driver to use only cached access to memory.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] zfcp: fix sense_buffer access bug
FUJITA Tomonori [Sun, 27 Jan 2008 03:41:50 +0000 (12:41 +0900)] 
[SCSI] zfcp: fix sense_buffer access bug

The commit de25deb18016f66dcdede165d07654559bb332bc changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] ncr53c8xx: fix sense_buffer access bug
FUJITA Tomonori [Sun, 27 Jan 2008 03:41:51 +0000 (12:41 +0900)] 
[SCSI] ncr53c8xx: fix sense_buffer access bug

The commit de25deb18016f66dcdede165d07654559bb332bc changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] aic79xx: fix sense_buffer access bug
FUJITA Tomonori [Sun, 27 Jan 2008 03:41:09 +0000 (12:41 +0900)] 
[SCSI] aic79xx: fix sense_buffer access bug

The commit de25deb18016f66dcdede165d07654559bb332bc changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] hptiop: fix sense_buffer access bug
FUJITA Tomonori [Sun, 27 Jan 2008 01:22:26 +0000 (10:22 +0900)] 
[SCSI] hptiop: fix sense_buffer access bug

&cmnd->sense_buffer now zeroes the wrong thing.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years ago[SCSI] sym53c8xx: fix bad memset argument in sym_set_cam_result_error
Nathan Lynch [Sat, 26 Jan 2008 22:07:30 +0000 (16:07 -0600)] 
[SCSI] sym53c8xx: fix bad memset argument in sym_set_cam_result_error

On a big powerpc box I got the following oops with 2.6.24-git2:

sym0: <1010-66> rev 0x1 at pci 0000:d0:01.0 irq 215
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
 target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31)
scsi 0:0:8:0: Direct-Access     IBM      ST318305LC       C509 PQ: 0
ANSI: 3
 target0:0:8: tagged command queuing enabled, command queue depth 16.
 target0:0:8: Beginning Domain Validation
 target0:0:8: asynchronous
 target0:0:8: wide asynchronous
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000038460
cpu 0x25: Vector: 300 (Data Access) at [c00000000f567840]
    pc: c000000000038460: .memcpy+0x60/0x280
    lr: d000000000050280: .sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
    sp: c00000000f567ac0
   msr: 8000000000009032
   dar: 0
 dsisr: 42000000
  current = 0xc000006d1e0af0a0
  paca    = 0xc0000000004afc00
    pid   = 0, comm = swapper
enter ? for help
[link register   ] d000000000050280
.sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
[c00000000f567ac0c00000000f567b80 (unreliable)
[c00000000f567b80d0000000000552b8 .sym_complete_error+0x12c/0x1bc [sym53c8xx]
[c00000000f567c20d0000000000561a4 .sym_int_sir+0xaa4/0x1718 [sym53c8xx]
[c00000000f567d00d000000000057e8c .sym_interrupt+0x4e4/0x6ec [sym53c8xx]
[c00000000f567dc0d00000000004fdf4 .sym53c8xx_intr+0x6c/0xdc [sym53c8xx]
[c00000000f567e50c0000000000a83e0 .handle_IRQ_event+0x7c/0xec
[c00000000f567ef0c0000000000aa344 .handle_fasteoi_irq+0x130/0x1f0
[c00000000f567f90c00000000002a538 .call_handle_irq+0x1c/0x2c
[c000004d5e0b3a90c00000000000c320 .do_IRQ+0x108/0x1d0
[c000004d5e0b3b20c000000000004790 hardware_interrupt_entry+0x18/0x1c

The memset() in sym_set_cam_result_error() would appear to be trashing
the scsi_cmnd struct instead of clearing sense_buffer.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
16 years agodlm: static initialization improvements
Denis Cheng [Tue, 29 Jan 2008 05:50:16 +0000 (13:50 +0800)] 
dlm: static initialization improvements

also change name_prefix from char pointer to char array.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: clean ups
David Teigland [Tue, 29 Jan 2008 20:52:10 +0000 (14:52 -0600)] 
dlm: clean ups

A couple small clean-ups.  Remove unnecessary wrapper-functions in
rcom.c, and remove unnecessary casting and an unnecessary ASSERT in
util.c.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: Sanity check namelen before copying it
Patrick Caulfeld [Thu, 17 Jan 2008 10:25:28 +0000 (10:25 +0000)] 
dlm: Sanity check namelen before copying it

The 32/64 compatibility code in the DLM does not check the validity of
the lock name length passed into it, so it can easily overwrite memory
if the value is rubbish (as early versions of libdlm can cause with
unlock calls, it doesn't zero the field).

This patch restricts the length of the name to the amount of data
actually passed into the call.

Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: keep cached master rsbs during recovery
David Teigland [Wed, 16 Jan 2008 19:02:31 +0000 (13:02 -0600)] 
dlm: keep cached master rsbs during recovery

To prevent the master of an rsb from changing rapidly, an unused rsb is kept
on the "toss list" for a period of time to be reused.  The toss list was
being cleared completely for each recovery, which is unnecessary.  Much of
the benefit of the toss list can be maintained if nodes keep rsb's in their
toss list that they are the master of.  These rsb's need to be included
when the resource directory is rebuilt during recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: change error message to debug
David Teigland [Wed, 16 Jan 2008 17:03:41 +0000 (11:03 -0600)] 
dlm: change error message to debug

The invalid lockspace messages are normal and can appear relatively
often.  They should be suppressed without debugging enabled.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: fix possible use-after-free
David Teigland [Mon, 14 Jan 2008 21:48:58 +0000 (15:48 -0600)] 
dlm: fix possible use-after-free

The dlm_put_lkb() can free the lkb and its associated ua structure,
so we can't depend on using the ua struct after the put.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: limit dir lookup loop
David Teigland [Wed, 9 Jan 2008 16:37:39 +0000 (10:37 -0600)] 
dlm: limit dir lookup loop

In a rare case we may need to repeat a local resource directory lookup
due to a race with removing the rsb and removing the resdir record.
We'll never need to do more than a single additional lookup, though,
so the infinite loop around the lookup can be removed.  In addition
to being unnecessary, the infinite loop is dangerous since some other
unknown condition may appear causing the loop to never break.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: reject normal unlock when lock is waiting for lookup
David Teigland [Wed, 9 Jan 2008 16:30:45 +0000 (10:30 -0600)] 
dlm: reject normal unlock when lock is waiting for lookup

Non-forced unlocks should be rejected if the lock is waiting on the
rsb_lookup list for another lock to establish the master node.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: validate messages before processing
David Teigland [Wed, 9 Jan 2008 15:59:41 +0000 (09:59 -0600)] 
dlm: validate messages before processing

There was some hit and miss validation of messages that has now been
cleaned up and unified.  Before processing a message, the new
validate_message() function checks that the lkb is the appropriate type,
process-copy or master-copy, and that the message is from the correct
nodeid for the the given lkb.  Other checks and assertions on the
lkb type and nodeid have been removed.  The assertions were particularly
bad since they would panic the machine instead of just ignoring the bad
message.

Although other recent patches have made processing old message unlikely,
it still may be possible for an old message to be processed and caught
by these checks.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: reject messages from non-members
David Teigland [Tue, 8 Jan 2008 22:24:00 +0000 (16:24 -0600)] 
dlm: reject messages from non-members

Messages from nodes that are no longer members of the lockspace should be
ignored.  When nodes are removed from the lockspace, recovery can
sometimes complete quickly enough that messages arrive from a removed node
after recovery has completed.  When processed, these messages would often
cause an error message, and could in some cases change some state, causing
problems.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: another call to confirm_master in receive_request_reply
David Teigland [Tue, 8 Jan 2008 21:37:47 +0000 (15:37 -0600)] 
dlm: another call to confirm_master in receive_request_reply

When a failed request (EBADR or ENOTBLK) is unlocked/canceled instead of
retried, there may be other lkb's waiting on the rsb_lookup list for it
to complete.  A call to confirm_master() is needed to move on to the next
waiting lkb since the current one won't be retried.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: recover locks waiting for overlap replies
David Teigland [Mon, 7 Jan 2008 22:15:05 +0000 (16:15 -0600)] 
dlm: recover locks waiting for overlap replies

When recovery looks at locks waiting for replies, it fails to consider
locks that have already received a reply for their first remote operation,
but not received a reply for secondary, overlapping unlock/cancel.  The
appropriate stub reply needs to be called for these waiters.

Appears when we start doing recovery in the presence of a many overlapping
unlock/cancel ops.

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: clear ast_type when removing from astqueue
David Teigland [Mon, 7 Jan 2008 21:55:18 +0000 (15:55 -0600)] 
dlm: clear ast_type when removing from astqueue

The lkb_ast_type field indicates whether the lkb is on the astqueue list.
When clearing locks for a process, lkb's were being removed from the astqueue
list without clearing the field.  If release_lockspace then happened
immediately afterward, it could try to remove the lkb from the list a second
time.

Appears when process calls libdlm dlm_release_lockspace() which first
closes the ls dev triggering clear_proc_locks, and then removes the ls
(a write to control dev) causing release_lockspace().

Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: use fixed errno values in messages
David Teigland [Tue, 15 Jan 2008 21:43:24 +0000 (15:43 -0600)] 
dlm: use fixed errno values in messages

Some errno values differ across platforms. So if we return things like
-EINPROGRESS from one node it can get misinterpreted or rejected on
another one.

This patch fixes up the errno values passed on the wire so that they
match the x86 ones (so as not to break the protocol), and re-instates
the platform-specific ones at the other end.

Many thanks to Fabio for testing this patch.
Initial patch from Patrick.

Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: swap bytes for rcom lock reply
Fabio M. Di Nitto [Tue, 15 Jan 2008 21:13:36 +0000 (15:13 -0600)] 
dlm: swap bytes for rcom lock reply

DLM_RCOM_LOCK_REPLY messages need byte swapping.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
16 years agodlm: align midcomms message buffer
Fabio M. Di Nitto [Wed, 30 Jan 2008 16:56:42 +0000 (10:56 -0600)] 
dlm: align midcomms message buffer

gcc does not guarantee that an auto buffer is 64bit aligned.
This change allows sparc64 to work.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
16 years agoKVM: Move apic timer migration away from critical section
Avi Kivity [Wed, 16 Jan 2008 10:49:30 +0000 (12:49 +0200)] 
KVM: Move apic timer migration away from critical section

Migrating the apic timer in the critical section is not very nice, and is
absolutely horrible with the real-time port.  Move migration to the regular
vcpu execution path, triggered by a new bitflag.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Put kvm_para.h include outside __KERNEL__
Glauber de Oliveira Costa [Tue, 15 Jan 2008 15:10:15 +0000 (13:10 -0200)] 
KVM: Put kvm_para.h include outside __KERNEL__

kvm_para.h potentially contains definitions that are to be used by userspace,
so it should not be included inside the __KERNEL__ block. To protect its own
data structures, kvm_para.h already includes its own __KERNEL__ block.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Acked-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Fix unbounded preemption latency
Avi Kivity [Tue, 15 Jan 2008 16:27:32 +0000 (18:27 +0200)] 
KVM: Fix unbounded preemption latency

When preparing to enter the guest, if an interrupt comes in while
preemption is disabled but interrupts are still enabled, we miss a
preemption point.  Fix by explicitly checking whether we need to
reschedule.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Initialize the mmu caches only after verifying cpu support
Avi Kivity [Sun, 13 Jan 2008 11:23:56 +0000 (13:23 +0200)] 
KVM: Initialize the mmu caches only after verifying cpu support

Otherwise we re-initialize the mmu caches, which will fail since the
caches are already registered, which will cause us to deinitialize said caches.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Fix dirty page setting for pages removed from rmap
Izik Eidus [Sat, 12 Jan 2008 21:49:09 +0000 (23:49 +0200)] 
KVM: MMU: Fix dirty page setting for pages removed from rmap

Right now rmap_remove won't set the page as dirty if the shadow pte
pointed to this page had write access and then it became readonly.
This patches fixes that, by setting the page as dirty for spte changes from
write to readonly access.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move kvm_fpu to asm-x86/kvm.h
Christian Ehrhardt [Tue, 8 Jan 2008 07:04:50 +0000 (08:04 +0100)] 
KVM: Portability: Move kvm_fpu to asm-x86/kvm.h

This patch moves kvm_fpu asm-x86/kvm.h to allow every architecture to
define an own representation used for KVM_GET_FPU/KVM_SET_FPU.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD
Sheng Yang [Wed, 2 Jan 2008 06:49:22 +0000 (14:49 +0800)] 
KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD

When executing a test program called "crashme", we found the KVM guest cannot
survive more than ten seconds, then encounterd kernel panic. The basic concept
of "crashme" is generating random assembly code and trying to execute it.

After some fixes on emulator insn validity judgment, we found it's hard to
get the current emulator handle the invalid instructions correctly, for the
#UD trap for hypercall patching caused troubles. The problem is, if the opcode
itself was OK, but combination of opcode and modrm_reg was invalid, and one
operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch
the memory operand first rather than checking the validity, and may encounter
an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem.

In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall,
then return from emulate_instruction() and inject a #UD to guest. With the
patch, the guest had been running for more than 12 hours.

Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Merge shadow level check in FNAME(fetch)
Dong, Eddie [Wed, 2 Jan 2008 06:29:08 +0000 (14:29 +0800)] 
KVM: MMU: Merge shadow level check in FNAME(fetch)

Remove the redundant level check when fetching
shadow pte for present & non-present spte.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Move kvm_free_some_pages() into critical section
Avi Kivity [Mon, 31 Dec 2007 13:27:49 +0000 (15:27 +0200)] 
KVM: MMU: Move kvm_free_some_pages() into critical section

If some other cpu steals mmu pages between our check and an attempt to
allocate, we can run out of mmu pages.  Fix by moving the check into the
same critical section as the allocation.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Switch to mmu spinlock
Marcelo Tosatti [Fri, 21 Dec 2007 00:18:26 +0000 (19:18 -0500)] 
KVM: MMU: Switch to mmu spinlock

Convert the synchronization of the shadow handling to a separate mmu_lock
spinlock.

Also guard fetch() by mmap_sem in read-mode to protect against alias
and memslot changes.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Avoid calling gfn_to_page() in mmu_set_spte()
Avi Kivity [Sun, 30 Dec 2007 10:29:05 +0000 (12:29 +0200)] 
KVM: MMU: Avoid calling gfn_to_page() in mmu_set_spte()

Since gfn_to_page() is a sleeping function, and we want to make the core mmu
spinlocked, we need to pass the page from the walker context (which can sleep)
to the shadow context (which cannot).

[marcelo: avoid recursive locking of mmap_sem]

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Add kvm_read_guest_atomic()
Marcelo Tosatti [Fri, 21 Dec 2007 00:18:23 +0000 (19:18 -0500)] 
KVM: Add kvm_read_guest_atomic()

In preparation for a mmu spinlock, add kvm_read_guest_atomic()
and use it in fetch() and prefetch_page().

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Concurrent guest walkers
Marcelo Tosatti [Fri, 21 Dec 2007 00:18:22 +0000 (19:18 -0500)] 
KVM: MMU: Concurrent guest walkers

Do not hold kvm->lock mutex across the entire pagefault code,
only acquire it in places where it is necessary, such as mmu
hash list, active list, rmap and parent pte handling.

Allow concurrent guest walkers by switching walk_addr() to use
mmap_sem in read-mode.

And get rid of the lockless __gfn_to_page.

[avi: move kvm_mmu_pte_write() locking inside the function]
[avi: add locking for real mode]
[avi: fix cmpxchg locking]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Disable vapic support on Intel machines with FlexPriority
Avi Kivity [Wed, 26 Dec 2007 11:57:04 +0000 (13:57 +0200)] 
KVM: Disable vapic support on Intel machines with FlexPriority

FlexPriority accelerates the tpr without any patching.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Accelerated apic support
Avi Kivity [Thu, 25 Oct 2007 14:52:32 +0000 (16:52 +0200)] 
KVM: Accelerated apic support

This adds a mechanism for exposing the virtual apic tpr to the guest, and a
protocol for letting the guest update the tpr without causing a vmexit if
conditions allow (e.g. there is no interrupt pending with a higher priority
than the new tpr).

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: local APIC TPR access reporting facility
Avi Kivity [Mon, 22 Oct 2007 14:50:39 +0000 (16:50 +0200)] 
KVM: local APIC TPR access reporting facility

Add a facility to report on accesses to the local apic tpr even if the
local apic is emulated in the kernel.  This is basically a hack that
allows userspace to patch Windows which tends to bang on the tpr a lot.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Print data for unimplemented wrmsr
Avi Kivity [Wed, 19 Dec 2007 10:02:40 +0000 (12:02 +0200)] 
KVM: Print data for unimplemented wrmsr

This can help diagnosing what the guest is trying to do.  In many cases
we can get away with partial emulation of msrs.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Add cache miss statistic
Avi Kivity [Tue, 18 Dec 2007 17:47:18 +0000 (19:47 +0200)] 
KVM: MMU: Add cache miss statistic

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Coalesce remote tlb flushes
Eddie Dong [Mon, 17 Dec 2007 22:08:27 +0000 (06:08 +0800)] 
KVM: MMU: Coalesce remote tlb flushes

Host side TLB flush can be merged together if multiple
spte need to be write-protected.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Expose ioapic to ia64 save/restore APIs
Zhang Xiantao [Mon, 17 Dec 2007 12:27:27 +0000 (20:27 +0800)] 
KVM: Expose ioapic to ia64 save/restore APIs

IA64 also needs to see ioapic structure in irqchip.

Signed-off-by: xiantao.zhang@intel.com <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Move kvm_vcpu_kick() to x86.c
Zhang Xiantao [Mon, 17 Dec 2007 06:21:40 +0000 (14:21 +0800)] 
KVM: Move kvm_vcpu_kick() to x86.c

Moving kvm_vcpu_kick() to x86.c. Since it should be
common for all archs, put its declarations in <linux/kvm_host.h>

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Move ioapic code to common directory.
Zhang Xiantao [Mon, 17 Dec 2007 06:16:14 +0000 (14:16 +0800)] 
KVM: Move ioapic code to common directory.

Move ioapic code to common, since IA64 also needs it.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Move irqchip declarations into new ioapic.h and lapic.h
Zhang Xiantao [Mon, 17 Dec 2007 05:59:56 +0000 (13:59 +0800)] 
KVM: Move irqchip declarations into new ioapic.h and lapic.h

This allows reuse of ioapic in ia64.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Move drivers/kvm/* to virt/kvm/
Avi Kivity [Sun, 16 Dec 2007 09:13:16 +0000 (11:13 +0200)] 
KVM: Move drivers/kvm/* to virt/kvm/

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Move arch dependent files to new directory arch/x86/kvm/
Avi Kivity [Sun, 16 Dec 2007 09:02:48 +0000 (11:02 +0200)] 
KVM: Move arch dependent files to new directory arch/x86/kvm/

This paves the way for multiple architecture support.  Note that while
ioapic.c could potentially be shared with ia64, it is also moved.

Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: VMX: Add printk_ratelimit in vmx_intr_assist
Ryan Harper [Thu, 13 Dec 2007 16:21:10 +0000 (10:21 -0600)] 
KVM: VMX: Add printk_ratelimit in vmx_intr_assist

Add printk_ratelimit check in front of printk.  This prevents spamming
of the message during 32-bit ubuntu 6.06server install.  Previously, it
would hang during the partition formatting stage.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move kvm_vm_stat to x86.h
Zhang Xiantao [Fri, 14 Dec 2007 02:23:23 +0000 (10:23 +0800)] 
KVM: Portability: Move kvm_vm_stat to x86.h

This patch moves kvm_vm_stat to x86.h, and every arch
can define its own kvm_vm_stat in $arch.h

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move round_robin_prev_vcpu and tss_addr to kvm_arch
Zhang Xiantao [Fri, 14 Dec 2007 02:20:16 +0000 (10:20 +0800)] 
KVM: Portability: Move round_robin_prev_vcpu and tss_addr to kvm_arch

This patches moves two fields round_robin_prev_vcpu and tss to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: move vpic and vioapic to kvm_arch
Zhang Xiantao [Fri, 14 Dec 2007 02:17:34 +0000 (10:17 +0800)] 
KVM: Portability: move vpic and vioapic to kvm_arch

This patches moves two fields vpid and vioapic to kvm_arch

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move mmu-related fields to kvm_arch
Zhang Xiantao [Fri, 14 Dec 2007 02:01:48 +0000 (10:01 +0800)] 
KVM: Portability: Move mmu-related fields to kvm_arch

This patches moves mmu-related fields to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move memslot aliases to new struct kvm_arch
Zhang Xiantao [Fri, 14 Dec 2007 01:54:20 +0000 (09:54 +0800)] 
KVM: Portability: Move memslot aliases to new struct kvm_arch

This patches create kvm_arch to hold arch-specific kvm fileds
and moves fields naliases and aliases to kvm_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move kvm_vcpu_stat to x86.h
Zhang Xiantao [Fri, 14 Dec 2007 01:49:26 +0000 (09:49 +0800)] 
KVM: Portability: Move kvm_vcpu_stat to x86.h

This patches moves kvm_vcpu_stat to x86.h, so every
arch can define its own kvm_vcpu_stat structure.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Expand the KVM_VCPU_COMM in kvm_vcpu structure.
Zhang Xiantao [Fri, 14 Dec 2007 01:45:31 +0000 (09:45 +0800)] 
KVM: Portability: Expand the KVM_VCPU_COMM in kvm_vcpu structure.

This patches removes KVM_COMM macro, original it is hold
kvm_vcpu common fields.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move kvm_vcpu definition back to kvm.h
Zhang Xiantao [Fri, 14 Dec 2007 01:41:22 +0000 (09:41 +0800)] 
KVM: Portability: Move kvm_vcpu definition back to kvm.h

This patches moves kvm_vcpu definition to kvm.h, and finally
kvm.h includes x86.h.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Split mmu-related static inline functions to mmu.h
Zhang Xiantao [Fri, 14 Dec 2007 01:35:10 +0000 (09:35 +0800)] 
KVM: Portability: Split mmu-related static inline functions to mmu.h

Since these functions need to know the details of kvm or kvm_vcpu structure,
it can't be put in x86.h.  Create mmu.h to hold them.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Introduce kvm_vcpu_arch
Zhang Xiantao [Thu, 13 Dec 2007 15:50:52 +0000 (23:50 +0800)] 
KVM: Portability: Introduce kvm_vcpu_arch

Move all the architecture-specific fields in kvm_vcpu into a new struct
kvm_vcpu_arch.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: Portability: Move kvm{pic,ioapic} accesors to x86 specific code
Zhang Xiantao [Tue, 11 Dec 2007 12:36:00 +0000 (20:36 +0800)] 
KVM: Portability: Move kvm{pic,ioapic} accesors to x86 specific code

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: emulated cmpxchg8b should be atomic on i386
Marcelo Tosatti [Wed, 12 Dec 2007 15:46:12 +0000 (10:46 -0500)] 
KVM: MMU: emulated cmpxchg8b should be atomic on i386

Emulate cmpxchg8b atomically on i386. This is required to avoid a guest
pte walker from seeing a splitted write.

[avi: make it compile]

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: support writing 0 to K8 performance counter control registers
Joerg Roedel [Tue, 11 Dec 2007 14:36:57 +0000 (15:36 +0100)] 
KVM: SVM: support writing 0 to K8 performance counter control registers

This lets SVM ignore writes of the value 0 to the performance counter control
registers.  Thus enabling them will still fail in the guest, but a write of 0
which keeps them disabled is accepted.  This is required to boot Windows
Vista 64bit.

[avi: avoid fall-thru in switch statement]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: LAPIC: minor debugging compile fix
Joerg Roedel [Wed, 12 Dec 2007 11:37:24 +0000 (12:37 +0100)] 
KVM: LAPIC: minor debugging compile fix

This patch fixes a compile error of the LAPIC code with APIC debugging enabled.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: MMU: Fix SMP shadow instantiation race
Marcelo Tosatti [Wed, 12 Dec 2007 00:12:27 +0000 (19:12 -0500)] 
KVM: MMU: Fix SMP shadow instantiation race

There is a race where VCPU0 is shadowing a pagetable entry while VCPU1
is updating it, which results in a stale shadow copy.

Fix that by comparing the contents of the cached guest pte with the
current guest pte after write-protecting the guest pagetable.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoKVM: SVM: Exit to userspace if write to cr8 and not using in-kernel apic
Joerg Roedel [Thu, 6 Dec 2007 20:02:25 +0000 (21:02 +0100)] 
KVM: SVM: Exit to userspace if write to cr8 and not using in-kernel apic

With this patch KVM on SVM will exit to userspace if the guest writes to CR8
and the in-kernel APIC is disabled.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>