linux-2.6
15 years agoMerge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core
Ingo Molnar [Wed, 19 Nov 2008 09:04:25 +0000 (10:04 +0100)] 
Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core

Conflicts:
kernel/trace/ftrace.c

[ We conflicted here because we backported a few fixes to
  tracing/urgent - which has different internal APIs. ]

15 years agoftrace: fix selftest locking
Ingo Molnar [Wed, 19 Nov 2008 09:00:15 +0000 (10:00 +0100)] 
ftrace: fix selftest locking

Impact: fix self-test boot crash

Self-test failure forgot to re-lock the BKL - crashing the next
initcall:

Testing tracer irqsoff: .. no entries found ..FAILED!
initcall init_irqsoff_tracer+0x0/0x11 returned 0 after 3906 usecs
calling  init_mmio_trace+0x0/0xf @ 1
------------[ cut here ]------------
Kernel BUG at c0c0a915 [verbose debug info unavailable]
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file:

Pid: 1, comm: swapper Not tainted (2.6.28-rc5-tip #53704)
EIP: 0060:[<c0c0a915>] EFLAGS: 00010286 CPU: 1
EIP is at unlock_kernel+0x10/0x2b
EAX: ffffffff EBX: 00000000 ECX: 00000000 EDX: f7030000
ESI: c12da19c EDI: 00000000 EBP: f7039f54 ESP: f7039f54
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=f7038000 task=f7030000 task.ti=f7038000)
Stack:
 f7039f6c c0164d30 c013fed8 a7d8d7b4 00000000 00000000 f7039f74 c12fb78a
 f7039fd0 c0101132 c12fb77d 00000000 6f727200 6f632072 2d206564 c1002031
 0000000f f7039fa2 f7039fb0 3531b171 00000000 00000000 0000002f c12ca480
Call Trace:
 [<c0164d30>] ? register_tracer+0x66/0x13f
 [<c013fed8>] ? ktime_get+0x19/0x1b
 [<c12fb78a>] ? init_mmio_trace+0xd/0xf
 [<c0101132>] ? do_one_initcall+0x4a/0x111
 [<c12fb77d>] ? init_mmio_trace+0x0/0xf
 [<c015c7e6>] ? init_irq_proc+0x46/0x59
 [<c12e851d>] ? kernel_init+0x104/0x152
 [<c12e8419>] ? kernel_init+0x0/0x152
 [<c01038b7>] ? kernel_thread_helper+0x7/0x10
Code: 58 14 43 75 0a b8 00 9b 2d c1 e8 51 43 7a ff 64 a1 00 a0 37 c1 89 58 14 5b 5d c3 55 64 8b 15 00 a0 37 c1 83 7a 14 00 89 e5 79 04 <0f> 0b eb fe 8b 42 14 48 85 c0 89 42 14 79 0a b8 00 9b 2d c1 e8
EIP: [<c0c0a915>] unlock_kernel+0x10/0x2b SS:ESP 0068:f7039f54
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill init!

So clean up the flow a bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tip/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Ingo Molnar [Wed, 19 Nov 2008 08:00:50 +0000 (09:00 +0100)] 
Merge branch 'tip/urgent' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent

15 years agoftrace: fix dyn ftrace filter selection
Steven Rostedt [Wed, 19 Nov 2008 01:33:02 +0000 (20:33 -0500)] 
ftrace: fix dyn ftrace filter selection

Impact: clean up and fix for dyn ftrace filter selection

The previous logic of the dynamic ftrace selection of enabling
or disabling functions was complex and incorrect. This patch simplifies
the code and corrects the usage. This simplification also makes the
code more robust.

Here is the correct logic:

  Given a function that can be traced by dynamic ftrace:

  If the function is not to be traced, disable it if it was enabled.
  (this is if the function is in the set_ftrace_notrace file)

  (filter is on if there exists any functions in set_ftrace_filter file)

  If the filter is on, and we are enabling functions:
    If the function is in set_ftrace_filter, enable it if it is not
      already enabled.
    If the function is not in set_ftrace_filter, disable it if it is not
      already disabled.

  Otherwise, if the filter is off and we are enabling function tracing:
    Enable the function if it is not already enabled.

  Otherwise, if we are disabling function tracing:
    Disable the function if it is not already disabled.

This code now sets or clears the ENABLED flag in the record, and at the
end it will enable the function if the flag is set, or disable the function
if the flag is cleared.

The parameters for the function that does the above logic is also
simplified. Instead of passing in confusing "new" and "old" where
they might be swapped if the "enabled" flag is not set. The old logic
even had one of the above always NULL and had to be filled in. The new
logic simply passes in one parameter called "nop". A "call" is calculated
in the code, and at the end of the logic, when we know we need to either
disable or enable the function, we can then use the "nop" and "call"
properly.

This code is more robust than the previous version.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: make filtered functions effective on setting
Steven Rostedt [Wed, 19 Nov 2008 04:57:14 +0000 (23:57 -0500)] 
ftrace: make filtered functions effective on setting

Impact: fix filter selection to apply when set

It can be confusing when the set_filter_functions is set (or cleared)
and the functions being recorded by the dynamic tracer does not
match.

This patch causes the code to be updated if the function tracer is
enabled and the filter is changed.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: fix set_ftrace_filter
Steven Rostedt [Sat, 8 Nov 2008 03:36:02 +0000 (22:36 -0500)] 
ftrace: fix set_ftrace_filter

Impact: fix of output of set_ftrace_filter

The commit "ftrace: do not show freed records in
             available_filter_functions"

Removed a bit too much from the set_ftrace_filter code, where we now see
all functions in the set_ftrace_filter file even when we set a filter.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: preemptoff selftest not working
Heiko Carstens [Tue, 18 Nov 2008 17:06:35 +0000 (18:06 +0100)] 
ftrace: preemptoff selftest not working

Impact: fix preemptoff and preemptirqsoff tracer self-tests

I was wondering why the preemptoff and preemptirqsoff tracer selftests
don't work on s390. After all its just that they get called from
non-preemptible context:

kernel_init() will execute all initcalls, however the first line in
kernel_init() is lock_kernel(), which causes the preempt_count to be
increased. Any later calls to add_preempt_count() (especially those
from the selftests) will therefore not result in a call to
trace_preempt_off() since the check below in add_preempt_count()
will be false:

        if (preempt_count() == val)
                trace_preempt_off(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));

Hence the trace buffer will be empty.

Fix this by releasing the BKL during the self-tests.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotrace: introduce missing mutex_unlock()
Vegard Nossum [Tue, 18 Nov 2008 18:22:13 +0000 (19:22 +0100)] 
trace: introduce missing mutex_unlock()

Impact: fix tracing buffer mutex leak in case of allocation failure

This error was spotted by this semantic patch:

  http://www.emn.fr/x-info/coccinelle/mut.html

It looks correct as far as I can tell. Please review.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'linus' into tracing/urgent
Ingo Molnar [Tue, 18 Nov 2008 20:37:07 +0000 (21:37 +0100)] 
Merge branch 'linus' into tracing/urgent

15 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Tue, 18 Nov 2008 16:07:51 +0000 (08:07 -0800)] 
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: hold extra reference to bio in blk_rq_map_user_iov()
  relay: fix cpu offline problem
  Release old elevator on change elevator
  block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash
  block/md: fix md autodetection
  block: make add_partition() return pointer to hd_struct
  block: fix add_partition() error path

15 years agosuspend: use WARN not WARN_ON to print the message
Arjan van de Ven [Tue, 18 Nov 2008 14:56:51 +0000 (06:56 -0800)] 
suspend: use WARN not WARN_ON to print the message

By using WARN(), kerneloops.org can collect which component is causing
the delay and make statistics about that. suspend_test_finish() is
currently the number 2 item but unless we can collect who's causing
it we're not going to be able to fix the hot topic ones..

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 18 Nov 2008 16:06:35 +0000 (08:06 -0800)] 
Merge branch 'tracing-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  kernel/profile.c: fix section mismatch warning
  function tracing: fix wrong pos computing when read buffer has been fulfilled
  tracing: fix mmiotrace resizing crash
  ring-buffer: no preempt for sched_clock()
  ring-buffer: buffer record on/off switch

15 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 18 Nov 2008 16:06:21 +0000 (08:06 -0800)] 
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  cpuset: fix regression when failed to generate sched domains
  sched, signals: fix the racy usage of ->signal in account_group_xxx/run_posix_cpu_timers
  sched: fix kernel warning on /proc/sched_debug access
  sched: correct sched-rt-group.txt pathname in init/Kconfig

15 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 18 Nov 2008 16:06:00 +0000 (08:06 -0800)] 
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  swiotlb: use coherent_dma_mask in alloc_coherent
  MAINTAINERS: remove me as RAID maintainer

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney...
Linus Torvalds [Tue, 18 Nov 2008 16:05:43 +0000 (08:05 -0800)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/cooloney/blackfin-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
  Blackfin arch: fix a broken define in dma-mapping
  Blackfin arch: fix bug - Turn on DEBUG_DOUBLEFAULT, booting SMP kernel crash
  Blackfin arch: fix bug - shared lib function in L2 failed be called
  Blackfin arch: fix incorrect limit check for bf54x check_gpio
  Blackfin arch: fix bug - Cpufreq assumes clocks in kHz and not Hz.
  Blackfin arch: dont warn when running a kernel on the oldest supported silicon
  Blackfin arch: fix bug - kernel build with write back policy fails to be booted up
  Blackfin arch: fix bug - dmacopy test case fail on all platform
  Blackfin arch: Fix typo when adding CONFIG_DEBUG_VERBOSE
  Blackfin arch: don't copy bss when copying L1
  Blackfin arch: fix bug - Fail to boot jffs2 kernel for BF561 with SMP patch
  Blackfin arch: handle case of d_path() returning error in decode_address()

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Tue, 18 Nov 2008 16:05:05 +0000 (08:05 -0800)] 
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Fix resume of GPIO unsol event for STAC/IDT
  ALSA: hda - Add quirks for HP Pavilion DV models
  ALSA: hda - Fix GPIO initialization in patch_stac92hd71bxx()
  ALSA: hda - Check model type instead of SSID in patch_92hd71bxx()
  ALSA: sound/pci/pcxhr/pcxhr.c: introduce missing kfree and pci_disable_device
  ALSA: hda: STAC_VREF_EVENT value change
  ALSA: hda - Missing NULL check in hda_beep.c
  ALSA: hda - Add digital beep playback switch for STAC/IDT codecs

15 years agotracing: kernel/trace/trace.c: introduce missing kfree()
Julia Lawall [Fri, 14 Nov 2008 18:05:31 +0000 (19:05 +0100)] 
tracing: kernel/trace/trace.c: introduce missing kfree()

Impact: fix memory leak

Error handling code following a kzalloc should free the allocated data.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,l;
position p1,p2;
expression *ptr != NULL;
@@

(
if ((x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...)) == NULL) S
|
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
)
<... when != x
     when != if (...) { <+...x...+> }
x->f = E
...>
(
 return \(0\|<+...x...+>\|ptr\);
|
 return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoblock: hold extra reference to bio in blk_rq_map_user_iov()
Jens Axboe [Tue, 18 Nov 2008 14:07:05 +0000 (15:07 +0100)] 
block: hold extra reference to bio in blk_rq_map_user_iov()

If the size passed in is OK but we end up mapping too many segments,
we call the unmap path directly like from IO completion. But from IO
completion we have an extra reference to the bio, so this error case
goes OOPS when it attempts to free and already free bio.

Fix it by getting an extra reference to the bio before calling the
unmap failure case.

Reported-by: Petr Vandrovec <vandrove@vc.cvut.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agorelay: fix cpu offline problem
Lai Jiangshan [Fri, 14 Nov 2008 09:44:59 +0000 (10:44 +0100)] 
relay: fix cpu offline problem

relay_open() will close allocated buffers when failed.
but if cpu offlined, some buffer will not be closed.
this patch fixed it.

and did cleanup for relay_reset() too.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoRelease old elevator on change elevator
Zhaolei [Fri, 14 Nov 2008 08:44:33 +0000 (09:44 +0100)] 
Release old elevator on change elevator

We should release old elevator when change to use a new one.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash
Zhang, Yanmin [Fri, 14 Nov 2008 07:26:30 +0000 (08:26 +0100)] 
block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash

We run into system boot failure with kernel 2.6.28-rc. We found it on a
couple of machines, including T61 notebook, nehalem machine, and another
HPC NX6325 notebook.  All the machines use FedoraCore 8 or FedoraCore 9.
With kernel prior to 2.6.28-rc, system boot doesn't fail.

I debug it and locate the root cause. Pls. see
http://bugzilla.kernel.org/show_bug.cgi?id=11899
https://bugzilla.redhat.com/show_bug.cgi?id=471517

As a matter of fact, there are 2 bugs.

1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5 times
and fails once. nash has a bug. Some of its functions misuse return
value 0.  Sometimes, 0 means timeout and no uevent available. Sometimes,
0 means nash gets an uevent, but the uevent isn't block-related (for
exmaple, usb). If by coincidence, kernel tells nash that uevents are
available, but kernel also set timeout, nash might stops collecting
other uevents in queue if current uevent isn't block-related.  I work
out a patch for nash to fix it.
http://bugzilla.kernel.org/attachment.cgi?id=18858

2) root=LABEL=/, system always can't boot. initrd init reports
switchroot fails. Here is an executation branch of nash when booting:
    (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
    (2) nash query /proc/devices with the major number; It found line
"8 sd";
    (3) nash use 'sd' to search its own probe table to find device (DISK)
type for the device and add it to its own list;
    (4) Later on, it probes all devices in its list to get filesystem
labels; scsi register "8 sd" always.

When major is 259, nash fails to find the device(DISK) type. I enables
CONFIG_DEBUG_BLOCK_EXT_DEVT=y when compiling kernel, so 259 is picked up
for device /dev/sda1, which causes nash to fail to find device (DISK)
type.

To fixing issue 2), I create a patch for nash and another patch for
kernel.

http://bugzilla.kernel.org/attachment.cgi?id=18859
http://bugzilla.kernel.org/attachment.cgi?id=18837

Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new
block device in proc/devices.

With 2 patches on nash and 1 patch on kernel, I boot my machines for
dozens of times without failure.

Signed-off-by Zhang Yanmin <yanmin.zhang@linux.intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock/md: fix md autodetection
Tejun Heo [Mon, 10 Nov 2008 06:30:47 +0000 (15:30 +0900)] 
block/md: fix md autodetection

Block ext devt conversion missed md_autodetect_dev() call in
rescan_partitions() leaving md autodetect unable to see partitions.
Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: make add_partition() return pointer to hd_struct
Tejun Heo [Mon, 10 Nov 2008 06:29:58 +0000 (15:29 +0900)] 
block: make add_partition() return pointer to hd_struct

Make add_partition() return pointer to the new hd_struct on success
and ERR_PTR() value on failure.  This change will be used to fix md
autodetection bug.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoblock: fix add_partition() error path
Tejun Heo [Mon, 10 Nov 2008 06:28:59 +0000 (15:28 +0900)] 
block: fix add_partition() error path

Partition stats structure was not freed on devt allocation failure
path.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years agoMerge branches 'topic/fix/hda' and 'topic/fix/misc' into for-linus
Takashi Iwai [Tue, 18 Nov 2008 12:49:39 +0000 (13:49 +0100)] 
Merge branches 'topic/fix/hda' and 'topic/fix/misc' into for-linus

15 years agoALSA: hda - Fix resume of GPIO unsol event for STAC/IDT
Takashi Iwai [Tue, 18 Nov 2008 09:55:36 +0000 (10:55 +0100)] 
ALSA: hda - Fix resume of GPIO unsol event for STAC/IDT

Use cached write for setting the GPIO unsolicited event mask to be
restored properly at resume.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Add quirks for HP Pavilion DV models
Takashi Iwai [Tue, 18 Nov 2008 09:48:41 +0000 (10:48 +0100)] 
ALSA: hda - Add quirks for HP Pavilion DV models

Added the quirk entries for HP Pavilion DV5 and DV7 with model=hp-m4.

Reference: Novell bnc#445321, bnc#445161
https://bugzilla.novell.com/show_bug.cgi?id=445321
https://bugzilla.novell.com/show_bug.cgi?id=445161

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoBlackfin arch: fix a broken define in dma-mapping
Mike Frysinger [Tue, 18 Nov 2008 09:48:22 +0000 (17:48 +0800)] 
Blackfin arch: fix a broken define in dma-mapping

dma_mapping_error is an actual function, so fix broken define with a
real inline stub

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
15 years agoBlackfin arch: fix bug - Turn on DEBUG_DOUBLEFAULT, booting SMP kernel crash
Graf Yang [Tue, 18 Nov 2008 09:48:22 +0000 (17:48 +0800)] 
Blackfin arch: fix bug - Turn on DEBUG_DOUBLEFAULT, booting SMP kernel crash

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
15 years agoALSA: hda - Fix GPIO initialization in patch_stac92hd71bxx()
Takashi Iwai [Tue, 18 Nov 2008 09:45:15 +0000 (10:45 +0100)] 
ALSA: hda - Fix GPIO initialization in patch_stac92hd71bxx()

Fixed the GPIO mask and co initialization in patch_stac92hd71bxx()
so that the gpio_maks for HP_M4 model is set properly.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoMerge branches 'tracing/branch-tracer' and 'tracing/urgent' into tracing/core
Ingo Molnar [Tue, 18 Nov 2008 07:52:13 +0000 (08:52 +0100)] 
Merge branches 'tracing/branch-tracer' and 'tracing/urgent' into tracing/core

15 years agokernel/profile.c: fix section mismatch warning
Rakib Mullick [Tue, 18 Nov 2008 04:15:24 +0000 (10:15 +0600)] 
kernel/profile.c: fix section mismatch warning

Impact: fix section mismatch warning in kernel/profile.c

Here, profile_nop function has been called from a non-init function
create_hash_tables(void). Which generetes a section mismatch warning.
Previously, create_hash_tables(void) was a init function. So, removing
__init from create_hash_tables(void) requires profile_nop to be
non-init.

This patch makes profile_nop function inline and fixes the
following warning:

 WARNING: vmlinux.o(.text+0x6ebb6): Section mismatch in reference from
 the function create_hash_tables() to the function
 .init.text:profile_nop()
 The function create_hash_tables() references
 the function __init profile_nop().
 This is often because create_hash_tables lacks a __init
 annotation or the annotation of profile_nop is wrong.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agocpuset: fix regression when failed to generate sched domains
Li Zefan [Tue, 18 Nov 2008 06:02:03 +0000 (14:02 +0800)] 
cpuset: fix regression when failed to generate sched domains

Impact: properly rebuild sched-domains on kmalloc() failure

When cpuset failed to generate sched domains due to kmalloc()
failure, the scheduler should fallback to the single partition
'fallback_doms' and rebuild sched domains, but now it only
destroys but not rebuilds sched domains.

The regression was introduced by:

| commit dfb512ec4834116124da61d6c1ee10fd0aa32bd6
| Author: Max Krasnyansky <maxk@qualcomm.com>
| Date:   Fri Aug 29 13:11:41 2008 -0700
|
|    sched: arch_reinit_sched_domains() must destroy domains to force rebuild

After the above commit, partition_sched_domains(0, NULL, NULL) will
only destroy sched domains and partition_sched_domains(1, NULL, NULL)
will create the default sched domain.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Tue, 18 Nov 2008 04:53:31 +0000 (20:53 -0800)] 
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  prevent cifs_writepages() from skipping unwritten pages
  Fixed parsing of mount options when doing DFS submount
  [CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier patch
  [CIFS] Fix build break
  cifs: reinstate sharing of tree connections
  [CIFS] minor cleanup to cifs_mount
  cifs: reinstate sharing of SMB sessions sans races
  cifs: disable sharing session and tcon and add new TCP sharing code
  [CIFS] clean up server protocol handling
  [CIFS] remove unused list, add new cifs sock list to prepare for mount/umount fix
  [CIFS] Fix cifs reconnection flags
  [CIFS] Can't rely on iov length and base when kernel_recvmsg returns error

15 years agoprevent cifs_writepages() from skipping unwritten pages
Dave Kleikamp [Tue, 18 Nov 2008 03:49:05 +0000 (03:49 +0000)] 
prevent cifs_writepages() from skipping unwritten pages

Fixes a data corruption under heavy stress in which pages could be left
dirty after all open instances of a inode have been closed.

In order to write contiguous pages whenever possible, cifs_writepages()
asks pagevec_lookup_tag() for more pages than it may write at one time.
Normally, it then resets index just past the last page written before calling
pagevec_lookup_tag() again.

If cifs_writepages() can't write the first page returned, it wasn't resetting
index, and the next call to pagevec_lookup_tag() resulted in skipping all of
the pages it previously returned, even though cifs_writepages() did nothing
with them.  This can result in data loss when the file descriptor is about
to be closed.

This patch ensures that index gets set back to the next returned page so
that none get skipped.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Shirish S Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
15 years agoFixed parsing of mount options when doing DFS submount
Igor Mammedov [Thu, 23 Oct 2008 09:58:42 +0000 (13:58 +0400)] 
Fixed parsing of mount options when doing DFS submount

Since these hit the same routines, and are relatively small, it is easier to review
them as one patch.

Fixed incorrect handling of the last option in some cases
Fixed prefixpath handling convert path_consumed into host depended string length (in bytes)
Use non default separator if it is provided in the original mount options

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
15 years agoRemove -mno-spe flags as they dont belong
Kumar Gala [Sat, 15 Nov 2008 18:02:34 +0000 (12:02 -0600)] 
Remove -mno-spe flags as they dont belong

For some unknown reason at Steven Rostedt added in disabling of the SPE
instruction generation for e500 based PPC cores in commit
6ec562328fda585be2d7f472cfac99d3b44d362a.

We are removing it because:

1. It generates e500 kernels that don't work
2. its not the correct set of flags to do this
3. we handle this in the arch/powerpc/Makefile already
4. its unknown in talking to Steven why he did this

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Tested-and-Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'for-linus' of git://git.o-hand.com/linux-mfd
Linus Torvalds [Mon, 17 Nov 2008 18:45:39 +0000 (10:45 -0800)] 
Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd

* 'for-linus' of git://git.o-hand.com/linux-mfd:
  mfd: Correct WM8350 I2C return code usage
  mfd: fix event masking for da9030

15 years ago[CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier...
Steve French [Mon, 17 Nov 2008 16:03:00 +0000 (16:03 +0000)] 
[CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier patch

set tcon->ses earlier

If the inital tree connect fails, we'll end up calling cifs_put_smb_ses
with a NULL pointer. Fix it by setting the tcon->ses earlier.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Mon, 17 Nov 2008 15:54:47 +0000 (07:54 -0800)] 
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  rtc: rtc-sun4v fixes, revised
  sparc: Fix tty compile warnings.
  sparc: struct device - replace bus_id with dev_name(), dev_set_name()

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Mon, 17 Nov 2008 15:53:25 +0000 (07:53 -0800)] 
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  rtnetlink: propagate error from dev_change_flags in do_setlink()
  isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply
  Phonet: refuse to send bigger than MTU packets
  e1000e: fix IPMI traffic
  e1000e: fix warn_on reload after phy_id error
  phy: fix phy address bug
  e100: fix dma error in direction for mapping
  igb: use dev_printk instead of printk
  qla3xxx: Cleanup: Fix link print statements.
  igb: Use device_set_wakeup_enable
  e1000: Use device_set_wakeup_enable
  e1000e: Use device_set_wakeup_enable
  via-velocity: enable perfect filtering for multicast packets
  phy: Add support for Marvell 88E1118 PHY
  mlx4_en: Pause parameters per port
  phylib: fix premature freeing of struct mii_bus
  atl1: Do not enumerate options unsupported by chip
  atl1e: fix broken multicast by removing unnecessary crc inversion
  gianfar: Fix DMA unmap invocations
  net/ucc_geth: Fix oops in uec_get_ethtool_stats()
  ...

15 years agosched, signals: fix the racy usage of ->signal in account_group_xxx/run_posix_cpu_timers
Oleg Nesterov [Mon, 17 Nov 2008 14:39:47 +0000 (15:39 +0100)] 
sched, signals: fix the racy usage of ->signal in account_group_xxx/run_posix_cpu_timers

Impact: fix potential NULL dereference

Contrary to ad474caca3e2a0550b7ce0706527ad5ab389a4d4 changelog, other
acct_group_xxx() helpers can be called after exit_notify() by timer tick.
Thanks to Roland for pointing out this. Somehow I missed this simple fact
when I read the original patch, and I am afraid I confused Frank during
the discussion. Sorry.

Fortunately, these helpers work with current, we can check ->exit_state
to ensure that ->signal can't go away under us.

Also, add the comment and compiler barrier to account_group_exec_runtime(),
to make sure we load ->signal only once.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing: branch tracer, fix writing to trace/trace_options
Aneesh Kumar K.V [Sun, 16 Nov 2008 10:37:58 +0000 (16:07 +0530)] 
tracing: branch tracer, fix writing to trace/trace_options

Impact: fix trace_options behavior

writing to trace/trace_options use the index of the array
to find the value of the flag. With branch tracer flag
defined conditionally, this breaks writing to trace_options
with branch tracer disabled.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-return...
Ingo Molnar [Mon, 17 Nov 2008 08:36:22 +0000 (09:36 +0100)] 
Merge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/tracepoints' and 'tracing/urgent' into tracing/core

15 years agoswiotlb: use coherent_dma_mask in alloc_coherent
FUJITA Tomonori [Mon, 17 Nov 2008 07:24:34 +0000 (16:24 +0900)] 
swiotlb: use coherent_dma_mask in alloc_coherent

Impact: fix DMA buffer allocation coherency bug in certain configs

This patch fixes swiotlb to use dev->coherent_dma_mask in
swiotlb_alloc_coherent().

coherent_dma_mask is a subset of dma_mask (equal to it most of
the time), enumerating the address range that a given device
is able to DMA to/from in a cache-coherent way.

But currently, swiotlb uses dev->dma_mask in alloc_coherent()
implicitly via address_needs_mapping(), but alloc_coherent is really
supposed to use coherent_dma_mask.

This bug could break drivers that uses smaller coherent_dma_mask than
dma_mask (though the current code works for the majority that use the
same mask for coherent_dma_mask and dma_mask).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agortnetlink: propagate error from dev_change_flags in do_setlink()
Johannes Berg [Mon, 17 Nov 2008 07:20:31 +0000 (23:20 -0800)] 
rtnetlink: propagate error from dev_change_flags in do_setlink()

Unlike ifconfig, iproute doesn't report an error when setting
an interface up fails:

(example: put wireless network mac80211 interface into repeater mode
with iwconfig but do not set a peer MAC address, it should fail with
-ENOLINK)

without patch:
# ip link set wlan0 up ; echo $?
0
#

with patch:
# ip link set wlan0 up ; echo $?
RTNETLINK answers: Link has been severed
2
#

Propagate the return value from dev_change_flags() to fix this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoisdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply
Harvey Harrison [Mon, 17 Nov 2008 07:03:45 +0000 (23:03 -0800)] 
isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply

commit a144ea4b7a13087081ab5402fa9ad0bcfd249e67 [IPV4]: annotate struct in_ifaddr

Missed this extra byteswap as the isdn inlines hide the htonl inside
put_u32 which causes an extra byteswap on little-endian arches.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago[CIFS] Fix build break
Steve French [Mon, 17 Nov 2008 03:57:13 +0000 (03:57 +0000)] 
[CIFS] Fix build break

Signed-off-by: Steve French <sfrench@us.ibm.com>
15 years agoPhonet: refuse to send bigger than MTU packets
Rémi Denis-Courmont [Mon, 17 Nov 2008 03:48:49 +0000 (19:48 -0800)] 
Phonet: refuse to send bigger than MTU packets

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agocifs: reinstate sharing of tree connections
Jeff Layton [Sat, 15 Nov 2008 16:12:47 +0000 (11:12 -0500)] 
cifs: reinstate sharing of tree connections

Use a similar approach to the SMB session sharing. Add a list of tcons
attached to each SMB session. Move the refcount to non-atomic. Protect
all of the above with the cifs_tcp_ses_lock. Add functions to
properly find and put references to the tcons.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
15 years agoe1000e: fix IPMI traffic
Jeff Kirsher [Fri, 14 Nov 2008 06:45:23 +0000 (06:45 +0000)] 
e1000e: fix IPMI traffic

Some users reported that they have machines with BMCs enabled that cannot
receive IPMI traffic after e1000e is loaded.
http://marc.info/?l=e1000-devel&m=121909039127414&w=2
http://marc.info/?l=e1000-devel&m=121365543823387&w=2

This fixes the issue if they load with the new parameter = 0 by disabling
crc stripping, but leaves the performance feature on for most users.
Based on work done by Hong Zhang.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: fix warn_on reload after phy_id error
Jeff Kirsher [Fri, 14 Nov 2008 06:45:07 +0000 (06:45 +0000)] 
e1000e: fix warn_on reload after phy_id error

If the driver fails to initialize the first time due to the failure in the
phy_id check the kernel triggers a warn_on on the second try to load the
driver because the driver did not free the msi/x resources in the first
load because of the previous failure in phy_id check.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agounitialized return value in mm/mlock.c: __mlock_vma_pages_range()
Helge Deller [Sun, 16 Nov 2008 23:30:57 +0000 (00:30 +0100)] 
unitialized return value in mm/mlock.c: __mlock_vma_pages_range()

Fix an unitialized return value when compiling on parisc (with CONFIG_UNEVICTABLE_LRU=y):
mm/mlock.c: In function `__mlock_vma_pages_range':
mm/mlock.c:165: warning: `ret' might be used uninitialized in this function

Signed-off-by: Helge Deller <deller@gmx.de>
[ It isn't ever really used uninitialized, since no caller should ever
  call this function with an empty range.  But the compiler is correct
  that from a local analysis standpoint that is impossible to see, and
  fixing the warning is appropriate.  ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agostop_machine: fix race with return value (fixes Bug #11989)
Rusty Russell [Sun, 16 Nov 2008 21:52:18 +0000 (08:22 +1030)] 
stop_machine: fix race with return value (fixes Bug #11989)

Bug #11989: Suspend failure on NForce4-based boards due to chanes in
stop_machine

We should not access active.fnret outside the lock; in theory the next
stop_machine could overwrite it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoFix broken ownership of /proc/sys/ files
Al Viro [Sun, 16 Nov 2008 22:19:10 +0000 (22:19 +0000)] 
Fix broken ownership of /proc/sys/ files

D'oh...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-and-tested-by: Peter Palfrader <peter@palfrader.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomfd: Correct WM8350 I2C return code usage
Mark Brown [Wed, 12 Nov 2008 16:34:02 +0000 (17:34 +0100)] 
mfd: Correct WM8350 I2C return code usage

The vendor BSP used for the WM8350 development provided an I2C driver
which incorrectly returned zero on succesful sends rather than the
number of transmitted bytes, an error which was then propagated into the
WM8350 I2C accessors.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
15 years agomfd: fix event masking for da9030
Mike Rapoport [Sat, 8 Nov 2008 00:28:19 +0000 (01:28 +0100)] 
mfd: fix event masking for da9030

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
15 years agoacpi: fix oops in acpi_system_wakeup_device_seq_show
Linus Torvalds [Sun, 16 Nov 2008 18:09:34 +0000 (10:09 -0800)] 
acpi: fix oops in acpi_system_wakeup_device_seq_show

Commit 0794469da3f7b2093575cbdfc1108308dd3641ce: ("ACPI: struct device -
replace bus_id with dev_name(), dev_set_name()") introduced a bug by
testing 'dev_name(ldev)' instead of 'ldev->bus' for NULL when printing
out the bus information.

So if ldev->bus was NULL, we'd oops.

Reported-and-tested-by: Bruno Prémont <bonbons@linux-vserver.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agophy: fix phy address bug
Giulio Benetti [Thu, 13 Nov 2008 21:53:13 +0000 (21:53 +0000)] 
phy: fix phy address bug

PHYID returns 0xffff and not 0xffffffff when not found and in some
case(at91sam9263) 0x0. Maybe this patch could be useful.

Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe100: fix dma error in direction for mapping
Jesse Brandeburg [Fri, 14 Nov 2008 13:51:54 +0000 (13:51 +0000)] 
e100: fix dma error in direction for mapping

The e100 driver triggers BUG_ON(buf->direction != dir)
by doing pci_map_single(..., PCI_DMA_BIDIRECTIONAL)
and pci_dma_sync_single_for_device(..., PCI_DMA_TODEVICE).

Changing the DMA direction, especially with dmabounce will result
in unexpected behaviour.

Reported-by: Anders Grafstrom <grfstrm@users.sourceforge.net>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: use dev_printk instead of printk
Bjorn Helgaas [Thu, 13 Nov 2008 06:20:10 +0000 (06:20 +0000)] 
igb: use dev_printk instead of printk

Use dev_printk() instead of printk() to give a little more context
and use consistent format.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqla3xxx: Cleanup: Fix link print statements.
Ron Mercer [Tue, 11 Nov 2008 07:54:54 +0000 (07:54 +0000)] 
qla3xxx: Cleanup: Fix link print statements.

Removed debug print statements and improved conditionals around informational statements.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoigb: Use device_set_wakeup_enable
\"Rafael J. Wysocki\ [Fri, 7 Nov 2008 20:30:37 +0000 (20:30 +0000)] 
igb: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
igb_set_wol().  Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000: Use device_set_wakeup_enable
\"Rafael J. Wysocki\ [Fri, 7 Nov 2008 20:30:19 +0000 (20:30 +0000)] 
e1000: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol().  Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: Use device_set_wakeup_enable
\"Rafael J. Wysocki\ [Wed, 12 Nov 2008 09:52:32 +0000 (09:52 +0000)] 
e1000e: Use device_set_wakeup_enable

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol().  Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomarkers/tracpoints: fix non-modular build
Ingo Molnar [Sun, 16 Nov 2008 08:50:34 +0000 (09:50 +0100)] 
markers/tracpoints: fix non-modular build

fix:

 kernel/marker.c: In function 'marker_module_notify':
 kernel/marker.c:905: error: 'MODULE_STATE_COMING' undeclared (first use in this function)
 [...]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agovia-velocity: enable perfect filtering for multicast packets
Joey Zhuo [Sun, 16 Nov 2008 08:39:35 +0000 (00:39 -0800)] 
via-velocity: enable perfect filtering for multicast packets

Signed-off-by: Joey Zhuo <joeyzhuo@via.com.tw>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotracepoints: format documentation
Ingo Molnar [Sun, 16 Nov 2008 07:54:36 +0000 (08:54 +0100)] 
tracepoints: format documentation

Impact: documentation update

Properly format Documentation/tracepoints.txt - it was full of
overlong lines and other typographical problems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints, docs: marker_synchronize_unregister->tracepoint_synchronize_unregister
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:49 +0000 (17:47 -0500)] 
tracepoints, docs: marker_synchronize_unregister->tracepoint_synchronize_unregister

Impact: documentation update.

Signed-off-by: Zhaolei <zhaolei@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: documentation fix for teardown
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:48 +0000 (17:47 -0500)] 
tracepoints: documentation fix for teardown

Impact: documentation update

Need a tracepoint_synchronize_unregister() before the end of exit()
to make sure every probe callers have exited the non preemptible
section and thus are not executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:47 +0000 (17:47 -0500)] 
tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()

Impact: API *CHANGE*. Must update all tracepoint users.

Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
structure in a single spot for all the kernel. It helps reducing memory
consumption, especially when declaring a lot of tracepoints, e.g. for
kmalloc tracing.

*API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
to do it. The name previously used was misleading.

Updates scheduler instrumentation to follow this API change.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: use modules notifiers
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:46 +0000 (17:47 -0500)] 
tracepoints: use modules notifiers

Impact: cleanup

Use module notifiers for tracepoint updates rather than adding a hook in
module.c.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: do not put arguments in name
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:45 +0000 (17:47 -0500)] 
tracepoints: do not put arguments in name

Impact: cleanup

That's overkill, takes space. We have a global tracepoint registery in
header files anyway.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: use unregister return value
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:44 +0000 (17:47 -0500)] 
tracepoints: use unregister return value

Impact: bugfix.

Unregistering a tracepoint can fail. Return the error value.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: use rcu_*_sched_notrace
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:43 +0000 (17:47 -0500)] 
tracepoints: use rcu_*_sched_notrace

Make sure tracepoints can be called within ftrace callbacks.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: fix disable
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:42 +0000 (17:47 -0500)] 
tracepoints: fix disable

Impact: fix race

Set the probe array pointer to NULL when the tracepoint is disabled.
The probe array point not being NULL could generate a race condition
where the reader would dereference a freed pointer.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracepoints: samples, fix teardown
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:41 +0000 (17:47 -0500)] 
tracepoints: samples, fix teardown

Impact: fix a bug in sample tracepoints

Need a tracepoint_synchronize_unregister() before the end of exit() to
make sure every probe callers have exited the non preemptible section
and thus are not executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: create DEFINE_MARKER and GET_MARKER (new API)
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:40 +0000 (17:47 -0500)] 
markers: create DEFINE_MARKER and GET_MARKER (new API)

Impact: new API.

Allow markers to be used only for declaration, without function call
associated. Useful to create specialized probes.

The problem we had is that two function calls were required when one
wanted to put a marker in a tracepoint probe. Now the marker can be used
simply for trace data type declaration, leaving the trace write work
within the tracepoint probe without any additional function call.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: auto enable tracepoints (new API : trace_mark_tp())
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:39 +0000 (17:47 -0500)] 
markers: auto enable tracepoints (new API : trace_mark_tp())

Impact: new API

Add a new API trace_mark_tp(), which declares a marker within a
tracepoint probe. When the marker is activated, the tracepoint is
automatically enabled.

No branch test is used at the marker site, because it would be a
duplicate of the branch already present in the tracepoint.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: use module notifier
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:38 +0000 (17:47 -0500)] 
markers: use module notifier

Impact: cleanup

Use module notifiers instead of adding a hook in module.c.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: use rcu_*_sched_notrace and notrace
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:37 +0000 (17:47 -0500)] 
markers: use rcu_*_sched_notrace and notrace

Make marker critical code use notrace to make sure they can be used as an
ftrace callback.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: add missing stdargs.h include, needed due to va_list usage
Arnaldo Carvalho de Melo [Fri, 14 Nov 2008 22:47:36 +0000 (17:47 -0500)] 
markers: add missing stdargs.h include, needed due to va_list usage

Impact: build fix (for future changes)

That seemed to cause built issue when marker.h is included early, even
though stdargs.h is included in kernel.h.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agomarkers: fix unregister
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:35 +0000 (17:47 -0500)] 
markers: fix unregister

Impact: fix marker registers/unregister race

get_marker() can return a NULL entry because the mutex is released in
the middle of those functions. Make sure we check to see if it has been
concurrently removed.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agorcu: add rcu_read_*_sched_notrace()
Mathieu Desnoyers [Fri, 14 Nov 2008 22:47:34 +0000 (17:47 -0500)] 
rcu: add rcu_read_*_sched_notrace()

Impact: new API, useful for tracepoints and markers.

Add _notrace version to rcu_read_*_sched().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Reviewed-by: Paul E McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agofunction tracing: fix wrong pos computing when read buffer has been fulfilled
walimis [Sat, 15 Nov 2008 07:19:06 +0000 (15:19 +0800)] 
function tracing: fix wrong pos computing when read buffer has been fulfilled

Impact: make output of available_filter_functions complete

phenomenon:

The first value of dyn_ftrace_total_info is not equal with
`cat available_filter_functions | wc -l`, but they should be equal.

root cause:

When printing functions with seq_printf in t_show, if the read buffer
is just overflowed by current function record, then this function
won't be printed to user space through read buffer, it will
just be dropped. So we can't see this function printing.

So, every time the last function to fill the read buffer, if overflowed,
will be dropped.

This also applies to set_ftrace_filter if set_ftrace_filter has
more bytes than read buffer.

fix:

Through checking return value of seq_printf, if less than 0, we know
this function doesn't be printed. Then we decrease position to force
this function to be printed next time, in next read buffer.

Another little fix is to show correct allocating pages count.

Signed-off-by: walimis <walimisdev@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMAINTAINERS: remove me as RAID maintainer
Ingo Molnar [Sun, 16 Nov 2008 07:27:53 +0000 (08:27 +0100)] 
MAINTAINERS: remove me as RAID maintainer

Neil has been the maintainer of the RAID/MD code for a long time,
remove me as a co-maintainer.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agosched: fix kernel warning on /proc/sched_debug access
Ingo Molnar [Sun, 16 Nov 2008 07:07:15 +0000 (08:07 +0100)] 
sched: fix kernel warning on /proc/sched_debug access

Luis Henriques reported that with CONFIG_PREEMPT=y + CONFIG_PREEMPT_DEBUG=y +
CONFIG_SCHED_DEBUG=y + CONFIG_LATENCYTOP=y enabled, the following warning
triggers when using latencytop:

> [  775.663239] BUG: using smp_processor_id() in preemptible [00000000] code: latencytop/6585
> [  775.663303] caller is native_sched_clock+0x3a/0x80
> [  775.663314] Pid: 6585, comm: latencytop Tainted: G        W 2.6.28-rc4-00355-g9c7c354 #1
> [  775.663322] Call Trace:
> [  775.663343]  [<ffffffff803a94e4>] debug_smp_processor_id+0xe4/0xf0
> [  775.663356]  [<ffffffff80213f7a>] native_sched_clock+0x3a/0x80
> [  775.663368]  [<ffffffff80213e19>] sched_clock+0x9/0x10
> [  775.663381]  [<ffffffff8024550d>] proc_sched_show_task+0x8bd/0x10e0
> [  775.663395]  [<ffffffff8034466e>] sched_show+0x3e/0x80
> [  775.663408]  [<ffffffff8031039b>] seq_read+0xdb/0x350
> [  775.663421]  [<ffffffff80368776>] ? security_file_permission+0x16/0x20
> [  775.663435]  [<ffffffff802f4198>] vfs_read+0xc8/0x170
> [  775.663447]  [<ffffffff802f4335>] sys_read+0x55/0x90
> [  775.663460]  [<ffffffff8020c67a>] system_call_fastpath+0x16/0x1b
> ...

This breakage was caused by me via:

  7cbaef9: sched: optimize sched_clock() a bit

Change the calls to cpu_clock().

Reported-by: Luis Henriques <henrix@sapo.pt>
15 years agotracing/function-return-tracer: support for dynamic ftrace on function return tracer
Frederic Weisbecker [Sun, 16 Nov 2008 05:02:06 +0000 (06:02 +0100)] 
tracing/function-return-tracer: support for dynamic ftrace on function return tracer

This patch adds the support for dynamic tracing on the function return tracer.
The whole difference with normal dynamic function tracing is that we don't need
to hook on a particular callback. The only pro that we want is to nop or set
dynamically the calls to ftrace_caller (which is ftrace_return_caller here).

Some security checks ensure that we are not trying to launch dynamic tracing for
return tracing while normal function tracing is already running.

An example of trace with getnstimeofday set as a filter:

ktime_get_ts+0x22/0x50 -> getnstimeofday (2283 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1396 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1825 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1426 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1524 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1434 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1502 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1404 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1397 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1051 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1314 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1344 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1163 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1390 ns)
ktime_get_ts+0x22/0x50 -> getnstimeofday (1374 ns)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/function-return-tracer: add a barrier to ensure return stack index is increme...
Frederic Weisbecker [Sat, 15 Nov 2008 01:37:44 +0000 (02:37 +0100)] 
tracing/function-return-tracer: add a barrier to ensure return stack index is incremented in memory

Impact: fix possible race condition in ftrace function return tracer

This fixes a possible race condition if index incrementation
is not immediately flushed in memory.

Thanks for Andi Kleen and Steven Rostedt for pointing out this issue
and give me this solution.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tracing/ftrace' into tracing/function-return-tracer
Ingo Molnar [Sun, 16 Nov 2008 06:57:13 +0000 (07:57 +0100)] 
Merge branch 'tracing/ftrace' into tracing/function-return-tracer

15 years agotracing/branch-tracer: fix a trace recursion on branch tracer
Frederic Weisbecker [Sun, 16 Nov 2008 04:59:52 +0000 (05:59 +0100)] 
tracing/branch-tracer: fix a trace recursion on branch tracer

Impact: fix crash when enabling the branch-tracer

When the branch tracer inserts an event through
probe_likely_condition(), it calls local_irq_save() and then results
in a trace recursion.

local_irq_save() -> trace_hardirqs_off() -> trace_hardirqs_off_caller()
-> unlikely()

The trace_branch.c file is protected by DISABLE_BRANCH_PROFILING but
that doesn't prevent from external call to functions that use
unlikely().

My box crashed each time I tried to set this tracer (sudden and hard
reboot).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/ftrace: change the type of the init() callback
Frederic Weisbecker [Sun, 16 Nov 2008 04:57:26 +0000 (05:57 +0100)] 
tracing/ftrace: change the type of the init() callback

Impact: extend the ->init() method with the ability to fail

This bring a way to know if the initialization of a tracer successed.
A tracer must return 0 on success and a traditional error (ie:
-ENOMEM) if it fails.

If a tracer fails to init, it is free to print a detailed warn. The
tracing api will not and switch to a new tracer will just return the
error from the init callback.

Note: this will be used for the return tracer.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/ftrace: fix unexpected -EINVAL when longest tracer name is set
Frederic Weisbecker [Sun, 16 Nov 2008 04:53:19 +0000 (05:53 +0100)] 
tracing/ftrace: fix unexpected -EINVAL when longest tracer name is set

Impact: fix confusing write() -EINVAL when changing the tracer

The following commit d9e540762f5cdd89f24e518ad1fd31142d0b9726 remade
alive the bug which made the set of a new tracer returning -EINVAL if
this is the longest name of tracer. This patch corrects it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: make filtered functions effective on setting
Steven Rostedt [Sat, 15 Nov 2008 21:31:41 +0000 (16:31 -0500)] 
ftrace: make filtered functions effective on setting

Impact: set filtered functions at time the filter is set

It can be confusing when the set_filter_functions is set (or cleared)
and the functions being recorded by the dynamic tracer does not
match.

This patch causes the code to be updated if the function tracer is
enabled and the filter is changed.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: fix dyn ftrace filter
Steven Rostedt [Sat, 15 Nov 2008 21:31:41 +0000 (16:31 -0500)] 
ftrace: fix dyn ftrace filter

Impact: correct implementation of dyn ftrace filter

The old decisions made by the filter algorithm was complex and incorrect.
This lead to inconsistent enabling or disabling of functions when
the filter was used.

This patch simplifies that code and in doing so, corrects the usage
of the filters.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: allow NULL pointers in mcount_loc
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)] 
ftrace: allow NULL pointers in mcount_loc

Impact: make ftrace_convert_nops() more permissive

Due to the way different architecture linkers combine the data sections
of the mcount_loc (the section that lists all the locations that
call mcount), there may be zeros added in that section. This is usually
due to strange alignments that the linker performs, that pads in zeros.

This patch makes the conversion code to nops skip any pointer in
the mcount_loc section that is NULL.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: pass module struct to arch dynamic ftrace functions
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)] 
ftrace: pass module struct to arch dynamic ftrace functions

Impact: allow archs more flexibility on dynamic ftrace implementations

Dynamic ftrace has largly been developed on x86. Since x86 does not
have the same limitations as other architectures, the ftrace interaction
between the generic code and the architecture specific code was not
flexible enough to handle some of the issues that other architectures
have.

Most notably, module trampolines. Due to the limited branch distance
that archs make in calling kernel core code from modules, the module
load code must create a trampoline to jump to what will make the
larger jump into core kernel code.

The problem arises when this happens to a call to mcount. Ftrace checks
all code before modifying it and makes sure the current code is what
it expects. Right now, there is not enough information to handle modifying
module trampolines.

This patch changes the API between generic dynamic ftrace code and
the arch dependent code. There is now two functions for modifying code:

  ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into
       a nop, where the original text is calling addr. (mod is the
       module struct if called by module init)

  ftrace_make_caller(rec, addr) - convert the code rec->ip that should
       be a nop into a caller to addr.

The record "rec" now has a new field called "arch" where the architecture
can add any special attributes to each call site record.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: replace raw_local_irq_save with local_irq_save
Steven Rostedt [Sat, 15 Nov 2008 20:48:29 +0000 (15:48 -0500)] 
ftrace: replace raw_local_irq_save with local_irq_save

Impact: fix lockdep disabling itself when function tracing is enabled

The raw_local_irq_saves used in ftrace is causing problems with
lockdep. (it thinks the irq flags are out of sync and disables
itself with a warning)

The raw ops here are not needed, and the normal local_irq_save is fine.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: do not process freed records
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)] 
ftrace: do not process freed records

Impact: keep from converting freed records

When the tracer is started or stopped, it converts all code pointed
to by the saved records into callers to ftrace or nops. When modules
are unloaded, their records are freed, but they still exist within
the record pages.

This patch changes the code to skip over freed records.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoftrace: disable ftrace on anomalies in trace start and stop
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)] 
ftrace: disable ftrace on anomalies in trace start and stop

Impact: robust feature to disable ftrace on start or stop tracing on error

Currently only the initial conversion to nops will disable ftrace
on an anomaly. But if an anomaly happens on start or stopping of the
tracer, it will silently fail.

This patch adds a check there too, to disable ftrace and warn if the
conversion fails.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>