Allow times and time system calls to return small negative values At the moment, the times() system call will appear to fail for a period shortly after boot, while the value it want to return is between -4095 and -1. The same thing will also happen for the time() system call on 32-bit platforms some time in 2106 or so. On some platforms, such as x86, this is unavoidable because of the system call ABI, but other platforms such as powerpc have a separate error indication from the return value, so system calls can in fact return small negative values without indicating an error. On those platforms, force_successful_syscall_return() provides a way to indicate that the system call return value should not be treated as an error even if it is in the range which would normally be taken as a negative error number. This adds a force_successful_syscall_return() call to the time() and times() system calls plus their 32-bit compat versions, so that they don't erroneously indicate an error on those platforms whose system call ABI has a separate error indication. This will not affect anything on other platforms. Joakim Tjernlund added the fix for time() and the compat versions of time() and times(), after I did the fix for times(). Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
select: add a timespec_add_safe() function For the select() rework, it's important to be able to add timespec structures in an overflow-safe manner. This patch adds a timespec_add_safe() function for this which is similar in operation to ktime_add_safe(), but works on a struct timespec. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Make constants in kernel/timeconst.h fixed 64 bits Force constants in kernel/timeconst.h (except shift counts) to be 64 bits, using U64_C() constructor macros, and eliminate constants that cannot be represented at all in 64 bits. This avoids warnings with some gcc versions. Drop generating 64-bit constants, since we have no real hope of getting a full set (operation on 64-bit values requires a 128-bit intermediate result, which gcc only supports on 64-bit platforms, and only with libgcc support on some.) Note that the use of these constants does not depend on if we are on a 32- or 64-bit architecture. This resolves Bugzilla 10153. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
remove div_long_long_rem x86 is the only arch right now, which provides an optimized for div_long_long_rem and it has the downside that one has to be very careful that the divide doesn't overflow. The API is a little akward, as the arguments for the unsigned divide are signed. The signed version also doesn't handle a negative divisor and produces worse code on 64bit archs. There is little incentive to keep this API alive, so this converts the few users to the new API. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
convert a few do_div users This converts a few users of do_div to div_[su]64 and this demonstrates nicely how it can reduce some expressions to one-liners. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kernel: explicitly include required header files under kernel/ Following an experimental deletion of the unnecessary directive #include <linux/slab.h> from the header file <linux/percpu.h>, these files under kernel/ were exposed as needing to include one of <linux/slab.h> or <linux/gfp.h>, so explicit includes were added where necessary. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
avoid overflows in kernel/time.c When the conversion factor between jiffies and milli- or microseconds is not a single multiply or divide, as for the case of HZ == 300, we currently do a multiply followed by a divide. The intervening result, however, is subject to overflows, especially since the fraction is not simplified (for HZ == 300, we multiply by 300 and divide by 1000). This is exposed to the user when passing a large timeout to poll(), for example. This patch replaces the multiply-divide with a reciprocal multiplication on 32-bit platforms. When the input is an unsigned long, there is no portable way to do this on 64-bit platforms there is no portable way to do this since it requires a 128-bit intermediate result (which gcc does support on 64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but since the output is a 32-bit integer in the cases affected, just simplify the multiply-divide (*3/10 instead of *300/1000). The reciprocal multiply used can have off-by-one errors in the upper half of the valid output range. This could be avoided at the expense of having to deal with a potential 65-bit intermediate result. Since the intent is to avoid overflow problems and most of the other time conversions are only semiexact, the off-by-one errors were considered an acceptable tradeoff. At Ralf Baechle's suggestion, this version uses a Perl script to compute the necessary constants. We already have dependencies on Perl for kernel compiles. This does, however, require the Perl module Math::BigInt, which is included in the standard Perl distribution starting with version 5.8.0. In order to support older versions of Perl, include a table of canned constants in the script itself, and structure the script so that Math::BigInt isn't required if pulling values from said table. Running the script requires that the HZ value is available from the Makefile. Thus, this patch also adds the Kconfig variable CONFIG_HZ to the architectures which didn't already have it (alpha, cris, frv, h8300, m32r, m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or sh64 architectures, since Paul Mundt has dealt with those separately in the sh tree. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Ralf Baechle <ralf@linux-mips.org>, Cc: Sam Ravnborg <sam@ravnborg.org>, Cc: Paul Mundt <lethal@linux-sh.org>, Cc: Richard Henderson <rth@twiddle.net>, Cc: Michael Starvik <starvik@axis.com>, Cc: David Howells <dhowells@redhat.com>, Cc: Yoshinori Sato <ysato@users.sourceforge.jp>, Cc: Hirokazu Takata <takata@linux-m32r.org>, Cc: Geert Uytterhoeven <geert@linux-m68k.org>, Cc: Roman Zippel <zippel@linux-m68k.org>, Cc: William L. Irwin <sparclinux@vger.kernel.org>, Cc: Chris Zankel <chris@zankel.net>, Cc: H. Peter Anvin <hpa@zytor.com>, Cc: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
time: fix typo in comments Fix typo in comments. BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise checkpatch.pl will be complaining. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
speed up jiffies conversion functions if HZ==USER_HZ Avoid calling do_div(x, 1) in this case. Cc: David Fries <david@fries.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
system timer: fix crash in <100Hz system timer The kernel has a divide by zero crash when trying to run the system timer less than 100Hz. The problem is x/(HZ/USER_HZ) and related. Now x*(USER_HZ/HZ) will be used if HZ<USER_HZ. I'm running the Linux kernel under qemu and went to run a slower system timer to take less CPU (and battery) on the host. I found that the kernel paniced under emulation because of a divide by zero in three places. Here is the patch. The base git was updated today 01-05-2008. I went for a 20Hz system time by adding config HZ_20 etc to kernel/Kconfig.hz. With this patch I verified the system timer by looking at /proc/interrupts. [akpm@linux-foundation.org: partially clean up the macro maze] Signed-off-by: David Fries <david@fries.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
timekeeping: update xtime_cache when time(zone) changes xtime_cache needs to be updated whenever xtime and or wall_to_monotic are changed. Otherwise users of xtime_cache might see a stale (and in the case of timezone changes utterly wrong) value until the next update happens. Fixup the obvious places, which miss this update. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <johnstul@us.ibm.com> Tested-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday(). On platforms that copy sys_tz into the vdso (currently only x86_64, soon to include powerpc), it is possible for the vdso to get out of sync if a user calls (admittedly unusual) settimeofday(NULL, ptr). This patch adds a hook for architectures that set CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also updatee their copy in the vdso. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Andi Kleen <ak@suse.de> Cc: Tony Luck <tony.luck@intel.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clean up duplicate includes in kernel/ This patch cleans up duplicate includes in kernel/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
time: introduce xtime_seconds improve performance of sys_time(). sys_time() returns time in seconds, but it does so by calling do_gettimeofday() and then returning the tv_sec portion of the GTOD time. But the data structure "xtime", which is updated by every timer/scheduler tick, already offers HZ granularity time. the patch improves the sysbench oltp macrobenchmark by 4-5% on an AMD dual-core system: v2.6.23: #threads 1: transactions: 4073 (407.23 per sec.) 2: transactions: 8530 (852.81 per sec.) 3: transactions: 8321 (831.88 per sec.) 4: transactions: 8407 (840.58 per sec.) 5: transactions: 8070 (806.74 per sec.) v2.6.23 + sys_time-speedup.patch: 1: transactions: 4281 (428.09 per sec.) 2: transactions: 8910 (890.85 per sec.) 3: transactions: 8659 (865.79 per sec.) 4: transactions: 8676 (867.34 per sec.) 5: transactions: 8532 (852.91 per sec.) and by 4-5% on an Intel dual-core system too: 2.6.23: 1: transactions: 4560 (455.94 per sec.) 2: transactions: 10094 (1009.30 per sec.) 3: transactions: 9755 (975.36 per sec.) 4: transactions: 9859 (985.78 per sec.) 5: transactions: 9701 (969.72 per sec.) 2.6.23 + sys_time-speedup.patch: 1: transactions: 4779 (477.84 per sec.) 2: transactions: 10103 (1010.14 per sec.) 3: transactions: 10141 (1013.93 per sec.) 4: transactions: 10371 (1036.89 per sec.) 5: transactions: 10178 (1017.50 per sec.) (the more CPUs the system has, the more speedup this patch gives for this particular workload.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). This avoids use of the kernel-internal "xtime" variable directly outside of the actual time-related functions. Instead, use the helper functions that we already have available to us. This doesn't actually change any behaviour, but this will allow us to fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ (because much of the realtime information is maintained as separate offsets to 'xtime'), which has caused interfaces that use xtime directly to get a time that is out of sync with the real-time clock by up to a third of a second or so. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert "sys_time() speedup" This basically reverts commit 4e44f3497d41db4c3b9051c61410dee8ae4fb49c, while waiting for it to be re-done more completely. There are cases of people mixing "time()" with higher-resolution time sources, and we need to take the nanosecond offsets into account. Ingo has a patch that does that, but it's still under some discussion. In the meantime, just revert back to the old simple situation of just doing the whole exact timesource calculations. But rather than using do_gettimeofday(), use the internal nanosecond resolution getnstimeofday(), which at least avoids one unnecessary conversion (since we really don't care about whether the fractional seconds are nanoseconds or microseconds - we'll just throw them away). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[IA64] remove time interpolator Remove time_interpolator code (This is generic code, but only user was ia64. It has been superseded by the CONFIG_GENERIC_TIME code). Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Peter Keilty <peter.keilty@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>