The base address of a (per cpu) clock base is a useful debug info.
Add it and bump the version number of timer_lists.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The per cpu clock events device output of timer_list lacks an
association of the device to the cpu which is annoying when looking at
the output of /proc/timer_list from a 128 way system.
Add the CPU number info and mark the broadcast device in the device
list printout.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current timer_list output prints the address of the on stack copy
of the active hrtimer instead of the hrtimer itself.
Print the address of the real timer instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We did not restart the tick device from irq_enter() to avoid double
reprogramming and extra events in the return immediate to idle case.
But long lasting softirqs can lead to a situation where jiffies become
stale:
idle()
tick stopped (reprogrammed to next pending timer)
halt()
interrupt
jiffies updated from irq_enter()
interrupt handler
softirq function 1 runs 20ms
softirq function 2 arms a 10ms timer with a stale jiffies value
jiffies updated from irq_exit()
timer wheel has now an already expired timer
(the one added in function 2)
timer fires and timer softirq runs
This was discovered when debugging a timer problem which happend only
when the ath5k driver is active. The debugging proved that there is a
softirq function running for more than 20ms, which is a bug by itself.
To solve this we restart the tick timer right from irq_enter(), but do
not go through the other functions which are necessary to return from
idle when need_resched() is set.
Reported-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Elias Oltmanns <eo@nebensachen.de>
We have two separate nohz function calls in irq_enter() for no good
reason. Just call a single NOHZ function from irq_enter() and call
the bits in the tick code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Greetings,
103638d added a bit of avoidable overhead to the fast-path.
Use sysctl_sched_min_granularity instead of sched_slice() to restrict buddy wakeups.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If NR_CPUS isn't a multiple of 32, we get a truncated string of sched
domains by catting /proc/schedstat. This is caused by the wrong mask_len.
This patch fixes it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is basically a genericization of Jens Axboe's block layer
remote softirq changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This fixes the broken 77af7e3403e7314c47b0c07fbc5e4ef21d939532
("softirq, warning fix: correct a format to avoid a warning") fix
correctly.
The type of a pointer subtraction is not "int", nor is it "long". It
can be either (or something else). It's "ptrdiff_t", and the printk
format for it is "%td".
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
UIO: Fix mapping of logical and virtual memory
UIO: add automata sercos3 pci card support
UIO: Change driver name of uio_pdrv
UIO: Add alignment warnings for uio-mem
Driver core: add bus_sort_breadthfirst() function
NET: convert the phy_device file to use bus_find_device_by_name
kobject: Cleanup kobject_rename and !CONFIG_SYSFS
kobject: Fix kobject_rename and !CONFIG_SYSFS
sysfs: Make dir and name args to sysfs_notify() const
platform: add new device registration helper
sysfs: use ilookup5() instead of ilookup5_nowait()
PNP: create device attributes via default device attributes
Driver core: make bus_find_device_by_name() more robust
usb: turn dev_warn+WARN_ON combos into dev_WARN
debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
debug: Introduce a dev_WARN() function
sysfs: fix deadlock
device model: Do a quickcheck for driver binding before doing an expensive check
Driver core: Fix cleanup in device_create_vargs().
Driver core: Clarify device cleanup.
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
module: remove CONFIG_KMOD in comment after #endif
remove CONFIG_KMOD from fs
remove CONFIG_KMOD from drivers
Manually fix conflict due to include cleanups in drivers/md/md.c
Make the needlessly global kretprobe_table_lock() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No functional change. Just return NULL for kzalloc failure immediately,
rather than wrapping the whole function body in the body of an "if".
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchs adds the CONFIG_AIO option which allows to remove support
for asynchronous I/O operations, that are not necessarly used by
applications, particularly on embedded devices. As this is a
size-reduction option, it depends on CONFIG_EMBEDDED. It allows to
save ~7 kilobytes of kernel code/data:
text data bss dec hex filename
1115067 119180 217088 1451335 162547 vmlinux
1108025 119048 217088 1444161 160941 vmlinux.new
-7042 -132 0 -7174 -1C06 +/-
This patch has been originally written by Matt Mackall
<mpm@selenic.com>, and is part of the Linux Tiny project.
[randy.dunlap@oracle.com: build fix]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
name and nlen parameters passed to ->strategy hook are unused, remove
them. In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nothing arch specific in get/settimeofday. The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.
Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon. I'll kill the externs in various other files
in a sparate patch.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add documentation in kerneldoc for new printk format extensions
This patch documents the new %pS/%pF options in printk in kernel doc.
Hope I didn't miss any other extension.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
People can use the real name an an index into MAINTAINERS to find the
current email address.
Signed-off-by: Francois Cami <francois.cami@free.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 6dd06c9fbe025f542bce4cdb91790c0f91962722 ("module: make
module_address_lookup safe") introduced double returns in the function
kallsyms_lookup(), it's weird. The second one should be removed.
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
utsname() is quite expensive to calculate. Cache it in a local.
text data bss dec hex filename
before: 11136 720 16 11872 2e60 kernel/sys.o
after: 11096 720 16 11832 2e38 kernel/sys.o
Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On sethostname() and setdomainname(), previous information may be retained
if it was longer than than the new hostname/domainname.
This can be demonstrated trivially by calling sethostname() first with a
long name, then with a short name, and then calling uname() to retrieve
the full buffer that contains the hostname (and possibly parts of the old
hostname), one just has to look past the terminating zero.
I don't know if we should really care that much (hence the RFC); the only
scenarios I can possibly think of is administrator putting something
sensitive in the hostname (or domain name) by accident, and changing it
back will not undo the mistake entirely, though it's not like we can
recover gracefully from "rm -rf /" either... The other scenario is
namespaces (CLONE_NEWUTS) where some information may be unintentionally
"inherited" from the previous namespace (a program wants to hide the
original name and does clone + sethostname, but some information is still
left).
I think the patch may be defended on grounds of the principle of least
surprise. But I am not adamant :-)
(I guess the question now is whether userspace should be able to
write embedded NULs into the buffer or not...)
At least the observation has been made and the patch has been presented.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Way too often, I have a machine that exhibits some kind of crappy
behavior. The CPU looks wedged in the kernel or it is spending way too
much system time and I wonder what is responsible.
I try to run readprofile. But, of course, Ubuntu doesn't enable it by
default. Dang!
The reason we boot-time enable it is that it takes a big bufffer that we
generally can only bootmem alloc. But, does it hurt to at least try and
runtime-alloc it?
To use:
echo 2 > /sys/kernel/profile
Then run readprofile like normal.
This should fix the compile issue with allmodconfig. I've compile-tested
on a bunch more configs now including a few more architectures.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a process wants to set the limit of open files to RLIM_INFINITY it
gets EPERM even if it has CAP_SYS_RESOURCE capability.
For example, BIND does:
...
#elif defined(NR_OPEN) && defined(__linux__)
/*
* Some Linux kernels don't accept RLIM_INFINIT; the maximum
* possible value is the NR_OPEN defined in linux/fs.h.
*/
if (resource == isc_resource_openfiles && rlim_value == RLIM_INFINITY) {
rl.rlim_cur = rl.rlim_max = NR_OPEN;
unixresult = setrlimit(unixresource, &rl);
if (unixresult == 0)
return (ISC_R_SUCCESS);
}
#elif ...
If we allow setting RLIMIT_NOFILE to RLIM_INFINITY we increase portability
- you don't have to check if OS is linux and then use different schema for
limits.
The spec says "Specifying RLIM_INFINITY as any resource limit value on a
successful call to setrlimit() shall inhibit enforcement of that resource
limit." and we're presently not doing that.
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's somewhat unlikely that it happens, but right now a race window
between interrupts or machine checks or oopses could corrupt the tainted
bitmap because it is modified in a non atomic fashion.
Convert the taint variable to an unsigned long and use only atomic bit
operations on it.
Unfortunately this means the intvec sysctl functions cannot be used on it
anymore.
It turned out the taint sysctl handler could actually be simplified a bit
(since it only increases capabilities) so this patch actually removes
code.
[akpm@linux-foundation.org: remove unneeded include]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using "def_bool n" is pointless, simply using bool here appears more
appropriate.
Further, retaining such options that don't have a prompt and aren't
selected by anything seems also at least questionable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
is_sync_wait() is used to distinguish between sync and async waits.
Basically sync waits are the ones initialized with init_waitqueue_entry()
and async ones with init_waitqueue_func_entry(). The sync/async
distinction is used only in prepare_to_wait[_exclusive]() and its only
function is to skip setting the current task state if the wait is async.
This has a few problems.
* No one uses it. None of func_entry users use prepare_to_wait()
functions, so the code path never gets executed.
* The distinction is bogus. Maybe back when func_entry is used only
by aio but it's now also used by epoll and in future possibly by 9p
and poll/select.
* Taking @state as argument and ignoring it silenly depending on how
@wait is initialized is just a bad error-prone API.
* It prevents func_entry waits from using wait->private for no good
reason.
This patch kills is_sync_wait() and the associated code paths from
prepare_to_wait[_exclusive](). As there was no user of these code paths,
this patch doesn't cause any behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove a CVS keyword that wasn't updated for a long time from a comment.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We currently use a PM notifier to disable user mode helpers before suspend
and hibernation and to re-enable them during resume. However, this is not
an ideal solution, because if any drivers want to upload firmware into
memory before suspend, they have to use a PM notifier for this purpose and
there is no guarantee that the ordering of PM notifiers will be as
expected (ie. the notifier that disables user mode helpers has to be run
after the driver's notifier used for uploading the firmware).
For this reason, it seems better to move the disabling and enabling of
user mode helpers to separate functions that will be called by the PM core
as necessary.
[akpm@linux-foundation.org: remove unneeded ifdefs]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds an additional field to the mm_owner callbacks. This field
is required to get to the mm that changed. Hold mmap_sem in write mode
before calling the mm_owner_changed callback
[hugh@veritas.com: fix mmap_sem deadlock]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Base infrastructure to enable per-module debug messages.
I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.
Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.
Usage:
Dynamic debugging is controlled by the debugfs file,
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:
<module_name> <enabled=0/1>
.
.
.
<module_name> : Name of the module in which the debug call resides
<enabled=0/1> : whether the messages are enabled or not
For example:
snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0
Enable a module:
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
Disable a module:
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
Enable all modules:
$echo "set enabled=1 all" > dynamic_printk/modules
Disable all modules:
$echo "set enabled=0 all" > dynamic_printk/modules
Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.
[gkh: minor cleanups and tweaks to make the build work quietly]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
probe_irq_off() is disfunctional as the local nr_irqs is referenced
instead of the global one for the for_each_irq_desc() iterator.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This code is not ready, but we need to rip it out instead of rebasing
as we would lose the APIC/IO_APIC unification otherwise.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>