1311 Commits

Author SHA1 Message Date
Ingo Molnar
5e9ad7df9f [S390] ftrace: update system call tracer support
Commit fb34a08c3 ("tracing: Add trace events for each syscall
entry/exit") changed the lowlevel API to ftrace syscall tracing
but did not update s390 which started making use of it recently.

This broke the s390 build, as reported by Paul Mundt.

Update the callbacks with the syscall number and the syscall
return code values. This allows per syscall tracepoints,
syscall argument enumeration /debug/tracing/events/syscalls/
and perfcounters support and integration on s390 too.

Reported-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <tip-fb34a08c3469b2be9eae626ccb96476b4687b810@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-19 14:16:15 +02:00
Martin Schwidefsky
23970e389e timekeeping: Introduce read_boot_clock
Add the new function read_boot_clock to get the exact time the system
has been started. For architectures without support for exact boot
time a new weak function is added that returns 0.  Use the exact boot
time to initialize wall_to_monotonic, or xtime if the read_boot_clock
returned 0.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.296703241@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:47 +02:00
Martin Schwidefsky
d4f587c67f timekeeping: Increase granularity of read_persistent_clock()
The persistent clock of some architectures (e.g. s390) have a
better granularity than seconds. To reduce the delta between the
host clock and the guest clock in a virtualized system change the 
read_persistent_clock function to return a struct timespec.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.013873340@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Martin Schwidefsky
155ec60226 timekeeping: Introduce struct timekeeper
Add struct timekeeper to keep the internal values timekeeping.c needs
in regard to the currently selected clock source. This moves the
timekeeping intervals, xtime_nsec and the ntp error value from struct
clocksource to struct timekeeper. The raw_time is removed from the
clocksource as well. It gets treated like xtime as a global variable.
Eventually xtime raw_time should be moved to struct timekeeper.

[ tglx: minor cleanup ]

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134809.613209842@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Martin Schwidefsky
f1b82746c1 clocksource: Cleanup clocksource selection
If a non high-resolution clocksource is first set as override clock
and then registered it becomes active even if the system is in one-shot
mode. Move the override check from sysfs_override_clocksource to the
clocksource selection. That fixes the bug and simplifies the code. The
check in clocksource_register for double registration of the same
clocksource is removed without replacement.

To find the initial clocksource a new weak function in jiffies.c is
defined that returns the jiffies clocksource. The architecture code
can then override the weak function with a more suitable clocksource,
e.g. the TOD clock on s390.

[ tglx: Folded in a fix from John Stultz ]

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134808.388024160@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Tejun Heo
384be2b18a Merge branch 'percpu-for-linus' into percpu-for-next
Conflicts:
	arch/sparc/kernel/smp_64.c
	arch/x86/kernel/cpu/perf_counter.c
	arch/x86/kernel/setup_percpu.c
	drivers/cpufreq/cpufreq_ondemand.c
	mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids.  As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14 14:45:31 +09:00
David S. Miller
aa11d958d1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	arch/microblaze/include/asm/socket.h
2009-08-12 17:44:53 -07:00
Linus Torvalds
17d11ba149 Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Avoid redelivery of edge interrupt before next edge
  KVM: MMU: limit rmap chain length
  KVM: ia64: fix build failures due to ia64/unsigned long mismatches
  KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc
  KVM: fix ack not being delivered when msi present
  KVM: s390: fix wait_queue handling
  KVM: VMX: Fix locking imbalance on emulation failure
  KVM: VMX: Fix locking order in handle_invalid_guest_state
  KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages
  KVM: SVM: force new asid on vcpu migration
  KVM: x86: verify MTRR/PAT validity
  KVM: PIT: fix kpit_elapsed division by zero
  KVM: Fix KVM_GET_MSR_INDEX_LIST
2009-08-09 14:58:21 -07:00
Roel Kluin
53cb780adb [S390] KVM: Read buffer overflow
Check whether index is within bounds before testing the element.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:40 +02:00
Hendrik Brueckner
677c1dd706 [S390] kernel: Storing machine flags early in lowcore
Currently, the machine_flags are stored late in the startup
initialization which results in failing machine type checks
(e.g. for MACHINE_IS_VM).
To allow these checks, store the machine flags in the lowcore
when the machine type has been detected.

Moving the machine_flags to the lowcore has been introduced with
git commit 25097bf153391f7be4c591d47061b3dc4990dac2

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:39 +02:00
Jan Engelhardt
0d6038ee76 net: implement a SO_DOMAIN getsockoption
This sockopt goes in line with SO_TYPE and SO_PROTOCOL. It makes it
possible for userspace programs to pass around file descriptors — I
am referring to arguments-to-functions, but it may even work for the
fd passing over UNIX sockets — without needing to also pass the
auxiliary information (PF_INET6/IPPROTO_TCP).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 13:02:57 -07:00
Jan Engelhardt
49c794e946 net: implement a SO_PROTOCOL getsockoption
Similar to SO_TYPE returning the socket type, SO_PROTOCOL allows to
retrieve the protocol used with a given socket.

I am not quite sure why we have that-many copies of socket.h, and why
the values are not the same on all arches either, but for where hex
numbers dominate, I use 0x1029 for SO_PROTOCOL as that seems to be
the next free unused number across a bunch of operating systems, or
so Google results make me want to believe. SO_PROTOCOL for others
just uses the next free Linux number, 38.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 13:02:56 -07:00
Christian Borntraeger
d3bc2f91b4 KVM: s390: fix wait_queue handling
There are two waitqueues in kvm for wait handling:
vcpu->wq for virt/kvm/kvm_main.c and
vpcu->arch.local_int.wq for the s390 specific wait code.

the wait handling in kvm_s390_handle_wait was broken by using different
wait_queues for add_wait queue and remove_wait_queue.

There are two options to fix the problem:
o  move all the s390 specific code to vcpu->wq and remove
   vcpu->arch.local_int.wq
o  move all the s390 specific code to vcpu->arch.local_int.wq

This patch chooses the 2nd variant for two reasons:
o  s390 does not use kvm_vcpu_block but implements its own enabled wait
   handling.
   Having a separate wait_queue make it clear, that our wait mechanism is
   different
o  the patch is much smaller

Report-by:  Julia Lawall <julia@diku.dk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05 13:59:46 +03:00
Stanislaw Gruszka
a42548a188 cputime: Optimize jiffies_to_cputime(1)
For powerpc with CONFIG_VIRT_CPU_ACCOUNTING
jiffies_to_cputime(1) is not compile time constant and run time
calculations are quite expensive. To optimize we use
precomputed value. For all other architectures is is
preprocessor definition.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <1248862529-6063-5-git-send-email-sgruszka@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03 14:48:36 +02:00
Linus Torvalds
760dcc6e18 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] zcrypt: fix scheduling of hrtimer ap_poll_timer
  [S390] vdso: clock_gettime of CLOCK_THREAD_CPUTIME_ID with noexec=on
  [S390] vdso: fix per cpu area allocation
  [S390] hibernation: fix register corruption on machine checks
  [S390] hibernation: fix lowcore handling
2009-07-27 12:16:38 -07:00
Benjamin Herrenschmidt
9e1b32caa5 mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()

Upcoming paches to support the new 64-bit "BookE" powerpc architecture
will need to have the virtual address corresponding to PTE page when
freeing it, due to the way the HW table walker works.

Basically, the TLB can be loaded with "large" pages that cover the whole
virtual space (well, sort-of, half of it actually) represented by a PTE
page, and which contain an "indirect" bit indicating that this TLB entry
RPN points to an array of PTEs from which the TLB can then create direct
entries. Thus, in order to invalidate those when PTE pages are deleted,
we need the virtual address to pass to tlbilx or tlbivax instructions.

The old trick of sticking it somewhere in the PTE page struct page sucks
too much, the address is almost readily available in all call sites and
almost everybody implemets these as macros, so we may as well add the
argument everywhere. I added it to the pmd and pud variants for consistency.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-27 12:10:38 -07:00
Martin Schwidefsky
1277580fe5 [S390] vdso: clock_gettime of CLOCK_THREAD_CPUTIME_ID with noexec=on
The combination of noexec=on and a clock_gettime call with clock id
CLOCK_THREAD_CPUTIME_ID is broken. The vdso code switches to the
access register mode to get access to the per-cpu data structure to
execute the magic ectg instruction. After the ectg instruction the
code always switches back to the primary mode but for noexec=on the
correct mode is the secondary mode. The effect of the bug is that the
user space program looses the access to all mappings without PROT_EXEC,
e.g. the stack. The problem is fixed by restoring the mode that has
been active before the switch to the access register mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:02 +02:00
Heiko Carstens
3a6ba4600d [S390] vdso: fix per cpu area allocation
vdso per cpu area allocation in smp_prepare_cpus() happens with GFP_KERNEL
but irqs disabled. Triggers this one:

Badness at kernel/lockdep.c:2280
Modules linked in:
CPU: 0 Not tainted 2.6.30 #2
Process swapper (pid: 1, task: 000000003fe88000, ksp: 000000003fe87eb8)
Krnl PSW : 0400c00180000000 0000000000083360 (lockdep_trace_alloc+0xec/0xf8)
[...]
Call Trace:
([<00000000000832b6>] lockdep_trace_alloc+0x42/0xf8)
 [<00000000000b1880>] __alloc_pages_internal+0x3e8/0x5c4
 [<00000000000b1b4a>] __get_free_pages+0x3a/0xb0
 [<0000000000026546>] vdso_alloc_per_cpu+0x6a/0x18c
 [<00000000005eff82>] smp_prepare_cpus+0x322/0x594
 [<00000000005e8232>] kernel_init+0x76/0x398
 [<000000000001bb1e>] kernel_thread_starter+0x6/0xc
 [<000000000001bb18>] kernel_thread_starter+0x0/0xc

Fix this by moving the allocation out of the irqs disabled section.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:01 +02:00
Heiko Carstens
c63b196afc [S390] hibernation: fix register corruption on machine checks
swsusp_arch_suspend() actually saves all cpu register contents on
hibernation.
Machine checks must be disabled since swsusp_arch_suspend() stores
register contents to their lowcore save areas. That's the same
place where register contents on machine checks would be saved.
To avoid register corruption disable machine checks.
We must also disable machine checks in the new psw mask for
program checks, since swsusp_arch_suspend() may generate program
checks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:00 +02:00
Heiko Carstens
5f954c3426 [S390] hibernation: fix lowcore handling
Our swsusp_arch_suspend() backend implementation disables prefixing
by setting the contents of the prefix register to 0.
However afterwards common code functions are called which might
access percpu data structures.
Since the lowcore contains e.g. the percpu base pointer this isn't
a good idea. So fix this by copying the hibernating cpu's lowcore to
absolute address zero.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:00 +02:00
Herbert Xu
9fadfd1adf crypto: sha512-s390 - Add export/import support
This patch adds export/import support to sha512-s390 (which includes
sha384-s390).  The exported type is defined by struct sha512_state,
which is basically the entire descriptor state of sha512_generic.

Since sha512-s390 only supports a 64-bit byte count the import
function will reject anything that exceeds that.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-22 14:38:13 +08:00
Sachin Sant
2a549c364a crypto: s390 - Fix sha build failure
Use struct s390_sha_ctx instead of sha1/sha256_state struct to fix
s390 crypto build break.

Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-16 19:58:42 +08:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Herbert Xu
f63559bef3 crypto: sha256-s390 - Add export/import support
This patch adds export/import support to sha256-s390.  The exported
type is defined by struct sha256_state, which is basically the entire
descriptor state of sha256_generic.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-11 18:23:34 +08:00
Herbert Xu
406f104b41 crypto: sha1-s390 - Add export/import support
This patch adds export/import support to sha1-s390.  The exported
type is defined by struct sha1_state, which is basically the entire
descriptor state of sha1_generic.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-11 18:23:34 +08:00
Linus Torvalds
eee33abe59 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] define KTIME_SCALAR for 32-bit s390
  [S390] add generic atomic64 support for 31 bit
  [S390] improve suspend/resume error messages
  [S390] set SCHED_OMIT_FRAME_POINTER for s390
  [S390] add __ucmpdi2() helper function
  [S390] perf_counter build fix
  [S390] shutdown actions: save/return rc from init function
  [S390] dasd: correct debugfeature sense dump
  [S390] udelay: disable lockdep to avoid false positives
  [S390] monreader: fix dev_set_drvdata conversion
  [S390] sclp: fix compile error for !SCLP_CONSOLE
2009-07-10 19:12:51 -07:00
Peter Zijlstra
c99e6efe1b sched: INIT_PREEMPT_COUNT
Pull the initial preempt_count value into a single
definition site.

Maintainers for: alpha, ia64 and m68k, please have a look,
your arch code is funny.

The header magic is a bit odd, but similar to the KERNEL_DS
one, CPP waits with expanding these macros until the
INIT_THREAD_INFO macro itself is expanded, which is in
arch/*/kernel/init_task.c where we've already included
sched.h so we're good.

Cc: tony.luck@intel.com
Cc: rth@twiddle.net
Cc: geert@linux-m68k.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10 14:24:05 -07:00
Tejun Heo
023bf6f1b8 linker script: unify usage of discard definition
Discarded sections in different archs share some commonality but have
considerable differences.  This led to linker script for each arch
implementing its own /DISCARD/ definition, which makes maintaining
tedious and adding new entries error-prone.

This patch makes all linker scripts to move discard definitions to the
end of the linker script and use the common DISCARDS macro.  As ld
uses the first matching section definition, archs can include default
discarded sections by including them earlier in the linker script.

ia64 is notable because it first throws away some ia64 specific
subsections and then include the rest of the sections into the final
image, so those sections must be discarded before the inclusion.

defconfig compile tested for x86, x86-64, powerpc, powerpc64, ia64,
alpha, sparc, sparc64 and s390.  Michal Simek tested microblaze.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Cc: linux-arch@vger.kernel.org
Cc: Michal Simek <monstr@monstr.eu>
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Tony Luck <tony.luck@intel.com>
2009-07-09 11:27:40 +09:00
Martin Schwidefsky
07606309ff [S390] define KTIME_SCALAR for 32-bit s390
32-bit s390 has efficient support for 64/32-bit conversions, define
KTIME_SCALAR to enable the use of the plain scalar nanosecond based
representation of ktime.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:54 +02:00
Heiko Carstens
25ca1251dc [S390] add generic atomic64 support for 31 bit
Performance counters need 64 bit atomic operations.
To keep the patch small we use the simple generic atomic64_t implementation.
The native implementation follows with the next kernel.

Fixes this build bug:

In file included from kernel/sched.c:42:
include/linux/perf_counter.h:427: error: expected specifier-qualifier-list before 'atomic64_t'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:54 +02:00
Martin Schwidefsky
fca3e357d5 [S390] set SCHED_OMIT_FRAME_POINTER for s390
The frame pointer is useless for s390 in the sched.c code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:53 +02:00
Heiko Carstens
5075baca2e [S390] add __ucmpdi2() helper function
Provide __ucmpdi2() helper function on 31 bit so we don't run
again and again in compile errors like this one:

kernel/built-in.o: In function `T.689':
perf_counter.c:(.text+0x56c86): undefined reference to `__ucmpdi2'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:53 +02:00
Heiko Carstens
2651fa2bcb [S390] perf_counter build fix
Add PERF_COUNTER_INDEX_OFFSET define to fix this build bug:

kernel/perf_counter.c: In function 'perf_counter_index':
kernel/perf_counter.c:1889: error: 'PERF_COUNTER_INDEX_OFFSET' undeclared

Same fix as for FRV since s390 doesn't support hw counters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:52 +02:00
Frank Munzert
81088819d5 [S390] shutdown actions: save/return rc from init function
We always returned -EINVAL when setting of a shutdown action failed. This was
misleading, if for example the hardware did not support the shutdown action.
Now we save each shutdown action's init return code and return it when the
action is being set.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:52 +02:00
Heiko Carstens
bb8c29caff [S390] udelay: disable lockdep to avoid false positives
Our udelay implementation enables interrupts to receive a special timer
interrupt regardless of the context it is called from.
This might lead to false positive lockdep reports. Since lockdep isn't
aware of the fact that only a single interrupt source is enabled it
warns about possible deadlocks that in reality won't happen, like
the one below.
To fix this disable lockdep before enabling interrupts.

[ 254.040888] =================================
[ 254.040904] [ INFO: inconsistent lock state ]
[ 254.040910] 2.6.30 #9
[ 254.040914] ---------------------------------
[ 254.040920] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 254.040927] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 254.040934] (sch->lock){?.-...}, at: [<00000000002e4778>] ccw_device_timeout+0x48/0x2f0
[ 254.040961] {IN-HARDIRQ-W} state was registered at:
[ 254.040969] [<0000000000096f74>] __lock_acquire+0x9d4/0x188c
[ 254.040985] [<0000000000097f68>] lock_acquire+0x13c/0x16c
[ 254.040998] [<00000000004527e0>] _spin_lock+0x74/0xb8
[ 254.041016] [<0000000000457eb2>] do_IRQ+0xde/0x208
[ 254.041031] [<000000000002d190>] io_return+0x0/0x8
[ 254.041049] [<0000000000029faa>] vtime_stop_cpu+0xbe/0x114
[ 254.041066] irq event stamp: 259629
[ 254.041076] hardirqs last enabled at (259628): [<000000000045238e>] _spin_unlock_irq+0x5e/0x9c
[ 254.041095] hardirqs last disabled at (259629): [<000000000045292e>] _spin_lock_irq+0x4a/0xc4
[ 254.041126] softirqs last enabled at (259614): [<000000000006500e>] __do_softirq+0x296/0x2b0
[ 254.041137] softirqs last disabled at (259619): [<0000000000024cf6>] do_softirq+0x102/0x108
[ 254.041147]
[ 254.041148] other info that might help us debug this:
[ 254.041153] 2 locks held by swapper/0:
[ 254.041157] #0: (&priv->timer){+.-...}, at: [<000000000006bf9a>] run_timer_softirq+0x19a/0x340
[ 254.041170] #1: (sch->lock){?.-...}, at: [<00000000002e4778>] ccw_device_timeout+0x48/0x2f0
[ 254.041182]
[ 254.041310] Call Trace:
[ 254.041313] ([<00000000000174fc>] show_trace+0x16c/0x170)
[ 254.041321] [<0000000000017578>] show_stack+0x78/0x104
[ 254.041327] [<000000000044d0ca>] dump_stack+0xc6/0xd4
[ 254.041342] [<00000000000949b4>] print_usage_bug+0x1c8/0x1fc
[ 254.041353] [<0000000000094e8a>] mark_lock+0x4a2/0x670
[ 254.041364] [<00000000000950e2>] mark_held_locks+0x8a/0xb4
[ 254.041375] [<0000000000095398>] trace_hardirqs_on_caller+0x74/0x1ac
[ 254.041388] [<00000000000954fa>] trace_hardirqs_on+0x2a/0x38
[ 254.041402] [<000000000025f1ec>] __udelay_disabled+0xac/0xfc
[ 254.041419] [<000000000025f432>] __udelay+0x12a/0x148
[ 254.041433] [<00000000002d64d8>] cio_commit_config+0x170/0x290
[ 254.041451] [<00000000002d6978>] cio_disable_subchannel+0x120/0x1cc
[ 254.041468] [<00000000002e32a4>] ccw_device_recog_done+0x54/0x2f4
[ 254.041485] [<00000000002e3638>] ccw_device_sense_id_done+0x50/0x90
[ 254.041508] [<00000000002e615a>] snsid_callback+0xfa/0x3a8
[ 254.041515] [<00000000002dd96c>] ccwreq_stop+0x80/0x90
[ 254.041523] [<00000000002dda8e>] ccw_request_timeout+0xc2/0xd0
[ 254.041530] [<00000000002e2f70>] ccw_device_request_event+0x58/0x90
[ 254.041537] [<00000000002e47ae>] ccw_device_timeout+0x7e/0x2f0
[ 254.041555] [<000000000006c02a>] run_timer_softirq+0x22a/0x340
[ 254.041566] [<0000000000064eb0>] __do_softirq+0x138/0x2b0
[ 254.041578] [<0000000000024cf6>] do_softirq+0x102/0x108
[ 254.041590] [<00000000000647ce>] irq_exit+0xee/0x114
[ 254.041603] [<0000000000457d88>] do_extint+0x130/0x17c
[ 254.041617] [<000000000002d41e>] ext_no_vtime+0x1e/0x22
[ 254.041631] [<0000000000029faa>] vtime_stop_cpu+0xbe/0x114
[ 254.041646] ([<0000000000029f58>] vtime_stop_cpu+0x6c/0x114)
[ 254.041662] [<000000000001d842>] cpu_idle+0x122/0x1c0
[ 254.041679] [<00000000004482c6>] start_secondary+0xce/0xe0
[ 254.041696] [<0000000000000000>] 0x0
[ 254.041715] [<0000000000000000>] 0x0
[ 254.041745] INFO: lockdep is turned off.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:51 +02:00
Tejun Heo
c43768cbb7 Merge branch 'master' into for-next
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes.  As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.

Conflicts:
	arch/alpha/include/asm/percpu.h
	arch/mn10300/kernel/vmlinux.lds.S
	include/linux/percpu-defs.h
2009-07-04 07:13:18 +09:00
Christian Borntraeger
ef50f7ac7e KVM: s390: Allow stfle instruction in the guest
2.6.31-rc introduced an architecture level set checker based on facility
bits. e.g. if the kernel is compiled to run only on z9, several facility
bits are checked very early and the kernel refuses to boot if a z9 specific
facility is missing.
Until now kvm on s390 did not implement the store facility extended (STFLE)
instruction. A 2.6.31-rc kernel that was compiled for z9 or higher did not
boot in kvm. This patch implements stfle.

This patch should go in before 2.6.31.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-28 14:10:30 +03:00
Tejun Heo
9a0ef2923a s390: switch to dynamic percpu allocator
64bit s390 shares the same problem with alpha regarding percpu symbol
addressing from modules.  It needs assembly magic to force GOTENT
reference when building module as the percpu address will be outside
the usual 4G range from the module text.  This can be solved by using
weak percpu variable definitions.

This patch makes s390 use weak definitions and switch to dynamic
percpu allocator.  Please note that weak attribute is not added if
!SMP as percpu variables behave exactly the same as normal variables
on UP.

Compile tested.  Generation of GOTENT reference verified.

This patch is based on Ivan Kokshaysky's alpha percpu patch.

[ Impact: use dynamic percpu allocator ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-06-24 15:13:53 +09:00
Tejun Heo
405d967dc7 linker script: throw away .discard section
x86 throws away .discard section but no other archs do.  Also,
.discard is not thrown away while linking modules.  Make every arch
and module linking throw it away.  This will be used to define dummy
variables for percpu declarations and definitions.

This patch is based on Ivan Kokshaysky's alpha percpu patch.

[ Impact: always throw away everything in .discard ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
2009-06-24 15:13:38 +09:00
Tejun Heo
e74e396204 percpu: use dynamic percpu allocator as the default percpu allocator
This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
dynamic percpu allocator.  The first chunk is allocated using
embedding helper and 8k is reserved for modules.  This ensures that
the new allocator behaves almost identically to the original allocator
as long as static percpu variables are concerned, so it shouldn't
introduce much breakage.

s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
range limit the addressing model imposes.  Unfortunately, this breaks
if the address is specified using a variable, so for now, the two
archs aren't converted.

The following architectures are affected by this change.

* sh
* arm
* cris
* mips
* sparc(32)
* blackfin
* avr32
* parisc (broken, under investigation)
* m32r
* powerpc(32)

As this change makes the dynamic allocator the default one,
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
archs.  These archs implement their own setup_per_cpu_areas() and the
conversion is not trivial.

* powerpc(64)
* sparc(64)
* ia64
* alpha
* s390

Boot and batch alloc/free tests on x86_32 with debug code (x86_32
doesn't use default first chunk initialization).  Compile tested on
sparc(32), powerpc(32), arm and alpha.

Kyle McMartin reported that this change breaks parisc.  The problem is
still under investigation and he is okay with pushing this patch
forward and fixing parisc later.

[ Impact: use dynamic allocator for most archs w/o custom percpu setup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
2009-06-24 15:13:35 +09:00
Martin Schwidefsky
da6330fccc [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:25 +02:00
Heiko Carstens
acf018004f [S390] kprobes: defer setting of ctlblk state
get_krobe_ctlblk returns a per cpu kprobe control block which holds
the state of the current cpu wrt to kprobe.
When inserting/removing a kprobe the state of the cpu which replaces
the code is changed to KPROBE_SWAP_INST. This however is done when
preemption is still enabled. So the state of the current cpu doesn't
necessarily reflect the real state.
To fix this move the code that changes the state to non-preemptible
context.

Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:24 +02:00
Martin Schwidefsky
12310e9c1b [S390] Enable tick based perf_counter on s390.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:24 +02:00
Martin Schwidefsky
e98bbaafcd [S390] lockless idle time accounting
Replace the spinlock used in the idle time accounting with a sequence
counter mechanism analog to seqlock.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:23 +02:00
Heiko Carstens
4a9c75255e [S390] pm: fix build error for !SMP
Fix build error for !SMP:

arch/s390/power/built-in.o: In function `swsusp_arch_resume':
(.text+0x1b4): undefined reference to `smp_get_phys_cpu_id'
arch/s390/power/built-in.o: In function `swsusp_arch_resume':
(.text+0x288): undefined reference to `smp_switch_boot_cpu_in_resume'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:22 +02:00
Jan Glauber
6618241b47 [S390] qdio: Sanitize do_QDIO sanity checks
Remove unneeded sanity checks from do_QDIO since this is the hot path.
Change the type of bufnr and count to unsigned int so the check for the
maximum value works.

Reported-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:21 +02:00
Pekka Enberg
66d51f3e81 [S390] s390: remove DEBUG_MALLOC
The kernel now has kmemleak and kmemtrace so there's no reason to keep
this ugly s390 hack around. I am not sure how it's supposed to work on
SMP anyway as it uses a global variable to temporarily store the return
value of all kmalloc() calls:

  void *b;

  #define kmalloc(x...) (PRINT_INFO(" kmalloc %p\n",b=kmalloc(x)),b)

Cc: <linux-s390@vger.kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:18 +02:00
Heiko Carstens
d7d1104fa4 [S390] time: convert from bootmem to slab
The slab allocator is earlier available so convert the
bootmem allocations to slab/gfp allocations.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-06-22 12:08:16 +02:00
Linus Torvalds
d06063cc22 Move FAULT_FLAG_xyz into handle_mm_fault() callers
This allows the callers to now pass down the full set of FAULT_FLAG_xyz
flags to handle_mm_fault().  All callers have been (mechanically)
converted to the new calling convention, there's almost certainly room
for architectures to clean up their code and then add FAULT_FLAG_RETRY
when that support is added.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-21 13:08:22 -07:00
Linus Torvalds
b0b7065b64 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
  tracing/urgent: warn in case of ftrace_start_up inbalance
  tracing/urgent: fix unbalanced ftrace_start_up
  function-graph: add stack frame test
  function-graph: disable when both x86_32 and optimize for size are configured
  ring-buffer: have benchmark test print to trace buffer
  ring-buffer: do not grab locks in nmi
  ring-buffer: add locks around rb_per_cpu_empty
  ring-buffer: check for less than two in size allocation
  ring-buffer: remove useless compile check for buffer_page size
  ring-buffer: remove useless warn on check
  ring-buffer: use BUF_PAGE_HDR_SIZE in calculating index
  tracing: update sample event documentation
  tracing/filters: fix race between filter setting and module unload
  tracing/filters: free filter_string in destroy_preds()
  ring-buffer: use commit counters for commit pointer accounting
  ring-buffer: remove unused variable
  ring-buffer: have benchmark test handle discarded events
  ring-buffer: prevent adding write in discarded area
  tracing/filters: strloc should be unsigned short
  tracing/filters: operand can be negative
  ...

Fix up kmemcheck-induced conflict in kernel/trace/ring_buffer.c manually
2009-06-20 10:56:46 -07:00