Add a pointer to the system call table to the thread_info structure.
The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
for the lifetime of a process. With the pointer to the correct system
call table in thread_info the system call code in entry64.S path can
drop the check for TIF_31BIT which saves a couple of instructions.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Store the stack pointers in the lowcore for the kernel stack, the async
stack and the panic stack with the offset required for the first user.
This avoids an unnecessary add instruction on the system call path.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds the last breaking event address as parameter
for 31 bit compat program signal handlers as it is already
done for 64 bit programs.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since commit 5f954c34 ([S390] hibernation: fix lowcore handling)
the absolute zero lowcore is lost during suspend/resume.
For example, this leads to the problem that the re-IPL device
for kdump is no longer set after resume.
With this patch during suspend a buffer is allocated in the new PM
notifier "suspend_pm_cb" and then the absolute zero lowcore is saved
to that buffer. The resume code then copies back this buffer to
absolute zero and afterwards the PM notifier releases the memory.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
dumpstack() did not always print a sane callchain when being called.
The reason is that show_trace() accessed register 15 directly to get
the current stack pointer and passed that pointer to __show_trace()
which expects a valid stack frame pointer as argument.
However due to tail call optimization the stack frame may not exist
anymore when __show_trace() gets called and therefore an invalid
stack frame pointer gets passed.
To prevent that disable tail call optimization for call chain walking
functions.
So move all the show_* functions to a dumpstack.c file like other
architectures have it already and add a -fno-optimize-sibling-calls
compile flag to both dumpstack.c and stacktrace.c to prevent tail
call optimization.
Fixes callchains that looked e.g. like this:
[ 12.868258] Call Trace:
[ 12.868262] ([<0000000000008000>] 0x8000)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pass buffer length in extra parameter.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For s390 the page table mapping for the crashkernel memory is removed to
protect the pre-loaded kdump kernel and ramdisk. Because the crashkernel
memory is not included in the page tables for suspend/resume it is not
included in the suspend image. Therefore after resume the resumed system
does no longer contain the pre-loaded kdump kernel and when kdump is
triggered it fails.
This patch adds a PM notifier that creates the page tables before suspend
is done and removes them for resume. This ensures that the kdump kernel
is included in the suspend image.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The size of the vmemmap must be a multiple of PAGES_PER_SECTION, since the
common code always initializes the vmemmap in such pieces.
So we must round up in order to not have a too small vmemmap.
Fixes an IPL crash on 31 bit with more than 1920MB.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The current machine check code uses the registers stored by the machine
in the lowcore at __LC_GPREGS_SAVE_AREA as the registers of the interrupted
context. The registers 0-7 of a user process can get clobbered if a machine
checks interrupts the execution of a critical section in entry[64].S.
The reason is that the critical section cleanup code may need to modify
the PSW and the registers for the previous context to get to the end of a
critical section. If registers 0-7 have to be replaced the relevant copy
will be in the registers, which invalidates the copy in the lowcore. The
machine check handler needs to explicitly store registers 0-7 to the stack.
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
... and convert a bunch of SYSCALL_DEFINE ones to SYSCALL_DEFINE<n>,
killing the boilerplate crap around them.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull second set of s390 patches from Martin Schwidefsky:
"The main part of this merge are Heikos uaccess patches. Together with
commit 09884964335e ("mm: do not grow the stack vma just because of an
overrun on preceding vma") the user string access is hopefully fixed
for good.
In addition some bug fixes and two cleanup patches."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/module: fix compile warning
qdio: remove unused parameters
s390/uaccess: fix kernel ds access for page table walk
s390/uaccess: fix strncpy_from_user string length check
input: disable i8042 PC Keyboard controller for s390
s390/dis: Fix invalid array size
s390/uaccess: remove pointless access_ok() checks
s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
s390/uaccess: shorten strncpy_from_user/strnlen_user
s390/dasd: fix unresponsive device after all channel paths were lost
s390/mm: ignore change bit for vmemmap
s390/page table dumper: add support for change-recording override bit
Pull signal/compat fixes from Al Viro:
"Fixes for several regressions introduced in the last signal.git pile,
along with fixing bugs in truncate and ftruncate compat (on just about
anything biarch at least one of those two had been done wrong)."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
compat: restore timerfd settime and gettime compat syscalls
[regression] braino in "sparc: convert to ksignal"
fix compat truncate/ftruncate
switch lseek to COMPAT_SYSCALL_DEFINE
lseek() and truncate() on sparc really need sign extension
Get rid of this one (false positive):
arch/s390/kernel/module.c: In function ‘apply_relocate_add’:
arch/s390/kernel/module.c:404:5: warning: ‘rc’ may be used uninitialized
in this function [-Wmaybe-uninitialized]
arch/s390/kernel/module.c:225:6: note: ‘rc’ was declared here
Play safe and preinitialize rc with an error value, so we see an error
if new users indeed don't initialize it.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We are using sizeof operator for an array given as function argument,
which is incorrect.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
access_ok() always returns 'true' on s390. Therefore all calls
are quite pointless and can be removed.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
lockdep, but it's a mechanical change.
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRJAcuAAoJENkgDmzRrbjxsw0P/3eXb+LddYnx0V0uHYdKpCUf
4vdW7X0fX3Z+aUK69IWRL/6ahoO4TpaHYGHBDjEoivyQ0GDq14X7JNWsYYt3LdMf
3wmDgRc2cn/mZOJbFeVpNV8ox5l/xc0CUvV+iQ8tMjfQItXMXgWUFZKMECsXKSO6
eex3lrw9M2jAX2uL8LQPp9W8xtKu24nSZRC6tH5riE/8fCzi1cZPPAqfxP5c8Lee
ZXtbCRSyAFENZLpKyMe1PC7HvtJyi5NDn9xwOQiXULZV/VOlvP94DGBLIKCM/6dn
4QvZxpG0P0uOlpCgRAVLyh/z7g4XY4VF/fHopLCmEcqLsvgD+V2LQpQ9zWUalLPC
Z+pUpz2vu0gIddPU1nR8R6oGpEdJ8O12aJle62p/RSXWZGx12qUQ+Tamu0tgKcv1
AsiJfbUGNDYfxgU6sHsoQjl2f68LTVckCU1C1LqEbW/S104EIORtGx30CHM4LRiO
32kDC5TtgYDBKQAIqJ4bL48ZMh+9W3uX40p7xzOI5khHQjvswUKa3jcxupU0C1uv
lx8KXo7pn8WT33QGysWC782wJCgJuzSc2vRn+KQoqoynuHGM6agaEtR59gil3QWO
rQEcxH63BBRDgHlg4FM9IkJwwsnC3PWKL8gbX0uAWXAPMbgapJkuuGZAwt0WDGVK
+GszxsFkCjlW0mK0egTb
=tiSY
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"The sweeping change is to make add_taint() explicitly indicate whether
to disable lockdep, but it's a mechanical change."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Add option to not sign modules during modules_install
MODSIGN: Add -s <signature> option to sign-file
MODSIGN: Specify the hash algorithm on sign-file command line
MODSIGN: Simplify Makefile with a Kconfig helper
module: clean up load_module a little more.
modpost: Ignore ARC specific non-alloc sections
module: constify within_module_*
taint: add explicit flag to show whether lock dep is still OK.
module: printk message when module signature fail taints kernel.
Pull signal handling cleanups from Al Viro:
"This is the first pile; another one will come a bit later and will
contain SYSCALL_DEFINE-related patches.
- a bunch of signal-related syscalls (both native and compat)
unified.
- a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE
(fixing several potential problems with missing argument
validation, while we are at it)
- a lot of now-pointless wrappers killed
- a couple of architectures (cris and hexagon) forgot to save
altstack settings into sigframe, even though they used the
(uninitialized) values in sigreturn; fixed.
- microblaze fixes for delivery of multiple signals arriving at once
- saner set of helpers for signal delivery introduced, several
architectures switched to using those."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits)
x86: convert to ksignal
sparc: convert to ksignal
arm: switch to struct ksignal * passing
alpha: pass k_sigaction and siginfo_t using ksignal pointer
burying unused conditionals
make do_sigaltstack() static
arm64: switch to generic old sigaction() (compat-only)
arm64: switch to generic compat rt_sigaction()
arm64: switch compat to generic old sigsuspend
arm64: switch to generic compat rt_sigqueueinfo()
arm64: switch to generic compat rt_sigpending()
arm64: switch to generic compat rt_sigprocmask()
arm64: switch to generic sigaltstack
sparc: switch to generic old sigsuspend
sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE
sparc: kill sign-extending wrappers for native syscalls
kill sparc32_open()
sparc: switch to use of generic old sigaction
sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE
mips: switch to generic sys_fork() and sys_clone()
...
Pull s390 update from Martin Schwidefsky:
"The most prominent change in this patch set is the software dirty bit
patch for s390. It removes __HAVE_ARCH_PAGE_TEST_AND_CLEAR_DIRTY and
the page_test_and_clear_dirty primitive which makes the common memory
management code a bit less obscure.
Heiko fixed most of the PCI related fallout, more often than not
missing GENERIC_HARDIRQS dependencies. Notable is one of the 3270
patches which adds an export to tty_io to be able to resize a tty.
The rest is the usual bunch of cleanups and bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
s390/module: Add missing R_390_NONE relocation type
drivers/gpio: add missing GENERIC_HARDIRQ dependency
drivers/input: add couple of missing GENERIC_HARDIRQS dependencies
s390/cleanup: rename SPP to LPP
s390/mm: implement software dirty bits
s390/mm: Fix crst upgrade of mmap with MAP_FIXED
s390/linker skript: discard exit.data at runtime
drivers/media: add missing GENERIC_HARDIRQS dependency
s390/bpf,jit: add vlan tag support
drivers/net,AT91RM9200: add missing GENERIC_HARDIRQS dependency
iucv: fix kernel panic at reboot
s390/Kconfig: sort list of arch selected config options
phylib: remove !S390 dependeny from Kconfig
uio: remove !S390 dependency from Kconfig
dasd: fix sysfs cleanup in dasd_generic_remove
s390/pci: fix hotplug module init
s390/pci: cleanup clp page allocation
s390/pci: cleanup clp inline assembly
s390/perf: cpum_cf: fallback to software sampling events
s390/mm: provide PAGE_SHARED define
...
Pull scheduler changes from Ingo Molnar:
"Main changes:
- scheduler side full-dynticks (user-space execution is undisturbed
and receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready, from Frederic
Weisbecker.
- Initial sched.h split-up changes, by Clark Williams
- select_idle_sibling() performance improvement by Mike Galbraith:
" 1 tbench pair (worst case) in a 10 core + SMT package:
pre 15.22 MB/sec 1 procs
post 252.01 MB/sec 1 procs "
- sched_rr_get_interval() ABI fix/change. We think this detail is not
used by apps (so it's not an ABI in practice), but lets keep it
under observation.
- misc RT scheduling cleanups, optimizations"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
cputime: Remove irqsave from seqlock readers
sched, powerpc: Fix sched.h split-up build failure
cputime: Restore CPU_ACCOUNTING config defaults for PPC64
sched/rt: Move rt specific bits into new header file
sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
sched: Move sched.h sysctl bits into separate header
sched: Fix signedness bug in yield_to()
sched: Fix select_idle_sibling() bouncing cow syndrome
sched/rt: Further simplify pick_rt_task()
sched/rt: Do not account zero delta_exec in update_curr_rt()
cputime: Safely read cputime of full dynticks CPUs
kvm: Prepare to add generic guest entry/exit callbacks
cputime: Use accessors to read task cputime stats
cputime: Allow dynamic switch between tick/virtual based cputime accounting
cputime: Generic on-demand virtual cputime accounting
cputime: Move default nsecs_to_cputime() to jiffies based cputime file
cputime: Librarize per nsecs resolution cputime definitions
cputime: Avoid multiplication overflow on utime scaling
context_tracking: Export context state for generic vtime
...
Fix up conflict in kernel/context_tracking.c due to comment additions.
Allow loading of kernel modules that have relocations
of type R_390_NONE.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The set-program-parameter (SPP) instruction has been renamed to
load-program-parameter (LPP) (see SA23-2260). Reflect this change
and rename all macro/instruction references.
Also remove the duplicate SPP/LPP entry in the kernel disassembler
instruction list.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Discard exit.data section at run time, not link time, since exit.text
references exit.data and causes this build error:
`.exit.data' referenced in section `.exit.text' of drivers/built-in.o:
defined in discarded section `.exit.data' of drivers/built-in.o
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The CPU-measurement counter facility does not support sampling events
and returns -EINVAL in that case. This return code lets the perf tool
fail. To fall back to software sampling events, return -ENOENT instead.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix name clash with some common code device drivers and add "tod"
to all tod clock access function names.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a zfcpdump is triggered and a second dump on the same CEC is
already in progress for another LPAR, diagnose 308 returns with
an error code until the first dump is finished. Currently the
second Linux stops with a disabled wait PSW in that case.
This is improved now by by triggering diag 308 in a loop until
it works.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Given enough debug options some modules can grow large enough
that the GOT table gets bigger than 4K. On s390 the modules
are compiled with -fpic which limits the GOT to 4K. The end
result is a module that is loaded but won't work.
Add a sanity check to apply_rela and return with an error if
a relocation error is detected for a module.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>