x86_64 UML is unable to load modules if more than 504MiB
of memory are used.
This happens because on x86_64 the UML process has a quite high
start address (typically around 0x6000000).
If UML's memory is larger than 504MiB VMALLOC_START happens to be after
0x8000000. This is no problem unless one loads a module which was built
with R_X86_64_32S relocations.
Symbols with a location > 0x8000000 cannot be used with R_X86_64_32S
To deal with this x86_64 UML has to be compiled with -mcmodel=large
such that no R_X86_64_32S relocations are used.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reported-by: 전하늘 <allskyee@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[richard@nod.at: Re-export SUBARCH in arch/um/Makefile]
Signed-off-by: Richard Weinberger <richard@nod.at>
... the same one that controls whether elf_aux.o is included into the
build, bringing the vsyscall_e... into it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
just provide get_current_pid() to the userland side of things
instead of get_current() + manual poking in its results
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
... and switch chan_interrupt() to directly calling close_one_chan(),
so we can lose delay_free_irq argument of close_chan() as well.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
... since chan_interrupt() might schedule it if there's too much
incoming data. Kill task argument of chan_interrupt(), while
we are at it - it's always &line->task.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
put references to in and out chans associated with line into
explicit struct chan * fields in it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
move config-independent parts of initialization into
register_lines(), call setup_one_line() after it instead
of abusing ->init_str.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Current code doesn't update the symlinks in /sys/dev/char when we add/remove
tty lines. Fixing that allows to stop messing with ->valid before the driver
registration, which is a Good Thing(tm) - we shouldn't have it set before we
really have the things set up and ready for line_open().
We need tty_driver available to call tty_{un,}register_device(), so we just
stash a reference to it into struct line_driver.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Pull parse_chan_pair() call into setup_one_line(), under the mutex.
We really don't want open() to succeed before parse_chan_pair() had
been done (or after it has failed, BTW). We also want "remove con<n>"
to free irqs, etc., same as "config con<n>=none".
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
If two processes are opening the same line, the second to get
into line_open() will decide that it doesn't need to do anything
(correctly) or wait for anything. The latter, unfortunately,
is incorrect - the first opener might not be through yet. We
need to have exclusion covering the entire line_init(), including
the blocking parts. Moreover, the next patch will need to
widen the exclusion on mconsole side of things, also including
the blocking bits, so let's just convert that sucker to mutex...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
make line_setup() act on a separate array of conf strings + default conf,
have lines array initialized explicitly by that data, bury LINE_INIT()
macro from hell.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled],
We run all interrupt handlers with interrupts disabled
and we even check and yell when an interrupt handler
returns with interrupts enabled (see commit [b738a50a:
genirq: Warn when handler enables interrupts]).
So now this flag is a NOOP and can be removed.
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits)
audit: no leading space in audit_log_d_path prefix
audit: treat s_id as an untrusted string
audit: fix signedness bug in audit_log_execve_info()
audit: comparison on interprocess fields
audit: implement all object interfield comparisons
audit: allow interfield comparison between gid and ogid
audit: complex interfield comparison helper
audit: allow interfield comparison in audit rules
Kernel: Audit Support For The ARM Platform
audit: do not call audit_getname on error
audit: only allow tasks to set their loginuid if it is -1
audit: remove task argument to audit_set_loginuid
audit: allow audit matching on inode gid
audit: allow matching on obj_uid
audit: remove audit_finish_fork as it can't be called
audit: reject entry,always rules
audit: inline audit_free to simplify the look of generic code
audit: drop audit_set_macxattr as it doesn't do anything
audit: inline checks for not needing to collect aux records
audit: drop some potentially inadvisable likely notations
...
Use evil merge to fix up grammar mistakes in Kconfig file.
Bad speling and horrible grammar (and copious swearing) is to be
expected, but let's keep it to commit messages and comments, rather than
expose it to users in config help texts or printouts.
Every arch calls:
if (unlikely(current->audit_context))
audit_syscall_entry()
which requires knowledge about audit (the existance of audit_context) in
the arch code. Just do it all in static inline in audit.h so that arch's
can remain blissfully ignorant.
Signed-off-by: Eric Paris <eparis@redhat.com>
The audit system previously expected arches calling to audit_syscall_exit to
supply as arguments if the syscall was a success and what the return code was.
Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
by converting from negative retcodes to an audit internal magic value stating
success or failure. This helper was wrong and could indicate that a valid
pointer returned to userspace was a failed syscall. The fix is to fix the
layering foolishness. We now pass audit_syscall_exit a struct pt_reg and it
in turns calls back into arch code to collect the return value and to
determine if the syscall was a success or failure. We also define a generic
is_syscall_success() macro which determines success/failure based on if the
value is < -MAX_ERRNO. This works for arches like x86 which do not use a
separate mechanism to indicate syscall failure.
We make both the is_syscall_success() and regs_return_value() static inlines
instead of macros. The reason is because the audit function must take a void*
for the regs. (uml calls theirs struct uml_pt_regs instead of just struct
pt_regs so audit_syscall_exit can't take a struct pt_regs). Since the audit
function takes a void* we need to use static inlines to cast it back to the
arch correct structure to dereference it.
The other major change is that on some arches, like ia64, MIPS and ppc, we
change regs_return_value() to give us the negative value on syscall failure.
THE only other user of this macro, kretprobe_example.c, won't notice and it
makes the value signed consistently for the audit functions across all archs.
In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
audit code as the return value. But the ptrace_64.h code defined the macro
regs_return_value() as regs[3]. I have no idea which one is correct, but this
patch now uses the regs_return_value() function, so it now uses regs[3].
For powerpc we previously used regs->result but now use the
regs_return_value() function which uses regs->gprs[3]. regs->gprs[3] is
always positive so the regs_return_value(), much like ia64 makes it negative
before calling the audit code when appropriate.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
Acked-by: Richard Weinberger <richard@nod.at> [for uml]
Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
* 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Move <asm/asm-offsets.h> from trace_syscalls.c to asm/syscall.h
x86, um: Fix typo in 32-bit system call modifications
um: Use $(srctree) not $(KBUILD_SRC)
x86, um: Mark system call tables readonly
x86, um: Use the same style generated syscall tables as native
um: Generate headers before generating user-offsets.s
um: Run host archheaders, allow use of host generated headers
kbuild, headers.sh: Don't make archheaders explicitly
x86, syscall: Allow syscall offset to be symbolic
x86, syscall: Re-fix typo in comment
x86: Simplify syscallhdr.sh
x86: Generate system call tables and unistd_*.h from tables
checksyscalls: Use arch/x86/syscalls/syscall_32.tbl as source
x86: Machine-readable syscall tables and scripts to process them
trace: Include <asm/asm-offsets.h> in trace_syscalls.c
x86-64, ia32: Move compat_ni_syscall into C and its own file
x86-64, syscall: Adjust comment spacing and remove typo
kbuild: Add support for an "archheaders" target
kbuild: Add support for installing generated asm headers
frv, h8300, m68k, microblaze, openrisc, score, um and xtensa currently
do not register a CPU device. Add the config option GENERIC_CPU_DEVICES
which causes a generic CPU device to be registered for each present CPU,
and make all these architectures select it.
Richard Weinberger <richard@nod.at> covered UML and suggested using
per_cpu.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
PM / Hibernate: Implement compat_ioctl for /dev/snapshot
PM / Freezer: fix return value of freezable_schedule_timeout_killable()
PM / shmobile: Allow the A4R domain to be turned off at run time
PM / input / touchscreen: Make st1232 use device PM QoS constraints
PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
PM / shmobile: Remove the stay_on flag from SH7372's PM domains
PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
PM: Drop generic_subsys_pm_ops
PM / Sleep: Remove forward-only callbacks from AMBA bus type
PM / Sleep: Remove forward-only callbacks from platform bus type
PM: Run the driver callback directly if the subsystem one is not there
PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
PM / Sleep: Merge internal functions in generic_ops.c
PM / Sleep: Simplify generic system suspend callbacks
PM / Hibernate: Remove deprecated hibernation snapshot ioctls
PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
ARM: S3C64XX: Implement basic power domain support
PM / shmobile: Use common always on power domain governor
...
Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
XBT_FORCE_SLEEP bit
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.
Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.
To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().
If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().
Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:
- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:
- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.
There are only few minor differences between both that
are handled by that function, driven by the ts->inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.
Split this function into:
- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.
- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).
To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().
This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
$(KBUILD_SRC) is not defined without O=, use $(srctree).
Reported-and-tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In case we need generated header files for the values in
user-offsets.h, make sure we build generated header files before
user-offsets.s is built.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Run the "archheaders" target for the host architecture, for
architectures (like x86, now) that want to generate some of the
necessary header files.
Add $(HOST_DIR)/include/generated to the include path so we then pick
them up.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This converts the um clocksource to use clocksource_register_hz/khz
This is untested, so any assistance in testing would be appreciated!
CC: Jeff Dike <jdike@addtoit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
* 'for-linus' of git://github.com/richardweinberger/linux: (90 commits)
um: fix ubd cow size
um: Fix kmalloc argument order in um/vdso/vma.c
um: switch to use of drivers/Kconfig
UserModeLinux-HOWTO.txt: fix a typo
UserModeLinux-HOWTO.txt: remove ^H characters
um: we need sys/user.h only on i386
um: merge delay_{32,64}.c
um: distribute exports to where exported stuff is defined
um: kill system-um.h
um: generic ftrace.h will do...
um: segment.h is x86-only and needed only there
um: asm/pda.h is not needed anymore
um: hw_irq.h can go generic as well
um: switch to generic-y
um: clean Kconfig up a bit
um: a couple of missing dependencies...
um: kill useless argument of free_chan() and free_one_chan()
um: unify ptrace_user.h
um: unify KSTK_...
um: fix gcov build breakage
...
ubd_file_size() cannot use ubd_dev->cow.file because at this time
ubd_dev->cow.file is not initialized.
Therefore, ubd_file_size() will always report a wrong disk size when
COW files are used.
Reading from /dev/ubd* would crash the kernel.
We have to read the correct disk size from the COW file's backing
file.
Signed-off-by: Richard Weinberger <richard@nod.at>
CC: stable@kernel.org
ksyms.c is down to the stuff defined in various USER_OBJS
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>