Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (31 commits)
  [S390] disassembler: mark exception causing instructions
  [S390] Enable exception traces by default
  [S390] return address of compat signals
  [S390] sysctl: get rid of dead declaration
  [S390] dasd: fix fixpoint divide exception in define_extent
  [S390] dasd: add sanity check to detect path connection error
  [S390] qdio: fix kernel panic for zfcp 31-bit
  [S390] Add s390x description to Documentation/kdump/kdump.txt
  [S390] Add VMCOREINFO_SYMBOL(high_memory) to vmcoreinfo
  [S390] dasd: fix expiration handling for recovery requests
  [S390] outstanding interrupts vs. smp_send_stop
  [S390] ipc: call generic sys_ipc demultiplexer
  [S390] zcrypt: Fix error return codes.
  [S390] zcrypt: Rework length parameter checking.
  [S390] cleanup trap handling
  [S390] Remove Kerntypes leftovers
  [S390] topology: increase poll frequency if change is anticipated
  [S390] entry[64].S improvements
  [S390] make arch/s390 subdirectories depend on config option
  [S390] kvm: move cmf host id constant out of lowcore
  ...

Fix up conflicts in arch/s390/kernel/{smp.c,topology.c} due to the
sysdev removal clashing with "topology: get rid of ifdefs" which moved
some of that code around.
This commit is contained in:
Linus Torvalds 2012-01-09 08:11:13 -08:00
commit 72f318897e
49 changed files with 1999 additions and 2032 deletions

View File

@ -66,7 +66,6 @@ GRTAGS
GSYMS
GTAGS
Image
Kerntypes
Module.markers
Module.symvers
PENDING

View File

@ -17,8 +17,8 @@ You can use common commands, such as cp and scp, to copy the
memory image to a dump file on the local disk, or across the network to
a remote system.
Kdump and kexec are currently supported on the x86, x86_64, ppc64 and ia64
architectures.
Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
and s390x architectures.
When the system kernel boots, it reserves a small section of memory for
the dump-capture kernel. This ensures that ongoing Direct Memory Access
@ -34,11 +34,18 @@ Similarly on PPC64 machines first 32KB of physical memory is needed for
booting regardless of where the kernel is loaded and to support 64K page
size kexec backs up the first 64KB memory.
For s390x, when kdump is triggered, the crashkernel region is exchanged
with the region [0, crashkernel region size] and then the kdump kernel
runs in [0, crashkernel region size]. Therefore no relocatable kernel is
needed for s390x.
All of the necessary information about the system kernel's core image is
encoded in the ELF format, and stored in a reserved area of memory
before a crash. The physical address of the start of the ELF header is
passed to the dump-capture kernel through the elfcorehdr= boot
parameter.
parameter. Optionally the size of the ELF header can also be passed
when using the elfcorehdr=[size[KMG]@]offset[KMG] syntax.
With the dump-capture kernel, you can access the memory image, or "old
memory," in two ways:
@ -291,6 +298,10 @@ Boot into System Kernel
The region may be automatically placed on ia64, see the
dump-capture kernel config option notes above.
On s390x, typically use "crashkernel=xxM". The value of xx is dependent
on the memory consumption of the kdump system. In general this is not
dependent on the memory size of the production system.
Load the Dump-capture Kernel
============================
@ -308,6 +319,8 @@ For ppc64:
- Use vmlinux
For ia64:
- Use vmlinux or vmlinuz.gz
For s390x:
- Use image or bzImage
If you are using a uncompressed vmlinux image then use following command
@ -337,6 +350,8 @@ For i386, x86_64 and ia64:
For ppc64:
"1 maxcpus=1 noirqdistrib reset_devices"
For s390x:
"1 maxcpus=1 cgroup_disable=memory"
Notes on loading the dump-capture kernel:
@ -362,6 +377,20 @@ Notes on loading the dump-capture kernel:
dump. Hence generally it is useful either to build a UP dump-capture
kernel or specify maxcpus=1 option while loading dump-capture kernel.
* For s390x there are two kdump modes: If a ELF header is specified with
the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
is done on all other architectures. If no elfcorehdr= kernel parameter is
specified, the s390x kdump kernel dynamically creates the header. The
second mode has the advantage that for CPU and memory hotplug, kdump has
not to be reloaded with kexec_load().
* For s390x systems with many attached devices the "cio_ignore" kernel
parameter should be used for the kdump kernel in order to prevent allocation
of kernel memory for devices that are not relevant for kdump. The same
applies to systems that use SCSI/FCP devices. In that case the
"allow_lun_scan" zfcp module parameter should be set to zero before
setting FCP devices online.
Kernel Panic
============

View File

@ -41,7 +41,6 @@ ldd
Debugging modules
The proc file system
Starting points for debugging scripting languages etc.
Dumptool & Lcrash
SysRq
References
Special Thanks
@ -2455,39 +2454,6 @@ jdb <filename> another fully interactive gdb style debugger.
Dumptool & Lcrash ( lkcd )
==========================
Michael Holzheu & others here at IBM have a fairly mature port of
SGI's lcrash tool which allows one to look at kernel structures in a
running kernel.
It also complements a tool called dumptool which dumps all the kernel's
memory pages & registers to either a tape or a disk.
This can be used by tech support or an ambitious end user do
post mortem debugging of a machine like gdb core dumps.
Going into how to use this tool in detail will be explained
in other documentation supplied by IBM with the patches & the
lcrash homepage http://oss.sgi.com/projects/lkcd/ & the lcrash manpage.
How they work
-------------
Lcrash is a perfectly normal program,however, it requires 2
additional files, Kerntypes which is built using a patch to the
linux kernel sources in the linux root directory & the System.map.
Kerntypes is an objectfile whose sole purpose in life
is to provide stabs debug info to lcrash, to do this
Kerntypes is built from kerntypes.c which just includes the most commonly
referenced header files used when debugging, lcrash can then read the
.stabs section of this file.
Debugging a live system it uses /dev/mem
alternatively for post mortem debugging it uses the data
collected by dumptool.
SysRq
=====
This is now supported by linux for s/390 & z/Architecture.

View File

@ -1,6 +1,7 @@
obj-y += kernel/
obj-y += mm/
obj-y += crypto/
obj-y += appldata/
obj-y += hypfs/
obj-y += kvm/
obj-y += kernel/
obj-y += mm/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_CRYPTO_HW) += crypto/
obj-$(CONFIG_S390_HYPFS_FS) += hypfs/
obj-$(CONFIG_APPLDATA_BASE) += appldata/
obj-$(CONFIG_MATHEMU) += math-emu/

View File

@ -193,18 +193,13 @@ config HOTPLUG_CPU
Say N if you want to disable CPU hotplug.
config SCHED_MC
def_bool y
prompt "Multi-core scheduler support"
depends on SMP
help
Multi-core scheduler support improves the CPU scheduler's decision
making when dealing with multi-core CPU chips at a cost of slightly
increased overhead in some places.
def_bool n
config SCHED_BOOK
def_bool y
prompt "Book scheduler support"
depends on SMP && SCHED_MC
depends on SMP
select SCHED_MC
help
Book scheduler support improves the CPU scheduler's decision making
when dealing with machines that have several books.

View File

@ -99,7 +99,6 @@ core-y += arch/s390/
libs-y += arch/s390/lib/
drivers-y += drivers/s390/
drivers-$(CONFIG_MATHEMU) += arch/s390/math-emu/
# must be linked after kernel
drivers-$(CONFIG_OPROFILE) += arch/s390/oprofile/

View File

@ -23,4 +23,4 @@ $(obj)/compressed/vmlinux: FORCE
install: $(CONFIGURE) $(obj)/image
sh -x $(srctree)/$(obj)/install.sh $(KERNELRELEASE) $(obj)/image \
System.map Kerntypes "$(INSTALL_PATH)"
System.map "$(INSTALL_PATH)"

View File

@ -22,6 +22,6 @@ enum die_val {
DIE_NMI_IPI,
};
extern void die(const char *, struct pt_regs *, long);
extern void die(struct pt_regs *, const char *);
#endif

View File

@ -97,47 +97,52 @@ struct _lowcore {
__u32 gpregs_save_area[16]; /* 0x0180 */
__u32 cregs_save_area[16]; /* 0x01c0 */
/* Save areas. */
__u32 save_area_sync[8]; /* 0x0200 */
__u32 save_area_async[8]; /* 0x0220 */
__u32 save_area_restart[1]; /* 0x0240 */
__u8 pad_0x0244[0x0248-0x0244]; /* 0x0244 */
/* Return psws. */
__u32 save_area[16]; /* 0x0200 */
psw_t return_psw; /* 0x0240 */
psw_t return_mcck_psw; /* 0x0248 */
psw_t return_psw; /* 0x0248 */
psw_t return_mcck_psw; /* 0x0250 */
/* CPU time accounting values */
__u64 sync_enter_timer; /* 0x0250 */
__u64 async_enter_timer; /* 0x0258 */
__u64 mcck_enter_timer; /* 0x0260 */
__u64 exit_timer; /* 0x0268 */
__u64 user_timer; /* 0x0270 */
__u64 system_timer; /* 0x0278 */
__u64 steal_timer; /* 0x0280 */
__u64 last_update_timer; /* 0x0288 */
__u64 last_update_clock; /* 0x0290 */
__u64 sync_enter_timer; /* 0x0258 */
__u64 async_enter_timer; /* 0x0260 */
__u64 mcck_enter_timer; /* 0x0268 */
__u64 exit_timer; /* 0x0270 */
__u64 user_timer; /* 0x0278 */
__u64 system_timer; /* 0x0280 */
__u64 steal_timer; /* 0x0288 */
__u64 last_update_timer; /* 0x0290 */
__u64 last_update_clock; /* 0x0298 */
/* Current process. */
__u32 current_task; /* 0x0298 */
__u32 thread_info; /* 0x029c */
__u32 kernel_stack; /* 0x02a0 */
__u32 current_task; /* 0x02a0 */
__u32 thread_info; /* 0x02a4 */
__u32 kernel_stack; /* 0x02a8 */
/* Interrupt and panic stack. */
__u32 async_stack; /* 0x02a4 */
__u32 panic_stack; /* 0x02a8 */
__u32 async_stack; /* 0x02ac */
__u32 panic_stack; /* 0x02b0 */
/* Address space pointer. */
__u32 kernel_asce; /* 0x02ac */
__u32 user_asce; /* 0x02b0 */
__u32 current_pid; /* 0x02b4 */
__u32 kernel_asce; /* 0x02b4 */
__u32 user_asce; /* 0x02b8 */
__u32 current_pid; /* 0x02bc */
/* SMP info area */
__u32 cpu_nr; /* 0x02b8 */
__u32 softirq_pending; /* 0x02bc */
__u32 percpu_offset; /* 0x02c0 */
__u32 ext_call_fast; /* 0x02c4 */
__u64 int_clock; /* 0x02c8 */
__u64 mcck_clock; /* 0x02d0 */
__u64 clock_comparator; /* 0x02d8 */
__u32 machine_flags; /* 0x02e0 */
__u32 ftrace_func; /* 0x02e4 */
__u8 pad_0x02e8[0x0300-0x02e8]; /* 0x02e8 */
__u32 cpu_nr; /* 0x02c0 */
__u32 softirq_pending; /* 0x02c4 */
__u32 percpu_offset; /* 0x02c8 */
__u32 ext_call_fast; /* 0x02cc */
__u64 int_clock; /* 0x02d0 */
__u64 mcck_clock; /* 0x02d8 */
__u64 clock_comparator; /* 0x02e0 */
__u32 machine_flags; /* 0x02e8 */
__u32 ftrace_func; /* 0x02ec */
__u8 pad_0x02f8[0x0300-0x02f0]; /* 0x02f0 */
/* Interrupt response block */
__u8 irb[64]; /* 0x0300 */
@ -229,57 +234,62 @@ struct _lowcore {
psw_t mcck_new_psw; /* 0x01e0 */
psw_t io_new_psw; /* 0x01f0 */
/* Entry/exit save area & return psws. */
__u64 save_area[16]; /* 0x0200 */
psw_t return_psw; /* 0x0280 */
psw_t return_mcck_psw; /* 0x0290 */
/* Save areas. */
__u64 save_area_sync[8]; /* 0x0200 */
__u64 save_area_async[8]; /* 0x0240 */
__u64 save_area_restart[1]; /* 0x0280 */
__u8 pad_0x0288[0x0290-0x0288]; /* 0x0288 */
/* Return psws. */
psw_t return_psw; /* 0x0290 */
psw_t return_mcck_psw; /* 0x02a0 */
/* CPU accounting and timing values. */
__u64 sync_enter_timer; /* 0x02a0 */
__u64 async_enter_timer; /* 0x02a8 */
__u64 mcck_enter_timer; /* 0x02b0 */
__u64 exit_timer; /* 0x02b8 */
__u64 user_timer; /* 0x02c0 */
__u64 system_timer; /* 0x02c8 */
__u64 steal_timer; /* 0x02d0 */
__u64 last_update_timer; /* 0x02d8 */
__u64 last_update_clock; /* 0x02e0 */
__u64 sync_enter_timer; /* 0x02b0 */
__u64 async_enter_timer; /* 0x02b8 */
__u64 mcck_enter_timer; /* 0x02c0 */
__u64 exit_timer; /* 0x02c8 */
__u64 user_timer; /* 0x02d0 */
__u64 system_timer; /* 0x02d8 */
__u64 steal_timer; /* 0x02e0 */
__u64 last_update_timer; /* 0x02e8 */
__u64 last_update_clock; /* 0x02f0 */
/* Current process. */
__u64 current_task; /* 0x02e8 */
__u64 thread_info; /* 0x02f0 */
__u64 kernel_stack; /* 0x02f8 */
__u64 current_task; /* 0x02f8 */
__u64 thread_info; /* 0x0300 */
__u64 kernel_stack; /* 0x0308 */
/* Interrupt and panic stack. */
__u64 async_stack; /* 0x0300 */
__u64 panic_stack; /* 0x0308 */
__u64 async_stack; /* 0x0310 */
__u64 panic_stack; /* 0x0318 */
/* Address space pointer. */
__u64 kernel_asce; /* 0x0310 */
__u64 user_asce; /* 0x0318 */
__u64 current_pid; /* 0x0320 */
__u64 kernel_asce; /* 0x0320 */
__u64 user_asce; /* 0x0328 */
__u64 current_pid; /* 0x0330 */
/* SMP info area */
__u32 cpu_nr; /* 0x0328 */
__u32 softirq_pending; /* 0x032c */
__u64 percpu_offset; /* 0x0330 */
__u64 ext_call_fast; /* 0x0338 */
__u64 int_clock; /* 0x0340 */
__u64 mcck_clock; /* 0x0348 */
__u64 clock_comparator; /* 0x0350 */
__u64 vdso_per_cpu_data; /* 0x0358 */
__u64 machine_flags; /* 0x0360 */
__u64 ftrace_func; /* 0x0368 */
__u64 gmap; /* 0x0370 */
__u64 cmf_hpp; /* 0x0378 */
__u32 cpu_nr; /* 0x0338 */
__u32 softirq_pending; /* 0x033c */
__u64 percpu_offset; /* 0x0340 */
__u64 ext_call_fast; /* 0x0348 */
__u64 int_clock; /* 0x0350 */
__u64 mcck_clock; /* 0x0358 */
__u64 clock_comparator; /* 0x0360 */
__u64 vdso_per_cpu_data; /* 0x0368 */
__u64 machine_flags; /* 0x0370 */
__u64 ftrace_func; /* 0x0378 */
__u64 gmap; /* 0x0380 */
__u8 pad_0x0388[0x0400-0x0388]; /* 0x0388 */
/* Interrupt response block. */
__u8 irb[64]; /* 0x0380 */
__u8 irb[64]; /* 0x0400 */
/* Per cpu primary space access list */
__u32 paste[16]; /* 0x03c0 */
__u32 paste[16]; /* 0x0440 */
__u8 pad_0x0400[0x0e00-0x0400]; /* 0x0400 */
__u8 pad_0x0480[0x0e00-0x0480]; /* 0x0480 */
/*
* 0xe00 contains the address of the IPL Parameter Information

View File

@ -128,28 +128,11 @@ static inline int is_zero_pfn(unsigned long pfn)
* effect, this also makes sure that 64 bit module code cannot be used
* as system call address.
*/
extern unsigned long VMALLOC_START;
extern unsigned long VMALLOC_END;
extern struct page *vmemmap;
#ifndef __s390x__
#define VMALLOC_SIZE (96UL << 20)
#define VMALLOC_END 0x7e000000UL
#define VMEM_MAP_END 0x80000000UL
#else /* __s390x__ */
#define VMALLOC_SIZE (128UL << 30)
#define VMALLOC_END 0x3e000000000UL
#define VMEM_MAP_END 0x40000000000UL
#endif /* __s390x__ */
/*
* VMEM_MAX_PHYS is the highest physical address that can be added to the 1:1
* mapping. This needs to be calculated at compile time since the size of the
* VMEM_MAP is static but the size of struct page can change.
*/
#define VMEM_MAX_PAGES ((VMEM_MAP_END - VMALLOC_END) / sizeof(struct page))
#define VMEM_MAX_PFN min(VMALLOC_START >> PAGE_SHIFT, VMEM_MAX_PAGES)
#define VMEM_MAX_PHYS ((VMEM_MAX_PFN << PAGE_SHIFT) & ~((16 << 20) - 1))
#define vmemmap ((struct page *) VMALLOC_END)
#define VMEM_MAX_PHYS ((unsigned long) vmemmap)
/*
* A 31 bit pagetable entry of S390 has following format:

View File

@ -80,8 +80,6 @@ struct thread_struct {
unsigned int acrs[NUM_ACRS];
unsigned long ksp; /* kernel stack pointer */
mm_segment_t mm_segment;
unsigned long prot_addr; /* address of protection-excep. */
unsigned int trap_no;
unsigned long gmap_addr; /* address of last gmap fault. */
struct per_regs per_user; /* User specified PER registers */
struct per_event per_event; /* Cause of the last PER trap */

View File

@ -324,7 +324,8 @@ struct pt_regs
psw_t psw;
unsigned long gprs[NUM_GPRS];
unsigned long orig_gpr2;
unsigned int svc_code;
unsigned int int_code;
unsigned long int_parm_long;
};
/*

View File

@ -352,7 +352,7 @@ typedef void qdio_handler_t(struct ccw_device *, unsigned int, int,
* @no_output_qs: number of output queues
* @input_handler: handler to be called for input queues
* @output_handler: handler to be called for output queues
* @queue_start_poll: polling handlers (one per input queue or NULL)
* @queue_start_poll_array: polling handlers (one per input queue or NULL)
* @int_parm: interruption parameter
* @input_sbal_addr_array: address of no_input_qs * 128 pointers
* @output_sbal_addr_array: address of no_output_qs * 128 pointers
@ -372,7 +372,8 @@ struct qdio_initialize {
unsigned int no_output_qs;
qdio_handler_t *input_handler;
qdio_handler_t *output_handler;
void (**queue_start_poll) (struct ccw_device *, int, unsigned long);
void (**queue_start_poll_array) (struct ccw_device *, int,
unsigned long);
int scan_threshold;
unsigned long int_parm;
void **input_sbal_addr_array;

View File

@ -56,6 +56,7 @@ enum {
ec_schedule = 0,
ec_call_function,
ec_call_function_single,
ec_stop_cpu,
};
/*

View File

@ -23,7 +23,6 @@ extern void __cpu_die (unsigned int cpu);
extern int __cpu_up (unsigned int cpu);
extern struct mutex smp_cpu_state_mutex;
extern int smp_cpu_polarization[];
extern void arch_send_call_function_single_ipi(int cpu);
extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);

View File

@ -4,8 +4,8 @@
#ifdef CONFIG_64BIT
#define SECTION_SIZE_BITS 28
#define MAX_PHYSADDR_BITS 42
#define MAX_PHYSMEM_BITS 42
#define MAX_PHYSADDR_BITS 46
#define MAX_PHYSMEM_BITS 46
#else

View File

@ -27,7 +27,7 @@ static inline long syscall_get_nr(struct task_struct *task,
struct pt_regs *regs)
{
return test_tsk_thread_flag(task, TIF_SYSCALL) ?
(regs->svc_code & 0xffff) : -1;
(regs->int_code & 0xffff) : -1;
}
static inline void syscall_rollback(struct task_struct *task,

View File

@ -20,8 +20,6 @@
struct task_struct;
extern int sysctl_userprocess_debug;
extern struct task_struct *__switch_to(void *, void *);
extern void update_per_regs(struct task_struct *task);

View File

@ -4,6 +4,10 @@
#include <linux/cpumask.h>
#include <asm/sysinfo.h>
struct cpu;
#ifdef CONFIG_SCHED_BOOK
extern unsigned char cpu_core_id[NR_CPUS];
extern cpumask_t cpu_core_map[NR_CPUS];
@ -16,8 +20,6 @@ static inline const struct cpumask *cpu_coregroup_mask(int cpu)
#define topology_core_cpumask(cpu) (&cpu_core_map[cpu])
#define mc_capable() (1)
#ifdef CONFIG_SCHED_BOOK
extern unsigned char cpu_book_id[NR_CPUS];
extern cpumask_t cpu_book_map[NR_CPUS];
@ -29,19 +31,45 @@ static inline const struct cpumask *cpu_book_mask(int cpu)
#define topology_book_id(cpu) (cpu_book_id[cpu])
#define topology_book_cpumask(cpu) (&cpu_book_map[cpu])
#endif /* CONFIG_SCHED_BOOK */
int topology_cpu_init(struct cpu *);
int topology_set_cpu_management(int fc);
void topology_schedule_update(void);
void store_topology(struct sysinfo_15_1_x *info);
void topology_expect_change(void);
#define POLARIZATION_UNKNWN (-1)
#else /* CONFIG_SCHED_BOOK */
static inline void topology_schedule_update(void) { }
static inline int topology_cpu_init(struct cpu *cpu) { return 0; }
static inline void topology_expect_change(void) { }
#endif /* CONFIG_SCHED_BOOK */
#define POLARIZATION_UNKNOWN (-1)
#define POLARIZATION_HRZ (0)
#define POLARIZATION_VL (1)
#define POLARIZATION_VM (2)
#define POLARIZATION_VH (3)
#ifdef CONFIG_SMP
extern int cpu_polarization[];
static inline void cpu_set_polarization(int cpu, int val)
{
#ifdef CONFIG_SCHED_BOOK
cpu_polarization[cpu] = val;
#endif
}
static inline int cpu_read_polarization(int cpu)
{
#ifdef CONFIG_SCHED_BOOK
return cpu_polarization[cpu];
#else
return POLARIZATION_HRZ;
#endif
}
#ifdef CONFIG_SCHED_BOOK
void s390_init_cpu_topology(void);
#else
static inline void s390_init_cpu_topology(void)

View File

@ -398,6 +398,7 @@
#define __ARCH_WANT_SYS_SIGNAL
#define __ARCH_WANT_SYS_UTIME
#define __ARCH_WANT_SYS_SOCKETCALL
#define __ARCH_WANT_SYS_IPC
#define __ARCH_WANT_SYS_FADVISE64
#define __ARCH_WANT_SYS_GETPGRP
#define __ARCH_WANT_SYS_LLSEEK

View File

@ -32,7 +32,8 @@ extra-y += head.o init_task.o vmlinux.lds
extra-y += $(if $(CONFIG_64BIT),head64.o,head31.o)
obj-$(CONFIG_MODULES) += s390_ksyms.o module.o
obj-$(CONFIG_SMP) += smp.o topology.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_SCHED_BOOK) += topology.o
obj-$(CONFIG_SMP) += $(if $(CONFIG_64BIT),switch_cpu64.o, \
switch_cpu.o)
obj-$(CONFIG_HIBERNATION) += suspend.o swsusp_asm64.o

View File

@ -45,7 +45,8 @@ int main(void)
DEFINE(__PT_PSW, offsetof(struct pt_regs, psw));
DEFINE(__PT_GPRS, offsetof(struct pt_regs, gprs));
DEFINE(__PT_ORIG_GPR2, offsetof(struct pt_regs, orig_gpr2));
DEFINE(__PT_SVC_CODE, offsetof(struct pt_regs, svc_code));
DEFINE(__PT_INT_CODE, offsetof(struct pt_regs, int_code));
DEFINE(__PT_INT_PARM_LONG, offsetof(struct pt_regs, int_parm_long));
DEFINE(__PT_SIZE, sizeof(struct pt_regs));
BLANK();
DEFINE(__SF_BACKCHAIN, offsetof(struct stack_frame, back_chain));
@ -108,7 +109,9 @@ int main(void)
DEFINE(__LC_PGM_NEW_PSW, offsetof(struct _lowcore, program_new_psw));
DEFINE(__LC_MCK_NEW_PSW, offsetof(struct _lowcore, mcck_new_psw));
DEFINE(__LC_IO_NEW_PSW, offsetof(struct _lowcore, io_new_psw));
DEFINE(__LC_SAVE_AREA, offsetof(struct _lowcore, save_area));
DEFINE(__LC_SAVE_AREA_SYNC, offsetof(struct _lowcore, save_area_sync));
DEFINE(__LC_SAVE_AREA_ASYNC, offsetof(struct _lowcore, save_area_async));
DEFINE(__LC_SAVE_AREA_RESTART, offsetof(struct _lowcore, save_area_restart));
DEFINE(__LC_RETURN_PSW, offsetof(struct _lowcore, return_psw));
DEFINE(__LC_RETURN_MCCK_PSW, offsetof(struct _lowcore, return_mcck_psw));
DEFINE(__LC_SYNC_ENTER_TIMER, offsetof(struct _lowcore, sync_enter_timer));
@ -150,7 +153,6 @@ int main(void)
DEFINE(__LC_LAST_BREAK, offsetof(struct _lowcore, breaking_event_addr));
DEFINE(__LC_VDSO_PER_CPU, offsetof(struct _lowcore, vdso_per_cpu_data));
DEFINE(__LC_GMAP, offsetof(struct _lowcore, gmap));
DEFINE(__LC_CMF_HPP, offsetof(struct _lowcore, cmf_hpp));
DEFINE(__GMAP_ASCE, offsetof(struct gmap, asce));
#endif /* CONFIG_32BIT */
return 0;

View File

@ -33,7 +33,7 @@ s390_base_mcck_handler_fn:
.previous
ENTRY(s390_base_ext_handler)
stmg %r0,%r15,__LC_SAVE_AREA
stmg %r0,%r15,__LC_SAVE_AREA_ASYNC
basr %r13,0
0: aghi %r15,-STACK_FRAME_OVERHEAD
larl %r1,s390_base_ext_handler_fn
@ -41,7 +41,7 @@ ENTRY(s390_base_ext_handler)
ltgr %r1,%r1
jz 1f
basr %r14,%r1
1: lmg %r0,%r15,__LC_SAVE_AREA
1: lmg %r0,%r15,__LC_SAVE_AREA_ASYNC
ni __LC_EXT_OLD_PSW+1,0xfd # clear wait state bit
lpswe __LC_EXT_OLD_PSW
@ -53,7 +53,7 @@ s390_base_ext_handler_fn:
.previous
ENTRY(s390_base_pgm_handler)
stmg %r0,%r15,__LC_SAVE_AREA
stmg %r0,%r15,__LC_SAVE_AREA_SYNC
basr %r13,0
0: aghi %r15,-STACK_FRAME_OVERHEAD
larl %r1,s390_base_pgm_handler_fn
@ -61,7 +61,7 @@ ENTRY(s390_base_pgm_handler)
ltgr %r1,%r1
jz 1f
basr %r14,%r1
lmg %r0,%r15,__LC_SAVE_AREA
lmg %r0,%r15,__LC_SAVE_AREA_SYNC
lpswe __LC_PGM_OLD_PSW
1: lpswe disabled_wait_psw-0b(%r13)
@ -142,7 +142,7 @@ s390_base_mcck_handler_fn:
.previous
ENTRY(s390_base_ext_handler)
stm %r0,%r15,__LC_SAVE_AREA
stm %r0,%r15,__LC_SAVE_AREA_ASYNC
basr %r13,0
0: ahi %r15,-STACK_FRAME_OVERHEAD
l %r1,2f-0b(%r13)
@ -150,7 +150,7 @@ ENTRY(s390_base_ext_handler)
ltr %r1,%r1
jz 1f
basr %r14,%r1
1: lm %r0,%r15,__LC_SAVE_AREA
1: lm %r0,%r15,__LC_SAVE_AREA_ASYNC
ni __LC_EXT_OLD_PSW+1,0xfd # clear wait state bit
lpsw __LC_EXT_OLD_PSW
@ -164,7 +164,7 @@ s390_base_ext_handler_fn:
.previous
ENTRY(s390_base_pgm_handler)
stm %r0,%r15,__LC_SAVE_AREA
stm %r0,%r15,__LC_SAVE_AREA_SYNC
basr %r13,0
0: ahi %r15,-STACK_FRAME_OVERHEAD
l %r1,2f-0b(%r13)
@ -172,7 +172,7 @@ ENTRY(s390_base_pgm_handler)
ltr %r1,%r1
jz 1f
basr %r14,%r1
lm %r0,%r15,__LC_SAVE_AREA
lm %r0,%r15,__LC_SAVE_AREA_SYNC
lpsw __LC_PGM_OLD_PSW
1: lpsw disabled_wait_psw-0b(%r13)

View File

@ -278,9 +278,6 @@ asmlinkage long sys32_ipc(u32 call, int first, int second, int third, u32 ptr)
{
if (call >> 16) /* hack for backward compatibility */
return -EINVAL;
call &= 0xffff;
switch (call) {
case SEMTIMEDOP:
return compat_sys_semtimedop(first, compat_ptr(ptr),

View File

@ -501,8 +501,12 @@ static int setup_frame32(int sig, struct k_sigaction *ka,
/* We forgot to include these in the sigcontext.
To avoid breaking binary compatibility, they are passed as args. */
regs->gprs[4] = current->thread.trap_no;
regs->gprs[5] = current->thread.prot_addr;
if (sig == SIGSEGV || sig == SIGBUS || sig == SIGILL ||
sig == SIGTRAP || sig == SIGFPE) {
/* set extra registers only for synchronous signals */
regs->gprs[4] = regs->int_code & 127;
regs->gprs[5] = regs->int_parm_long;
}
/* Place signal number on stack to allow backtrace from handler. */
if (__put_user(regs->gprs[2], (int __force __user *) &frame->signo))
@ -544,9 +548,9 @@ static int setup_rt_frame32(int sig, struct k_sigaction *ka, siginfo_t *info,
/* Set up to return from userspace. If provided, use a stub
already in userspace. */
if (ka->sa.sa_flags & SA_RESTORER) {
regs->gprs[14] = (__u64) ka->sa.sa_restorer;
regs->gprs[14] = (__u64) ka->sa.sa_restorer | PSW32_ADDR_AMODE;
} else {
regs->gprs[14] = (__u64) frame->retcode;
regs->gprs[14] = (__u64) frame->retcode | PSW32_ADDR_AMODE;
err |= __put_user(S390_SYSCALL_OPCODE | __NR_rt_sigreturn,
(u16 __force __user *)(frame->retcode));
}

View File

@ -1578,10 +1578,15 @@ void show_code(struct pt_regs *regs)
ptr += sprintf(ptr, "%s Code:", mode);
hops = 0;
while (start < end && hops < 8) {
*ptr++ = (start == 32) ? '>' : ' ';
opsize = insn_length(code[start]);
if (start + opsize == 32)
*ptr++ = '#';
else if (start == 32)
*ptr++ = '>';
else
*ptr++ = ' ';
addr = regs->psw.addr + start - 32;
ptr += sprintf(ptr, ONELONG, addr);
opsize = insn_length(code[start]);
if (start + opsize >= end)
break;
for (i = 0; i < opsize; i++)

View File

@ -434,18 +434,22 @@ static void __init append_to_cmdline(size_t (*ipl_data)(char *, size_t))
}
}
static void __init setup_boot_command_line(void)
static inline int has_ebcdic_char(const char *str)
{
int i;
/* convert arch command line to ascii */
for (i = 0; i < ARCH_COMMAND_LINE_SIZE; i++)
if (COMMAND_LINE[i] & 0x80)
break;
if (i < ARCH_COMMAND_LINE_SIZE)
EBCASC(COMMAND_LINE, ARCH_COMMAND_LINE_SIZE);
COMMAND_LINE[ARCH_COMMAND_LINE_SIZE-1] = 0;
for (i = 0; str[i]; i++)
if (str[i] & 0x80)
return 1;
return 0;
}
static void __init setup_boot_command_line(void)
{
COMMAND_LINE[ARCH_COMMAND_LINE_SIZE - 1] = 0;
/* convert arch command line to ascii if necessary */
if (has_ebcdic_char(COMMAND_LINE))
EBCASC(COMMAND_LINE, ARCH_COMMAND_LINE_SIZE);
/* copy arch command line */
strlcpy(boot_command_line, strstrip(COMMAND_LINE),
ARCH_COMMAND_LINE_SIZE);

File diff suppressed because it is too large Load Diff

View File

@ -6,15 +6,15 @@
#include <asm/ptrace.h>
extern void (*pgm_check_table[128])(struct pt_regs *, long, unsigned long);
extern void (*pgm_check_table[128])(struct pt_regs *);
extern void *restart_stack;
asmlinkage long do_syscall_trace_enter(struct pt_regs *regs);
asmlinkage void do_syscall_trace_exit(struct pt_regs *regs);
void do_protection_exception(struct pt_regs *, long, unsigned long);
void do_dat_exception(struct pt_regs *, long, unsigned long);
void do_asce_exception(struct pt_regs *, long, unsigned long);
void do_protection_exception(struct pt_regs *regs);
void do_dat_exception(struct pt_regs *regs);
void do_asce_exception(struct pt_regs *regs);
void do_per_trap(struct pt_regs *regs);
void syscall_trace(struct pt_regs *regs, int entryexit);
@ -28,7 +28,7 @@ void do_extint(struct pt_regs *regs, unsigned int, unsigned int, unsigned long);
void do_restart(void);
int __cpuinit start_secondary(void *cpuvoid);
void __init startup_init(void);
void die(const char * str, struct pt_regs * regs, long err);
void die(struct pt_regs *regs, const char *str);
void __init time_init(void);

File diff suppressed because it is too large Load Diff

View File

@ -329,8 +329,8 @@ iplstart:
#
# reset files in VM reader
#
stidp __LC_SAVE_AREA # store cpuid
tm __LC_SAVE_AREA,0xff # running VM ?
stidp __LC_SAVE_AREA_SYNC # store cpuid
tm __LC_SAVE_AREA_SYNC,0xff# running VM ?
bno .Lnoreset
la %r2,.Lreset
lhi %r3,26

View File

@ -208,6 +208,7 @@ void machine_kexec_cleanup(struct kimage *image)
void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_SYMBOL(lowcore_ptr);
VMCOREINFO_SYMBOL(high_memory);
VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS);
}

View File

@ -63,71 +63,83 @@ void detect_memory_layout(struct mem_chunk chunk[])
}
EXPORT_SYMBOL(detect_memory_layout);
/*
* Move memory chunks array from index "from" to index "to"
*/
static void mem_chunk_move(struct mem_chunk chunk[], int to, int from)
{
int cnt = MEMORY_CHUNKS - to;
memmove(&chunk[to], &chunk[from], cnt * sizeof(struct mem_chunk));
}
/*
* Initialize memory chunk
*/
static void mem_chunk_init(struct mem_chunk *chunk, unsigned long addr,
unsigned long size, int type)
{
chunk->type = type;
chunk->addr = addr;
chunk->size = size;
}
/*
* Create memory hole with given address, size, and type
*/
void create_mem_hole(struct mem_chunk chunks[], unsigned long addr,
void create_mem_hole(struct mem_chunk chunk[], unsigned long addr,
unsigned long size, int type)
{
unsigned long start, end, new_size;
int i;
unsigned long lh_start, lh_end, lh_size, ch_start, ch_end, ch_size;
int i, ch_type;
for (i = 0; i < MEMORY_CHUNKS; i++) {
if (chunks[i].size == 0)
if (chunk[i].size == 0)
continue;
if (addr + size < chunks[i].addr)
continue;
if (addr >= chunks[i].addr + chunks[i].size)
continue;
start = max(addr, chunks[i].addr);
end = min(addr + size, chunks[i].addr + chunks[i].size);
new_size = end - start;
if (new_size == 0)
continue;
if (start == chunks[i].addr &&
end == chunks[i].addr + chunks[i].size) {
/* Remove chunk */
chunks[i].type = type;
} else if (start == chunks[i].addr) {
/* Make chunk smaller at start */
if (i >= MEMORY_CHUNKS - 1)
panic("Unable to create memory hole");
memmove(&chunks[i + 1], &chunks[i],
sizeof(struct mem_chunk) *
(MEMORY_CHUNKS - (i + 1)));
chunks[i + 1].addr = chunks[i].addr + new_size;
chunks[i + 1].size = chunks[i].size - new_size;
chunks[i].size = new_size;
chunks[i].type = type;
i += 1;
} else if (end == chunks[i].addr + chunks[i].size) {
/* Make chunk smaller at end */
if (i >= MEMORY_CHUNKS - 1)
panic("Unable to create memory hole");
memmove(&chunks[i + 1], &chunks[i],
sizeof(struct mem_chunk) *
(MEMORY_CHUNKS - (i + 1)));
chunks[i + 1].addr = start;
chunks[i + 1].size = new_size;
chunks[i + 1].type = type;
chunks[i].size -= new_size;
/* Define chunk properties */
ch_start = chunk[i].addr;
ch_size = chunk[i].size;
ch_end = ch_start + ch_size - 1;
ch_type = chunk[i].type;
/* Is memory chunk hit by memory hole? */
if (addr + size <= ch_start)
continue; /* No: memory hole in front of chunk */
if (addr > ch_end)
continue; /* No: memory hole after chunk */
/* Yes: Define local hole properties */
lh_start = max(addr, chunk[i].addr);
lh_end = min(addr + size - 1, ch_end);
lh_size = lh_end - lh_start + 1;
if (lh_start == ch_start && lh_end == ch_end) {
/* Hole covers complete memory chunk */
mem_chunk_init(&chunk[i], lh_start, lh_size, type);
} else if (lh_end == ch_end) {
/* Hole starts in memory chunk and convers chunk end */
mem_chunk_move(chunk, i + 1, i);
mem_chunk_init(&chunk[i], ch_start, ch_size - lh_size,
ch_type);
mem_chunk_init(&chunk[i + 1], lh_start, lh_size, type);
i += 1;
} else if (lh_start == ch_start) {
/* Hole ends in memory chunk */
mem_chunk_move(chunk, i + 1, i);
mem_chunk_init(&chunk[i], lh_start, lh_size, type);
mem_chunk_init(&chunk[i + 1], lh_end + 1,
ch_size - lh_size, ch_type);
break;
} else {
/* Create memory hole */
if (i >= MEMORY_CHUNKS - 2)
panic("Unable to create memory hole");
memmove(&chunks[i + 2], &chunks[i],
sizeof(struct mem_chunk) *
(MEMORY_CHUNKS - (i + 2)));
chunks[i + 1].addr = addr;
chunks[i + 1].size = size;
chunks[i + 1].type = type;
chunks[i + 2].addr = addr + size;
chunks[i + 2].size =
chunks[i].addr + chunks[i].size - (addr + size);
chunks[i + 2].type = chunks[i].type;
chunks[i].size = addr - chunks[i].addr;
i += 2;
/* Hole splits memory chunk */
mem_chunk_move(chunk, i + 2, i);
mem_chunk_init(&chunk[i], ch_start,
lh_start - ch_start, ch_type);
mem_chunk_init(&chunk[i + 1], lh_start, lh_size, type);
mem_chunk_init(&chunk[i + 2], lh_end + 1,
ch_end - lh_end, ch_type);
break;
}
}
}

View File

@ -17,11 +17,11 @@
#
ENTRY(store_status)
/* Save register one and load save area base */
stg %r1,__LC_SAVE_AREA+120(%r0)
stg %r1,__LC_SAVE_AREA_RESTART
lghi %r1,SAVE_AREA_BASE
/* General purpose registers */
stmg %r0,%r15,__LC_GPREGS_SAVE_AREA-SAVE_AREA_BASE(%r1)
lg %r2,__LC_SAVE_AREA+120(%r0)
lg %r2,__LC_SAVE_AREA_RESTART
stg %r2,__LC_GPREGS_SAVE_AREA-SAVE_AREA_BASE+8(%r1)
/* Control registers */
stctg %c0,%c15,__LC_CREGS_SAVE_AREA-SAVE_AREA_BASE(%r1)

View File

@ -95,6 +95,15 @@ struct mem_chunk __initdata memory_chunk[MEMORY_CHUNKS];
int __initdata memory_end_set;
unsigned long __initdata memory_end;
unsigned long VMALLOC_START;
EXPORT_SYMBOL(VMALLOC_START);
unsigned long VMALLOC_END;
EXPORT_SYMBOL(VMALLOC_END);
struct page *vmemmap;
EXPORT_SYMBOL(vmemmap);
/* An array with a pointer to the lowcore of every CPU. */
struct _lowcore *lowcore_ptr[NR_CPUS];
EXPORT_SYMBOL(lowcore_ptr);
@ -278,6 +287,15 @@ static int __init early_parse_mem(char *p)
}
early_param("mem", early_parse_mem);
static int __init parse_vmalloc(char *arg)
{
if (!arg)
return -EINVAL;
VMALLOC_END = (memparse(arg, &arg) + PAGE_SIZE - 1) & PAGE_MASK;
return 0;
}
early_param("vmalloc", parse_vmalloc);
unsigned int user_mode = HOME_SPACE_MODE;
EXPORT_SYMBOL_GPL(user_mode);
@ -383,7 +401,6 @@ setup_lowcore(void)
__ctl_set_bit(14, 29);
}
#else
lc->cmf_hpp = -1ULL;
lc->vdso_per_cpu_data = (unsigned long) &lc->paste[0];
#endif
lc->sync_enter_timer = S390_lowcore.sync_enter_timer;
@ -479,8 +496,7 @@ EXPORT_SYMBOL_GPL(real_memory_size);
static void __init setup_memory_end(void)
{
unsigned long memory_size;
unsigned long max_mem;
unsigned long vmax, vmalloc_size, tmp;
int i;
@ -490,12 +506,9 @@ static void __init setup_memory_end(void)
memory_end_set = 1;
}
#endif
memory_size = 0;
real_memory_size = 0;
memory_end &= PAGE_MASK;
max_mem = memory_end ? min(VMEM_MAX_PHYS, memory_end) : VMEM_MAX_PHYS;
memory_end = min(max_mem, memory_end);
/*
* Make sure all chunks are MAX_ORDER aligned so we don't need the
* extra checks that HOLES_IN_ZONE would require.
@ -515,23 +528,48 @@ static void __init setup_memory_end(void)
chunk->addr = start;
chunk->size = end - start;
}
real_memory_size = max(real_memory_size,
chunk->addr + chunk->size);
}
/* Choose kernel address space layout: 2, 3, or 4 levels. */
#ifdef CONFIG_64BIT
vmalloc_size = VMALLOC_END ?: 128UL << 30;
tmp = (memory_end ?: real_memory_size) / PAGE_SIZE;
tmp = tmp * (sizeof(struct page) + PAGE_SIZE) + vmalloc_size;
if (tmp <= (1UL << 42))
vmax = 1UL << 42; /* 3-level kernel page table */
else
vmax = 1UL << 53; /* 4-level kernel page table */
#else
vmalloc_size = VMALLOC_END ?: 96UL << 20;
vmax = 1UL << 31; /* 2-level kernel page table */
#endif
/* vmalloc area is at the end of the kernel address space. */
VMALLOC_END = vmax;
VMALLOC_START = vmax - vmalloc_size;
/* Split remaining virtual space between 1:1 mapping & vmemmap array */
tmp = VMALLOC_START / (PAGE_SIZE + sizeof(struct page));
tmp = VMALLOC_START - tmp * sizeof(struct page);
tmp &= ~((vmax >> 11) - 1); /* align to page table level */
tmp = min(tmp, 1UL << MAX_PHYSMEM_BITS);
vmemmap = (struct page *) tmp;
/* Take care that memory_end is set and <= vmemmap */
memory_end = min(memory_end ?: real_memory_size, tmp);
/* Fixup memory chunk array to fit into 0..memory_end */
for (i = 0; i < MEMORY_CHUNKS; i++) {
struct mem_chunk *chunk = &memory_chunk[i];
real_memory_size = max(real_memory_size,
chunk->addr + chunk->size);
if (chunk->addr >= max_mem) {
if (chunk->addr >= memory_end) {
memset(chunk, 0, sizeof(*chunk));
continue;
}
if (chunk->addr + chunk->size > max_mem)
chunk->size = max_mem - chunk->addr;
memory_size = max(memory_size, chunk->addr + chunk->size);
if (chunk->addr + chunk->size > memory_end)
chunk->size = memory_end - chunk->addr;
}
if (!memory_end)
memory_end = memory_size;
}
void *restart_stack __attribute__((__section__(".data")));
@ -655,7 +693,6 @@ static int __init verify_crash_base(unsigned long crash_base,
static void __init reserve_kdump_bootmem(unsigned long addr, unsigned long size,
int type)
{
create_mem_hole(memory_chunk, addr, size, type);
}

View File

@ -302,9 +302,13 @@ static int setup_frame(int sig, struct k_sigaction *ka,
/* We forgot to include these in the sigcontext.
To avoid breaking binary compatibility, they are passed as args. */
regs->gprs[4] = current->thread.trap_no;
regs->gprs[5] = current->thread.prot_addr;
regs->gprs[6] = task_thread_info(current)->last_break;
if (sig == SIGSEGV || sig == SIGBUS || sig == SIGILL ||
sig == SIGTRAP || sig == SIGFPE) {
/* set extra registers only for synchronous signals */
regs->gprs[4] = regs->int_code & 127;
regs->gprs[5] = regs->int_parm_long;
regs->gprs[6] = task_thread_info(current)->last_break;
}
/* Place signal number on stack to allow backtrace from handler. */
if (__put_user(regs->gprs[2], (int __user *) &frame->signo))
@ -434,13 +438,13 @@ void do_signal(struct pt_regs *regs)
* call information.
*/
current_thread_info()->system_call =
test_thread_flag(TIF_SYSCALL) ? regs->svc_code : 0;
test_thread_flag(TIF_SYSCALL) ? regs->int_code : 0;
signr = get_signal_to_deliver(&info, &ka, regs, NULL);
if (signr > 0) {
/* Whee! Actually deliver the signal. */
if (current_thread_info()->system_call) {
regs->svc_code = current_thread_info()->system_call;
regs->int_code = current_thread_info()->system_call;
/* Check for system call restarting. */
switch (regs->gprs[2]) {
case -ERESTART_RESTARTBLOCK:
@ -457,7 +461,7 @@ void do_signal(struct pt_regs *regs)
regs->gprs[2] = regs->orig_gpr2;
regs->psw.addr =
__rewind_psw(regs->psw,
regs->svc_code >> 16);
regs->int_code >> 16);
break;
}
}
@ -488,11 +492,11 @@ void do_signal(struct pt_regs *regs)
/* No handlers present - check for system call restart */
clear_thread_flag(TIF_SYSCALL);
if (current_thread_info()->system_call) {
regs->svc_code = current_thread_info()->system_call;
regs->int_code = current_thread_info()->system_call;
switch (regs->gprs[2]) {
case -ERESTART_RESTARTBLOCK:
/* Restart with sys_restart_syscall */
regs->svc_code = __NR_restart_syscall;
regs->int_code = __NR_restart_syscall;
/* fallthrough */
case -ERESTARTNOHAND:
case -ERESTARTSYS:

View File

@ -69,9 +69,7 @@ enum s390_cpu_state {
};
DEFINE_MUTEX(smp_cpu_state_mutex);
int smp_cpu_polarization[NR_CPUS];
static int smp_cpu_state[NR_CPUS];
static int cpu_management;
static DEFINE_PER_CPU(struct cpu, cpu_devices);
@ -149,29 +147,59 @@ void smp_switch_to_ipl_cpu(void (*func)(void *), void *data)
sp -= sizeof(struct pt_regs);
regs = (struct pt_regs *) sp;
memcpy(&regs->gprs, &current_lc->gpregs_save_area, sizeof(regs->gprs));
regs->psw = lc->psw_save_area;
regs->psw = current_lc->psw_save_area;
sp -= STACK_FRAME_OVERHEAD;
sf = (struct stack_frame *) sp;
sf->back_chain = regs->gprs[15];
sf->back_chain = 0;
smp_switch_to_cpu(func, data, sp, stap(), __cpu_logical_map[0]);
}
static void smp_stop_cpu(void)
{
while (sigp(smp_processor_id(), sigp_stop) == sigp_busy)
cpu_relax();
}
void smp_send_stop(void)
{
int cpu, rc;
cpumask_t cpumask;
int cpu;
u64 end;
/* Disable all interrupts/machine checks */
__load_psw_mask(psw_kernel_bits | PSW_MASK_DAT);
trace_hardirqs_off();
/* stop all processors */
for_each_online_cpu(cpu) {
if (cpu == smp_processor_id())
continue;
do {
rc = sigp(cpu, sigp_stop);
} while (rc == sigp_busy);
cpumask_copy(&cpumask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &cpumask);
if (oops_in_progress) {
/*
* Give the other cpus the opportunity to complete
* outstanding interrupts before stopping them.
*/
end = get_clock() + (1000000UL << 12);
for_each_cpu(cpu, &cpumask) {
set_bit(ec_stop_cpu, (unsigned long *)
&lowcore_ptr[cpu]->ext_call_fast);
while (sigp(cpu, sigp_emergency_signal) == sigp_busy &&
get_clock() < end)
cpu_relax();
}
while (get_clock() < end) {
for_each_cpu(cpu, &cpumask)
if (cpu_stopped(cpu))
cpumask_clear_cpu(cpu, &cpumask);
if (cpumask_empty(&cpumask))
break;
cpu_relax();
}
}
/* stop all processors */
for_each_cpu(cpu, &cpumask) {
while (sigp(cpu, sigp_stop) == sigp_busy)
cpu_relax();
while (!cpu_stopped(cpu))
cpu_relax();
}
@ -187,7 +215,7 @@ static void do_ext_call_interrupt(unsigned int ext_int_code,
{
unsigned long bits;
if (ext_int_code == 0x1202)
if ((ext_int_code & 0xffff) == 0x1202)
kstat_cpu(smp_processor_id()).irqs[EXTINT_EXC]++;
else
kstat_cpu(smp_processor_id()).irqs[EXTINT_EMS]++;
@ -196,6 +224,9 @@ static void do_ext_call_interrupt(unsigned int ext_int_code,
*/
bits = xchg(&S390_lowcore.ext_call_fast, 0);
if (test_bit(ec_stop_cpu, &bits))
smp_stop_cpu();
if (test_bit(ec_schedule, &bits))
scheduler_ipi();
@ -204,6 +235,7 @@ static void do_ext_call_interrupt(unsigned int ext_int_code,
if (test_bit(ec_call_function_single, &bits))
generic_smp_call_function_single_interrupt();
}
/*
@ -369,7 +401,7 @@ static int smp_rescan_cpus_sigp(cpumask_t avail)
if (cpu_known(cpu_id))
continue;
__cpu_logical_map[logical_cpu] = cpu_id;
smp_cpu_polarization[logical_cpu] = POLARIZATION_UNKNWN;
cpu_set_polarization(logical_cpu, POLARIZATION_UNKNOWN);
if (!cpu_stopped(logical_cpu))
continue;
set_cpu_present(logical_cpu, true);
@ -403,7 +435,7 @@ static int smp_rescan_cpus_sclp(cpumask_t avail)
if (cpu_known(cpu_id))
continue;
__cpu_logical_map[logical_cpu] = cpu_id;
smp_cpu_polarization[logical_cpu] = POLARIZATION_UNKNWN;
cpu_set_polarization(logical_cpu, POLARIZATION_UNKNOWN);
set_cpu_present(logical_cpu, true);
if (cpu >= info->configured)
smp_cpu_state[logical_cpu] = CPU_STATE_STANDBY;
@ -656,7 +688,7 @@ int __cpuinit __cpu_up(unsigned int cpu)
- sizeof(struct stack_frame));
memset(sf, 0, sizeof(struct stack_frame));
sf->gprs[9] = (unsigned long) sf;
cpu_lowcore->save_area[15] = (unsigned long) sf;
cpu_lowcore->gpregs_save_area[15] = (unsigned long) sf;
__ctl_store(cpu_lowcore->cregs_save_area, 0, 15);
atomic_inc(&init_mm.context.attach_count);
asm volatile(
@ -806,7 +838,7 @@ void __init smp_prepare_boot_cpu(void)
S390_lowcore.percpu_offset = __per_cpu_offset[0];
current_set[0] = current;
smp_cpu_state[0] = CPU_STATE_CONFIGURED;
smp_cpu_polarization[0] = POLARIZATION_UNKNWN;
cpu_set_polarization(0, POLARIZATION_UNKNOWN);
}
void __init smp_cpus_done(unsigned int max_cpus)
@ -868,7 +900,8 @@ static ssize_t cpu_configure_store(struct device *dev,
rc = sclp_cpu_deconfigure(__cpu_logical_map[cpu]);
if (!rc) {
smp_cpu_state[cpu] = CPU_STATE_STANDBY;
smp_cpu_polarization[cpu] = POLARIZATION_UNKNWN;
cpu_set_polarization(cpu, POLARIZATION_UNKNOWN);
topology_expect_change();
}
}
break;
@ -877,7 +910,8 @@ static ssize_t cpu_configure_store(struct device *dev,
rc = sclp_cpu_configure(__cpu_logical_map[cpu]);
if (!rc) {
smp_cpu_state[cpu] = CPU_STATE_CONFIGURED;
smp_cpu_polarization[cpu] = POLARIZATION_UNKNWN;
cpu_set_polarization(cpu, POLARIZATION_UNKNOWN);
topology_expect_change();
}
}
break;
@ -892,35 +926,6 @@ out:
static DEVICE_ATTR(configure, 0644, cpu_configure_show, cpu_configure_store);
#endif /* CONFIG_HOTPLUG_CPU */
static ssize_t cpu_polarization_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
int cpu = dev->id;
ssize_t count;
mutex_lock(&smp_cpu_state_mutex);
switch (smp_cpu_polarization[cpu]) {
case POLARIZATION_HRZ:
count = sprintf(buf, "horizontal\n");
break;
case POLARIZATION_VL:
count = sprintf(buf, "vertical:low\n");
break;
case POLARIZATION_VM:
count = sprintf(buf, "vertical:medium\n");
break;
case POLARIZATION_VH:
count = sprintf(buf, "vertical:high\n");
break;
default:
count = sprintf(buf, "unknown\n");
break;
}
mutex_unlock(&smp_cpu_state_mutex);
return count;
}
static DEVICE_ATTR(polarization, 0444, cpu_polarization_show, NULL);
static ssize_t show_cpu_address(struct device *dev,
struct device_attribute *attr, char *buf)
{
@ -928,13 +933,11 @@ static ssize_t show_cpu_address(struct device *dev,
}
static DEVICE_ATTR(address, 0444, show_cpu_address, NULL);
static struct attribute *cpu_common_attrs[] = {
#ifdef CONFIG_HOTPLUG_CPU
&dev_attr_configure.attr,
#endif
&dev_attr_address.attr,
&dev_attr_polarization.attr,
NULL,
};
@ -1055,11 +1058,20 @@ static int __devinit smp_add_present_cpu(int cpu)
rc = sysfs_create_group(&s->kobj, &cpu_common_attr_group);
if (rc)
goto out_cpu;
if (!cpu_online(cpu))
goto out;
rc = sysfs_create_group(&s->kobj, &cpu_online_attr_group);
if (!rc)
return 0;
if (cpu_online(cpu)) {
rc = sysfs_create_group(&s->kobj, &cpu_online_attr_group);
if (rc)
goto out_online;
}
rc = topology_cpu_init(c);
if (rc)
goto out_topology;
return 0;
out_topology:
if (cpu_online(cpu))
sysfs_remove_group(&s->kobj, &cpu_online_attr_group);
out_online:
sysfs_remove_group(&s->kobj, &cpu_common_attr_group);
out_cpu:
#ifdef CONFIG_HOTPLUG_CPU
@ -1111,61 +1123,16 @@ static ssize_t __ref rescan_store(struct device *dev,
static DEVICE_ATTR(rescan, 0200, NULL, rescan_store);
#endif /* CONFIG_HOTPLUG_CPU */
static ssize_t dispatching_show(struct device *dev,
struct device_attribute *attr,
char *buf)
static int __init s390_smp_init(void)
{
ssize_t count;
mutex_lock(&smp_cpu_state_mutex);
count = sprintf(buf, "%d\n", cpu_management);
mutex_unlock(&smp_cpu_state_mutex);
return count;
}
static ssize_t dispatching_store(struct device *dev,
struct device_attribute *attr,
const char *buf,
size_t count)
{
int val, rc;
char delim;
if (sscanf(buf, "%d %c", &val, &delim) != 1)
return -EINVAL;
if (val != 0 && val != 1)
return -EINVAL;
rc = 0;
get_online_cpus();
mutex_lock(&smp_cpu_state_mutex);
if (cpu_management == val)
goto out;
rc = topology_set_cpu_management(val);
if (!rc)
cpu_management = val;
out:
mutex_unlock(&smp_cpu_state_mutex);
put_online_cpus();
return rc ? rc : count;
}
static DEVICE_ATTR(dispatching, 0644, dispatching_show,
dispatching_store);
static int __init topology_init(void)
{
int cpu;
int rc;
int cpu, rc;
register_cpu_notifier(&smp_cpu_nb);
#ifdef CONFIG_HOTPLUG_CPU
rc = device_create_file(cpu_subsys.dev_root, &dev_attr_rescan);
if (rc)
return rc;
#endif
rc = device_create_file(cpu_subsys.dev_root, &dev_attr_dispatching);
if (rc)
return rc;
for_each_present_cpu(cpu) {
rc = smp_add_present_cpu(cpu);
if (rc)
@ -1173,4 +1140,4 @@ static int __init topology_init(void)
}
return 0;
}
subsys_initcall(topology_init);
subsys_initcall(s390_smp_init);

View File

@ -60,74 +60,22 @@ out:
}
/*
* sys_ipc() is the de-multiplexer for the SysV IPC calls..
*
* This is really horribly ugly.
* sys_ipc() is the de-multiplexer for the SysV IPC calls.
*/
SYSCALL_DEFINE5(s390_ipc, uint, call, int, first, unsigned long, second,
unsigned long, third, void __user *, ptr)
{
struct ipc_kludge tmp;
int ret;
switch (call) {
case SEMOP:
return sys_semtimedop(first, (struct sembuf __user *)ptr,
(unsigned)second, NULL);
case SEMTIMEDOP:
return sys_semtimedop(first, (struct sembuf __user *)ptr,
(unsigned)second,
(const struct timespec __user *) third);
case SEMGET:
return sys_semget(first, (int)second, third);
case SEMCTL: {
union semun fourth;
if (!ptr)
return -EINVAL;
if (get_user(fourth.__pad, (void __user * __user *) ptr))
return -EFAULT;
return sys_semctl(first, (int)second, third, fourth);
}
case MSGSND:
return sys_msgsnd (first, (struct msgbuf __user *) ptr,
(size_t)second, third);
break;
case MSGRCV:
if (!ptr)
return -EINVAL;
if (copy_from_user (&tmp, (struct ipc_kludge __user *) ptr,
sizeof (struct ipc_kludge)))
return -EFAULT;
return sys_msgrcv (first, tmp.msgp,
(size_t)second, tmp.msgtyp, third);
case MSGGET:
return sys_msgget((key_t)first, (int)second);
case MSGCTL:
return sys_msgctl(first, (int)second,
(struct msqid_ds __user *)ptr);
case SHMAT: {
ulong raddr;
ret = do_shmat(first, (char __user *)ptr,
(int)second, &raddr);
if (ret)
return ret;
return put_user (raddr, (ulong __user *) third);
break;
}
case SHMDT:
return sys_shmdt ((char __user *)ptr);
case SHMGET:
return sys_shmget(first, (size_t)second, third);
case SHMCTL:
return sys_shmctl(first, (int)second,
(struct shmid_ds __user *) ptr);
default:
return -ENOSYS;
}
return -EINVAL;
if (call >> 16)
return -EINVAL;
/* The s390 sys_ipc variant has only five parameters instead of six
* like the generic variant. The only difference is the handling of
* the SEMTIMEDOP subcall where on s390 the third parameter is used
* as a pointer to a struct timespec where the generic variant uses
* the fifth parameter.
* Therefore we can call the generic variant by simply passing the
* third parameter also as fifth parameter.
*/
return sys_ipc(call, first, second, third, ptr, third);
}
#ifdef CONFIG_64BIT

View File

@ -1,22 +1,22 @@
/*
* Copyright IBM Corp. 2007
* Copyright IBM Corp. 2007,2011
* Author(s): Heiko Carstens <heiko.carstens@de.ibm.com>
*/
#define KMSG_COMPONENT "cpu"
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/init.h>
#include <linux/device.h>
#include <linux/bootmem.h>
#include <linux/sched.h>
#include <linux/workqueue.h>
#include <linux/bootmem.h>
#include <linux/cpuset.h>
#include <linux/device.h>
#include <linux/kernel.h>
#include <linux/sched.h>
#include <linux/init.h>
#include <linux/delay.h>
#include <linux/cpu.h>
#include <linux/smp.h>
#include <linux/cpuset.h>
#include <asm/delay.h>
#include <linux/mm.h>
#define PTF_HORIZONTAL (0UL)
#define PTF_VERTICAL (1UL)
@ -31,7 +31,6 @@ struct mask_info {
static int topology_enabled = 1;
static void topology_work_fn(struct work_struct *work);
static struct sysinfo_15_1_x *tl_info;
static struct timer_list topology_timer;
static void set_topology_timer(void);
static DECLARE_WORK(topology_work, topology_work_fn);
/* topology_lock protects the core linked list */
@ -41,11 +40,12 @@ static struct mask_info core_info;
cpumask_t cpu_core_map[NR_CPUS];
unsigned char cpu_core_id[NR_CPUS];
#ifdef CONFIG_SCHED_BOOK
static struct mask_info book_info;
cpumask_t cpu_book_map[NR_CPUS];
unsigned char cpu_book_id[NR_CPUS];
#endif
/* smp_cpu_state_mutex must be held when accessing this array */
int cpu_polarization[NR_CPUS];
static cpumask_t cpu_group_map(struct mask_info *info, unsigned int cpu)
{
@ -71,7 +71,7 @@ static cpumask_t cpu_group_map(struct mask_info *info, unsigned int cpu)
static struct mask_info *add_cpus_to_mask(struct topology_cpu *tl_cpu,
struct mask_info *book,
struct mask_info *core,
int z10)
int one_core_per_cpu)
{
unsigned int cpu;
@ -85,18 +85,16 @@ static struct mask_info *add_cpus_to_mask(struct topology_cpu *tl_cpu,
for_each_present_cpu(lcpu) {
if (cpu_logical_map(lcpu) != rcpu)
continue;
#ifdef CONFIG_SCHED_BOOK
cpumask_set_cpu(lcpu, &book->mask);
cpu_book_id[lcpu] = book->id;
#endif
cpumask_set_cpu(lcpu, &core->mask);
if (z10) {
if (one_core_per_cpu) {
cpu_core_id[lcpu] = rcpu;
core = core->next;
} else {
cpu_core_id[lcpu] = core->id;
}
smp_cpu_polarization[lcpu] = tl_cpu->pp;
cpu_set_polarization(lcpu, tl_cpu->pp);
}
}
return core;
@ -111,13 +109,11 @@ static void clear_masks(void)
cpumask_clear(&info->mask);
info = info->next;
}
#ifdef CONFIG_SCHED_BOOK
info = &book_info;
while (info) {
cpumask_clear(&info->mask);
info = info->next;
}
#endif
}
static union topology_entry *next_tle(union topology_entry *tle)
@ -127,66 +123,75 @@ static union topology_entry *next_tle(union topology_entry *tle)
return (union topology_entry *)((struct topology_container *)tle + 1);
}
static void tl_to_cores(struct sysinfo_15_1_x *info)
static void __tl_to_cores_generic(struct sysinfo_15_1_x *info)
{
#ifdef CONFIG_SCHED_BOOK
struct mask_info *book = &book_info;
struct cpuid cpu_id;
#else
struct mask_info *book = NULL;
#endif
struct mask_info *core = &core_info;
struct mask_info *book = &book_info;
union topology_entry *tle, *end;
int z10 = 0;
#ifdef CONFIG_SCHED_BOOK
get_cpu_id(&cpu_id);
z10 = cpu_id.machine == 0x2097 || cpu_id.machine == 0x2098;
#endif
spin_lock_irq(&topology_lock);
clear_masks();
tle = info->tle;
end = (union topology_entry *)((unsigned long)info + info->length);
while (tle < end) {
#ifdef CONFIG_SCHED_BOOK
if (z10) {
switch (tle->nl) {
case 1:
book = book->next;
book->id = tle->container.id;
break;
case 0:
core = add_cpus_to_mask(&tle->cpu, book, core, z10);
break;
default:
clear_masks();
goto out;
}
tle = next_tle(tle);
continue;
}
#endif
switch (tle->nl) {
#ifdef CONFIG_SCHED_BOOK
case 2:
book = book->next;
book->id = tle->container.id;
break;
#endif
case 1:
core = core->next;
core->id = tle->container.id;
break;
case 0:
add_cpus_to_mask(&tle->cpu, book, core, z10);
add_cpus_to_mask(&tle->cpu, book, core, 0);
break;
default:
clear_masks();
goto out;
return;
}
tle = next_tle(tle);
}
out:
}
static void __tl_to_cores_z10(struct sysinfo_15_1_x *info)
{
struct mask_info *core = &core_info;
struct mask_info *book = &book_info;
union topology_entry *tle, *end;
tle = info->tle;
end = (union topology_entry *)((unsigned long)info + info->length);
while (tle < end) {
switch (tle->nl) {
case 1:
book = book->next;
book->id = tle->container.id;
break;
case 0:
core = add_cpus_to_mask(&tle->cpu, book, core, 1);
break;
default:
clear_masks();
return;
}
tle = next_tle(tle);
}
}
static void tl_to_cores(struct sysinfo_15_1_x *info)
{
struct cpuid cpu_id;
get_cpu_id(&cpu_id);
spin_lock_irq(&topology_lock);
clear_masks();
switch (cpu_id.machine) {
case 0x2097:
case 0x2098:
__tl_to_cores_z10(info);
break;
default:
__tl_to_cores_generic(info);
}
spin_unlock_irq(&topology_lock);
}
@ -196,7 +201,7 @@ static void topology_update_polarization_simple(void)
mutex_lock(&smp_cpu_state_mutex);
for_each_possible_cpu(cpu)
smp_cpu_polarization[cpu] = POLARIZATION_HRZ;
cpu_set_polarization(cpu, POLARIZATION_HRZ);
mutex_unlock(&smp_cpu_state_mutex);
}
@ -215,8 +220,7 @@ static int ptf(unsigned long fc)
int topology_set_cpu_management(int fc)
{
int cpu;
int rc;
int cpu, rc;
if (!MACHINE_HAS_TOPOLOGY)
return -EOPNOTSUPP;
@ -227,7 +231,7 @@ int topology_set_cpu_management(int fc)
if (rc)
return -EBUSY;
for_each_possible_cpu(cpu)
smp_cpu_polarization[cpu] = POLARIZATION_UNKNWN;
cpu_set_polarization(cpu, POLARIZATION_UNKNOWN);
return rc;
}
@ -239,22 +243,18 @@ static void update_cpu_core_map(void)
spin_lock_irqsave(&topology_lock, flags);
for_each_possible_cpu(cpu) {
cpu_core_map[cpu] = cpu_group_map(&core_info, cpu);
#ifdef CONFIG_SCHED_BOOK
cpu_book_map[cpu] = cpu_group_map(&book_info, cpu);
#endif
}
spin_unlock_irqrestore(&topology_lock, flags);
}
void store_topology(struct sysinfo_15_1_x *info)
{
#ifdef CONFIG_SCHED_BOOK
int rc;
rc = stsi(info, 15, 1, 3);
if (rc != -ENOSYS)
return;
#endif
stsi(info, 15, 1, 2);
}
@ -296,12 +296,30 @@ static void topology_timer_fn(unsigned long ignored)
set_topology_timer();
}
static struct timer_list topology_timer =
TIMER_DEFERRED_INITIALIZER(topology_timer_fn, 0, 0);
static atomic_t topology_poll = ATOMIC_INIT(0);
static void set_topology_timer(void)
{
topology_timer.function = topology_timer_fn;
topology_timer.data = 0;
topology_timer.expires = jiffies + 60 * HZ;
add_timer(&topology_timer);
if (atomic_add_unless(&topology_poll, -1, 0))
mod_timer(&topology_timer, jiffies + HZ / 10);
else
mod_timer(&topology_timer, jiffies + HZ * 60);
}
void topology_expect_change(void)
{
if (!MACHINE_HAS_TOPOLOGY)
return;
/* This is racy, but it doesn't matter since it is just a heuristic.
* Worst case is that we poll in a higher frequency for a bit longer.
*/
if (atomic_read(&topology_poll) > 60)
return;
atomic_add(60, &topology_poll);
set_topology_timer();
}
static int __init early_parse_topology(char *p)
@ -313,23 +331,6 @@ static int __init early_parse_topology(char *p)
}
early_param("topology", early_parse_topology);
static int __init init_topology_update(void)
{
int rc;
rc = 0;
if (!MACHINE_HAS_TOPOLOGY) {
topology_update_polarization_simple();
goto out;
}
init_timer_deferrable(&topology_timer);
set_topology_timer();
out:
update_cpu_core_map();
return rc;
}
__initcall(init_topology_update);
static void __init alloc_masks(struct sysinfo_15_1_x *info,
struct mask_info *mask, int offset)
{
@ -357,10 +358,108 @@ void __init s390_init_cpu_topology(void)
store_topology(info);
pr_info("The CPU configuration topology of the machine is:");
for (i = 0; i < TOPOLOGY_NR_MAG; i++)
printk(" %d", info->mag[i]);
printk(" / %d\n", info->mnest);
printk(KERN_CONT " %d", info->mag[i]);
printk(KERN_CONT " / %d\n", info->mnest);
alloc_masks(info, &core_info, 1);
#ifdef CONFIG_SCHED_BOOK
alloc_masks(info, &book_info, 2);
#endif
}
static int cpu_management;
static ssize_t dispatching_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
ssize_t count;
mutex_lock(&smp_cpu_state_mutex);
count = sprintf(buf, "%d\n", cpu_management);
mutex_unlock(&smp_cpu_state_mutex);
return count;
}
static ssize_t dispatching_store(struct device *dev,
struct device_attribute *attr,
const char *buf,
size_t count)
{
int val, rc;
char delim;
if (sscanf(buf, "%d %c", &val, &delim) != 1)
return -EINVAL;
if (val != 0 && val != 1)
return -EINVAL;
rc = 0;
get_online_cpus();
mutex_lock(&smp_cpu_state_mutex);
if (cpu_management == val)
goto out;
rc = topology_set_cpu_management(val);
if (rc)
goto out;
cpu_management = val;
topology_expect_change();
out:
mutex_unlock(&smp_cpu_state_mutex);
put_online_cpus();
return rc ? rc : count;
}
static DEVICE_ATTR(dispatching, 0644, dispatching_show,
dispatching_store);
static ssize_t cpu_polarization_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
int cpu = dev->id;
ssize_t count;
mutex_lock(&smp_cpu_state_mutex);
switch (cpu_read_polarization(cpu)) {
case POLARIZATION_HRZ:
count = sprintf(buf, "horizontal\n");
break;
case POLARIZATION_VL:
count = sprintf(buf, "vertical:low\n");
break;
case POLARIZATION_VM:
count = sprintf(buf, "vertical:medium\n");
break;
case POLARIZATION_VH:
count = sprintf(buf, "vertical:high\n");
break;
default:
count = sprintf(buf, "unknown\n");
break;
}
mutex_unlock(&smp_cpu_state_mutex);
return count;
}
static DEVICE_ATTR(polarization, 0444, cpu_polarization_show, NULL);
static struct attribute *topology_cpu_attrs[] = {
&dev_attr_polarization.attr,
NULL,
};
static struct attribute_group topology_cpu_attr_group = {
.attrs = topology_cpu_attrs,
};
int topology_cpu_init(struct cpu *cpu)
{
return sysfs_create_group(&cpu->dev.kobj, &topology_cpu_attr_group);
}
static int __init topology_init(void)
{
if (!MACHINE_HAS_TOPOLOGY) {
topology_update_polarization_simple();
goto out;
}
set_topology_timer();
out:
update_cpu_core_map();
return device_create_file(cpu_subsys.dev_root, &dev_attr_dispatching);
}
device_initcall(topology_init);

View File

@ -43,9 +43,9 @@
#include <asm/debug.h>
#include "entry.h"
void (*pgm_check_table[128])(struct pt_regs *, long, unsigned long);
void (*pgm_check_table[128])(struct pt_regs *regs);
int show_unhandled_signals;
int show_unhandled_signals = 1;
#define stack_pointer ({ void **sp; asm("la %0,0(15)" : "=&d" (sp)); sp; })
@ -234,7 +234,7 @@ void show_regs(struct pt_regs *regs)
static DEFINE_SPINLOCK(die_lock);
void die(const char * str, struct pt_regs * regs, long err)
void die(struct pt_regs *regs, const char *str)
{
static int die_counter;
@ -243,7 +243,7 @@ void die(const char * str, struct pt_regs * regs, long err)
console_verbose();
spin_lock_irq(&die_lock);
bust_spinlocks(1);
printk("%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter);
printk("%s: %04x [#%d] ", str, regs->int_code & 0xffff, ++die_counter);
#ifdef CONFIG_PREEMPT
printk("PREEMPT ");
#endif
@ -254,7 +254,7 @@ void die(const char * str, struct pt_regs * regs, long err)
printk("DEBUG_PAGEALLOC");
#endif
printk("\n");
notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no, SIGSEGV);
notify_die(DIE_OOPS, str, regs, 0, regs->int_code & 0xffff, SIGSEGV);
show_regs(regs);
bust_spinlocks(0);
add_taint(TAINT_DIE);
@ -267,8 +267,7 @@ void die(const char * str, struct pt_regs * regs, long err)
do_exit(SIGSEGV);
}
static void inline report_user_fault(struct pt_regs *regs, long int_code,
int signr)
static inline void report_user_fault(struct pt_regs *regs, int signr)
{
if ((task_pid_nr(current) > 1) && !show_unhandled_signals)
return;
@ -276,7 +275,7 @@ static void inline report_user_fault(struct pt_regs *regs, long int_code,
return;
if (!printk_ratelimit())
return;
printk("User process fault: interruption code 0x%lX ", int_code);
printk("User process fault: interruption code 0x%X ", regs->int_code);
print_vma_addr("in ", regs->psw.addr & PSW_ADDR_INSN);
printk("\n");
show_regs(regs);
@ -287,19 +286,28 @@ int is_valid_bugaddr(unsigned long addr)
return 1;
}
static inline void __kprobes do_trap(long pgm_int_code, int signr, char *str,
struct pt_regs *regs, siginfo_t *info)
static inline void __user *get_psw_address(struct pt_regs *regs)
{
if (notify_die(DIE_TRAP, str, regs, pgm_int_code,
pgm_int_code, signr) == NOTIFY_STOP)
return (void __user *)
((regs->psw.addr - (regs->int_code >> 16)) & PSW_ADDR_INSN);
}
static void __kprobes do_trap(struct pt_regs *regs,
int si_signo, int si_code, char *str)
{
siginfo_t info;
if (notify_die(DIE_TRAP, str, regs, 0,
regs->int_code, si_signo) == NOTIFY_STOP)
return;
if (regs->psw.mask & PSW_MASK_PSTATE) {
struct task_struct *tsk = current;
tsk->thread.trap_no = pgm_int_code & 0xffff;
force_sig_info(signr, info, tsk);
report_user_fault(regs, pgm_int_code, signr);
info.si_signo = si_signo;
info.si_errno = 0;
info.si_code = si_code;
info.si_addr = get_psw_address(regs);
force_sig_info(si_signo, &info, current);
report_user_fault(regs, si_signo);
} else {
const struct exception_table_entry *fixup;
fixup = search_exception_tables(regs->psw.addr & PSW_ADDR_INSN);
@ -311,18 +319,11 @@ static inline void __kprobes do_trap(long pgm_int_code, int signr, char *str,
btt = report_bug(regs->psw.addr & PSW_ADDR_INSN, regs);
if (btt == BUG_TRAP_TYPE_WARN)
return;
die(str, regs, pgm_int_code);
die(regs, str);
}
}
}
static inline void __user *get_psw_address(struct pt_regs *regs,
long pgm_int_code)
{
return (void __user *)
((regs->psw.addr - (pgm_int_code >> 16)) & PSW_ADDR_INSN);
}
void __kprobes do_per_trap(struct pt_regs *regs)
{
siginfo_t info;
@ -339,26 +340,19 @@ void __kprobes do_per_trap(struct pt_regs *regs)
force_sig_info(SIGTRAP, &info, current);
}
static void default_trap_handler(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
static void default_trap_handler(struct pt_regs *regs)
{
if (regs->psw.mask & PSW_MASK_PSTATE) {
report_user_fault(regs, pgm_int_code, SIGSEGV);
report_user_fault(regs, SIGSEGV);
do_exit(SIGSEGV);
} else
die("Unknown program exception", regs, pgm_int_code);
die(regs, "Unknown program exception");
}
#define DO_ERROR_INFO(name, signr, sicode, str) \
static void name(struct pt_regs *regs, long pgm_int_code, \
unsigned long trans_exc_code) \
static void name(struct pt_regs *regs) \
{ \
siginfo_t info; \
info.si_signo = signr; \
info.si_errno = 0; \
info.si_code = sicode; \
info.si_addr = get_psw_address(regs, pgm_int_code); \
do_trap(pgm_int_code, signr, str, regs, &info); \
do_trap(regs, signr, sicode, str); \
}
DO_ERROR_INFO(addressing_exception, SIGILL, ILL_ILLADR,
@ -388,42 +382,34 @@ DO_ERROR_INFO(special_op_exception, SIGILL, ILL_ILLOPN,
DO_ERROR_INFO(translation_exception, SIGILL, ILL_ILLOPN,
"translation exception")
static inline void do_fp_trap(struct pt_regs *regs, void __user *location,
int fpc, long pgm_int_code)
static inline void do_fp_trap(struct pt_regs *regs, int fpc)
{
siginfo_t si;
si.si_signo = SIGFPE;
si.si_errno = 0;
si.si_addr = location;
si.si_code = 0;
int si_code = 0;
/* FPC[2] is Data Exception Code */
if ((fpc & 0x00000300) == 0) {
/* bits 6 and 7 of DXC are 0 iff IEEE exception */
if (fpc & 0x8000) /* invalid fp operation */
si.si_code = FPE_FLTINV;
si_code = FPE_FLTINV;
else if (fpc & 0x4000) /* div by 0 */
si.si_code = FPE_FLTDIV;
si_code = FPE_FLTDIV;
else if (fpc & 0x2000) /* overflow */
si.si_code = FPE_FLTOVF;
si_code = FPE_FLTOVF;
else if (fpc & 0x1000) /* underflow */
si.si_code = FPE_FLTUND;
si_code = FPE_FLTUND;
else if (fpc & 0x0800) /* inexact */
si.si_code = FPE_FLTRES;
si_code = FPE_FLTRES;
}
do_trap(pgm_int_code, SIGFPE,
"floating point exception", regs, &si);
do_trap(regs, SIGFPE, si_code, "floating point exception");
}
static void __kprobes illegal_op(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
static void __kprobes illegal_op(struct pt_regs *regs)
{
siginfo_t info;
__u8 opcode[6];
__u16 __user *location;
int signal = 0;
location = get_psw_address(regs, pgm_int_code);
location = get_psw_address(regs);
if (regs->psw.mask & PSW_MASK_PSTATE) {
if (get_user(*((__u16 *) opcode), (__u16 __user *) location))
@ -467,44 +453,31 @@ static void __kprobes illegal_op(struct pt_regs *regs, long pgm_int_code,
* If we get an illegal op in kernel mode, send it through the
* kprobes notifier. If kprobes doesn't pick it up, SIGILL
*/
if (notify_die(DIE_BPT, "bpt", regs, pgm_int_code,
if (notify_die(DIE_BPT, "bpt", regs, 0,
3, SIGTRAP) != NOTIFY_STOP)
signal = SIGILL;
}
#ifdef CONFIG_MATHEMU
if (signal == SIGFPE)
do_fp_trap(regs, location,
current->thread.fp_regs.fpc, pgm_int_code);
else if (signal == SIGSEGV) {
info.si_signo = signal;
info.si_errno = 0;
info.si_code = SEGV_MAPERR;
info.si_addr = (void __user *) location;
do_trap(pgm_int_code, signal,
"user address fault", regs, &info);
} else
do_fp_trap(regs, current->thread.fp_regs.fpc);
else if (signal == SIGSEGV)
do_trap(regs, signal, SEGV_MAPERR, "user address fault");
else
#endif
if (signal) {
info.si_signo = signal;
info.si_errno = 0;
info.si_code = ILL_ILLOPC;
info.si_addr = (void __user *) location;
do_trap(pgm_int_code, signal,
"illegal operation", regs, &info);
}
if (signal)
do_trap(regs, signal, ILL_ILLOPC, "illegal operation");
}
#ifdef CONFIG_MATHEMU
void specification_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
void specification_exception(struct pt_regs *regs)
{
__u8 opcode[6];
__u16 __user *location = NULL;
int signal = 0;
location = (__u16 __user *) get_psw_address(regs, pgm_int_code);
location = (__u16 __user *) get_psw_address(regs);
if (regs->psw.mask & PSW_MASK_PSTATE) {
get_user(*((__u16 *) opcode), location);
@ -539,30 +512,21 @@ void specification_exception(struct pt_regs *regs, long pgm_int_code,
signal = SIGILL;
if (signal == SIGFPE)
do_fp_trap(regs, location,
current->thread.fp_regs.fpc, pgm_int_code);
else if (signal) {
siginfo_t info;
info.si_signo = signal;
info.si_errno = 0;
info.si_code = ILL_ILLOPN;
info.si_addr = location;
do_trap(pgm_int_code, signal,
"specification exception", regs, &info);
}
do_fp_trap(regs, current->thread.fp_regs.fpc);
else if (signal)
do_trap(regs, signal, ILL_ILLOPN, "specification exception");
}
#else
DO_ERROR_INFO(specification_exception, SIGILL, ILL_ILLOPN,
"specification exception");
#endif
static void data_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
static void data_exception(struct pt_regs *regs)
{
__u16 __user *location;
int signal = 0;
location = get_psw_address(regs, pgm_int_code);
location = get_psw_address(regs);
if (MACHINE_HAS_IEEE)
asm volatile("stfpc %0" : "=m" (current->thread.fp_regs.fpc));
@ -627,32 +591,18 @@ static void data_exception(struct pt_regs *regs, long pgm_int_code,
else
signal = SIGILL;
if (signal == SIGFPE)
do_fp_trap(regs, location,
current->thread.fp_regs.fpc, pgm_int_code);
else if (signal) {
siginfo_t info;
info.si_signo = signal;
info.si_errno = 0;
info.si_code = ILL_ILLOPN;
info.si_addr = location;
do_trap(pgm_int_code, signal, "data exception", regs, &info);
}
do_fp_trap(regs, current->thread.fp_regs.fpc);
else if (signal)
do_trap(regs, signal, ILL_ILLOPN, "data exception");
}
static void space_switch_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
static void space_switch_exception(struct pt_regs *regs)
{
siginfo_t info;
/* Set user psw back to home space mode. */
if (regs->psw.mask & PSW_MASK_PSTATE)
regs->psw.mask |= PSW_ASC_HOME;
/* Send SIGILL. */
info.si_signo = SIGILL;
info.si_errno = 0;
info.si_code = ILL_PRVOPC;
info.si_addr = get_psw_address(regs, pgm_int_code);
do_trap(pgm_int_code, SIGILL, "space switch event", regs, &info);
do_trap(regs, SIGILL, ILL_PRVOPC, "space switch event");
}
void __kprobes kernel_stack_overflow(struct pt_regs * regs)

View File

@ -125,8 +125,7 @@ static inline int user_space_fault(unsigned long trans_exc_code)
return trans_exc_code != 3;
}
static inline void report_user_fault(struct pt_regs *regs, long int_code,
int signr, unsigned long address)
static inline void report_user_fault(struct pt_regs *regs, long signr)
{
if ((task_pid_nr(current) > 1) && !show_unhandled_signals)
return;
@ -134,10 +133,12 @@ static inline void report_user_fault(struct pt_regs *regs, long int_code,
return;
if (!printk_ratelimit())
return;
printk("User process fault: interruption code 0x%lX ", int_code);
printk(KERN_ALERT "User process fault: interruption code 0x%X ",
regs->int_code);
print_vma_addr(KERN_CONT "in ", regs->psw.addr & PSW_ADDR_INSN);
printk("\n");
printk("failing address: %lX\n", address);
printk(KERN_CONT "\n");
printk(KERN_ALERT "failing address: %lX\n",
regs->int_parm_long & __FAIL_ADDR_MASK);
show_regs(regs);
}
@ -145,24 +146,18 @@ static inline void report_user_fault(struct pt_regs *regs, long int_code,
* Send SIGSEGV to task. This is an external routine
* to keep the stack usage of do_page_fault small.
*/
static noinline void do_sigsegv(struct pt_regs *regs, long int_code,
int si_code, unsigned long trans_exc_code)
static noinline void do_sigsegv(struct pt_regs *regs, int si_code)
{
struct siginfo si;
unsigned long address;
address = trans_exc_code & __FAIL_ADDR_MASK;
current->thread.prot_addr = address;
current->thread.trap_no = int_code;
report_user_fault(regs, int_code, SIGSEGV, address);
report_user_fault(regs, SIGSEGV);
si.si_signo = SIGSEGV;
si.si_code = si_code;
si.si_addr = (void __user *) address;
si.si_addr = (void __user *)(regs->int_parm_long & __FAIL_ADDR_MASK);
force_sig_info(SIGSEGV, &si, current);
}
static noinline void do_no_context(struct pt_regs *regs, long int_code,
unsigned long trans_exc_code)
static noinline void do_no_context(struct pt_regs *regs)
{
const struct exception_table_entry *fixup;
unsigned long address;
@ -178,55 +173,48 @@ static noinline void do_no_context(struct pt_regs *regs, long int_code,
* Oops. The kernel tried to access some bad page. We'll have to
* terminate things with extreme prejudice.
*/
address = trans_exc_code & __FAIL_ADDR_MASK;
if (!user_space_fault(trans_exc_code))
address = regs->int_parm_long & __FAIL_ADDR_MASK;
if (!user_space_fault(regs->int_parm_long))
printk(KERN_ALERT "Unable to handle kernel pointer dereference"
" at virtual kernel address %p\n", (void *)address);
else
printk(KERN_ALERT "Unable to handle kernel paging request"
" at virtual user address %p\n", (void *)address);
die("Oops", regs, int_code);
die(regs, "Oops");
do_exit(SIGKILL);
}
static noinline void do_low_address(struct pt_regs *regs, long int_code,
unsigned long trans_exc_code)
static noinline void do_low_address(struct pt_regs *regs)
{
/* Low-address protection hit in kernel mode means
NULL pointer write access in kernel mode. */
if (regs->psw.mask & PSW_MASK_PSTATE) {
/* Low-address protection hit in user mode 'cannot happen'. */
die ("Low-address protection", regs, int_code);
die (regs, "Low-address protection");
do_exit(SIGKILL);
}
do_no_context(regs, int_code, trans_exc_code);
do_no_context(regs);
}
static noinline void do_sigbus(struct pt_regs *regs, long int_code,
unsigned long trans_exc_code)
static noinline void do_sigbus(struct pt_regs *regs)
{
struct task_struct *tsk = current;
unsigned long address;
struct siginfo si;
/*
* Send a sigbus, regardless of whether we were in kernel
* or user mode.
*/
address = trans_exc_code & __FAIL_ADDR_MASK;
tsk->thread.prot_addr = address;
tsk->thread.trap_no = int_code;
si.si_signo = SIGBUS;
si.si_errno = 0;
si.si_code = BUS_ADRERR;
si.si_addr = (void __user *) address;
si.si_addr = (void __user *)(regs->int_parm_long & __FAIL_ADDR_MASK);
force_sig_info(SIGBUS, &si, tsk);
}
static noinline void do_fault_error(struct pt_regs *regs, long int_code,
unsigned long trans_exc_code, int fault)
static noinline void do_fault_error(struct pt_regs *regs, int fault)
{
int si_code;
@ -238,24 +226,24 @@ static noinline void do_fault_error(struct pt_regs *regs, long int_code,
/* User mode accesses just cause a SIGSEGV */
si_code = (fault == VM_FAULT_BADMAP) ?
SEGV_MAPERR : SEGV_ACCERR;
do_sigsegv(regs, int_code, si_code, trans_exc_code);
do_sigsegv(regs, si_code);
return;
}
case VM_FAULT_BADCONTEXT:
do_no_context(regs, int_code, trans_exc_code);
do_no_context(regs);
break;
default: /* fault & VM_FAULT_ERROR */
if (fault & VM_FAULT_OOM) {
if (!(regs->psw.mask & PSW_MASK_PSTATE))
do_no_context(regs, int_code, trans_exc_code);
do_no_context(regs);
else
pagefault_out_of_memory();
} else if (fault & VM_FAULT_SIGBUS) {
/* Kernel mode? Handle exceptions or die */
if (!(regs->psw.mask & PSW_MASK_PSTATE))
do_no_context(regs, int_code, trans_exc_code);
do_no_context(regs);
else
do_sigbus(regs, int_code, trans_exc_code);
do_sigbus(regs);
} else
BUG();
break;
@ -273,12 +261,12 @@ static noinline void do_fault_error(struct pt_regs *regs, long int_code,
* 11 Page translation -> Not present (nullification)
* 3b Region third trans. -> Not present (nullification)
*/
static inline int do_exception(struct pt_regs *regs, int access,
unsigned long trans_exc_code)
static inline int do_exception(struct pt_regs *regs, int access)
{
struct task_struct *tsk;
struct mm_struct *mm;
struct vm_area_struct *vma;
unsigned long trans_exc_code;
unsigned long address;
unsigned int flags;
int fault;
@ -288,6 +276,7 @@ static inline int do_exception(struct pt_regs *regs, int access,
tsk = current;
mm = tsk->mm;
trans_exc_code = regs->int_parm_long;
/*
* Verify that the fault happened in user space, that
@ -387,45 +376,46 @@ out:
return fault;
}
void __kprobes do_protection_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
void __kprobes do_protection_exception(struct pt_regs *regs)
{
unsigned long trans_exc_code;
int fault;
trans_exc_code = regs->int_parm_long;
/* Protection exception is suppressing, decrement psw address. */
regs->psw.addr = __rewind_psw(regs->psw, pgm_int_code >> 16);
regs->psw.addr = __rewind_psw(regs->psw, regs->int_code >> 16);
/*
* Check for low-address protection. This needs to be treated
* as a special case because the translation exception code
* field is not guaranteed to contain valid data in this case.
*/
if (unlikely(!(trans_exc_code & 4))) {
do_low_address(regs, pgm_int_code, trans_exc_code);
do_low_address(regs);
return;
}
fault = do_exception(regs, VM_WRITE, trans_exc_code);
fault = do_exception(regs, VM_WRITE);
if (unlikely(fault))
do_fault_error(regs, 4, trans_exc_code, fault);
do_fault_error(regs, fault);
}
void __kprobes do_dat_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
void __kprobes do_dat_exception(struct pt_regs *regs)
{
int access, fault;
access = VM_READ | VM_EXEC | VM_WRITE;
fault = do_exception(regs, access, trans_exc_code);
fault = do_exception(regs, access);
if (unlikely(fault))
do_fault_error(regs, pgm_int_code & 255, trans_exc_code, fault);
do_fault_error(regs, fault);
}
#ifdef CONFIG_64BIT
void __kprobes do_asce_exception(struct pt_regs *regs, long pgm_int_code,
unsigned long trans_exc_code)
void __kprobes do_asce_exception(struct pt_regs *regs)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
unsigned long trans_exc_code;
trans_exc_code = regs->int_parm_long;
if (unlikely(!user_space_fault(trans_exc_code) || in_atomic() || !mm))
goto no_context;
@ -440,12 +430,12 @@ void __kprobes do_asce_exception(struct pt_regs *regs, long pgm_int_code,
/* User mode accesses just cause a SIGSEGV */
if (regs->psw.mask & PSW_MASK_PSTATE) {
do_sigsegv(regs, pgm_int_code, SEGV_MAPERR, trans_exc_code);
do_sigsegv(regs, SEGV_MAPERR);
return;
}
no_context:
do_no_context(regs, pgm_int_code, trans_exc_code);
do_no_context(regs);
}
#endif
@ -459,14 +449,15 @@ int __handle_fault(unsigned long uaddr, unsigned long pgm_int_code, int write)
regs.psw.mask |= PSW_MASK_IO | PSW_MASK_EXT;
regs.psw.addr = (unsigned long) __builtin_return_address(0);
regs.psw.addr |= PSW_ADDR_AMODE;
uaddr &= PAGE_MASK;
regs.int_code = pgm_int_code;
regs.int_parm_long = (uaddr & PAGE_MASK) | 2;
access = write ? VM_WRITE : VM_READ;
fault = do_exception(&regs, access, uaddr | 2);
fault = do_exception(&regs, access);
if (unlikely(fault)) {
if (fault & VM_FAULT_OOM)
return -EFAULT;
else if (fault & VM_FAULT_SIGBUS)
do_sigbus(&regs, pgm_int_code, uaddr);
do_sigbus(&regs);
}
return fault ? -EFAULT : 0;
}
@ -509,7 +500,7 @@ int pfault_init(void)
.reserved = __PF_RES_FIELD };
int rc;
if (!MACHINE_IS_VM || pfault_disable)
if (pfault_disable)
return -1;
asm volatile(
" diag %1,%0,0x258\n"
@ -530,7 +521,7 @@ void pfault_fini(void)
.refversn = 2,
};
if (!MACHINE_IS_VM || pfault_disable)
if (pfault_disable)
return;
asm volatile(
" diag %0,0,0x258\n"
@ -643,8 +634,6 @@ static int __init pfault_irq_init(void)
{
int rc;
if (!MACHINE_IS_VM)
return 0;
rc = register_external_interrupt(0x2603, pfault_interrupt);
if (rc)
goto out_extint;

View File

@ -93,18 +93,22 @@ static unsigned long setup_zero_pages(void)
void __init paging_init(void)
{
unsigned long max_zone_pfns[MAX_NR_ZONES];
unsigned long pgd_type;
unsigned long pgd_type, asce_bits;
init_mm.pgd = swapper_pg_dir;
S390_lowcore.kernel_asce = __pa(init_mm.pgd) & PAGE_MASK;
#ifdef CONFIG_64BIT
/* A three level page table (4TB) is enough for the kernel space. */
S390_lowcore.kernel_asce |= _ASCE_TYPE_REGION3 | _ASCE_TABLE_LENGTH;
pgd_type = _REGION3_ENTRY_EMPTY;
if (VMALLOC_END > (1UL << 42)) {
asce_bits = _ASCE_TYPE_REGION2 | _ASCE_TABLE_LENGTH;
pgd_type = _REGION2_ENTRY_EMPTY;
} else {
asce_bits = _ASCE_TYPE_REGION3 | _ASCE_TABLE_LENGTH;
pgd_type = _REGION3_ENTRY_EMPTY;
}
#else
S390_lowcore.kernel_asce |= _ASCE_TABLE_LENGTH;
asce_bits = _ASCE_TABLE_LENGTH;
pgd_type = _SEGMENT_ENTRY_EMPTY;
#endif
S390_lowcore.kernel_asce = (__pa(init_mm.pgd) & PAGE_MASK) | asce_bits;
clear_table((unsigned long *) init_mm.pgd, pgd_type,
sizeof(unsigned long)*2048);
vmem_map_init();

View File

@ -33,17 +33,6 @@
#define FRAG_MASK 0x03
#endif
unsigned long VMALLOC_START = VMALLOC_END - VMALLOC_SIZE;
EXPORT_SYMBOL(VMALLOC_START);
static int __init parse_vmalloc(char *arg)
{
if (!arg)
return -EINVAL;
VMALLOC_START = (VMALLOC_END - memparse(arg, &arg)) & PAGE_MASK;
return 0;
}
early_param("vmalloc", parse_vmalloc);
unsigned long *crst_table_alloc(struct mm_struct *mm)
{
@ -267,7 +256,10 @@ static int gmap_alloc_table(struct gmap *gmap,
struct page *page;
unsigned long *new;
/* since we dont free the gmap table until gmap_free we can unlock */
spin_unlock(&gmap->mm->page_table_lock);
page = alloc_pages(GFP_KERNEL, ALLOC_ORDER);
spin_lock(&gmap->mm->page_table_lock);
if (!page)
return -ENOMEM;
new = (unsigned long *) page_to_phys(page);

View File

@ -1718,7 +1718,7 @@ dasd_3990_erp_action_1B_32(struct dasd_ccw_req * default_erp, char *sense)
erp->startdev = device;
erp->memdev = device;
erp->magic = default_erp->magic;
erp->expires = 0;
erp->expires = default_erp->expires;
erp->retries = 256;
erp->buildclk = get_clock();
erp->status = DASD_CQR_FILLED;
@ -2363,7 +2363,7 @@ static struct dasd_ccw_req *dasd_3990_erp_add_erp(struct dasd_ccw_req *cqr)
erp->memdev = device;
erp->block = cqr->block;
erp->magic = cqr->magic;
erp->expires = 0;
erp->expires = cqr->expires;
erp->retries = 256;
erp->buildclk = get_clock();
erp->status = DASD_CQR_FILLED;

View File

@ -705,6 +705,16 @@ struct dasd_device *dasd_alias_get_start_dev(struct dasd_device *base_device)
if (lcu->pav == NO_PAV ||
lcu->flags & (NEED_UAC_UPDATE | UPDATE_PENDING))
return NULL;
if (unlikely(!(private->features.feature[8] & 0x01))) {
/*
* PAV enabled but prefix not, very unlikely
* seems to be a lost pathgroup
* use base device to do IO
*/
DBF_DEV_EVENT(DBF_ERR, base_device, "%s",
"Prefix not enabled with PAV enabled\n");
return NULL;
}
spin_lock_irqsave(&lcu->lock, flags);
alias_device = group->next;

View File

@ -752,24 +752,13 @@ dasd_eckd_cdl_reclen(int recid)
return sizes_trk0[recid];
return LABEL_SIZE;
}
/*
* Generate device unique id that specifies the physical device.
*/
static int dasd_eckd_generate_uid(struct dasd_device *device)
/* create unique id from private structure. */
static void create_uid(struct dasd_eckd_private *private)
{
struct dasd_eckd_private *private;
struct dasd_uid *uid;
int count;
unsigned long flags;
struct dasd_uid *uid;
private = (struct dasd_eckd_private *) device->private;
if (!private)
return -ENODEV;
if (!private->ned || !private->gneq)
return -ENODEV;
uid = &private->uid;
spin_lock_irqsave(get_ccwdev_lock(device->cdev), flags);
memset(uid, 0, sizeof(struct dasd_uid));
memcpy(uid->vendor, private->ned->HDA_manufacturer,
sizeof(uid->vendor) - 1);
@ -792,6 +781,23 @@ static int dasd_eckd_generate_uid(struct dasd_device *device)
private->vdsneq->uit[count]);
}
}
}
/*
* Generate device unique id that specifies the physical device.
*/
static int dasd_eckd_generate_uid(struct dasd_device *device)
{
struct dasd_eckd_private *private;
unsigned long flags;
private = (struct dasd_eckd_private *) device->private;
if (!private)
return -ENODEV;
if (!private->ned || !private->gneq)
return -ENODEV;
spin_lock_irqsave(get_ccwdev_lock(device->cdev), flags);
create_uid(private);
spin_unlock_irqrestore(get_ccwdev_lock(device->cdev), flags);
return 0;
}
@ -811,6 +817,21 @@ static int dasd_eckd_get_uid(struct dasd_device *device, struct dasd_uid *uid)
return -EINVAL;
}
/*
* compare device UID with data of a given dasd_eckd_private structure
* return 0 for match
*/
static int dasd_eckd_compare_path_uid(struct dasd_device *device,
struct dasd_eckd_private *private)
{
struct dasd_uid device_uid;
create_uid(private);
dasd_eckd_get_uid(device, &device_uid);
return memcmp(&device_uid, &private->uid, sizeof(struct dasd_uid));
}
static void dasd_eckd_fill_rcd_cqr(struct dasd_device *device,
struct dasd_ccw_req *cqr,
__u8 *rcd_buffer,
@ -1005,59 +1026,120 @@ static int dasd_eckd_read_conf(struct dasd_device *device)
int conf_len, conf_data_saved;
int rc;
__u8 lpm, opm;
struct dasd_eckd_private *private;
struct dasd_eckd_private *private, path_private;
struct dasd_path *path_data;
struct dasd_uid *uid;
char print_path_uid[60], print_device_uid[60];
private = (struct dasd_eckd_private *) device->private;
path_data = &device->path_data;
opm = ccw_device_get_path_mask(device->cdev);
lpm = 0x80;
conf_data_saved = 0;
/* get configuration data per operational path */
for (lpm = 0x80; lpm; lpm>>= 1) {
if (lpm & opm) {
rc = dasd_eckd_read_conf_lpm(device, &conf_data,
&conf_len, lpm);
if (rc && rc != -EOPNOTSUPP) { /* -EOPNOTSUPP is ok */
DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
"Read configuration data returned "
"error %d", rc);
return rc;
}
if (conf_data == NULL) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
"No configuration data "
"retrieved");
/* no further analysis possible */
path_data->opm |= lpm;
continue; /* no error */
}
/* save first valid configuration data */
if (!conf_data_saved) {
kfree(private->conf_data);
private->conf_data = conf_data;
private->conf_len = conf_len;
if (dasd_eckd_identify_conf_parts(private)) {
private->conf_data = NULL;
private->conf_len = 0;
kfree(conf_data);
continue;
}
conf_data_saved++;
}
switch (dasd_eckd_path_access(conf_data, conf_len)) {
case 0x02:
path_data->npm |= lpm;
break;
case 0x03:
path_data->ppm |= lpm;
break;
}
path_data->opm |= lpm;
if (conf_data != private->conf_data)
kfree(conf_data);
if (!(lpm & opm))
continue;
rc = dasd_eckd_read_conf_lpm(device, &conf_data,
&conf_len, lpm);
if (rc && rc != -EOPNOTSUPP) { /* -EOPNOTSUPP is ok */
DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
"Read configuration data returned "
"error %d", rc);
return rc;
}
if (conf_data == NULL) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
"No configuration data "
"retrieved");
/* no further analysis possible */
path_data->opm |= lpm;
continue; /* no error */
}
/* save first valid configuration data */
if (!conf_data_saved) {
kfree(private->conf_data);
private->conf_data = conf_data;
private->conf_len = conf_len;
if (dasd_eckd_identify_conf_parts(private)) {
private->conf_data = NULL;
private->conf_len = 0;
kfree(conf_data);
continue;
}
/*
* build device UID that other path data
* can be compared to it
*/
dasd_eckd_generate_uid(device);
conf_data_saved++;
} else {
path_private.conf_data = conf_data;
path_private.conf_len = DASD_ECKD_RCD_DATA_SIZE;
if (dasd_eckd_identify_conf_parts(
&path_private)) {
path_private.conf_data = NULL;
path_private.conf_len = 0;
kfree(conf_data);
continue;
}
if (dasd_eckd_compare_path_uid(
device, &path_private)) {
uid = &path_private.uid;
if (strlen(uid->vduit) > 0)
snprintf(print_path_uid,
sizeof(print_path_uid),
"%s.%s.%04x.%02x.%s",
uid->vendor, uid->serial,
uid->ssid, uid->real_unit_addr,
uid->vduit);
else
snprintf(print_path_uid,
sizeof(print_path_uid),
"%s.%s.%04x.%02x",
uid->vendor, uid->serial,
uid->ssid,
uid->real_unit_addr);
uid = &private->uid;
if (strlen(uid->vduit) > 0)
snprintf(print_device_uid,
sizeof(print_device_uid),
"%s.%s.%04x.%02x.%s",
uid->vendor, uid->serial,
uid->ssid, uid->real_unit_addr,
uid->vduit);
else
snprintf(print_device_uid,
sizeof(print_device_uid),
"%s.%s.%04x.%02x",
uid->vendor, uid->serial,
uid->ssid,
uid->real_unit_addr);
dev_err(&device->cdev->dev,
"Not all channel paths lead to "
"the same device, path %02X leads to "
"device %s instead of %s\n", lpm,
print_path_uid, print_device_uid);
return -EINVAL;
}
path_private.conf_data = NULL;
path_private.conf_len = 0;
}
switch (dasd_eckd_path_access(conf_data, conf_len)) {
case 0x02:
path_data->npm |= lpm;
break;
case 0x03:
path_data->ppm |= lpm;
break;
}
path_data->opm |= lpm;
if (conf_data != private->conf_data)
kfree(conf_data);
}
return 0;
}
@ -1090,12 +1172,61 @@ static int verify_fcx_max_data(struct dasd_device *device, __u8 lpm)
return 0;
}
static int rebuild_device_uid(struct dasd_device *device,
struct path_verification_work_data *data)
{
struct dasd_eckd_private *private;
struct dasd_path *path_data;
__u8 lpm, opm;
int rc;
rc = -ENODEV;
private = (struct dasd_eckd_private *) device->private;
path_data = &device->path_data;
opm = device->path_data.opm;
for (lpm = 0x80; lpm; lpm >>= 1) {
if (!(lpm & opm))
continue;
memset(&data->rcd_buffer, 0, sizeof(data->rcd_buffer));
memset(&data->cqr, 0, sizeof(data->cqr));
data->cqr.cpaddr = &data->ccw;
rc = dasd_eckd_read_conf_immediately(device, &data->cqr,
data->rcd_buffer,
lpm);
if (rc) {
if (rc == -EOPNOTSUPP) /* -EOPNOTSUPP is ok */
continue;
DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
"Read configuration data "
"returned error %d", rc);
break;
}
memcpy(private->conf_data, data->rcd_buffer,
DASD_ECKD_RCD_DATA_SIZE);
if (dasd_eckd_identify_conf_parts(private)) {
rc = -ENODEV;
} else /* first valid path is enough */
break;
}
if (!rc)
rc = dasd_eckd_generate_uid(device);
return rc;
}
static void do_path_verification_work(struct work_struct *work)
{
struct path_verification_work_data *data;
struct dasd_device *device;
struct dasd_eckd_private path_private;
struct dasd_uid *uid;
__u8 path_rcd_buf[DASD_ECKD_RCD_DATA_SIZE];
__u8 lpm, opm, npm, ppm, epm;
unsigned long flags;
char print_uid[60];
int rc;
data = container_of(work, struct path_verification_work_data, worker);
@ -1112,64 +1243,129 @@ static void do_path_verification_work(struct work_struct *work)
ppm = 0;
epm = 0;
for (lpm = 0x80; lpm; lpm >>= 1) {
if (lpm & data->tbvpm) {
memset(data->rcd_buffer, 0, sizeof(data->rcd_buffer));
memset(&data->cqr, 0, sizeof(data->cqr));
data->cqr.cpaddr = &data->ccw;
rc = dasd_eckd_read_conf_immediately(device, &data->cqr,
data->rcd_buffer,
lpm);
if (!rc) {
switch (dasd_eckd_path_access(data->rcd_buffer,
DASD_ECKD_RCD_DATA_SIZE)) {
case 0x02:
npm |= lpm;
break;
case 0x03:
ppm |= lpm;
break;
}
opm |= lpm;
} else if (rc == -EOPNOTSUPP) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
"path verification: No configuration "
"data retrieved");
opm |= lpm;
} else if (rc == -EAGAIN) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
if (!(lpm & data->tbvpm))
continue;
memset(&data->rcd_buffer, 0, sizeof(data->rcd_buffer));
memset(&data->cqr, 0, sizeof(data->cqr));
data->cqr.cpaddr = &data->ccw;
rc = dasd_eckd_read_conf_immediately(device, &data->cqr,
data->rcd_buffer,
lpm);
if (!rc) {
switch (dasd_eckd_path_access(data->rcd_buffer,
DASD_ECKD_RCD_DATA_SIZE)
) {
case 0x02:
npm |= lpm;
break;
case 0x03:
ppm |= lpm;
break;
}
opm |= lpm;
} else if (rc == -EOPNOTSUPP) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
"path verification: No configuration "
"data retrieved");
opm |= lpm;
} else if (rc == -EAGAIN) {
DBF_EVENT_DEVID(DBF_WARNING, device->cdev, "%s",
"path verification: device is stopped,"
" try again later");
epm |= lpm;
} else {
dev_warn(&device->cdev->dev,
"Reading device feature codes failed "
"(rc=%d) for new path %x\n", rc, lpm);
continue;
}
if (verify_fcx_max_data(device, lpm)) {
epm |= lpm;
} else {
dev_warn(&device->cdev->dev,
"Reading device feature codes failed "
"(rc=%d) for new path %x\n", rc, lpm);
continue;
}
if (verify_fcx_max_data(device, lpm)) {
opm &= ~lpm;
npm &= ~lpm;
ppm &= ~lpm;
continue;
}
/*
* save conf_data for comparison after
* rebuild_device_uid may have changed
* the original data
*/
memcpy(&path_rcd_buf, data->rcd_buffer,
DASD_ECKD_RCD_DATA_SIZE);
path_private.conf_data = (void *) &path_rcd_buf;
path_private.conf_len = DASD_ECKD_RCD_DATA_SIZE;
if (dasd_eckd_identify_conf_parts(&path_private)) {
path_private.conf_data = NULL;
path_private.conf_len = 0;
continue;
}
/*
* compare path UID with device UID only if at least
* one valid path is left
* in other case the device UID may have changed and
* the first working path UID will be used as device UID
*/
if (device->path_data.opm &&
dasd_eckd_compare_path_uid(device, &path_private)) {
/*
* the comparison was not successful
* rebuild the device UID with at least one
* known path in case a z/VM hyperswap command
* has changed the device
*
* after this compare again
*
* if either the rebuild or the recompare fails
* the path can not be used
*/
if (rebuild_device_uid(device, data) ||
dasd_eckd_compare_path_uid(
device, &path_private)) {
uid = &path_private.uid;
if (strlen(uid->vduit) > 0)
snprintf(print_uid, sizeof(print_uid),
"%s.%s.%04x.%02x.%s",
uid->vendor, uid->serial,
uid->ssid, uid->real_unit_addr,
uid->vduit);
else
snprintf(print_uid, sizeof(print_uid),
"%s.%s.%04x.%02x",
uid->vendor, uid->serial,
uid->ssid,
uid->real_unit_addr);
dev_err(&device->cdev->dev,
"The newly added channel path %02X "
"will not be used because it leads "
"to a different device %s\n",
lpm, print_uid);
opm &= ~lpm;
npm &= ~lpm;
ppm &= ~lpm;
continue;
}
}
/*
* There is a small chance that a path is lost again between
* above path verification and the following modification of
* the device opm mask. We could avoid that race here by using
* yet another path mask, but we rather deal with this unlikely
* situation in dasd_start_IO.
*/
spin_lock_irqsave(get_ccwdev_lock(device->cdev), flags);
if (!device->path_data.opm && opm) {
device->path_data.opm = opm;
dasd_generic_path_operational(device);
} else
device->path_data.opm |= opm;
device->path_data.npm |= npm;
device->path_data.ppm |= ppm;
device->path_data.tbvpm |= epm;
spin_unlock_irqrestore(get_ccwdev_lock(device->cdev), flags);
}
/*
* There is a small chance that a path is lost again between
* above path verification and the following modification of
* the device opm mask. We could avoid that race here by using
* yet another path mask, but we rather deal with this unlikely
* situation in dasd_start_IO.
*/
spin_lock_irqsave(get_ccwdev_lock(device->cdev), flags);
if (!device->path_data.opm && opm) {
device->path_data.opm = opm;
dasd_generic_path_operational(device);
} else
device->path_data.opm |= opm;
device->path_data.npm |= npm;
device->path_data.ppm |= ppm;
device->path_data.tbvpm |= epm;
spin_unlock_irqrestore(get_ccwdev_lock(device->cdev), flags);
dasd_put_device(device);
if (data->isglobal)
@ -1441,11 +1637,6 @@ dasd_eckd_check_characteristics(struct dasd_device *device)
device->default_expires = value;
}
/* Generate device unique id */
rc = dasd_eckd_generate_uid(device);
if (rc)
goto out_err1;
dasd_eckd_get_uid(device, &temp_uid);
if (temp_uid.type == UA_BASE_DEVICE) {
block = dasd_alloc_block();
@ -2206,7 +2397,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp_cmd_single(
sizeof(struct PFX_eckd_data));
} else {
if (define_extent(ccw++, cqr->data, first_trk,
last_trk, cmd, startdev) == -EAGAIN) {
last_trk, cmd, basedev) == -EAGAIN) {
/* Clock not in sync and XRC is enabled.
* Try again later.
*/

View File

@ -22,12 +22,9 @@
static struct kmem_cache *qdio_q_cache;
static struct kmem_cache *qdio_aob_cache;
struct qaob *qdio_allocate_aob()
struct qaob *qdio_allocate_aob(void)
{
struct qaob *aob;
aob = kmem_cache_zalloc(qdio_aob_cache, GFP_ATOMIC);
return aob;
return kmem_cache_zalloc(qdio_aob_cache, GFP_ATOMIC);
}
EXPORT_SYMBOL_GPL(qdio_allocate_aob);
@ -180,7 +177,8 @@ static void setup_queues(struct qdio_irq *irq_ptr,
setup_queues_misc(q, irq_ptr, qdio_init->input_handler, i);
q->is_input_q = 1;
q->u.in.queue_start_poll = qdio_init->queue_start_poll[i];
q->u.in.queue_start_poll = qdio_init->queue_start_poll_array ?
qdio_init->queue_start_poll_array[i] : NULL;
setup_storage_lists(q, irq_ptr, input_sbal_array, i);
input_sbal_array += QDIO_MAX_BUFFERS_PER_Q;

View File

@ -56,11 +56,6 @@
#define PCIXCC_MAX_ICA_RESPONSE_SIZE 0x77c /* max size type86 v2 reply */
#define PCIXCC_MAX_XCRB_MESSAGE_SIZE (12*1024)
#define PCIXCC_MAX_XCRB_RESPONSE_SIZE PCIXCC_MAX_XCRB_MESSAGE_SIZE
#define PCIXCC_MAX_XCRB_DATA_SIZE (11*1024)
#define PCIXCC_MAX_XCRB_REPLY_SIZE (5*1024)
#define PCIXCC_MAX_RESPONSE_SIZE PCIXCC_MAX_XCRB_RESPONSE_SIZE
#define PCIXCC_CLEANUP_TIME (15*HZ)
@ -265,7 +260,7 @@ static int ICACRT_msg_to_type6CRT_msgX(struct zcrypt_device *zdev,
* @ap_msg: pointer to AP message
* @xcRB: pointer to user input data
*
* Returns 0 on success or -EFAULT.
* Returns 0 on success or -EFAULT, -EINVAL.
*/
struct type86_fmt2_msg {
struct type86_hdr hdr;
@ -295,19 +290,12 @@ static int XCRB_msg_to_type6CPRB_msgX(struct zcrypt_device *zdev,
CEIL4(xcRB->request_control_blk_length) +
xcRB->request_data_length;
if (ap_msg->length > PCIXCC_MAX_XCRB_MESSAGE_SIZE)
return -EFAULT;
if (CEIL4(xcRB->reply_control_blk_length) > PCIXCC_MAX_XCRB_REPLY_SIZE)
return -EFAULT;
if (CEIL4(xcRB->reply_data_length) > PCIXCC_MAX_XCRB_DATA_SIZE)
return -EFAULT;
replylen = CEIL4(xcRB->reply_control_blk_length) +
CEIL4(xcRB->reply_data_length) +
sizeof(struct type86_fmt2_msg);
if (replylen > PCIXCC_MAX_XCRB_RESPONSE_SIZE) {
xcRB->reply_control_blk_length = PCIXCC_MAX_XCRB_RESPONSE_SIZE -
(sizeof(struct type86_fmt2_msg) +
CEIL4(xcRB->reply_data_length));
}
return -EINVAL;
replylen = sizeof(struct type86_fmt2_msg) +
CEIL4(xcRB->reply_control_blk_length) +
xcRB->reply_data_length;
if (replylen > PCIXCC_MAX_XCRB_MESSAGE_SIZE)
return -EINVAL;
/* prepare type6 header */
msg->hdr = static_type6_hdrX;
@ -326,7 +314,7 @@ static int XCRB_msg_to_type6CPRB_msgX(struct zcrypt_device *zdev,
return -EFAULT;
if (msg->cprbx.cprb_len + sizeof(msg->hdr.function_code) >
xcRB->request_control_blk_length)
return -EFAULT;
return -EINVAL;
function_code = ((unsigned char *)&msg->cprbx) + msg->cprbx.cprb_len;
memcpy(msg->hdr.function_code, function_code, sizeof(msg->hdr.function_code));
@ -678,7 +666,7 @@ static void zcrypt_pcixcc_receive(struct ap_device *ap_dev,
break;
case PCIXCC_RESPONSE_TYPE_XCRB:
length = t86r->fmt2.offset2 + t86r->fmt2.count2;
length = min(PCIXCC_MAX_XCRB_RESPONSE_SIZE, length);
length = min(PCIXCC_MAX_XCRB_MESSAGE_SIZE, length);
memcpy(msg->message, reply->message, length);
break;
default:
@ -1043,7 +1031,7 @@ static int zcrypt_pcixcc_probe(struct ap_device *ap_dev)
struct zcrypt_device *zdev;
int rc = 0;
zdev = zcrypt_device_alloc(PCIXCC_MAX_RESPONSE_SIZE);
zdev = zcrypt_device_alloc(PCIXCC_MAX_XCRB_MESSAGE_SIZE);
if (!zdev)
return -ENOMEM;
zdev->ap_dev = ap_dev;

View File

@ -4552,7 +4552,7 @@ static int qeth_qdio_establish(struct qeth_card *card)
init_data.no_output_qs = card->qdio.no_out_queues;
init_data.input_handler = card->discipline.input_handler;
init_data.output_handler = card->discipline.output_handler;
init_data.queue_start_poll = queue_start_poll;
init_data.queue_start_poll_array = queue_start_poll;
init_data.int_parm = (unsigned long) card;
init_data.input_sbal_addr_array = (void **) in_sbal_ptrs;
init_data.output_sbal_addr_array = (void **) out_sbal_ptrs;