linux/arch/ia64/include/asm
Hidetoshi Seto 4295ab3488 [IA64] kdump: Mask MCA/INIT on frozen cpus
Summary:

  INIT asserted on kdump kernel invokes INIT handler not only on a
  cpu that running on the kdump kernel, but also BSP of the panicked
  kernel, because the (badly) frozen BSP can be thawed by INIT.

Description:

  The kdump_cpu_freeze() is called on cpus except one that initiates
  panic and/or kdump, to stop/offline the cpu (on ia64, it means we
  pass control of cpus to SAL, or put them in spinloop).  Note that
  CPU0(BSP) always go to spinloop, so if panic was happened on an AP,
  there are at least 2cpus (= the AP and BSP) which not back to SAL.

  On the spinning cpus, interrupts are disabled (rsm psr.i), but INIT
  is still interruptible because psr.mc for mask them is not set unless
  kdump_cpu_freeze() is not called from MCA/INIT context.

  Therefore, assume that a panic was happened on an AP, kdump was
  invoked, new INIT handlers for kdump kernel was registered and then
  an INIT is asserted.  From the viewpoint of SAL, there are 2 online
  cpus, so INIT will be delivered to both of them.  It likely means
  that not only the AP (= a cpu executing kdump) enters INIT handler
  which is newly registered, but also BSP (= another cpu spinning in
  panicked kernel) enters the same INIT handler.  Of course setting of
  registers in BSP are still old (for panicked kernel), so what happen
  with running handler with wrong setting will be extremely unexpected.
  I believe this is not desirable behavior.

How to Reproduce:

  Start kdump on one of APs (e.g. cpu1)
    # taskset 0x2 echo c > /proc/sysrq-trigger
  Then assert INIT after kdump kernel is booted, after new INIT handler
  for kdump kernel is registered.

Expected results:

  An INIT handler is invoked only on the AP.

Actual results:

  An INIT handler is invoked on the AP and BSP.

Sample of results:

  I got following console log by asserting INIT after prompt "root:/>".
  It seems that two monarchs appeared by one INIT, and one panicked at
  last.  And it also seems that the panicked one supposed there were
  4 online cpus and no one did rendezvous:

    :
    [  0 %]dropping to initramfs shell
    exiting this shell will reboot your system
    root:/> Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=0
    ia64_init_handler: Promoting cpu 0 to monarch.
    Delaying for 5 seconds...
    All OS INIT slaves have reached rendezvous
    Processes interrupted by INIT - 0 (cpu 0 task 0xa000000100af0000)
    :
    <<snip>>
    :
    Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=1
    Delaying for 5 seconds...
    mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
    OS INIT slave did not rendezvous on cpu 1 2 3
    INIT swapper 0[0]: bugcheck! 0 [1]
    :
    <<snip>>
    :
    Kernel panic - not syncing: Attempted to kill the idle task!

Proposed fix:

  To avoid this problem, this patch inserts ia64_set_psr_mc() to mask
  INIT on cpus going to be frozen.  This masking have no effect if the
  kdump_cpu_freeze() is called from INIT handler when kdump_on_init == 1,
  because psr.mc is already turned on to 1 before entering OS_INIT.
  I confirmed that weird log like above are disappeared after applying
  this patch.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Haren Myneni <hbabu@us.ibm.com>
Cc: kexec@lists.infradead.org
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-09-14 16:17:05 -07:00
..
native ia64/pv_ops: paravirtualize gate.S. 2009-03-26 11:01:46 -07:00
sn [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
uv sgi-gru: add macros for using the UV hub to send interrupts 2009-04-02 19:05:05 -07:00
xen Fix ia64 compilation IS_ERR and PTE_ERR errors. 2009-07-17 06:34:50 -07:00
acpi-ext.h ACPI: remove private acpica headers from driver files 2008-12-31 01:15:22 -05:00
acpi.h
agp.h
asmmacro.h
atomic.h asm-generic: rename atomic.h to atomic-long.h 2009-06-11 21:02:17 +02:00
auxvec.h
bitops.h ia64: boolean __test_and_clear_bit 2009-08-11 14:52:10 -07:00
bitsperlong.h asm-generic: introduce asm/bitsperlong.h 2009-06-11 21:02:14 +02:00
break.h ia64/xen: reserve "break" numbers used for xen hypercalls. 2008-10-17 09:52:52 -07:00
bug.h
bugs.h
byteorder.h byteorder: make swab.h include asm/swab.h like a regular header 2009-01-14 19:56:50 -08:00
cache.h
cacheflush.h [IA64] Add Variable Page Size and IA64 Support in Intel IOMMU 2008-10-17 12:14:13 -07:00
checksum.h
compat.h
cpu.h
cputime.h
current.h
cyclone.h
delay.h
device.h [IA64] Add Variable Page Size and IA64 Support in Intel IOMMU 2008-10-17 12:14:13 -07:00
div64.h
dma-mapping.h dma-mapping: ia64: add CONFIG_DMA_API_DEBUG support 2009-06-18 13:03:58 -07:00
dma.h
dmi.h
elf.h [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY 2008-10-16 15:40:05 +02:00
emergency-restart.h
errno.h
esi.h
fb.h
fcntl.h
fpswa.h
fpu.h Revert "Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h" 2009-07-17 06:35:05 -07:00
ftrace.h ftrace, ia64: IA64 dynamic ftrace support 2009-01-14 12:11:31 +01:00
futex.h
gcc_intrin.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
hardirq.h preempt-count: force hardirq-count to max of 10 2009-02-12 11:19:05 -05:00
hpsim.h
hugetlb.h
hw_irq.h [IA64] remove obsolete hw_interrupt_type 2009-06-15 14:35:10 -07:00
ia32.h
ia64regs.h
idle.h [IA64] xen_domu_defconfig: fix build issues/warnings 2009-05-05 11:43:13 -07:00
intel_intrin.h
intrinsics.h Pull pvops into release branch 2009-03-31 14:25:08 -07:00
io.h [IA64] remove dead BIO_VMERGE_BOUNDARY definition 2008-11-04 11:31:58 -08:00
ioctl.h
ioctls.h
iommu.h intel-iommu: Fix one last ia64 build problem in Pass Through Support 2009-06-05 20:49:53 +01:00
iosapic.h
ipcbuf.h
irq_regs.h
irq.h ia64: cpumask fix for is_affinity_mask_valid() 2009-01-04 15:39:24 +01:00
Kbuild [IA64] unexport fpswa.h 2009-06-15 14:32:54 -07:00
kdebug.h
kexec.h
kmap_types.h kmap_types: make most arches use generic header file 2009-06-16 19:47:51 -07:00
kprobes.h
kregs.h [IA64] Fix annoying IA64_TR_ALLOC_MAX message. 2008-10-17 13:47:53 -07:00
kvm_host.h KVM: Enable snooping control for supported hardware 2009-06-10 11:48:50 +03:00
kvm_para.h
kvm.h Merge branch 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-26 16:11:41 -07:00
libata-portmap.h
linkage.h
local.h
machvec_dig_vtd.h remove dma operations in struct ia64_machine_vector 2009-01-06 14:06:50 +01:00
machvec_dig.h
machvec_hpsim.h
machvec_hpzx1_swiotlb.h remove hwsw_dma_ops 2009-01-06 14:06:52 +01:00
machvec_hpzx1.h remove dma operations in struct ia64_machine_vector 2009-01-06 14:06:50 +01:00
machvec_init.h [IA64] SN specific version of dma_get_required_mask() 2009-01-15 10:42:16 -08:00
machvec_sn2.h Merge branch 'linus' into core/iommu 2009-01-16 10:09:10 +01:00
machvec_uv.h
machvec_xen.h ia64/xen: define xen machine vector for domU. 2008-10-17 10:08:56 -07:00
machvec.h Merge branch 'linus' into core/iommu 2009-01-16 10:09:10 +01:00
mc146818rtc.h
mca_asm.h
mca.h [IA64] kdump: Mask MCA/INIT on frozen cpus 2009-09-14 16:17:05 -07:00
meminit.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
mman.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
mmu_context.h cpumask: use mm_cpumask() wrapper: ia64 2009-03-16 14:12:48 +10:30
mmu.h
mmzone.h mm: clean up for early_pfn_to_nid() 2009-02-18 15:37:55 -08:00
module.h ia64/pv_ops/bp/module: support binary patching for kernel module. 2009-03-26 11:02:51 -07:00
msgbuf.h
msidef.h ia64: Move the macro definitions related to MSI to one header file. 2009-03-24 11:03:12 +02:00
mutex.h
nodedata.h
numa.h
page.h
pal.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
param.h
paravirt_patch.h ia64/pv_op/binarypatch: add helper functions to support binary patching for paravirt_ops. 2009-03-26 11:02:31 -07:00
paravirt_privop.h [IA64] fix allmodconfig compilation breakage. 2009-04-20 09:46:29 -07:00
paravirt.h ia64/pv_ops: implement binary patching optimization for native. 2009-03-26 11:02:42 -07:00
parport.h
patch.h
pci.h Delete pcibios_select_root 2009-06-17 14:04:42 -07:00
percpu.h percpu: make PER_CPU_BASE_SECTION overridable by arches 2009-02-09 10:30:29 +01:00
perfmon_default_smpl.h
perfmon.h
pgalloc.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
pgtable.h IA64: includecheck fix: ia64, pgtable.h 2009-08-11 14:52:11 -07:00
poll.h
posix_types.h
processor.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
ptrace_offsets.h
ptrace.h remove __ARCH_WANT_COMPAT_SYS_PTRACE 2008-11-30 11:00:15 -08:00
pvclock-abi.h ia64/xen: add a necessary header file to compile include/xen/interface/xen.h 2008-10-17 09:57:28 -07:00
resource.h
rse.h
rwsem.h
sal.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
scatterlist.h
sections.h [IA64] Put the space for cpu0 per-cpu area into .data section 2008-09-29 16:39:19 -07:00
segment.h
sembuf.h
serial.h
setup.h
shmbuf.h
shmparam.h
sigcontext.h
siginfo.h signals: demultiplexing SIGTRAP signal 2008-09-23 13:26:52 +02:00
signal.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
smp.h FRV: Fix the section attribute on UP DECLARE_PER_CPU() 2009-04-21 19:39:59 -07:00
socket.h net: new user space API for time stamping of incoming and outgoing packets 2009-02-15 22:43:33 -08:00
sockios.h
sparsemem.h
spinlock_types.h
spinlock.h ia64: implement interrupt-enabling rwlocks 2009-04-02 19:05:11 -07:00
stat.h
statfs.h
string.h
swab.h headers_check fix: ia64, swab.h 2009-02-01 11:01:25 +05:30
swiotlb.h swiotlb: replace architecture-specific swiotlb.h with linux/swiotlb.h 2008-12-28 10:04:00 +01:00
sync_bitops.h ia64/xen: introduce sync bitops which is necessary for ia64/xen support. 2008-10-17 09:53:33 -07:00
syscall.h [IA64] utrace Convert compat ptrace to use compat_sys_ptrace 2008-10-06 10:45:29 -07:00
system.h
termbits.h
termios.h
thread_info.h sched: INIT_PREEMPT_COUNT 2009-07-10 14:24:05 -07:00
timex.h ia64/pv_ops/pv_time_ops: add sched_clock hook. 2009-03-26 10:50:42 -07:00
tlb.h mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() 2009-07-27 12:10:38 -07:00
tlbflush.h
topology.h Pull cpumask into release branch 2009-03-31 14:24:52 -07:00
types.h [IA64] Convert ia64 to use int-ll64.h 2009-06-17 09:33:49 -07:00
uaccess.h
ucontext.h
unaligned.h
uncached.h
unistd.h [IA64] ia64 does not need umount2() syscall 2009-06-16 13:13:50 -07:00
unwind.h
user.h
ustack.h
vga.h
xor.h