CONFIG_PPC47x doesn't exist in Kconfig and no 476 processor calls this
function ppc44x_pin_tlb() as it has it's own ppc47x_pin_tlb().
This code is probably an artifact of the original 476 code that
shouldn't have made it upstream.
Signed-off-by: Christoph Egger <siccegge@cs.fau.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
Needed if you want to use swiotlb, harmless otherwise.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
Currituck doesn't need nor use SDR so aborting the pci setup if there is
no sdr-base would be bad.
Add a flag to ppc4xx_pciex_hwops for the backends to state if they need
SDR and then only complain and abort if they do and it's not found in
the device tree.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP -
there's no user of early_node_map[] left. Kill early_node_map[] and
replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP. Also,
relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h
as page_alloc.c would no longer host an alternative implementation.
This change is ultimately one to one mapping and shouldn't cause any
observable difference; however, after the recent changes, there are
some functions which now would fit memblock.c better than page_alloc.c
and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK
doesn't make much sense on some of them. Further cleanups for
functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice.
-v2: Fix compile bug introduced by mis-spelling
CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in
mmzone.h. Reported by Stephen Rothwell.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
powerpc doesn't access early_node_map[] directly and enabling
HAVE_MEMBLOCK_NODE_MAP is trivial - replacing add_active_range() calls
with memblock_set_node() and selecting HAVE_MEMBLOCK_NODE_MAP is
enough.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
The only function of memblock_analyze() is now allowing resize of
memblock region arrays. Rename it to memblock_allow_resize() and
update its users.
* The following users remain the same other than renaming.
arm/mm/init.c::arm_memblock_init()
microblaze/kernel/prom.c::early_init_devtree()
powerpc/kernel/prom.c::early_init_devtree()
openrisc/kernel/prom.c::early_init_devtree()
sh/mm/init.c::paging_init()
sparc/mm/init_64.c::paging_init()
unicore32/mm/init.c::uc32_memblock_init()
* In the following users, analyze was used to update total size which
is no longer necessary.
powerpc/kernel/machine_kexec.c::reserve_crashkernel()
powerpc/kernel/prom.c::early_init_devtree()
powerpc/mm/init_32.c::MMU_init()
powerpc/mm/tlb_nohash.c::__early_init_mmu()
powerpc/platforms/ps3/mm.c::ps3_mm_add_memory()
powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups()
sh/kernel/machine_kexec.c::reserve_crashkernel()
* x86/kernel/e820.c::memblock_x86_fill() was directly setting
memblock_can_resize before populating memblock and calling analyze
afterwards. Call memblock_allow_resize() before start populating.
memblock_can_resize is now static inside memblock.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "H. Peter Anvin" <hpa@zytor.com>
* early_init_devtree(): Total memory size is aligned to PAGE_SIZE;
however, alignment isn't enforced if memory_limit is explicitly
specified. Simplify the logic and always apply PAGE_SIZE alignment.
* MMU_init(): memblock regions is truncated by directly modifying
memblock.memory.cnt. This is incomplete (reserved array is not
truncated) and unnecessarily low level hindering further memblock
improvments. Use memblock_enforce_memory_limit() instead.
* wii_memory_fixups(): Unnecessarily low level direct manipulation of
memblock regions. The same result can be achieved using properly
abstracted operations. Reimplement using memblock API.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
memblock_init() initializes arrays for regions and memblock itself;
however, all these can be done with struct initializers and
memblock_init() can be removed. This patch kills memblock_init() and
initializes memblock with struct initializer.
The only difference is that the first dummy entries don't have .nid
set to MAX_NUMNODES initially. This doesn't cause any behavior
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "H. Peter Anvin" <hpa@zytor.com>
24aa07882b (memblock, x86: Replace memblock_x86_reserve/free_range()
with generic ones) removed arch/x86/include/asm/memblock.h and dropped
its inclusion from include/linux/memblock.h which breaks other
architectures which depended on the generic memblock.h pulling in the
arch specific one.
However, the proper fix isn't adding back the asm inclusion. memblock
doesn't have any arch dependent part and doesn't need arch specific
header file and asm/memblock.h files are either practically empty or
contain mostly unrelated arch specific stuff.
* In microblaze, sh, powerpc, sparc and openrisc, asm/memblock.h is
either empty or just contains unused MEMBLOCK_DBG() macro. Remove
them.
* In arm and unicore32, asm/memblock.h contains arch specific stuff.
Include it directly from its users. It might be a good idea to
rename the header file to avoid confusion.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Most distros use it so we may as well enable it and get regular compile
testing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When issuing a system reset we almost always oops in the oops_to_nvram
code because multiple CPUs are using the deflate work area. Add a
spinlock to protect it.
To play it safe I'm using trylock to avoid locking up if the NVRAM
code oopses. This means we might miss multiple CPUs oopsing at exactly
the same time but I think it's best to play it safe for now. Once we
are happy with the reliability we can change it to a full spinlock.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This fixes a problem where a CPU thread coming out of nap mode can
think it has valid values in the nonvolatile GPRs (r14 - r31) as saved
away in power7_idle, but in fact the values have been trashed because
the thread was used for KVM in the mean time. The result is that the
thread crashes because code that called power7_idle (e.g.,
pnv_smp_cpu_kill_self()) goes to use values in registers that have
been trashed.
The bit field in SRR1 that tells whether state was lost only reflects
the most recent nap, which may not have been the nap instruction in
power7_idle. So we need an extra PACA field to indicate that state
has been lost even if SRR1 indicates that the most recent nap didn't
lose state. We clear this field when saving the state in power7_idle,
we set it to a non-zero value when we use the thread for KVM, and we
test it in power7_wakeup_noloss.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At present, on the powernv platform, if you off-line a CPU that was
online, and then try to on-line it again, the kernel generates a
warning message "OPAL Error -1 starting CPU n". Furthermore, if the
CPU is a secondary thread that was used by KVM while it was off-line,
the CPU fails to come online.
The first problem is fixed by only calling OPAL to start the CPU the
first time it is on-lined, as indicated by the cpu_start field of its
PACA being zero. The second problem is fixed by restoring the
cpu_start field to 1 instead of 0 when using the CPU within KVM.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The minimum RMO size field in ibm,client-architecture is currently
ignored, but a future firmware version will rectify that. Since we
always get at least 128MB of RMO right now, asking for 64MB is
likely to result in boot failures.
We should bump it to at least 128MB, but considering all the boot
issues we have on 128MB RMO boxes and all new machines have virtual
RMO, we may as well set our minimum to 256MB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With CONFIG_STRICT_DEVMEM=y, user space cannot read any part of /dev/mem.
Since this breaks librtas, punch a hole in /dev/mem to allow access to the
rmo_buffer that librtas needs.
Anton Blanchard reported the problem and helped with the fix.
A quick test for this patch:
# cat /proc/rtas/rmo_buffer
000000000f190000 10000
# python -c "print 0x000000000f190000 / 0x10000"
3865
# dd if=/dev/mem of=/tmp/foo count=1 bs=64k skip=3865
1+0 records in
1+0 records out
65536 bytes (66 kB) copied, 0.000205235 s, 319 MB/s
# dd if=/dev/mem of=/tmp/foo
dd: reading `/dev/mem': Operation not permitted
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00022519 s, 0.0 kB/s
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
So I've had one of these for a while and it looks like the vendor never
bothered submitting the support upstream.
This adds it using ppc40x_simple and provides a device-tree.
There are some changes to the boot wrapper because the way u-boot works
on this thing, it seems to expect a multipart image with the kernel,
initrd and dtb in it.
The USB support is missing as it needs the yet unmerged driver for
the DWC OTG part and the GPIOs may need further definition in the dts.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Refresh ps3_defconfig to latest kernel sources and
change the options:
CONFIG_PPP=m to CONFIG_PPP=n.
CONFIG_NAMESPACES=y to CONFIG_NAMESPACES=n
CONFIG_NUMA=y to CONFIG_NUMA=n
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add an __init annotation to the ps3_smp_probe() routine.
Fixes build warnings like these when
CONFIG_DEBUG_SECTION_MISMATCH=y:
WARNING: Section mismatch in reference from the function .ps3_smp_probe()
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix some PS3 repository.c build warnings when DEBUG is
defined. Also change most pr_debug calls to pr_devel calls.
Fixes warnings like these:
format '%lx' expects type 'long unsigned int', but argument 7 has type 'u64'
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1 hcall #91 should be named lv1_read_repository_node, and
not lv1_get_repository_node_value. Adjust the lv1 hcall table
and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_get_version_info hcall takes 2, not 1 output
arguments. Adjust the lv1 hcall table and all calls.
Usage:
int lv1_get_version_info(u64 *version_number, u64 *vendor_id)
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_get_virtual_address_space_id_of_ppe hcall takes 0, not 1 input
arguments. Adjust the lv1 hcall table and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_net_stop_tx_dma and net_stop_rx_dma hcalls take 2, not 3 input
arguments. Adjust the lv1 hcall table and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
General code cleanup for PS3 interrupt.c:
o Fill out comments for structure members.
o Move variables ipi_debug_brk_mask and lock from struct ps3_bmp to
struct ps3_private.
o Fix pr_debug build errors when DEBUG is defined.
o Convert bit operation to set_bit().
o Convert DBG macro from pr_debug to pr_devel
o Add new macro FAIL to replace pr_debug calls
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We might enter the secondary CPU capture code twice, eg if we have to
unstick some CPUs with a system reset. In this case we don't want to
overwrite the state on CPUs that had made it into the capture code OK,
so use the cpus_state_saved cpumask for that and make it local to
crash_ipi_callback.
For controlling progress now use atomic_t cpus_in_crash to count how
many CPUs have made it into the kdump code, and time_to_dump to tell
everyone it's time to dump.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If we enter the kdump code via system reset, wait a bit before
sending the IPI to capture all secondary CPUs. Without it we race
with the hypervisor that is issuing the system reset to each CPU.
If the IPI gets there first the system reset oops output then shows
the register state of the IPI handler which is not what we want.
I took the opportunity to add defines for all the various delays
we have. There's no need for cpu_relax when we are doing an mdelay,
so remove them too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I have an intermittent kdump fail where the hypervisor fails an H_EOI.
As a result our CPPR is never reset to 0xff and we no longer accept
interrupts.
This patch calls icp_hv_set_cppr to reset the CPPR if H_EOI fails,
fixing the kdump fail.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We've had a 180 second panic timeout on ppc64 for as long as I
can remember. This patch reduces it to 10 seconds on pseries for a few
reasons:
- Almost all pseries machines have a hypervisor console so panic
output will be available in a scrollback buffer.
- The 180 seconds impacts our availability, users (other than
kernel hackers) just want the box to come back around so it
can continue its work.
- I spend a lot of my life staring at the 180 second panic timeout.
Many pseries machines take minutes to power cycle, so it's quicker
to sit through the 180 seconds than it is to power cycle.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Our die() code was based off a very old x86 version. Update it to
mirror the current x86 code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Remove some unnecessary defines and fix some spelling mistakes.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We can handle recursion caused by system reset by reusing the crash
shutdown fault handler.
Since we don't have an OS triggerable NMI, if all CPUs don't make it
into kdump then we tell the user to issue a system reset. However if
we have a panic timeout set we cannot wait forever and must continue
the kdump.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have a lot of complicated logic that handles possible recursion between
kdump and a system reset exception. We can solve this in a much simpler
way using the same setjmp/longjmp tricks xmon does.
As a first step, this patch removes the old system reset code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I've been seeing truncated output when people send system reset info
to me. We should see a backtrace for every CPU, but the panic() code
takes the box down before they all make it out to the console. The
panic code runs unlocked so we also see corrupted console output.
If we are going to panic, then delay 1 second before calling into the
panic code. Move oops_exit inside the die lock and put a newline
between oopses for clarity.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch makes pseries_idle_driver not to be registered when
power_save=off kernel boot option is specified. The
cpuidle_disable variable used here is similar to
its usage on x86. If cpuidle_disable is set then
sysfs entries for cpuidle framework are not created
and the required drivers are not loaded.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch enables cpuidle for pSeries and pSeries_idle is
directly called from the idle loop. As a result of pSeries_idle, cpuidle
driver registered with cpuidle subsystem comes into action. On
failure of loading of the driver or cpuidle framework default idle
is executed as part of the function. This patch
also removes the routines pseries_shared_idle_sleep and
pseries_dedicated_idle_sleep as they are now implemented as part of
pseries_idle cpuidle driver.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements a back-end cpuidle driver for pSeries
based on pseries_dedicated_idle_loop and pseries_shared_idle_loop
routines. The driver is built only if CONFIG_CPU_IDLE is set. This
cpuidle driver uses global registration of idle states and
not per-cpu.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch provides cpu_idle_wait() routine for the powerpc
platform which is required by the cpuidle subsystem. This
routine is required to change the idle handler on SMP systems.
The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.
cpuidle_disable variable is to enable/disable cpuidle
framework if power_save option is set during the boot
time.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's only used inside the same file where it's defined. There's
also no point exporting it anymore.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Open Firmware on OPAL machines seems to have issues if we close
stdin and/or we try to print things after calling "quiesce" so
we avoid doing both.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Define HUGETLB_NEED_PRELOAD in mmu-book3e.h for CONFIG_PPC64 instead
of having a much more complicated #if block. This is easier to read
and maintain.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This avoids an extra find_vma() and is less error-prone.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Allow hugetlb to be enabled on 64b FSL_BOOK3E. No platforms enable
it by default yet.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For 64-bit FSL_BOOKE implementations, gigantic pages need to be
reserved at boot time by the memblock code based on the command line.
This adds the call that handles the reservation, and fixes some code
comments.
It also removes the previous pr_err when reserve_hugetlb_gpages
is called on a system without hugetlb enabled - the way the code is
structured, the call is unconditional and the resulting error message
spurious and confusing.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>