My changes to _tlbie to fix 4xx unfortunately broke 8xx build in a
couple of places. This fixes it.
Spotted by Olof Johansson.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Arch-independent zone-sizing is using indices instead of symbolic names to
offset within an array related to zones (max_zone_pfns). The unintended
impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead
of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set. As a result, the
the machine fails to boot but will boot with CONFIG_HIGHMEM turned off.
The following patch properly initialises the max_zone_pfns[] array and uses
symbolic names instead of indices in each architecture using
arch-independent zone-sizing. Two users have successfully booted their
powerpcs with it (one an ibook G4). It has also been boot tested on x86,
x86_64, ppc64 and ia64. Please merge for 2.6.19-rc2.
Credit to Benjamin Herrenschmidt for identifying the bug and rolling the
first fix. Additional credit to Johannes Berg and Andreas Schwab for
reporting the problem and testing on powerpc.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
32-bit CHRP machines are now supported only in arch/powerpc, as are
all 64-bit PowerPC processors. This means that we don't use
Open Firmware on any platform in arch/ppc any more.
This makes PReP support a single-platform option like every other
platform support option in arch/ppc now, thus CONFIG_PPC_MULTIPLATFORM
is gone from arch/ppc. CONFIG_PPC_PREP is the option that selects
PReP support and is generally what has replaced
CONFIG_PPC_MULTIPLATFORM within arch/ppc.
_machine is all but dead now, being #defined to 0.
Updated Makefiles, comments and Kconfig options generally to reflect
these changes.
Signed-off-by: Paul Mackerras <paulus@samba.org>
set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().
This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes it possible to build kernels for PReP and/or CHRP
with ARCH=ppc by removing the (non-building) powermac support.
It's now also possible to select PReP and CHRP independently.
Powermac users should now build with ARCH=powerpc instead of
ARCH=ppc. (This does mean that it is no longer possible to
build a 32-bit kernel for a G5.)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently 8xx fails to boot due to endless pagefaults.
Seems the bug is exposed by the change which avoids flushing the
TLB when not necessary (in case the pte has not changed), introduced
recently:
__handle_mm_fault():
entry = pte_mkyoung(entry);
if (!pte_same(old_entry, entry)) {
ptep_set_access_flags(vma, address, pte, entry, write_access);
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
} else {
/*
* This is needed only for protection faults but the arch code
* is not yet telling us if this is a protection fault or not.
* This still avoids useless tlb flushes for .text page faults
* with threads.
*/
if (write_access)
flush_tlb_page(vma, address);
}
The "update_mmu_cache()" call was unconditional before, which caused the TLB
to be flushed by:
if (pfn_valid(pfn)) {
struct page *page = pfn_to_page(pfn);
if (!PageReserved(page)
&& !test_bit(PG_arch_1, &page->flags)) {
if (vma->vm_mm == current->active_mm) {
#ifdef CONFIG_8xx
/* On 8xx, cache control instructions (particularly
* "dcbst" from flush_dcache_icache) fault as write
* operation if there is an unpopulated TLB entry
* for the address in question. To workaround that,
* we invalidate the TLB here, thus avoiding dcbst
* misbehaviour.
*/
_tlbie(address);
#endif
__flush_dcache_icache((void *) address);
} else
flush_dcache_icache_page(page);
set_bit(PG_arch_1, &page->flags);
}
Which worked to due to pure luck: PG_arch_1 was always unset before, but
now it isnt.
The root of the problem are the changes against the 8xx TLB handlers introduced
during v2.6. What happens is the TLBMiss handlers load the zeroed pte into
the TLB, causing the TLBError handler to be invoked (thats two TLB faults per
pagefault), which then jumps to the generic MM code to setup the pte.
The bug is that the zeroed TLB is not invalidated (the same reason
for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
The "two exception" approach requires a TLB flush (to nuke the zeroed TLB)
at each PTE update for correct behaviour:
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change the phys_mem_access_prot() function to take a pfn instead of an
address. This allows mmap64() to work on /dev/mem for addresses above 4G
on 32-bit architectures. We start with a pfn in mmap_mem(), so there's no
need to convert to an address; in fact, it's actively bad, since the
conversion can overflow when the address is above 4G.
Similarly fix the ppc32 page_is_ram() function to avoid a conversion to an
address by directly comparing to max_pfn. Working with max_pfn instead of
high_memory fixes page_is_ram() to give the right answer for highmem pages.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here is a new patch that removes all notion of the pmac, prep,
chrp and openfirmware initialization sections, and then unifies
the sections.h files without those __pmac, etc, sections identifiers
cluttering things up.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here is the default configuration for Marvell EV64360BP board. It has been
tested on the board.
Signed-off-by: Lee Nicks <allinux@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
flush_dcache_icache_page() will be called on an instruction page fault. We
can't sleep in the fault handler, so use kmap_atomic() instead of just
kmap() for the Book-E case.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On 8xx, in the case where a pagefault happens for a process who's not
the owner of the vma in question (ptrace for instance), the flush
operation is performed via the physical address.
Unfortunately, that results in a strange, unexplainable "icbi"
instruction fault, most likely due to a CPU bug (see oops below).
Avoid that by flushing the page via its kernel virtual address.
Oops: kernel access of bad area, sig: 11 [#2]
NIP: C000543C LR: C000B060 SP: C0F35DF0 REGS: c0f35d40 TRAP: 0300 Not tainted
MSR: 00009022 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 10
DAR: 00000010, DSISR: C2000000
TASK = c0ea8430[761] 'gdbserver' THREAD: c0f34000
Last syscall: 26
GPR00: 00009022 C0F35DF0 C0EA8430 00F59000 00000100 FFFFFFFF 00F58000
00000001
GPR08: C021DAEF C0270000 00009032 C0270000 22044024 10025428 01000800
00000001
GPR16: 007FFF3F 00000001 00000000 7FBC6AC0 00F61022 00000001 C0839300
C01E0000
GPR24: 00CD0889 C082F568 3000AC18 C02A7A00 C0EA15C8 00F588A9 C02ACB00
C02ACB00
NIP [c000543c] __flush_dcache_icache_phys+0x38/0x54
LR [c000b060] flush_dcache_icache_page+0x20/0x30
Call trace:
[c000b154] update_mmu_cache+0x7c/0xa4
[c005ae98] do_wp_page+0x460/0x5ec
[c005c8a0] handle_mm_fault+0x7cc/0x91c
[c005ccec] get_user_pages+0x2fc/0x65c
[c0027104] access_process_vm+0x9c/0x1d4
[c00076e0] sys_ptrace+0x240/0x4a4
[c0002bd0] ret_from_syscall+0x0/0x44
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The proposed _tlbie call at update_mmu_cache() is safe because:
Addresses for which update_mmu_cache() gets invocated are never inside the
static kernel virtual mapping, meaning that there is no risk for the
_tlbie() here to be thrashing the pinned entry, as Dan suspected.
The intermediate TLB state in which this bug can be triggered is not
visible by userspace or any other contexts, except the page fault handling
path. So there is no need to worry about userspace dcbxxx users.
The other solution to this is to avoid dcbst misbehaviour in the first
place, which involves changing in-kernel "dcbst" callers to use 8xx
specific SPR's.
Summary:
On 8xx, cache control instructions (particularly "dcbst" from
flush_dcache_icache) fault as write operation if there is an unpopulated
TLB entry for the address in question. To workaround that, we invalidate
the TLB here, thus avoiding dcbst misbehaviour.
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch kills the whole embedded System.map mecanism and the
bootloader-passed System.map that was used to provide symbol resolution in
xmon. Instead, xmon now uses kallsyms like ppc64 does.
No hurry getting that in Linus tree, let it be tested in -mm for a while
first and make sure it doesn't break various embedded configs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove PG_highmem, to save a page flag. Use is_highmem() instead. It'll
generate a little more code, but we don't use PageHigheMem() in many places.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On ppc32, the platform code can supply a "progress" function that is
used to show progress through the boot. These functions are usually
in an init section and so can't be called after the init pages are
freed. Now that the cpu bringup code can be called after the system
is booted (for hotplug cpu) we can get the situation where the
progress function can be called after boot. The simple fix is to set
the progress function pointer to NULL when the init pages are freed,
and that is what this patch does (note that all callers already check
whether the function pointer is NULL before trying to call it).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!