This follows the ARM change from Aaro Koskinen:
When unmapping N pages (e.g. shared memory) the amount of TLB
flushes done can be (N*PAGE_SIZE/ZAP_BLOCK_SIZE)*N although it
should be N at maximum. With PREEMPT kernel ZAP_BLOCK_SIZE is 8
pages, so there is a noticeable performance penalty when
unmapping a large VMA and the system is spending its time in
flush_tlb_range().
The problem is that tlb_end_vma() is always flushing the full VMA
range. The subrange that needs to be flushed can be calculated by
tlb_remove_tlb_entry(). This approach was suggested by Hugh
Dickins, and is also used by other arches.
The speed increase is roughly 3x for 8M mappings and for larger
mappings even more.
Bits and peices are taken from the ARM patch as well as the existing
arch/um implementation that is quite similar.
The end result is a significant reduction in both partial and full TLB
flushes initiated through flush_tlb_range().
At the same time, the nommu implementation was broken, had a superfluous
cache flush, and subsequently would have triggered a BUG_ON() if a
code-path had triggered it. Tidy this up for correctness and provide a
nopped-out implementation there.
More background on the initial discussion can be found at:
http://marc.info/?t=123609820900002&r=1&w=2http://marc.info/?t=123660375800003&r=1&w=2
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
that implement the PTAEX register and respective functionality. Presently
only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).
The main change is in how the PTE is written out when loading the entry
in to the TLB, as well as in how the TLB entry is selectively flushed.
While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
arrays for extra bits, extended ASID mode splits out the address arrays.
While we don't use the memory-mapped data array access, the address
array accesses are necessary for selective TLB flushes, so these are
implemented newly and replace the generic SH-4 implementation.
With this, TLB flushes in switch_mm() are almost non-existent on newer
parts.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains CONFIG_SUSPEND support to the SuperH
architecture. If enabled, SuperH Mobile processors will
register their suspend callbacks during boot.
To suspend, use "echo mem > /sys/power/state". To allow
wakeup, make sure "/sys/device/platform/../power/wakeup"
contains "enabled". Additional per-device driver patches
are most likely needed.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds the clk_set_parent/clk_get_parent routines to the sh
clock framework.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds DMA support for newer SH-4A CPUs, particularly SH7763/64/80/85.
This also enables multi IRQ support for platforms that have multiple
vectors bound to the same IRQ source.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a method for supporting fixed PMB mappings inherited from
the bootloader, as an alternative to the dynamic PMB mapping currently
used by the kernel. In the future these methods will be combined.
P1/P2 area is handled like a regular 29-bit physical address, and local
bus device are assigned P3 area addresses.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add Suspend-to-disk / swsusp / CONFIG_HIBERNATION support
to the SuperH architecture.
To suspend, use "swapon /dev/sda2; echo disk > /sys/power/state"
To resume, pass "resume=/dev/sda2" on the kernel command line.
The patch "pm: rework includes, remove arch ifdefs V2" is
needed to allow the generic swsusp code to build properly.
Hibernation is not enabled with this patch though, a patch
setting ARCH_HIBERNATION_POSSIBLE will be submitted later.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds preliminary support for the SH7786-based Urquell board.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds preliminary support for the SH7786 CPU subtype.
While this is a dual-core CPU, only UP is supported for now. L2 cache
support is likewise not yet implemented.
More information on this particular CPU subtype is available at:
http://www.renesas.com/fmwk.jsp?cnt=sh7786_root.jsp&fp=/products/mpumcu/superh_family/sh7780_series/sh7786_group/
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The SH-3 does not support 'pref'-based prefetching, only SH-2A and SH-4A
parts do. Remove SH-3 from the list.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Prefetch early exception data. There is unused space in our
exception handler cache line anyway, so this is almost free.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove EXPEVT vector from the stack, lookup_exception_vector()
for sh3/sh4/sh4a is already using k2 to get the vector.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Rework and simplify the sched_clock and clocksource code. Instead
of registering the clocksource in a shared file we move it into the
tmu driver. Also, add code to handle sched_clock in the case of no
clocksource.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Fix macros start_thread and user_stack_pointer.
When these macros aren't called with a variable named regs as second
argument, this will result in a build failure.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that atomic_t is a generic opaque type for all architectures, it is
unwise to use intimate knowledge of its internals when manipulating it.
Instead of relying on the "counter" member being at offset 0 from the
beginning of an atomic_t, explicitly reference the member. This guards
us from any changes to the layout of the beginning of the atomic_t type.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When dereferencing the memory address contained in a register and
modifying the value at that memory address, the register should not be
listed in the inline asm outputs. The value at the memory address is an
output (which is taken care of with the "memory" clobber), not the register.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This corrects a deadlock encountered on ap325 in the cases where the
mutex is contended and the slow-path needs to be fallen back upon.
Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com>
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The T-bit manipulation for syscall error checking had the side effect of
spuriously returning ERESTART* errno values over EINTR. So, we simplify
the error checking a bit and leave the T-bit alone.
Reported-by: Kaz Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch updates the SuperH gpio code to make use of gpiolib. The
gpiolib callbacks get() and set() are lockless, but we use our own
spinlock for the other operations to make sure hardware register
bitfield accesses stay atomic.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch optimizes the gpio data register handling for gpio_set_value().
Instead of using the good old spinlock-plus-read-modify-write strategy
we now use a shadow register and atomic operations.
This improves the bitbanging mmc performance on Migo-R from 26 Kbytes/s
to 40 Kbytes/s.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch modifies the table based SuperH gpio implementation to
make use of direct table lookups. With this change the functions
gpio_get_value() and gpio_set_value() are O(1).
Tested on Migo-R using bitbanging mmc. Performance is improved from
11 KBytes/s to 26 Kbytes/s.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Bring sh in line with all the other ports. Not sure how sh missed this
change as all the other arches were being updated ...
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[CVE-2009-0029] s390 specific system call wrappers
[CVE-2009-0029] System call wrappers part 33
[CVE-2009-0029] System call wrappers part 32
[CVE-2009-0029] System call wrappers part 31
[CVE-2009-0029] System call wrappers part 30
[CVE-2009-0029] System call wrappers part 29
[CVE-2009-0029] System call wrappers part 28
[CVE-2009-0029] System call wrappers part 27
[CVE-2009-0029] System call wrappers part 26
[CVE-2009-0029] System call wrappers part 25
[CVE-2009-0029] System call wrappers part 24
[CVE-2009-0029] System call wrappers part 23
[CVE-2009-0029] System call wrappers part 22
[CVE-2009-0029] System call wrappers part 21
[CVE-2009-0029] System call wrappers part 20
[CVE-2009-0029] System call wrappers part 19
[CVE-2009-0029] System call wrappers part 18
[CVE-2009-0029] System call wrappers part 17
[CVE-2009-0029] System call wrappers part 16
[CVE-2009-0029] System call wrappers part 15
...
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove __attribute__((weak)) from common code sys_pipe implemantation.
IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations
with the same name. Just rename them.
For sys_pipe2 there is no architecture specific implementation.
Cc: Richard Henderson <rth@twiddle.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Make VMAs per mm_struct as for MMU-mode linux. This solves two problems:
(1) In SYSV SHM where nattch for a segment does not reflect the number of
shmat's (and forks) done.
(2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
that a VMA might be shared and already have its vm_mm assigned to another
process or a dead process.
A new struct (vm_region) is introduced to track a mapped region and to remember
the circumstances under which it may be shared and the vm_list_struct structure
is discarded as it's no longer required.
This patch makes the following additional changes:
(1) Regions are now allocated with alloc_pages() rather than kmalloc() and
with no recourse to __GFP_COMP, so the pages are not composite. Instead,
each page has a reference on it held by the region. Anything else that is
interested in such a page will have to get a reference on it to retain it.
When the pages are released due to unmapping, each page is passed to
put_page() and will be freed when the page usage count reaches zero.
(2) Excess pages are trimmed after an allocation as the allocation must be
made as a power-of-2 quantity of pages.
(3) VMAs are added to the parent MM's R/B tree and mmap lists. As an MM may
end up with overlapping VMAs within the tree, the VMA struct address is
appended to the sort key.
(4) Non-anonymous VMAs are now added to the backing inode's prio list.
(5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
the backing region. The VMA and region structs will be split if
necessary.
(6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
segment instead of all the attachments at that addresss. Multiple
shmat()'s return the same address under NOMMU-mode instead of different
virtual addresses as under MMU-mode.
(7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
(8) /proc/maps is now the global list of mapped regions, and may list bits
that aren't actually mapped anywhere.
(9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
of RAM currently allocated by mmap to hold mappable regions that can't be
mapped directly. These are copies of the backing device or file if not
anonymous.
These changes make NOMMU mode more similar to MMU mode. The downside is that
NOMMU mode requires some extra memory to track things over NOMMU without this
patch (VMAs are no longer shared, and there are now region structs).
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
The atomic_t type cannot currently be used in some header files because it
would create an include loop with asm/atomic.h. Move the type definition
to linux/types.h to break the loop.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: New APIs
The old node_to_cpumask/node_to_pcibus returned a cpumask_t: these
return a pointer to a struct cpumask. Part of removing cpumasks from
the stack.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul Mundt <lethal@linux-sh.org>
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.
What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We don't really want this enabled by default, but it is still quite
useful for debugging. So, make it conditional and leave it off by
default.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that the rest of the boards that were using cf-enabler "generically"
have switched to setting up their mappings on their own, only the mach-se
boards were left using it. All of the cf-enabler using mach-se boards
use a special initialization of the MRSHPC windows rather than going
through the special PTE as other SH-4 platforms do. This consolidates
the MRSHPC setup logic, hooks it up on the boards that care, and gets rid
of any and all remaining references to cf-enabler.
This has been long overdue, as cf-enabler has been the bane of
arch/sh/kernel for the last 7 years. Good riddance.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Nothing is using this any more, so get rid of it before anyone gets the
bright idea to start using it again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the reworked kgdb support, we always detach and reinitialize the
stub. This was mostly a feature for handoffs between sh-ipl+g and the
kgdb stub, but virtually no sh-ipl+g versions ever had this working
right in the first place.
Given that the sh-ipl+g stubs in general use today don't even support
the GDB stub, and we have already killed off the special casing in the
sh-sci serial driver, kill off this now unused symbol too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This migrates from the old bitrotted kgdb stub implementation and moves
to the generic stub. In the process support for SH-2/SH-2A is also added,
which the old stub never provided.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>