linux/arch/powerpc/kernel
Anton Blanchard ae01f84b93 powerpc: Optimise per cpu accesses on 64bit
Now we dynamically allocate the paca array, it takes an extra load
whenever we want to access another cpu's paca. One place we do that a lot
is per cpu variables. A simple example:

DEFINE_PER_CPU(unsigned long, vara);
unsigned long test4(int cpu)
{
	return per_cpu(vara, cpu);
}

This takes 4 loads, 5 if you include the actual load of the per cpu variable:

    ld r11,-32760(r30)  # load address of paca pointer
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,9       # get offset into paca (each entry is 512 bytes)
    ld r0,0(r11)        # load paca pointer
    add r3,r0,r3        # paca + offset
    ld r11,64(r3)       # load paca[cpu].data_offset

    ldx r3,r9,r11       # load per cpu variable

If we remove the ppc64 specific per_cpu_offset(), we get the generic one
which indexes into a statically allocated array. This removes one load and
one add:

    ld r11,-32760(r30)  # load address of __per_cpu_offset
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,3       # get offset into __per_cpu_offset (each entry 8 bytes)
    ldx r11,r11,r3      # load __per_cpu_offset[cpu]

    ldx r3,r9,r11       # load per cpu variable

Having all the offsets in one array also helps when iterating over a per cpu
variable across a number of cpus, such as in the scheduler. Before we would
need to load one paca cacheline when calculating each per cpu offset. Now we
have 16 (128 / sizeof(long)) per cpu offsets in each cacheline.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:28:30 +10:00
..
vdso32 powerpc: Rework VDSO gettimeofday to prevent time going backwards 2010-07-09 11:26:16 +10:00
vdso64 powerpc: Rework VDSO gettimeofday to prevent time going backwards 2010-07-09 11:26:16 +10:00
.gitignore powerpc: Ignore generated vmlinux.lds in git 2008-10-07 14:26:18 +11:00
align.c powerpc: Handle VSX alignment faults correctly in little-endian mode 2009-12-18 14:55:43 +11:00
asm-offsets.c powerpc: Optimise per cpu accesses on 64bit 2010-07-09 11:28:30 +10:00
audit.c [PATCH] audit signal recipients 2007-05-11 05:38:25 -04:00
btext.c powerpc: Use the common ascii hex helpers 2008-08-20 16:34:57 +10:00
cacheinfo.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
cacheinfo.h powerpc: Rewrite sysfs processor cache info code 2009-01-08 16:25:10 +11:00
clock.c [POWERPC] clk.h interface for platforms 2007-10-03 09:11:56 +10:00
compat_audit.c [PATCH] add SIGNAL syscall class (v3) 2007-05-11 05:38:25 -04:00
cpu_setup_6xx.S powerpc: Use names rather than numbers for SPRGs (v2) 2009-08-20 10:12:27 +10:00
cpu_setup_44x.S AMCC PPC 460SX redwood SoC platform initial framework 2009-02-14 14:41:29 -05:00
cpu_setup_fsl_booke.S powerpc/fsl-booke: Enable L1 cache on e500v1/e500v2/e500mc CPUs 2009-06-15 21:45:30 -05:00
cpu_setup_pa6t.S
cpu_setup_ppc970.S powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bit 2008-09-15 11:08:35 -07:00
cputable.c powerpc/e500mc: Implement machine check handler. 2010-05-21 07:41:52 -05:00
crash_dump.c powerpc: Unify opcode definitions and support 2009-02-23 10:48:56 +11:00
crash.c powerpc: Fix default_machine_crash_shutdown #ifdef botch 2010-07-08 18:11:45 +10:00
dbell.c powerpc: Add support for using doorbells for SMP IPI 2009-02-23 15:53:03 +11:00
dma-iommu.c powerpc: Change archdata dma_data to a union 2009-09-24 15:31:43 +10:00
dma-swiotlb.c powerpc: remove unnecessary sync_single_range_* in swiotlb_dma_ops 2010-05-27 09:12:52 -07:00
dma.c powerpc: remove unnecessary sync_single_range_* in swiotlb_dma_ops 2010-05-27 09:12:52 -07:00
e500-pmu.c powerpc/perf: e500 support 2010-03-05 03:04:08 -06:00
entry_32.S powerpc/47x: Base ppc476 support 2010-05-05 09:11:10 -04:00
entry_64.S powerpc/perf_event: Fix oops due to perf_event_do_pending call 2010-05-12 14:34:00 +10:00
exceptions-64e.S powerpc/book3e-64: Remove duplicated #include 2009-09-24 15:31:41 +10:00
exceptions-64s.S powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors 2010-06-22 19:40:50 +10:00
firmware.c powerpc: Make powerpc_firmware_features __read_mostly 2010-02-09 13:56:07 +11:00
fpu.S powerpc: Use names rather than numbers for SPRGs (v2) 2009-08-20 10:12:27 +10:00
fsl_booke_entry_mapping.S powerpc/kexec: Add support for FSL-BookE 2010-05-24 21:25:32 -05:00
ftrace.c Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-20 10:56:46 -07:00
head_8xx.S powerpc/8xx: Use SPRG2 and DAR registers to stash r11 and cr. 2010-04-07 18:00:34 +10:00
head_32.S KVM: PPC: Add KVM intercept handlers 2010-05-17 12:18:52 +03:00
head_40x.S powerpc: Use names rather than numbers for SPRGs (v2) 2009-08-20 10:12:27 +10:00
head_44x.S powerpc/4xx: Simple platform for the ISS 4xx simulator 2010-05-05 11:11:56 -04:00
head_64.S KVM: PPC: Name generic 64-bit code generic 2010-05-17 12:18:14 +03:00
head_booke.h powerpc/booke: Add Stack Marking support to Booke Exception Prolog 2010-05-05 08:01:52 -04:00
head_fsl_booke.S powerpc/kexec: Add support for FSL-BookE 2010-05-24 21:25:32 -05:00
hw_breakpoint.c powerpc, hw_breakpoint: Tell generic code we have no instruction breakpoints 2010-06-30 13:54:58 +10:00
ibmebus.c of: Remove duplicate fields from of_platform_driver 2010-05-22 00:10:40 -06:00
idle_6xx.S powerpc: Fix for getting CPU number in power_save_ppc32_restore() 2008-09-03 20:53:47 +10:00
idle_e500.S powerpc: Fix for getting CPU number in power_save_ppc32_restore() 2008-09-03 20:53:47 +10:00
idle_power4.S
idle.c sysctl: Drop & in front of every proc_handler. 2009-11-18 08:37:40 -08:00
init_task.c Use new __init_task_data macro in arch init_task.c files. 2009-09-21 06:27:08 +02:00
io.c powerpc: tiny memcpy_(to|from)io optimisation 2009-11-04 16:43:12 -07:00
iomap.c [POWERPC] Add 64-bit resources support to pci_iomap 2007-09-20 07:36:52 -05:00
iommu.c powerpc: Remove unused 'protect4gb' boot parameter 2010-05-21 17:31:13 +10:00
irq.c powerpc: Fix logic error in fixup_irqs 2010-07-08 18:11:44 +10:00
isa-bridge.c [POWERPC] Remove leftover printk in isa-bridge.c 2008-05-09 20:22:59 +10:00
kgdb.c powerpc,kgdb: Introduce low level trap catching 2010-05-20 21:04:25 -05:00
kprobes.c powerpc/kprobes: Remove resume_execution() in kprobes 2010-06-02 17:50:37 +10:00
l2cr_6xx.S Convert files to UTF-8 and some cleanups 2007-10-19 23:21:04 +02:00
legacy_serial.c Fix spelling of 'platform' in comments and doc 2010-02-05 12:22:34 +01:00
lparcfg.c powerpc/pseries: Export data from new hcall H_EM_GET_PARMS 2010-04-07 18:00:29 +10:00
machine_kexec_32.c kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE 2008-08-15 08:35:42 -07:00
machine_kexec_64.c Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
machine_kexec.c powerpc: Allow mem=x cmdline to work with 4G+ 2009-05-15 16:43:41 +10:00
Makefile powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors 2010-06-22 19:40:50 +10:00
misc_32.S powerpc: Unconditionally enabled irq stacks 2010-06-15 15:02:37 +10:00
misc_64.S powerpc: Unconditionally enabled irq stacks 2010-06-15 15:02:37 +10:00
misc.S perf: Always build the powerpc perf_arch_fetch_caller_regs version 2010-04-03 12:42:00 +02:00
module_32.c powerpc/ppc32: ftrace, dynamic ftrace to handle modules 2008-11-20 10:52:53 -08:00
module_64.c powerpc: Unify opcode definitions and support 2009-02-23 10:48:56 +11:00
module.c module: cleanup FIXME comments about trimming exception table entries. 2009-06-12 21:47:05 +09:30
mpc7450-pmu.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
msi.c powerpc/PCI: include pci.h in powerpc MSI implementation 2009-03-25 08:54:29 -07:00
nvram_64.c arch/powerpc: Fix continuation line formats 2010-02-09 13:55:05 +11:00
of_device.c arch/powerpc: Move dma_mask from of_device into pdev_archdata 2010-05-22 00:10:40 -06:00
of_platform.c of: Remove duplicate fields from of_platform_driver 2010-05-22 00:10:40 -06:00
paca.c powerpc/kexec: Fix race in kexec shutdown 2010-05-21 17:31:11 +10:00
pci_32.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
pci_64.c of: add 'of_' prefix to machine_is_compatible() 2010-02-09 08:33:00 -07:00
pci_dn.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
pci_of_scan.c powerpc/pci: Check devices status property when scanning OF tree 2010-05-21 17:31:09 +10:00
pci-common.c PCI: clear bridge resource range if BIOS assigned bad one 2010-06-11 13:24:51 -07:00
perf_callchain.c perf: Fix inconsistency between IP and callchain sampling 2010-01-28 14:31:20 +01:00
perf_event_fsl_emb.c powerpc/perf: e500 support 2010-03-05 03:04:08 -06:00
perf_event.c powerpc/perf_event: Fix for power_pmu_disable() 2010-07-08 18:11:37 +10:00
pmc.c powerpc: Convert pmc_owner_lock to raw_spinlock 2010-02-19 14:52:33 +11:00
power4-pmu.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
power5-pmu.c powerpc: perf_event: Enable SDAR in continous sample mode 2009-10-28 16:13:02 +11:00
power5+-pmu.c powerpc: perf_event: Enable SDAR in continous sample mode 2009-10-28 16:13:02 +11:00
power6-pmu.c powerpc: perf_event: Enable SDAR in continous sample mode 2009-10-28 16:13:02 +11:00
power7-pmu.c powerpc: perf_event: Enable SDAR in continous sample mode 2009-10-28 16:13:02 +11:00
ppc32.h powerpc: Add VSX context save/restore, ptrace and signal support 2008-07-01 11:28:50 +10:00
ppc970-pmu.c powerpc: perf_event: Enable SDAR in continous sample mode 2009-10-28 16:13:02 +11:00
ppc_ksyms.c powerpc: Don't export cvt_fd & _df when CONFIG_PPC_FPU is not set 2010-05-31 11:51:54 +10:00
ppc_save_regs.S powerpc: Prepare xmon_save_regs for use with kdump 2008-12-23 15:13:28 +11:00
proc_powerpc.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
process.c Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
prom_init_check.sh powerpc: Fix compile errors in prom_init_check for gcc 4.5 2010-07-08 18:11:39 +10:00
prom_init.c powerpc: Linux cannot run with 0 cores 2010-07-08 18:11:42 +10:00
prom_parse.c powerpc: Fix of_node_put() exit path in of_irq_map_one() 2009-04-20 12:18:43 -06:00
prom.c powerpc: Dynamically allocate pacas 2010-03-09 11:52:52 +11:00
ptrace32.c headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
ptrace.c powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors 2010-06-22 19:40:50 +10:00
reloc_64.S powerpc: Make the 64-bit kernel as a position-independent executable 2008-09-15 11:08:38 -07:00
rtas_flash.c powerpc: rtas_flash needs to use rtas_data_buf 2010-06-15 15:02:37 +10:00
rtas_pci.c powerpc/pci: Clean up direct access to sysdata by RTAS 2009-05-21 15:44:23 +10:00
rtas-proc.c powerpc: Move /proc/ppc64 to /proc/powerpc update 2010-01-15 13:26:17 +11:00
rtas-rtc.c
rtas.c powerpc/pseries: Migration code reorganization / hibernation prep 2010-07-09 11:26:17 +10:00
rtasd.c powerpc/rtasd: Don't start event scan if scan rate is zero 2010-05-21 17:29:39 +10:00
setup_32.c powerpc: Unconditionally enabled irq stacks 2010-06-15 15:02:37 +10:00
setup_64.c powerpc: Optimise per cpu accesses on 64bit 2010-07-09 11:28:30 +10:00
setup-common.c powerpc/cpumask: Update some comments 2010-05-06 17:41:59 +10:00
setup.h
signal_32.c powerpc/booke: Add support for advanced debug registers 2010-02-17 14:03:17 +11:00
signal_64.c powerpc: Sanitize stack pointer in signal handling code 2009-03-27 16:58:24 +11:00
signal.c powerpc, hw_breakpoint: Enable hw-breakpoints while handling intervening signals 2010-06-22 19:40:50 +10:00
signal.h powerpc: Sanitize stack pointer in signal handling code 2009-03-27 16:58:24 +11:00
smp-tbsync.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
smp.c powerpc: Clean up obsolete code relating to decrementer and timebase 2010-07-09 11:26:16 +10:00
softemu8xx.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
stacktrace.c powerpc: Removed duplicated include in stacktrace.c 2008-07-28 16:30:47 +10:00
suspend.c PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures 2008-07-24 10:47:21 -07:00
swsusp_32.S powerpc/swsusp_32: Fix TLB invalidation 2010-01-15 13:20:07 +11:00
swsusp_64.c [POWERPC] powermac: Suspend to disk on G5 2007-05-07 20:31:14 +10:00
swsusp_asm64.S powerpc: Fix 64-bit hibernation with 64k pages 2008-10-07 14:26:20 +11:00
swsusp_booke.S powerpc/fsl-booke: Add hibernation support for FSL BookE processors 2010-05-21 07:41:53 -05:00
swsusp.c powerpc/mm: Split mmu_context handling 2008-12-21 14:21:15 +11:00
sys_ppc32.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
syscalls.c Add generic sys_olduname() 2010-03-12 15:52:32 -08:00
sysfs.c powerpc: Use smt_snooze_delay=-1 to always busy loop 2010-05-21 17:31:12 +10:00
systbl_chk.c [POWERPC] Fix a couple of copyright symbols 2008-01-25 22:52:50 +11:00
systbl_chk.sh [POWERPC] Fix a couple of copyright symbols 2008-01-25 22:52:50 +11:00
systbl.S [POWERPC] Align the sys_call_table 2007-10-11 14:36:47 +10:00
tau_6xx.c tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
time.c powerpc: Clean up obsolete code relating to decrementer and timebase 2010-07-09 11:26:16 +10:00
traps.c powerpc, hw_breakpoint: Handle concurrent alignment interrupts 2010-06-22 19:40:50 +10:00
udbg_16550.c trivial: fix typo "for for" in multiple files 2009-09-21 15:14:54 +02:00
udbg.c powerpc: gamecube/wii: early debugging using usbgecko 2009-12-12 22:24:31 -07:00
vdso.c tree-wide: fix a very frequent spelling mistake 2009-11-09 09:40:54 +01:00
vecemu.c
vector.S powerpc: Fix usage of 64-bit instruction in 32-bit altivec code 2009-12-09 18:10:12 +11:00
vio.c Merge remote branch 'origin' into secretlab/next-devicetree 2010-05-22 00:36:56 -06:00
vmlinux.lds.S Rename .data.read_mostly to .data..read_mostly. 2010-03-03 11:26:00 +01:00