timer_interrupt() was calculating per_cpu_offset several times, having to
start from the toc because of potential aliasing issues.
Placing both decrementer per_cpu varables in a struct and calculating
the address once with __get_cpu_var results in better code on both 32
and 64 bit.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>