linux/arch/x86/include/asm
Linus Torvalds 34ddc81a23 i387: re-introduce FPU state preloading at context switch time
After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870ef3 ("i387:
do not preload FPU state at task switch time").

However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably

 - properly abstracted as a true FPU state switch, rather than as
   open-coded save and restore with various hacks.

   In particular, implementing it as a proper FPU state switch allows us
   to optimize the CR0.TS flag accesses: there is no reason to set the
   TS bit only to then almost immediately clear it again.  CR0 accesses
   are quite slow and expensive, don't flip the bit back and forth for
   no good reason.

 - Make sure that the same model works for both x86-32 and x86-64, so
   that there are no gratuitous differences between the two due to the
   way they save and restore segment state differently due to
   architectural differences that really don't matter to the FPU state.

 - Avoid exposing the "preload" state to the context switch routines,
   and in particular allow the concept of lazy state restore: if nothing
   else has used the FPU in the meantime, and the process is still on
   the same CPU, we can avoid restoring state from memory entirely, just
   re-expose the state that is still in the FPU unit.

   That optimized lazy restore isn't actually implemented here, but the
   infrastructure is set up for it.  Of course, older CPU's that use
   'fnsave' to save the state cannot take advantage of this, since the
   state saving also trashes the state.

In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage.  Hopefully it's easier to
follow as a result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-18 14:03:48 -08:00
..
numachip x86: Add NumaChip support 2011-12-05 17:17:24 +01:00
uv x86/uv: Fix uv_gpa_to_soc_phys_ram() shift 2012-01-26 10:58:27 +01:00
visws
xen Merge branch 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen 2011-11-06 20:15:05 -08:00
a.out-core.h
a.out.h
acpi.h Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 2011-05-29 11:18:09 -07:00
aes.h
agp.h
alternative-asm.h x86: Fix atomic64_xxx_cx8() functions 2012-01-04 15:01:56 +01:00
alternative.h asm alternatives: remove incorrect alignment notes 2011-09-15 13:28:33 -07:00
amd_nb.h x86/PCI: amd: factor out MMCONFIG discovery 2012-01-06 12:11:19 -08:00
apb_timer.h Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-07-23 10:34:47 -07:00
apic_flat_64.h x86: Make flat_init_apic_ldr() available 2011-12-05 17:17:07 +01:00
apic.h x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS 2011-12-23 11:01:43 -08:00
apicdef.h x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping 2011-12-23 11:01:01 -08:00
apm.h
arch_hweight.h
archrandom.h x86, random: Verify RDRAND functionality and allow it to be disabled 2011-07-31 14:02:19 -07:00
asm-offsets.h
asm.h x86: Fix write lock scalability 64-bit issue 2011-07-21 09:03:36 +02:00
atomic64_32.h x86, atomic: atomic64_read() take a const pointer 2012-01-09 19:33:24 -08:00
atomic64_64.h x86: Use xadd helper more widely 2011-08-29 13:44:12 -07:00
atomic.h x86: Use xadd helper more widely 2011-08-29 13:44:12 -07:00
auxvec.h
bios_ebda.h x86: Better comments for get_bios_ebda() 2011-04-29 14:13:15 -07:00
bitops.h x86_64, asm: Optimise fls(), ffs() and fls64() 2011-12-15 15:16:49 -08:00
bitsperlong.h
boot.h x86: support XZ-compressed kernel 2011-01-13 08:03:25 -08:00
bootparam.h x86: Add missing bzImage fields to struct setup_header 2011-12-09 17:35:33 -08:00
bug.h
bugs.h
byteorder.h
cache.h
cacheflush.h x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
calgary.h
calling.h x86, asm: Flip RESTORE_ARGS arguments logic 2011-06-03 14:38:53 -07:00
ce4100.h x86: ce4100: Set pci ops via callback instead of module init 2011-03-14 15:13:23 +01:00
checksum_32.h
checksum_64.h
checksum.h
clocksource.h clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option 2011-07-21 13:34:05 -07:00
cmpxchg_32.h x86: Fix and improve cmpxchg_double{,_local}() 2012-01-04 15:01:54 +01:00
cmpxchg_64.h x86: Fix and improve cmpxchg_double{,_local}() 2012-01-04 15:01:54 +01:00
cmpxchg.h x86: Properly parenthesize cmpxchg() macro arguments 2012-01-26 21:18:29 +01:00
compat.h compat: sync compat_stats with statfs. 2011-10-28 14:58:53 +02:00
cpu.h x86: Fix mwait_usable section mismatch 2011-02-14 12:08:28 +01:00
cpufeature.h x86/amd: Add missing feature flag for fam15h models 10h-1fh processors 2012-01-26 12:06:38 +01:00
cpumask.h
cputime.h
current.h
debugreg.h x86: Add counter when debug stack is used with interrupts enabled 2011-12-21 15:38:56 -05:00
delay.h asm-generic: move archictures to common delay.h 2011-07-22 18:46:24 +02:00
desc_defs.h
desc.h x86: Keep current stack in NMI breakpoints 2011-12-21 15:38:55 -05:00
device.h iommu: Rename the DMAR and INTR_REMAP config options 2011-09-21 10:22:03 +02:00
div64.h x86/div64: Add a micro-optimization shortcut if base is power of two 2011-12-05 18:16:11 +01:00
dma-mapping.h doc: fix broken references 2011-09-27 18:08:04 +02:00
dma.h x86, NUMA: Enable emulation on 32bit too 2011-05-02 17:24:48 +02:00
dmi.h
dwarf2.h x86-64: Fix CFI data for interrupt frames 2011-09-28 19:04:52 +02:00
e820.h Revert "x86, efi: Calling __pa() with an ioremap()ed address is invalid" 2011-12-12 18:25:56 +01:00
edac.h
efi.h Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-11 19:12:33 -08:00
elf.h x86, amd: Avoid cache aliasing penalties on AMD family 15h 2011-08-05 12:26:44 -07:00
emergency-restart.h
entry_arch.h x86, mce: Replace MCE_SELF_VECTOR by irq_work 2011-06-16 12:10:08 +02:00
errno.h
fb.h
fcntl.h
fixmap.h x86/intel config: Revamp configuration to allow for Moorestown and Medfield 2011-12-18 09:17:02 +01:00
floppy.h
frame.h x86: Unify rwlock assembly implementation 2011-07-21 09:03:31 +02:00
ftrace.h ftrace/x86: mcount offset calculation 2011-05-16 14:55:57 -04:00
futex.h futex: Sanitize futex ops argument types 2011-03-11 12:23:31 +01:00
gart.h x86, gart: Set DISTLBWALKPRB bit always 2011-04-18 09:26:48 -07:00
genapic.h
geode.h
gpio.h x86/gpio: Implement x86 gpio_to_irq convert function 2011-01-11 12:46:15 +01:00
hardirq.h x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat 2011-12-18 10:46:48 +01:00
highmem.h mm: stack based kmap_atomic() 2010-10-26 16:52:08 -07:00
hpet.h
hugetlb.h
hw_breakpoint.h
hw_irq.h iommu: Rename the DMAR and INTR_REMAP config options 2011-09-21 10:22:03 +02:00
hypertransport.h
hyperv.h Staging: hv: vmbus: Retry vmbus_post_msg() before giving up 2011-08-25 15:23:19 -07:00
hypervisor.h xen: HVM X2APIC support 2011-01-07 10:03:50 -05:00
i387.h i387: re-introduce FPU state preloading at context switch time 2012-02-18 14:03:48 -08:00
i8259.h
ia32_unistd.h x86: Generate system call tables and unistd_*.h from tables 2011-11-17 13:35:37 -08:00
ia32.h
idle.h x86 idle: clarify AMD erratum 400 workaround 2011-05-29 03:38:57 -04:00
inat_types.h
inat.h
init.h x86, mm: Unify zone_sizes_init() 2011-11-11 10:22:55 +01:00
insn.h x86, perf: Add a build-time sanity test to the x86 decoder 2011-11-10 12:38:51 +01:00
inst.h
intel_scu_ipc.h x86,mrst: Power control commands update 2011-12-05 12:42:11 +01:00
io_apic.h x86, ioapic: Consolidate gsi routing info into 'struct ioapic' 2011-05-20 13:41:01 +02:00
io.h x86: don't include xen/xen.h in <asm/io.h> unless XEN is enabled 2011-08-03 22:00:38 -10:00
ioctl.h
ioctls.h
iomap.h mm: stack based kmap_atomic() 2010-10-26 16:52:08 -07:00
iommu_table.h
iommu.h iommu: Add option to group multi-function devices 2011-11-15 12:22:31 +01:00
ipcbuf.h
ipi.h x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only 2011-01-28 14:54:05 +01:00
irq_controller.h x86: dtb: Add irq domain abstraction 2011-02-23 22:27:53 +01:00
irq_regs.h
irq_remapping.h iommu: Rename the DMAR and INTR_REMAP config options 2011-09-21 10:22:03 +02:00
irq_vectors.h x86/irq: Standardize on CONFIG_SPARSE_IRQ=y 2011-10-13 12:12:12 +02:00
irq.h x86: Add device tree support 2011-02-23 22:27:52 +01:00
irqflags.h tracing, x86/irq: Do not trace arch_local_{*,irq_*}() functions 2011-07-07 19:22:32 +02:00
ist.h
jump_label.h jump label: Add _ASM_ALIGN for x86 and x86_64 2011-04-04 13:42:51 -04:00
Kbuild x86: Generate system call tables and unistd_*.h from tables 2011-11-17 13:35:37 -08:00
kdebug.h ptrace: unify show_regs() prototype 2011-07-26 16:49:43 -07:00
kexec.h
kgdb.h kgdbts: unify/generalize gdb breakpoint adjustment 2011-05-26 17:12:36 -07:00
kmap_types.h
kmemcheck.h
kprobes.h
kvm_emulate.h KVM: x86: fix missing checks in syscall emulation 2012-02-01 11:43:40 +02:00
kvm_host.h KVM: Add generic RDPMC support 2011-12-27 11:24:35 +02:00
kvm_para.h KVM guest: KVM Steal time registration 2011-07-24 11:49:36 +03:00
kvm.h
ldt.h
lguest_hcall.h lguest: update comments 2011-07-22 14:39:50 +09:30
lguest.h
linkage.h x86: Get rid of asmregparm 2011-05-24 14:33:35 +02:00
local64.h
local.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
mach_timer.h time: x86: Remove CLOCK_TICK_RATE from mach_timer.h 2011-11-21 19:00:57 -08:00
mach_traps.h x86/mrst: Avoid reporting wrong nmi status 2011-11-10 16:21:01 +01:00
math_emu.h
mc146818rtc.h x86: Use "do { } while(0)" for empty lock_cmos()/unlock_cmos() macros 2011-12-18 09:14:31 +01:00
mca_dma.h
mca.h
mce.h mce: fix warning messages about static struct mce_device 2012-01-16 17:08:42 -08:00
microcode.h x86, microcode, AMD: Add a vendor-specific exit function 2011-12-14 12:46:47 +01:00
mman.h
mmconfig.h
mmu_context.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
mmu.h x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct 2011-05-25 16:16:41 +02:00
mmx.h
mmzone_32.h x86, mm: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/ 2011-07-12 21:58:11 -07:00
mmzone_64.h Fix node_start/end_pfn() definition for mm/page_cgroup.c 2011-06-27 14:13:09 -07:00
mmzone.h
module.h x86, cpu: Move AMD Elan Kconfig under "Processor family" 2011-04-08 13:01:25 -07:00
mpspec_def.h x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion 2011-01-05 14:09:23 +01:00
mpspec.h x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit 2011-01-28 14:54:09 +01:00
mrst-vrtc.h x86: mrst: Add vrtc driver which serves as a wall clock device 2010-11-11 11:34:27 +01:00
mrst.h Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty 2012-01-09 12:09:24 -08:00
msgbuf.h
mshyperv.h
msidef.h
msr-index.h x86: TSC deadline definitions 2011-09-25 19:53:00 +03:00
msr.h x86: Document rdmsr_safe restrictions 2011-12-05 14:28:37 +01:00
mtrr.h
mutex_32.h
mutex_64.h
mutex.h
mwait.h
nmi.h x86, nmi: Add in logic to handle multiple events and unknown NMIs 2011-10-10 06:57:01 +02:00
nops.h x86, cpu: Clean up and unify the NOP selection infrastructure 2011-04-18 16:40:21 -07:00
numa_32.h x86, NUMA: Move NUMA init logic from numa_64.c to numa.c 2011-05-02 14:18:53 +02:00
numa_64.h x86, NUMA: Move NUMA init logic from numa_64.c to numa.c 2011-05-02 14:18:53 +02:00
numa.h x86, NUMA: Make numa_init_array() static 2011-05-02 17:24:48 +02:00
numaq.h x86-32, NUMA: Update numaq to use new NUMA init protocol 2011-05-02 14:18:53 +02:00
olpc_ofw.h x86, olpc: Use device tree for platform identification 2011-03-15 14:17:23 -07:00
olpc.h x86, olpc-xo1-sci: Add GPE handler and ebook switch functionality 2011-07-06 14:44:38 -07:00
page_32_types.h
page_32.h
page_64_types.h
page_64.h
page_types.h x86-64, NUMA: Revert NUMA affine page table allocation 2011-03-04 10:26:36 +01:00
page.h
param.h
paravirt_types.h Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip 2011-08-12 20:46:24 -07:00
paravirt.h KVM guest: Add a pv_ops stub for steal time 2011-07-14 12:59:44 +03:00
parport.h
pat.h
pci_64.h
pci_x86.h PCI: Pull PCI 'latency timer' setup up into the core 2012-01-06 12:10:42 -08:00
pci-direct.h
pci-functions.h
pci.h x86/PCI: Expand the x86_msi_ops to have a restore MSIs. 2012-01-06 14:02:26 -08:00
percpu.h Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2012-01-09 13:08:28 -08:00
perf_event_p4.h x86, perf: P4 PMU - Fix typos in comments and style cleanup 2011-07-21 20:41:54 +02:00
perf_event.h perf events: Enable raw event support for Intel unhalted_reference_cycles event 2011-12-21 10:26:32 +01:00
pgalloc.h tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
pgtable_32_types.h
pgtable_32.h mm: remove pte_*map_nested() 2010-10-26 16:52:08 -07:00
pgtable_64_types.h
pgtable_64.h thp: add x86 32bit support 2011-01-13 17:32:44 -08:00
pgtable_types.h x86-64: Map the HPET NX 2011-06-05 21:30:33 +02:00
pgtable-2level_types.h
pgtable-2level.h thp: add x86 32bit support 2011-01-13 17:32:44 -08:00
pgtable-3level_types.h
pgtable-3level.h x86: Flush TLB if PGD entry is changed in i386 PAE mode 2011-03-18 11:44:01 +01:00
pgtable.h x86: Use "do { } while(0)" for empty flush_tlb_fix_spurious_fault() macro 2011-12-18 09:14:18 +01:00
poll.h
posix_types_32.h
posix_types_64.h
posix_types.h
prctl.h
probe_roms.h x86: Introduce pci_map_biosrom() 2011-03-15 15:34:15 -07:00
processor-cyrix.h
processor-flags.h x86: Fix rflags in FAKE_STACK_FRAME 2011-12-06 10:02:38 +01:00
processor.h i387: move TS_USEDFPU flag from thread_info to task_struct 2012-02-18 10:19:41 -08:00
prom.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
proto.h
ptrace-abi.h x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
ptrace.h x86-64: Add user_64bit_mode paravirt op 2011-08-04 16:13:49 -07:00
pvclock-abi.h
pvclock.h KVM: Fix instruction size issue in pvclock scaling 2011-08-30 14:42:30 +03:00
reboot_fixups.h
reboot.h x86, nmi: Wire up NMI handlers to new routines 2011-10-10 06:56:57 +02:00
required-features.h
resource.h
resume-trace.h
rio.h
rtc.h
rwlock.h x86: Fix write lock scalability 64-bit issue 2011-07-21 09:03:36 +02:00
rwsem.h x86: Use xadd helper more widely 2011-08-29 13:44:12 -07:00
scatterlist.h
seccomp_32.h
seccomp_64.h
seccomp.h
sections.h
segment.h x86, asm: Fix binutils 2.16 issue with __USER32_CS 2011-06-03 14:39:14 -07:00
sembuf.h
serial.h
serpent.h crypto: serpent - add 4-way parallel i586/SSE2 assembler implementation 2011-11-21 16:13:23 +08:00
setup_arch.h
setup.h x86/intel config: Revamp configuration to allow for Moorestown and Medfield 2011-12-18 09:17:02 +01:00
shmbuf.h
shmparam.h
sigcontext32.h
sigcontext.h
sigframe.h
siginfo.h
signal.h
smp.h x86, NMI: Add NMI IPI selftest 2011-12-05 12:00:16 +01:00
smpboot_hooks.h x86: Serialize SMP bootup CMOS accesses on rtc_lock 2011-07-21 09:20:59 +02:00
socket.h
sockios.h
sparsemem.h
spinlock_types.h x86, ticketlock: Make __ticket_spin_trylock common 2011-08-29 13:46:34 -07:00
spinlock.h x86/cmpxchg: add a locked add() helper 2011-11-25 10:42:59 -08:00
stackprotector.h
stacktrace.h x86: Remove warning and warning_symbol from struct stacktrace_ops 2011-05-12 15:31:28 +02:00
stat.h
statfs.h
string_32.h
string_64.h
string.h
suspend_32.h PM / Hibernate: Remove arch_prepare_suspend() 2011-05-24 23:35:55 +02:00
suspend_64.h PM / Hibernate: Remove arch_prepare_suspend() 2011-05-24 23:35:55 +02:00
suspend.h
svm.h KVM: SVM: copy instruction bytes from VMCB 2011-01-12 11:31:07 +02:00
swab.h
swiotlb.h
sync_bitops.h
sys_ia32.h
syscall.h x86: Move <asm/asm-offsets.h> from trace_syscalls.c to asm/syscall.h 2012-01-07 14:10:18 -08:00
syscalls.h
system.h xen/pm_idle: Make pm_idle be default_idle under Xen. 2011-12-03 10:49:58 -08:00
tce.h
termbits.h
termios.h
thread_info.h i387: move TS_USEDFPU flag from thread_info to task_struct 2012-02-18 10:19:41 -08:00
time.h x86: i8253: Consolidate definitions of global_clock_event 2011-06-09 15:01:40 +02:00
timer.h sched, x86: Avoid unnecessary overflow in sched_clock 2011-11-16 19:51:25 +01:00
timex.h
tlb.h
tlbflush.h x86-32, mm: Add an initial page table for core bootstrapping 2010-10-20 14:23:55 -07:00
topology.h Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
trampoline.h x86, trampoline: Use the unified trampoline setup for ACPI wakeup 2011-02-17 21:05:06 -08:00
traps.h x86-64: Rework vsyscall emulation and add vsyscall= parameter 2011-08-10 19:26:46 -05:00
tsc.h x86, tsc: Skip TSC synchronization checks for tsc=reliable 2011-12-05 18:00:31 +01:00
types.h remove dma64_addr_t 2011-03-23 19:47:18 -07:00
uaccess_32.h sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
uaccess_64.h sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
uaccess.h x86-64: Set siginfo and context on vsyscall emulation faults 2011-12-05 12:17:27 +01:00
ucontext.h
unaligned.h
unistd.h x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits 2012-01-19 12:57:09 -08:00
user32.h
user_32.h
user_64.h
user.h
vdso.h x86-64: Clean up vdso/kernel shared variables 2011-05-24 14:51:28 +02:00
vga.h
vgtod.h x86-64: Move vread_tsc and vread_hpet into the vDSO 2011-07-14 17:57:05 -07:00
virtext.h
vm86.h
vmx.h KVM: APIC: avoid instruction emulation for EOI writes 2011-09-25 19:52:17 +03:00
vsyscall.h x86-64: Rework vsyscall emulation and add vsyscall= parameter 2011-08-10 19:26:46 -05:00
vvar.h x86-64: Give vvars their own page 2011-06-05 21:30:32 +02:00
x2apic.h x86, x2apic: Move the common bits to x2apic.h 2011-05-20 13:41:11 +02:00
x86_init.h Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
xcr.h
xor_32.h
xor_64.h
xor.h
xsave.h