linux/arch/x86/lib
Ma Ling 3b4b682bec x86, mem: Optimize memmove for small size and unaligned cases
movs instruction will combine data to accelerate moving data,
however we need to concern two cases about it.

1. movs instruction need long lantency to startup,
   so here we use general mov instruction to copy data.
2. movs instruction is not good for unaligned case,
   even if src offset is 0x10, dest offset is 0x0,
   we avoid and handle the case by general mov instruction.

Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <1284664360-6138-1-git-send-email-ling.ma@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-09-24 18:57:11 -07:00
..
.gitignore x86: Gitignore: arch/x86/lib/inat-tables.c 2009-11-04 13:11:28 +01:00
atomic64_32.c x86-32: Rewrite 32-bit atomic64 functions in assembly 2010-02-25 20:47:30 -08:00
atomic64_386_32.S x86, asm: Use a lower case name for the end macro in atomic64_386_32.S 2010-08-12 07:04:16 -07:00
atomic64_cx8_32.S x86-32: Fix atomic64_inc_not_zero return value convention 2010-03-01 11:39:03 -08:00
cache-smp.c x86, lib: Add wbinvd smp helpers 2010-01-22 16:05:42 -08:00
checksum_32.S
clear_page_64.S x86, alternatives: Use 16-bit numbers for cpufeature index 2010-07-07 10:36:28 -07:00
cmpxchg8b_emu.S x86: Provide an alternative() based cmpxchg64() 2009-09-30 22:55:59 +02:00
cmpxchg.c x86, asm: Merge cmpxchg_486_u64() and cmpxchg8b_emu() 2010-07-28 17:05:11 -07:00
copy_page_64.S x86, alternatives: Use 16-bit numbers for cpufeature index 2010-07-07 10:36:28 -07:00
copy_user_64.S x86, alternatives: Fix one more open-coded 8-bit alternative number 2010-07-13 14:56:16 -07:00
copy_user_nocache_64.S x86: wrong register was used in align macro 2008-07-30 10:10:39 -07:00
csum-copy_64.S
csum-partial_64.c x86: fix csum_partial() export 2008-05-13 19:38:47 +02:00
csum-wrappers_64.c
delay.c x86, delay: tsc based udelay should have rdtsc_barrier 2009-06-25 16:47:40 -07:00
getuser.S x86: use _types.h headers in asm where available 2009-02-13 11:35:01 -08:00
inat.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
insn.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
iomap_copy_64.S
Makefile x86, asm: Move cmpxchg emulation code to arch/x86/lib 2010-07-28 16:53:49 -07:00
memcpy_32.c x86, mem: Optimize memmove for small size and unaligned cases 2010-09-24 18:57:11 -07:00
memcpy_64.S x86, mem: Optimize memcpy by avoiding memory false dependece 2010-08-23 14:56:41 -07:00
memmove_64.c x86, mem: Optimize memmove for small size and unaligned cases 2010-09-24 18:57:11 -07:00
memset_64.S x86, alternatives: Use 16-bit numbers for cpufeature index 2010-07-07 10:36:28 -07:00
mmx_32.c x86: clean up mmx_32.c 2008-04-17 17:40:47 +02:00
msr-reg-export.c x86, msr: change msr-reg.o to obj-y, and export its symbols 2009-09-04 10:00:09 -07:00
msr-reg.S x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too 2009-09-03 21:26:34 +02:00
msr-smp.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
msr.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
putuser.S x86: merge putuser asm functions. 2008-07-09 09:14:13 +02:00
rwlock_64.S
rwsem_64.S Fix the x86_64 implementation of call_rwsem_wait() 2010-05-04 15:24:14 -07:00
semaphore_32.S
string_32.c x86: coding style fixes to arch/x86/lib/string_32.c 2008-08-15 16:53:25 +02:00
strstr_32.c x86: coding style fixes to arch/x86/lib/strstr_32.c 2008-08-15 16:53:24 +02:00
thunk_32.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
thunk_64.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
usercopy_32.c x86: Turn the copy_from_user check into an (optional) compile time warning 2009-10-01 11:31:04 +02:00
usercopy_64.c x86, 64-bit: Clean up user address masking 2009-06-20 15:40:00 -07:00
x86-opcode-map.txt x86: Add Intel FMA instructions to x86 opcode map 2009-10-29 08:47:47 +01:00