linux/include/asm-generic
Denys Vlasenko 133fd9f5cd vsprintf: further optimize decimal conversion
Previous code was using optimizations which were developed to work well
even on narrow-word CPUs (by today's standards).  But Linux runs only on
32-bit and wider CPUs.  We can use that.

First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
divide by 10 much larger numbers, and thus we can print groups of 9 digits
instead of groups of 5 digits.

Next: there are two algorithms to print larger numbers.  One is generic:
divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
It's conceptually simple, but requires an (unsigned long long) /
1000000000 division.

Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
manipulates them cleverly and generates groups of 4 decimal digits.  It so
happens that it does NOT require long long division.

If long is > 32 bits, division of 64-bit values is relatively easy, and we
will use the first algorithm.  If long long is > 64 bits (strange
architecture with VERY large long long), second algorithm can't be used,
and we again use the first one.

Else (if long is 32 bits and long long is 64 bits) we use second one.

And third: there is a simple optimization which takes fast path not only
for zero as was done before, but for all one-digit numbers.

In all tested cases new code is faster than old one, in many cases by 30%,
in few cases by more than 50% (for example, on x86-32, conversion of
12345678).  Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
case.

This patch is based upon an original from Michal Nazarewicz.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:27 -07:00
..
bitops Add #includes needed to permit the removal of asm/system.h 2012-03-28 18:30:03 +01:00
4level-fixup.h
atomic64.h
atomic-long.h
atomic.h Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
audit_change_attr.h
audit_dir_write.h
audit_read.h
audit_signal.h
audit_write.h
auxvec.h
barrier.h Create asm-generic/barrier.h 2012-03-28 18:30:03 +01:00
bitops.h
bitsperlong.h vsprintf: further optimize decimal conversion 2012-05-31 17:49:27 -07:00
bug.h consolidate WARN_...ONCE() static variables 2012-03-23 16:58:31 -07:00
bugs.h
cache.h
cacheflush.h
checksum.h
cmpxchg-local.h
cmpxchg.h asm-generic: add linux/types.h to cmpxchg.h 2012-04-02 14:41:27 -07:00
cputime.h
current.h
delay.h
device.h
div64.h
dma-coherent.h common: add dma_mmap_from_coherent() function 2012-05-21 15:06:09 +02:00
dma-contiguous.h drivers: add Contiguous Memory Allocator 2012-05-21 15:09:37 +02:00
dma-mapping-broken.h
dma-mapping-common.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
dma.h
emergency-restart.h
errno-base.h
errno.h
exec.h Split arch_align_stack() out from asm-generic/system.h 2012-03-28 18:30:03 +01:00
fb.h
fcntl.h
ftrace.h
futex.h
getorder.h bitops: Add missing parentheses to new get_order macro 2012-02-24 10:39:27 -08:00
gpio.h gpiolib: Remove 'const' from data argument of gpiochip_find() 2012-05-18 23:01:05 -06:00
hardirq.h
hw_irq.h
ide_iops.h
int-l64.h
int-ll64.h
io-64-nonatomic-hi-lo.h asm-generic: architecture independent readq/writeq for 32bit environment 2012-02-21 16:47:28 -08:00
io-64-nonatomic-lo-hi.h asm-generic: architecture independent readq/writeq for 32bit environment 2012-02-21 16:47:28 -08:00
io.h
ioctl.h
ioctls.h
iomap.h [PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional 2012-02-27 09:43:30 -06:00
ipcbuf.h
irq_regs.h
irq.h
irqflags.h
Kbuild
Kbuild.asm
kdebug.h
kmap_types.h
kvm_para.h KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ block 2012-05-21 17:47:52 +03:00
libata-portmap.h
linkage.h
local64.h
local.h
memory_model.h
mm_hooks.h
mman-common.h coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP 2012-03-23 16:58:42 -07:00
mman.h
mmu_context.h
mmu.h
module.h
msgbuf.h
mutex-dec.h
mutex-null.h
mutex-xchg.h
mutex.h
page.h
param.h
parport.h
pci_iomap.h [PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional 2012-02-27 09:43:30 -06:00
pci-bridge.h PCI: work around Stratus ftServer broken PCIe hierarchy 2012-04-30 15:21:02 -06:00
pci-dma-compat.h
pci.h PCI: collapse pcibios_resource_to_bus 2012-02-23 20:19:04 -07:00
percpu.h
pgalloc.h
pgtable-nopmd.h
pgtable-nopud.h
pgtable.h mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition 2012-05-29 16:22:24 -07:00
poll.h epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() 2012-02-24 11:42:50 -08:00
posix_types.h posix_types: Introduce __kernel_[u]long_t 2012-02-20 12:48:47 -08:00
ptrace.h
resource.h
rtc.h
rwsem.h
scatterlist.h
sections.h
segment.h
sembuf.h
serial.h
setup.h
shmbuf.h
shmparam.h
siginfo.h Linux 3.4-rc5 2012-05-04 12:46:40 +10:00
signal-defs.h
signal.h
sizes.h
socket.h net: Add framework to allow sending packets with customized CRC. 2012-02-24 01:37:35 -08:00
sockios.h
spinlock.h
stat.h
statfs.h asm-generic: Use __BITS_PER_LONG in statfs.h 2012-04-30 12:55:15 -07:00
string.h
swab.h
switch_to.h Split the switch_to() wrapper out of asm-generic/system.h 2012-03-28 18:30:03 +01:00
syscall.h asm/syscall.h: add syscall_get_arch 2012-04-14 11:13:19 +10:00
syscalls.h
termbits.h
termios-base.h
termios.h
timex.h
tlb.h
tlbflush.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
topology.h
types.h
uaccess-unaligned.h
uaccess.h
ucontext.h
unaligned.h
unistd.h compat: use sys_sendfile64() implementation for sendfile syscall 2012-03-27 13:36:57 -04:00
user.h
vga.h
vmlinux.lds.h ftrace: Sort all function addresses, not just per page 2012-05-16 19:58:44 -04:00
word-at-a-time.h word-at-a-time: make the interfaces truly generic 2012-05-26 11:33:40 -07:00
xor.h