This exploits an approach based on the sieve of Eratosthenes, a popular
method for generating prime numbers.
Tables are identical to previous ones.
Tested with FATE with/without --enable-hardcoded-tables.
Sample benchmark (Haswell, GNU/Linux+gcc):
prev:
7860100 decicycles in cbrt_tableinit, 1 runs, 0 skips
7777490 decicycles in cbrt_tableinit, 2 runs, 0 skips
[...]
7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips
7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips
new:
2099480 decicycles in cbrt_tableinit, 1 runs, 0 skips
2044470 decicycles in cbrt_tableinit, 2 runs, 0 skips
[...]
1796544 decicycles in cbrt_tableinit, 256 runs, 0 skips
1791631 decicycles in cbrt_tableinit, 512 runs, 0 skips
Both small and large run count given as this is called once so small run
count may give a better picture, small numbers are fairly consistent,
and there is a consistent downward trend from small to large runs,
at which point it stabilizes to a new value.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
On systems having cbrt, there is no reason to use the slow pow function.
Sample benchmark (x86-64, Haswell, GNU/Linux):
new:
5124920 decicycles in cbrt_tableinit, 1 runs, 0 skips
old:
12321680 decicycles in cbrt_tableinit, 1 runs, 0 skips
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Use macros from aac_defines.h for adding suffixes
instead of local macros.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Add fixed point implementation of functions for generating tables
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
cbrtf() took floats but it represented 1/3 exactly
and even if not more precission should be better in theory
for the table generation
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
You cannot count on them being present on all systems, and you
cannot include libm.h in a host tool, so just hard code baseline
implementations.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* qatar/master:
ffmpeg: get rid of the -vglobal option.
dct32: Add AVX implementation of 32-point DCT
dct32: Change pass 6 permutation to allow for AVX implementation
dct32: port SSE 32-point DCT to YASM
multiple inclusion guard cleanup
avio: document buffer must created with av_malloc() and friends
avio: check AVIOContext malloc failure
swscale: point out an alternative to sws_getContext
svq3: Do initialization after parsing the extradata
add changelog entries for 0.7_beta2
mp3lame: add #include required for AV_RB32 macro.
Conflicts:
Changelog
libavcodec/svq3.c
libavcodec/x86/dct32_sse.c
libavfilter/vsrc_buffer.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>