FFmpeg/libavcodec/arm
Martin Storsjö cef914e083 arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.

Before:                   Cortex A7       A8       A9      A53     A72
vp8_put_epel16_h6v6_neon:    3058.0   2218.5   2459.8   2183.0  1572.2
After:
vp8_put_epel16_h6v6_neon:    2670.8   1934.2   2244.4   1729.4  1503.9

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:18 +02:00
..
aac.h arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation 2012-12-07 16:54:03 +00:00
aacpsdsp_init_arm.c aacps: NEON optimisations 2012-05-05 22:04:21 +01:00
aacpsdsp_neon.S ARM: Move asm.S from libavcodec to libavutil 2012-06-08 13:14:38 -04:00
ac3dsp_arm.S ARM: Move asm.S from libavcodec to libavutil 2012-06-08 13:14:38 -04:00
ac3dsp_armv6.S ARM: swap source operands in some add instructions 2012-09-20 17:07:18 +01:00
ac3dsp_init_arm.c dsputil: Move apply_window_int16 to ac3dsp 2013-12-08 17:57:15 +01:00
ac3dsp_neon.S dsputil: Move apply_window_int16 to ac3dsp 2013-12-08 17:57:15 +01:00
apedsp_init_arm.c dsputil: Move APE-specific bits into apedsp 2014-05-29 06:41:15 -07:00
apedsp_neon.S dsputil: Move APE-specific bits into apedsp 2014-05-29 06:41:15 -07:00
asm-offsets.h mpegvideo: move the MpegEncContext fields used from arm asm to the beginning 2014-04-29 14:49:42 +02:00
audiodsp_arm.h dsputil: Split audio operations off into a separate context 2014-06-22 06:20:15 -07:00
audiodsp_init_arm.c dsputil: Split audio operations off into a separate context 2014-06-22 06:20:15 -07:00
audiodsp_init_neon.c audiodsp: reorder arguments for vector_clipf 2016-09-22 09:47:52 +02:00
audiodsp_neon.S audiodsp: reorder arguments for vector_clipf 2016-09-22 09:47:52 +02:00
blockdsp_arm.h blockdsp: drop the high_bit_depth parameter 2016-09-22 09:47:52 +02:00
blockdsp_init_arm.c blockdsp: drop the high_bit_depth parameter 2016-09-22 09:47:52 +02:00
blockdsp_init_neon.c blockdsp: drop the high_bit_depth parameter 2016-09-22 09:47:52 +02:00
blockdsp_neon.S dsputil: Split clear_block*/fill_block* off into a separate context 2014-06-18 14:07:23 -07:00
cabac.h arm: get_cabac inline asm 2014-03-09 00:45:34 +01:00
dca.h dcadec: simplify decoding of VQ high frequencies 2014-02-28 13:03:22 +01:00
dcadsp_init_arm.c dca: remove unused decode_hf function and quant_d tables 2015-12-24 13:58:18 +01:00
dcadsp_neon.S dca: remove unused decode_hf function and quant_d tables 2015-12-24 13:58:18 +01:00
dcadsp_vfp.S dcadec: remove scaling in lfe_interpolation_fir 2014-02-28 13:00:47 +01:00
fft_fixed_init_arm.c fft: Split MDCT bits off from FFT 2016-03-01 10:18:28 +01:00
fft_fixed_neon.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
fft_init_arm.c fft: Split MDCT bits off from FFT 2016-03-01 10:18:28 +01:00
fft_neon.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
fft_vfp.S arm: Use .data.rel.ro for const data with relocations 2014-12-09 11:43:25 +02:00
flacdsp_arm.S flacdsp: arm optimised lpc filter 2012-09-15 23:54:21 +01:00
flacdsp_init_arm.c flacdsp: arm optimised lpc filter 2012-09-15 23:54:21 +01:00
fmtconvert_init_arm.c arm: Remove a redundant check in fmtconvert_init_arm.c 2017-10-24 09:07:01 +03:00
fmtconvert_neon.S arm: add ff_int32_to_float_fmul_array8_neon 2015-12-14 16:45:02 +01:00
fmtconvert_vfp.S arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
g722dsp_init_arm.c g722: Add ARM NEON implementation for g722_apply_qmf() 2015-02-15 22:47:21 +02:00
g722dsp_neon.S g722: Add ARM NEON implementation for g722_apply_qmf() 2015-02-15 22:47:21 +02:00
h264chroma_init_arm.c h264chroma: Change type of stride parameters to ptrdiff_t 2016-09-29 14:48:04 +02:00
h264cmc_neon.S h264chroma: Change type of stride parameters to ptrdiff_t 2016-09-29 14:48:04 +02:00
h264dsp_init_arm.c h264: Move start code search functions into separate source files. 2014-08-04 22:22:54 +02:00
h264dsp_neon.S dsputil: Separate h264 qpel 2013-01-24 10:44:43 +01:00
h264idct_neon.S arm: Fix SIGBUS on ARM when compiled with binutils 2.29 2017-09-02 22:18:20 +03:00
h264pred_init_arm.c h264: arm: use intra pred8x8 functions only for chroma_format_idc <= 1 2015-07-18 00:28:49 +02:00
h264pred_neon.S ARM: Move asm.S from libavcodec to libavutil 2012-06-08 13:14:38 -04:00
h264qpel_init_arm.c qpeldsp: Mark source pointer in qpel_mc_func function pointer const 2014-07-25 02:52:54 -07:00
h264qpel_neon.S dsputil: Separate h264 qpel 2013-01-24 10:44:43 +01:00
hevc_idct.S hevc: Add NEON 32x32 IDCT 2017-05-04 14:08:39 +02:00
hevc_mc.S hevc: Add hevc_get_pixel_4/8/12/16/24/32/48/64 2017-12-08 23:41:01 +02:00
hevcdsp_init_arm.c hevc: Add hevc_get_pixel_4/8/12/16/24/32/48/64 2017-12-08 23:41:01 +02:00
hpeldsp_arm.h arm: Use full filenames as multiple inclusion guards 2014-01-14 00:04:52 +01:00
hpeldsp_arm.S hpeldsp: arm: Update comments left behind in 25841dfe80 2016-09-29 14:48:03 +02:00
hpeldsp_armv6.S arm: hpeldsp: fix put_pixels8_y2_{,no_rnd_}armv6 2014-03-08 18:31:57 +01:00
hpeldsp_init_arm.c dsputil: Refactor duplicated CALL_2X_PIXELS / PIXELS16 macros 2014-03-22 06:17:29 -07:00
hpeldsp_init_armv6.c arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp 2013-04-19 23:19:08 +03:00
hpeldsp_init_neon.c arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp 2013-04-19 23:19:08 +03:00
hpeldsp_neon.S arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp 2013-04-19 23:19:08 +03:00
idct.h idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
idctdsp_arm.h dsputil: Split off IDCT bits into their own context 2014-06-30 07:58:46 -07:00
idctdsp_arm.S idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
idctdsp_armv6.S dsputil: Split off IDCT bits into their own context 2014-06-30 07:58:46 -07:00
idctdsp_init_arm.c idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
idctdsp_init_armv5te.c idct: Move arm-specific declarations to a header in the arm directory 2014-07-20 13:02:17 -07:00
idctdsp_init_armv6.c idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
idctdsp_init_neon.c idct: Move arm-specific declarations to a header in the arm directory 2014-07-20 13:02:17 -07:00
idctdsp_neon.S dsputil: Split off IDCT bits into their own context 2014-06-30 07:58:46 -07:00
int_neon.S dsputil: Move APE-specific bits into apedsp 2014-05-29 06:41:15 -07:00
jrevdct_arm.S Drop DCTELEM typedef 2013-01-22 18:32:56 -08:00
Makefile hevc: Add hevc_get_pixel_4/8/12/16/24/32/48/64 2017-12-08 23:41:01 +02:00
mathops.h arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation 2012-12-07 16:54:03 +00:00
mdct_fixed_init_arm.c fft: Split MDCT bits off from FFT 2016-03-01 10:18:28 +01:00
mdct_fixed_neon.S ARM: set Tag_ABI_align_preserved in all asm files 2012-10-02 19:47:56 +01:00
mdct_init_arm.c fft: Split MDCT bits off from FFT 2016-03-01 10:18:28 +01:00
mdct_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
mdct_vfp.S armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6) 2014-07-18 01:34:08 +03:00
me_cmp_armv6.S dsputil: Split motion estimation compare bits off into their own context 2014-07-17 09:07:10 -07:00
me_cmp_init_arm.c motion_est: convert stride to ptrdiff_t 2014-11-24 01:30:10 +00:00
mlpdsp_armv5te.S arm: mlpdsp: handle pic offset calculation in a macro 2014-12-09 22:00:08 +01:00
mlpdsp_armv6.S cosmetics: Fix spelling mistakes 2016-05-04 18:16:21 +02:00
mlpdsp_init_arm.c truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output. 2014-03-26 19:54:32 +02:00
mpegaudiodsp_fixed_armv6.S ARM: Move asm.S from libavcodec to libavutil 2012-06-08 13:14:38 -04:00
mpegaudiodsp_init_arm.c Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
mpegvideo_arm.c mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes 2014-08-15 01:26:33 -07:00
mpegvideo_arm.h mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes 2014-08-15 01:26:33 -07:00
mpegvideo_armv5te_s.S ARM: use standard syntax for all LDRD/STRD instructions 2012-08-01 10:32:24 +01:00
mpegvideo_armv5te.c cosmetics: Fix spelling mistakes 2016-05-04 18:16:21 +02:00
mpegvideo_neon.S arm: Add X() around all references to extern symbols 2014-02-07 15:13:58 +02:00
mpegvideoencdsp_armv6.S dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc 2014-07-06 14:26:53 -07:00
mpegvideoencdsp_init_arm.c dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc 2014-07-06 14:26:53 -07:00
neon.S ARM: make some NEON macros reusable 2011-12-02 19:59:18 +00:00
neontest.c lavc: add clobber tests for the new encoding/decoding API 2016-09-28 10:01:52 +02:00
pixblockdsp_armv6.S dsputil: Split off pixel block routines into their own context 2014-07-09 08:05:26 -07:00
pixblockdsp_init_arm.c pixblockdsp: Change type of stride parameters to ptrdiff_t 2016-09-14 14:12:36 +02:00
rdft_init_arm.c rdft: arm: Split RDFT initialization into a separate file 2016-02-26 14:34:58 +01:00
rdft_neon.S ARM: set Tag_ABI_align_preserved in all asm files 2012-10-02 19:47:56 +01:00
rv34dsp_init_arm.c rv34: Drop now unnecessary dsputil dependencies 2013-02-06 11:30:54 +01:00
rv34dsp_neon.S Drop DCTELEM typedef 2013-01-22 18:32:56 -08:00
rv40dsp_init_arm.c qpeldsp: Mark source pointer in qpel_mc_func function pointer const 2014-07-25 02:52:54 -07:00
rv40dsp_neon.S ARM: Move asm.S from libavcodec to libavutil 2012-06-08 13:14:38 -04:00
sbrdsp_init_arm.c ARM: allow runtime masking of CPU features 2012-04-22 12:30:45 +01:00
sbrdsp_neon.S ARM: generate position independent code to access data symbols 2012-07-01 11:25:06 +01:00
simple_idct_arm.S cosmetics: Fix spelling mistakes 2016-05-04 18:16:21 +02:00
simple_idct_armv5te.S simple_idct: arm: Drop disabled code variant 2016-08-17 12:21:54 +02:00
simple_idct_armv6.S idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
simple_idct_neon.S idct: Change type of array stride parameters to ptrdiff_t 2016-09-29 14:48:03 +02:00
startcode_armv6.S h264: Move start code search functions into separate source files. 2014-08-04 22:22:54 +02:00
startcode.h h264: Move start code search functions into separate source files. 2014-08-04 22:22:54 +02:00
synth_filter_neon.S ARM: set Tag_ABI_align_preserved in all asm files 2012-10-02 19:47:56 +01:00
synth_filter_vfp.S arm: cosmetics: Consistently use lowercase for shift operators 2014-07-18 11:17:40 +03:00
vc1dsp_init_arm.c vc-1: Add platform-specific start code search routine to VC1DSPContext. 2014-08-04 22:22:54 +02:00
vc1dsp_init_neon.c arm: Avoid using .dn register aliases 2017-05-15 09:52:18 +03:00
vc1dsp_neon.S arm: vc1dsp: Add commas between macro arguments 2018-03-30 15:47:24 +03:00
vc1dsp.h vc1: arm: Add NEON assembly 2013-12-20 14:53:39 +02:00
videodsp_arm.h lavc: add missing files for arm 2012-12-20 14:07:23 +01:00
videodsp_armv5te.S arm: use a local label instead of the function symbol in ff_prefetch_arm 2015-07-20 23:10:29 +02:00
videodsp_init_arm.c Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
videodsp_init_armv5te.c Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
vorbisdsp_init_arm.c Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
vorbisdsp_neon.S Move vorbis_inverse_coupling from dsputil to vorbisdspcontext. 2013-01-19 22:21:10 -08:00
vp3dsp_init_arm.c vp3: Change type of stride parameters to ptrdiff_t 2016-08-26 11:36:26 +02:00
vp3dsp_neon.S arm: Add a missing # as prefix for an immediate constant 2014-01-07 19:30:13 +02:00
vp6dsp_init_arm.c vp56: Separate VP5 and VP6 dsp initialization 2016-08-26 11:50:22 +02:00
vp6dsp_neon.S vp56: Mark VP6-only optimizations as such. 2013-08-23 14:42:19 +02:00
vp8_armv6.S ARM: swap source operands in some add instructions 2012-09-20 17:07:18 +01:00
vp8.h arm: asm decode_block_coeffs_internal is vp8 specific 2014-04-04 10:39:29 +02:00
vp8dsp_armv6.S vp8: Update some assembly comments left unchanged in bd66f073fe 2016-08-26 11:36:53 +02:00
vp8dsp_init_arm.c On2 VP7 decoder 2014-04-04 04:00:11 +02:00
vp8dsp_init_armv6.c On2 VP7 decoder 2014-04-04 04:00:11 +02:00
vp8dsp_init_neon.c On2 VP7 decoder 2014-04-04 04:00:11 +02:00
vp8dsp_neon.S arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2 2019-02-19 11:46:18 +02:00
vp8dsp.h On2 VP7 decoder 2014-04-04 04:00:11 +02:00
vp9dsp_init_arm.c arm: vp9lpf: Implement the mix2_44 function with one single filter pass 2017-02-24 00:03:09 +02:00
vp9itxfm_neon.S arm/aarch64: vp9: Fix vertical alignment 2017-03-16 23:09:00 +02:00
vp9lpf_neon.S arm/aarch64: vp9: Fix vertical alignment 2017-03-16 23:09:00 +02:00
vp9mc_neon.S arm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter 2017-02-11 00:08:37 +02:00
vp56_arith.h arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation 2012-12-07 16:54:03 +00:00