FFmpeg/libavcodec/arm
Martin Storsjö 3fbbad2984 arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit
The theoretical maximum value of E is 193, so we can just
saturate the addition to 255.

Before:                     Cortex A7      A8      A9     A53  A53/AArch64
vp9_loop_filter_v_4_8_neon:     143.0   127.7   114.8    88.0         87.7
vp9_loop_filter_v_8_8_neon:     241.0   197.2   173.7   140.0        136.7
vp9_loop_filter_v_16_8_neon:    497.0   419.5   379.7   293.0        275.7
vp9_loop_filter_v_16_16_neon:   965.2   818.7   731.4   579.0        452.0
After:
vp9_loop_filter_v_4_8_neon:     136.0   125.7   112.6    84.0         83.0
vp9_loop_filter_v_8_8_neon:     234.0   195.5   171.5   136.0        133.7
vp9_loop_filter_v_16_8_neon:    490.0   417.5   377.7   289.0        271.0
vp9_loop_filter_v_16_16_neon:   951.2   814.7   732.3   571.0        446.7

This is cherrypicked from libav commit
c582cb8537.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-11 13:14:50 +02:00
..
aac.h
aacpsdsp_init_arm.c
aacpsdsp_neon.S
ac3dsp_arm.S
ac3dsp_armv6.S
ac3dsp_init_arm.c
ac3dsp_neon.S
asm-offsets.h
audiodsp_arm.h
audiodsp_init_arm.c
audiodsp_init_neon.c
audiodsp_neon.S
blockdsp_arm.h blockdsp: remove high bitdepth parameter 2015-10-02 04:38:40 +02:00
blockdsp_init_arm.c blockdsp: remove high bitdepth parameter 2015-10-02 04:38:40 +02:00
blockdsp_init_neon.c blockdsp: reindent after parameter removal 2015-10-03 23:34:56 +02:00
blockdsp_neon.S
cabac.h
dca.h avcodec/dca: remove old decoder 2016-01-31 17:09:38 +01:00
fft_fixed_init_arm.c Merge commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555' 2016-04-12 15:43:09 +01:00
fft_fixed_neon.S Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46' 2014-12-09 11:58:13 +01:00
fft_init_arm.c Merge commit '4c297249ac0f513a610a62691ce96d6b62f65b94' 2016-04-12 15:43:34 +01:00
fft_neon.S Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46' 2014-12-09 11:58:13 +01:00
fft_vfp.S Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46' 2014-12-09 11:58:13 +01:00
flacdsp_arm.S
flacdsp_init_arm.c lavc/flac: Fix encoding and decoding with high lpc. 2015-05-17 02:08:58 +02:00
fmtconvert_init_arm.c Merge commit '90b1b9350c0a97c4065ae9054b83e57f48a0de1f' 2016-01-02 11:21:36 +01:00
fmtconvert_neon.S Merge commit '90b1b9350c0a97c4065ae9054b83e57f48a0de1f' 2016-01-02 11:21:36 +01:00
fmtconvert_vfp.S
g722dsp_init_arm.c Merge commit '702458538d4e52809bcef460d39baabf061b16b5' 2015-02-16 02:16:29 +01:00
g722dsp_neon.S Merge commit '702458538d4e52809bcef460d39baabf061b16b5' 2015-02-16 02:16:29 +01:00
h264chroma_init_arm.c
h264cmc_neon.S avcodec: fix vc1dsp dependencies 2016-09-25 13:11:45 +02:00
h264dsp_init_arm.c lavc/arm: Use the neon vertical chroma loop filter also for H.264 4:2:2. 2015-01-31 10:05:24 +01:00
h264dsp_neon.S
h264idct_neon.S
h264pred_init_arm.c Merge commit '256ef19844892c6cf8e0386e3287bae970ec6320' 2015-07-18 02:13:22 +02:00
h264pred_neon.S
h264qpel_init_arm.c
h264qpel_neon.S
hevcdsp_arm.h hevcdsp: fix compilation for arm and aarch64 2015-03-12 20:01:01 +01:00
hevcdsp_deblock_neon.S hevcdsp: HEVC deblocking ARM NEON register clobber fix 2015-02-16 13:27:41 +01:00
hevcdsp_idct_neon.S Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d' 2017-01-31 15:31:34 +01:00
hevcdsp_init_arm.c hevcdsp: fix compilation for arm and aarch64 2015-03-12 20:01:01 +01:00
hevcdsp_init_neon.c Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d' 2017-01-31 15:31:34 +01:00
hevcdsp_qpel_neon.S avcodec/hevcdsp: ARM NEON optimized qpel functions 2015-02-25 18:39:51 +01:00
hpeldsp_arm.h
hpeldsp_arm.S
hpeldsp_armv6.S
hpeldsp_init_arm.c
hpeldsp_init_armv6.c
hpeldsp_init_neon.c
hpeldsp_neon.S
idct.h
idctdsp_arm.h
idctdsp_arm.S
idctdsp_armv6.S
idctdsp_init_arm.c Merge commit '7c6eb0a1b7bf1aac7f033a7ec6d8cacc3b5c2615' 2015-07-27 22:10:35 +02:00
idctdsp_init_armv5te.c
idctdsp_init_armv6.c Merge commit '7c6eb0a1b7bf1aac7f033a7ec6d8cacc3b5c2615' 2015-07-27 22:10:35 +02:00
idctdsp_init_neon.c
idctdsp_neon.S
int_neon.S
jrevdct_arm.S
lossless_audiodsp_init_arm.c
lossless_audiodsp_neon.S
Makefile arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
mathops.h
mdct_fixed_neon.S
mdct_neon.S
mdct_vfp.S
me_cmp_armv6.S
me_cmp_init_arm.c
mlpdsp_armv5te.S Merge commit '4c81613df499ba81d64ea102b38d0c6686cc304c' 2014-12-10 00:51:26 +01:00
mlpdsp_armv6.S Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
mlpdsp_init_arm.c
mpegaudiodsp_fixed_armv6.S
mpegaudiodsp_init_arm.c
mpegvideo_arm.c
mpegvideo_arm.h
mpegvideo_armv5te_s.S
mpegvideo_armv5te.c Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
mpegvideo_neon.S
mpegvideoencdsp_armv6.S
mpegvideoencdsp_init_arm.c
neon.S
neontest.c avcodec: fix arguments on xmm/neon clobber test wrappers 2016-10-02 02:15:47 -03:00
pixblockdsp_armv6.S
pixblockdsp_init_arm.c
rdft_init_arm.c arm/rdft_init: fix license header 2016-04-12 15:01:19 -03:00
rdft_neon.S
rv34dsp_init_arm.c
rv34dsp_neon.S
rv40dsp_init_arm.c
rv40dsp_neon.S
sbrdsp_init_arm.c
sbrdsp_neon.S
simple_idct_arm.S Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb' 2016-06-21 21:55:34 +02:00
simple_idct_armv5te.S
simple_idct_armv6.S
simple_idct_neon.S
startcode_armv6.S
startcode.h
synth_filter_init_arm.c avcodec/synth_filter: split off remaining code from dcadec files 2016-01-25 14:57:38 -03:00
synth_filter_neon.S
synth_filter_vfp.S
vc1dsp_init_arm.c
vc1dsp_init_neon.c
vc1dsp_neon.S
vc1dsp.h
videodsp_arm.h
videodsp_armv5te.S arm: use a local label instead of the function symbol in ff_prefetch_arm 2015-07-20 23:10:29 +02:00
videodsp_init_arm.c
videodsp_init_armv5te.c
vorbisdsp_init_arm.c
vorbisdsp_neon.S
vp3dsp_init_arm.c
vp3dsp_neon.S
vp6dsp_init_arm.c
vp6dsp_neon.S
vp8_armv6.S
vp8.h
vp8dsp_armv6.S Merge commit '5f74bd31a9bd1ac7655103b11743c12d38e0419f' 2016-11-17 15:05:07 +01:00
vp8dsp_init_arm.c
vp8dsp_init_armv6.c
vp8dsp_init_neon.c
vp8dsp_neon.S Merge commit 'e8b96a77010dd62624c3c65c357d7ae3b397ceaa' 2016-11-14 15:21:49 +01:00
vp8dsp.h
vp9dsp_init_10bpp_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init_12bpp_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init_16bpp_arm_template.c arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
vp9dsp_init_arm.c arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9dsp_init.h arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9itxfm_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 itxfm 2017-01-24 22:35:56 +02:00
vp9itxfm_neon.S arm: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling 2017-03-11 13:14:48 +02:00
vp9lpf_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter 2017-01-24 22:35:59 +02:00
vp9lpf_neon.S arm/aarch64: vp9lpf: Keep the comparison to E within 8 bit 2017-03-11 13:14:50 +02:00
vp9mc_16bpp_neon.S arm: Add NEON optimizations for 10 and 12 bit vp9 MC 2017-01-24 22:35:50 +02:00
vp9mc_neon.S arm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter 2017-03-11 13:14:47 +02:00
vp56_arith.h