third_party_ffmpeg/libavcodec/x86
Linjie Fu 8b8492452d lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8
Fix overflow for coeff -32768 in function ADD_RES_SSE_16_32_8 with no
performance drop.(SSE2/AVX/AVX2)

./checkasm --test=hevc_add_res --bench

Mainline:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_32x32_8_sse2: 127.5
    hevc_add_res_32x32_8_avx: 127.0
    hevc_add_res_32x32_8_avx2: 86.5

Add overflow test case:
  - hevc_add_res.add_residual [FAILED]

After:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_32x32_8_sse2: 126.8
    hevc_add_res_32x32_8_avx: 128.3
    hevc_add_res_32x32_8_avx2: 86.8

Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
..
aacencdsp_init.c
aacencdsp.asm
aacpsdsp_init.c lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis 2017-06-28 12:22:39 +02:00
aacpsdsp.asm lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis 2017-06-28 12:22:39 +02:00
ac3dsp_downmix.asm Merge commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb' 2017-03-22 12:49:29 +01:00
ac3dsp_init.c Merge commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb' 2017-03-22 12:49:29 +01:00
ac3dsp.asm
alacdsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
alacdsp.asm
audiodsp_init.c Merge commit '12004a9a7f20e44f4da2ee6c372d5e1794c8d6c5' 2017-03-20 22:35:07 +01:00
audiodsp.asm Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2' 2017-10-21 12:15:57 -03:00
blockdsp_init.c libavcodec/blockdsp : add AVX version 2017-10-03 19:47:37 -03:00
blockdsp.asm x86/blockdsp: use three operand form for an instruction 2017-10-04 23:51:44 -03:00
bswapdsp_init.c libavcodec/bswapdsp : add AVX2 func for bswap_buf (swap uint32_t) 2017-10-29 15:21:35 +01:00
bswapdsp.asm avcodec/x86/bswapdsp : use macro for 128 bits constants loading in xmm or ymm 2017-12-02 18:25:25 +01:00
cabac.h
cavsdsp.c avcodec/x86/cavsdsp: Delete #include "libavcodec/x86/idctdsp.h". 2017-07-21 02:08:33 +02:00
cavsidct.asm cavs: add a sse2 idct implementation. 2017-04-06 10:03:28 -04:00
celt_pvq_init.c celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled 2019-03-31 23:36:43 +02:00
celt_pvq_search.asm x86/opus_dsp: rename to celt_pvq 2019-03-31 23:35:00 +02:00
constants.c x86/constants: make pb_80 32 byte wide 2017-11-21 10:57:03 -03:00
constants.h x86/constants: make pb_80 32 byte wide 2017-11-21 10:57:03 -03:00
dcadsp_init.c build: Generalize yasm/nasm-related variable names 2017-03-01 10:18:15 +01:00
dcadsp.asm
dct32.asm Merge commit '6eef263aca281fb582e1fa3d841ac20ef747a252' 2017-10-12 13:48:35 -03:00
dct_init.c
dirac_dwt_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
dirac_dwt.asm Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6' 2017-09-26 18:48:06 -03:00
diracdsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
diracdsp.asm avcodec/x86/diracdsp: Fix high bits on Windows x86_64 2020-01-31 00:04:22 +01:00
dnxhdenc_init.c
dnxhdenc.asm Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6' 2017-09-26 18:48:06 -03:00
exrdsp_init.c libavcodec/exr : add x86 SIMD for predictor 2017-10-01 17:35:30 -03:00
exrdsp.asm avcodec/x86/exrdsp : use ymm constant for pb_80 2017-11-23 20:00:13 +01:00
fdct.c Mark some arrays that never change as const. 2017-02-01 10:42:59 +01:00
fdct.h
fdctdsp_init.c
fft_init.c
fft.asm avcodec/fft: fix INTERL macro on 3dnow 2017-11-25 13:11:45 -03:00
fft.h
flac_dsp_gpl.asm
flacdsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
flacdsp.asm
fmtconvert_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
fmtconvert.asm
fpel.asm Merge commit '009adfd4fbdd78a890a4a65d6f141c467bb027fa' 2017-03-21 15:02:31 +01:00
fpel.h
g722dsp_init.c
g722dsp.asm
h263_loopfilter.asm
h263dsp_init.c
h264_cabac.c Merge commit '0a35f128f3c6e0ae9a0a2236c557602c108da269' 2017-04-08 14:30:13 +02:00
h264_chromamc_10bit.asm Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c' 2017-03-21 15:20:45 -03:00
h264_chromamc.asm Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c' 2017-03-21 15:20:45 -03:00
h264_deblock_10bit.asm avcodec/x86: deduplicate PASS8ROWS macro 2017-02-18 20:26:49 +01:00
h264_deblock.asm avcodec/h264: enable sse2 chroma deblock/loop filter functions 2017-02-27 13:22:06 +01:00
h264_idct_10bit.asm
h264_idct.asm h264_idct: enable unmacro on newer NASM versions 2018-02-12 10:50:37 +00:00
h264_intrapred_10bit.asm Merge commit '5801f9ed245ca5ebb57b0b5183de7a24aaece133' 2017-03-23 11:58:01 +01:00
h264_intrapred_init.c h264pred: added AVX2 implementation for tm_vp8 16x16. 2017-03-20 09:45:42 -04:00
h264_intrapred.asm Merge commit '5801f9ed245ca5ebb57b0b5183de7a24aaece133' 2017-03-23 11:58:01 +01:00
h264_qpel_8bit.asm
h264_qpel_10bit.asm
h264_qpel.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
h264_weight_10bit.asm
h264_weight.asm
h264chroma_init.c Merge commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c' 2017-03-21 15:20:45 -03:00
h264dsp_init.c avcodec/h264dsp: change loop filter stride argument to ptrdiff_t 2019-02-20 15:27:43 -03:00
hevc_add_res.asm lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8 2020-03-27 10:57:40 +01:00
hevc_deblock.asm avcodec/x86: deduplicate PASS8ROWS macro 2017-02-18 20:26:49 +01:00
hevc_idct.asm Merge commit '112cee0241f5799edff0e4682b9e8639b046dc78' 2017-03-23 15:58:46 +01:00
hevc_mc.asm
hevc_sao_10bit.asm avcodec: increase AV_INPUT_BUFFER_PADDING_SIZE to 64 2018-01-11 23:46:31 -03:00
hevc_sao.asm avcodec: increase AV_INPUT_BUFFER_PADDING_SIZE to 64 2018-01-11 23:46:31 -03:00
hevcdsp_init.c Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b' 2017-03-24 12:33:25 +01:00
hevcdsp.h Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b' 2017-03-24 12:33:25 +01:00
hpeldsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
hpeldsp_rnd_template.c
hpeldsp_vp3_init.c Revert "Merge commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553'" 2017-02-01 02:01:07 +01:00
hpeldsp_vp3.asm Merge commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5' 2017-01-31 14:49:29 -03:00
hpeldsp.asm Merge commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5' 2017-01-31 14:49:29 -03:00
hpeldsp.h Revert "Merge commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553'" 2017-02-01 02:01:07 +01:00
huffyuvdsp_init.c avcodec/huffyuvdsp : add add_int16 AVX2 func 2017-11-21 09:41:58 +01:00
huffyuvdsp_template.asm avcodec/huffyuvdsp : add add_int16 AVX2 func 2017-11-21 09:41:58 +01:00
huffyuvdsp.asm avcodec/huffyuvdsp : add add_int16 AVX2 func 2017-11-21 09:41:58 +01:00
huffyuvencdsp_init.c avcodec/huffyuvdspenc : add diff_int16 AVX2 func 2017-11-21 09:42:08 +01:00
huffyuvencdsp.asm avcodec/huffyuvdspenc : add diff_int16 AVX2 func 2017-11-21 09:42:08 +01:00
idctdsp_init.c mpeg4video: Add support for MPEG-4 Simple Studio Profile. 2018-04-02 13:06:23 +01:00
idctdsp.asm
idctdsp.h avcodec/x86/idctdsp: Remove duplicate include 2017-03-26 19:17:30 +02:00
imdct36.asm Merge commit '6eef263aca281fb582e1fa3d841ac20ef747a252' 2017-10-12 13:48:35 -03:00
inline_asm.h
jpeg2000dsp_init.c x86/jpeg2000dsp: add ff_ict_float_{fma3,fma4} 2017-11-20 18:33:58 -03:00
jpeg2000dsp.asm x86/jpeg2000dsp: add ff_ict_float_{fma3,fma4} 2017-11-20 18:33:58 -03:00
lossless_audiodsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
lossless_audiodsp.asm
lossless_videodsp_init.c x86/lossless_videodsp: rename ff_add_left_pred_int16_sse4 to ff_add_left_pred_int16_unaligned_ssse3 2017-12-10 00:51:01 -03:00
lossless_videodsp.asm x86/lossless_videodsp: rename ff_add_left_pred_int16_sse4 to ff_add_left_pred_int16_unaligned_ssse3 2017-12-10 00:51:01 -03:00
lossless_videoencdsp_init.c avcodec/utvideoenc : add SIMD (avx) for sub_left_prediction 2018-01-28 20:23:11 +01:00
lossless_videoencdsp.asm avcodec/utvideoenc : add SIMD (avx) for sub_left_prediction 2018-01-28 20:23:11 +01:00
lpc.c Merge commit '4efab89332ea39a77145e8b15562b981d9dbde68' 2017-01-31 15:08:19 -03:00
Makefile avcodec/Makefile: add missing pngdsp dependency to the lscr decoder 2019-05-14 16:47:56 -03:00
mathops.h
mdct15_init.c mdct15: simplify x86 exptab permutation 2018-05-07 23:44:40 +01:00
mdct15.asm mdct15: simplify the fft15 x86 SIMD 2018-05-07 23:27:41 +01:00
me_cmp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
me_cmp.asm
mlpdsp_init.c Merge commit 'fd9212f2edfe9b107c3c08ba2df5fd2cba5ab9e3' 2017-09-26 16:02:40 -03:00
mlpdsp.asm
mpegaudiodsp.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
mpegvideo.c avcodec/x86/mpegvideo: Use intra scantable in dct_unquantize_h263_intra_mmx() 2017-06-20 00:07:51 +02:00
mpegvideodsp.c avcodec/x86/mpegvideodsp: Fix signedness bug in need_emu 2017-11-14 04:54:31 +01:00
mpegvideoenc_qns_template.c
mpegvideoenc_template.c avcodec/x86/mpegenc: support transpose permuation type 2017-06-20 12:12:13 +02:00
mpegvideoenc.c lavc/mpegvideoenc: reformat inv_zigzag_direct16 so the zigzag pattern is visible 2017-05-19 11:17:58 +02:00
mpegvideoencdsp_init.c
mpegvideoencdsp.asm
opusdsp_init.c x85/opusdsp: enable the functions on all FMA3 CPUs 2019-09-11 20:50:45 -03:00
opusdsp.asm x86/opusdps: clear the high bits from some gprs 2019-09-11 20:42:31 -03:00
pixblockdsp_init.c avcodec/me_cmp: Fix crashes on ARM due to misalignment 2017-08-21 23:19:18 +02:00
pixblockdsp.asm Merge commit 'de452e503734ebb0fdbce86e9d16693b3530fad3' 2017-03-20 15:58:32 +01:00
pngdsp_init.c
pngdsp.asm
proresdsp_init.c avcodec/proresdsp indent after prev commit 2018-12-02 12:55:35 +01:00
proresdsp.asm avcodec/x86: allow future 8-bit simple idct to use slightly different coefficients 2017-06-20 16:12:25 +02:00
qpel.asm
qpeldsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
qpeldsp.asm
rnd_template.c
rv34dsp_init.c x86/rv34dsp: add ff_rv34_idct_dc_add_sse2 2017-02-02 17:51:21 -03:00
rv34dsp.asm x86/rv34dsp: add ff_rv34_idct_dc_add_sse2 2017-02-02 17:51:21 -03:00
rv40dsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
rv40dsp.asm Merge commit '6eef263aca281fb582e1fa3d841ac20ef747a252' 2017-10-12 13:48:35 -03:00
sbcdsp_init.c sbcenc: add MMX optimizations 2018-03-07 22:26:53 +01:00
sbcdsp.asm sbcenc: add MMX optimizations 2018-03-07 22:26:53 +01:00
sbrdsp_init.c
sbrdsp.asm Revert "x86/sbrdsp: remove unnecessary sign extend instruction in apply_noise_main" 2017-07-05 10:29:15 -03:00
simple_idct10_template.asm avcodec/x86: allow future 8-bit simple idct to have "DC only hack" 2017-06-28 17:27:35 +02:00
simple_idct10.asm avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions 2017-06-28 17:27:35 +02:00
simple_idct.asm avcodec/x86: move simple_idct to external assembly 2017-05-30 13:20:42 +02:00
simple_idct.h avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions 2017-06-28 17:27:35 +02:00
snowdsp.c
svq1enc_init.c
svq1enc.asm
synth_filter_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
synth_filter.asm
takdsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
takdsp.asm
ttadsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
ttadsp.asm
ttaencdsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
ttaencdsp.asm
utvideodsp_init.c avcodec/utvideodsp : add avx2 version for the dsp 2017-11-21 09:00:42 +01:00
utvideodsp.asm x86/utvideodsp: reuse shared constants 2017-11-21 10:57:14 -03:00
v210-init.c libavcodec Adding ff_v210_planar_unpack AVX2 2019-05-02 19:21:37 +02:00
v210.asm x86/v210dec: use named registers 2019-05-03 01:20:18 -03:00
v210enc_init.c
v210enc.asm
vc1dsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
vc1dsp_loopfilter.asm Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6' 2017-09-26 18:48:06 -03:00
vc1dsp_mc.asm Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6' 2017-09-26 18:48:06 -03:00
vc1dsp_mmx.c
vc1dsp.h
videodsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
videodsp.asm Merge commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9' 2017-03-23 18:35:49 -03:00
vorbisdsp_init.c
vorbisdsp.asm x86/vorbisdsp: optimize ff_vorbis_inverse_coupling_sse 2017-06-15 23:20:05 -03:00
vp3dsp_init.c vp4: prevent unaligned memory access in loop filter 2019-10-30 10:06:38 +01:00
vp3dsp.asm Merge commit '6892df9294d93322d43255ada299507465bc93c8' 2017-03-19 18:41:26 +01:00
vp6dsp_init.c Merge commit '721d57e608dc4fd6c86f27c5ae76ef559d646220' 2017-03-19 17:15:24 -03:00
vp6dsp.asm Merge commit 'd9d26a3674f31f482f54e936fcb382160830877a' 2017-03-19 14:54:25 -03:00
vp8dsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
vp8dsp_loopfilter.asm Merge commit '802727b538b484e3f9d1345bfcc4ab24cfea8898' 2017-03-19 15:18:31 -03:00
vp8dsp.asm Merge commit '307eb1a8ee363db1fcf869e427a8deb6d9538881' 2017-10-21 12:28:39 -03:00
vp9dsp_init_10bpp.c
vp9dsp_init_12bpp.c
vp9dsp_init_16bpp_template.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
vp9dsp_init_16bpp.c avcodec/x86/vp9dsp_init_16bpp: Fix linking to missing ff_vp9_ipred_dr_32x32_16_avx2() on 32bit 2017-06-28 00:31:33 +02:00
vp9dsp_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
vp9dsp_init.h vp9: re-split the decoder/format/dsp interface header files. 2017-03-28 18:04:26 -04:00
vp9intrapred_16bpp.asm avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementation 2017-06-27 16:10:50 -04:00
vp9intrapred.asm
vp9itxfm_16bpp.asm
vp9itxfm_template.asm
vp9itxfm.asm
vp9lpf_16bpp.asm
vp9lpf.asm
vp9mc_16bpp.asm
vp9mc.asm Merge commit 'e99ecda55082cb9dde8fd349361e169dc383943a' 2017-03-16 20:25:39 +01:00
vp56_arith.h
w64xmmtest.c Merge commit 'de2ae3c1fae5a2eb539b9abd7bc2a9ca8c286ff0' 2017-03-21 14:43:53 +01:00
xvididct_init.c build: Generalize yasm/nasm-related variable names 2017-06-21 17:00:29 -03:00
xvididct.asm
xvididct.h Merge commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428' 2017-03-21 14:29:52 -03:00