* commit 'ab3554e1a7c04a5ea30f9c905de92348478ef7c8':
configure: Drop check_lib()/require() in favor of check_lib2()/require2()
Merged-by: Clément Bœsch <u@pkh.me>
* commit '6ce93757ee6b81fe727bfdc9f546fd0ddf9139c3':
ppc: Update #endif comments
This commit is mostly a noop as we seem to support PPC LE (see
902ce2a6c4). Only the h264 chunks are
updated.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '75d642a944d5579e4ef20ff3701422a64692afcf':
vaapi_vp8: Explicitly include libva vp8 decode header
vaapi_decode: Ignore the profile when not useful
lavc/vaapi: Add VP8 decode hwaccel
vp8: Add hwaccel hooks
This merge is a noop as these commits are already under review on the
mailing list. doc/libav-merge.txt is updated to track its progress.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '52730e0f867fe77b7d2353d8b44e92edb7079ca5':
iir_filter: Change type of array stride parameters to ptrdiff_t
The merge also updates the MIPS code and drop the extra log.h include.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '3aa9d37d03da3c9b482d19b3988659287815280e':
build: Fix directory dependencies of tests/pixfmts.mak target
This might not be necessary given our mkdirs in the configure, but it
probably doesn't hurt.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '0e5dde739943168d6f61d3fb40b3f622e7abfeff':
configure: Fix --disable-pod2man / --disable-texi2html
This commit is a noop, we have dedicated documentation option for this
purpose.
Merged-by: Clément Bœsch <u@pkh.me>
The configure has the --disable-manpages option for this purpose, and
--disable-pod2man is currently ignored due to that. This is also
consistent with the other documentation options.
* commit '2610c9528f86286e4c6e174411a26ff5b4815cde':
configure: Move initial VAAPI check to a more sensible place
This commit is a noop, see 17989dcf54
Merged-by: Clément Bœsch <u@pkh.me>
* commit '4fb311c804098d78e5ce5f527f9a9c37536d3a08':
Drop memalign hack
Merged, as this may indeed be uneeded since
46e3936fb0.
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'f01f7a7846529b7c3ef343f117eaa2c0a1457af0':
hwcontext_dxva2: use the special UC copy for downloading frames
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'd7bc52bf456deba0f32d9fe5c288ec441f1ebef5':
imgutils: add a function for copying image data from GPU mapped memory
Merged-by: Clément Bœsch <u@pkh.me>
* commit '851960f6f8cf1f946fe42fa36cf6598fac68072c':
lavc: Remove old vaapi decode infrastructure
avconv_vaapi: Convert to use hw_frames_ctx only
vaapi_mpeg4: Convert to use the new VAAPI hwaccel code
vaapi_vc1: Convert to use the new VAAPI hwaccel code
vaapi_mpeg2: Convert to use the new VAAPI hwaccel code
vaapi_h264: Convert to use the new VAAPI hwaccel code
lavc: Rewrite VAAPI decode infrastructure
This merge is a noop, these commits have already been cherry-picked.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '72eba6558ee4f10239ba3f472c0b033ec70082a7':
wmavoice: Simplify GetBitContext initialization
This commit is a noop. We don't have that code anymore since
3deb4b54a2.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '728e80cd2e1d4b7c3e26489efcd77bd7a9e84a99':
High Definition Compatible Digital (HDCD) decoder filter, using libhdcd
This commit is a noop, we have that code natively.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '95f80293456d9d4b1b096621260c38bc90325ec0':
avprobe: Fix memory leak
This commit is a noop, ffprobe is not affected.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '8db804e8f549d5b86a1edf62736e0ef80f160da9':
mov: Remove old b-frame/video delay heuristic
This commit is a noop, see 425be3c810
Merged-by: Clément Bœsch <u@pkh.me>
* commit 'eb96505b761eb02b6a3efc76d854afa6a41941ff':
mov: Remove ancient heuristic hack
This commit is a noop, see 04f8d31287
Merged-by: Clément Bœsch <u@pkh.me>
Fixes timeout with 847/clusterfuzz-testcase-5291877358108672
Fixes timeout with 850/clusterfuzz-testcase-5721296509861888
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 864/clusterfuzz-testcase-4774385942528000
See: [FFmpeg-devel] [PATCH 1/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: 2147483647 - -14133 cannot be represented in type 'int'
See: [FFmpeg-devel] [PATCH 2/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: -9 - 2147483647 cannot be represented in type 'int'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This work is sponsored by, and copyright, Google.
This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.
This gives a pretty substantial speedup for the smaller subpartitions.
The code size increases from 21512 bytes to 31400 bytes.
The idct16/32_end macros are moved above the individual functions; the
instructions themselves are unchanged, but since new functions are added
at the same place where the code is moved from, the diff looks rather
messy.
Before:
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 284.6
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1902.7
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1903.0
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2201.1
vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2510.0
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2821.3
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1011.6
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 9716.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9704.9
vp9_inv_dct_dct_32x32_sub8_add_10_neon: 10641.7
vp9_inv_dct_dct_32x32_sub12_add_10_neon: 11555.7
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 12499.8
vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13403.7
vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14335.8
vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15253.6
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16179.5
After:
vp9_inv_dct_dct_16x16_sub1_add_10_neon: 282.8
vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1142.4
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1139.0
vp9_inv_dct_dct_16x16_sub8_add_10_neon: 1772.9
vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2515.2
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2823.5
vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1012.7
vp9_inv_dct_dct_32x32_sub2_add_10_neon: 6944.4
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 6944.2
vp9_inv_dct_dct_32x32_sub8_add_10_neon: 7609.8
vp9_inv_dct_dct_32x32_sub12_add_10_neon: 9953.4
vp9_inv_dct_dct_32x32_sub16_add_10_neon: 10770.1
vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13418.8
vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14330.7
vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15257.1
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16190.6
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/aarch64/vp9itxfm_16bpp_neon.o from
26288 to 21512 bytes.
This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.
Before:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1887.4
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2801.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9691.4
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16154.9
After:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1899.5
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2827.2
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9714.7
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16175.9
Signed-off-by: Martin Storsjö <martin@martin.st>
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_16bpp_neon.o from
17500 to 14516 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4237.4 3561.5 3971.8 2525.3
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6371.9 5452.0 5779.3 3910.5
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22068.8 17867.5 19555.2 13871.6
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37268.9 38684.2 32314.2 23969.0
After:
vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4375.1 3571.9 4283.8 2567.2
vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6415.6 5578.9 5844.6 3948.3
vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22653.7 18079.7 19603.7 13905.3
vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37593.2 38862.2 32235.8 24070.9
Signed-off-by: Martin Storsjö <martin@martin.st>