FFmpeg

mirror of https://github.com/xenia-project/FFmpeg.git synced 2024-12-12 13:46:17 +00:00

Author	SHA1	Message	Date
Clément Bœsch	4563a86f01	Merge commit 'ab3554e1a7c04a5ea30f9c905de92348478ef7c8' * commit 'ab3554e1a7c04a5ea30f9c905de92348478ef7c8': configure: Drop check_lib()/require() in favor of check_lib2()/require2() Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 12:23:02 +01:00
Clément Bœsch	8e9dfe0d29	Merge commit '468bfe38c66d4d020984158e53b09a6a5749f394' * commit '468bfe38c66d4d020984158e53b09a6a5749f394': ppc: mpegvideo: Add proper runtime AltiVec detection Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 12:08:11 +01:00
Clément Bœsch	7c54e5870f	Merge commit '6ce93757ee6b81fe727bfdc9f546fd0ddf9139c3' * commit '6ce93757ee6b81fe727bfdc9f546fd0ddf9139c3': ppc: Update #endif comments This commit is mostly a noop as we seem to support PPC LE (see `902ce2a6c4`). Only the h264 chunks are updated. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 12:06:51 +01:00
Clément Bœsch	9e8fd5c423	Merge commit 'caccb3a0cdc7ee32cbed7eab156d35025133eadc' * commit 'caccb3a0cdc7ee32cbed7eab156d35025133eadc': audiodsp: ppc: Add VSX variant Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:57:32 +01:00
Clément Bœsch	3c8f7a8f6b	Merge commit 'e89cef40506d990a982aefedfde7d3ca4f88c524' * commit 'e89cef40506d990a982aefedfde7d3ca4f88c524': checkasm: Read the unsigned value as it should Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:55:20 +01:00
Clément Bœsch	9785b1e21b	Merge commit '75d642a944d5579e4ef20ff3701422a64692afcf' * commit '75d642a944d5579e4ef20ff3701422a64692afcf': vaapi_vp8: Explicitly include libva vp8 decode header vaapi_decode: Ignore the profile when not useful lavc/vaapi: Add VP8 decode hwaccel vp8: Add hwaccel hooks This merge is a noop as these commits are already under review on the mailing list. doc/libav-merge.txt is updated to track its progress. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:54:29 +01:00
Clément Bœsch	eed8ccde3e	Merge commit '131a85a1fed9966bbd38517f76abfac0237e39dc' * commit '131a85a1fed9966bbd38517f76abfac0237e39dc': utvideo: Change type of array stride parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:33:48 +01:00
Clément Bœsch	8316a0e08b	Merge commit '52730e0f867fe77b7d2353d8b44e92edb7079ca5' * commit '52730e0f867fe77b7d2353d8b44e92edb7079ca5': iir_filter: Change type of array stride parameters to ptrdiff_t The merge also updates the MIPS code and drop the extra log.h include. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:27:48 +01:00
Clément Bœsch	d36a423445	Merge commit '6b52762951fa138eef59e2628dabb389e0500e40' * commit '6b52762951fa138eef59e2628dabb389e0500e40': error_resilience: Change type of array stride parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:10:46 +01:00
Clément Bœsch	100026bed6	Merge commit 'ec903058447ad5be34d89533962e9ae1aa1c78f7' * commit 'ec903058447ad5be34d89533962e9ae1aa1c78f7': configure: Simplify clock_gettime() test nanosleep check also updated. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:04:50 +01:00
Clément Bœsch	38343651a8	Merge commit '3aa9d37d03da3c9b482d19b3988659287815280e' * commit '3aa9d37d03da3c9b482d19b3988659287815280e': build: Fix directory dependencies of tests/pixfmts.mak target This might not be necessary given our mkdirs in the configure, but it probably doesn't hurt. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 11:01:02 +01:00
Clément Bœsch	4ae80c3753	Merge commit '0e5dde739943168d6f61d3fb40b3f622e7abfeff' * commit '0e5dde739943168d6f61d3fb40b3f622e7abfeff': configure: Fix --disable-pod2man / --disable-texi2html This commit is a noop, we have dedicated documentation option for this purpose. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 10:47:01 +01:00
Clément Bœsch	d0db00c808	configure: remove pod2man from the config list The configure has the --disable-manpages option for this purpose, and --disable-pod2man is currently ignored due to that. This is also consistent with the other documentation options.	2017-03-20 10:45:48 +01:00
Clément Bœsch	715f781834	Merge commit 'b8c2d407efa41c3db6813ad67fadd51b814765bd' * commit 'b8c2d407efa41c3db6813ad67fadd51b814765bd': configure: Simplify libopenjpeg check This commit is a noop, our libopenjpeg check is already "simpler". Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:48:22 +01:00
Clément Bœsch	6d6f79c737	Merge commit '2610c9528f86286e4c6e174411a26ff5b4815cde' * commit '2610c9528f86286e4c6e174411a26ff5b4815cde': configure: Move initial VAAPI check to a more sensible place This commit is a noop, see `17989dcf54` Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:46:33 +01:00
Clément Bœsch	7317b69630	Merge commit '5b5ed92d92252a685e891a5d636870e223b63228' * commit '5b5ed92d92252a685e891a5d636870e223b63228': sanm: Change type of array pitch parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:43:52 +01:00
Clément Bœsch	64926292a6	lavc/copy_block: style fix	2017-03-20 09:23:15 +01:00
Clément Bœsch	21c18b0878	Merge commit '73f5e17a203713c4ac4e5a821809823b383b195f' * commit '73f5e17a203713c4ac4e5a821809823b383b195f': copy_block: Change type of array stride parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:22:36 +01:00
Clément Bœsch	e59d8d030f	Merge commit '21e500ba647aec233d5930d3d1081489d0d53ceb' * commit '21e500ba647aec233d5930d3d1081489d0d53ceb': svq1dec: Change type of array pitch parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:17:34 +01:00
Clément Bœsch	bb3ad401fc	Merge commit '746c56b7730ce09397d3a8354acc131285e9d829' * commit '746c56b7730ce09397d3a8354acc131285e9d829': indeo: Change type of array pitch parameters to ptrdiff_t Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 09:07:57 +01:00
Clément Bœsch	3835283293	Merge commit '4fb311c804098d78e5ce5f527f9a9c37536d3a08' * commit '4fb311c804098d78e5ce5f527f9a9c37536d3a08': Drop memalign hack Merged, as this may indeed be uneeded since `46e3936fb0`. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:54:44 +01:00
Clément Bœsch	a5cf6628d6	Merge commit 'f01f7a7846529b7c3ef343f117eaa2c0a1457af0' * commit 'f01f7a7846529b7c3ef343f117eaa2c0a1457af0': hwcontext_dxva2: use the special UC copy for downloading frames Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:37:40 +01:00
Clément Bœsch	8200b16a9c	Merge commit 'd7bc52bf456deba0f32d9fe5c288ec441f1ebef5' * commit 'd7bc52bf456deba0f32d9fe5c288ec441f1ebef5': imgutils: add a function for copying image data from GPU mapped memory Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:34:10 +01:00
Clément Bœsch	5d23543277	Merge commit '24da430324735f95880c4a4a54298dc8023125bb' * commit '24da430324735f95880c4a4a54298dc8023125bb': Changelog: mark the release 12 branch This commit is a noop. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:26:09 +01:00
Clément Bœsch	518961bc99	Merge commit '851960f6f8cf1f946fe42fa36cf6598fac68072c' * commit '851960f6f8cf1f946fe42fa36cf6598fac68072c': lavc: Remove old vaapi decode infrastructure avconv_vaapi: Convert to use hw_frames_ctx only vaapi_mpeg4: Convert to use the new VAAPI hwaccel code vaapi_vc1: Convert to use the new VAAPI hwaccel code vaapi_mpeg2: Convert to use the new VAAPI hwaccel code vaapi_h264: Convert to use the new VAAPI hwaccel code lavc: Rewrite VAAPI decode infrastructure This merge is a noop, these commits have already been cherry-picked. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:25:01 +01:00
Clément Bœsch	464fcc979c	Merge commit '72eba6558ee4f10239ba3f472c0b033ec70082a7' * commit '72eba6558ee4f10239ba3f472c0b033ec70082a7': wmavoice: Simplify GetBitContext initialization This commit is a noop. We don't have that code anymore since `3deb4b54a2`. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:21:09 +01:00
Clément Bœsch	e514a1d404	Merge commit '80fc75d51e3312e1890591048eb6a3d499b6e49d' * commit '80fc75d51e3312e1890591048eb6a3d499b6e49d': Changelog: Mention mov with multiple stsd Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:19:03 +01:00
Clément Bœsch	45982bdcd0	Merge commit '728e80cd2e1d4b7c3e26489efcd77bd7a9e84a99' * commit '728e80cd2e1d4b7c3e26489efcd77bd7a9e84a99': High Definition Compatible Digital (HDCD) decoder filter, using libhdcd This commit is a noop, we have that code natively. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:17:09 +01:00
Clément Bœsch	b1a80bdb62	Merge commit '95f80293456d9d4b1b096621260c38bc90325ec0' * commit '95f80293456d9d4b1b096621260c38bc90325ec0': avprobe: Fix memory leak This commit is a noop, ffprobe is not affected. Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:12:57 +01:00
Clément Bœsch	5e5e793552	doc/APIchanges: fill date & hash for AV_PIX_FMT_FLAG_BAYER	2017-03-20 08:10:54 +01:00
Clément Bœsch	6557d784d2	Merge commit '8db804e8f549d5b86a1edf62736e0ef80f160da9' * commit '8db804e8f549d5b86a1edf62736e0ef80f160da9': mov: Remove old b-frame/video delay heuristic This commit is a noop, see `425be3c810` Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:09:15 +01:00
Clément Bœsch	64722057b4	Merge commit 'eb96505b761eb02b6a3efc76d854afa6a41941ff' * commit 'eb96505b761eb02b6a3efc76d854afa6a41941ff': mov: Remove ancient heuristic hack This commit is a noop, see `04f8d31287` Merged-by: Clément Bœsch <u@pkh.me>	2017-03-20 08:08:31 +01:00
Clément Bœsch	e811f84a2e	swscale: cosmetics in is{RGB,BGR}inInt Reduce diff with Libav.	2017-03-20 08:02:30 +01:00
Clément Bœsch	d6635daded	swscale: remove unused is{RGB,BGR}inBytes	2017-03-20 08:02:30 +01:00
Clément Bœsch	ff6bc16c5a	swscale: use a (more correct) function for isPacked	2017-03-20 08:02:30 +01:00
Clément Bœsch	2b9a52bcca	swscale: use a function for isAnyRGB	2017-03-20 08:02:30 +01:00
Clément Bœsch	c30875e8b2	swscale: use a function for isBayer	2017-03-20 08:02:30 +01:00
Clément Bœsch	9c2436e1e7	lavu: add AV_PIX_FMT_FLAG_BAYER	2017-03-20 08:02:30 +01:00
Clément Bœsch	f052b1b40f	swscale: use a function for isGray	2017-03-20 08:02:30 +01:00
Clément Bœsch	08e1376d81	fate: add fate-sws-pixdesc-query Test the pixel format querying within libswscale.	2017-03-20 08:02:30 +01:00
Michael Niedermayer	23f3f92361	avcodec/mjpegdec: quant_matrixes can be up to 65535, use uint16_t Fixes invalid shift Fixes: 870/clusterfuzz-testcase-5649105424482304 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-03-20 01:38:04 +01:00
Michael Niedermayer	656a17e126	avcodec/mjpegdec: Check quant_matrixes values for being non zero Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-03-20 01:38:02 +01:00
Michael Niedermayer	98da63b3f5	avcodec/vp56: Check avctx->error_concealment before enabling EC Fixes timeout with 847/clusterfuzz-testcase-5291877358108672 Fixes timeout with 850/clusterfuzz-testcase-5721296509861888 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-03-20 01:33:08 +01:00
Michael Niedermayer	a84d610b37	avcodec/h264_direct: Fix runtime error: signed integer overflow: -9 - 2147483647 cannot be represented in type 'int' Fixes: 864/clusterfuzz-testcase-4774385942528000 See: [FFmpeg-devel] [PATCH 1/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: 2147483647 - -14133 cannot be represented in type 'int' See: [FFmpeg-devel] [PATCH 2/2] avcodec/h264_direct: Fix runtime error: signed integer overflow: -9 - 2147483647 cannot be represented in type 'int' Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-03-20 01:33:08 +01:00
Michael Niedermayer	5d996b5649	avcodec/tiff: Check stripsize strippos for overflow Fixes: 861/clusterfuzz-testcase-5688284384591872 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-03-20 01:33:08 +01:00
Martin Storsjö	61b8a9ea29	aarch64: vp9itxfm16: Do a simpler half/quarter idct16/idct32 when possible This work is sponsored by, and copyright, Google. This avoids loading and calculating coefficients that we know will be zero, and avoids filling the temp buffer with zeros in places where we know the second pass won't read. This gives a pretty substantial speedup for the smaller subpartitions. The code size increases from 21512 bytes to 31400 bytes. The idct16/32_end macros are moved above the individual functions; the instructions themselves are unchanged, but since new functions are added at the same place where the code is moved from, the diff looks rather messy. Before: vp9_inv_dct_dct_16x16_sub1_add_10_neon: 284.6 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1902.7 vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1903.0 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2201.1 vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2510.0 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2821.3 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1011.6 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 9716.5 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9704.9 vp9_inv_dct_dct_32x32_sub8_add_10_neon: 10641.7 vp9_inv_dct_dct_32x32_sub12_add_10_neon: 11555.7 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 12499.8 vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13403.7 vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14335.8 vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15253.6 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16179.5 After: vp9_inv_dct_dct_16x16_sub1_add_10_neon: 282.8 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 1142.4 vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1139.0 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 1772.9 vp9_inv_dct_dct_16x16_sub12_add_10_neon: 2515.2 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2823.5 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1012.7 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 6944.4 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 6944.2 vp9_inv_dct_dct_32x32_sub8_add_10_neon: 7609.8 vp9_inv_dct_dct_32x32_sub12_add_10_neon: 9953.4 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 10770.1 vp9_inv_dct_dct_32x32_sub20_add_10_neon: 13418.8 vp9_inv_dct_dct_32x32_sub24_add_10_neon: 14330.7 vp9_inv_dct_dct_32x32_sub28_add_10_neon: 15257.1 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16190.6 Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-19 22:54:37 +02:00
Martin Storsjö	eabc5abf94	arm: vp9itxfm16: Do a simpler half/quarter idct16/idct32 when possible This work is sponsored by, and copyright, Google. This avoids loading and calculating coefficients that we know will be zero, and avoids filling the temp buffer with zeros in places where we know the second pass won't read. This gives a pretty substantial speedup for the smaller subpartitions. The code size increases from 14516 bytes to 22484 bytes. The idct16/32_end macros are moved above the individual functions; the instructions themselves are unchanged, but since new functions are added at the same place where the code is moved from, the diff looks rather messy. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub1_add_10_neon: 454.0 270.7 418.5 295.4 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 3840.2 3244.8 3700.1 2337.9 vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4212.5 3575.4 3996.9 2571.6 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 5174.4 4270.5 4615.5 3031.9 vp9_inv_dct_dct_16x16_sub12_add_10_neon: 5676.0 4908.5 5226.5 3491.3 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6403.9 5589.0 5839.8 3948.5 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1710.7 944.7 1582.1 1045.4 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 21040.7 16706.1 18687.7 13193.1 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22197.7 18282.7 19577.5 13918.6 vp9_inv_dct_dct_32x32_sub8_add_10_neon: 24511.5 20911.5 21472.5 15367.5 vp9_inv_dct_dct_32x32_sub12_add_10_neon: 26939.5 24264.3 23239.1 16830.3 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 29419.5 26845.1 25020.6 18259.9 vp9_inv_dct_dct_32x32_sub20_add_10_neon: 31146.4 29633.5 26803.3 19721.7 vp9_inv_dct_dct_32x32_sub24_add_10_neon: 33376.3 32507.8 28642.4 21174.2 vp9_inv_dct_dct_32x32_sub28_add_10_neon: 35629.4 35439.6 30416.5 22625.7 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37269.9 37914.9 32271.9 24078.9 After: vp9_inv_dct_dct_16x16_sub1_add_10_neon: 454.0 276.0 418.5 295.1 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 2336.2 1886.0 2251.0 1458.6 vp9_inv_dct_dct_16x16_sub4_add_10_neon: 2531.0 2054.7 2402.8 1591.1 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 3848.6 3491.1 3845.7 2554.8 vp9_inv_dct_dct_16x16_sub12_add_10_neon: 5703.8 4831.6 5230.8 3493.4 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6399.5 5567.0 5832.4 3951.5 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1722.1 938.5 1577.3 1044.5 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 15003.5 11576.8 13105.8 9602.2 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 15768.5 12677.2 13726.0 10138.1 vp9_inv_dct_dct_32x32_sub8_add_10_neon: 17278.8 14825.4 14907.5 11185.7 vp9_inv_dct_dct_32x32_sub12_add_10_neon: 22335.7 21544.5 20379.5 15019.8 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 24165.6 23881.7 21938.6 16308.2 vp9_inv_dct_dct_32x32_sub20_add_10_neon: 31082.2 30860.9 26835.3 19711.3 vp9_inv_dct_dct_32x32_sub24_add_10_neon: 33102.6 31922.8 28638.3 21161.0 vp9_inv_dct_dct_32x32_sub28_add_10_neon: 35104.9 34867.5 30411.7 22621.2 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37438.1 39103.4 32217.8 24067.6 Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-19 22:54:33 +02:00
Martin Storsjö	d564c9018f	aarch64: vp9itxfm16: Move the load_add_store macro out from the itxfm16 pass2 function This allows reusing the macro for a separate implementation of the pass2 function. Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-19 22:54:30 +02:00
Martin Storsjö	0f2705e66b	aarch64: vp9itxfm16: Make the larger core transforms standalone functions This work is sponsored by, and copyright, Google. This reduces the code size of libavcodec/aarch64/vp9itxfm_16bpp_neon.o from 26288 to 21512 bytes. This gives a small slowdown of a couple of tens of cycles, but makes it more feasible to add more optimized versions of these transforms. Before: vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1887.4 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2801.5 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9691.4 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16154.9 After: vp9_inv_dct_dct_16x16_sub4_add_10_neon: 1899.5 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 2827.2 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 9714.7 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 16175.9 Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-19 22:54:26 +02:00
Martin Storsjö	0ea603203d	arm: vp9itxfm16: Make the larger core transforms standalone functions This work is sponsored by, and copyright, Google. This reduces the code size of libavcodec/arm/vp9itxfm_16bpp_neon.o from 17500 to 14516 bytes. This gives a small slowdown of a couple tens of cycles, up to around 150 cycles for the full case of the largest transform, but makes it more feasible to add more optimized versions of these transforms. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4237.4 3561.5 3971.8 2525.3 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6371.9 5452.0 5779.3 3910.5 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22068.8 17867.5 19555.2 13871.6 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37268.9 38684.2 32314.2 23969.0 After: vp9_inv_dct_dct_16x16_sub4_add_10_neon: 4375.1 3571.9 4283.8 2567.2 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 6415.6 5578.9 5844.6 3948.3 vp9_inv_dct_dct_32x32_sub4_add_10_neon: 22653.7 18079.7 19603.7 13905.3 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 37593.2 38862.2 32235.8 24070.9 Signed-off-by: Martin Storsjö <martin@martin.st>	2017-03-19 22:54:19 +02:00

1 2 3 4 5 ...

84235 Commits