* commit '218d6844b37d339ffbf2044ad07d8be7767e2734':
h264dsp: Factorize code into a new function, h264_find_start_code_candidate
Conflicts:
libavcodec/h264_parser.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This performs the start code search which was previously part of
h264_find_frame_end() - the most CPU intensive part of the function.
By itself, this results in a performance regression:
Before After
Mean StdDev Mean StdDev Change
Overall time 2925.6 26.2 3068.5 31.7 -4.7%
but this can more than be made up for by platform-optimised
implementations of the function.
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (25 commits)
rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MC
ape: Use unsigned integer maths
arm: dsputil: fix overreads in put/avg_pixels functions
h264: K&R formatting cosmetics for header files (part II/II)
h264: K&R formatting cosmetics for header files (part I/II)
rtmp: Implement check bandwidth notification.
rtmp: Support 'rtmp_swfurl', an option which specifies the URL of the SWF player.
rtmp: Support 'rtmp_flashver', an option which overrides the version of the Flash plugin.
rtmp: Support 'rtmp_tcurl', an option which overrides the URL of the target stream.
cmdutils: Add fallback case to switch in check_stream_specifier().
sctp: be consistent with socket option level
configure: Add _XOPEN_SOURCE=600 to Solaris preprocessor flags.
vcr1enc: drop pointless empty encode_init() wrapper function
vcr1: drop pointless write-only AVCodecContext member from VCR1Context
vcr1: group encoder code together to save #ifdefs
vcr1: cosmetics: K&R prettyprinting, typos, parentheses, dead code, comments
mov: make one comment slightly more specific
lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX
lavfi: move audio-related functions to a separate file.
lavfi: remove some audio-related function from public API.
...
Conflicts:
cmdutils.c
libavcodec/h264.h
libavcodec/h264_mvpred.h
libavcodec/vcr1.c
libavfilter/avfilter.c
libavfilter/avfilter.h
libavfilter/defaults.c
libavfilter/internal.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
libschroedinger: Switch to function names more in line with Libav style.
Move code shared between libdirac and libschroedinger to libschroedinger.
lavfi: uninline avfilter_copy_buffer_ref_props().
lavf: add missing '*' in a doxy.
h264: Remove a commented-out function pointer typedef.
txd: Remove write-only variable in txd_decode_frame().
mmvideo.c: Remove unused variable in mm_decode_pal().
build: cosmetics: Add missing end-of-line backslashes to item lists.
build: cosmetics: Split HEADERS/OBJS/PROGS lists into one entry per line.
libschroedinger: Move a function to avoid a forward declaration.
pthread: warn on high thread counts
vf_yadif: fix missing error handling for avfilter_poll_frame()
avprobe: allow showing only one container/stream property.
lavfi: support audio in avfilter_copy_frame_props().
lavfi: avfilter_merge_formats: handle case where inputs are same
lavc: add sample rate and channel layout to AVFrame.
zerocodec: check if the previous frame is missing
doc: clarify check for NULL pointer style
Conflicts:
doc/APIchanges
doc/developer.texi
ffprobe.c
libavcodec/Makefile
libavcodec/avcodec.h
libavcodec/libdirac_libschro.c
libavcodec/libdirac_libschro.h
libavcodec/mmvideo.c
libavcodec/txd.c
libavcodec/version.h
libavcodec/zerocodec.c
libavfilter/Makefile
libavfilter/avfilter.c
libavfilter/version.h
libavformat/Makefile
libavutil/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (35 commits)
flvdec: Do not call parse_keyframes_index with a NULL stream
libspeexdec: include system headers before local headers
libspeexdec: return meaningful error codes
libspeexdec: cosmetics: reindent
libspeexdec: decode one frame at a time.
swscale: fix signed shift overflows in ff_yuv2rgb_c_init_tables()
Move timefilter code from lavf to lavd.
mov: add support for hdvd and pgapmetadata atoms
mov: rename function _stik, some indentation cosmetics
mov: rename function _int8 to remove ambiguity, some indentation cosmetics
mov: parse the gnre atom
mp3on4: check for allocation failures in decode_init_mp3on4()
mp3on4: create a separate flush function for MP3onMP4.
mp3on4: ensure that the frame channel count does not exceed the codec channel count.
mp3on4: set channel layout
mp3on4: fix the output channel order
mp3on4: allocate temp buffer with av_malloc() instead of on the stack.
mp3on4: copy MPADSPContext from first context to all contexts.
fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
...
Conflicts:
libavcodec/arm/h264dsp_init_arm.c
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
libavcodec/h264_ps.c
libavcodec/h264dsp_template.c
libavcodec/h264idct_template.c
libavcodec/h264pred.c
libavcodec/h264pred_template.c
libavcodec/x86/h264dsp_mmx.c
libavdevice/Makefile
libavdevice/jack_audio.c
libavformat/Makefile
libavformat/flvdec.c
libavformat/flvenc.c
libavutil/pixfmt.h
libswscale/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
swscale: remove misplaced comment.
ffmpeg: fix streaming to ffserver.
swscale: split out RGB48 output functions from yuv2packed[12X]_c().
build: move vpath directives to main Makefile
swscale: fix JPEG-range YUV scaling artifacts.
build: move ALLFFLIBS to a more logical place
ARM: factor some repetitive code into macros
Fix SVQ3 after adding 4:4:4 H.264 support
H.264: fix CODEC_FLAG_GRAY
4:4:4 H.264 decoding support
ac3enc: fix allocation of floating point samples.
Conflicts:
ffmpeg.c
libavcodec/dsputil_template.c
libavcodec/h264.c
libavcodec/mpegvideo.c
libavcodec/snow.c
libswscale/swscale.c
libswscale/swscale_internal.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (30 commits)
AVOptions: make default_val a union, as proposed in AVOption2.
arm/h264pred: add missing argument type.
h264dsp_mmx: place bracket outside #if/#endif block.
lavf/utils: fix ff_interleave_compare_dts corner case.
fate: add 10-bit H264 tests.
h264: do not print "too many references" warning for intra-only.
Enable decoding of high bit depth h264.
Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
Add support for higher QP values in h264.
Add the notion of pixel size in h264 related functions.
Make the h264 loop filter bit depth aware.
Template dsputil_template.c with respect to pixel size, etc.
Template h264idct_template.c with respect to pixel size, etc.
Preparatory patch for high bit depth h264 decoding support.
Move some functions in dsputil.c into a new file dsputil_template.c.
Move the functions in h264idct into a new file h264idct_template.c.
Move the functions in h264pred.c into a new file h264pred_template.c.
Preparatory patch for high bit depth h264 decoding support.
Add pixel formats for 9- and 10-bit yuv420p.
Choose h264 chroma dc dequant function dynamically.
...
Conflicts:
doc/APIchanges
ffmpeg.c
ffplay.c
libavcodec/alpha/dsputil_alpha.c
libavcodec/arm/dsputil_init_arm.c
libavcodec/arm/dsputil_init_armv6.c
libavcodec/arm/dsputil_init_neon.c
libavcodec/arm/dsputil_iwmmxt.c
libavcodec/arm/h264pred_init_arm.c
libavcodec/bfin/dsputil_bfin.c
libavcodec/dsputil.c
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
libavcodec/h264_loopfilter.c
libavcodec/h264_ps.c
libavcodec/h264_refs.c
libavcodec/h264dsp.c
libavcodec/h264idct.c
libavcodec/h264pred.c
libavcodec/mlib/dsputil_mlib.c
libavcodec/options.c
libavcodec/ppc/dsputil_altivec.c
libavcodec/ppc/dsputil_ppc.c
libavcodec/ppc/h264_altivec.c
libavcodec/ps2/dsputil_mmi.c
libavcodec/sh4/dsputil_align.c
libavcodec/sh4/dsputil_sh4.c
libavcodec/sparc/dsputil_vis.c
libavcodec/utils.c
libavcodec/version.h
libavcodec/x86/dsputil_mmx.c
libavformat/options.c
libavformat/utils.c
libavutil/pixfmt.h
libswscale/swscale.c
libswscale/swscale_internal.h
libswscale/swscale_template.c
tests/ref/seek/lavf_avi
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
No speed improvement, but necessary for some future stuff.
Also opens up the possibility of asm chroma dc idct/dequant.
Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk
Since we no longer have non-transposed scantables, the problem it warns about
no longer exists.
Originally committed as revision 26339 to svn://svn.ffmpeg.org/ffmpeg/trunk
About 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.
Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
This moves the H264-specific functions from DSPContext to the new
H264DSPContext. The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.
The qpel and chroma MC functions are not moved as these are used by
non-h264 code.
Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk