The return value provides no useful information and removing the printing
avoids the following warning:
libavcodec/h264_refs.c:788:15: warning: 'i' may be used uninitialized in this function [-Wuninitialized]
The function requires increasing the fuzz factor for the ac3/eac3 encode
tests and even so makes fate fail. It only provides a slight encoding
speedup for legacy CPUs that do not support SS2. Thus its benefit is not
worth the trouble it creates and fixing it would be a waste of time.
Intra codecs do not need an update_thread_context() function and never
call ff_thread_finish_setup(). They rely on ff_thread_get_buffer()
calling it. So call it even if the get_buffer2 function pointer is
avcodec_default_get_buffer2 and it has not been called before.
Based on the 2007 GSoC project from Kamil Nowosad <k.nowosad@students.mimuw.edu.pl>
Updated to current programming standards, style and many more small
fixes by Diego Biurrun <diego@biurrun.de>.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The code represents a considerable maintenance burden and it is not
clear that it gives a noticeable benefit to outweigh this after 10
years of improvements in compiler technology since its creation.
Previously, if dct_bits was set to 32, we used separate 32-bit
versions of these functions. Since dct_bits now is removed,
remove the unused 32-bit versions of the functions.
Signed-off-by: Martin Storsjö <martin@martin.st>
The data offsets are relative to the bistream header, which is 16 bytes
after the start of the data.
Fixes invalid reads with corrupted files.
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
Also add an additional sanity check to the alt_quant table.
Fixes invalid reads with corrupted files.
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
They can be different if the last keyframe failed to decode correctly.
Fixes possible invalid reads in such a case.
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.
Also remove the unused type == 0 cases from the plain C version
of the idct.
Signed-off-by: Martin Storsjö <martin@martin.st>
While this change isn't bitexact, the IDCTs weren't bitexact to
start with either.
This simplifies decoupling vp3 from dsputil.
Signed-off-by: Martin Storsjö <martin@martin.st>
It is only used for error resilience. This allows building the
h264 decoder without dsputil, if error resilience is disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
Put a copy of the 8bit functions only in dsputil, where they are
used for some other things (e.g. mpeg4qpel, mspel, cavsqpel).
Signed-off-by: Martin Storsjö <martin@martin.st>
The pointers that get assigned ff_cropTbl were made const in
9e0f14f1, but other variables that transitively are assigned
based on these variables were missed.
Signed-off-by: Martin Storsjö <martin@martin.st>
These are widely used throughout libavcodec, nothing dsputil-specific.
Change ff_cropTbl to a statically initialized table, to avoid
initializing it with a function call.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes the vp3 decoder less dependent on dsputil, and will aid
in making it (eventually) dsputil-independent.
Signed-off-by: Martin Storsjö <martin@martin.st>
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.
Signed-off-by: Martin Storsjö <martin@martin.st>
In the non-bitexact mode, vp3 currently decodes to the same
frame crcs as before 28f9ab702 (and the output visually looks
correct).
Signed-off-by: Martin Storsjö <martin@martin.st>
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
It can be 0 or -1 for invalid files, which may result in invalid memory
access.
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
This can happen when the number of skipped lines is not consistent with
the number of coded lines.
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
This fixes standalone compilation of the msmpeg4v2, msmpeg4v3
and wmv2 encoders, that previously failed to link due to the
decoder codepaths requiring error_resilience.
Signed-off-by: Martin Storsjö <martin@martin.st>
Allows use of AVHWAccel based decoders with frame based multithreading.
The decoders will be forced into an non-concurrent mode by delaying
ff_thread_finish_setup() calls after decoding of the current frame
is finished.
This wastes memory by unnecessarily using multiple threads and thus
copies of the decoder context but allows seamless switching between
hardware accelerated and frame threaded software decoding when the
hardware decoder does not support the stream.
Since c977039e58 plane count for
PIX_FMT_HWACCEL pixel formats is 0 instead of 1. The created dummy
AVBuffers are still bogus since AVFrame does not hold frame data when
AVHWAccels are used.
Error resilience is enabled by the h264 decoder, unless explicitly
disabled. --disable-everything --enable-decoder=h264 will produce
a h264 decoder with error resilience enabled, while
--disable-everything --enable-decoder=h264 --disable-error-resilience
will produce a h264 decoder with error resilience disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
Also move the declaration to internal.h, and add restrict qualifiers
to the declaration (as in the implementation).
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows dropping the mpegvideo dependency from a number of
components.
This also fixes standalone building of the h264 parser, which
was broken in 64e438697.
Signed-off-by: Martin Storsjö <martin@martin.st>
AVCodecContext.bits_per_raw_sample is updated from the previous thread
in the generic update function before the codec specific update_thread
function is called. The check for reinitialization of dsp functions uses
bits_per_raw_sample. When called from update_thread_context it will be
already at the current value and the dsp functions aren't updated if
only the bit depth changes.
This ensures the hwaccel privdata does not leak when a frame buffer could
not be allocated (and toggle the assert when the frame is re-used).
Having no frame buffer available is quite common when using the DXVA2
hwaccel in situations where the DXVA2 renderer is being re-allocated, for
example when moving between displays.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This ensures the hwaccel privdata does not leak when a frame buffer could
not be allocated (and toggle the assert when the frame is re-used).
Having no frame buffer available is quite common when using the DXVA2
hwaccel in situations where the DXVA2 renderer is being re-allocated, for
example when moving between displays.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Number of planes is not always equal to the number of components even
for formats marked with PIX_FMT_PLANAR -- e.g. NV12 has three components
in two planes.
The total frame size is a combination of the 12 bits in the sequence
header and 2 more bits in the the sequence extension. While the
specification explicitly forbids the dimensions from the sequence header
from being 0 (thus ruling out multiples of 4096), such videos
apparrently exist in the wild so we should attempt to decode them.
Based on a patch by Michael Niedermayer <michaelni@gmx.at>
Fixes Bug 416.
When `off' is 0, `0x537F6103 << 32' in the following expression invokes
undefined behavior, the result of which is not necessarily 0.
(0x537F6103 >> (off * 8)) | (0x537F6103 << (32 - (off * 8)))
Avoid oversized shifting.
CC: libav-stable@libav.org
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This will help in supporting old versions, e.g. version 3.93 uses the same
range coder but different predictor and version 3.82 uses different range
coder and predictor. Also this should not make decoding newer versions slower
by introducing additional checks on versions.
AVCodecContext release_buffer() shall be NULL for audio codecs using
get_buffer. The backward compatibility code hence have to check before
calling it.
Returning 0 may result in an infinite loop in valid calling programs. A
decoder should never return 0 without producing any output.
CC:libav-stable@libav.org
When there is just 1 byte remanining in the buffer, nothing will be read
and the loop will continue forever. Check that there are at least 8
bytes, which are always read at the beginning.
CC:libav-stable@libav.org
The loop a few lines below the xan_unpack() call accesses up to
dec_size * 2 bytes into y_buffer, so dec_size must be limited to
buffer_size / 2.
CC:libav-stable@libav.org
This is the most that can be represented with the current channel layout
system. This limit should be raised/removed when a better system is
implemented.
This misfeature is most likely completely useless and conflicts with
removing the mpegvideo-specific fields from AVFrame. In the improbable
case it is actually useful, it should be reimplemented in a better way.
When parsing the Xing/Info tag, don't set the bit rate if it's an Info tag.
When parsing the stream, don't override the bit rate if it's already set,
otherwise calculate the mean bit rate from parsed frames. This way, the bit
rate will be set correctly both for CBR and VBR streams.
CC:libav-stable@libav.org
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This makes the decoder independent of mpegvideo.
This copy of the draw_horiz_band code is simplified compared to
the "generic" mpegvideo one which still has a number of special
cases for different codecs.
Signed-off-by: Martin Storsjö <martin@martin.st>
Make av_get_codec_tag_string() show codec tag string characters in a more
intelligible ways. For example the ascii char "@" is used as a number, so
should be displayed like "[64]" rather than as a printable character.
Apart alphanumeric chars, only the characters ' ' and '.' are used
literally in codec tags, all the other characters represent numbers.
This also avoids relying on locale-dependent character class functions.
Signed-off-by: Martin Storsjö <martin@martin.st>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Martin Storsjö <martin@martin.st>
get_uint returns an unsigned value, use an unsigned to store
blocksize to make sure the comparison logic is correct and report
correctly the error for the channel count not supported.
Prevent the loop shorten_decode_close from writing and freeing out of
the array boundary.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The decoder assumes a single bit depth for all the planes
while the specification allows different bit depths for luma
and chroma.
Avoid the possible problems described in CVE-2013-2277
CC: libav-stable@libav.org
This reverts commit f90ff772e7.
The code should be put back in h264_qpel_8bit.asm, but unfortunately
it is unconditionally used from dsputil_mmx.c since 71155d7.
The external assembly function uses mmxext instructions and should not be
masqueraded as an mmx-only function. Instead, use the mmx-only inline
assembly function.
The specification does not prevent an encoder to write the amplitude 0
as 0 amplitude_bits.
Our get_bits() implementation might not support a zero sized read
properly, thus the additional branch.
The value is used to calculate output LSP curve and a division by zero
and out of array accesses would occur.
CVE-2013-0894
CC: libav-stable@libav.org
Reported-by: Dale Curtis <dalecurtis@chromium.org>
Found-by: inferno@chromium.org
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The init functions marked as av_cold have to be executed in any case,
so there is no gain from trying to mark paths leading to such functions
as unlikely.
This also allows libx264 to modify its i_qfactor value
when using the "-tune" setting. Previously it had a static
value of 1.25
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This fixes crashes in chromium on win64 on machines with AVX
(crashes that apparently aren't triggered by fate).
Signed-off-by: Martin Storsjö <martin@martin.st>
This gets rid of a number of warnings about casts discarding
qualifiers from the pointer target, present since 7ebfb466a.
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows more transparent mixing of get_bits and whole-byte access
without having to touch get_bits internals.
Signed-off-by: Martin Storsjö <martin@martin.st>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-alpha and alpha-Y planes are cleared in the idct_put/add()
calls. For the alpha U/V planes, we only care about the DC for entropy
context prediction purposes, the rest of the data is unused.
Signed-off-by: Martin Storsjö <martin@martin.st>
This was caused by unconditionally referencing a conditionally compiled
table. Now the code is also compiled conditionally.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The library might provide an encoder in the future, so it's better to
check for the presence of the decoder rather than just the library.
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>