This fixes a race condition and use of the wrong field, which become shared
instead of per thread during some AVFrame changes.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Sometimes the input buffers get directly used as raw images and
SIMD optimized video/image filters can sometimes read more than 16 bytes
over the end.
a specific example is the AVX 24bpp to yuv code
This also fixes fate-vsynth3-rgb
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Since error resilience uses AVFrame pointers instead of references it
has to copy NULL pointers too. After a codec flush the last/next frame
pointers in MpegEncContext are NULL and the old pointers remaining in
ERContext are invalid. Fixes a crash in vlc for android thumbnailer.
Reported and debugged by Adrien Maglo <magsoft@videolan.org>.
grayscale is coded as 4 pixels at a time, the encoder lacks support
for the case where width%4 != 0, and will simply encode less data
leaving random data after decoding at the right side
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '27860819d508068f056cf48473af04868791ad77':
ppc: Consistently use convenience macro for runtime CPU detection
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '570d4b21863b6254d6bbca9c528bede471bb4478':
x86: h264: Don't keep data in the redzone across function calls on 64 bit unix
Merged-by: Michael Niedermayer <michaelni@gmx.at>
We know that the called function (ff_chroma_inter_body_mmxext)
doesn't touch the redzone, and thus will be kept intact - thus,
this doesn't fix any bug per se.
However, valgrind's memcheck tool intentionally assumes that the
redzone is clobbered on every function call and function return
(see a long comment in valgrind/memcheck/mc_main.c). This avoids
false positives in that tool, at the cost of an extra stack pointer
adjustment.
The other alternative would be a valgrind suppression for this issue,
but that's an extra burden for everybody that wants to run libavcodec
within valgrind.
Signed-off-by: Martin Storsjö <martin@martin.st>
The actual predictor value, set by the trellis code, never
was written back into the variable that was written into
the block header. This was accidentally removed in b304244b.
This significantly improves the audio quality of the trellis
case, which was plain broken since b304244b.
Encoding IMA QT with trellis still actually gives a slightly
worse quality than without trellis, since the trellis encoder
doesn't use the exact same way of rounding as in
adpcm_ima_qt_compress_sample and adpcm_ima_qt_expand_nibble.
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
This very slightly improves compression
Found-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is probably not the simplest solution but as this is needed for a bugfix,
simplification is left for later.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes a regression since fb3e380 similar to ticket #2661,
reported by fluffrabbit at aol dot com.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
There's an SSE2 version already, and technically the SSE version
on x86_64 was wrong (using pshufd and pshuflw, SSE2 instructions).
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This avoids returning a initial frame after seeking which does
not match what would be received when decoding from the begin.
Suggested-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was broken in 095be4fb - samples+ch (for the previous
non-planar case) equals &samples_p[ch][0]. The confusion
probably stemmed from the IMA WAV case where it originally
was &samples[avctx->channels + ch], which was correctly
changed into &samples_p[ch][1].
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
The actual predictor value, set by the trellis code, never
was written back into the variable that was written into
the block header. This was accidentally removed in b304244b.
This significantly improves the audio quality of the trellis
case, which was plain broken since b304244b.
Encoding IMA QT with trellis still actually gives a slightly
worse quality than without trellis, since the trellis encoder
doesn't use the exact same way of rounding as in
adpcm_ima_qt_compress_sample and adpcm_ima_qt_expand_nibble.
Fixes part of Ticket3701
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was broken in 095be4fb - samples+ch (for the previous
non-planar case) equals &samples_p[ch][0]. The confusion
probably stemmed from the IMA WAV case where it originally
was &samples[avctx->channels + ch], which was correctly
changed into &samples_p[ch][1].
Fixes part of Ticket3701
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is done by padding the coefficient buffer with 0s, because the order
may be only a multiple of 4, and the DSP function requires batches of 8.
However, no sample with such a case was found, so request one if it uses
that kind of order.
Approximate relative speedup depending on instruction set:
plain C: -6%
mmxext: 51%
sse2: 54%
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2f7065190ad48744014a02288799d03adfa613e0':
libfdk-aac: Relicense the library wrappers to the ISC license
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This reduces the number of different licenses used within libav,
and is preferrable since it has less ambiguous wordings than
the BSD license with respect to the duties of the user of the code.
Fraunhofer have now indicated that they're allowed to contribute
code under this license as well.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes the mp3 decoder produce the same result when repeatly flushing and decoding
Suggested-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
An invalid entry already has the property of having a negative number
of bits, so remove the check on the reserved value, and rearrange the
code as a consequence.
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Overall time: 7.878s
319076 decicycles in 422, 262096 runs, 48 skips
159875 decicycles in gray, 262057 runs, 87 skips
Overall time: 7.394s
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It's no longer used inside another specific macro, so rename it.
Also remove duplicated definition and realign code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These where intended to maintain the previous behavior before dca_dmix_code()
but it is unclear (to me) which way is correct and no sample seem to trigger
the case, also they are incomplete for the purprose of error checking
Found-by: Niels Möller <nisse@lysator.liu.se>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
When the joint table does not contain a valid entry, the decoding restarts
from scratch. By implementing the trick of jumping to the 2nd level of the
individual table (and inlining the whole), a speed improvement of 5-10%
is possible.
On a 1000-frames YUV4:2:0 video, before:
362851 decicycles in 422, 262094 runs, 50 skips
182488 decicycles in gray, 262087 runs, 57 skips
Object size: 23584
Overall time: 8.377
After:
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Object size: 23188
Overall time: 7.878
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It was instead using the highest available asm functions, which completely
kills the point of being a reference C context.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>