The problem is that with particularly complex images and especially at
high bit depths and 5-level transforms the coefficients would overflow,
causing huge artifacts to appear. This was discovered thanks to the fate
tests, which will have to be redone as this fixes a multitude of
problems and increases PSNR.
There is a slight performance drop associated with this change, making
the encoder slower by 1.15 times, however this is necessary in order to
avoid undefined behavior and overflows.
It would be worth to template the transforms to keep the performance for
8 bit images as 32 bit coefficients are unnecessary for that case, but
the primary use of the encoder is to encode video at 10 bits.
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Commit ca2f19b9cc modified the meaning of
H264SliceContext.gb: it is now initialised at the start of the NAL unit
header, rather than at the start of the slice header. The VAAPI slice
decoder uses the offset after parsing to determine the offset of the
slice data in the bitstream, so with the changed meaning we no longer
need to add the extra byte to account for the NAL unit header because
it is now included directly.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
The unique user so far is wmalossless 24bits. The few samples tested show an
order of 8, so more unrolling or an avx2 version do not make sense.
Timings: 68 -> 49 cycles
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '9fa888c02801fff2e8817c24068f5296bbe60000':
intrax8: Keep a reference to the decoder blocks
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* commit 'c2084ffcbfc11d1b6ed3a4a0df9cafd56fbb896f':
intrax8: Use the generic horizband function
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This ports the fix from 033a533 to the new parser module in prepartion
of using it for the h264 decoder.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
When only one sublayer is present, no information is coded. Only when at least two
are present, all 8 sublayers are written.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
It is impossible to pass "aspect" parameter to encoder from ffmpeg CLI
because option from lavc/options_table.h is eclipsed by option with same
name in ffmpeg_opt.c, which has different meaning (DAR, not SAR).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The slice prefix is 0 in the reference encoder and the decoder ignores it.
Writing 0 there seems like the best temporary solution.
The padding could have contained uninitialized data, but reference VC2
encoders put 0xFF there, hence the memset value.
Overall this allows producing bistreams with no random data for use by fate.
Add frames_before and frames_after as hints that there will be frames before
or after the frames produced in this session. This may help with
concatenation issues like bit rate spikes.
Signed-off-by: Rick Kern <kernrj@gmail.com>
Handle AV_PIX_FMT_VIDEOTOOLBOX.
This results in better energy usage and faster encoding, especially on iOS.
When the buffer comes from the media server, no memcpy's are needed.
Signed-off-by: Rick Kern <kernrj@gmail.com>
reverts one hunk from 7966ddfc0b
The new code from 7966ddfc0b only covers extradata based SPS
Fixes: ffplay -ss 13 58af5798-fa2c-42a2-997d-dc8e49de2d8a.flv
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'a7829a2a3f8e6ec0b9f2673c11f56916800aeb33':
h264: reimplement 3aa661ec5 in a more explicit way
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>