mirror of
https://github.com/jellyfin/jellyfin-ffmpeg.git
synced 2024-10-07 11:23:26 +00:00
avfilter/tonemapx: add simd optimized tonemapx
This includes NEON for ARMv8, SSE for x86-64-v2 and AVX+FMA for x86-64-v3 Test result with 4K HEVC 10bit HLG input, encoding with libx264 veryfast using bt2390: Intel Core i9-12900: tonemapx.c: 57fps tonemapx.sse: 74fps tonemapx.avx: 77fps Apple M1 Max: tonemapx.c:43fps tonemapx.neon: 57fps For comparison, original zscale+tonemap simd results: Intel Core i9-12900: tonemap.avx: 40fps tonemap.sse: 40fps tonemap.c: 32fps Apple M1 Max: tonemap.neon: 44fps tonemap.c: 35fps The original implementation is too memory heavy that dual-channel desktop CPUs are easily memory bounded due to the intermediate RGBF32 framebuffer sharing with zscale. Tonemapx lowered the the bandwidth requirement which brings significant performance gain to bandwidth limited platforms. Even for bandwidth-rich M1 Max it still provides significant performance boost due to better cache hitrate.
This commit is contained in:
parent
443d842d1e
commit
d37d7386e6
2673
debian/patches/0080-test-perf-tonemapx-filter.patch
vendored
Normal file
2673
debian/patches/0080-test-perf-tonemapx-filter.patch
vendored
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user