mirror of
https://github.com/mozilla/gecko-dev.git
synced 2024-11-28 15:23:51 +00:00
dcced52d74
--HG-- extra : commitid : E8TzkSU0TOC extra : rebase_source : df175815820cba7829062fd69f93f9a8aeceb473
340 lines
16 KiB
Plaintext
Executable File
340 lines
16 KiB
Plaintext
Executable File
*******************************************************************************
|
|
** Background
|
|
*******************************************************************************
|
|
|
|
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
|
|
NEON) to accelerate baseline JPEG compression and decompression on x86, x86-64,
|
|
and ARM systems. On such systems, libjpeg-turbo is generally 2-4x as fast as
|
|
libjpeg, all else being equal. On other types of systems, libjpeg-turbo can
|
|
still outperform libjpeg by a significant amount, by virtue of its
|
|
highly-optimized Huffman coding routines. In many cases, the performance of
|
|
libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
|
|
|
|
libjpeg-turbo implements both the traditional libjpeg API as well as the less
|
|
powerful but more straightforward TurboJPEG API. libjpeg-turbo also features
|
|
colorspace extensions that allow it to compress from/decompress to 32-bit and
|
|
big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
|
|
interface.
|
|
|
|
libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
|
|
derivative of libjpeg v6b developed by Miyasaka Masaru. The TigerVNC and
|
|
VirtualGL projects made numerous enhancements to the codec in 2009, and in
|
|
early 2010, libjpeg-turbo spun off into an independent project, with the goal
|
|
of making high-speed JPEG compression/decompression technology available to a
|
|
broader range of users and developers.
|
|
|
|
|
|
*******************************************************************************
|
|
** License
|
|
*******************************************************************************
|
|
|
|
libjpeg-turbo is covered by three compatible BSD-style open source licenses.
|
|
Refer to LICENSE.txt for a roll-up of license terms.
|
|
|
|
|
|
*******************************************************************************
|
|
** Using libjpeg-turbo
|
|
*******************************************************************************
|
|
|
|
libjpeg-turbo includes two APIs that can be used to compress and decompress
|
|
JPEG images:
|
|
|
|
TurboJPEG API: This API provides an easy-to-use interface for compressing
|
|
and decompressing JPEG images in memory. It also provides some functionality
|
|
that would not be straightforward to achieve using the underlying libjpeg
|
|
API, such as generating planar YUV images and performing multiple
|
|
simultaneous lossless transforms on an image. The Java interface for
|
|
libjpeg-turbo is written on top of the TurboJPEG API.
|
|
|
|
libjpeg API: This is the de facto industry-standard API for compressing and
|
|
decompressing JPEG images. It is more difficult to use than the TurboJPEG
|
|
API but also more powerful. The libjpeg API implementation in libjpeg-turbo
|
|
is both API/ABI-compatible and mathematically compatible with libjpeg v6b.
|
|
It can also optionally be configured to be API/ABI-compatible with libjpeg v7
|
|
and v8 (see below.)
|
|
|
|
There is no significant performance advantage to either API when both are used
|
|
to perform similar operations.
|
|
|
|
=====================
|
|
Colorspace Extensions
|
|
=====================
|
|
|
|
libjpeg-turbo includes extensions that allow JPEG images to be compressed
|
|
directly from (and decompressed directly to) buffers that use BGR, BGRX,
|
|
RGBX, XBGR, and XRGB pixel ordering. This is implemented with ten new
|
|
colorspace constants:
|
|
|
|
JCS_EXT_RGB /* red/green/blue */
|
|
JCS_EXT_RGBX /* red/green/blue/x */
|
|
JCS_EXT_BGR /* blue/green/red */
|
|
JCS_EXT_BGRX /* blue/green/red/x */
|
|
JCS_EXT_XBGR /* x/blue/green/red */
|
|
JCS_EXT_XRGB /* x/red/green/blue */
|
|
JCS_EXT_RGBA /* red/green/blue/alpha */
|
|
JCS_EXT_BGRA /* blue/green/red/alpha */
|
|
JCS_EXT_ABGR /* alpha/blue/green/red */
|
|
JCS_EXT_ARGB /* alpha/red/green/blue */
|
|
|
|
Setting cinfo.in_color_space (compression) or cinfo.out_color_space
|
|
(decompression) to one of these values will cause libjpeg-turbo to read the
|
|
red, green, and blue values from (or write them to) the appropriate position in
|
|
the pixel when compressing from/decompressing to an RGB buffer.
|
|
|
|
Your application can check for the existence of these extensions at compile
|
|
time with:
|
|
|
|
#ifdef JCS_EXTENSIONS
|
|
|
|
At run time, attempting to use these extensions with a libjpeg implementation
|
|
that does not support them will result in a "Bogus input colorspace" error.
|
|
Applications can trap this error in order to test whether run-time support is
|
|
available for the colorspace extensions.
|
|
|
|
When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
|
|
X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
|
|
can set that byte to whatever value it wishes. If an application expects the X
|
|
byte to be used as an alpha channel, then it should specify JCS_EXT_RGBA,
|
|
JCS_EXT_BGRA, JCS_EXT_ABGR, or JCS_EXT_ARGB. When these colorspace constants
|
|
are used, the X byte is guaranteed to be 0xFF, which is interpreted as opaque.
|
|
|
|
Your application can check for the existence of the alpha channel colorspace
|
|
extensions at compile time with:
|
|
|
|
#ifdef JCS_ALPHA_EXTENSIONS
|
|
|
|
jcstest.c, located in the libjpeg-turbo source tree, demonstrates how to check
|
|
for the existence of the colorspace extensions at compile time and run time.
|
|
|
|
===================================
|
|
libjpeg v7 and v8 API/ABI Emulation
|
|
===================================
|
|
|
|
With libjpeg v7 and v8, new features were added that necessitated extending the
|
|
compression and decompression structures. Unfortunately, due to the exposed
|
|
nature of those structures, extending them also necessitated breaking backward
|
|
ABI compatibility with previous libjpeg releases. Thus, programs that were
|
|
built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
|
|
based on the libjpeg v6b code base. Although libjpeg v7 and v8 are not
|
|
as widely used as v6b, enough programs (including a few Linux distros) made
|
|
the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
|
|
in libjpeg-turbo. It should be noted, however, that this feature was added
|
|
primarily so that applications that had already been compiled to use libjpeg
|
|
v7+ could take advantage of accelerated baseline JPEG encoding/decoding
|
|
without recompiling. libjpeg-turbo does not claim to support all of the
|
|
libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
|
|
cases (see below.)
|
|
|
|
By passing an argument of --with-jpeg7 or --with-jpeg8 to configure, or an
|
|
argument of -DWITH_JPEG7=1 or -DWITH_JPEG8=1 to cmake, you can build a version
|
|
of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that programs
|
|
that are built against libjpeg v7 or v8 can be run with libjpeg-turbo. The
|
|
following section describes which libjpeg v7+ features are supported and which
|
|
aren't.
|
|
|
|
Support for libjpeg v7 and v8 Features:
|
|
---------------------------------------
|
|
|
|
Fully supported:
|
|
|
|
-- libjpeg: IDCT scaling extensions in decompressor
|
|
libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
|
|
1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
|
|
and 1/2 are SIMD-accelerated.)
|
|
|
|
-- libjpeg: arithmetic coding
|
|
|
|
-- libjpeg: In-memory source and destination managers
|
|
See notes below.
|
|
|
|
-- cjpeg: Separate quality settings for luminance and chrominance
|
|
Note that the libpjeg v7+ API was extended to accommodate this feature only
|
|
for convenience purposes. It has always been possible to implement this
|
|
feature with libjpeg v6b (see rdswitch.c for an example.)
|
|
|
|
-- cjpeg: 32-bit BMP support
|
|
|
|
-- cjpeg: -rgb option
|
|
|
|
-- jpegtran: lossless cropping
|
|
|
|
-- jpegtran: -perfect option
|
|
|
|
-- jpegtran: forcing width/height when performing lossless crop
|
|
|
|
-- rdjpgcom: -raw option
|
|
|
|
-- rdjpgcom: locale awareness
|
|
|
|
|
|
Not supported:
|
|
|
|
NOTE: As of this writing, extensive research has been conducted into the
|
|
usefulness of DCT scaling as a means of data reduction and SmartScale as a
|
|
means of quality improvement. The reader is invited to peruse the research at
|
|
http://www.libjpeg-turbo.org/About/SmartScale and draw his/her own conclusions,
|
|
but it is the general belief of our project that these features have not
|
|
demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
|
|
|
|
-- libjpeg: DCT scaling in compressor
|
|
cinfo.scale_num and cinfo.scale_denom are silently ignored.
|
|
There is no technical reason why DCT scaling could not be supported when
|
|
emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
|
|
below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
|
|
8/9 would be available, which is of limited usefulness.
|
|
|
|
-- libjpeg: SmartScale
|
|
cinfo.block_size is silently ignored.
|
|
SmartScale is an extension to the JPEG format that allows for DCT block
|
|
sizes other than 8x8. Providing support for this new format would be
|
|
feasible (particularly without full acceleration.) However, until/unless
|
|
the format becomes either an official industry standard or, at minimum, an
|
|
accepted solution in the community, we are hesitant to implement it, as
|
|
there is no sense of whether or how it might change in the future. It is
|
|
our belief that SmartScale has not demonstrated sufficient usefulness as a
|
|
lossless format nor as a means of quality enhancement, and thus, our primary
|
|
interest in providing this feature would be as a means of supporting
|
|
additional DCT scaling factors.
|
|
|
|
-- libjpeg: Fancy downsampling in compressor
|
|
cinfo.do_fancy_downsampling is silently ignored.
|
|
This requires the DCT scaling feature, which is not supported.
|
|
|
|
-- jpegtran: Scaling
|
|
This requires both the DCT scaling and SmartScale features, which are not
|
|
supported.
|
|
|
|
-- Lossless RGB JPEG files
|
|
This requires the SmartScale feature, which is not supported.
|
|
|
|
What About libjpeg v9?
|
|
----------------------
|
|
|
|
libjpeg v9 introduced yet another field to the JPEG compression structure
|
|
(color_transform), thus making the ABI backward incompatible with that of
|
|
libjpeg v8. This new field was introduced solely for the purpose of supporting
|
|
lossless SmartScale encoding. Further, there was actually no reason to extend
|
|
the API in this manner, as the color transform could have just as easily been
|
|
activated by way of a new JPEG colorspace constant, thus preserving backward
|
|
ABI compatibility.
|
|
|
|
Our research (see link above) has shown that lossless SmartScale does not
|
|
generally accomplish anything that can't already be accomplished better with
|
|
existing, standard lossless formats. Thus, at this time, it is our belief that
|
|
there is not sufficient technical justification for software to upgrade from
|
|
libjpeg v8 to libjpeg v9, and therefore, not sufficient technical justification
|
|
for us to emulate the libjpeg v9 ABI.
|
|
|
|
=====================================
|
|
In-Memory Source/Destination Managers
|
|
=====================================
|
|
|
|
By default, libjpeg-turbo 1.3 and later includes the jpeg_mem_src() and
|
|
jpeg_mem_dest() functions, even when not emulating the libjpeg v8 API/ABI.
|
|
Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
|
|
API/ABI emulation in order to use the in-memory source/destination managers,
|
|
but several projects requested that those functions be included when emulating
|
|
the libjpeg v6b API/ABI as well. This allows the use of those functions by
|
|
programs that need them without breaking ABI compatibility for programs that
|
|
don't, and it allows those functions to be provided in the "official"
|
|
libjpeg-turbo binaries.
|
|
|
|
Those who are concerned about maintaining strict conformance with the libjpeg
|
|
v6b or v7 API can pass an argument of --without-mem-srcdst to configure or
|
|
an argument of -DWITH_MEM_SRCDST=0 to CMake prior to building libjpeg-turbo.
|
|
This will restore the pre-1.3 behavior, in which jpeg_mem_src() and
|
|
jpeg_mem_dest() are only included when emulating the libjpeg v8 API/ABI.
|
|
|
|
On Un*x systems, including the in-memory source/destination managers changes
|
|
the dynamic library version from 62.0.0 to 62.1.0 if using libjpeg v6b API/ABI
|
|
emulation and from 7.0.0 to 7.1.0 if using libjpeg v7 API/ABI emulation.
|
|
|
|
Note that, on most Un*x systems, the dynamic linker will not look for a
|
|
function in a library until that function is actually used. Thus, if a program
|
|
is built against libjpeg-turbo 1.3+ and uses jpeg_mem_src() or jpeg_mem_dest(),
|
|
that program will not fail if run against an older version of libjpeg-turbo or
|
|
against libjpeg v7- until the program actually tries to call jpeg_mem_src() or
|
|
jpeg_mem_dest(). Such is not the case on Windows. If a program is built
|
|
against the libjpeg-turbo 1.3+ DLL and uses jpeg_mem_src() or jpeg_mem_dest(),
|
|
then it must use the libjpeg-turbo 1.3+ DLL at run time.
|
|
|
|
Both cjpeg and djpeg have been extended to allow testing the in-memory
|
|
source/destination manager functions. See their respective man pages for more
|
|
details.
|
|
|
|
|
|
*******************************************************************************
|
|
** Mathematical Compatibility
|
|
*******************************************************************************
|
|
|
|
For the most part, libjpeg-turbo should produce identical output to libjpeg
|
|
v6b. The one exception to this is when using the floating point DCT/IDCT, in
|
|
which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
|
|
following reasons:
|
|
|
|
-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
|
|
slightly more accurate than the implementation in libjpeg v6b, but not by
|
|
any amount perceptible to human vision (generally in the range of 0.01 to
|
|
0.08 dB gain in PNSR.)
|
|
-- When not using the SIMD extensions, libjpeg-turbo uses the more accurate
|
|
(and slightly faster) floating point IDCT algorithm introduced in libjpeg
|
|
v8a as opposed to the algorithm used in libjpeg v6b. It should be noted,
|
|
however, that this algorithm basically brings the accuracy of the floating
|
|
point IDCT in line with the accuracy of the slow integer IDCT. The floating
|
|
point DCT/IDCT algorithms are mainly a legacy feature, and they do not
|
|
produce significantly more accuracy than the slow integer algorithms (to put
|
|
numbers on this, the typical difference in PNSR between the two algorithms
|
|
is less than 0.10 dB, whereas changing the quality level by 1 in the upper
|
|
range of the quality scale is typically more like a 1.0 dB difference.)
|
|
-- If the floating point algorithms in libjpeg-turbo are not implemented using
|
|
SIMD instructions on a particular platform, then the accuracy of the
|
|
floating point DCT/IDCT can depend on the compiler settings.
|
|
|
|
While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
|
|
still using the same algorithms as libjpeg v6b, so there are several specific
|
|
cases in which libjpeg-turbo cannot be expected to produce the same output as
|
|
libjpeg v8:
|
|
|
|
-- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
|
|
implements those scaling algorithms differently than libjpeg v6b does, and
|
|
libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
|
|
|
|
-- When using chrominance subsampling, because libjpeg v8 implements this
|
|
with its DCT/IDCT scaling algorithms rather than with a separate
|
|
downsampling/upsampling algorithm. In our testing, the subsampled/upsampled
|
|
output of libjpeg v8 is less accurate than that of libjpeg v6b for this
|
|
reason.
|
|
|
|
-- When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
|
|
"non-smooth") chrominance upsampling, because libjpeg v8 does not support
|
|
merged upsampling with scaling factors > 1.
|
|
|
|
|
|
*******************************************************************************
|
|
** Performance Pitfalls
|
|
*******************************************************************************
|
|
|
|
===============
|
|
Restart Markers
|
|
===============
|
|
|
|
The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
|
|
in a way that makes the rest of the libjpeg infrastructure happy, so it is
|
|
necessary to use the slow Huffman decoder when decompressing a JPEG image that
|
|
has restart markers. This can cause the decompression performance to drop by
|
|
as much as 20%, but the performance will still be much greater than that of
|
|
libjpeg. Many consumer packages, such as PhotoShop, use restart markers when
|
|
generating JPEG images, so images generated by those programs will experience
|
|
this issue.
|
|
|
|
===============================================
|
|
Fast Integer Forward DCT at High Quality Levels
|
|
===============================================
|
|
|
|
The algorithm used by the SIMD-accelerated quantization function cannot produce
|
|
correct results whenever the fast integer forward DCT is used along with a JPEG
|
|
quality of 98-100. Thus, libjpeg-turbo must use the non-SIMD quantization
|
|
function in those cases. This causes performance to drop by as much as 40%.
|
|
It is therefore strongly advised that you use the slow integer forward DCT
|
|
whenever encoding images with a JPEG quality of 98 or higher.
|