mirror of
https://gitee.com/openharmony/third_party_mesa3d
synced 2024-11-27 09:31:03 +00:00
b8030ab1ea
We also update and improve the docs in isl.h which get pulled into this new chapter. Acked-by: Luis Strano <luis.strano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
90 lines
4.5 KiB
ReStructuredText
90 lines
4.5 KiB
ReStructuredText
Auxiliary surface compression
|
|
=============================
|
|
|
|
Most lossless image compression on Intel hardware, be that CCS, MCS, or HiZ,
|
|
works by way of some chunk of auxiliary data (often a surface) which is used
|
|
together with the main surface to provide compression. Even though this means
|
|
more memory is allocated, the scheme allows us to reduce our over-all memory
|
|
bandwidth since the auxiliary data is much smaller than the main surface.
|
|
|
|
The simplest example of this is single-sample fast clears
|
|
(:cpp:enumerator:`isl_aux_usage::ISL_AUX_USAGE_CCS_D`) on Ivy Bridge through
|
|
Broadwell and later. For this scheme, the auxiliary surface stores a single
|
|
bit for each cache-line-pair in the main surface. If that bit is set, then the
|
|
entire cache line pair contains only the clear color as provided in the
|
|
``RENDER_SURFACE_STATE`` for the image. If the bit is unset, then it's not
|
|
clear and you should look at the main surface. Since a cache line is 64B, this
|
|
yields a scale-down factor of 1:1024.
|
|
|
|
Even the simple fast-clear scheme saves us bandwidth in two places. The first
|
|
is when we go to clear the surface. If we're doing a full-surface clear or
|
|
clearing to the same color that was used to clear before, we don't have to
|
|
touch the main surface at all. All we have to do is record the clear color and
|
|
smash the aux data to ``0xff``. The hardware then knows to ignore whatever is
|
|
in the main surface and look at the clear color instead. The second is when we
|
|
go to render. Say we're doing some color blending. Instead of the blend unit
|
|
having to read back actual surface contents to blend with, it looks at the
|
|
clear bit and blends with the clear color recorded with the surface state
|
|
instead. Depending on the geometry and cache utilization, this can save as
|
|
much as one whole read of the surface worth of bandwidth.
|
|
|
|
The difficulty with a scheme like this comes when we want to do something else
|
|
with that surface. What happens if the sampler doesn't support this fast-clear
|
|
scheme (it doesn't on IVB)? In that case, we have to do a *resolve* where we
|
|
run a special pipeline that reads the auxiliary data and applies it to the main
|
|
surface. In the case of fast clears, this means that, for every 1 bit in the
|
|
auxiliary surface, the corresponding pair of cache lines in the main surface
|
|
gets filled with the clear color. At the end of the resolve operation, the
|
|
main surface contents are the actual contents of the surface.
|
|
|
|
Types of surface compression
|
|
----------------------------
|
|
|
|
Intel hardware has several different compression schemes that all work along
|
|
similar lines:
|
|
|
|
.. doxygenenum:: isl_aux_usage
|
|
.. doxygenfunction:: isl_aux_usage_has_fast_clears
|
|
.. doxygenfunction:: isl_aux_usage_has_compression
|
|
.. doxygenfunction:: isl_aux_usage_has_hiz
|
|
.. doxygenfunction:: isl_aux_usage_has_mcs
|
|
.. doxygenfunction:: isl_aux_usage_has_ccs
|
|
|
|
Creating auxiliary surfaces
|
|
---------------------------
|
|
|
|
Each type of data compression requires some type of auxiliary data on the side.
|
|
For most, this involves a second auxiliary surface. ISL provides helpers for
|
|
creating each of these types of surfaces:
|
|
|
|
.. doxygenfunction:: isl_surf_get_hiz_surf
|
|
.. doxygenfunction:: isl_surf_get_mcs_surf
|
|
.. doxygenfunction:: isl_surf_supports_ccs
|
|
.. doxygenfunction:: isl_surf_get_ccs_surf
|
|
|
|
Compression state tracking
|
|
--------------------------
|
|
|
|
All of the Intel auxiliary surface compression schemes share a common concept
|
|
of a main surface which may or may not contain correct up-to-date data and some
|
|
auxiliary data which says how to interpret it. The main surface is divided
|
|
into blocks of some fixed size and some smaller block in the auxiliary data
|
|
controls how that main surface block is to be interpreted. We then have to do
|
|
resolves depending on the different HW units which need to interact with a
|
|
given surface.
|
|
|
|
To help drivers keep track of what all is going on and when resolves need to be
|
|
inserted, ISL provides a finite state machine which tracks the current state of
|
|
the main surface and auxiliary data and their relationship to each other. The
|
|
states are encoded with the :cpp:enum:`isl_aux_state` enum. ISL also provides
|
|
helper functions for operating the state machine and determining what aux op
|
|
(if any) is required to get to the right state for a given operation.
|
|
|
|
.. doxygenenum:: isl_aux_state
|
|
.. doxygenfunction:: isl_aux_state_has_valid_primary
|
|
.. doxygenfunction:: isl_aux_state_has_valid_aux
|
|
.. doxygenenum:: isl_aux_op
|
|
.. doxygenfunction:: isl_aux_prepare_access
|
|
.. doxygenfunction:: isl_aux_state_transition_aux_op
|
|
.. doxygenfunction:: isl_aux_state_transition_write
|