mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-12-05 02:07:56 +00:00
Add AMDPAL Code Conventions section to AMD docs
Summary: This is a first version of the AMDPAL code conventions. Further updates will undoubtably be required to fully document AMDPAL. Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D45246 llvm-svn: 329188
This commit is contained in:
parent
0daeb0bf1f
commit
68d742f9d8
@ -3781,6 +3781,125 @@ the ``s_trap`` instruction with the following usage:
|
||||
debugger ``s_trap 0xff`` Reserved for debugger.
|
||||
=================== =============== =============== =======================
|
||||
|
||||
AMDPAL
|
||||
------
|
||||
|
||||
This section provides code conventions used when the target triple OS is
|
||||
``amdpal`` (see :ref:`amdgpu-target-triples`) for passing runtime parameters
|
||||
from the application/runtime to each invocation of a hardware shader. These
|
||||
parameters include both generic, application-controlled parameters called
|
||||
*user data* as well as system-generated parameters that are a product of the
|
||||
draw or dispatch execution.
|
||||
|
||||
User Data
|
||||
~~~~~~~~~
|
||||
|
||||
Each hardware stage has a set of 32-bit *user data registers* which can be
|
||||
written from a command buffer and then loaded into SGPRs when waves are launched
|
||||
via a subsequent dispatch or draw operation. This is the way most arguments are
|
||||
passed from the application/runtime to a hardware shader.
|
||||
|
||||
Compute User Data
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
Compute shader user data mappings are simpler than graphics shaders, and have a
|
||||
fixed mapping.
|
||||
|
||||
Note that there are always 10 available *user data entries* in registers -
|
||||
entries beyond that limit must be fetched from memory (via the spill table
|
||||
pointer) by the shader.
|
||||
|
||||
.. table:: PAL Compute Shader User Data Registers
|
||||
:name: pal-compute-user-data-registers
|
||||
|
||||
============= ================================
|
||||
User Register Description
|
||||
============= ================================
|
||||
0 Global Internal Table (32-bit pointer)
|
||||
1 Per-Shader Internal Table (32-bit pointer)
|
||||
2 - 11 Application-Controlled User Data (10 32-bit values)
|
||||
12 Spill Table (32-bit pointer)
|
||||
13 - 14 Thread Group Count (64-bit pointer)
|
||||
15 GDS Range
|
||||
============= ================================
|
||||
|
||||
Graphics User Data
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Graphics pipelines support a much more flexible user data mapping:
|
||||
|
||||
.. table:: PAL Graphics Shader User Data Registers
|
||||
:name: pal-graphics-user-data-registers
|
||||
|
||||
============= ================================
|
||||
User Register Description
|
||||
============= ================================
|
||||
0 Global Internal Table (32-bit pointer)
|
||||
+ Per-Shader Internal Table (32-bit pointer)
|
||||
+ 1-15 Application Controlled User Data
|
||||
(1-15 Contiguous 32-bit Values in Registers)
|
||||
+ Spill Table (32-bit pointer)
|
||||
+ Draw Index (First Stage Only)
|
||||
+ Vertex Offset (First Stage Only)
|
||||
+ Instance Offset (First Stage Only)
|
||||
============= ================================
|
||||
|
||||
The placement of the global internal table remains fixed in the first *user
|
||||
data SGPR register*. Otherwise all parameters are optional, and can be mapped
|
||||
to any desired *user data SGPR register*, with the following regstrictions:
|
||||
|
||||
* Draw Index, Vertex Offset, and Instance Offset can only be used by the first
|
||||
activehardware stage in a graphics pipeline (i.e. where the API vertex
|
||||
shader runs).
|
||||
|
||||
* Application-controlled user data must be mapped into a contiguous range of
|
||||
user data registers.
|
||||
|
||||
* The application-controlled user data range supports compaction remapping, so
|
||||
only *entries* that are actually consumed by the shader must be assigned to
|
||||
corresponding *registers*. Note that in order to support an efficient runtime
|
||||
implementation, the remapping must pack *registers* in the same order as
|
||||
*entries*, with unused *entries* removed.
|
||||
|
||||
.. _pal_global_internal_table:
|
||||
|
||||
Global Internal Table
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The global internal table is a table of *shader resource descriptors* (SRDs) that
|
||||
define how certain engine-wide, runtime-managed resources should be accessed
|
||||
from a shader. The majority of these resources have HW-defined formats, and it
|
||||
is up to the compiler to write/read data as required by the target hardware.
|
||||
|
||||
The following table illustrates the required format:
|
||||
|
||||
.. table:: PAL Global Internal Table
|
||||
:name: pal-git-table
|
||||
|
||||
============= ================================
|
||||
Offset Description
|
||||
============= ================================
|
||||
0-3 Graphics Scratch SRD
|
||||
4-7 Compute Scratch SRD
|
||||
8-11 ES/GS Ring Output SRD
|
||||
12-15 ES/GS Ring Input SRD
|
||||
16-19 GS/VS Ring Output #0
|
||||
20-23 GS/VS Ring Output #1
|
||||
24-27 GS/VS Ring Output #2
|
||||
28-31 GS/VS Ring Output #3
|
||||
32-35 GS/VS Ring Input SRD
|
||||
36-39 Tessellation Factor Buffer SRD
|
||||
40-43 Off-Chip LDS Buffer SRD
|
||||
44-47 Off-Chip Param Cache Buffer SRD
|
||||
48-51 Sample Position Buffer SRD
|
||||
52 vaRange::ShadowDescriptorTable High Bits
|
||||
============= ================================
|
||||
|
||||
The pointer to the global internal table passed to the shader as user data
|
||||
is a 32-bit pointer. The top 32 bits should be assumed to be the same as
|
||||
the top 32 bits of the pipeline, so the shader may use the program
|
||||
counter's top 32 bits.
|
||||
|
||||
Unspecified OS
|
||||
--------------
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user