Name mangling did not account for the vector size in the template type of a texture.
This adds that. The mangle is as it ever was for the vec4 case, which leaves
all GLSL behavior and most HLSL behavior uneffected. For vec1-3 the size is added
to the mangle.
Current limitation: textures cannot presently be templatized on structured types,
so this works only for vectors of basic types.
Fixes#895.
OpSpecConstantOp contains an embedded opcode which is given as a literal
argument to the OpSpecConstantOp. The subsequent arguments are as the
embedded op would expect, which may be a mixture of IDs and literals. This
adds support for that to the remapper binary parser. Upon seeing such an
embedded op, the parser flips over to parsing the argument list as
appropriate for that opcode.
Fixes#882.
Also, provides an option to auto-assign locations.
Existing tests use this option, to avoid the error message,
however, it is not fully implemented yet.
This modifies function parameter passing to pass the counter
buffer associated with a struct buffer to a function as a
hidden parameter. Similarly function declarations will have
hidden parameters added to accept the associated counter buffers.
There is a limitation: if a SB type may or may not have an associated
counter, passing it as a function parameter will assume that it does, and
the counter will appear in the linkage whether or not there is a counter
method used on the object.
This implements mytex.mips[mip][coord] for texture types. There is
some error testing, but not comprehensive. The constructs can be
nested, e.g in this case the inner .mips is parsed before the completion
of the outer [][] operator.
tx.mips[tx.mips[a][b].x][c]
Using GS methods such as Append() in non-GS stages should be ignored, but was
creating errors due to the lack of a stream output symbol for the non-GS stage.
This adds infrastructure suitable for any front end to create SPIR-V loop
control flags. The only current front end doing so is HLSL.
[unroll] turns into spv::LoopControlUnrollMask
[loop] turns into spv::LoopControlDontUnrollMask
no specification means spv::LoopControlMaskNone
Multisample textures support a GetSamplePosition() method intended to query
positions given a sample index. This cannot be truly implemented in SPIR-V,
but #753 requested returning standard positions for the 1..16 cases, which
this PR adds. Anything besides that returns (0,0). If the standard positions
are not used, this will be wrong.
This should be revisited when there is a real query available.
Vector conditions properly convert the true/false expression types to same
width vector as the condition.
Scalar conditions make the true/false expressions convert to each other.
Byte address buffers were failing to detect that they were byte address
buffers when used as fn parameters.
Note: this detection is a little awkward, and could be simplified if
it was easy to obtain the declared builtin type for an object.
Adds --hlsl-iomap option to perform IO mapping in HLSL register space.
--shift-cbuffer-binding is now a synonym for --shift-ubo-binding.
The idea way to do this seems to be passing in a dedicated IO resolver, but
that would require more intrusive restructuring, so maybe best for its
own PR.
The TDefaultHlslIoResolver class and the former TDefaultIoResolver class
share quite a bit of mechanism in a common base class.
TODO: tbuffers are landing in the wrong register class, which needs some
investigation. They're either wrong upstream, or the detection in the
resolver is wrong.
Some texture and SB operations can take non-integer indexes, which should be
cast to integers before use if they are not already. This adds makeIntegerIndex()
for the purpose. Int types are left alone.
(This was done before for operator[], but needs to apply to some other things
too, hence its extraction into common function now)
This adds TProgram::getUniformBlockCounterIndex(int index), which returns the
index the block of the counter buffer associated with the block of the passed in
index, if any, or -1 if none.
This is WIP, heavy on the IP part. There's not yet enough to use in real workloads.
Currently present:
* Creation of separate counter buffers for structured buffer types needing them.
* IncrementCounter / DecrementCounter methods
* Postprocess to remove unused counter buffers from linkage
* Associated counter buffers are given @count suffix (invalid as a user identifier)
Not yet present:
* reflection queries to obtain bindings for counter buffers
* Append/Consume buffers
* Ability to use SB references passed as fn parameters
The prior decomposition of isfinite was not setting the return type on the
sequence node. (Sequence was used because there's an internal temporary
to avoid the complex rvalue problem).
HLSL requires vec2 tessellation coordinate declarations in some cases
(e.g, isoline topology), where SPIR-V requires the TessCoord qualified
builtin to be a vec3 in all cases. This alters the IO form of the
variable to be a vec3, which will be copied to the shader's declared
type if needed. This is not a validation; the shader type must be correct.
Previously, patch constant functions only accepted OutputPatch. This
adds InputPatch support, via a pseudo-builtin variable type, so that
the patch can be tracked clear through from the qualifier.
The prior implementation of GS did not work with the new EP wrapping architecture.
This fixes it: the Append() method now looks up the actual output rather
than the internal sanitized temporary type, and writes to that.
In the hull shader, the PCF output does not participate in an argument list,
so has no defined ordering. It is always put at the end of the linkage. That
means the DS input reading PCF data must be be at the end of the DS linkage
as well, no matter where it may appear in the argument list. This change
makes sure that happens.
The detection is by looking for arguments that contain tessellation factor
builtins, even as a struct member. The whole struct is taken as the PCF output
if any members are so qualified.
The SPIR-V generator had assumed tessellation modes such as
primitive type and vertex order would only appear in tess eval
(domain) shaders. SPIR-V allows either, and HLSL allows and
possibly requires them to be in the hull shader.
This change:
1. Passes them through for either tessellation stage, and,
2. Does not set up defaults in the domain stage for HLSl compilation,
to avoid conflicting definitions.
Unknown how extensive the semantics need to be yet. Need real
feedback from workloads. This is just done as part of unifying it
with the class/struct namespaces and grammar productions.
HLSL HS outputs a per ctrl point value, and the DS reads an array
of that type. (It also has a per patch frequency). The per-ctrl-pt
frequency is arrayed on just one side, as opposed to SPIR-V which
is arrayed on both. To match semantics, the compiler creates an
array behind the scenes and indexes it by invocation ID, assigning
the HS return value to it.
SPIR-V requires that tessellation factor arrays be size 4 (outer) or 2 (inner).
HLSL allows other sizes such as 3, or even scalars. This commit converts
between them by forcing the IO types to be the SPIR-V size, and allowing
copies between the internal and IO types to handle these cases.
This PR emulates per control point inputs to patch constant functions.
Without either an extension to look across SIMD lanes or a dedicated
stage, the emulation must use separate invocations of the wrapped
entry point to obtain the per control point values. This is provided
since shaders are wanting this functionality now, but such an extension
is not yet available.
Entry point arguments qualified as an invocation ID are replaced by the
current control point number when calling the wrapped entry point. There
is no particular optimization for the case of the entry point not having
such an input but the PCF still accepting ctrl pt frequency data. It'll
work, but anyway makes no so much sense.
The wrapped entry point must return the per control point data by value.
At this time it is not supported as an output parameter.
Makes some white-space differences in most output, plus a few cases
where more could have been put out but was cut short by the previous
fix-sized buffer.
Added version check (version >= 150) for GL_(core/compatibility)_profile macros.
Added GL_core_profile standard macro check to "150.vert" test file.
Fixed version check for GL_core_profile macros, and removed bad token character from 150.vert test.
Updated 150.vert.out test base-result with google-test suite.
The non-LOD form of image size query is prohibited in certain cases:
see the OpImageQuerySize and OpImageQuerySizeLod sections of the SPIR-V
spec for details. Sometimes we were generating the non-LOD form when
we should have been using the LOD form. Sometimes the LOD form is required
even if the underlying HLSL query did not supply a MIP level itself,
in which case level 0 is now queried.
This change propagates the storage qualifier from the buffer object to its contained
array type so that isStructBufferType() realizes it is one. That propagation was
happening before only for global variable declarations, so compilation defects would
result if the use of a function parameter happened before a global declaration.
This fixes that case, whether or not there ever is a global declaration, and
regardless of the relative order.
This changes the hlsl.structbuffer.fn.frag test to exercise the alternate order.
There are no differences to generated SPIR-V for the cases which successfully compiled before.
The f16tof32 opcode was indexing a vector with a float 0, rather
than an int 0. It may have made no functional difference due to the
identical bit pattern, but code looking at the type could be
confused.
This PR adds the ability to pass structuredbuffer types by reference
as function parameters.
It also changes the representation of structuredbuffers from anonymous
blocks with named members, to named blocks with pseudonymous members.
That should not be an externally visible change.
New command line option --shift-ssbo-binding mirrors --shift-ubo-binding, etc.
New reflection query getLocalSize(int dim) queries local size, e.g, CS threads.
This is a partial implemention of structurebuffers supporting:
* structured buffer types of:
* StructuredBuffer
* RWStructuredBuffer
* ByteAddressBuffer
* RWByteAddressBuffer
* Atomic operations on RWByteAddressBuffer
* Load/Load[234], Store/Store[234], GetDimensions methods (where allowed by type)
* globallycoherent flag
But NOT yet supporting:
* AppendStructuredBuffer / ConsumeStructuredBuffer types
* IncrementCounter/DecrementCounter methods
Please note: the stride returned by GetDimensions is as calculated by glslang for std430,
and may not match other environments in all cases.
This obsoletes WIP PR #704, which was built on the pre entry point wrapping master. New version
here uses entry point wrapping.
This is a limited implementation of tessellation shaders. In particular, the following are not functional,
and will be added as separate stages to reduce the size of each PR.
* patchconstantfunctions accepting per-control-point input values, such as
const OutputPatch <hs_out_t, 3> cpv are not implemented.
* patchconstantfunctions whose signature requires an aggregate input type such as
a structure containing builtin variables. Code to synthesize such calls is not
yet present.
These restrictions will be relaxed as soon as possible. Simple cases can compile now: see for example
Test/hulsl.hull.1.tesc - e.g, writing to inner and outer tessellation factors.
PCF invocation is synthesized as an entry point epilogue protected behind a barrier and a test on
invocation ID == 0. If there is an existing invocation ID variable it will be used, otherwise one is
added to the linkage. The PCF and the shader EP interfaces are unioned and builtins appearing in
the PCF but not the EP are also added to the linkage and synthesized as shader inputs.
Parameter matching to (eventually arbitrary) PCF signatures is by builtin variable type. Any user
variables in the PCF signature will result in an error. Overloaded PCF functions will also result in
an error.
[domain()], [partitioning()], [outputtopology()], [outputcontrolpoints()], and [patchconstantfunction()]
attributes to the shader entry point are in place, with the exception of the Pow2 partitioning mode.
This removes pervertex output blocks, in favor of using only
loose variables. The pervertex blocks are not required and were
only partly implemented, and were adding some complication.
This change goes with wrap-entry-point.
Structs are split to remove builtin members to create valid SPIR-V. In this
process, an outer structure array dimension may be propegated onto the
now-removed builtin variables. For example, a mystruct[3].position ->
position[3]. The copy between the split and unsplit forms would handle
this in some cases, but not if the array dimension was at different levels
of aggregate.
It now does this, but may not handle arbitrary composite types. Unclear if
that has any semantic meaning for builtins though.
This introduces parallel types for IO-type containing aggregates used as
non-entry point function parameters or return types, or declared as variables.
Further uses of the same original type will share the same sanitized deep
structure.
This is intended to be used with the wrap-entry-point branch.
Previously, a type graph would turn into a type tree. That is,
a deep node that is shared would have multiple copies made.
This is important when creating IO and non-IO versions of deep types.
This needs some render testing, but is destined to be part of master.
This also leads to a variety of other simplifications.
- IO are global symbols, so only need one list of linkage nodes (deferred)
- no longer need parse-context-wide 'inEntryPoint' state, entry-point is localized
- several parts of splitting/flattening are now localized
When copying split types with mixtures of user variables and buitins,
where the builtins are extracted, there is a parallel structures traversal.
The traversal was not obtaining the derefenced types in the array case.
- Add support for invocation functions with "InclusiveScan" and
"ExclusiveScan" modes.
- Add support for invocation functions taking int64/uint64/doube/float16
as inout data types.
doc.cpp: Add capabilities, scope to the opcodes. Add opcode and
capability strings.
GLSL.ext.KHR.h: Add extension
string.
GlslangToSpv.cpp: Fix handling of opcodes to generate
appropriate SPIR-V.
spirv.hpp: Add capability and opcode
enums.
spv.shaderGroupVote.comp.out: Update SPIR-V output for test
shader.
Since EOpMatrixSwizzle is a new op, existing back-ends only work when the
front end first decomposes it to other operations. So far, this is only
being done for simple assignment into matrix swizzles.
This partially addressess issue #670, for when the matrix swizzle
degenerates to a component or column: m[c], m[c][r] (where HLSL
swaps rows and columns for user's view).
An error message is given for the arbitrary cases not covered.
These cases will work for arbitrary use of l-values.
Future work will handle more arbitrary swizzles, which might
not work as arbitrary l-values.
This encapsulates where the string could overflow, removing 40 lines
of fragile code. It also improves handling of numbers that are too long.
There are a couple of open issues that could related to this function
being more rational (locale dependence, 1.#INF).
(Still adding tests: do not commit)
This fixes PR #632 so that:
(a) The 4 PerVertex builtins are added to an interface block for all stages except fragment.
(b) Other builtin qualified variables are added as "loose" linkage members.
(c) Arrayness from the PerVertex builtins is moved to the PerVertex block.
(d) Sometimes, two PerVertex blocks are created, one for in, one for out (e.g, for some GS that
both reads and writes a Position)
Any previous use would only be for "", which would probably mean changing
include(...) -> includeLocal(...)
See comments about includeLocal() being an additional search over
includeSystem(), not a superset search.
This also removed ForbidIncluder, as
- the message in ForbidIncluder was redundant: error results were
already returned to the caller, which then gives the error it
wants to
- there is a trivial default implementation that a subclass can
override any subset of (I still like abstract base classes though)
- trying to get less implementation out of the interface file anyway
Reads and write syntax to UAV objects is turned into EOpImageLoad/Store
operations. This translation did not support destination swizzles,
for example, "mybuffer[tc].zyx = 3;", so such statements would fail to
compile. Now they work.
Parial updates are explicitly prohibited.
New test: hlsl.rw.swizzle.frag
This PR adds support for default function parameters in the following cases:
1. Simple constants, such as void fn(int x, float myparam = 3)
2. Expressions that can be const folded, such a ... myparam = sin(some_const)
3. Initializer lists that can be const folded, such as ... float2 myparam = {1,2}
New tests are added: hlsl.params.default.frag and hlsl.params.default.err.frag
(for testing error situations, such as ambiguity or non-const-foldable).
In order to avoid sampler method ambiguity, the hlsl better() lambda now
considers sampler matches. Previously, all sampler types looked identical
since only the basic type of EbtSampler was considered.
HLSL allows type keywords to also be identifiers, so a sequence such as "float half = 3" is
valid, or more bizzarely, something like "float.float = int.uint + bool;"
There are places this is not supported. E.g, it's permitted for struct members, but not struct
names or functions. Also, vector or matrix types such as "float3" are not permitted as
identifiers.
This PR adds that support, as well as support for the "half" type. In production shaders,
this was seen with variables named "half". The PR attempts to support this without breaking
useful grammar errors such as "; expected" at the end of unterminated statements, so it errs
on that side at the possible expense of failing to accept valid constructs containing a type
keyword identifier. If others are discovered, they can be added.
Also, half is now accepted as a valid type, alongside the min*float types.