This change adds a folding rule which transforms x * y - a and a - x * y
into FMA(x, y, -a) and FMA(-x, y, a), respectively.
While the SPIR-V instruction count remains the same, target instruction
sets typically feature FMA instruction variants that can negate an
operand. Also this transformation may unlock further optimizations which
eliminate the negation.
(Google bug: b/226145988)
* Add move folding for composite instructions
Fold chains of insert into construct
If a chain of OpCompositeInsert instruction write to every element of a
composite object, then we can replace it with an OpCompositeConstruct.
Fold a construct fed by extracts to a single extract
We already fold an OpCompositeConstruct when it is simlpy reconstructing
an object that was decomposed by a series of OpCompositeExtract
instructions. However, we do not do that if that object is an element
of a larger object.
I have updated the rule, so that if the original object is a an element
of a larger object, then the OpCompositeConstruct is replaced with a
single OpCompositeExtract from the larger object.
Fixes#4371.
* Handle 64-bit integers in local access chain convert
The local access chain convert pass does on run on module that have
64-bit integers, even if they have nothing to to with access chains.
This is very limiting because other passes rely on the access chains
being removed. So this commit will add this functionality to the pass.
spirv validation require OpFunctionCall with memory object, usually this
is non issue as all the functions are inlined.
This pass deal with some case for
DontInline function. accesschain input operand would be replaced new
created variable
This reverts commit 671f6e633f.
PR #4783 was reverted because it caused OpenCL CTS failures for clvk.
The was in clspv, which was not adding the no contract decoration when
it was required. This has been fixed in
https://github.com/google/clspv/pull/845. We can now reapply #4783.
Adding Fma instruction can speed up the code. This was requested by
swiftshader, so they do not have to do this analysis themselves. It can
also help reduce the code size, and the work the ICD compilers have to
do.
spread-volatile-semantics pass spreads Volatile semantics for builtin
variables used by certain execution models based on
VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V"). Therefore, shaders without execution model (e.g., used only
for linkage) are not the target of the pass. This commit lets the pass
just return SuccessWithoutChange in that case.
Clang added -Wunqualified-std-cast-call in
https://reviews.llvm.org/D119670, which warns on unqualified std::move
and std::forward calls. This change qualifies these calls to allow the
project to build on HEAD Clang -Werror.
If the body of the module does not have any ids change, compact ids will
not change the id bound. This can cause problems because the id bound
could be much higher than the largest id in that is used. It should be
reset any time it is not the larger id used + 1.
Fixes#4604
When folding a vector shuffle feeding a vector shuffle, we do not
propagate an 0xFFFFFFFF, which has a special meaning, correctly. We
adjust the value making it lose it meaning as an undefined value.
Fixes#4581
CCP does not want to fold an instruction unless it folds to a constant.
There is an asser to check for this. The question if a spec constant
counts as a constant. The constant folder considers a spec constant a
constand, but CCP does not. I've fixed the assert in CCP to match what
the folder does. It should not require any new changes to CCP.
Swift shader needs a way to inline all functions, even those marked as
DontInline. See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR. We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.
SwiftShader will still have to modify their code to add this pass before
the other passes are run.
The function `BuildInvalideAnalyses` will be rebuilt for every analysis that
has been requested, but it is not necessary. It also can cause problems
because if the CFG needs to be rebuilt, so do the dominator trees.
This change will make the functionality match the description of the
function.
* Optimize DefUseManager allocations
Saves around 30-35% of compilation time.
For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction. For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes. Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).
Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
Incrementally compute the hash instead of collecting words
Avoids allocating temporary space in a std::vector and std::u32string, and making three passes over all the hashed data.
Switch to using std::vector to prevent processing duplicates instead of std::unordered_set: avoids an allocation/deletion every call to ComputeHashValue, and ends up faster due to much better cache behaviour and smaller constant-factor when searching the (generally very small) list.
In my test case, made Type::HashValue go from 7.5% of compilation time to .5%
Scalar replacement generates a null when there value for a member will
not be used. The null is used to make sure things are
deterministic in case there is an error.
However, some type cannot be null, so we will change that to use undef.
To keep the code simpler we will always use the undef.
Fixes#3996
spirv-diff is a new tool that produces diff-style output comparing two
SPIR-V modules. The instructions between the src and dst modules are
matched as best as the tool can, and output is produced (in src
id-space) that shows which instructions are removed in src, added in dst
or modified between them. The order of instructions are not retained.
Matching instructions between two SPIR-V modules is not trivial, and
thus a number of heuristics are applied in this tool. In particular,
without debug information, it's hard to match functions as they can be
reordered. As such, this tool is primarily useful to produce the diff
of two SPIR-V modules derived from the same source.
This tool can be useful in a number of scenarios:
- Compare the SPIR-V before and after modifying a shader
- Compare the SPIR-V produced from a shader before and after compiler
codegen changes.
- Compare the SPIR-V produced from a shader before and after some
transformation or optimization.
- Compare the SPIR-V produced from a shader with different compilers.
The handling of the RayQueryKHR type is not complete in the type
manager. The tests were not picking this up. I've added a test to make
sure that the `GenerateAllTypes` function actually does generate all of
the types. Once it is added there other tests should pick up on the
other parts that were missing.
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").
Handle variables used by multiple entry points:
1. Update error check to make it working regardless of the order of
entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
it needs the Volatile semantics for E1 while it does not for E2
- If VulkanMemoryModel capability is enabled, which means we have to
set memory operation of load instructions for the variable, we
update load instructions in E1, but do not update the ones in E2.
- If VulkanMemoryModel capability is disabled, which means we have
to add Volatile decoration for the variable, we report an error
because E1 needs to add Volatile decoration for the variable while
E2 does not.
For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
* PassManager: Print errors occurring during disassembly
Otherwise one could be greeted by the following text when running
spirv-opt withe the `--print-all` flag:
; IR before pass wrap-opkill
; IR before pass eliminate-dead-branches
; IR before pass merge-return
With this commit, one will instead get:
error: line 143: Invalid opcode: 400
warning: line 0: Disassembly failed before pass wrap-opkill
error: line 143: Invalid opcode: 400
warning: line 0: Disassembly failed before pass eliminate-dead-branches
error: line 143: Invalid opcode: 400
warning: line 0: Disassembly failed before pass merge-return
* PassManager: Use the right target environment when disassembling
Disassembly would fail if features from a newer version of SPIR-V than
1.2 were used.
The comment in `Array::GetExtraHashWords` is misleading because getting
the hash words is split up between the generic `Type::GetHashWords` and
the type specific `Type::GetExtraHashWords`. While `IsSameImpl` is
self-contained. Removing the comment since it is misleading and no
comment is really needed.
Fixes#3248
The pass to remove the nonsemantic information and instructions
is used for drivers or tools that may not support them. Debug
information was only partially handle, which is causing a
problem. We need to either fully remove debug information or
not remove it all. Since I can see it being useful to keep the
debug information even when the nonsemantic instructions are
removed, I propose we do not remove debug info.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4269
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions. This
makes it so that the name of the pass no longer reflects what the pass actually
does. This change renames the pass so that it reflects what the pass actaully does.
The change in
commit 4ac8e5e541
Author: Greg Fischer <greg@lunarg.com>
Date: Wed Sep 15 12:38:34 2021 -0600
Add preserve_interface mode to aggressive_dead_code_elim (#4520)
Broke the C++ ABI for spirv-tools shared libraries on Linux, for not a great reason.
Restore the previous ABI.