The work order was changed in r228186 from SCC order
to RPO with an arbitrary sorting function. The sorting
function attempted to move inner loop nodes earlier. This
was was apparently relying on an assumption that every block
in a given loop / the same loop depth would be seen before
visiting another loop. In the broken testcase, a block
outside of the loop was encountered before moving onto
another block in the same loop. The testcase would then
structurize such that one blocks unconditional successor
could never be reached.
Revert to plain RPO for the analysis phase. This fixes
detecting edges as backedges that aren't really.
The processing phase does use another visited set, and
I'm unclear on whether the order there is as important.
An arbitrary order doesn't work, and triggers some infinite
loops. The reversed RPO list seems to work and is closer
to the order that was used before, minus the arbitary
custom sorting.
A few of the changed tests now produce smaller code,
and a few are slightly worse looking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321751 91177308-0d34-0410-b5e6-96231b3b80d8
The test needs to be changed; it was exercising UB and that likely wasn't the intent of the test author. I simply removed the checks because I have absolutely no idea what this test was trying to accomplish. With multiple check patterns, no explanation, and no familiarity on my part with the ISA a true fix is going to have to come from someone familiar with the target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321591 91177308-0d34-0410-b5e6-96231b3b80d8
The test in question was checking for a particular intepretation of undefined behavior. Relax the test to check that we simply don't crash.
Sorry for the breakage, I don't generally build AMDGPU locally and just saw the failure this morning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321589 91177308-0d34-0410-b5e6-96231b3b80d8
Two issues were found about machine inst scheduler when compiling ProRender
with -g for amdgcn target:
GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it
should not since DBG_VALUE is not mapped in LiveIntervals.
when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and
ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D41132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320650 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add isRenamable() predicate to MachineOperand. This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand. Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).
Reviewers: qcolombet, MatzeB, hfinkel
Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D39400
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320503 91177308-0d34-0410-b5e6-96231b3b80d8
-amdgpu-waitcnt-forcezero={1|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs
Differential Revision: https://reviews.llvm.org/D40091
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320084 91177308-0d34-0410-b5e6-96231b3b80d8
D40231 requires to run case with -O0 to prevent InstructionSimplify from
transforming an extractelement with undef index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319907 91177308-0d34-0410-b5e6-96231b3b80d8
This was only searching for explicit defs,
and asserting for any implicit or variadic
instruction defs, like inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319826 91177308-0d34-0410-b5e6-96231b3b80d8
Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319808 91177308-0d34-0410-b5e6-96231b3b80d8
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.
The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319732 91177308-0d34-0410-b5e6-96231b3b80d8
Move the entire optimization to one place. Before it was possible
to adjust dmask without changing the register class of the output
instruction, since they were done in separate places. Fix all
lane sizes and move all of the optimization into the DAG folding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319705 91177308-0d34-0410-b5e6-96231b3b80d8
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8
Two issues found when doing codegen for splitting vector with non-zero alloca addr space:
DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.
TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.
Differential Revision: https://reviews.llvm.org/D39758
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319622 91177308-0d34-0410-b5e6-96231b3b80d8
output
As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.
Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.
Differential Revision: https://reviews.llvm.org/D40421
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319445 91177308-0d34-0410-b5e6-96231b3b80d8
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8
GFX9 does not enable bounds checking for the resource descriptors
used for private access, so it should be OK to use vaddr with
a potentially negative value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319393 91177308-0d34-0410-b5e6-96231b3b80d8
GFX9 stopped using m0 for most DS instructions. Select
a different instruction without the use. I think this will
be less error prone than trying to manually maintain m0
uses as needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319270 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that store-merge is only generates type-safe stores, do a second
pass just before instruction selection to allow lowered intrinsics to
be merged as well.
Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33675
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319036 91177308-0d34-0410-b5e6-96231b3b80d8
AMDGPU backend errors with "unsupported call to function" upon
encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch
adds custom lowering to avoid that error on both R600 and SI.
Reviewers: arsenm, jvesely
Subscribers: tstellar
Differential Revision: https://reviews.llvm.org/D29942
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319025 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This bug seems to have gone unnoticed because critical cases with LDS
instructions are eliminated by the peephole optimizer.
However, equivalent situations arise with buffer loads and stores
as well, so this fixes regressions since r317751 ("AMDGPU: Merge
S_BUFFER_LOAD_DWORD_IMM into x2, x4").
Fixes at least:
KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs
KHR-GL45.cull_distance.functional
piglit tes-input-gl_ClipDistance.shader_test
... and probably more
Change-Id: I0e371536288eb8e6afeaa241a185266fd45d129d
Reviewers: arsenm, mareko, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D40303
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318829 91177308-0d34-0410-b5e6-96231b3b80d8
DAGTypeLegalizer::SplitInteger uses default pointer size as shift amount constant type,
which causes less performant ISA in amdgcn---amdgiz target since the default pointer
type is i64 whereas the desired shift amount type is i32.
This patch fixes that by using TLI.getScalarShiftAmountTy in DAGTypeLegalizer::SplitInteger.
The X86 change is necessary since splitting i512 requires shifting amount of 256, which
cannot be held by i8.
Differential Revision: https://reviews.llvm.org/D40148
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318727 91177308-0d34-0410-b5e6-96231b3b80d8
Manually update test r600.amdgpu-alias-analysis.ll for amdgiz environment
since it cannot be done by script.
The two pointers are swapped in the output because PrintResults in
AliasAnalysisEvaluator.cpp sorts the strings obtained from printAsOperand
before printing them.
Differential Revision: https://reviews.llvm.org/D40131
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318660 91177308-0d34-0410-b5e6-96231b3b80d8