Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value.
This issues causes amdgcn target to assert when compiling some user codes with -g.
Differential Revision: https://reviews.llvm.org/D42394
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323214 91177308-0d34-0410-b5e6-96231b3b80d8
- Change inserted add ( V_ADD_{I|U}32_e32 ) to _e64 version ( V_ADD_{I|U}32_e64 ) so that the add uses a vreg for the carry; this prevents inserted v_add from killing VCC; the _e64 version doesn't accept a literal in its encoding, so we need to introduce a mov instr as well to get the imm into a register.
- Change pass name to "SI Load Store Optimizer"; this removes the '/', which complicates scripts.
Differential Revision: https://reviews.llvm.org/D42124
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323153 91177308-0d34-0410-b5e6-96231b3b80d8
test/CodeGen/MIR is for testing the MIR parser/printer. Tests for passes
and targets belong to test/CodeGen/TARGETNAME.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322925 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
A recent change
321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores
can allow the machine instruction scheduler to move an image store past
an image load using the same descriptor.
V2: Fixed by marking image ops as mayAlias and isAliased. This may be
overly conservative, and we may need to revisit.
V3: Reverted test change done on 321556.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: llvm-commits, t-tye, yaxunl, wdng, kzhuravl
Differential Revision: https://reviews.llvm.org/D41969
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322419 91177308-0d34-0410-b5e6-96231b3b80d8
While updating clang tests for having clang set dso_local I noticed
that:
- There are *a lot* of tests to update.
- Many of the updates are redundant.
They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322317 91177308-0d34-0410-b5e6-96231b3b80d8
Planning to add support for named vregs. This puts is in a conundrum since
physregs are named as well. To rectify this we need to use a sigil other than
'%' for physregs in MIR. We've settled on using '$' for physregs but first we
must repurpose it from external symbols using it, which is what this commit is
all about. We think '&' will have familiar semantics for C/C++ users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322146 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In the case of an fp_extend of v1f16 to v1f32 where the v1f16 is the
result of a bitcast from i16, avoid creating an illegal fp16_to_fp where
the input is not a vector and the result is a v1f32.
V2: The fix is now to avoid vector scalarization creating a v1->scalar
bitcast.
Reviewers: srhines, t.p.northover
Subscribers: nhaehnle, llvm-commits, dstuttard, t-tye, yaxunl, wdng, kzhuravl, arsenm
Differential Revision: https://reviews.llvm.org/D41126
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322120 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I had a case where multiple nested uniform ifs resulted in code that did
v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and
s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first
ensuring that bits for inactive lanes were clear.
There was already code for inserting an "s_and_b64 vcc, exec, vcc" to
clear bits for inactive lanes in the case that the branch is instruction
selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in
SIFixSGPRCopies. I have added the same code into SILowerControlFlow for
the case that the branch is instruction selected as s_cbranch_vccnz.
This de-optimizes the code in some cases where the s_and is not needed,
because vcc is the result of a v_cmp, or multiple v_cmp instructions
combined by s_and/s_or. We should add a pass to re-optimize those cases.
Reviewers: arsenm, kzhuravl
Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle
Differential Revision: https://reviews.llvm.org/D41292
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322119 91177308-0d34-0410-b5e6-96231b3b80d8
Since register classes and banks are already printed with the register
definition, don't print it at the end of every instruction anymore.
This follows MIR in this regard and is another step to the unification
of the two formats.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322086 91177308-0d34-0410-b5e6-96231b3b80d8
The work order was changed in r228186 from SCC order
to RPO with an arbitrary sorting function. The sorting
function attempted to move inner loop nodes earlier. This
was was apparently relying on an assumption that every block
in a given loop / the same loop depth would be seen before
visiting another loop. In the broken testcase, a block
outside of the loop was encountered before moving onto
another block in the same loop. The testcase would then
structurize such that one blocks unconditional successor
could never be reached.
Revert to plain RPO for the analysis phase. This fixes
detecting edges as backedges that aren't really.
The processing phase does use another visited set, and
I'm unclear on whether the order there is as important.
An arbitrary order doesn't work, and triggers some infinite
loops. The reversed RPO list seems to work and is closer
to the order that was used before, minus the arbitary
custom sorting.
A few of the changed tests now produce smaller code,
and a few are slightly worse looking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321751 91177308-0d34-0410-b5e6-96231b3b80d8
The test needs to be changed; it was exercising UB and that likely wasn't the intent of the test author. I simply removed the checks because I have absolutely no idea what this test was trying to accomplish. With multiple check patterns, no explanation, and no familiarity on my part with the ISA a true fix is going to have to come from someone familiar with the target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321591 91177308-0d34-0410-b5e6-96231b3b80d8
The test in question was checking for a particular intepretation of undefined behavior. Relax the test to check that we simply don't crash.
Sorry for the breakage, I don't generally build AMDGPU locally and just saw the failure this morning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321589 91177308-0d34-0410-b5e6-96231b3b80d8
Two issues were found about machine inst scheduler when compiling ProRender
with -g for amdgcn target:
GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it
should not since DBG_VALUE is not mapped in LiveIntervals.
when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and
ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D41132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320650 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add isRenamable() predicate to MachineOperand. This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand. Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).
Reviewers: qcolombet, MatzeB, hfinkel
Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D39400
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320503 91177308-0d34-0410-b5e6-96231b3b80d8
-amdgpu-waitcnt-forcezero={1|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs
Differential Revision: https://reviews.llvm.org/D40091
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320084 91177308-0d34-0410-b5e6-96231b3b80d8
D40231 requires to run case with -O0 to prevent InstructionSimplify from
transforming an extractelement with undef index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319907 91177308-0d34-0410-b5e6-96231b3b80d8
This was only searching for explicit defs,
and asserting for any implicit or variadic
instruction defs, like inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319826 91177308-0d34-0410-b5e6-96231b3b80d8
Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319808 91177308-0d34-0410-b5e6-96231b3b80d8
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.
The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319732 91177308-0d34-0410-b5e6-96231b3b80d8
Move the entire optimization to one place. Before it was possible
to adjust dmask without changing the register class of the output
instruction, since they were done in separate places. Fix all
lane sizes and move all of the optimization into the DAG folding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319705 91177308-0d34-0410-b5e6-96231b3b80d8
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8
Two issues found when doing codegen for splitting vector with non-zero alloca addr space:
DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.
TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.
Differential Revision: https://reviews.llvm.org/D39758
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319622 91177308-0d34-0410-b5e6-96231b3b80d8
output
As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.
Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.
Differential Revision: https://reviews.llvm.org/D40421
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319445 91177308-0d34-0410-b5e6-96231b3b80d8
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8
GFX9 does not enable bounds checking for the resource descriptors
used for private access, so it should be OK to use vaddr with
a potentially negative value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319393 91177308-0d34-0410-b5e6-96231b3b80d8