There's only one use of this for the convenience
of a pattern. I think v_mov_b64_pseudo should also be
moved, but SIFoldOperands does currently make use of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279901 91177308-0d34-0410-b5e6-96231b3b80d8
When global-isel fails on a MachineFunction MF, MF will be cleaned up
and given to SDISel.
Thanks to this fallback, we can already perform correctness test even if
we support only a small portion of the functions in a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279891 91177308-0d34-0410-b5e6-96231b3b80d8
Adds a baseline test for lowering shuffles where the width of the output
vector is not twice the size of the input vectors. Many of those sequences
are suboptimal, and will hopefully be improved in follow-up patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279888 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.
Also, we will want to do this if we move the load/store optimizer before
the scheduler.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23776
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279870 91177308-0d34-0410-b5e6-96231b3b80d8
In the code to detect fixed-point conversions and make use of AArch64's special
instructions, we weren't prepared for weird types. The fptosi direction got
fixed recently, but not the similar sitofp code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279852 91177308-0d34-0410-b5e6-96231b3b80d8
It's unclear how the old
%res(32) = G_ICMP { s32, s32 } intpred(eq), %0, %1
is actually different from an s1 verison
%res(1) = G_ICMP { s1, s32 } intpred(eq), %0, %1
so we'll remove it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279843 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In fuctions that contained debug info but were empty otherwise,
the ARM load/store optimizer could abort. This was because
function MergeReturnIntoLDM handled the special case where a
Machine Basic BLock is empty by calling MBB.empty(). However, this
returns false in presence of debug info, although the function
should be considered empty in the eyes of the load/store optimizer.
This has been fixed by handling the case where searching through the
block finds only debug instructions.
Reviewers: rengolin, dexonsmith, llvm-commits, jmolloy
Subscribers: t.p.northover, aemerson, rengolin, samparker
Differential Revision: https://reviews.llvm.org/D23847
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279820 91177308-0d34-0410-b5e6-96231b3b80d8
This was for some reason skipping operands that are subregisters
instead of keeping the same subregister index.
v_movreld_b32 expects src0 to be the subregister of the tied
super register use/def.
e.g.
v_movreld_b32 v0, v9, <imp-def, tied3> v[0:3], <imp-use, tied2> v[0:3]
was being replaced with
v[4:7] = copy v[0:3]
v_movreld_b32 v0, v9, <imp-def, tied3> v[4:7], <imp-use, tied2> v[4:7],
which really writes to v[0:3]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279804 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289),
due to miscompiles in the test suite. The FastISel change was left in, because
it apparently fixes an unrelated issue.
(Recommit of r279782 which was broken due to a bad merge.)
This fixes 4 out of the 5 test failures in PR29112.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279788 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts most of r274613 and its follow-ups (r276347, r277289), due to
miscompiles in the test suite. The FastISel change was left in, because it
apparently fixes an unrelated issue.
This fixes 4 out of the 5 test failures in PR29112.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279782 91177308-0d34-0410-b5e6-96231b3b80d8
The 32-bit variants of these operations don't depend on the bits not being
operated on, so they also naturally model operations narrower than the actual
register width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279760 91177308-0d34-0410-b5e6-96231b3b80d8
Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths,
and change associated asserts to assert in the right direction...
This fixes PR29111.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279755 91177308-0d34-0410-b5e6-96231b3b80d8
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.
Differential Revision: http://reviews.llvm.org/D23850
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279698 91177308-0d34-0410-b5e6-96231b3b80d8
tracksSubRegLiveness only depends on the Subtarget and a cl::opt, there
is not need to change it or save/parse it in a .mir file.
Make the field const and move the initialization LiveIntervalAnalysis to the
MachineRegisterInfo constructor. Also cleanup some code and fix some
instances which better use MachineRegisterInfo::subRegLivenessEnabled() instead
of TargetSubtargetInfo::enableSubRegLiveness().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279676 91177308-0d34-0410-b5e6-96231b3b80d8
The cost of predicating a diamond is only the instructions that are not shared
between the two branches. Additionally If a predicate clobbering instruction
occurs in the shared portion of the branches (e.g. a cond move), it may still
be possible to if convert the sub-cfg. This change handles these two facts by
rescanning the non-shared portion of a diamond sub-cfg to recalculate both the
predication cost and whether both blocks are pred-clobbering.
Fixed 2 bugs before recommitting. Branch instructions must be compared and found
identical before diamond conversion. Also, predicate-clobbering instructions in
the shared prefix disqualifies a potential diamond conversion. Includes tests
for both.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279670 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch implements readlane/readfirstlane intrinsics.
TODO: need to define a new register class to consider the case
that the source could be a vector register or M0.
Reviewed by:
arsenm and tstellarAMD
Differential Revision:
http://reviews.llvm.org/D22489
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279660 91177308-0d34-0410-b5e6-96231b3b80d8
These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279652 91177308-0d34-0410-b5e6-96231b3b80d8
Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr)
This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly.
Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279646 91177308-0d34-0410-b5e6-96231b3b80d8
The register allocator can split a live interval of a register into a set
of smaller intervals. After the allocation of registers is complete, the
rewriter will modify the IR to replace virtual registers with the corres-
ponding physical registers. At this stage, if a register corresponding
to a subregister of a virtual register is used, the rewriter will check
if that subregister is undefined, and if so, it will add the <undef> flag
to the machine operand. The function verifying liveness of the subregis-
ter would assume that it is undefined, unless any of the subranges of the
live interval proves otherwise.
The problem is that the live intervals created during splitting do not
have any subranges, even if the original parent interval did. This could
result in the <undef> flag placed on a register that is actually defined.
Differential Revision: http://reviews.llvm.org/D21189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279625 91177308-0d34-0410-b5e6-96231b3b80d8
Extend instruction definitions from nearly all ISAs to include
appropriate instruction itineraries. Change MIPS16s gp prologue
generation to use real instructions instead of using a pseudo
instruction.
Reviewers: dsanders, vkalintiris
Differential Review: https://reviews.llvm.org/D23548
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279623 91177308-0d34-0410-b5e6-96231b3b80d8
Re-apply this patch, hopefully I will get away without any warnings
in the constructor now.
This patch removes the MachineFunctionAnalysis. Instead we keep a
map from IR Function to MachineFunction in the MachineModuleInfo.
This allows the insertion of ModulePasses into the codegen pipeline
without breaking it because the MachineFunctionAnalysis gets dropped
before a module pass.
Peak memory should stay unchanged without a ModulePass in the codegen
pipeline: Previously the MachineFunction was freed at the end of a codegen
function pipeline because the MachineFunctionAnalysis was dropped; With
this patch the MachineFunction is freed after the AsmPrinter has
finished.
Differential Revision: http://reviews.llvm.org/D23736
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279602 91177308-0d34-0410-b5e6-96231b3b80d8