Reverting due to assertion failure in
lib/CodeGen/SelectionDAG/InstrEmitter.cpp
This reverts commit r272792.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272799 91177308-0d34-0410-b5e6-96231b3b80d8
[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.
Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.
This refixes PR9817 which was being incompletely checked in the
testsuite.
Reviewers: jyknight
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D21037
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272792 91177308-0d34-0410-b5e6-96231b3b80d8
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272126 91177308-0d34-0410-b5e6-96231b3b80d8
Although this was intended to be NFC, the test case wiggle shows a change in
code scheduling/RA caused by a difference in the SDLoc() generation.
Depending on how you look at it, this is the (dis)advantage of exact checking
in regression tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271526 91177308-0d34-0410-b5e6-96231b3b80d8
This should have been converting the size to bytes, but wasn't really.
These should probably all be using getStoreSize instead.
I haven't been able to come up with a meaningful testcase for this.
I can trigger it using combinations of struct loads and stores,
but can't observe a difference in non-broken testcases.
isAlias is only really used during store merging, so I'm not sure how
to get into the vector splitting situation the comment describes
since store merging is only done before type legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271356 91177308-0d34-0410-b5e6-96231b3b80d8
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"
%typ = type { i32, i32 }
define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
%b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
%1 = load i32, i32* %b, align 4
%2 = ptrtoint i32* %b to i64
%3 = and i64 %2, -35184372088833
%4 = inttoptr i64 %3 to i32*
%_msld = load i32, i32* %4, align 4
%zzz = add i32 %1, %_msld
ret i32 %zzz
}
Fix this by checking ResNo.
I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them. In fact, they might not be triggerable at all,
at least with current targets. Still, better safe than sorry.
Differential Revision: http://reviews.llvm.org/D19202
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267420 91177308-0d34-0410-b5e6-96231b3b80d8
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267328 91177308-0d34-0410-b5e6-96231b3b80d8
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267098 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes a bug (PR26827) when using anti-aliasing in store
merging. This sets the chain users of the component stores to point to
the new store instead of the component stores chain parent.
Reviewers: jyknight
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18909
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266217 91177308-0d34-0410-b5e6-96231b3b80d8
xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well.
The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles.
I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result.
Differential Revision: http://reviews.llvm.org/D18944
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265998 91177308-0d34-0410-b5e6-96231b3b80d8
In Memcpy lowering we had missed a dependence from the load of the
operation to successor operations. This causes us to potentially
construct an in initial DAG with a memory dependence not fully
represented in the chain sub-DAG but rather require looking at the
entire DAG breaking alias analysis by allowing incorrect repositioning
of memory operations.
To work around this, r200033 changed DAGCombiner::GatherAllAliases to be
conservative if any possible issues to happen. Unfortunately this check
forbade many non-problematic situations as well. For example, it's
common for incoming argument lowering to add a non-aliasing load hanging
off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would
find that other (unvisited) use of the EntryNode chain, and just give up
entirely. Furthermore, the check was incomplete: it would not actually
detect all such potentially problematic DAG constructions, because
GatherAllAliases did not guarantee to visit all chain nodes going up to
the root EntryNode. This is in general fine -- giving up early will just
miss a potential optimization, not generate incorrect results. But, for
this non-chain dependency detection code, it's possible that you could
have a load attached to a higher-up chain node than any which were
visited. If that load aliases your store, but the only dependency is
through the value operand of a non-aliasing store, it would've been
missed by this code, and potentially reordered.
With the dependence added, this check can be removed and Alias Analysis
can be much more aggressive. This fixes code quality regression in the
Consecutive Store Merge cleanup (D14834).
Test Change:
ppc64-align-long-double.ll now may see multiple serializations
of its stores
Differential Revision: http://reviews.llvm.org/D18062
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265836 91177308-0d34-0410-b5e6-96231b3b80d8
Change isConsecutiveLoads to check that loads are non-volatile as this
is a requirement for any load merges. Propagate change to two callers.
Reviewers: RKSimon
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265013 91177308-0d34-0410-b5e6-96231b3b80d8
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG. This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.
This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.
Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight
Subscribers: llvm-commits, tberghammer, danalbert, srhines
Differential Revision: http://reviews.llvm.org/D18336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264461 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:
(and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)
DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.
This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.
Reviewers: t.p.northover, jmolloy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18247
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263935 91177308-0d34-0410-b5e6-96231b3b80d8
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.
Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion).
Differential Revision: http://reviews.llvm.org/D17691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263159 91177308-0d34-0410-b5e6-96231b3b80d8
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization
combine.
This fixes PR26835.
Differential Revision: http://reviews.llvm.org/D17878
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262746 91177308-0d34-0410-b5e6-96231b3b80d8
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.
This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.
Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.
This patch fixes PR17193 (and a long time FIXME in the tests).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262738 91177308-0d34-0410-b5e6-96231b3b80d8
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.
Differential Revision: http://reviews.llvm.org/D17691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262599 91177308-0d34-0410-b5e6-96231b3b80d8
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.
This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.
This patch fixes PR17193 (and a long time FIXME in the tests).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262507 91177308-0d34-0410-b5e6-96231b3b80d8
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262397 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262387 91177308-0d34-0410-b5e6-96231b3b80d8
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262358 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.
The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.
Reviewers: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D10970
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262316 91177308-0d34-0410-b5e6-96231b3b80d8
In the case where op = add, y = base_ptr, and x = offset, this
transform:
(op y, (op x, c1)) -> (op (op x, y), c1)
breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.
This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262148 91177308-0d34-0410-b5e6-96231b3b80d8