Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.
This will also make it easier to turn it on in buildbots.
Reviewers: chandlerc
Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits
Differential Revision: http://reviews.llvm.org/D19723
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268050 91177308-0d34-0410-b5e6-96231b3b80d8
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"
%typ = type { i32, i32 }
define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
%b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
%1 = load i32, i32* %b, align 4
%2 = ptrtoint i32* %b to i64
%3 = and i64 %2, -35184372088833
%4 = inttoptr i64 %3 to i32*
%_msld = load i32, i32* %4, align 4
%zzz = add i32 %1, %_msld
ret i32 %zzz
}
Fix this by checking ResNo.
I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them. In fact, they might not be triggerable at all,
at least with current targets. Still, better safe than sorry.
Differential Revision: http://reviews.llvm.org/D19202
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267420 91177308-0d34-0410-b5e6-96231b3b80d8
This code was specific to vector operations with scalar operands:
all the opcodes in FoldValue (via FoldConstantArithmetic) can't
match those criteria.
Replace it with an assert if that ever changes: at that point,
we might need to add back a splat BUILD_VECTOR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266100 91177308-0d34-0410-b5e6-96231b3b80d8
In Memcpy lowering we had missed a dependence from the load of the
operation to successor operations. This causes us to potentially
construct an in initial DAG with a memory dependence not fully
represented in the chain sub-DAG but rather require looking at the
entire DAG breaking alias analysis by allowing incorrect repositioning
of memory operations.
To work around this, r200033 changed DAGCombiner::GatherAllAliases to be
conservative if any possible issues to happen. Unfortunately this check
forbade many non-problematic situations as well. For example, it's
common for incoming argument lowering to add a non-aliasing load hanging
off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would
find that other (unvisited) use of the EntryNode chain, and just give up
entirely. Furthermore, the check was incomplete: it would not actually
detect all such potentially problematic DAG constructions, because
GatherAllAliases did not guarantee to visit all chain nodes going up to
the root EntryNode. This is in general fine -- giving up early will just
miss a potential optimization, not generate incorrect results. But, for
this non-chain dependency detection code, it's possible that you could
have a load attached to a higher-up chain node than any which were
visited. If that load aliases your store, but the only dependency is
through the value operand of a non-aliasing store, it would've been
missed by this code, and potentially reordered.
With the dependence added, this check can be removed and Alias Analysis
can be much more aggressive. This fixes code quality regression in the
Consecutive Store Merge cleanup (D14834).
Test Change:
ppc64-align-long-double.ll now may see multiple serializations
of its stores
Differential Revision: http://reviews.llvm.org/D18062
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265836 91177308-0d34-0410-b5e6-96231b3b80d8
Change isConsecutiveLoads to check that loads are non-volatile as this
is a requirement for any load merges. Propagate change to two callers.
Reviewers: RKSimon
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265013 91177308-0d34-0410-b5e6-96231b3b80d8
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG. This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.
This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.
Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight
Subscribers: llvm-commits, tberghammer, danalbert, srhines
Differential Revision: http://reviews.llvm.org/D18336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264461 91177308-0d34-0410-b5e6-96231b3b80d8
Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264085 91177308-0d34-0410-b5e6-96231b3b80d8
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.
Original Message:
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.
Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.
It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).
This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262902 91177308-0d34-0410-b5e6-96231b3b80d8
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.
This reverts r262886.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262892 91177308-0d34-0410-b5e6-96231b3b80d8
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.
Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.
It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).
This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262886 91177308-0d34-0410-b5e6-96231b3b80d8
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.
Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.
Part of llvm.org/PR26808.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262500 91177308-0d34-0410-b5e6-96231b3b80d8
This was causing assertions later from using the wrong pointer
size with LDS operations. getOptimalMemOpType should also have
address space arguments later.
This avoids assertions in existing tests exposed by
a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261580 91177308-0d34-0410-b5e6-96231b3b80d8
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.
Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators. (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)
There should be NFC here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261498 91177308-0d34-0410-b5e6-96231b3b80d8
The code change is simple enough: instead of attaching an anonymous SDLoc to splatted
vector constants, use the scalar constant's existing SDLoc since that is what is passed
into getConstant() as a param. But this changes instruction scheduling, so I'll explain
why that happens.
The motivation for this patch starts near:
http://reviews.llvm.org/rL258833
...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'.
But when I made that change locally, several x86 codegen tests wiggled.
It turns out that the lack of SDLoc consistency in getConstant() changes the way
ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG
scheduler algorithms use IROrder for tie-breaking.
Differential Revision: http://reviews.llvm.org/D16972
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260582 91177308-0d34-0410-b5e6-96231b3b80d8
I reinvented this functionality in http://reviews.llvm.org/D16828 because it was
hidden away as a static function. The changes in x86 are not based on a complete
audit. I suspect there are other possible uses there, and there are almost certainly
more potential users in other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260295 91177308-0d34-0410-b5e6-96231b3b80d8
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.
The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.
The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.
The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259691 91177308-0d34-0410-b5e6-96231b3b80d8
When generating calls to memcpy, memmove, and memset, use void* as the return
type rather than void, to match the standard signatures for these functions.
This has no practical effect for most targets, since the return values of
these calls aren't being used anyway, and most calling conventions tolerate
this kind of mismatch. However, this change will help support future
optimizations to utilize the return value to avoid holding the argument
value live across a call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258691 91177308-0d34-0410-b5e6-96231b3b80d8
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258482 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
#include <string>
#include <iostream>
static const char kData[] = "asdf jkl;";
int main() {
std::string s(kData + 3, sizeof(kData) - 3);
std::cout << s << '\n';
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258465 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.
This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.
Differential Revision: http://reviews.llvm.org/D16090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258296 91177308-0d34-0410-b5e6-96231b3b80d8
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.
Differential Revision: http://reviews.llvm.org/D15957
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257940 91177308-0d34-0410-b5e6-96231b3b80d8
Pulled out the similar CONCAT_VECTORS creation code from the 2/3 operand getNode() calls (to handle all UNDEF and all BUILD_VECTOR cases). Added a similar handler to the general getNode() call as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256709 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256. This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view. On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.
This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls. If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.
The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86). NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction. All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator). X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.
I call this a "partial fix" because there are still cases that can't be
lowered. A fatal error is reported in these cases.
Reviewers: arsenm, theraven, compnerd, hfinkel
Subscribers: hfinkel, llvm-commits, alex
Differential Revision: http://reviews.llvm.org/D7241
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255441 91177308-0d34-0410-b5e6-96231b3b80d8
PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255289 91177308-0d34-0410-b5e6-96231b3b80d8
Almost all these changes are conditioned and only apply to the new
x86-64 f128 type configuration, which will be enabled in a follow up
patch. They are required together to make new f128 work. If there is
any error, we should fix or revert them as a whole.
These changes should have no impact to current configurations.
* Relax type legalization checks to accept new f128 type configuration,
whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has
TLI.isTypeLegal true.
* Relax GetSoftenedFloat to return in some cases f128 type SDValue,
which is TLI.isTypeLegal but not "softened" to i128 node.
* Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration,
to generate optimized bitwise operators for libm functions.
* Enhance related Lower* functions to handle f128 type.
* Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions
to keep new f128 type in register, and convert f128 operators to library calls.
* Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
* Add ExpandConstant to handle i128 constants, ExpandNode
to handle ISD::Constant node.
* Add one more parameter to getCommonSubClass and firstCommonClass,
to guarantee that returned common sub class will contain the specified
simple value type.
This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
* Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
* Fix printOperand to handle null operand.
* Enhance ISD::BITCAST node to handle f128 constant.
* Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
* Enhance X86AsmPrinter to emit f128 values in comments.
Differential Revision: http://reviews.llvm.org/D15134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254653 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.
Reviewers: MatzeB, andreadb, tstellarAMD
Subscribers: arsenm, jyknight, llvm-commits
Differential Revision: http://reviews.llvm.org/D14945
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254085 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for vector constant folding of integer/float comparisons.
This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations)
Differential Revision: http://reviews.llvm.org/D14683
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253504 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Don't call `computeKnownBitsFromRangeMetadata` for extended loads --
this can cause a mismatch between the width of the !range metadata and
the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the
future). This isn't a problem now, but will be after a future change.
Note: this can be made more aggressive in the future.
Reviewers: nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14107
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251486 91177308-0d34-0410-b5e6-96231b3b80d8
While technically this is untested dead code, it has out-of-tree users.
This reverts a part of r250434.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250717 91177308-0d34-0410-b5e6-96231b3b80d8
We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical).
This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases.
After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode().
Differential Revision: http://reviews.llvm.org/D13665
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250118 91177308-0d34-0410-b5e6-96231b3b80d8