Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting
... = array[zext(a)]
... = array[zext(a) + 1]
to
... = array[sext(a)]
... = array[zext(a) + 1],
the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.
Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.
Test Plan: make check-all
Reviewers: eliben, meheff
Reviewed By: meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D4542
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213209 91177308-0d34-0410-b5e6-96231b3b80d8
In the original version of the patch the behaviour was like described in
the comment. This behaviour was changed before committing it without
updating the comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213117 91177308-0d34-0410-b5e6-96231b3b80d8
This patch modifies the existing DiagnosticInfo system to create a generic base
class that is inherited to produce diagnostic-based warnings. This is used by
the loop vectorizer to trigger a warning when vectorization is forced and
fails. Several tests have been added to verify this behavior.
Reviewed by: Arnold Schwaighofer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213110 91177308-0d34-0410-b5e6-96231b3b80d8
Specifically, when building a union query, if we are dominated by an identical
query then use the result of that query instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213047 91177308-0d34-0410-b5e6-96231b3b80d8
not properly handle the case where the predecessor block was the entry block to
the function. The only in-tree client of this is JumpThreading, which worked
around the issue in its own code. This patch moves the solution into the helper
so that JumpThreading (and other clients) do not have to replicate the same fix
everywhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212875 91177308-0d34-0410-b5e6-96231b3b80d8
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).
Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.
This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212872 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets
created as an argument to a GEP partway through an iteration, causing
-instcombine to optimize the GEP before the multiply.
rdar://problem/17615671
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212742 91177308-0d34-0410-b5e6-96231b3b80d8
This is the one remaining place I see where passing
isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for
loads) -- I think I got the others in r212720. Most of the other remaining
callers of isSafeToSpeculativelyExecute only use it for call sites (or
otherwise exclude loads).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212730 91177308-0d34-0410-b5e6-96231b3b80d8
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.
This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212720 91177308-0d34-0410-b5e6-96231b3b80d8
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.
In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212686 91177308-0d34-0410-b5e6-96231b3b80d8
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
* DFSanABIList in DFSan instrumentation pass.
* SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).
Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212643 91177308-0d34-0410-b5e6-96231b3b80d8
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).
This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.
Differential Revision: http://reviews.llvm.org/D4424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212629 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 5b55a47e94.
A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212550 91177308-0d34-0410-b5e6-96231b3b80d8
All blacklisting logic is now moved to the frontend (Clang).
If a function (or source file it is in) is blacklisted, it doesn't
get sanitize_address attribute and is therefore not instrumented.
If a global variable (or source file it is in) is blacklisted, it is
reported to be blacklisted by the entry in llvm.asan.globals metadata,
and is not modified by the instrumentation.
The latter may lead to certain false positives - not all the globals
created by Clang are described in llvm.asan.globals metadata (e.g,
RTTI descriptors are not), so we may start reporting errors on them
even if "module" they appear in is blacklisted. We assume it's fine
to take such risk:
1) errors on these globals are rare and usually indicate wild memory access
2) we can lazily add descriptors for these globals into llvm.asan.globals
lazily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212505 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those.
The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212490 91177308-0d34-0410-b5e6-96231b3b80d8
This is useful for functions that are not actually available externally but
referenced by a vtable of some kind. Clang emits functions like this for the MS
ABI.
PR20182.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212337 91177308-0d34-0410-b5e6-96231b3b80d8
Exposes more constant globals that can be removed by
the global optimizer. A specific example is the removal
of the static global block address array in
clang/test/CodeGen/indirect-goto.c. This change impacts only
lower optimization levels. With LTO interprocedural
const prop runs already before global opt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212284 91177308-0d34-0410-b5e6-96231b3b80d8
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.
This makes blacklist applicable in a wider range of situations.
It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212268 91177308-0d34-0410-b5e6-96231b3b80d8
With this change all values passed through blacklisted functions
become fully initialized. Previous behavior was to initialize all
loads in blacklisted functions, but apply normal shadow propagation
logic for all other operation.
This makes blacklist applicable in a wider range of situations.
It also makes code for blacklisted functions a lot shorter, which
works as yet another workaround for PR17409.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212265 91177308-0d34-0410-b5e6-96231b3b80d8
These don't need to be mutable and callers being added soon in CodeGen
won't have access to non-const Module&.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212202 91177308-0d34-0410-b5e6-96231b3b80d8
See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the
original feature request.
Introduce llvm.asan.globals metadata, which Clang (or any other frontend)
may use to report extra information about global variables to ASan
instrumentation pass in the backend. This metadata replaces
llvm.asan.dynamically_initialized_globals that was used to detect init-order
bugs. llvm.asan.globals contains the following data for each global:
1) source location (file/line/column info);
2) whether it is dynamically initialized;
3) whether it is blacklisted (shouldn't be instrumented).
Source location data is then emitted in the binary and can be picked up
by ASan runtime in case it needs to print error report involving some global.
For example:
0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40
These source locations are printed even if the binary doesn't have any
debug info.
This is an ABI-breaking change. ASan initialization is renamed to
__asan_init_v4(). Pre-built libraries compiled with older Clang will not work
with the fresh runtime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212188 91177308-0d34-0410-b5e6-96231b3b80d8
It is not safe to negate the smallest signed integer, doing so yields
the same number back.
This fixes PR20186.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212164 91177308-0d34-0410-b5e6-96231b3b80d8
This patch reduces the stack memory consumption of the InstCombine
function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions
could overflow the stack because of excessive recursiveness.
For example, in a case like this:
%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0, i32* %1
%2 = getelementptr inbounds i32* %1, i64 1
store i32 1, i32* %2
%3 = getelementptr inbounds i32* %2, i64 1
store i32 2, i32* %3
%4 = getelementptr inbounds i32* %3, i64 1
store i32 3, i32* %4
%5 = getelementptr inbounds i32* %4, i64 1
store i32 4, i32* %5
%6 = getelementptr inbounds i32* %5, i64 1
store i32 5, i32* %6
...
This piece of code crashes llvm when trying to apply instcombine on
desktop. On embedded devices this could happen with a much lower limit
of recursiveness. Some instructions (getelementptr and bitcasts) make
the function recursively call itself on their uses, which is what makes
the example above consume so much stack (it becomes a recursive
depth-first tree visit with a very big depth).
The patch changes the algorithm to be semantically equivalent, but
iterative instead of recursive and the visiting order to be from a
depth-first visit to a breadth-first visit (visit all the instructions
of the current level before the ones of the next one).
Now if a lot of memory is required a heap allocation is done instead of
the the stack allocation, avoiding the possible crash.
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D4355
Patch by Marcello Maggioni! We don't generally commit large stress test
that look for out of memory conditions, so I didn't request that one be
added to the patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212133 91177308-0d34-0410-b5e6-96231b3b80d8
Matching behavior with DeadArgumentElimination (and leveraging some
now-common infrastructure), keep track of the function from debug info
metadata if arguments are promoted.
This may produce interesting debug info - since the arguments may be
missing or of different types... but at least backtraces, inlining, etc,
will be correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212128 91177308-0d34-0410-b5e6-96231b3b80d8
Update DeadArgumentElimintation to use this, with the intent of reusing
the functionality for ArgumentPromotion as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212122 91177308-0d34-0410-b5e6-96231b3b80d8
There were transforms whose *intent* was to downgrade the linkage of
external objects to have internal linkage.
However, it fired on things with private linkage as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212104 91177308-0d34-0410-b5e6-96231b3b80d8
This both improves basic debug info quality, but also fixes a larger
hole whenever we inline a call/invoke without a location (debug info for
the entire inlining is lost and other badness that the debug info
emission code is currently working around but shouldn't have to).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212065 91177308-0d34-0410-b5e6-96231b3b80d8
This probably isn't necessary since msan started to unpoison the return
value shadow memory before all calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212061 91177308-0d34-0410-b5e6-96231b3b80d8
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.
COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.
This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.
Differential Revision: http://reviews.llvm.org/D4178
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211920 91177308-0d34-0410-b5e6-96231b3b80d8
If both instructions to be replaced are marked invariant the resulting
instruction is invariant.
rdar://13358910
Fix by Erik Eckstein!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211801 91177308-0d34-0410-b5e6-96231b3b80d8
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211762 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211753 91177308-0d34-0410-b5e6-96231b3b80d8
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211749 91177308-0d34-0410-b5e6-96231b3b80d8
[LLVM part]
These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix. Metadata name
changes:
llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*
This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata.
Patch by Mark Heffernan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211710 91177308-0d34-0410-b5e6-96231b3b80d8
Origin history should only be recorded for uninitialized values, because it is
meaningless otherwise. This change moves __msan_chain_origin to the runtime
library side and makes it conditional on the corresponding shadow value.
Previous code was correct, but _very_ inefficient.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211700 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes exponential compilation complexity in PR19835, caused by
LICM::sink not handling the following pattern well:
f = op g
e = op f, g
d = op e
c = op d, e
b = op c
a = op b, c
When an instruction with N uses is sunk, each of its operands gets N
new uses (all of them - phi nodes). In the example above, if a had 1
use, c would have 2, e would have 4, and g would have 8.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211673 91177308-0d34-0410-b5e6-96231b3b80d8
The induction variables start value needs to be defined before we branch
(overflow check) to the scalar preheader where we used it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211460 91177308-0d34-0410-b5e6-96231b3b80d8
Patch removes rest part of code related to old implementation.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
This one was the final patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211457 91177308-0d34-0410-b5e6-96231b3b80d8
Added short description for new comparison algorithm, that introduces
total ordering among functions set.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211456 91177308-0d34-0410-b5e6-96231b3b80d8
Patch activates new implementation.
So from now, merging process should take time O(N*log(N)).
Where N size of module (we are free to measure it in
functions or in instructions). Internally FnTree represents
binary tree. So every lookup operation takes O(log(N)) time.
It is still not the last patch in series, we also have to
clean-up pass from old code, and update pass comments.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211445 91177308-0d34-0410-b5e6-96231b3b80d8
Patch removed next old FunctionComparator methods:
* enumerate
* isEquivalentOperation
* isEquivalentGEP
* isEquivalentType
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211444 91177308-0d34-0410-b5e6-96231b3b80d8
introduced among functions set.
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211442 91177308-0d34-0410-b5e6-96231b3b80d8
methods.
Patch changes return type of FunctionComparator::compare() and
FunctionComparator::compare(const BasicBlock*, const BasicBlock*)
methods from bool (equal or not) to {-1, 0, 1} (less, equal, great).
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211437 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Different range metadata can lead to different optimizations in later
passes, possibly breaking the semantics of the merged function. So range
metadata must be taken into consideration when comparing Load
instructions.
Thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211391 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211339 91177308-0d34-0410-b5e6-96231b3b80d8
We would previously put dllimport variables in switch lookup tables, which
doesn't work because the address cannot be used in a constant initializer.
This is basically the same problem that we have in PR19955.
Putting TLS variables in switch tables also desn't work, because the
address of such a variable is not constant.
Differential Revision: http://reviews.llvm.org/D4220
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211331 91177308-0d34-0410-b5e6-96231b3b80d8
* Find factorization opportunities using identity values.
* Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
* Keep NSW flag while simplifying instruction using factorization.
This fixes PR19263.
Differential Revision: http://reviews.llvm.org/D3799
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211261 91177308-0d34-0410-b5e6-96231b3b80d8
InstCombineMulDivRem has:
// Canonicalize (X+C1)*CI -> X*CI+C1*CI.
InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z) iff W == Y
These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.
The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.
To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.
This fixes PR20079.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211258 91177308-0d34-0410-b5e6-96231b3b80d8
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211247 91177308-0d34-0410-b5e6-96231b3b80d8
Multiplication by an integer with a number of trailing zero bits leaves
the same number of lower bits of the result initialized to zero.
This change makes MSan take this into account in the case of multiplication by
a compile-time constant.
We don't handle the general, non-constant, case because
(a) it's not going to be cheap (computation-wise);
(b) multiplication by a partially uninitialized value in user code is
a bad idea anyway.
Constant case must be handled because it appears from LLVM optimization of a
completely valid user code, as the test case in compiler-rt demonstrates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211092 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".
Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll
Test Plan: make check-all
Reviewers: eliben, rafael, meheff, chandlerc
Reviewed By: chandlerc
Subscribers: chandlerc, llvm-commits
Differential Revision: http://reviews.llvm.org/D4144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211084 91177308-0d34-0410-b5e6-96231b3b80d8
r199771 accidently broke the logic that makes sure that SROA only splits
load on byte boundaries. If such a split happens, some bits get lost
when reassembling loads of wider types, causing data corruption.
Move the width check up to reject such splits early, avoiding the
corruption. Fixes PR19250.
Patch by: Björn Steinbrink <bsteinbr@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211082 91177308-0d34-0410-b5e6-96231b3b80d8
[This is resubmitting r210721, which was reverted due to suspected breakage
which turned out to be unrelated].
Some extra review comments were addressed. See D4090 and D4147 for more details.
The Clang change that produces this metadata was committed in r210667
Patch by Mark Heffernan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211076 91177308-0d34-0410-b5e6-96231b3b80d8
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.
So, if we have a program with something like this:
switch (a) {
case 0:
do0();
break;
case 1:
do1();
break;
case 2:
do2();
break;
default:
break;
}
the code produced is something like this:
if (a < 1) {
if (a == 0) {
do0();
}
} else {
if (a < 2) {
if (a == 1) {
do1();
}
} else {
if (a == 2) {
do2();
}
}
}
This code is inefficient because the check (a == 1) to execute do1() is
not needed.
The reason is that because we already checked that (a >= 1) initially by
checking that also (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).
The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .
For example, the above with "a = 1" would work like this:
the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.
When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).
Patch by: Marcello Maggioni <hayarms@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211038 91177308-0d34-0410-b5e6-96231b3b80d8
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.
Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.
Piggyback a minor refactor in InstCombineCasts.cpp
Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211004 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is to move GlobalMerge pass from Transform/Scalar
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210951 91177308-0d34-0410-b5e6-96231b3b80d8
Init-order and use-after-return modes can currently be enabled
by runtime flags. use-after-scope mode is not really working at the
moment.
The only problem I see is that users won't be able to disable extra
instrumentation for init-order and use-after-scope by a top-level Clang flag.
But this instrumentation was implicitly enabled for quite a while and
we didn't hear from users hurt by it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210924 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.
As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.
At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.
By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.
Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.
Summary for out of tree users:
------------------------------
+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
Enable value forwarding for loads from `calloc()` without an intervening
store.
This change extends GVN to handle the following case:
%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This load is trivially constant zero
%3 = load i32* %2, align 4
This is analogous to the handling for `malloc()` in the same places.
`malloc()` returns `undef`; `calloc()` returns a zero value. Note that
it is correct to return zero even for out of bounds GEPs since the
result of such a GEP would be undefined.
Patch by Philip Reames!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210828 91177308-0d34-0410-b5e6-96231b3b80d8
This is a minimal change to remove the header. I will remove the occurrences
of "using std::error_code" in a followup patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210803 91177308-0d34-0410-b5e6-96231b3b80d8
Pass initialization requires to initialize TargetMachine for back-end
specific passes. This commit creates a new macro INITIALIZE_TM_PASS to
simplify this kind of initialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210641 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210640 91177308-0d34-0410-b5e6-96231b3b80d8
never be true in a well-defined context. The checking for null pointers
has been moved into the caller logic so it does not rely on undefined behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210497 91177308-0d34-0410-b5e6-96231b3b80d8
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.
With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.
Fixes PR19001.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210459 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a crash on MMX intrinsics, as well as a corner case in handling of
all unsigned pack intrinsics.
PR19953.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210454 91177308-0d34-0410-b5e6-96231b3b80d8
For each array index that is in the form of zext(a), convert it to sext(a)
if we can prove zext(a) <= max signed value of typeof(a). The conversion
helps to split zext(x + y) into sext(x) + sext(y).
Reviewed in http://reviews.llvm.org/D4060
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210444 91177308-0d34-0410-b5e6-96231b3b80d8
zext(a + b) != zext(a) + zext(b) even if a + b >= 0 && b >= 0.
e.g., a = i4 0b1111, b = i4 0b0001
zext a + b to i8 = zext 0b0000 to i8 = 0b00000000
(zext a to i8) + (zext b to i8) = 0b00001111 + 0b00000001 = 0b00010000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210439 91177308-0d34-0410-b5e6-96231b3b80d8
The messages were
"PR19753: Optimize comparisons with "ashr exact" of a constanst."
"Added support to optimize comparisons with "lshr exact" of a constant."
They were not correctly handling signed/unsigned operation differences,
causing pr19958.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210393 91177308-0d34-0410-b5e6-96231b3b80d8
addrspacecast X addrspace(M)* to Y addrspace(N)*
-->
bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*
Updat all affected tests and add several new tests in addrspacecast.ll.
This patch is based on http://reviews.llvm.org/D2186 (authored by Matt
Arsenault) with fixes and more tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210375 91177308-0d34-0410-b5e6-96231b3b80d8
If we have common uses on separate paths in the tree; process the one with greater common depth first.
This makes sure that we do not assume we need to extract a load when it is actually going to be part of a vectorized tree.
Review: http://reviews.llvm.org/D3800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210310 91177308-0d34-0410-b5e6-96231b3b80d8
Most issues are on mishandling s/zext.
Fixes:
1. When rebuilding new indices, s/zext should be distributed to
sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not
sext(a + b) + 5. This also affects the logic of recursively looking for a
constant offset, we need to include s/zext into the context of the searching.
2. Function find should return the bitwidth of the constant offset instead of
always sign-extending it to i64.
3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP
indices to pointer-size before computing the address. Therefore, gep base,
zext(a + b) != gep base, a + b
Improvements:
1. Add an optimization for splitting sext(a + b): if a + b is proven
non-negative (e.g., used as an index of an inbound GEP) and one of a, b is
non-negative, sext(a + b) = sext(a) + sext(b)
2. Function Distributable checks whether both sext and zext can be distributed
to operands of a binary operator. This helps us split zext(sext(a + b)) to
zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow.
Refactoring:
Merge some common logic of handling add/sub/or in find.
Testing:
Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes
we made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210291 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector. This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses. This is represented with a
ppc_altivec_vperm intrinsic function.
The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.
This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.
The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210282 91177308-0d34-0410-b5e6-96231b3b80d8
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.
This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210280 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements two things:
1. If we know one number is positive and another is negative, we return true as
signed addition of two opposite signed numbers will never overflow.
2. Implemented TODO : If one of the operands only has one non-zero bit, and if
the other operand has a known-zero bit in a more significant place than it
(not including the sign bit) the ripple may go up to and fill the zero, but
won't change the sign. e.x - (x & ~4) + 1
We make sure that we are ignoring 0 at MSB.
Patch by Suyog Sarda.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210186 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.
This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like
@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
i32 ptrtoint (i32* @bar to i32)) to i32*)
An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).
Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210062 91177308-0d34-0410-b5e6-96231b3b80d8
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210052 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r210029.
It was not correctly handling cases where LHS and RHS had multiple but different
sign bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210048 91177308-0d34-0410-b5e6-96231b3b80d8
Instrumentation passes now use attributes
address_safety/thread_safety/memory_safety which are added by Clang frontend.
Clang parses the blacklist file and adds the attributes accordingly.
Currently blacklist is still used in ASan module pass to disable instrumentation
for certain global variables. We should fix this as well by collecting the
set of globals we're going to instrument in Clang and passing it to ASan
in metadata (as we already do for dynamically-initialized globals and init-order
checking).
This change also removes -tsan-blacklist and -msan-blacklist LLVM commandline
flags in favor of -fsanitize-blacklist= Clang flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210038 91177308-0d34-0410-b5e6-96231b3b80d8
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing
Differential Revision: http://reviews.llvm.org/D3777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210006 91177308-0d34-0410-b5e6-96231b3b80d8
Handle "X + ~X" -> "-1" in the function Value *Reassociate::OptimizeAdd(Instruction *I, SmallVectorImpl<ValueEntry> &Ops);
This patch implements:
TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1".
Patch by Rahul Jain!
Differential Revision: http://reviews.llvm.org/D3835
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209973 91177308-0d34-0410-b5e6-96231b3b80d8
This helps more branches into selects. On R600,
vectors are cheap and anything that helps
remove branches is very good.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209914 91177308-0d34-0410-b5e6-96231b3b80d8
original fix would actually trigger the *exact* same crasher as the
original bug for a different reason. Awesomesauce.
Working on test cases now, but wanted to get bots healthier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209860 91177308-0d34-0410-b5e6-96231b3b80d8
across PHI nodes. The code was computing the Idxs from the 'GEP'
variable's indices when what it wanted was Op1's indices. This caused an
ASan heap-overflow for me that pin pointed the issue when Op1 had more
indices than GEP did. =] I'll let Louis add a specific test case for
this if he wants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209857 91177308-0d34-0410-b5e6-96231b3b80d8
The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count.
If this expression overflows the generated code was invalid.
In case of overflow the code now jumps to the scalar loop.
Fixes PR17288.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209854 91177308-0d34-0410-b5e6-96231b3b80d8
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:
GEP1--\
|-->PHI1-->GEP3
GEP2--/
This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):
GEP1--\ --\ --\
|-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123
GEP2--/ --/ --/
This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.
Tests included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209843 91177308-0d34-0410-b5e6-96231b3b80d8
During loop-unroll, loop exits from the current loop may end up in in different
outer loop. This requires to re-form LCSSA recursively for one level down from
the outer most loop where loop exits are landed during unroll. This fixes PR18861.
Differential Revision: http://reviews.llvm.org/D2976
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209796 91177308-0d34-0410-b5e6-96231b3b80d8
Clang knows about the sanitizer blacklist and it makes no sense to
add global to the list of llvm.asan.dynamically_initialized_globals if it
will be blacklisted in the instrumentation pass anyway. Instead, we should
do as much blacklisting as possible (if not all) in the frontend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209790 91177308-0d34-0410-b5e6-96231b3b80d8
Don't assume that dynamically initialized globals are all initialized from
_GLOBAL__<module_name>I_ function. Instead, scan the llvm.global_ctors and
insert poison/unpoison calls to each function there.
Patch by Nico Weber!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209780 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209776 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r209746.
It looks it is causing a crash while building libcxx. I am trying to get a
reduced testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209762 91177308-0d34-0410-b5e6-96231b3b80d8
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:
GEP1--\
|-->PHI1-->GEP3
GEP2--/
This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):
GEP1--\ --\ --\
|-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123
GEP2--/ --/ --/
This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.
Tests included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209755 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements two things:
1. If we know one number is positive and another is negative, we return true as
signed addition of two opposite signed numbers will never overflow.
2. Implemented TODO : If one of the operands only has one non-zero bit, and if
the other operand has a known-zero bit in a more significant place than it
(not including the sign bit) the ripple may go up to and fill the zero, but
won't change the sign. e.x - (x & ~4) + 1
We make sure that we are ignoring 0 at MSB.
Patch by Suyog Sarda.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209746 91177308-0d34-0410-b5e6-96231b3b80d8
This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can
extract a constant offset from "s/zext and/or/xor A, B".
Added a new test @ext_or to verify this enhancement.
Refactoring the code, I also extracted some common logic to function
Distributable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209670 91177308-0d34-0410-b5e6-96231b3b80d8
Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio
Fixed the argument order to select (the mask semantics to blendv* are the
inverse of select) and fixed the tests
Added parenthesis to the assert condition
Ran clang-format
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209667 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Implemented an InstCombine transformation that takes a blendv* intrinsic
call and translates it into an IR select, if the mask is constant.
This will eventually get lowered into blends with immediates if possible,
or pblendvb (with an option to further optimize if we can transform the
pblendvb into a blend+immediate instruction, depending on the selector).
It will also enable optimizations by the IR passes, which give up on
sight of the intrinsic.
Both the transformation and the lowering of its result to asm got shiny
new tests.
The transformation is a bit convoluted because of blendvp[sd]'s
definition:
Its mask is a floating point value! This forces us to convert it and get
the highest bit. I suppose this happened because the mask has type
__m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin.
I will send an email to llvm-dev to discuss if we want to change this or
not.
Reviewers: grosbach, delena, nadav
Differential Revision: http://reviews.llvm.org/D3859
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209643 91177308-0d34-0410-b5e6-96231b3b80d8
and via the command line, mirroring similar functionality in LoopUnroll. In
situations where clients used custom unrolling thresholds, their intent could
previously be foiled by LoopRotate having a hardcoded threshold.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209617 91177308-0d34-0410-b5e6-96231b3b80d8
This extension point allows adding passes that perform peephole optimizations
similar to the instruction combiner. These passes will be inserted after
each instance of the instruction combiner pass.
Differential Revision: http://reviews.llvm.org/D3905
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209595 91177308-0d34-0410-b5e6-96231b3b80d8
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.
"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.
This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209577 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed a TODO in r207783.
Add the extracted constant offset using GEP instead of ugly
ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR
easier to understand.
Updated all affected tests, and added a new test in split-gep.ll to cover a
corner case where emitting uglygep is necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209537 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.
-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.
-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.
The patch also:
1- Adds support in the inliner for the two new remarks and a
test case.
2- Moves emitOptimizationRemark* functions to the llvm namespace.
3- Adds an LLVMContext argument instead of making them member functions
of LLVMContext.
Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3682
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209442 91177308-0d34-0410-b5e6-96231b3b80d8
This commit introduces a canonical representation for the formulae.
Basically, as soon as a formula has more that one base register, the scaled
register field is used for one of them. The register put into the scaled
register is preferably a loop variant.
The commit refactors how the formulae are built in order to produce such
representation.
This yields a more accurate, but still perfectible, cost model.
<rdar://problem/16731508>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209230 91177308-0d34-0410-b5e6-96231b3b80d8