This patch adds a test for MIPS64R6 relocations, it corrects check
expressions for R_MIPS_26 and R_MIPS_PC16 relocations in MIPS64R2 test, and
it adds run for big endian in MIPS64R2 test.
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D11217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246311 91177308-0d34-0410-b5e6-96231b3b80d8
When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.
Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed. To fix this, attempt to use FindBetterChain
on any possibly neighboring stores in visitSTORE.
Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store would be visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246307 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch removes two remaining places where pointer value comparisons
are used to order functions: comparing range annotation metadata, and comparing
block address constants. (These are both rare cases, and so no actual
non-determinism was observed from either case).
The fix for range metadata is simple: the annotation always consists of a pair
of integers, so we just order by those integers.
The fix for block addresses is more subtle. Two constants are the same if they
are the same basic block in the same function, or if they refer to corresponding
basic blocks in each respective function. Note that in the first case, merging
is trivially correct. In the second, the correctness of merging relies on the
fact that the the values of block addresses cannot be compared. This change is
actually an enhancement, as these functions could not previously be merged (see
merge-block-address.ll).
There is still a problem with cross function block addresses, in that constants
pointing to a basic block in a merged function is not updated.
This also more robustly compares floating point constants by all fields of their
semantics, and fixes a dyn_cast/cast mixup.
Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers llvm-commits
Differential revision: http://reviews.llvm.org/D12376
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246305 91177308-0d34-0410-b5e6-96231b3b80d8
handle more allocas with loads past the end of the alloca.
I suspect there are some related crashers with slightly different
patterns, but I'll fix those and add test cases as I find them.
Thanks to David Majnemer for the excellent test case reduction here.
Made this super simple to debug and fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246289 91177308-0d34-0410-b5e6-96231b3b80d8
This patch includes a fix for a llvm-readobj test. With this patch,
the tool does no longer print out COFF headers for the short import
file, but that's probably desirable because the header for the short
import file is dummy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246283 91177308-0d34-0410-b5e6-96231b3b80d8
COFF short import files are special kind of files that contains only
DLL-exported symbol names. That's different from object files because
it has no data except symbol names.
This change implements a SymbolicFile interface for the short import
files so that symbol names can be accessed through that interface.
llvm-ar is now able to read the file and create symbol table entries
for short import files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246276 91177308-0d34-0410-b5e6-96231b3b80d8
FIXME:
- Introduce explicit mapping.
- Investigate crash on win32. Could we introduce crash handler in examples?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246267 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Change the coloring algorithm in WinEHPrepare to visit a funclet's exits
in its parents' contexts and so properly classify the continuations of
nested funclets.
Also change the placement of cloned blocks to be deterministic and to
maintain the relative order of each funclet's blocks.
Add a lit test showing various patterns that require cloning, the last
several of which don't have CHECKs yet because they require cloning
entire funclets which is NYI.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12353
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246245 91177308-0d34-0410-b5e6-96231b3b80d8
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
is constant we can change all variables with constants in the same BasicBlock
http://reviews.llvm.org/D11918
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246243 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to this patch, we hadn't been marking StratifiedSets with the
appropriate StratifiedAttrs when handling the result of no-args call
instructions. This caused us to report NoAlias when handed, for
example, an escaped alloca and a result from an opaque function. Now we
properly mark the return value of said functions.
Thanks again to Chandler, Richard, and Nick for pinging me about this.
Differential review: http://reviews.llvm.org/D12408
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246240 91177308-0d34-0410-b5e6-96231b3b80d8
more than 2 instructions.
I introduced this regression a while back and did not noticed it because I
somehow forgot to push the initial test cases for the pass!
Fix that as well!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246239 91177308-0d34-0410-b5e6-96231b3b80d8
llvm::splitCodeGen is a function that implements the core of parallel LTO
code generation. It uses llvm::SplitModule to split the module into linkable
partitions and spawning one code generation thread per partition. The function
produces multiple object files which can be linked in the usual way.
This has been threaded through to LTOCodeGenerator (and llvm-lto for testing
purposes). Separate patches will add parallel LTO support to the gold plugin
and lld.
Differential Revision: http://reviews.llvm.org/D12260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246236 91177308-0d34-0410-b5e6-96231b3b80d8
We can now run 32-bit programs with empty catch bodies. The next step
is to change PEI so that we get funclet prologues and epilogues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246235 91177308-0d34-0410-b5e6-96231b3b80d8
Any call which is side effect free is trivially OK to speculate. We
already had similar logic in EarlyCSE and GVN but we were missing it
from isSafeToSpeculativelyExecute.
This fixes PR24601.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246232 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR24602: r245689 introduced an unguarded use of
SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails
because of opaque (hoisted) constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246217 91177308-0d34-0410-b5e6-96231b3b80d8
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.
Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246194 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the analysis diagnostics produced when loops with
floating-point recurrences or memory operations are identified. The new messages
say "cannot prove it is safe to reorder * operations; allow reordering by
specifying #pragma clang loop vectorize(enable)". Depending on the type of
diagnostic the message will include additional options such as ffast-math or
__restrict__.
This patch also allows the vectorize(enable) pragma to override the low pointer
memory check threshold. When the hint is given a higher threshold is used.
See the clang patch for the options produced for each diagnostic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246187 91177308-0d34-0410-b5e6-96231b3b80d8
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.
Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246186 91177308-0d34-0410-b5e6-96231b3b80d8
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.
Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246158 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike scalar operations, we can perform vector operations on element types that
are smaller than the native integer types. We type-promote scalar operations if
they are smaller than a native type (e.g., i8 arithmetic is promoted to i32
arithmetic on Arm targets). This patch detects and removes type-promotions
within the reduction detection framework, enabling the vectorization of small
size reductions.
In the legality phase, we look through the ANDs and extensions that InstCombine
creates during promotion, keeping track of the smaller type. In the
profitability phase, we use the smaller type and ignore the ANDs and extensions
in the cost model. Finally, in the code generation phase, we truncate the result
of the reduction to allow InstCombine to rewrite the entire expression in the
smaller type.
This fixes PR21369.
http://reviews.llvm.org/D12202
Patch by Matt Simpson <mssimpso@codeaurora.org>!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246149 91177308-0d34-0410-b5e6-96231b3b80d8
Globals in address spaces other than one may have 0 as a valid address,
so we should not assume that they can be null.
Reviewed by Philip Reames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246137 91177308-0d34-0410-b5e6-96231b3b80d8
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence.
We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence.
Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences.
Differential Revision: http://reviews.llvm.org/D11434
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246134 91177308-0d34-0410-b5e6-96231b3b80d8
When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints.
I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead.
Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them.
Differential Revision: http://reviews.llvm.org/D12004
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246133 91177308-0d34-0410-b5e6-96231b3b80d8
This patch ensures that every analysis diagnostic produced by the vectorizer
will be printed if the loop has a vectorization hint on it. The condition has
also been improved to prevent printing when a disabling hint is specified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246132 91177308-0d34-0410-b5e6-96231b3b80d8
As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine.
Note: Duplicate case values are disallowed by the LangRef and verifier.
Differential Revision: http://reviews.llvm.org/D11995
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246125 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly.
Differential Revision: http://reviews.llvm.org/D12166#fae6eca7
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246104 91177308-0d34-0410-b5e6-96231b3b80d8