sprintf doesn't read or copy the terminating null byte from it's string
operands. sprintf will append it's own after processing all of the
format specifiers.
This fixes PR27526.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267580 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.
This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19301
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267519 91177308-0d34-0410-b5e6-96231b3b80d8
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267515 91177308-0d34-0410-b5e6-96231b3b80d8
I really thought we were doing this already, but we were not. Given this input:
void Test(int *res, int *c, int *d, int *p) {
for (int i = 0; i < 16; i++)
res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}
we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.
One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.
The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.
Differential Revision: http://reviews.llvm.org/D19512
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267514 91177308-0d34-0410-b5e6-96231b3b80d8
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267491 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is what was the "instcombine" portion of D14185, with an additional
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll).
The patch causes instcombine to replace sequences of extractelement-insertvalue-store
that act essentially like a bitcast followed by a store.
Differential review: http://reviews.llvm.org/D14260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267482 91177308-0d34-0410-b5e6-96231b3b80d8
The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!).
There seems to be no reason why we can't SRA constants too, so let's do it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267393 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267359 91177308-0d34-0410-b5e6-96231b3b80d8
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:
1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).
2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements
3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly
We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).
Differential Revision: http://reviews.llvm.org/D19318
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267357 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:
1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)
2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input
Differential Revision: http://reviews.llvm.org/D17490
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267356 91177308-0d34-0410-b5e6-96231b3b80d8
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*. It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.
Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType. The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.
This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata. Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.
The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267296 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.
LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.
Differential Revision: http://reviews.llvm.org/D18367
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267223 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code turned out to be completely correct when auditted. Thus, only minor code changes and adding a couple of tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267215 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.
Patch by Anna Thomas!
Reviewers: majnemer, reames, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19390
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267214 91177308-0d34-0410-b5e6-96231b3b80d8
Extend the type canonicalization logic to work for unordered atomic loads and stores. Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before. Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered. If you see problems, feel free to revert this change, but please make sure you collect a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267210 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.
Reviewers: hfinkel, eeckstein, dberlin, mcrosier
Subscribers: mgrang, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18906
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267197 91177308-0d34-0410-b5e6-96231b3b80d8
Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267196 91177308-0d34-0410-b5e6-96231b3b80d8
In the next change, I am generalizing the function
findStringMetadataForLoop and I want to make sure I don't break this.
Looks like there was no coverage for this so far.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267182 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: [u|s]gt and [u|s]lt imply [u|s]ge and [u|s]le are true, respectively.
I've simplified the existing tests and added additional tests to cover the new
cases mentioned above. I've also added tests for all the cases where the
first compare doesn't imply anything about the second compare.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267171 91177308-0d34-0410-b5e6-96231b3b80d8
A followup commit will replace these tests with simplified and more inclusive
tests. The diff is unreadable if this were to be done in a single commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267170 91177308-0d34-0410-b5e6-96231b3b80d8
We take the intersection of overflow flags while CSE'ing.
This permits us to consider two instructions with different overflow
behavior to be replaceable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267153 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.
This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267139 91177308-0d34-0410-b5e6-96231b3b80d8
We assumed that flags were only present on binary operators. This is
not true, they may also be present on calls and fcmps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267113 91177308-0d34-0410-b5e6-96231b3b80d8
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
which had no flags with an instruction which has flags.
Fix this by being more consistent with our flag handling by utilizing
andIRFlags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267111 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If we know that the pointer allocated within a function does not escape,
we can fold away comparisons that are done with global pointers
Patch by Anna Thomas!
Reviewers: reames, majnemer, sanjoy
Subscribers: mgrang, mcrosier, majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D19276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267035 91177308-0d34-0410-b5e6-96231b3b80d8
This builds on 266999 which made FindAvailableValue do the right thing. Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267006 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing. At the moment, the code added is dead, but separating it makes follow on changes far more obvious.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266999 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
`llvm.guard(false)` always bails out of the current compilation unit, so
we can prune any control flow following it.
Reviewers: hfinkel, pcc, reames
Subscribers: majnemer, reames, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19245
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266955 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The function importer already decided what symbols need to be pulled
in. Also these magically added ones will not be in the export list
for the source module, which can confuse the internalizer for
instance.
Reviewers: tejohnson, rafael
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19096
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266948 91177308-0d34-0410-b5e6-96231b3b80d8
No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table:
```
A is
+, -, +/-
F F F + B is
T F ? -
? F ? +/-
```
The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate.
There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266939 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch prevents importing from (and therefore exporting from) any
module with a "llvm.used" local value. Local values need to be promoted
and renamed when importing, and their presense on the llvm.used variable
indicates that there are opaque uses that won't see the rename. One such
example is a use in inline assembly.
See also the discussion at:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html
As part of this, move collectUsedGlobalVariables out of Transforms/Utils
and into IR/Module so that it can be used more widely. There are several
other places in LLVM that used copies of this code that can be cleaned
up as a follow on NFC patch.
Reviewers: joker.eph
Subscribers: pcc, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18986
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266877 91177308-0d34-0410-b5e6-96231b3b80d8
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer. I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.
This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.
Differential Revision: http://reviews.llvm.org/D19098
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266818 91177308-0d34-0410-b5e6-96231b3b80d8
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.
Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.
Should fix the 32-bit part of PR25526.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266679 91177308-0d34-0410-b5e6-96231b3b80d8