This patch modifies the LiveDebugValues pass to use more efficient set
data structures as outlined in PR26055. Both VarLocSet and VarLocList are
now SparseBitVectors which allows us to perform much faster bitvector
arithmetic on them.
The speedup can be in the order of minutes especially on ASANified code.
The change is not NFC in the assembler output because the inserted
DBG_VALUEs are now sorted by variable and location.
Many thanks to Daniel Berlin for helping design the improved algorithm and
reviewing the patch.
https://llvm.org/bugs/show_bug.cgi?id=26055http://reviews.llvm.org/D20178
rdar://problem/24091200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270776 91177308-0d34-0410-b5e6-96231b3b80d8
Getting accurate locations for loops is important, because those locations are
used by the frontend to generate optimization remarks. Currently, optimization
remarks for loops often appear on the wrong line, often the first line of the
loop body instead of the loop itself. This is confusing because that line might
itself be another loop, or might be somewhere else completely if the body was
inlined function call. This happens because of the way we find the loop's
starting location. First, we look for a preheader, and if we find one, and its
terminator has a debug location, then we use that. Otherwise, we look for a
location on an instruction in the loop header.
The fallback heuristic is not bad, but will almost always find the beginning of
the body, and not the loop statement itself. The preheader location search
often fails because there's often not a preheader, and even when there is a
preheader, depending on how it was formed, it sometimes carries the location of
some preceeding code.
I don't see any good theoretical way to fix this problem. On the other hand,
this seems like a straightforward solution: Put the debug location in the
loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert
llvm.loop metadata with appropriate locations when generating debugging
information. With these changes, our loop remarks have much more accurate
locations.
Differential Revision: http://reviews.llvm.org/D19738
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270771 91177308-0d34-0410-b5e6-96231b3b80d8
Since r268966 the modern Verifier pass defaults to stripping invalid debug info
in nonasserts builds. This patch ports this behavior back to the legacy
Verifier pass as well. The primary motivation is that the clang frontend
accepts bitcode files as input but is still using the legacy pass pipeline.
Background: The problem I'm trying to solve with this sequence of patches is
that historically we've done a really bad job at verifying debug info. We want
to be able to make the verifier stricter without having to worry about breaking
bitcode compatibility with existing producers. For example, we don't necessarily
want IR produced by an older version of clang to be rejected by an LTO link just
because of malformed debug info, and rather provide an option to strip it. Note
that merely outdated (but well-formed) debug info would continue to be
auto-upgraded in this scenario.
http://reviews.llvm.org/D20629
<rdar://problem/26448800>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270768 91177308-0d34-0410-b5e6-96231b3b80d8
As a result of D18634 we no longer infer certain attributes on linkonce_odr
functions at compile time, and may only infer them at LTO time. The readnone
attribute in particular is required for virtual constant propagation (part
of whole-program virtual call optimization) to work correctly.
This change moves the whole-program virtual call optimization pass after
the function attribute inference passes, and enables the attribute inference
passes at opt level 1, so that virtual constant propagation has a chance to
work correctly for linkonce_odr functions.
Differential Revision: http://reviews.llvm.org/D20643
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270765 91177308-0d34-0410-b5e6-96231b3b80d8
They were originally separated to handle the co-recursion between
the ValueMapper and the ValueMaterializer. This recursion does not
exist anymore: the ValueMapper now uses a Worklist and the
ValueMaterializer is scheduling job on the Worklist.
Differential Revision: http://reviews.llvm.org/D20593
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270758 91177308-0d34-0410-b5e6-96231b3b80d8
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270751 91177308-0d34-0410-b5e6-96231b3b80d8
There was a typo in r267758. It caused invalid accesses when
given something like "void @free(...)", as NumParams == 0, and
we then try to look at the 0th parameter.
Turns out, most of these were untested; add both attribute
and missing-prototype checks for all libc libfuncs.
Differential Revision: http://reviews.llvm.org/D20543
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270750 91177308-0d34-0410-b5e6-96231b3b80d8
This is probably correct for all uses except cross-module IR linking,
where we need to move the comdat from the source module to the
destination module.
Fixes PR27870.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D20631
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270743 91177308-0d34-0410-b5e6-96231b3b80d8
While here, convert the logic of the pass to use static function(s).
This is in preparation for porting this pass to the new PM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270734 91177308-0d34-0410-b5e6-96231b3b80d8
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270731 91177308-0d34-0410-b5e6-96231b3b80d8
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826
There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.
Differential Revision: http://reviews.llvm.org/D20601
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270729 91177308-0d34-0410-b5e6-96231b3b80d8
This should actually address PR27855. This results in adding references to the system libs inside generated dylibs so that they get correctly pulled in when linking against the dylib.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270723 91177308-0d34-0410-b5e6-96231b3b80d8
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270720 91177308-0d34-0410-b5e6-96231b3b80d8
There aren't any checks with prefix PROMOTE, should be PROMOTE_MOD1
which wasn't being tested (but works as expected).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270719 91177308-0d34-0410-b5e6-96231b3b80d8
searching for external symbols, and fall back to the SymbolResolver::findSymbol
method if the former returns null.
This makes RuntimeDyld behave more like a static linker: Symbol definitions
from within the current module's "logical dylib" will be preferred to
external definitions. We can build on this behavior in the future to properly
support weak symbol handling.
Custom symbol resolvers that override the findSymbolInLogicalDylib method may
notice changes due to this patch. Clients who have not overridden this method
should generally be unaffected, however users of the OrcMCJITReplacement class
may notice changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270716 91177308-0d34-0410-b5e6-96231b3b80d8
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270715 91177308-0d34-0410-b5e6-96231b3b80d8
Move the now index-based ODR resolution and internalization routines out
of ThinLTOCodeGenerator.cpp and into either LTO.cpp (index-based
analysis) or FunctionImport.cpp (index-driven optimizations).
This is to enable usage by other linkers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270698 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
**Description**
This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.
When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call.
`getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`.
Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr.
This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.
**Minimal reproducer:**
```
int foo(int a, int b, int c);
int baz();
void bar()
{
int arr[20];
int i = 0;
for (i = 0; i < 4; ++i)
arr[i] = baz();
for (; i < 20; ++i)
arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}
```
**Clang command line:**
```
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir
```
**Expected result:**
The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated.
Reviewers: sanjoy
Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D20058
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270695 91177308-0d34-0410-b5e6-96231b3b80d8
There's already a ARMTargetParser,now adding a similar one for aarch64.
so we can use it to do ARCH/CPU/FPU parsing in clang and llvm, instead of
string comparison.
Patch by Jojo Ma.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270687 91177308-0d34-0410-b5e6-96231b3b80d8