In some places, like InstCombine, we resize a DenseMap to fit the elements
we intend to put in it, then insert those elements (to avoid continual
reallocations as it grows). But .resize(foo) doesn't actually do what
people think; it resizes to foo buckets (which is really an
implementation detail the user of DenseMap probably shouldn't care about),
not the space required to fit foo elements. DenseMap grows if 3/4 of its
buckets are full, so this actually causes one forced reallocation every
time instead of avoiding a reallocation.
This patch makes .resize(foo) do the intuitive thing: it grows to the size
necessary to fit foo elements without new allocations.
Also include a test to verify that .resize() actually does what we think it
does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263522 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:
safe_foo:
// check preconditions for foo
.alt_entry fast_foo
fast_foo:
// body of foo, can assume preconditions.
The .alt_entry flag is also implicitly set on assembly aliases of the form:
a = b + C
where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263521 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263516 91177308-0d34-0410-b5e6-96231b3b80d8
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.
Differential Revision: http://reviews.llvm.org/D18160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263515 91177308-0d34-0410-b5e6-96231b3b80d8
(Resubmitting after fixing missing file issue)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.
A companion clang patch will immediately succeed this patch to reflect
this renaming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263513 91177308-0d34-0410-b5e6-96231b3b80d8
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.
This reverts commit r263303.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263512 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.
Without this change, we'll happily unroll e.g. the following loop
for (int i = 0; i < N; ++i) {
if (i == 0) convergent_op();
foo();
}
into
int i = 0;
if (N % 2 == 1) {
convergent_op();
foo();
++i;
}
for (; i < N - 1; i += 2) {
if (i == 0) convergent_op();
foo();
foo();
}.
This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.
In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.
Reviewers: resistor
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263509 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This now try to reorder instructions in order to help create the optimizable pattern.
Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer
Differential Revision: http://reviews.llvm.org/D16523
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263503 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This form was replaced by a form taking an instruction instead of opcode and
return type in r258391. After committing this change (and some depending,
follow-up changes) it turned out in the review thread to be controversial. The
discussion didn't come to a conclusion yet. I'm re-adding the old form to fix
the API regression and to provide a better base for discussion, possibly on
llvm-dev.
A difference to the original function is that it can't be called with GEPs
(similarly to how it was already the case for compares). In order to support
opaque pointers in the future, folding GEPs needs to be passed the source
element type, which is not possible with the current API.
Reviewers: dberlin, reames
Subscribers: dblaikie, eddyb
Differential Revision: http://reviews.llvm.org/D17901
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263501 91177308-0d34-0410-b5e6-96231b3b80d8
If anybody is actually using this, it probably doesn't do what they
think it does. This actually causes the dylib to *export* a
__cxa_atexit symbol, so anything that links it probably loses their
exit time destructors as well as disabling LLVM's.
This just removes the option entirely. If somebody does need this
behaviour we should figure out a more principled way to do it.
This is effectively a revert of r223805.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263498 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
llvm-config --libs does not produce correct output since commit r260263
(llvm-config: Add preliminary Windows support) changed naming format of
the libraries. This patch updates llvm-config to recognize new naming
format and output correct linker flags.
Ref: https://llvm.org/bugs/show_bug.cgi?id=26581
Patch by Vedran Miletić
Reviewers: ehsan, rnk, pxli168
Subscribers: pxli168
Differential Revision: http://reviews.llvm.org/D17300
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263497 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.
Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18077
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263495 91177308-0d34-0410-b5e6-96231b3b80d8
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.
A companion clang patch will immediately succeed this patch to reflect
this renaming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263490 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This check was added in rL152620, and has started causing downstream warnings in Julia:
```
In file included from /home/tkelman/Julia/julia-0.5/src/codegen.cpp:22:0:
/home/tkelman/Julia/julia-0.5/usr/include/llvm/ExecutionEngine/JITEventListener.h:84:5: warning: "LLVM_USE_INTEL_JITEVENTS" is not defined [-Wundef]
#if LLVM_USE_INTEL_JITEVENTS
^
/home/tkelman/Julia/julia-0.5/usr/include/llvm/ExecutionEngine/JITEventListener.h💯5: warning: "LLVM_USE_OPROFILE" is not defined [-Wundef]
#if LLVM_USE_OPROFILE
^
```
Patch by Tony Kelman.
Reviewers: loladiro
Differential Revision: http://reviews.llvm.org/D17254
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263487 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r263472.
There is an LNT failure on clang-ppc64be-linux-lnt. Turn this off,
while I am investigating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263485 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in:
https://llvm.org/bugs/show_bug.cgi?id=26636
This doesn't accomplish anything on its own. It's the first step towards preserving
and using branch weights with selects.
The next step would be to make sure we're propagating the info in all of the other
places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's
an easy fix to make this happen; we have to look at each transform individually to
determine how to correctly propagate the weights.
Along with that step, we need to then use the weights when making subsequent transform
decisions such as discussed in http://reviews.llvm.org/D16836.
The inliner test is independent but closely related. It verifies that metadata is
preserved when both branches and selects are cloned.
Differential Revision: http://reviews.llvm.org/D18133
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263482 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously we had a notion of convergent functions but not of convergent
calls. This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.
Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent. As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.
Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage. Re-landed here with no changes.
Reviewers: chandlerc, jingyue
Subscribers: llvm-commits, tra, jhen, hfinkel
Differential Revision: http://reviews.llvm.org/D17739
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263481 91177308-0d34-0410-b5e6-96231b3b80d8
Some instructions were missing isBranch, isCall, or isTerminator
flags. This didn't really affect code generation since most of
the affected patterns were used only for the AsmParser and/or
disassembler.
However, it could affect tools using the MC layer to disassemble
and parse binary code (e.g. via MCInstrDesc::mayAffectControlFlow).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263478 91177308-0d34-0410-b5e6-96231b3b80d8
The two issues that were discovered got fixed (r263058, r263173).
The pass can be disabled with -mllvm -enable-loop-load-elim=0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263472 91177308-0d34-0410-b5e6-96231b3b80d8
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.
This is an historical accident effect from r129188.
When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.
Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:
- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.
The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.
Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263460 91177308-0d34-0410-b5e6-96231b3b80d8
When the SP in not changed because of realignment/VLAs etc., we restore the SP
by using the previous value of SP and not the FP. Breaking the dependency will
help in cases when the epilog of a callee is close to the epilog of the caller;
for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of
the callee.
http://reviews.llvm.org/D18060
Patch by Aditya Kumar and Evandro Menezes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263458 91177308-0d34-0410-b5e6-96231b3b80d8
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
objections.
1. x86 already does this kind of optimization for multiple scalar loads -> vector load.
2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.
Differential Revision: http://reviews.llvm.org/D18094
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263446 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
MIPSR6 introduces a class of branches called compact branches. Unlike the
traditional MIPS branches which have a delay slot, compact branches do not
have a delay slot. The instruction following the compact branch is only
executed if the branch is not taken and must not be a branch.
It works by generating compact branches for MIPS32R6 when the delay slot
filler cannot fill a delay slot. Then, inspecting the generated code for
forbidden slot hazards (a compact branch with an adjacent branch or other
CTI) and inserting nops to clear this hazard.
Patch by Simon Dardis.
Reviewers: vkalintiris, dsanders
Subscribers: MatzeB, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D16353
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263444 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When multiple threads perform an atomic op with the same arguments, they
will usually see different return values.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18101
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263440 91177308-0d34-0410-b5e6-96231b3b80d8
On the z13, it turns out to be more efficient to access a full
floating-point register than just the upper half (as done e.g.
by the LE and LER instructions).
Current code already takes this into account when loading from
memory by using the LDE instruction in place of LE. However,
we still generate LER, which shows the same performance issues
as LE in certain circumstances.
This patch changes the back-end to emit LDR instead of LER to
implement FP32 register-to-register copies on z13.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263431 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
With the addition of checks to ensure that operands have a strict ordering
it has become tricky to manage the order in the way I originally intended.
This patch linearizes the ordering which simplifies the implementation but
requires an order that is arbitrary in places. Here are some examples:
* uimm4 < uimm5 < uimm6
* simm4 < uimm4 < simm5 < uimm5
* uimm5 < uimm5_plus1 (1..32) < uimm5_plus32 (32..63) < uimm6
The term 'superset' starts to break down here since the *_plus* classes
are not true supersets of uimm5 (but they are still subsets of uimm6).
* uimm5 < uimm5_64, and uimm5 < vsplat_uimm5
This is entirely arbitrary. We need an ordering and what we pick is
unimportant since only one is possible for a given mnemonic.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D17723
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263423 91177308-0d34-0410-b5e6-96231b3b80d8