Summary:
Fix LSRInstance::HoistInsertPosition() to check the original insert
position block first for a canonical insertion point that is dominated
by all inputs. This leads to SCEV being able to reuse more instructions
since it currently tracks the instructions it creates for reuse by
keeping a table of <Value, insert point> pairs.
Reviewers: atrick
Subscribers: mcrosier, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18001
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263644 91177308-0d34-0410-b5e6-96231b3b80d8
- Ensure we test X86 + X64
- sitopfp / uitofp requires testing for SSE2 and SSE42 as well (part of the fix for PR26953)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263640 91177308-0d34-0410-b5e6-96231b3b80d8
And emit an error if it fails.
This prevents illegal instructions from getting sent to the GPU, which
would potentially result in a hang.
This is a candidate for the stable branch(es).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263627 91177308-0d34-0410-b5e6-96231b3b80d8
Fork off compatibility.ll for the 3.8 release. The *.bc file in this
commit was produced using a Release build of the release_38 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263620 91177308-0d34-0410-b5e6-96231b3b80d8
This patch introduces the Error classs for lightweight, structured,
recoverable error handling. It includes utilities for creating, manipulating
and handling errors. The scheme is similar to exceptions, in that errors are
described with user-defined types. Unlike exceptions however, errors are
represented as ordinary return types in the API (similar to the way
std::error_code is used).
For usage notes see the LLVM programmer's manual, and the Error.h header.
Usage examples can be found in unittests/Support/ErrorTest.cpp.
Many thanks to David Blaikie, Mehdi Amini, Kevin Enderby and others on the
llvm-dev and llvm-commits lists for lots of discussion and review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263609 91177308-0d34-0410-b5e6-96231b3b80d8
We can currently only match zeroable vector elements of the same size as the shuffle type - these tests demonstrate the problem and a solution will be shortly added in an updated D14261
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263606 91177308-0d34-0410-b5e6-96231b3b80d8
These were printing as "UnknownCode3", since we were looking for them
inside PARAMATTR blocks instead of PARAMATTR_GROUP blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263597 91177308-0d34-0410-b5e6-96231b3b80d8
The latent bug that LLE exposed in the LoopVectorizer was resolved
(PR26952).
The pass can be disabled with -mllvm -enable-loop-load-elim=0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263595 91177308-0d34-0410-b5e6-96231b3b80d8
There is something strange going on with debug info (.eh_frame_hdr)
disappearing when msan.module_ctor are placed in comdat sections.
Moving this functionality under flag, disabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263579 91177308-0d34-0410-b5e6-96231b3b80d8
Annoyingly, ErrorOr allows to *not check* the error when things go
well. It will crash badly when there is an error though. It should
runtime assert when it is used without being checked!
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263577 91177308-0d34-0410-b5e6-96231b3b80d8
Record all variable defs with a summary record to aid in building a
complete reference graph and locating constant variable defs to import.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263576 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This change adds a PACKAGE_VENDOR variable. When set it makes the version output more closely resemble the clang version output.
Reviewers: aprantl, bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18159
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263566 91177308-0d34-0410-b5e6-96231b3b80d8
This was a latent bug that got exposed by the change to add LoopSimplify
as a dependence to LoopLoadElimination. Since LoopInfo was corrupted
after LV, LoopSimplify mis-compiled nbench in the test-suite (more
details in the PR).
The problem was that when we create the blocks for predicated stores we
didn't add those to any loops.
The original testcase for store predication provides coverage for this
assuming we verify LI on the way out of LV.
Fixes PR26952.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263565 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Static LDS size is saved in MachineFunctionInfo::LDSSize,
We define a pseudo instruction with usesCustomInserter bit set. Then, in EmitInstrWithCustomInserter,
we replace this pseudo instruction with a mov of MachineFunctionInfo::LDSSize.
Reviewers:
arsenm
tstellarAMD
Subscribers
llvm-commits, arsenm
Differential Revision:
http://reviews.llvm.org/D18064
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263563 91177308-0d34-0410-b5e6-96231b3b80d8
If both are different aliases to the same value the sorting becomes
non-deterministic as array_pod_sort is not stable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263550 91177308-0d34-0410-b5e6-96231b3b80d8
The PE TLS directory contains information about where the TLS data
resides in the image, what functions should be executed when threads are
created, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263537 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
LLVMGetTargetDataLayout was removed from the C API,
and then TargetMachine.TargetData was removed. Later,
LLVMCreateTargetMachineData was added to the C API,
and we now expose this via the Go API.
Reviewers: deadalnix, pcc
Subscribers: cierniak, llvm-commits, axw
Differential Revision: http://reviews.llvm.org/D18173
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263530 91177308-0d34-0410-b5e6-96231b3b80d8
Address review suggestions from dblaikie: change a few dyn_cast to cast
and fold a cast into if condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263526 91177308-0d34-0410-b5e6-96231b3b80d8
Since the static getGlobalIdentifier and getGUID methods are now called
for global values other than functions, reflect that by moving these
methods to the GlobalValue class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263524 91177308-0d34-0410-b5e6-96231b3b80d8
In some places, like InstCombine, we resize a DenseMap to fit the elements
we intend to put in it, then insert those elements (to avoid continual
reallocations as it grows). But .resize(foo) doesn't actually do what
people think; it resizes to foo buckets (which is really an
implementation detail the user of DenseMap probably shouldn't care about),
not the space required to fit foo elements. DenseMap grows if 3/4 of its
buckets are full, so this actually causes one forced reallocation every
time instead of avoiding a reallocation.
This patch makes .resize(foo) do the intuitive thing: it grows to the size
necessary to fit foo elements without new allocations.
Also include a test to verify that .resize() actually does what we think it
does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263522 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:
safe_foo:
// check preconditions for foo
.alt_entry fast_foo
fast_foo:
// body of foo, can assume preconditions.
The .alt_entry flag is also implicitly set on assembly aliases of the form:
a = b + C
where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263521 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263516 91177308-0d34-0410-b5e6-96231b3b80d8
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.
Differential Revision: http://reviews.llvm.org/D18160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263515 91177308-0d34-0410-b5e6-96231b3b80d8
(Resubmitting after fixing missing file issue)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.
A companion clang patch will immediately succeed this patch to reflect
this renaming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263513 91177308-0d34-0410-b5e6-96231b3b80d8
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.
This reverts commit r263303.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263512 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.
Without this change, we'll happily unroll e.g. the following loop
for (int i = 0; i < N; ++i) {
if (i == 0) convergent_op();
foo();
}
into
int i = 0;
if (N % 2 == 1) {
convergent_op();
foo();
++i;
}
for (; i < N - 1; i += 2) {
if (i == 0) convergent_op();
foo();
foo();
}.
This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.
In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.
Reviewers: resistor
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263509 91177308-0d34-0410-b5e6-96231b3b80d8