This is a follow-on from the discussion in http://reviews.llvm.org/D12154.
This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.
A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx
This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449
Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.
Differential Revision: http://reviews.llvm.org/D12288
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245950 91177308-0d34-0410-b5e6-96231b3b80d8
My theory is that somehow Python's refcounting and GC strategy isn't
closing the subprocess handle in a timely fashion. This accesses the
private '_handle' field of the Popen object, but I see no other way to
do this. If this doesn't address the problem on the sanitizer-windows
buildbot, we can revert this change. If it does, then let's keep the
hack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245946 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models. To do this, we need the sample profile pass to
be a module pass.
This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245940 91177308-0d34-0410-b5e6-96231b3b80d8
While introducing support for MinVersionLoadCommand in llvm-readobj I noticed there's
no API to extract Major/Minor/Update components conveniently. Currently consumers
do the bit twiddling on their own, but this will change from now on.
I'll convert llvm-objdump (and llvm-readobj) in a later commit.
Differential Revision: http://reviews.llvm.org/D12282
Reviewed by: rafael
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245938 91177308-0d34-0410-b5e6-96231b3b80d8
As of r245924, _ftol2 is no longer used for fptoui on MS platforms.
Remove the dead code associated with it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245925 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes two issues in x86 fptoui lowering.
1) Makes conversions from f80 go through the right path on AVX-512.
2) Implements an inline sequence for fptoui i64 instead of a library
call. This improves performance by 6X on SSE3+ and 3X otherwise.
Incidentally, it also removes the use of ftol2 for fptoui, which was
wrong to begin with, as ftol2 converts to a signed i64, producing
wrong results for values >= 2^63.
Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D11316
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245924 91177308-0d34-0410-b5e6-96231b3b80d8
It doesn't solve the problem, when for example we load something, and
then assume that it is the same as some constant value, because
globalopt will fail on unknown load instruction. The proposed solution
would be to skip some instructions that we can't evaluate and they are
safe to skip (f.e. load, assume and many others) and see if they are
required to perform optimization (f.e. we don't care about ephemeral
instructions that may appear using @llvm.assume())
http://reviews.llvm.org/D12266
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245919 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 433bfd94e4b7e3cc3f8b08f8513ce47817941b0c.
Broke some bot, have to see why it passed locally.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245917 91177308-0d34-0410-b5e6-96231b3b80d8
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.
It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.
Reviewers: ributzka
Differential Revision: http://reviews.llvm.org/D12263
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245916 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix some grammatical and typographical errors.
- Try to improve upon some awkward/nonstandard phrasings.
- Expand slightly the treatment of how you specify arguments to cmake.
- Update the list of possible LLVM_BUILD_TESTS and state where to find the
definitive list.
- Correct the name of llvm-tblgen.
- Expand slightly the treatment of several build options, including
LLVM_LIT_TOOLS_DIR, LLVM_ENABLE_FFI, and LLVM_EXTERNAL_project_SOURCE_DIR.
Patch by Brian R. Gaeke!
Differential Revision: http://reviews.llvm.org/D11862
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245911 91177308-0d34-0410-b5e6-96231b3b80d8
We might end up with a trivial copy as the addend, and if so, we should ignore
the corresponding FMA instruction. The trivial copy can be coalesced away later,
so there's nothing to do here. We should not, however, assert. Fixes PR24544.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245907 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently std::vector::erase(const_iterator) (as opposed to the
non-const iterator) is a part of C++11 but it seems this is not available
on all the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245900 91177308-0d34-0410-b5e6-96231b3b80d8
This change moves LTOCodeGenerator's ownership of the merged module to a
field of type std::unique_ptr<Module>. This helps simplify parts of the code
and clears the way for the module to be consumed by LLVM CodeGen (see D12132
review comments).
Differential Revision: http://reviews.llvm.org/D12205
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245891 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes PR24546, which demonstrates a segfault during the VSX
swap removal pass. The problem is that debug value instructions were
not excluded from the list of instructions to be analyzed for webs of
related computation. I've added the test case from the PR as a crash
test in test/CodeGen/PowerPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245862 91177308-0d34-0410-b5e6-96231b3b80d8
Arranging the language specific property section into readable groupings and adding a couple of notes about pass order, extensions, and the like.
For the record, suggestion for word smithing are welcomed. I'm happy to revise; I'm just trying to get *something* in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245855 91177308-0d34-0410-b5e6-96231b3b80d8
This change just groups the suggestions by broad topic. I'm planning a couple of follow on changes to improve the readability of this document.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245854 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for dfsan on aarch64-linux with 42-bit VMA
(current default config for 64K pagesize kernels). The support is
enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time
for both clang/llvm and compiler-rt. The default VMA is 39 bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245840 91177308-0d34-0410-b5e6-96231b3b80d8
The FP16_TO_FP node only uses the bottom 16 bits of its input, so the
following pattern can be optimised by removing the AND:
(FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op)
This is a common pattern for ARM targets when functions have __fp16
arguments, as they are passed as floats (so that they get passed in the
correct registers), but then bitcast and truncated to ignore the top 16
bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245832 91177308-0d34-0410-b5e6-96231b3b80d8