Further unify the handling of function-local metadata with global
metadata, by exposing the same interface in ValueEnumerator. Both
contexts use the same accessors:
- getMDStrings(): get the strings for this block.
- getNonMDStrings(): get the non-strings for this block.
A future commit will start adding strings to the function-block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265224 91177308-0d34-0410-b5e6-96231b3b80d8
A follow-up commit will start using function metadata blocks more
heavily. This commit adds some error checking to confirm that metadata
is fully resolved before (and after) materializing each function.
This is valid even when reading very old bitcode from before the
metadata/value split. The global metadata block always came before the
function blocks. However, in case somehow this causes a regression
(i.e., an old LLVM did produce such bitcode after all) I'm committing
separately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265223 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This should make the code more readable, especially all the map declarations.
Reviewers: tejohnson
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18721
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265215 91177308-0d34-0410-b5e6-96231b3b80d8
Incremental LTO will usea cache to store object files.
This patch handles the pruning part of the cache, exposing
a few knobs:
- Pruning interval: the implementation keeps a "timestamp" file in the
directory and will scan it only after a given interval since the
last modification of the timestamp file. This is for performance
purpose, we don't want to scan continuously the folder.
- Entry expiration: this is the time after which a file that hasn't
been used is remove from the cache.
- Maximum size: expressed in percentage of the available disk space,
it helps to avoid that we blow up the disk space.
http://reviews.llvm.org/D18422
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265209 91177308-0d34-0410-b5e6-96231b3b80d8
Use a helper function to find all the direct-calls-sites in a function.
Also split the code into a separated file as this will be use by
indirect-call-promotion transformation.
Differential Revision: http://reviews.llvm.org/D18704
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265199 91177308-0d34-0410-b5e6-96231b3b80d8
These function can be dropped by the compiler if they are no longer
referenced in the current module. However there is a change that
another module is still referencing them because of the import.
Multiple solutions can be used:
- Always import LinkOnce when a caller is imported. This ensure that
every module with a call to a LinkOnce has the definition and will
be able to emit it if it emits the call.
- Turn the LinkOnce into Weak, so that it is always emitted.
- Turn all LinkOnce into available_externally and come back after all
modules are codegen'ed to emit only one copy of the linkonce, when
there is still a reference to it.
This patch implement the second option, with am optimization that
only *one* module will turn the LinkOnce into Weak, while the others
will turn it into available_externally, so that there is exactly one
copy emitted for the whole compilation.
http://reviews.llvm.org/D18346
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265190 91177308-0d34-0410-b5e6-96231b3b80d8
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.
This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.
Differential Revision: http://reviews.llvm.org/D18092
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265189 91177308-0d34-0410-b5e6-96231b3b80d8
ThreadModel::Single is already handled already by ARMPassConfig adding
LowerAtomicPass to the pass list, which lowers all atomics to non-atomic
ops and deletes fences.
So by the time we get to ISel, there's no atomic fences left, so they
don't need special handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265178 91177308-0d34-0410-b5e6-96231b3b80d8
Was there really no other way to splat a byte in SSE2?
punpcklbw {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
pshuflw {{.*#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7]
pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265172 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Patch by: Vedran Miletić
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: jvesely, scchan, kanarayan, arsenm
Differential Revision: http://reviews.llvm.org/D17280
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265170 91177308-0d34-0410-b5e6-96231b3b80d8
Note however that this is identical to the existing SSE2 run.
What we really want is yet another run for an SSE2 machine that
also has fast unaligned 16-byte accesses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265167 91177308-0d34-0410-b5e6-96231b3b80d8
Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 -
where we noticed that an intermediate splat was being generated for memsets of
non-zero chars.
That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.
The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.
Note that the SSE1 path is not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265161 91177308-0d34-0410-b5e6-96231b3b80d8
A catchswitch cannot be preceded by another instruction in the same
basic block (other than a PHI node).
Instead, insert the extract element right after the materialization of
the vectorized value. This isn't optimal but is a reasonable compromise
given the constraints of WinEH.
This fixes PR27163.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265157 91177308-0d34-0410-b5e6-96231b3b80d8
Follow-up to D18566 - where we noticed that an intermediate splat was being
generated for memsets of non-zero chars.
That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.
The tests that were added in the last patch are now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.
In the new tests, the splat via shuffling looks ok to me, but there might be some
room for improvement depending on uarch there.
Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.
Differential Revision: http://reviews.llvm.org/D18676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265148 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids undefined behavior when casting pointers to it. Also make
sure that we don't cast to a derived StringMapEntry before checking for
tombstone, as that may have different alignment requirements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265145 91177308-0d34-0410-b5e6-96231b3b80d8
Since revision 235394, we no longer perform target specific combines on
build_vector nodes. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265138 91177308-0d34-0410-b5e6-96231b3b80d8