An alias has the address of what it points to, so it also has the same
alignment.
This allows a few optimizations to see past aliases for free.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208103 91177308-0d34-0410-b5e6-96231b3b80d8
Obviously we can't expect the two backends to produce identical diagnostics,
since what's possible depends quite a bit on how the .td files are structured.
I think the ARM64 diagnostics are basically of the same quality in all the
changed cases, so I've split the CHECK lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208084 91177308-0d34-0410-b5e6-96231b3b80d8
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.
Patch by Jameson Nash!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208029 91177308-0d34-0410-b5e6-96231b3b80d8
The number of tail call to loop conversions remains the same (1618 by my count).
The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208017 91177308-0d34-0410-b5e6-96231b3b80d8
Visibility is meaningless when the linkage is local. Change
`-internalize` to reset the visibility to `default`.
<rdar://problem/16141113>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207979 91177308-0d34-0410-b5e6-96231b3b80d8
Windows on ARM does not conform to AEABI. However, memset would be emitted
using the AEABI signature, resulting in inverted parameters. Handle this
special case appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207943 91177308-0d34-0410-b5e6-96231b3b80d8
Add handling for FK_SecRel_4 (4-byte section relative relocations). These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations). This will also be used in generation of CodeView line
tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207941 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise we use the same threshold as for complete unrolling, which is
way too high. This made us unroll any loop smaller than 150 instructions
by 8 times, but only if someone specified -march=core2 or better,
which happens to be the default on darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207940 91177308-0d34-0410-b5e6-96231b3b80d8
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.
Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.
This reverts r207746 which reverted the revert of the revert of r205018 or so.
Fixes the test case in PR19621.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207939 91177308-0d34-0410-b5e6-96231b3b80d8
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix. This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive. This apparently has been been broken
for a while. However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.
Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix. Add an explicit test for cygwin's behaviour of export
directives.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207926 91177308-0d34-0410-b5e6-96231b3b80d8
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.
This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.
Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207920 91177308-0d34-0410-b5e6-96231b3b80d8
See PR19608 for the details but to summarize it was easy to modify the .ll
file to get the desired def-use ordering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207887 91177308-0d34-0410-b5e6-96231b3b80d8
This makes it *really* easy to debug leaks FYI:
ASAN_OPTIONS=detect_leaks=1 ./bin/llvm-lit -v <path to test>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207874 91177308-0d34-0410-b5e6-96231b3b80d8
Reading line tables in llvm-cov was pretty broken, but would happen to
work as long as no line in the table was 0. It's not clear to me
whether a line of zero *should* show up in these tables, but deciding
to read a string in the middle of the line table is certainly the
wrong thing to do if it does.
I've also added some comments, as trying to figure out what this block
of code was doing was fairly unpleasant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207866 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
* Updated the documentation
* Added a test for >2 arguments
* Added a check for the lexical concatenation
* Made the existing test a bit stricter.
Reviewers: t.p.northover
Reviewed By: t.p.northover
Subscribers: t.p.northover, llvm-commits
Differential Revision: http://reviews.llvm.org/D3485
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207865 91177308-0d34-0410-b5e6-96231b3b80d8
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.
GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.
Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207856 91177308-0d34-0410-b5e6-96231b3b80d8
address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where
PRE is applied to a load that is not partially redundant.
<rdar://problem/16638765>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207853 91177308-0d34-0410-b5e6-96231b3b80d8
.file records are supposed to have a section identifier of 65534
(IMAGE_SCN_DEBUG) rather than 0. This is spelt out clearly within the PE/COFF
specification. Fix this minor oversight with the implementation for support for
.file records.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207851 91177308-0d34-0410-b5e6-96231b3b80d8
v2: move code to AMDGPUISelLowering.cpp
squash with tests (both EG and SI)
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207845 91177308-0d34-0410-b5e6-96231b3b80d8
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207838 91177308-0d34-0410-b5e6-96231b3b80d8
Given the following C code llvm currently generates suboptimal code for
x86-64:
__m128 bss4( const __m128 *ptr, size_t i, size_t j )
{
float f = ptr[i][j];
return (__m128) { f, f, f, f };
}
=================================================
define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
%a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
%a2 = load <4 x float>* %a1, align 16, !tbaa !1
%a3 = trunc i64 %j to i32
%a4 = extractelement <4 x float> %a2, i32 %a3
%a5 = insertelement <4 x float> undef, float %a4, i32 0
%a6 = insertelement <4 x float> %a5, float %a4, i32 1
%a7 = insertelement <4 x float> %a6, float %a4, i32 2
%a8 = insertelement <4 x float> %a7, float %a4, i32 3
ret <4 x float> %a8
}
=================================================
shlq $4, %rsi
addq %rdi, %rsi
movslq %edx, %rax
vbroadcastss (%rsi,%rax,4), %xmm0
retq
=================================================
The movslq is uneeded, but is present because of the trunc to i32 and then
sext back to i64 that the backend adds for vbroadcastss.
We can't remove it because it changes the meaning. The IR that clang
generates is already suboptimal. What clang really should emit is:
%a4 = extractelement <4 x float> %a2, i64 %j
This patch makes that legal. A separate patch will teach clang to do it.
Differential Revision: http://reviews.llvm.org/D3519
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207801 91177308-0d34-0410-b5e6-96231b3b80d8
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel
Test Plan: simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3527
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207790 91177308-0d34-0410-b5e6-96231b3b80d8