is a vararg function.
The original code was examining flag OutputArg::IsFixed to determine whether
CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this
flag is often set to false when the function being analyzed is a non-variadic
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174442 91177308-0d34-0410-b5e6-96231b3b80d8
base point of a load, and the overall alignment of the load. This caused infinite loops in DAG combine with the
original application of this patch.
ORIGINAL COMMIT LOG:
When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174431 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, when a fragment is relaxed, its size is modified, but its
offset is not (it gets laid out as a side effect of checking whether
it needs relaxation), then all subsequent fragments are invalidated
because their offsets need to change. When bundling is enabled,
relaxed fragments need to get laid out again, because the increase in
size may push it over a bundle boundary. So instead of only
invalidating subsequent fragments, also invalidate the fragment that
gets relaxed, which causes it to get laid out again.
This patch also fixes some trailing whitespace and fixes the
bundling-related debug output of MCFragments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174401 91177308-0d34-0410-b5e6-96231b3b80d8
Emitting the function name allows us to check for it in the FileCheck
tests so we can make sure FileCheck is checking the output of the
correct function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174392 91177308-0d34-0410-b5e6-96231b3b80d8
In the loop vectorizer cost model, we used to ignore stores/loads of a pointer
type when computing the widest type within a loop. This meant that if we had
only stores/loads of pointers in a loop we would return a widest type of 8bits
(instead of 32 or 64 bit) and therefore a vector factor that was too big.
Now, if we see a consecutive store/load of pointers we use the size of a pointer
(from data layout).
This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced
test case is the first test in vector_ptr_load_store.ll).
radar://13139343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174377 91177308-0d34-0410-b5e6-96231b3b80d8
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174374 91177308-0d34-0410-b5e6-96231b3b80d8
The sh_link in the ELF section header of .ARM.exidx should
be filled with the section index of the corresponding text
section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174372 91177308-0d34-0410-b5e6-96231b3b80d8
and enables the instruction printer to print aliased
instructions.
Due to usage of RegisterOperands a change in common
code (utils/TableGen/AsmWriterEmitter.cpp) is required
to get the correct register value if it is a RegisterOperand.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174358 91177308-0d34-0410-b5e6-96231b3b80d8
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174343 91177308-0d34-0410-b5e6-96231b3b80d8
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.
rdar://13127907
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174336 91177308-0d34-0410-b5e6-96231b3b80d8
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174330 91177308-0d34-0410-b5e6-96231b3b80d8
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp
The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.
Fix some trivially foldable X86 tests too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174325 91177308-0d34-0410-b5e6-96231b3b80d8
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).
radar://13096933
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174300 91177308-0d34-0410-b5e6-96231b3b80d8
The main lists of debug info metadata attached to the compile_unit had an extra
layer of metadata nodes they went through for no apparent reason. This patch
removes that (& still passes just as much of the GDB 7.5 test suite). If anyone
can show evidence as to why these extra metadata nodes are there I'm open to
reverting this patch & documenting why they're there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174266 91177308-0d34-0410-b5e6-96231b3b80d8
1) allows the use of RIP-relative addressing in 32-bit LEA instructions under
x86-64 (ILP32 and LP64)
2) separates the size of address registers in 64-bit LEA instructions from
control by ILP32/LP64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174208 91177308-0d34-0410-b5e6-96231b3b80d8
Prepare it for vectors of pointers and handle simple cases. We don't handle
complicated cases because accumulateConstantOffset bails on pointer vectors.
Fixes selfhost on i386.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174179 91177308-0d34-0410-b5e6-96231b3b80d8
Only Linux is supported at the moment, and other platforms quickly fault. As a
result these tests would fail on non-Linux hosts. It may be worth making the
tests more generic again as more platforms are supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174170 91177308-0d34-0410-b5e6-96231b3b80d8
remaining use of AliasAnalysis concepts such as isIdentifiedObject to
prove pointer inequality.
@external_compare in test/Transforms/InstSimplify/compare.ll shows a simple
case where a noalias argument can be equal to a global variable address, and
while AliasAnalysis can get away with saying that these pointers don't alias,
instsimplify cannot say that they are not equal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174122 91177308-0d34-0410-b5e6-96231b3b80d8
be equal, since there's nothing preventing a caller from correctly predicting
the stack location of an alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174119 91177308-0d34-0410-b5e6-96231b3b80d8
The AttrBuilder is for building a collection of attributes. The Attribute object
holds only one attribute. So it's not really useful for the Attribute object to
have a creator which takes an AttrBuilder.
This has two fallouts:
1. The AttrBuilder no longer holds its internal attributes in a bit-mask form.
2. The attributes are now ordered alphabetically (hence why the tests have changed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174110 91177308-0d34-0410-b5e6-96231b3b80d8
This is a re-worked version of r174048.
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29!27 = metadata !{null}
With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28
Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.
rdar://problem/13089880
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174093 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.
This initial commit should have support for:
+ Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
(except the late addition CRC instructions).
+ CodeGen features required for C++03 and C99.
+ Compilation for the "small" memory model: code+static data <
4GB.
+ Absolute and position-independent code.
+ GNU-style (i.e. "__thread") TLS.
+ Debugging information.
The principal omission, currently, is performance tuning.
This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.
Further reviews would be gratefully received.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174054 91177308-0d34-0410-b5e6-96231b3b80d8
register for inline asm. This conforms to how gcc allows for effective
casting of inputs into gprs (fprs is already handled).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174008 91177308-0d34-0410-b5e6-96231b3b80d8
for example, a one-past-the-end pointer from one global variable may
be equal to the base pointer of another global variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173995 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first commit of a large series which will add support for the
QPX vector instruction set to the PowerPC backend. This instruction set is
used on the IBM Blue Gene/Q supercomputers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173973 91177308-0d34-0410-b5e6-96231b3b80d8
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29!27 = metadata !{null}
With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28
Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173946 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a new --with-python option to allow configuration of the python binary
for building. If not specified, $PATH will be searched for common python binary
names (python, python2, python3). If specified, and the path is not executable,
it will attempt to search $PATH.
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
Reviewed-by: Eric Christopher <echristo@gmail.com>, Daniel Dunbar <daniel@zuster.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173890 91177308-0d34-0410-b5e6-96231b3b80d8
and update ELF header e_flags.
Currently gathering information such as symbol,
section and data is done by collecting it in an
MCAssembler object. From MCAssembler and MCAsmLayout
objects ELFObjectWriter::WriteObject() forms and
streams out the ELF object file.
This patch just adds a few members to the MCAssember
class to store and access the e_flag settings. It
allows for runtime additions to the e_flag by
assembler directives. The standalone assembler can
get to MCAssembler from getParser().getStreamer().getAssembler().
This patch is the generic infrastructure and will be
followed by patches for ARM and Mips for their target
specific use.
Contributer: Jack Carter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173882 91177308-0d34-0410-b5e6-96231b3b80d8
Changing ARMBaseTargetMachine to return ARMTargetLowering intead of
the generic one (similar to x86 code).
Tests showing which instructions were added to cast when necessary
or cost zero when not. Downcast to 16 bits are not lowered in NEON,
so costs are not there yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173849 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8
The AttributeSetNode contains all of the attributes. This removes one (hopefully
last) use of the Attribute class as a container of multiple attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173761 91177308-0d34-0410-b5e6-96231b3b80d8
The common code in the post-RA scheduler to break anti-dependencies on the
critical path contained a flaw. In the reported case, an anti-dependency
between the overlapping registers %X4 and %R4 exists:
%X29<def> = OR8 %X4, %X4
%R4<def>, %X3<def,dead,tied3> = LBZU 1, %X3<kill,tied1>
The unpatched code breaks the dependency by replacing %R4 and its uses
with %R3, the first register on the available list. However, %R3 and
%X3 overlap, so this creates two overlapping definitions on the same
instruction.
The fix is straightforward, preventing selection of a register that
overlaps any other defined register on the same instruction.
The test case is reduced from the bug report, and verifies that we no
longer produce "lbzu 3, 1(3)" when breaking this anti-dependency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173706 91177308-0d34-0410-b5e6-96231b3b80d8
It is way too slow. Change the default option value to 0.
Always do exact shadow propagation for unsigned ICmp with constants, it is
cheap (under 1% cpu time) and required for correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173682 91177308-0d34-0410-b5e6-96231b3b80d8
Fix that by adding a cast to the shift expander. This came up with vector shifts
on sse-less X86 CPUs.
<2 x i64> = shl <2 x i64> <2 x i64>
-> i64,i64 = shl i64 i64; shl i64 i64
-> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64
Now we cast the last two i64s to the right type. Fixes the crash in PR14668.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173615 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for LLVM to accept metadata that doesn't include a top level
lexical block in a function. Specifically LLVM couldn't handle this when there
were file changes relating to these blocks. I've updated a few test cases to
ensure other functionality (such as inlining) isn't affected by this change, but
haven't pervasively updated all the test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173592 91177308-0d34-0410-b5e6-96231b3b80d8
This catches many cases where we can emit a more efficient shuffle for a
specific mask or when the mask contains undefs. Once the splat is lowered to
unpacks we can't do that anymore.
There is a possibility of moving the promotion after pshufb matching, but I'm
not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so
I avoided that for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173569 91177308-0d34-0410-b5e6-96231b3b80d8
This provides a place to add customized operation cost information and
control some other target-specific IR-level transformations.
The only non-trivial logic in this checkin assigns a higher cost to
unaligned loads and stores (covered by the included test case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173520 91177308-0d34-0410-b5e6-96231b3b80d8