The point is to ensure that the attribute in question
(DW_AT_data_member_location) is associated with the prior tag, so ensure
that we don't see another tag starting between the intended tag and the
desired attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193884 91177308-0d34-0410-b5e6-96231b3b80d8
In a failed attempt to allow the gnu-public-names.ll test case to not
hardcode the size of the unit that the pubnames section referred to I've
at least managed to have unit headers and pubnames headers print out in
a similar style.
This failed to achieve the desired goal because the header in a unit
specifies the length of the unit without the length element of the
header whereas the length in the pubnames includes this element, so the
numbers are off by 4 bytes. I don't know of any arithmetic powers in
FileCheck so the test case can't simply say "CU_LENGTH + 4".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193872 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a pointer to a single-element struct we can still build wide loads
and stores to it (if there is no padding).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193860 91177308-0d34-0410-b5e6-96231b3b80d8
Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.
Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193859 91177308-0d34-0410-b5e6-96231b3b80d8
Fix Tag_ABI_HardFP_use build attribute to handle single precision FP,
replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add
some tests for Tag_ABI_VFP_args.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193856 91177308-0d34-0410-b5e6-96231b3b80d8
This adds another heuristic to BPI, similar to the existing heuristic that
considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by
Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and
equality comparisons with -1 are also unlikely to succeed. Local
experimentation supports this hypothesis: This yields a 1-2% speedup in the
test-suite sqlite benchmark on the PPC A2 core, with no significant
regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193855 91177308-0d34-0410-b5e6-96231b3b80d8
When a dependence check fails we can still try to vectorize loops with runtime
array bounds checks.
This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the
scalar performance on a corei7-avx.
radar://15339680
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193853 91177308-0d34-0410-b5e6-96231b3b80d8
Given that backend does not handle "invoke asm" correctly ("invoke asm" will be
handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right
setup for LPadToCallSiteMap) and we already made the assumption that inline asm
does not throw in InstCombiner::visitCallSite, we are going to make the same
assumption in Inliner to make sure we don't convert "call asm" to "invoke asm".
If it becomes necessary to add support for "invoke asm" later on, we will need
to modify the backend as well as remove the assumptions that inline asm does
not throw.
Fix rdar://15317907
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193808 91177308-0d34-0410-b5e6-96231b3b80d8
There are two ways one could implement hiding of linkonce_odr symbols in LTO:
* LLVM tells the linker which symbols can be hidden if not used from native
files.
* The linker tells LLVM which symbols are not used from other object files,
but will be put in the dso symbol table if present.
GOLD's API is the second option. It was implemented almost 1:1 in llvm by
passing the list down to internalize.
LLVM already had partial support for the first option. It is also very similar
to how ld64 handles hiding these symbols when *not* doing LTO.
This patch then
* removes the APIs for the DSO list.
* marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
global values and other linkonce_odr whose address is not used.
* makes the gold plugin responsible for handling the API mismatch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193800 91177308-0d34-0410-b5e6-96231b3b80d8
That way the test won't start faililng when someone adds a new attribute
and wants to use the next logical enum (38) for bitcode. The new
bitcode file tries to use the number 48 as an attribute instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193787 91177308-0d34-0410-b5e6-96231b3b80d8
Two of the tests are new test cases (cross-module-a.ll, multi-module-a.ll)
not yet supported on MIPS, while XFAIL for the other two tests was
accidentally removed in r193570 and this change reverts those lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193781 91177308-0d34-0410-b5e6-96231b3b80d8
We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the
corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs,
that is why we keep the maps in DwarfDebug instead of CompileUnit.
We make the assumption that if a DIE is not added to an owner yet, we assume
it belongs to the current CU. Since DIEs for the type system are added to
their owners immediately after creation, and other DIEs belong to the current
CU, the assumption should be true.
A testing case is added to show that we only create a single DIE for a type
MDNode and we use ref_addr to refer to the type DIE.
We also add a testing case to show ref_addr relocations for non-darwin
platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193779 91177308-0d34-0410-b5e6-96231b3b80d8
As on other hosts, the CPU identification instruction is priveleged,
so we need to look through /proc/cpuinfo. I copied the PowerPC way of
handling "generic".
Several tests were implicitly assuming z10 and so failed on z196.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193742 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193739 91177308-0d34-0410-b5e6-96231b3b80d8
When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
%tmp = zext <16 x i8> %a to <16 x i32>
store <16 x i32> %tmp, <16 x i32>*%p
ret void
}
Generates:
.section __TEXT,__text,regular,pure_instructions
.section __TEXT,__const
.align 5
LCPI0_0:
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.long 255 ## 0xff
.section __TEXT,__text,regular,pure_instructions
.globl _bar
.align 4, 0x90
_bar:
vpunpckhbw %xmm0, %xmm0, %xmm1
vpunpckhwd %xmm0, %xmm1, %xmm2
vpmovzxwd %xmm1, %xmm1
vinsertf128 $1, %xmm2, %ymm1, %ymm1
vmovaps LCPI0_0(%rip), %ymm2
vandps %ymm2, %ymm1, %ymm1
vpmovzxbw %xmm0, %xmm3
vpunpckhwd %xmm0, %xmm3, %xmm3
vpmovzxbd %xmm0, %xmm0
vinsertf128 $1, %xmm3, %ymm0, %ymm0
vandps %ymm2, %ymm0, %ymm0
vmovaps %ymm0, (%rdi)
vmovaps %ymm1, 32(%rdi)
vzeroupper
ret
So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
- the number of vector elements is even,
- the source type is legal,
- the type of a split source is illegal,
- the type of an extended (by doubling element size) source is legal, and
- the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."
With this change, the above example now compiles to:
_bar:
vpxor %xmm1, %xmm1, %xmm1
vpunpcklbw %xmm1, %xmm0, %xmm2
vpunpckhwd %xmm1, %xmm2, %xmm3
vpunpcklwd %xmm1, %xmm2, %xmm2
vinsertf128 $1, %xmm3, %ymm2, %ymm2
vpunpckhbw %xmm1, %xmm0, %xmm0
vpunpckhwd %xmm1, %xmm0, %xmm3
vpunpcklwd %xmm1, %xmm0, %xmm0
vinsertf128 $1, %xmm3, %ymm0, %ymm0
vmovaps %ymm0, 32(%rdi)
vmovaps %ymm2, (%rdi)
vzeroupper
ret
This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.
rdar://14735100
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193727 91177308-0d34-0410-b5e6-96231b3b80d8
With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr
if they are also unnamed_addr or don't have their address taken.
There is not a lot of documentation about .weak_def_can_be_hidden, but
from the old discussion about linkonce_odr_auto_hide and the name of
the directive this looks correct: these symbols can be hidden.
Testing this with the ld64 in Xcode 5 linking clang reduces the number of
exported symbols from 21053 to 19049.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193718 91177308-0d34-0410-b5e6-96231b3b80d8
Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).
These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.
bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
values is an uimm8, rather than when the condition is a uimm8 (which is
covered by bmnzi.b)
compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
is the same operation (see MSA.txt).
i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
distinguishable from bmnzi.b.
vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
MSA.txt).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193693 91177308-0d34-0410-b5e6-96231b3b80d8
This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.
It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().
Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193687 91177308-0d34-0410-b5e6-96231b3b80d8