Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks.
Minor follow up to D15477.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258000 91177308-0d34-0410-b5e6-96231b3b80d8
%RBP can't be handled explicitly. We generate the following code:
pushq %rbp
movq %rsp, %rbp
...
movq %rbx, (%rbp) ## 8-byte Spill
where %rbp will be overwritten by the spilled value.
The fix is to let PEI handle %RBP.
PR26136
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257997 91177308-0d34-0410-b5e6-96231b3b80d8
Entry block count was not counted and is corrected. Also
introduce a new metric that is MaxInternalBlockCount which
show command shows (as before).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257987 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.
Fixes PR26163.
Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257979 91177308-0d34-0410-b5e6-96231b3b80d8
Out-of-tree projects that don't support this can disable the test for
themselves rather than having it disabled in LLVM itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257960 91177308-0d34-0410-b5e6-96231b3b80d8
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.
Differential Revision: http://reviews.llvm.org/D15957
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257940 91177308-0d34-0410-b5e6-96231b3b80d8
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.
PR26136
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257930 91177308-0d34-0410-b5e6-96231b3b80d8
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.
I will commit fixes to other platforms as well.
PR26136
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257929 91177308-0d34-0410-b5e6-96231b3b80d8
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.
I will commit fixes to other platforms as well.
PR26136
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257925 91177308-0d34-0410-b5e6-96231b3b80d8
The fix uniques the bundle of getelementptr indices we are about to vectorize
since it's possible for the same index to be used by multiple instructions.
The original commit message is below.
[SLP] Vectorize the index computations of getelementptr instructions.
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257918 91177308-0d34-0410-b5e6-96231b3b80d8
# The first commit's message is:
Revert "[ARM] Add DSP build attribute and extension targeting"
This reverts commit b11cc50c0b.
# This is the 2nd commit message:
Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"
This reverts commit 837d08454e.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257916 91177308-0d34-0410-b5e6-96231b3b80d8
This means that LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN will not be set
in a few cases.
This should have no impact in ld64 since it doesn't use lazy loading
when merging modules and that is when it checks
LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257915 91177308-0d34-0410-b5e6-96231b3b80d8
Added forgotten ELFDumper.cpp to commit.
Initial commit message:
[llvm-readobj] Add support for TLSDESC_PLT and TLSDESC_GOT dynamic section tags to the llvm-readobj.
If module uses uses lazy TLSDESC relocations it should define DT_TLSDESC_PLT and DT_TLSDESC_GOT entries.
They were unknown for llvm-readobj before this patch.
Differential revision: http://reviews.llvm.org/D16224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257914 91177308-0d34-0410-b5e6-96231b3b80d8
Initial commit message:
[llvm-readobj] Add support for TLSDESC_PLT and TLSDESC_GOT dynamic section tags to the llvm-readobj.
If module uses uses lazy TLSDESC relocations it should define DT_TLSDESC_PLT and DT_TLSDESC_GOT entries.
They were unknown for llvm-readobj before this patch.
Differential revision: http://reviews.llvm.org/D16224
----
Added : /llvm/trunk/test/tools/llvm-readobj/Inputs/dynamic-table-so.aarch64
Modified : /llvm/trunk/test/tools/llvm-readobj/Inputs/dynamic-table.c
Modified : /llvm/trunk/test/tools/llvm-readobj/dynamic.test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257913 91177308-0d34-0410-b5e6-96231b3b80d8
platforms.
With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.
This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311
(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)
Differential Revision: http://reviews.llvm.org/D16145
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257902 91177308-0d34-0410-b5e6-96231b3b80d8
This contains a fix for the issue that caused the revert:
we no longer assume that we can insert instructions after the
instruction that produces the base pointer. We previously
assumed that this would be ok, because the instruction produces
a value and therefore is not a terminator. This is false for invoke
instructions. We will now insert these new instruction directly
at the location of the users.
Original commit message:
[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs
Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.
InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.
This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.
Reviewers: majnemer, jmolloy
Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits
Differential Revision: http://reviews.llvm.org/D15146
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257897 91177308-0d34-0410-b5e6-96231b3b80d8
There are several requirements that ended up with this design;
1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
2. Bitreversals and byteswaps are very related in their matching logic.
3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
4. Bswaps are best matched early in InstCombine.
The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.
We can then extend the matching logic in one place only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257875 91177308-0d34-0410-b5e6-96231b3b80d8
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.
r257550 commit message was:
```
The follow extra changes were made to test cases:
Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:
LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
LLVM :: DebugInfo/Generic/varargs.ll
Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):
LLVM :: DebugInfo/Generic/restrict.ll
LLVM :: DebugInfo/Generic/tu-composite.ll
LLVM :: Linker/type-unique-type-array-a.ll
LLVM :: Linker/type-unique-simple2.ll
Fixing an incorrect DW_OP_deref
LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll
Fixing a missing DW_OP_deref
LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll
Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.
The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.
One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.
Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
variable as that would depend on the target, even though this test is
supposed to be generic
- I had to manually declared size/align for reference type. See also the
discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``
```
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257850 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This pass may modify the Cmp operands. However, the flag reg may be used by both the branch and CSEL.
Modifying CMP will have side effect on CSEL.
Reviewers: t.p.northover
Subscribers: llvm-commits, aemerson, rengolin
Differential Revision: http://reviews.llvm.org/D16147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257844 91177308-0d34-0410-b5e6-96231b3b80d8
classes.
OrcRemoteTargetClient::RCMemoryManager will now register EH frames with the
server automatically. This allows remote-execution of code that uses exceptions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257816 91177308-0d34-0410-b5e6-96231b3b80d8
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.
Differential Revision: http://reviews.llvm.org/D14829
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257800 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: If SROA creates only one piece (e.g. because the other is not needed),
it still needs to create a bit_piece expression if that bit piece is smaller
than the original size of the alloca.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16187
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257795 91177308-0d34-0410-b5e6-96231b3b80d8
Since r230276, we support an improved legalization for f64->f16,
which goes through a temporary f32, improving codegen when
f32->f16 is legal but not f64->f16. This requires unsafe-fp-math.
However, that legalization assumed that the second step, producing
a pseudo-softened f16, had type i16. That's not true on targets
with illegal i16, such as ARM.
Use the initial f64->f16 result type instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257794 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The dbg.declare -> dbg.value conversion did not check which operand of
the store instruction the alloca was passed to. As a result code that stored the
address of an alloca, rather than storing to the alloca, would still trigger
the conversion routine, leading to the insertion of an incorrect dbg.value
intrinsic.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16169
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257787 91177308-0d34-0410-b5e6-96231b3b80d8
Some patterns of select+compare allow us to know exactly the value of the uppermost bits in the select result. For example:
%b = icmp ugt i32 %a, 5
%c = select i1 %b, i32 2, i32 %a
Here we know that %c is bounded by 5, and therefore KnownZero = ~APInt(5).getActiveBits() = ~7.
There are several such patterns, and this patch attempts to understand a reasonable subset of them - namely when the base values are the same (as above), and when they are related by a simple (add nsw), for example (add nsw %a, 4) and %a.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257769 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Since globals may escape as function arguments (even when they have been
found to be non-escaping, because of optimizations such as memcpyoptimizer
that replaces stores with memcpy), all arguments to a function are checked
during query to make sure they are identifiable. At that time, also ensure
we return a conservative result only if the arguments don't alias to our global.
Reviewers: hfinkel, jmolloy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16140
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257750 91177308-0d34-0410-b5e6-96231b3b80d8
We rely on HasOpaqueSPAdjustment not changing after we've calculated
things based on it. Things like whether or not we can use 'rep;movs' to
copy bytes around, that sort of thing. If it changes, invariants in the
backend will quietly break. This situation arose when we had a call to
memcpy *and* a COPY of the FLAGS register where we would attempt to
reference local variables using %esi, a register that was clobbered by
the 'rep;movs'.
This fixes PR26124.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257730 91177308-0d34-0410-b5e6-96231b3b80d8
Clang generates good display names for codeview since r255744, and the
change to make LLVM use them was accidentally included in r257658.
This change just updates the comments and test case to reflect reality
better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257723 91177308-0d34-0410-b5e6-96231b3b80d8
platforms.
With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.
This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257719 91177308-0d34-0410-b5e6-96231b3b80d8
Previous implementation in http://reviews.llvm.org/D10522
created external references to __emutls_v.* variables.
Such references are inaccurate and cannot be handled by
all linkers, e.g. Android dynamic and gold linkers for aarch64.
Now a new LowerEmuTLS pass to go through all global variables,
and add emutls_v.* and emutls_t.* variables.
These __emutls* variables have the same linkage and
visibility as the associated user defined TLS variable.
Also removed old code that dump __emutls* variables in AsmPrinter.cpp,
and updated TLS unit tests.
Differential Revision: http://reviews.llvm.org/D15300
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257718 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a detailed profile summary in llvm-profdata. The summary is in the
form of one or more triples of the form (P, N, M) which is interpreted as if
we look at the Top-N counts in the profile, their sum accounts for P percentage
of the sum of all counts in the program and the minimum count in the Top-N is M.
Differential Revision: http://reviews.llvm.org/D16005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257680 91177308-0d34-0410-b5e6-96231b3b80d8
This rewrites and expands the existing codeview dumping functionality in
llvm-readobj using techniques similar to those in lib/Object. This defines a
number of new records and enums useful for reading memory mapped codeview
sections in COFF objects.
The dumper is intended as a testing tool for LLVM as it grows more codeview
output capabilities.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D16104
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257658 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It is off by default, but can be used
with --misched=si
Patch by: Axel Davy
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D11885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257609 91177308-0d34-0410-b5e6-96231b3b80d8
The global entry point prologue currently assumes that the TOC
associated with a function is less than 2GB away from the function
entry point. This is always true when using the medium or small
code model, but may not be the case when using the large code model.
This patch adds a new variant of the ELFv2 global entry point prologue
that lifts the 2GB restriction when building with -mcmodel=large.
This works by emitting a quadword containing the distance from the
function entry point to its associated TOC immediately before the
entry point, and then using a prologue like:
ld r2,-8(r12)
add r2,r2,r12
Since creation of the entry point prologue is now split across two
separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits
the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog
code), I've switched to using named labels instead of just temporaries
to indicate the locations of the global and local entry points and the
new TOC offset data word.
These names are provided by new routines in PPCFunctionInfo modeled
after the existing PPCFunctionInfo::getPICOffsetSymbol.
Note that a corresponding change was committed to GCC here:
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D15500
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257597 91177308-0d34-0410-b5e6-96231b3b80d8
Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA
(in the same basic block) which calculates address differing only be a
displacement. Works only for -Oz.
Differential Revision: http://reviews.llvm.org/D13295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257589 91177308-0d34-0410-b5e6-96231b3b80d8
This patch turns off the fast-math optimization attribute on the caller
if the callee's fast-math attribute is not turned on.
For example,
- before inlining
caller: "less-precise-fpmad"="true"
callee: "less-precise-fpmad"="false"
- after inlining
caller: "less-precise-fpmad"="false"
Alternatively, it's possible to block inlining if the caller's and
callee's attributes don't match. If this approach is preferable to the
one in this patch, we can discuss post-commit.
rdar://problem/19836465
Differential Revision: http://reviews.llvm.org/D7802
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257575 91177308-0d34-0410-b5e6-96231b3b80d8
AnalyzeBranch on X86 (and, previously, SPARC, which implementation was
copied from X86) tries to modify the branches based on block
layout (e.g. checking isLayoutSuccessor), when AllowModify is true.
The rest of the architectures leave that up to the caller, which can
call InsertBranch, RemoveBranch, and ReverseBranchCondition as
appropriate. That appears to be the preferred way to do it nowadays.
This commit makes SPARC like the rest: replaces AnalyzeBranch with an
implementation cribbed from AArch64, and adds a ReverseBranchCondition
implementation.
Additionally, a test-case has been added (also cribbed from AArch64)
demonstrating that redundant branch sequences no longer get emitted.
E.g., it used to emit code like this:
bne .LBB1_2
nop
ba .LBB1_1
nop
.LBB1_2:
And now emits:
cmp %i0, 42
be .LBB1_1
nop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257572 91177308-0d34-0410-b5e6-96231b3b80d8
The version numbers of the darwin kernel are different from the version
numbers of OS X, so we need adjustments if we had "*-*-darwin" triples.
Use the existing utility functions in TargetTriple for this.
Fixes rdar://22056966
Differential Revision: http://reviews.llvm.org/D14601
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257555 91177308-0d34-0410-b5e6-96231b3b80d8
The line tables for CodeView make a distinction between expressions and
statements. As it turns out, MSVC always emits them as statements and
we always emit them as expressions. Let's switch to statements to match
the CodeView that they emit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257553 91177308-0d34-0410-b5e6-96231b3b80d8
This change has us print out fields we didn't previously understand. To
improve readability, we now group column information with it's
respective line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257552 91177308-0d34-0410-b5e6-96231b3b80d8
The follow extra changes were made to test cases:
Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:
LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
LLVM :: DebugInfo/Generic/varargs.ll
Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):
LLVM :: DebugInfo/Generic/restrict.ll
LLVM :: DebugInfo/Generic/tu-composite.ll
LLVM :: Linker/type-unique-type-array-a.ll
LLVM :: Linker/type-unique-simple2.ll
Fixing an incorrect DW_OP_deref
LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll
Fixing a missing DW_OP_deref
LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll
Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.
The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.
One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.
Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
variable as that would depend on the target, even though this test is
supposed to be generic
- I had to manually declared size/align for reference type. See also the
discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257550 91177308-0d34-0410-b5e6-96231b3b80d8
to only print the first private header.
Which for Mach-O files only prints the Mach header and not the subsequent load
commands. Which is used by scripts to match what the darwin otool(1) with the
-h flag does without the -l flag.
For non-Mach-O files it has the same functionality as -private-headers (with
the trailing ’s’).
rdar://24158331
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257548 91177308-0d34-0410-b5e6-96231b3b80d8
VMOVs are not strictly speaking cheap, but they are as expensive as a vector
copy (VORR), so we should prefer rematerialization over splitting when it
applies.
rdar://problem/23754176
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257545 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext
to find a value to describe the variable (in the expectation that those
zext/sext instruction will go away later). However, those values do not
cover the entire variable and thus need a DW_OP_bit_piece.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16061
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257534 91177308-0d34-0410-b5e6-96231b3b80d8
CodeView, unlike DWARF, can associate code with a range of columns.
However, LLVM can only represent a single column position internally.
We used to claim that the end column and start column were the same
which yielded less than satisfactory results: we would stop printing at
the _beginning_ of the source expression! Instead, mark the column-end
as 'zero' to indicate that we don't have one (as per the documentation
for IDiaLineNumber::get_lineNumberEnd).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257528 91177308-0d34-0410-b5e6-96231b3b80d8
Add -no-integrated-as to this test, since it's testing inline asm strings
that aren't actually valid assembly syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257519 91177308-0d34-0410-b5e6-96231b3b80d8
Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle
the weighted predicates as well.
This latent bug was triggered by r255398, because it added use of the
branch-weighted predicates.
While here, switch over an enum instead of an int to get the compiler to enforce
totality in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257518 91177308-0d34-0410-b5e6-96231b3b80d8
A request has been made to the official registry, but an official value is
not yet available. This patch uses a temporary value in order to support
development. When an official value is recieved, the value of EM_WEBASSEMBLY
will be updated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257517 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the way labels are referenced. Instead of referencing the
basic-block label name (eg. .LBB0_0), instructions now just have an immediate
which indicates the depth in the control-flow stack to find a label to jump to.
This makes them much closer to what we expect to have in the binary encoding,
and avoids the problem of basic-block label names not being explicit in the
binary encoding.
Also, it terminates blocks and loops with end_block and end_loop instructions,
rather than basic-block label names, for similar reasons.
This will also fix problems where two constructs appear to have the same label,
because we no longer explicitly use labels, so consumers that need labels will
presumably create their own labels, and presumably they won't reuse labels
when they do.
This patch does make the code a little more awkward to read; as a partial
mitigation, this patch also introduces comments showing where the labels are,
and comments on each branch showing where it's branching to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257505 91177308-0d34-0410-b5e6-96231b3b80d8
The findExternalCalls routine ignores calls to functions already
defined in the dest module. This was not handling the case where
the definition in the current module is actually an alias to a
function call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257493 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes three bugs, in all of which state is not or incorrecly reset between
objects (i.e. when reusing the same pass manager to create multiple object
files):
1) AttributeSection needs to be reset to nullptr, because otherwise the backend
will try to emit into the old object file's attribute section causing a
segmentation fault.
2) MappingSymbolCounter needs to be reset, otherwise the second object file
will start where the first one left off.
3) The MCStreamer base class resets the Streamer's e_flags settings. Since
EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again
after the MCStreamer was rest.
Also rename Reset (uppser case) to EHReset to avoid confusion with
reset (lower case).
Reviewers: rengolin
Differential Revision: http://reviews.llvm.org/D15950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257473 91177308-0d34-0410-b5e6-96231b3b80d8
(64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64'
(a zero-extended 64-bit load).
However, a change in r248784 teaches the instruction combiner that only
the lower 64 bits of the input to a 128-bit vcvtph2ps are used. This means
the instruction combiner will ordinarily optimize away the upper 64-bit
insertelement instruction in the zero-extension and so we no longer select
the memory-register form. To fix this a new pattern has been added.
Differential Revision: http://reviews.llvm.org/D16067
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257470 91177308-0d34-0410-b5e6-96231b3b80d8
Function::copyAttributesFrom will copy the personality function, prefix
data and prolog data from the source function to the new function, and
is invoked when the IRMover copies the function prototype. This puts a
reference to a constant in the source module on a function in the dest
module, which causes an error when deleting the source module after
importing, since the personality function in the source module still has
uses (this would presumably also be an issue for the prologue and prefix
data). Remove the copies added to the dest copy when creating the new
prototype, as they are mapped properly when/if we link the function body.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257420 91177308-0d34-0410-b5e6-96231b3b80d8
Currently WebAssembly has two kinds of relocations; data addresses and
function addresses. This adds ELF relocations for them, as well as an
MC symbol kind to indicate which type of relocation is needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257416 91177308-0d34-0410-b5e6-96231b3b80d8
Currently we're unrolling loops more in minsize than in optsize, which
means -Oz will have a larger code size than -Os. That doesn't make any
sense.
This resolves the FIXME about this in LoopUnrollPass and extends the
optsize test to make sure we use the smaller threshold for minsize as
well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257402 91177308-0d34-0410-b5e6-96231b3b80d8
This is a continuation of adding FMF to call instructions:
http://reviews.llvm.org/rL255555
The intent of the patch is to preserve the current behavior of the transform except
that we use the sqrt instruction's 'fast' attribute as a trigger rather than the
function-level attribute.
But this raises a bug noted by the new FIXME comment.
In order to do this transform:
sqrt((x * x) * y) ---> fabs(x) * sqrt(y)
...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'.
If any of those ops is strict, we should bail out.
Differential Revision: http://reviews.llvm.org/D15937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257400 91177308-0d34-0410-b5e6-96231b3b80d8
The old lowering for uint_to_fp failed opencl conformance.
It might be OK for fast math mode, but I'm not sure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257393 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes the memory sanitizer origin store instrumentation for
array types. This can be triggered by cases where frontend lowers
function return to array type instead of aggregation.
For instance, the C code:
--
struct mypair {
int64_t x;
int y;
};
mypair my_make_pair(int64_t x, int y) {
mypair p;
p.x = x;
p.y = y;
return p;
}
int foo (int p)
{
mypair z = my_make_pair(p, 0);
return z.y + z.x;
}
--
It will be lowered with target set to aarch64-linux and -O0 to:
--
[...]
define i32 @_Z3fooi(i32 %p) #0 {
[...]
%call = call [2 x i64] @_Z12my_make_pairxi(i64 %conv, i32 0)
%1 = bitcast %struct.mypair* %z to [2 x i64]*
store [2 x i64] %call, [2 x i64]* %1, align 8
[...]
--
The origin store will emit a 'icmp' to test each store value again the
TLS origin array. However since 'icmp' does not support ArrayType the
memory instrumentation phase will bail out with an error.
This patch change it by using the same strategy used for struct type on
array.
It fixes the 'test/msan/insertvalue_origin.cc' for aarch64 (the -O0 case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257375 91177308-0d34-0410-b5e6-96231b3b80d8
intermittent XPASSes on some builders.
These can be reinstated when we have proper support for small-code model in
the JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257359 91177308-0d34-0410-b5e6-96231b3b80d8
The hardware instruction's output on 0 is -1 rather than 32.
Eliminate a test and select to -1. This removes an extra instruction
from the compatability function with HSAIL's firstbit instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257352 91177308-0d34-0410-b5e6-96231b3b80d8
The new ORC remote-JITing support provides a superset of the old code's
functionality, so we can replace the old stuff. As a bonus, a couple of
previously XFAILed tests have started passing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257343 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It actually takes an offset into the current PC-region.
This fixes the 'expr' command in lldb.
Reviewers: vkalintiris, jaydeep, bhushan
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D16054
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257339 91177308-0d34-0410-b5e6-96231b3b80d8
This is a recommit of r257253 which was reverted in r257270.
Previous testcase can make failure on some targets due to using opt with O3 option.
Original Summary:
Merge MBBICommon and MBBI's MMOs.
Differential Revision: http://reviews.llvm.org/D15990
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257317 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The code was simply ensuring that the catchpad's pred is its catchswitch,
which was letting cases slip through where the flow edge was the unwind
edge of the catchswitch rather than one of its catch clauses.
Reviewers: andrew.w.kaylor, rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257275 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Funclet-based EH personalities/tables likely can't handle these, and they
can't be generated at source, so make them officially illegal in IR as
well.
Reviewers: andrew.w.kaylor, rnk, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15963
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257274 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
A funclet EH pad may be exited by an unwind edge, which may be a
cleanupret exiting its cleanuppad, an invoke exiting a funclet, or an
unwind out of a nested funclet transitively exiting its parent. Funclet
EH personalities require all such exceptional exits from a given funclet to
have the same unwind destination, and EH preparation / state numbering /
table generation implicitly depends on this. Formalize it as a rule of
the IR in the LangRef and verifier.
Reviewers: rnk, majnemer, andrew.w.kaylor
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257273 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Funclet EH personalities require a tree-like nesting among funclets
(enforced by the ParentPad linkage in the IR), and also require that
unwind edges conform to certain rules with respect to the tree:
- An unwind edge may exit 0 or more ancestor pads
- An unwind edge must enter exactly one EH pad, which must be distinct
from any exited pads
- A cleanupret's edge must exit its cleanuppad
Describe these rules in the LangRef, and enforce them in the verifier.
Reviewers: rnk, majnemer, andrew.w.kaylor
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15961
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257272 91177308-0d34-0410-b5e6-96231b3b80d8
AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through any bitcast to check for a load node to allow broadcasts to occur.
This is a re-commit of r257055 after r257264 fixed 32-bit broadcast loads of i64 scalars.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257266 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is analogous to r256079, which removed an overly strong assertion, and
r256812, which simplified the code by replacing three conditionals by one.
Reviewers: reames
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D16019
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257250 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates.
The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications.
Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR.
At the moment, the new functionality is hidden behind a flag. To use this code, you need to pass "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested.
Differential Revision: http://reviews.llvm.org/D15982
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257244 91177308-0d34-0410-b5e6-96231b3b80d8
Look for PHI/Select in the same BB of the form
bb:
%p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ...
%s = select p, trueval, falseval
And expand the select into a branch structure. This later enables
jump-threading over bb in this pass.
Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold
select if the associated PHI has at least one constant. If the unfolded
select is not jump-threaded, it will be folded again in the later
optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257198 91177308-0d34-0410-b5e6-96231b3b80d8
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257191 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse.
RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5.
Patch by Z. Zheng <zhaoshiz@codeaurora.org>
Reviewers: apazos, jmolloy, weimingz
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D15932
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257188 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16.
Directly return an FP_TO_FP16 node with return type as the
promote-to-type of i16.
This patch also removes extraneous length check. This legalization
should be valid even if integer and float types are of different
lengths.
This patch breaks a hard-float test for fp16 args. The test is changed
to allow a vmov to zero-out the top bits, and also ensure that the
return value is in an FP register.
Reviewers: ab, jmolloy
Subscribers: srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D15438
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257184 91177308-0d34-0410-b5e6-96231b3b80d8
StackColoring rewrites the frame indicies of operations involving
allocas if it can find that the life time of two objects do not overlap.
MSVC EH needs to be kept aware of this if happens in the event that a
catch object has moved around. However, we represent the non-existance
of a catch object with a sentinel frame index (INT_MAX). This sentinel
also happens to be the EmptyKey of the SlotRemap DenseMap. Testing for
whether or not we need to translate the frame index fails in this case
because we call the count method on the DenseMap with the EmptyKey,
leading to assertions. Instead, check if it is our sentinel value
before trying to look into the DenseMap.
This fixes PR26073.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257182 91177308-0d34-0410-b5e6-96231b3b80d8
The function importer was still materializing metadata when modules were
loaded for function importing. We only want to materialize it when we
are going to invoke the metadata linking postpass. Materializing it
before function importing is not only unnecessary, but also causes
metadata referenced by imported functions to be mapped in early, and
then not connected to the rest of the module level metadata when it is
ultimately linked in.
Augmented the test case to specifically check for the metadata being
properly connected, which it wasn't before this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257171 91177308-0d34-0410-b5e6-96231b3b80d8
In setInsertionPoint if the value is not a PHI, Instruction or
Argument it should be a Constant, not a ConstantExpr.
Original commit message:
[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs
Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.
InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.
This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.
Reviewers: majnemer, jmolloy
Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits
Differential Revision: http://reviews.llvm.org/D15146
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257164 91177308-0d34-0410-b5e6-96231b3b80d8
a top-down manner into a true top-down or RPO pass over the call graph.
There are specific patterns of function attributes, notably the
norecurse attribute, which are most effectively propagated top-down
because all they us caller information.
Walk in RPO over the call graph SCCs takes the form of a module pass run
immediately after the CGSCC pass managers postorder walk of the SCCs,
trying again to deduce norerucrse for each singular SCC in the call
graph.
This removes a very legacy pass manager specific trick of using a lazy
revisit list traversed during finalization of the CGSCC pass. There is
no analogous finalization step in the new pass manager, and a lazy
revisit list is just trying to produce an RPO iteration of the call
graph. We can do that more directly if more expensively. It seems
unlikely that this will be the expensive part of any compilation though
as we never examine the function bodies here. Even in an LTO run over
a very large module, this should be a reasonable fast set of operations
over a reasonably small working set -- the function call graph itself.
In the future, if this really is a compile time performance issue, we
can look at building support for both post order and RPO traversals
directly into a pass manager that builds and maintains the PO list of
SCCs.
Differential Revision: http://reviews.llvm.org/D15785
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257163 91177308-0d34-0410-b5e6-96231b3b80d8
Windows EH keeping track of which frame index corresponds to a catchpad
in order to inform the runtime where the catch parameter should be
initialized. LLVM's optimizations are able to prove that the memory
used by the catch parameter can be reused with another memory
optimization, changing it's frame index.
We need to keep WinEHFuncInfo up to date with respect to this or we will
miscompile/assert.
This fixes PR26069.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257158 91177308-0d34-0410-b5e6-96231b3b80d8
Done in InstrProfWriter to eliminate the need for client
code to do the sorting. The operation is done once and reused
many times so it is more efficient. Update unit test to remove
sorting. Also update expected output of affected tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257145 91177308-0d34-0410-b5e6-96231b3b80d8
This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839.
For a PIC TLS variable access in a function, prologue (mflr followed by std and
stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but
no one saves/restores it.
Also added a test for save/restore clobbered registers during calling __tls_get_addr.
Patch by Tim Shen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257137 91177308-0d34-0410-b5e6-96231b3b80d8
The early return seems to be missed. This causes a radical and wrong loop
optimization on powerpc. It isn't reproducible on x86_64, because
"UseInterleaved" is false.
Patch by Tim Shen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257134 91177308-0d34-0410-b5e6-96231b3b80d8
.zero is confusing when used with two arguments. Documentation:
This directive emits SIZE 0-valued bytes. SIZE must be an absolute
expression. This directive is actually an alias for the '.skip'
directive so in can take an optional second argument of the value to
store in the bytes instead of zero. Using '.zero' in this way would be
confusing however.
Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=18353
Hexagon and Sparc do the same, and it's all the same to WebAssembly so
let's pick the less confusing of the two.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257111 91177308-0d34-0410-b5e6-96231b3b80d8
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257107 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.
One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.
Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
variable as that would depend on the target, even though this test is
supposed to be generic
- I had to manually declared size/align for reference type. See also the
discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257105 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In rL242338, debugger tuning was introduced, and the tuning for FreeBSD
was set to lldb by default. However, for the foreseeable future we
still need to default to gdb tuning, since lldb is not ready for all of
FreeBSD's architectures, and some system tools (like objcopy, etc) have
not yet been adapted to cope with the lldb tuned format, which has
.apple sections.
Therefore, let FreeBSD use gdb by default for now.
Reviewers: emaste, probinson
Subscribers: llvm-commits, emaste
Differential Revision: http://reviews.llvm.org/D15966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257103 91177308-0d34-0410-b5e6-96231b3b80d8
We marked values which are 'undef' as constant instead of undefined
which violates SCCP's invariants. If we can figure out that a
computation results in 'undef', leave it in the undefined state.
This fixes PR16052.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257102 91177308-0d34-0410-b5e6-96231b3b80d8
The fix for PR23999 made us mark loads of null as producing the constant
undef which upsets the lattice. Instead, keep the load as "undefined".
This fixes PR26044.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257087 91177308-0d34-0410-b5e6-96231b3b80d8
The MC assembler doesn't like using the empty string as a private label
prefix because then it treats all labels as private. This commit reverts
back to the default prefix, which is .L, which is common in ELF targets
and consistent with the LLVM name mangler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257083 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.
Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.
Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.
With the IR generated by current Mesa, the changes are obviously relatively
minor:
7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)
Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)
Reviewers: tstellarAMD, arsenm, mareko
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15875
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Somehow, I first interpreted the docs as saying space for xnack_mask is only
reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and
went back to actually test what is happening, and it turns out that xnack_mask
is always reserved at least on Tonga and Carrizo, in the sense that flat_scr
is always fixed below the SGPRs that are used to implement xnack_mask, whether
or not they are actually used.
I confirmed this by writing a shader using inline assembly to tease out the
aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where
we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so
xnack_mask is s[76:77] and vcc is s[78:79]).
This patch changes both the calculation of the total number of SGPRs and the
various register reservations to account for this.
It ought to be possible to use the gap left by xnack_mask when the feature
isn't used, but this patch doesn't try to do that. (Note that the same applies
to vcc.)
Note that previously, even before my earlier change in r256794, the SGPRs that
alias to xnack_mask could end up being used as well when flat_scr was unused
and the total number of SGPRs happened to fall on the right alignment
(e.g. highest regular SGPR being used s29 and VCC used would lead to number
of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there
were some conflict due to such aliasing, we should have noticed that already.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15898
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257073 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.
InstCombine was already doing this when comparing two GEPs if the
base pointers were the same. However, in the case where we have
complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs,
conversions to or from integers, etc) the value of the original
base pointer will be hidden to the optimizer and this transformation
will be disabled.
This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the
relevant uses of GEPs to GEPs with a common base pointer. The
GEP comparison will be converted to a comparison done on indices.
Reviewers: majnemer, jmolloy
Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits
Differential Revision: http://reviews.llvm.org/D15146
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257064 91177308-0d34-0410-b5e6-96231b3b80d8
See PR25822 for a more full summary, but we were conflating the concepts of "capture" and "escape". We were proving nocapture and using that proof to infer noescape, which is not true. Escaped-ness is a function-local property - as soon as a value is used in a call argument it escapes. Capturedness is a related but distinct property. It implies a *temporally limited* escape. Consider:
static int a;
int b;
int g(int * nocapture arg);
int f() {
a = 2; // Even though a escapes to g, it is not captured so can be treated as non-escaping here.
g(&a); // But here it must be treated as escaping.
g(&b); // Now that g(&a) has returned we know it was not captured so we can treat it as non-escaping again.
}
The original commit did not sufficiently understand this nuance and so caused PR25822 and PR26046.
r248576 included both a performance improvement (which has been backed out) and a related conformance fix (which has been kept along with its testcase).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257058 91177308-0d34-0410-b5e6-96231b3b80d8
AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur.
Follow up to D15310
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257055 91177308-0d34-0410-b5e6-96231b3b80d8
Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257046 91177308-0d34-0410-b5e6-96231b3b80d8
Darwin TLS accesses most closely resemble ELF's general-dynamic situation,
since they have to be able to handle all possible situations. The descriptors
and so on are obviously slightly different though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257039 91177308-0d34-0410-b5e6-96231b3b80d8
Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them. Add a couple of tests which show this working.
Also, add a triple to the same test file to hopefully fix a failing bot.
It turns out we do han
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257025 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong.
The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate.
Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases.
Differential Revision: http://reviews.llvm.org/D15632
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257022 91177308-0d34-0410-b5e6-96231b3b80d8
Follow-up to r257000: DIImportedEntity can reach a DISubprogram via
its entity, but also via its scope. Handle the latter case as well.
PR26037.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257019 91177308-0d34-0410-b5e6-96231b3b80d8
We need to know whether or not a given basic block is in a loop for the analysis
to be correct.
Loop information may be incomplete on irreducible CFGs, therefore we may
generate incorrect code if we use it in those situations.
This fixes PR25988.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257012 91177308-0d34-0410-b5e6-96231b3b80d8
It is illegal to have a null entity in a DIImportedEntity, so
we must link in a DISubprogram metadata node referenced by one,
even if the associated function is not linked in or inlined anywhere.
Fixes PR26037.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257000 91177308-0d34-0410-b5e6-96231b3b80d8
This is a conservative fix, I expect Amaury to relax this.
Follow-up for r256923
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256999 91177308-0d34-0410-b5e6-96231b3b80d8
With r256990, bogner introduced comprehensive tests for constant arrays
and vectors. We no longer need the existing ones because they are
redundant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256991 91177308-0d34-0410-b5e6-96231b3b80d8
I added a couple of tests in r256982, but vedantk suggested that they
fit better into compatibility.ll, since they could catch format breaks
later on there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256990 91177308-0d34-0410-b5e6-96231b3b80d8
In r254991 I allowed ConstantDataVectors to contain elements of
HalfTy, but I missed updating the bitcode reader and writer to handle
this, so now we crash if we try to emit bitcode on programs that have
constant vectors of half.
This fixes the issue and adds test coverage for reading and writing
constant sequences in bitcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256982 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is admittedly something that you could only run into by manually
playing around with shader assembly because the SITypeWriter pass is
skipped for compute.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15902
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256980 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch adds a check in SplitLandingPadPredecessors to see if the original landingpad instruction has any uses. If not, we don't need to create a PHINode for it in the joint block since it's gonna be a dead code anyway. The motivation for this patch is that we found a bug that SplitLandingPadPredecessors created a PHINode of token type landingpad, which failed the verifier since PHINode can not be token type. However, the created PHINode will never be used in our code pattern. This patch will workaround this bug, and we might add supports in SplitLandingPadPredecessors to handle token type landingpad with uses in the future.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15835
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256972 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: As per title. This will allow the optimizer to pick up on it.
Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15923
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256969 91177308-0d34-0410-b5e6-96231b3b80d8
TLS calls need the stack frame to be properly set up and this
implies that such calls need ADJUSTSTACK_xxx markers.
Fixes PR25820.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256959 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM_ENABLE_TIMESTAMPS controls if timestamps are embedded into llvm's
binaries. Turning it off is useful for deterministic builds.
r246905 made it so that the define suddenly also controls if the binaries that
the llvm binaries _create_ embed timestamps or not – but this shouldn't be a
configure-time option. r256203/r256204 added a driver option to toggle this on
and off, so this patch now passes this driver option in LLVM_ENABLE_TIMESTAMPS
builds so that if LLVM_ENABLE_TIMESTAMPS is set, the build of LLVM is
deterministic – but the built clang can still write timestamps into other
executables when requested.
This also allows removing some of the test machinery added in r292012 to work
around this problem.
See PR24740 for background.
http://reviews.llvm.org/D15783
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256958 91177308-0d34-0410-b5e6-96231b3b80d8
FindIDom() can fail in two different ways - it can either return nullptr or the
block itself, depending on the circumstances. Some users of FindIDom() check
one error condition, while others check the other.
Change it to always return nullptr on failure.
This fixes PR26004.
Differential Revision: http://reviews.llvm.org/D15847
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256955 91177308-0d34-0410-b5e6-96231b3b80d8
The first instruction in a block is what the rend() iterator points to, so
if it moves, we need to re-evaluate rend() so that we continue to iterate
through the rest of the instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256953 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc.
Examples:
-print-after-all -print-funcs=foo,bar
Reviewers: mcrosier, joker.eph
Subscribers: tejohnson, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D15776
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256952 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In buildSchedGraph(), when adding memory dependencies for loads, move
the call to adjustChainDeps() after the call to
addChainDependency(AliasChain) to handle the case where
addChainDependency(AliasChain) ends up not adding a dependency and
instead putting the SU on the RejectMemNodes list. The call to
adjustChainDeps() must be done after the call to addChainDependency() in
order to process the SU added to the RejectMemNodes list to create
memory dependencies for it.
Reviewers: hfinkel, atrick, jonpa, resistor
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D15927
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256950 91177308-0d34-0410-b5e6-96231b3b80d8
In general, disabling comments in the output reduces the chances of a
CHECK line accidentally matching a comment instead of its intended text.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256946 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This reverts commit 5a9e526f29.
As per discussion in D15665
This also add a test case so that regression introduced by that diff are not reintroduced.
Reviewers: vaivaswatha, jmolloy, hfinkel, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256932 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store.
Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15903
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256923 91177308-0d34-0410-b5e6-96231b3b80d8
Most of the properties of memset_pattern16 can be now covered by the generic attributes and inferred by InferFunctionAttrs. The only exceptions are:
- We don't yet have a writeonly attribute for the first argument.
- We don't have an attribute for modeling the access size facts encoded in MemoryLocation.cpp.
Differential Revision: http://reviews.llvm.org/D15879
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256911 91177308-0d34-0410-b5e6-96231b3b80d8
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.
This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.
Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.
Differential Revision: http://reviews.llvm.org/D15544
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256890 91177308-0d34-0410-b5e6-96231b3b80d8