The usual CodeGenPrepare trickery, on a target-specific intrinsic.
Without this, the expansion of atomics will usually have the zext
be hoisted out of the loop, defeating the various patterns we have
to catch this precise case.
Differential Revision: http://reviews.llvm.org/D9930
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238054 91177308-0d34-0410-b5e6-96231b3b80d8
We changed the test to test non-constant values in r238049.
We can also use CHECK-NEXT to be a little stricter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238052 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.
The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.
Differential Revision: http://reviews.llvm.org/D8982
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238051 91177308-0d34-0410-b5e6-96231b3b80d8
The problem was that I slipped a change required for shrink-wrapping, namely I
used getFirstTerminator instead of the getLastNonDebugInstr that was here before
the refactoring, whereas the surrounding code is not yet patched for that.
Original message:
[X86] Refactor the prologue emission to prepare for shrink-wrapping.
- Add a late pass to expand pseudo instructions (tail call and EH returns).
Instead of doing it in the prologue emission.
- Factor some static methods in X86FrameLowering to ease code sharing.
NFC.
Related to <rdar://problem/20821487>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238035 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for the ISA 2.07 additions involving the
branch history rolling buffer and event-based branching. These will
not be used by typical applications, so built-in support is not
required. They will only be available via inline assembly.
Assembly/disassembly tests are included in the patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238032 91177308-0d34-0410-b5e6-96231b3b80d8
MachO and COFF quite reasonably only define the size for common symbols.
We used to try to figure out the "size" by computing the gap from one symbol to
the next.
This would not be correct in general, since a part of a section can belong to no
visible symbol (padding, private globals).
It was also really expensive, since we would walk every symbol to find the size
of one.
If a caller really wants this, it can sort all the symbols once and get all the
gaps ("size") in O(n log n) instead of O(n^2).
On MachO this also has the advantage of centralizing all the checks for an
invalid n_sect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238028 91177308-0d34-0410-b5e6-96231b3b80d8
The list of subtarget features for the 7em triple contains 't2xtpk',
which actually disables that subtarget feature. Correct that to
'+t2xtpk' and test that the instructions enabled by that feature do
actually work.
Differential Revision: http://reviews.llvm.org/D9936
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238022 91177308-0d34-0410-b5e6-96231b3b80d8
This change does a few things:
- Move some InstCombine transforms to InstSimplify
- Run SimplifyCall from within InstCombine::visitCallInst
- Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237995 91177308-0d34-0410-b5e6-96231b3b80d8
PR23608 pointed out that using the preheader to gain a context instruction isn't always legal because a loop might not have a preheader. When looking into that, I realized that using the preheader to determine legality for sinking is questionable at best. Given no test covers that case and the original commit didn't seem to intend it, I restructured the code to only ask context sensative queries for hoising of loads and stores. This is effectively a partial revert of 237593.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237985 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
x = &a[i];
y = &a[i + j];
=>
y = x + j;
along with some refactoring work such as extracting method
findClosestMatchingDominator.
Depends on D9786 which provides the ScalarEvolution::getGEPExpr interface.
Test Plan: nary-gep.ll
Reviewers: meheff, broune
Reviewed By: broune
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9802
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237971 91177308-0d34-0410-b5e6-96231b3b80d8
A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform
into undef instead of %X.
This fixes PR23624.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237968 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This supersedes http://reviews.llvm.org/D4010, hopefully properly
dealing with the JIT case and also adds an actual test case.
DwarfContext was basically already usable for the JIT (and back when
we were overwriting ELF files it actually worked out of the box by
accident), but in order to resolve relocations correctly it needs
to know the load address of the section.
Rather than trying to get this out of the ObjectFile or requiring
the user to create a new ObjectFile just to get some debug info,
this adds the capability to pass in that info directly.
As part of this I separated out part of the LoadedObjectInfo struct
from RuntimeDyld, since it is now required at a higher layer.
Reviewers: lhames, echristo
Reviewed By: echristo
Subscribers: vtjnash, friss, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D6961
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237961 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is a 2nd attempt at committing the initial MIR serialization patch.
The first commit (r237708) made the incremental buildbots unstable and was
reverted in r237730. The original commit didn't add a terminating null
character to the LLVM IR source which was passed to LLParser, and this
sometimes caused the test 'llvmIR.mir' to fail with a parsing error because
the LLVM IR source didn't have a null character immediately after the end
and thus LLLexer encountered some garbage characters that ultimately caused
the error.
This commit also includes the other test fixes I committed in
r237712 (llc path fix) and r237723 (remove target triple) which
also got reverted in r237730.
--Original Commit Message--
MIR Serialization: print and parse LLVM IR using MIR format.
This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR
using the MIR format. This pass is then added as a last pass when a
'stop-after' option is used in llc. The new library adds the initial
functionality for parsing of MIR files as well. This commit also
extends the llc tool so that it can recognize and parse MIR input files.
Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames
Differential Revision: http://reviews.llvm.org/D9616
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237954 91177308-0d34-0410-b5e6-96231b3b80d8
My recent patch to add support for ISA 2.07 vector pack/unpack
instructions didn't properly check for availability of the vpkudum
instruction when recognizing it as a special vector shuffle case.
This causes us to leave the vector shuffle in place (rather than
converting it to a vector permute) so that it can be recognized later
as a vpkudum, but that pattern is invalid for processors prior to
POWER8. Thus LLVM crashes with an "unable to select" message. We
observed this since one of our buildbots is configured to generate
code for a POWER7.
This patch fixes the problem by checking for availability of the
vpkudum instruction during custom lowering of vector shuffles.
I've added a test case variant for the vpkudum pattern when the
instruction isn't available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237952 91177308-0d34-0410-b5e6-96231b3b80d8
so DWARF skeleton CUs can be expression in IR. A skeleton CU is a
(typically empty) DW_TAG_compile_unit that has a DW_AT_(GNU)_dwo_name and
a DW_AT_(GNU)_dwo_id attribute. It is used to refer to external debug info.
This is a prerequisite for clang module debugging as discussed in
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html.
In order to refer to external types stored in split DWARF (dwo) objects,
such as clang modules, we need to emit skeleton CUs, which identify the
dwarf object (i.e., the clang module) by filename (the SplitDebugFilename)
and a hash value, the dwo_id.
This patch only contains the IR changes. The idea is that a CUs with a
non-zero dwo_id field will be emitted together with a DW_AT_GNU_dwo_name
and DW_AT_GNU_dwo_id attribute.
http://reviews.llvm.org/D9488
rdar://problem/20091852
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237949 91177308-0d34-0410-b5e6-96231b3b80d8
On X86 (and similar OOO cores) unrolling is very limited, and even if the
runtime unrolling is otherwise profitable, the expense of a division to compute
the trip count could greatly outweigh the benefits. On the A2, we unroll a lot,
and the benefits of unrolling are more significant (seeing a 5x or 6x speedup
is not uncommon), so we're more able to tolerate the expense, on average, of a
division to compute the trip count.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237947 91177308-0d34-0410-b5e6-96231b3b80d8
http://reviews.llvm.org/D9891
Following up on the VSX single precision loads and stores added earlier, this
adds support for elementary arithmetic operations on single precision values
in VSX registers. These instructions utilize the new VSSRC register class.
Instructions added:
xsaddsp
xsdivsp
xsmulsp
xsresp
xsrsqrtesp
xssqrtsp
xssubsp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237937 91177308-0d34-0410-b5e6-96231b3b80d8
Predicate UseAVX depricates pattern selection on AVX-512.
This predicate is necessary for DAG selection to select EVEX form.
But mapping SSE intrinsics to AVX-512 instructions is not ready yet.
So I replaced UseAVX with HasAVX for intrinsics patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237903 91177308-0d34-0410-b5e6-96231b3b80d8
One of the testcases introduced by D9365 had incorrect !dereferenceable metadata on load. It must fail but it doesn't due to incorrect order of CHECK/CHECK-NOT commands in test. Fixed both.
Reviewed By: sanjoy
Differential Revision: http://reviews.llvm.org/D9877
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237897 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization).
It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner.
Differential Revision: http://reviews.llvm.org/D9848
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237885 91177308-0d34-0410-b5e6-96231b3b80d8
We had not been trying hard enough to resolve def names inside multiclasses
that had complex concatenations, etc. Now we'll try harder.
Patch by Amaury Sechet!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237877 91177308-0d34-0410-b5e6-96231b3b80d8
In effect a partial revert of r237858, which was a dumb shortcut.
Looking at the dependencies of the destination should be the proper
fix: if the new memset would depend on anything other than itself,
the transformation isn't correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237874 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR23599, another miscompile introduced by r235232: when there is
another dependency on the destination of the created memset (i.e., the
part of the original destination that the memcpy doesn't depend on)
between the memcpy and the original memset, we would insert the created
memset after the memcpy, and thus after the other dependency.
Instead, insert the created memset right after the old one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237858 91177308-0d34-0410-b5e6-96231b3b80d8
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.
Discussed with Tim Northover on IRC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237837 91177308-0d34-0410-b5e6-96231b3b80d8
In some cases it won't get cleaned up properly leading to crashes
downstream. PR23353.
Based on a patch by Davide Italiano.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237828 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.
> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237821 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
During icmp lowering it can happen that a constant value can be larger than expected (see the code around the change).
APInt::getMinSignedBits() must be checked again as the shift before can change the constant sign to positive.
I'm not sure it is the best fix possible though.
Test Plan: Regression test included.
Reviewers: resistor, chandlerc, spatel, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D9147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237812 91177308-0d34-0410-b5e6-96231b3b80d8
fixed extract-insert i1 element,
load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage.
added a bunch of new tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237793 91177308-0d34-0410-b5e6-96231b3b80d8
It works, but I've noticed that I missed several callers of createMCAsmInfo()
and many don't have a TargetMachine to provide.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237792 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
-check-prefix replaces the default CHECK prefix rather than adding to it and
must be explicitly re-added.
Also added the N32 cases.
Reviewers: petarj
Reviewed By: petarj
Subscribers: tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9668
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237790 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For N32/N64, private labels begin with '.L' but for O32 they begin with '$'.
MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: jfb, sandeep, llvm-commits, rafael
Differential Revision: http://reviews.llvm.org/D9821
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237789 91177308-0d34-0410-b5e6-96231b3b80d8
This change implements support for lowering of the gc.relocates tied to the invoke statepoint.
This is acomplished by storing frame indices of the lowered values in "StatepointRelocatedValues" map inside FunctionLoweringInfo instead of storing them in per-basic block structure StatepointLowering.
After this change StatepointLowering is used only during "LowerStatepoint" call and it is not necessary to store it as a field in SelectionDAGBuilder anymore.
Differential Revision: http://reviews.llvm.org/D7798
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237786 91177308-0d34-0410-b5e6-96231b3b80d8
We know that _tls_index is zero for local-exec TLS variables because
they are always defined in the executable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237772 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a new GC strategy for supporting the CoreCLR runtime.
This strategy is currently identical to Statepoint-example GC,
but is necessary for several upcoming changes specific to CoreCLR, such as:
1. Base-pointers not explicitly reported for interior pointers
2. Different format for stack-map encoding
3. Location of Safe-point polls: polls are only needed before loop-back edges and before tail-calls (not needed at function-entry)
4. Runtime specific handshake between calls to managed/unmanaged functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237753 91177308-0d34-0410-b5e6-96231b3b80d8
We were special casing a handful of intrinsics as not needing a safepoint before them. After running into another valid case - memset - I took a closer look and realized that almost no intrinsics need to have a safepoint poll before them. Restructure the code to make that apparent so that we stop hitting these bugs. The only intrinsics which need a safepoint poll before them are ones which can run arbitrary code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237744 91177308-0d34-0410-b5e6-96231b3b80d8