The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.
Differential Revision: http://reviews.llvm.org/D14150
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253544 91177308-0d34-0410-b5e6-96231b3b80d8
Optimizations like LoadPRE in GVN will insert new instructions.
If the insertion point is in a already processed BB, they should
get a value number explicitly. If the insertion point is after
current instruction, then just leave it. However, current GVN framework
has no support for it.
In this patch, we just bail out if a VN can't be found.
Dfferential Revision: http://reviews.llvm.org/D14670
A test/Transforms/GVN/pr25440.ll
M lib/Transforms/Scalar/GVN.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253536 91177308-0d34-0410-b5e6-96231b3b80d8
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for vector constant folding of integer/float comparisons.
This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations)
Differential Revision: http://reviews.llvm.org/D14683
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253504 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out we decide whether to use SjLj exceptions or some alternative in
two separate places in the backend, and they disagreed with each other. This
led to inconsistent code and is generally a terrible idea.
So make them consistent and add an assert that they *do* match (unfortunately
MCAsmInfo isn't available in opt, so it can't be used to initialise the CodeGen
version directly).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253502 91177308-0d34-0410-b5e6-96231b3b80d8
This change introduces an instrumentation intrinsic instruction for
value profiling purposes, the lowering of the instrumentation intrinsic
and raw reader updates. The raw profile data files for llvm-profdata
testing are updated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253484 91177308-0d34-0410-b5e6-96231b3b80d8
These tests aren't testing that the result is valid syntax; they're testing
that the compiler emits the inline asm operands correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253469 91177308-0d34-0410-b5e6-96231b3b80d8
This also takes the push/pop syntax another step forward, introducing stack
slot numbers to make it easier to see how expressions are connected. For
example, the value pushed in $push7 is popped in $pop7.
And, this begins an experiment with making get_local and set_local implicit
when an operation directly uses or defines a register. This greatly reduces
clutter. If this experiment succeeds, it may make sense to do this for
const instructions as well.
And, this introduces more special code for ARGUMENTS; hopefully this code
will soon be obviated by proper support for live-in virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253465 91177308-0d34-0410-b5e6-96231b3b80d8
Starting on an input stream that is not at offset 0 would trigger the
assert in WinCOFFObjectWriter.cpp:1065:
assert(getStream().tell() <= (*i)->Header.PointerToRawData &&
"Section::PointerToRawData is insane!");
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253464 91177308-0d34-0410-b5e6-96231b3b80d8
The virtual register containing the address for returned value on
stack should in the DAG be represented with a CopyFromReg node and not
a Register node. Otherwise, InstrEmitter will not make sure that it
ends up in the right register class for the target instruction.
SystemZ needs this, becuause the reg class for address registers is a
subset of the general 64 bit register class.
test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with
-verify-machineinstrs.
Reviewed by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253461 91177308-0d34-0410-b5e6-96231b3b80d8
This time I've found a linux box and checked it there. This test now passes.
Because I'd introduced an undefined reference in @bar, gold now returns an error. This doesn't matter for the test itself, because it also emits the remarks the test is checking for. But it does cause LIT to notice a nonzero return code which it faults on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253454 91177308-0d34-0410-b5e6-96231b3b80d8
Let's try again. This time using the right function signature. It's a real pity I can't run this on a darwin machine...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253453 91177308-0d34-0410-b5e6-96231b3b80d8
It needs the same fixes as in test/LTO/X86/remarks.ll, but this test appears not to get run on my system (but does on the buildbot). Strange.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253452 91177308-0d34-0410-b5e6-96231b3b80d8
Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253451 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253438 91177308-0d34-0410-b5e6-96231b3b80d8
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.
It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.
There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
linker since it knowns where the dynamic relocations are.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253436 91177308-0d34-0410-b5e6-96231b3b80d8
Most linked executables do not have a symbol table in COFF.
However, it is pretty typical to have some export entries. Use those
entries to inform the disassembler about potential function definitions
and call targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253429 91177308-0d34-0410-b5e6-96231b3b80d8
When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.
if (...)
do {
B0;
... // some branches
B1;
} while(...);
else
B2;
B3;
Differential revision: http://reviews.llvm.org/D10825
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253414 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change tries to make the root cause of instrumented profile data merge failures clearer.
Previous:
$ llvm-profdata merge test_0.profraw test_1.profraw -o test_merged.profdata
test_1.profraw: foo: Function count mismatch
test_1.profraw: bar: Function count mismatch
test_1.profraw: baz: Function count mismatch
...
Changed:
$ llvm-profdata merge test_0.profraw test_1.profraw -o test_merged.profdata
test_1.profraw: foo: Function basic block count change detected (counter mismatch)
Make sure that all profile data to be merged is generated from the same binary.
test_1.profraw: bar: Function basic block count change detected (counter mismatch)
test_1.profraw: baz: Function basic block count change detected (counter mismatch)
...
Reviewers: dnovillo, davidxl, bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14739
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253384 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.
The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.
The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This
intrinsic generates no code and simply marks the alloca in use.
Reviewers: JCTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14668
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253378 91177308-0d34-0410-b5e6-96231b3b80d8
The instruction combiner previously removed types from filter clauses in Landing Pad instructions if the type had previously been seen in a catch clause. This is incorrect and prevents unexpected exception handlers from rethrowing the caught type.
Differential Revision: http://reviews.llvm.org/D14669
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253370 91177308-0d34-0410-b5e6-96231b3b80d8
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.
Differential Revision: http://reviews.llvm.org/D14693
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253363 91177308-0d34-0410-b5e6-96231b3b80d8
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.
I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.
From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%
I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change
Differential Revision: http://reviews.llvm.org/D14743
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253349 91177308-0d34-0410-b5e6-96231b3b80d8
SELECT_CC has the nasty property of having operands with unrelated
types. So if you do something like:
f32 = select_cc f16, f16, f32, f32, cc
You'd only look for the action for <select_cc, f32>, but never f16.
If the types are all legal, but the op isn't (as for f16 on AArch64,
or for f128 on x86_64/AArch64?), then you get into trouble.
For f128, we have softenSetCCOperands to handle this case.
Similarly, for f16, we can directly promote the CC operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253344 91177308-0d34-0410-b5e6-96231b3b80d8
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253339 91177308-0d34-0410-b5e6-96231b3b80d8
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.
This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.
Differential Revision: http://reviews.llvm.org/D14709
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253338 91177308-0d34-0410-b5e6-96231b3b80d8