As noted in D34071, there are some IR optimization opportunities that could be
handled by normal IR passes if this expansion wasn't happening so late in CGP.
Regardless of that, it seems wasteful to knowingly produce suboptimal IR here,
so I'm proposing this change:
%s = sub i32 %x, %y
%r = icmp ne %s, 0
=>
%r = icmp ne %x, %y
Changing the predicate to 'eq' mimics what InstCombine would do, so that's just
an efficiency improvement if we decide this expansion should happen sooner.
The fact that the PowerPC backend doesn't eliminate the 'subf.' might be
something for PPC folks to investigate separately.
Differential Revision: https://reviews.llvm.org/D34416
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306471 91177308-0d34-0410-b5e6-96231b3b80d8
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.
At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306470 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently this replacement can really be substituting the
same as the original register. Avoid restarting the loop
when there's been no change in the register uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306441 91177308-0d34-0410-b5e6-96231b3b80d8
- DenseMap should be faster than std::map
- Use the `InsertRes = insert() if (!InsertRes.inserted)` pattern rather
than the `if (!X.contains(...)) { X.insert(...); }` to save one map
lookup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306436 91177308-0d34-0410-b5e6-96231b3b80d8
When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities.
This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable.
Differential Revision: https://reviews.llvm.org/D34679
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306404 91177308-0d34-0410-b5e6-96231b3b80d8
Remove invalid shortcut in fixupKills(): A register needs to be marked
live even when we are not adding a kill flag. This is because a
partially live register must not get a kill flags, but it still needs to
be fully marked live when walking backwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306352 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes bug 33597.
Use of substituteRegister in the tied operand case messes
up the register use iterator, causing some uses to be left
unprocessed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306333 91177308-0d34-0410-b5e6-96231b3b80d8
This is the dual problem to legalizing G_INSERTs so most of the code and
testing was cribbed from there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306328 91177308-0d34-0410-b5e6-96231b3b80d8
Revert "[MBP] do not rotate loop if it creates extra branch"
It breaks the sanitizer build bots. Need to fix this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306276 91177308-0d34-0410-b5e6-96231b3b80d8
This is a last fix for the corner case of PR32214. Actually this is not really corner case in general.
We should not do a loop rotation if we create an additional branch due to it.
Consider the case where we have a loop chain H, M, B, C , where
H is header with viable fallthrough from pre-header and exit from the loop
M - some middle block
B - backedge to Header but with exit from the loop also.
C - some cold block of the loop.
Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch.
Let's compute the change in number of branches:
+1 branch from pre-header to header
-1 branch from header to exit
+1 branch from header to middle block if there is such
-1 branch from cold bock to header if there is one
So if C is not a predecessor of H then we introduce extra branch.
This change actually prohibits rotation of the loop if both true
1) Best Exit has next element in chain as successor.
2) Last element in chain is not a predecessor of first element of chain.
Reviewers: iteratee, xur
Reviewed By: iteratee
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34271
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306272 91177308-0d34-0410-b5e6-96231b3b80d8
The compiler fails with assertion during legalization of SETCC for <3 x i8> operands.
The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>.
Differential Revision: https://reviews.llvm.org/D34503
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306242 91177308-0d34-0410-b5e6-96231b3b80d8
When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable.
This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer.
This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure.
Differential Revision: https://reviews.llvm.org/D34467
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306209 cdac9f57-aa62-4fd3-8940-286f4534e8a0
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306120 91177308-0d34-0410-b5e6-96231b3b80d8
G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to
be taught how to emulate it with alternatives. We use G_MERGE_VALUES where
possible, and a sequence of G_INSERTs if not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306119 91177308-0d34-0410-b5e6-96231b3b80d8
Move GlobalAddress Offset decomposition from initial match into
comparision check and removing the possibility of constructing a new
offseted global address when examining addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305917 91177308-0d34-0410-b5e6-96231b3b80d8
Converts to range-loop usage in machine scheduler.
This makes the code neater and easier to read,
and also keeps pace of the machine scheduler
implementation with C++11 features.
Reviewed by: Matthias Braun
Differential Revision: https://reviews.llvm.org/D34320
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305887 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for combining a build vector to a shuffle.
When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305883 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When we're building with XRay instrumentation, we use a trick that
preserves references from the function to a function sled index. This
index table lives in a separate section, and without this trick the
linker is free to garbage-collect this section and all the segments it
refers to. Until we're able to tell the linkers to preserve these
sections, we use this reference trick to keep around both the index and
the entries in the instrumentation map.
Before this change we emitted both a synthetic reference to the label in
the instrumentation map, and to the entry in the function map index.
This change removes the first synthetic reference and only emits one
synthetic reference to the index -- the index entry has the references
to the labels in the instrumentation map, so the linker will still
preserve those if the function itself is preserved.
This reduces the amount of synthetic references we emit from 16 bytes to
just 8 bytes in x86_64, and similarly to other platforms.
Reviewers: dblaikie
Subscribers: javed.absar, kpw, pelikan, llvm-commits
Differential Revision: https://reviews.llvm.org/D34340
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305880 91177308-0d34-0410-b5e6-96231b3b80d8
Right now areMemoryOpsAliased has an assertion justified as:
MMO1 should have a value due it comes from operation we'd like to use
as implicit null check.
assert(MMO1->getValue() && "MMO1 should have a Value!");
However, it is possible for that invariant to not be upheld in the
following situation (conceptually):
Null check %RAX
NotNullSucc:
%RAX = LEA %RSP, 16 // I0
%RDX = MOV64rm %RAX // I1
With the current code, we will have an early exit from
ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable.
However, I1 will look plausible (since it loads from %RAX) and
will go ahead and call areMemoryOpsAliased(I1, I0). This will cause
us to fail the assert mentioned above since I1 does not load from an
IR level value and thus is allowed to have a non-Value base address.
The fix is to bail out earlier whenever we see an unsuitable
instruction overwrite PointerReg. This would guarantee that when we
call areMemoryOpsAliased, we're guaranteed to be looking at an
instruction that loads from or stores to an IR level value.
Original Patch Author: sanjoy
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34385
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305879 91177308-0d34-0410-b5e6-96231b3b80d8
The instruction it falls over on is an IMPLICT_DEF that also happens
to be the only instruction in its lexical scope. That LexicalScope has
never been created because its range is empty. This patch skips over
all meta-instructions instead of just DBG_VALUEs.
Thanks to David Blaikie for providing a testcase!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305853 91177308-0d34-0410-b5e6-96231b3b80d8
This does some improvements/cleanup to the recently introduced
scavengeRegisterBackwards() functionality:
- Rewrite findSurvivorBackwards algorithm to use the existing
LiveRegUnit::accumulateBackward() code. This also avoids the Available
and Candidates bitset and just need 1 LiveRegUnit instance
(= 1 bitset).
- Pick registers in allocation order instead of register number order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305817 91177308-0d34-0410-b5e6-96231b3b80d8
We were incorrectly sign extending into the high word (as you would for
SMULO) when legalizing UMULO in terms of a wider full multiplication.
Patch by James Duley.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305800 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As part of this
* Emitted instructions now have named MachineInstr variables associated
with them. This isn't particularly important yet but it's a small step
towards multiple-insn emission.
* constrainSelectedInstRegOperands() is no longer hardcoded. It's now added
as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses
an alternate constraint mechanism ConstrainOperandToRegClassAction() which
supports arbitrary constraints such as that defined by COPY_TO_REGCLASS.
Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar
Reviewed By: ab
Subscribers: javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305791 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In some cases legalization ends up with not symmetric merge/unmerge nodes.
Transform it to merge/unmerge nodes.
Reviewers: t.p.northover, qcolombet, zvi
Reviewed By: t.p.northover
Subscribers: rovka, kristof.beyls, guyblank, llvm-commits
Differential Revision: https://reviews.llvm.org/D33626
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305783 91177308-0d34-0410-b5e6-96231b3b80d8
The recursive implementation of CalcNodeSethiUllmanNumber may
overflow stack on extremely long pred chains. This patch replaces it
with an equivalent iterative implementation.
Differential Revision: https://reviews.llvm.org/D33769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305775 91177308-0d34-0410-b5e6-96231b3b80d8
This is the last step needed to avoid regressions for x86 before we flip the switch to allow
expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings,
so we need to do that here too.
FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that
code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1
constant tests because its alignment requirements are too strict (shouldn't require alignment
for a constant operand).
Differential Revision: https://reviews.llvm.org/D34071
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305734 91177308-0d34-0410-b5e6-96231b3b80d8
As all store merges checks are based on the memory operation
performed, allow use of truncated stores and extended loads as valid
input candidates for merging.
Relanding after fixing selection between truncated and normal store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305701 91177308-0d34-0410-b5e6-96231b3b80d8
Use llvm::make_unique to avoid ambiguity with MSVC.
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.
Differential Revision: https://reviews.llvm.org/D34144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305690 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.
Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB
Reviewed By: MatzeB
Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305677 91177308-0d34-0410-b5e6-96231b3b80d8
Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse
to place spills as the very first instruciton of a basic block and thus
artifically increase pressure (test in
test/CodeGen/PowerPC/scavenging.mir:spill_at_begin)
This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.
This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.
Differential Revision: http://reviews.llvm.org/D21885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305625 91177308-0d34-0410-b5e6-96231b3b80d8