Filling no-ops is done just before emitting of assembly,
when the instruction stream is final. No-ops are inserted
to align the instructions so the dual-issue of the pipeline
is utilized. This speeds up generated code with a minimum of
1% on a select set of algorithms.
This pass may be redundant if the instruction scheduler and
all subsequent passes that modify the instruction stream
(prolog+epilog inserter, register scavenger, are there others?)
are made aware of the instruction alignments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123226 91177308-0d34-0410-b5e6-96231b3b80d8
phi nodes. It is called from MergeBlockIntoPredecessor which is
called from GVN, which claims to preserve these.
I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123222 91177308-0d34-0410-b5e6-96231b3b80d8
point values to their integer representation through the SSE intrinsic
calls. This is the last part of a README.txt entry for which I have real
world examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123206 91177308-0d34-0410-b5e6-96231b3b80d8
There's an inherent tension in DAGCombine between assuming
that things will be put in canonical form, and the Depth
mechanism that disables transformations when recursion gets
too deep. It would not surprise me if there's a lot of little
bugs like this one waiting to be discovered. The mechanism
seems fragile and I'd suggest looking at it from a design viewpoint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123191 91177308-0d34-0410-b5e6-96231b3b80d8
These functions not longer assert when passed 0, but simply return false instead.
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123155 91177308-0d34-0410-b5e6-96231b3b80d8
perform rounding other than truncation in the IR. Common C code for this
turns into really an LLVM intrinsic call that blocks a lot of further
optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123135 91177308-0d34-0410-b5e6-96231b3b80d8
a + {b,+,stride} into {a+b,+,stride} (because a is LIV),
then the resultant AddRec is NUW/NSW if the client says it
is.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123133 91177308-0d34-0410-b5e6-96231b3b80d8
void f(int* begin, int* end) { std::fill(begin, end, 0); }
which turns into a != exit expression where one pointer is
strided and (thanks to step #1) known to not overflow, and
the other is loop invariant.
The observation here is that, though the IV is strided by
4 in this case, that the IV *has* to become equal to the
end value. It cannot "miss" the end value by stepping over
it, because if it did, the strided IV expression would
eventually wrap around.
Handle this by turning A != B into "A-B != 0" where the A-B
part is known to be NUW.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123131 91177308-0d34-0410-b5e6-96231b3b80d8
when no virtual registers have been allocated.
It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that
Map.grow(MRI.getLastVirtReg());
can be replaced with the simpler
Map.resize(MRI.getNumVirtRegs());
This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123130 91177308-0d34-0410-b5e6-96231b3b80d8
physical register numbers.
This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.
The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123128 91177308-0d34-0410-b5e6-96231b3b80d8
without informing memdep. This could cause nondeterminstic weirdness
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123124 91177308-0d34-0410-b5e6-96231b3b80d8
Also, switch to a more clear 'sink' function with its declaration to
avoid any confusion about 'g'. Thanks for the suggestion Frits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123113 91177308-0d34-0410-b5e6-96231b3b80d8
of using a Location class with the same information.
When making a copy of a MachineOperand that was already stored in a
MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise
the register use-def lists become inconsistent.
Add MachineOperand::clearParent() to do that. An alternative would be a custom
MachineOperand copy constructor that cleared ParentMI. I didn't want to do that
because of the performance impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123109 91177308-0d34-0410-b5e6-96231b3b80d8
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123107 91177308-0d34-0410-b5e6-96231b3b80d8
depending on TRI::FirstVirtualRegister.
Also use TRI::printReg instead of printing virtual registers directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123101 91177308-0d34-0410-b5e6-96231b3b80d8
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123098 91177308-0d34-0410-b5e6-96231b3b80d8
larger memsets. Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:
_mad_synth_mute: ## @mad_synth_mute
## BB#0: ## %entry
pushq %rax
movl $4096, %esi ## imm = 0x1000
callq ___bzero
popq %rax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123089 91177308-0d34-0410-b5e6-96231b3b80d8
that it was leaving in loops after rotation (between the original latch
block and the original header.
With this change, it is possible for rotated loops to have just a single
basic block, which is useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123075 91177308-0d34-0410-b5e6-96231b3b80d8
1. Rip out LoopRotate's domfrontier updating code. It isn't
needed now that LICM doesn't use DF and it is super complex
and gross.
2. Make DomTree updating code a lot simpler and faster. The
old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
SplitCriticalEdge instead of doing an overcomplex
reimplementation of it.
No behavior change, except for the name of the inserted preheader.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123072 91177308-0d34-0410-b5e6-96231b3b80d8
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.
The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123064 91177308-0d34-0410-b5e6-96231b3b80d8
Add a unnamed_addr bit to global variables and functions. This will be used
to indicate that the address is not significant and therefore the constant
or function can be merged with others.
If an optimization pass can show that an address is not used, it can set this.
Examples of things that can have this set by the FE are globals created to
hold string literals and C++ constructors.
Adding unnamed_addr to a non-const global should have no effect unless
an optimization can transform that global into a constant.
Aliases are not allowed to have unnamed_addr since I couldn't figure
out any use for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123063 91177308-0d34-0410-b5e6-96231b3b80d8
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.
Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123060 91177308-0d34-0410-b5e6-96231b3b80d8
1. Take a flags argument instead of a bool. This makes
it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
more efficient. For lookup failures, don't drop null values
into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
and LoopUnswitch, kill it.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123058 91177308-0d34-0410-b5e6-96231b3b80d8
map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123057 91177308-0d34-0410-b5e6-96231b3b80d8
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.
This allows memory instructions to be moved around INLINEASM instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123044 91177308-0d34-0410-b5e6-96231b3b80d8