addresses a longstanding deficiency noted in many FIXMEs scattered
across all the targets.
This effectively moves the problem up one level, replacing eleven
FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path
through FastISel where we actually supply a DebugLoc, fixing Radar
7421831.
llvm-svn: 106243
Given a copy instruction, CoalescerPair can determine which registers to
coalesce in order to eliminate the copy. It deals with all the subreg fun to
determine a tuple (DstReg, SrcReg, SubIdx) such that:
- SrcReg is a virtual register that will disappear after coalescing.
- DstReg is a virtual or physical register whose live range will be extended.
- SubIdx is 0 when DstReg is a physical register.
- SrcReg can be joined with DstReg:SubIdx.
CoalescerPair::isCoalescable() determines if another copy instruction is
compatible with the same tuple. This fixes some NEON miscompilations where
shuffles are getting coalesced as if they were copies.
The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy
later.
llvm-svn: 105997
the Profile method. Currently this only works with the default FoldingSetTraits
implementation.
The point of this is to allow nodes to not store context values which are only
used during profiling. A better solution would thread this value through the
folding algorithms, but then those would need to be (1) templated and
(2) non-opaque.
llvm-svn: 105819
with gcc-4.6. The warning is wrong, since Sub *is* used (perhaps gcc is confused
because the use of Sub is constant folded away?), but since it is trivial to avoid,
and massively reduces the amount of warning spew, just workaround the wrong warning.
llvm-svn: 105788
'class llvm::DAGDeltaAlgorithm' has virtual functions and accessible non-virtual destructor
Not sure if this is the best solution, but this class has state and some of the
classes that inherit from it also do, so it looks appropriate.
llvm-svn: 105675
scrounging through SCEVUnknown contents and SCEVNAryExpr operands;
instead just do a simple deterministic comparison of the precomputed
hash data.
Also, since this is more precise, it eliminates the need for the slow
N^2 duplicate detection code.
llvm-svn: 105540
encapsulation to force the users of these classes to know about the internal
data structure of the Operands structure. It also can lead to errors, like in
the MSIL writer.
llvm-svn: 105539
In file included from X86InstrInfo.cpp:16:
X86GenInstrInfo.inc:2789: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2790: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2792: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2793: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2808: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2809: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2816: error: integer constant is too large for 'long' type
X86GenInstrInfo.inc:2817: error: integer constant is too large for 'long' type
llvm-svn: 105524
instruction defines subregisters.
Any existing subreg indices on the original instruction are preserved or
composed with the new subreg index.
Also substitute multiple operands mentioning the original register by using the
new MachineInstr::substituteRegister() function. This is necessary because there
will soon be <imp-def> operands added to non read-modify-write partial
definitions. This instruction:
%reg1234:foo = FLAP %reg1234<imp-def>
will reMaterialize(%reg3333, bar) like this:
%reg3333:bar-foo = FLAP %reg333:bar<imp-def>
Finally, replace the TargetRegisterInfo pointer argument with a reference to
indicate that it cannot be NULL.
llvm-svn: 105358
implementation that is correct for most targets. Tablegen will override where
needed.
Add MachineOperand::subst{Virt,Phys}Reg methods that correctly handle existing
subreg indices when sustituting registers.
llvm-svn: 104985
optimization level.
This only really affects llc for now because both the llvm-gcc and clang front
ends override the default register allocator. I intend to remove that code later.
llvm-svn: 104904
implementing pop with a linear search for a "best" element. The priority
queue was a neat idea, but in practice the comparison functions depend
on dynamic information.
llvm-svn: 104718
A Register with subregisters must also provide SubRegIndices for adressing the
subregisters. TableGen automatically inherits indices for sub-subregisters to
minimize typing.
CompositeIndices may be specified for the weirder cases such as the XMM sub_sd
index that returns the same register, and ARM NEON Q registers where both D
subregs have ssub_0 and ssub_1 sub-subregs.
It is now required that all subregisters are named by an index, and a future
patch will also require inherited subregisters to be named. This is necessary to
allow composite subregister indices to be reduced to a single index.
llvm-svn: 104704
A Register with subregisters must also provide SubRegIndices for adressing the
subregisters. TableGen automatically inherits indices for sub-subregisters to
minimize typing.
CompositeIndices may be specified for the weirder cases such as the XMM sub_sd
index that returns the same register, and ARM NEON Q registers where both D
subregs have ssub_0 and ssub_1 sub-subregs.
It is now required that all subregisters are named by an index, and a future
patch will also require inherited subregisters to be named. This is necessary to
allow composite subregister indices to be reduced to a single index.
llvm-svn: 104654
structure that represents a mapping without any dependencies on SubRegIndex
numbering.
This brings us closer to being able to remove the explicit SubRegIndex
numbering, and it is now possible to specify any mapping without inventing
*_INVALID register classes.
llvm-svn: 104563
This is the beginning of purely symbolic subregister indices, but we need a bit
of jiggling before the explicit numeric indices can be completely removed.
llvm-svn: 104492
that are aliases of the specified register.
- Rename modifiesRegister to definesRegister since it's looking a def of the
specific register or one of its super-registers. It's not looking for def of a
sub-register or alias that could change the specified register.
- Added modifiesRegister to look for defs of aliases.
llvm-svn: 104377
reads or writes a register.
This takes partial redefines and undef uses into account.
Don't actually use it yet. That caused miscompiles.
llvm-svn: 104372
If the size of the string is greater than the zero fill size, the function will attempt to write a very large string of zeros to the object file (~4GB on 32 bit platforms). This assertion will catch the scenario and crash the program before the write occurs.
llvm-svn: 104334
isn't ideal if we want to be able to use another object file format.
Add a createObjectStreamer() factory method so that the correct object
file streamer can be instantiated for a given target triple.
llvm-svn: 104318
pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot
of long latency instructions so a strict register pressure reduction
scheduler does not work well.
Early experiments show this speeds up some NEON loops by over 30%.
llvm-svn: 104216
partial redefines.
We are going to treat a partial redefine of a virtual register as a
read-modify-write:
%reg1024:6 = OP
Unless the register is fully clobbered:
%reg1024:6 = OP, %reg1024<imp-def>
MachineInstr::readsVirtualRegister() knows the difference. The first case is a
read, the second isn't.
llvm-svn: 104149
- Of questionable utility, since in general anything which wants to do this should probably be within a target specific hook, which can rely on the sections being of the appropriate type. However, it can be useful for short term hacks.
llvm-svn: 103980
variable has not yet been used in an expression. This allows us to support a few
cases that show up in real code (mostly because gcc generates it for Objective-C
on Darwin), without giving up a reasonable semantic model for assignment.
llvm-svn: 103950
instructions.
e.g.
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1027<def> = EXTRACT_SUBREG %reg1026, 6
%reg1028<def> = EXTRACT_SUBREG %reg1026<kill>, 5
...
%reg1029<def> = REG_SEQUENCE %reg1028<kill>, 5, %reg1027<kill>, 6, %reg1028, 7, %reg1027, 8, %reg1028, 9, %reg1027, 10, %reg1030<kill>, 11, %reg1032<kill>, 12
After REG_SEQUENCE is eliminated, we are left with:
%reg1026<def> = VLDMQ %reg1025<kill>, 260, pred:14, pred:%reg0
%reg1029:6<def> = EXTRACT_SUBREG %reg1026, 6
%reg1029:5<def> = EXTRACT_SUBREG %reg1026<kill>, 5
The regular coalescer will not be able to coalesce reg1026 and reg1029 because it doesn't
know how to combine sub-register indices 5 and 6. Now 2-address pass will consult the
target whether sub-registers 5 and 6 of reg1026 can be combined to into a larger
sub-register (or combined to be reg1026 itself as is the case here). If it is possible,
it will be able to replace references of reg1026 with reg1029 + the larger sub-register
index.
llvm-svn: 103835
the variable actually tracks.
N.B., several back-ends are using "HasCalls" as being synonymous for something
that adjusts the stack. This isn't 100% correct and should be looked into.
llvm-svn: 103802
on RAUW of functions, this is a correctness issue instead of a mere memory
usage problem.
No testcase until the new MergeFunctions can land.
llvm-svn: 103653
- This provides a convenient alternative to using something llvm::prior or
manual iterator access, for example::
if (T *Prev = foo->getPrevNode())
...
instead of::
iterator it(foo);
if (it != begin()) {
--it;
...
}
- Chris, please review.
llvm-svn: 103647
be diced into atoms, and adjust getAtom() to take this into account.
- This fixes relocations to symbols in fixed size literal sections, for
example.
llvm-svn: 103532
and the others use the regular addPassesToEmitFile hook now, and
llc no longer needs a bunch of redundant code to handle the
whole-file case.
llvm-svn: 103492
Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and
EmitTargetCodeForMemmove out of TargetLowering and into
SelectionDAGInfo to exercise this.
llvm-svn: 103481
- This eliminates getAtomForAddress() (which was a linear search) and
simplifies getAtom().
- This also fixes some correctness problems where local labels at the same
address as non-local labels could be assigned to the wrong atom.
llvm-svn: 103480
string of features for that target. However LTO was using that string to pass
into the "create target machine" stuff. That stuff needed the feature string to
be in a particular form. In particular, it needed the CPU specified first and
then the attributes. If there isn't a CPU specified, it required it to be blank
-- e.g., ",+altivec". Yuck.
Modify the getDefaultSubtargetFeatures method to be a non-static member
function. For all attributes for a specific subtarget, it will add them in like
normal. It will also take a CPU string so that it can satisfy this horrible
syntax.
llvm-svn: 103451
This includes a patch by Roman Divacky to fix the initial crash.
Move the actual addition of passes from *PassManager::add to
*PassManager::addImpl. That way, when adding printer passes we won't
recurse infinitely.
Finally, check to make sure that we are actually adding a FunctionPass
to a FunctionPassManager before doing a print before or after it.
Immutable passes are strange in this way because they aren't
FunctionPasses yet they can be and are added to the FunctionPassManager.
llvm-svn: 103425