When reporting an error for a defm, we would previously only report the
location of the outer defm, which is not always where the error is.
Now we also print the location of the expanded multiclass defs:
lib/Target/X86/X86InstrSSE.td:2902:12: error: foo
defm ADD : basic_sse12_fp_binop_s<0x58, "add", fadd, SSE_ALU_ITINS_S>,
^
lib/Target/X86/X86InstrSSE.td:2801:11: note: instantiated from multiclass
defm PD : sse12_fp_packed<opc, !strconcat(OpcodeStr, "pd"), OpNode, VR128,
^
lib/Target/X86/X86InstrSSE.td:194:5: note: instantiated from multiclass
def rm : PI<opc, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2),
^
llvm-svn: 162409
TableGen sometimes synthesizes missing sub-register indexes. Emit these
indexes as enumerators in the target namespace along with the
user-defined ones.
Also take this opportunity to stop creating new Record objects for
synthetic indexes.
llvm-svn: 161964
Now that the weird X86 sub_ss and sub_sd sub-register indexes are gone,
there is no longer a need for the CompositeIndices construct in .td
files. Sub-register index composition can be specified on the
SubRegIndex itself using the ComposedOf field.
Also enforce unique names for sub-registers in TableGen. The same
sub-register cannot be available with multiple sub-register indexes.
llvm-svn: 160842
Register units are already used internally in TableGen to compute
register pressure sets and overlapping registers. This patch makes them
available to the code generators.
The register unit lists are differentially encoded so they can be reused
for many related registers. This keeps the total size of the lists below
200 bytes for most targets. ARM has the largest table at 560 bytes.
Add an MCRegUnitIterator for traversing the register unit lists. It
provides an abstract interface so the representation can be changed in
the future without changing all clients.
llvm-svn: 157650
TableGen already computes register units as the basic unit of
interference. We can use that to compute the set of overlapping
registers.
This means that we can easily compute overlap sets for one register at a
time. There is no benefit to computing all registers at once.
llvm-svn: 156960
Besides the weight, we also want to store up to two root registers per
unit. Most units will have a single root, the leaf register they
represent. Units created for ad hoc aliasing get two roots: The two
aliasing registers.
The root registers can be used to compute the set of overlapping
registers.
llvm-svn: 156792
Register units can be used to compute if two registers overlap:
A overlaps B iff units(A) intersects units(B).
With this change, the above holds true even on targets that use ad hoc
aliasing (currently only ARM). This means that register units can be
used to implement regsOverlap() more efficiently, and the register
allocator can use the concept to model interference.
When there is no ad hoc aliasing, the register units correspond to the
maximal cliques in the register overlap graph. This is optimal, no other
register unit assignment can have fewer units.
With ad hoc aliasing, weird things are possible, and we don't try too
hard to compute the maximal cliques. The current approach is always
correct, and it works very well (probably optimally) as long as the ad
hoc aliasing doesn't have cliques larger than pairs. It seems unlikely
that any target would need more.
llvm-svn: 156763
The ad hoc aliasing specified in the 'Aliases' list in .td files is
currently only used by computeOverlaps(). It will soon be needed to
build accurate register units as well, so build the undirected graph in
CodeGenRegister::buildObjectGraph() instead.
Aliasing is a symmetric relationship with only one direction specified
in the .td files. Make sure both directions are represented in
getExplicitAliases().
llvm-svn: 156762
TableGen creates new register classes and sub-register indices based on
the sub-register structure present in the register bank. So far, it has
been doing that on a per-register basis, but that is not very efficient.
This patch teaches TableGen to compute topological signatures for
registers, and use that to reduce the amount of redundant computation.
Registers get the same TopoSig if they have identical sub-register
structure.
TopoSigs are not currently exposed outside TableGen.
llvm-svn: 156761
Don't compute the SuperRegs list until the sub-register graph is
completely finished. This guarantees that the list of super-registers is
properly topologically ordered, and has no duplicates.
llvm-svn: 156629
The sub-registers explicitly listed in SubRegs in the .td files form a
tree. In a complicated register bank, it is possible to have
sub-register relationships across sub-trees. For example, the ARM NEON
double vector Q0_Q1 is a tree:
Q0_Q1 = [Q0, Q1], Q0 = [D0, D1], Q1 = [D2, D3]
But we also define the DPair register D1_D2 = [D1, D2] which is fully
contained in Q0_Q1.
This patch teaches TableGen to find such sub-register relationships, and
assign sub-register indices to them. In the example, TableGen will
create a dsub_1_dsub_2 sub-register index, and add D1_D2 as a
sub-register of Q0_Q1.
This will eventually enable the coalescer to handle copies of skewed
sub-registers.
llvm-svn: 156587
The .td files specify a tree of sub-registers. Store that tree as
ExplicitSubRegs lists in CodeGenRegister instead of extracting it from
the Record when needed.
llvm-svn: 156555
This mapping is for internal use by TableGen. It will not be exposed in
the generated files.
Unfortunately, the mapping is not completely well-defined. The X86 xmm
registers appear with multiple sub-register indices in the ymm
registers. This is because of the odd idempotent sub_sd and sub_ss
sub-register indices. I hope to be able to eliminate them entirely, so
we can require the sub-registers to form a tree.
For now, just place the canonical sub_xmm index in the mapping, and
ignore the idempotents.
llvm-svn: 156519
This is still a topological ordering such that every register class gets
a smaller enum value than its sub-classes.
Placing the smaller spill sizes first makes a difference for the
super-register class bit masks. When looking for a super-register class,
we usually want the smallest possible kind of super-register. That is
now available as the first bit set in the bit mask.
llvm-svn: 156222
This manually enumerated list of super-register classes has been
superceeded by the automatically computed super-register class masks
available through SuperRegClassIterator.
llvm-svn: 156151
This is a new algorithm that finds sets of register units that can be
used to model registers pressure. This handles arbitrary, overlapping
register classes. Each register class is associated with a (small)
list of pressure sets. These are the dimensions of pressure affected
by the register class's liveness.
llvm-svn: 154374
This is a new algorithm that associates registers with weighted
register units to accuretely model their effect on register
pressure. This handles registers with multiple overlapping
subregisters. It is possible, but almost inconceivable that the
algorithm fails to find an exact solution for a target description. If
an exact solution cannot be found, an inexact, but reasonable solution
will be chosen.
llvm-svn: 154373
First small step toward modeling multi-register multi-pressure. In the
future, register units can also be used to model liveness and
aliasing.
llvm-svn: 153794
It is simpler to define a composite index directly:
def ssub_2 : SubRegIndex<[dsub_1, ssub_0]>;
def ssub_3 : SubRegIndex<[dsub_1, ssub_1]>;
Than specifying the composite indices on each register:
CompositeIndices = [(ssub_2 dsub_1, ssub_0),
(ssub_3 dsub_1, ssub_1)] in ...
This also makes it clear that SubRegIndex composition is supposed to be
unique.
llvm-svn: 149556
The final tie breaker comparison also needs to return +/-1, or 0.
This is not a less() function.
This could cause otherwise identical super-classes to be ordered
unstably, depending on what the system qsort routine does with a bad
compare function.
llvm-svn: 149549
This class is used to represent SubRegIndex instances instead of the raw
Record pointers that were used before.
No functional change intended.
llvm-svn: 149418
When set, this bit indicates that a register is completely defined by
the value of its sub-registers.
Use the CoveredBySubRegs property to infer which super-registers are
call-preserved given a list of callee-saved registers. For example, the
ARM registers D8-D15 are callee-saved. This now automatically implies
that Q4-Q7 are call-preserved.
Conversely, Win64 callees save XMM6-XMM15, but the corresponding
YMM6-YMM15 registers are not call-preserved because they are not fully
defined by their sub-registers.
llvm-svn: 148363
Targets can now add CalleeSavedRegs defs to their *CallingConv.td file.
TableGen will use this to create a *_SaveList array suitable for
returning from getCalleeSavedRegs() as well as a *_RegMask bit mask
suitable for returning from getCallPreservedMask().
llvm-svn: 148346
Use information computed while inferring new register classes to emit
accurate, table-driven implementations of getMatchingSuperRegClass().
Delete the old manual, error-prone implementations in the targets.
llvm-svn: 146873
Teach TableGen to create the missing register classes needed for
getMatchingSuperRegClass() to return maximal results. The function is
still not auto-generated, so it still returns inexact results.
This produces these new register classes:
ARM:
QQPR_with_dsub_0_in_DPR_8
QQQQPR_with_dsub_0_in_DPR_8
X86:
GR64_with_sub_32bit_in_GR32_NOAX
GR64_with_sub_32bit_in_GR32_NOAX_and_GR32_NOSP
GR64_with_sub_16bit_in_GR16_NOREX
GR64_with_sub_32bit_in_GR32_NOAX_and_GR32_NOREX
GR64_TC_and_GR64_with_sub_32bit_in_GR32_NOAX
GR64_with_sub_32bit_in_GR32_NOAX_and_GR32_NOREX_NOSP
GR64_TCW64_and_GR64_with_sub_32bit_in_GR32_NOAX
GR64_TC_and_GR64_with_sub_32bit_in_GR32_NOAX_and_GR32_NOREX
GR64_with_sub_32bit_in_GR32_TC
GR64_with_sub_32bit_in_GR32_ABCD_and_GR32_NOAX
GR64_with_sub_32bit_in_GR32_NOAX_and_GR32_TC
GR64_with_sub_32bit_in_GR32_AD
GR64_with_sub_32bit_in_GR32_AD_and_GR32_NOAX
The other targets in the tree are not weird enough to be affected.
llvm-svn: 146872