77 Commits

Author SHA1 Message Date
Keno Fischer
5acb804311 [ExecutionDepsFix] Don't recurse over the CFG
Summary:
Use an explicit work queue instead, to avoid accidentally
causing stack overflows for input with very large CFGs.

Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D31681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299569 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 17:42:56 +00:00
Keno Fischer
54eda94389 [ExecutionDepsFix] Don't revisit true dependencies
If an instruction has a true dependency, it makes sense for to use that
register for any undef read operands in the same instruction (we'll have
to wait for that register to become available anyway). This logic
was already implemented. However, the code would then still try to
revisit that instruction and break the dependency (and always fail,
since by definition a true dependency has to be live before the
instruction). Avoid revisiting such instructions as a performance
optimization. No functional change.

Differential Revision: https://reviews.llvm.org/D30173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299467 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 20:30:47 +00:00
Matthias Braun
8ff4fe417f ExecutionDepsFix: Let targets specialize the pass; NFC
Let targets specialize the pass with the register class so we can get a
parameterless default constructor and can put the pass into the pass
registry to enable testing with -run-pass=.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298184 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-18 05:08:58 +00:00
Matthias Braun
76900ddfb6 ExecutionDepsFix: Normalize names; NFC
Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and
ExecutionDepsFix to the last one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298183 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-18 05:05:40 +00:00
Craig Topper
6a0edd3bec [ExecutionDepsFix] Don't make copies of LiveReg objects when collecting operands for soft instructions
Summary:
While collecting operands we make copies of the LiveReg objects which are stored in the LiveRegs array. If the instruction uses the same register multiple times we end up with multiple copies. Later we iterate through the collected list of LiveReg objects and merge DomainValues. In the process of doing this the merge function can change the contents of the original LiveReg object in the LiveRegs array, but not the copies that have been made. So when we get to the second usage of the register we end up seeing a stale copy of the LiveReg object.

To fix this I've stopped copying and now just store a pointer to the original LiveReg object. Another option might be to avoid adding the same register to the Regs array twice, but this approach seemed simpler.

The included test case exposes this bug due to an AVX-512 masked OR instruction using the same register for the passthru operand and one of the inputs to the OR operation.

Fixes PR30284.

Reviewers: RKSimon, stoklund, MatzeB, spatel, myatsina

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30242

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296260 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-25 18:12:25 +00:00
Craig Topper
732985157f [ExecutionDepsFix] Use range-based for loop. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296093 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 06:38:24 +00:00
Keno Fischer
2a3b42cf37 [ExecutionDepsFix] Improve clearance calculation for loops
Summary:
In revision rL278321, ExecutionDepsFix learned how to pick a better
register for undef register reads, e.g. for instructions such as
`vcvtsi2sdq`. While this revision improved performance on a good number
of our benchmarks, it unfortunately also caused significant regressions
(up to 3x) on others. This regression turned out to be caused by loops
such as:

PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT
      ^                                  |
      +----------------------------------+

In the previous version of the clearance calculation, we would visit
the blocks in order, remembering for each whether there were any
incoming backedges from blocks that we hadn't processed yet and if
so queuing up the block to be re-processed. However, for loop structures
such as the above, this is clearly insufficient, since the block B
does not have any unknown backedges, so we do not see the false
dependency from the previous interation's Def of xmm registers in B.

To fix this, we need to consider all blocks that are part of the loop
and reprocess them one the correct clearance values are known. As
an optimization, we also want to avoid reprocessing any later blocks
that are not part of the loop.

In summary, the iteration order is as follows:
Before: PH A B C D A'
Corrected (Naive): PH A B C D A' B' C' D'
Corrected (w/ optimization): PH A B C A' B' C' D

To facilitate this optimization we introduce two new counters for each
basic block. The first counts how many of it's predecssors have
completed primary processing. The second counts how many of its
predecessors have completed all processing (we will call such a block
*done*. Now, the criteria to reprocess a block is as follows:
    - All Predecessors have completed primary processing
    - For x the number of predecessors that have completed primary
      processing *at the time of primary processing of this block*,
      the number of predecessors that are done has reached x.

The intuition behind this criterion is as follows:
We need to perform primary processing on all predecessors in order to
find out any direct defs in those predecessors. When predecessors are
done, we also know that we have information about indirect defs (e.g.
in block B though that were inherited through B->C->A->B). However,
we can't wait for all predecessors to be done, since that would
cause cyclic dependencies. However, it is guaranteed that all those
predecessors that are prior to us in reverse postorder will be done
before us. Since we iterate of the basic blocks in reverse postorder,
the number x above, is precisely the count of the number of predecessors
prior to us in reverse postorder.

Reviewers: myatsina
Differential Revision: https://reviews.llvm.org/D28759

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293571 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:37:03 +00:00
Matthias Braun
f3e629e3ec LivePhysReg: Use reference instead of pointer in init(); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289002 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-08 00:15:51 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Matthias Braun
690a3cbc95 MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.

Differential Revision: http://reviews.llvm.org/D23850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279698 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25 01:27:13 +00:00
Marina Yatsina
83260f2394 Fix for PR29010
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=29010
Root cause of the bug is that the register class of the machine instruction operand does not fully reflect if this registers that can be allocated.
Both for i386 and x86_64 the operand's register class is VR128RegClass and thus contains xmm0-xmm15, though in i386 we can only use xmm0-xmm8.
In order to get the actual allocable registers of the class we need to use RegisterClassInfo.

Differential Revision: https://reviews.llvm.org/D23613



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278954 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 19:07:40 +00:00
Marina Yatsina
653bfcb46d Fixing bug committed in rev. 278321
In theory the indices of RC (and thus the index used for LiveRegs) may differ from the indices of OpRC.
Fixed the code to extract the correct RC index.
OpRC contains the first X consecutive elements of RC, and thus their indices are currently de facto the same, therefore a test cannot be added at this point.

Differential Revision: https://reviews.llvm.org/D23491



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278923 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 11:40:21 +00:00
Marina Yatsina
ac9ca3bbe7 Avoid false dependencies of undef machine operands
This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref.

Pseudo example:

loop:
xmm0 = ...
xmm1 = vcvtsi2sdl eax, xmm0<undef>
... = inst xmm0
jmp loop

In this example, selecting xmm0 as the undef register creates false dependency between loop iterations.
This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction.
Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem.

Differential Revision: https://reviews.llvm.org/D22466



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278321 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 07:32:08 +00:00
Marina Yatsina
a264c851a4 ExecutionDepsFix - Fix bug in clearance calculation
The clearance calculation did not take into account registers defined as outputs or clobbers in inline assembly machine instructions because these register defs are implicit.

Differential Revision: http://reviews.llvm.org/D22580



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276266 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 12:37:07 +00:00
Duncan P. N. Exon Smith
567409db69 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 00:01:54 +00:00
Andrew Kaylor
7b7e9c726b Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268457 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 22:32:30 +00:00
Matthias Braun
02073cb41c livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 00:24:32 +00:00
Matthias Braun
b4756d6b2d LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268336 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 00:08:46 +00:00
Derek Schuff
9b3da26fa8 Add MachineFunctionProperty checks for AllVRegsAllocated for target passes
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.

Reviewers: qcolombet

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265313 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-04 17:09:25 +00:00
Sanjay Patel
c32ece6aee use range-based for-loops; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256566 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-29 17:15:22 +00:00
Duncan P. N. Exon Smith
9cb8e12ce1 CodeGen: Start removing implicit conversions to/from list iterators, NFC
Start removing implicit conversions to/from list iterators in CodeGen,
ala r249782 for IR.  A lot more to go after this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249851 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-09 16:54:49 +00:00
Matthias Braun
af5ff60200 Save LaneMask with livein registers
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets

Differential Revision: http://reviews.llvm.org/D12442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247171 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 18:08:03 +00:00
Matthias Braun
56dd2d0886 MachineBasicBlock: Add liveins() method returning an iterator_range
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245895 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-24 22:59:52 +00:00
Matthias Braun
28fbb4be5d MachineRegisterInfo: Introduce isPhysRegUsed()
This method checks whether a physical regiser or any of its aliases are
used in the function.

Using this function in SIRegisterInfo::findUnusedReg() should also fix
this reported failure:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/292143.html
http://reviews.llvm.org/rL242173#inline-533

The report doesn't come with a testcase and I don't know enough about
AMDGPU to create one myself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245329 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-18 18:54:27 +00:00
Pete Cooper
9f2f1650c4 Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern

for (auto I = x.rbegin(), E = x.end(); I != E; ++I)

we can use make_range to construct the reverse range and iterate using
that instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243163 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-24 21:13:43 +00:00
Matthias Braun
2addf067a2 MachineRegisterInfo: Remove UsedPhysReg infrastructure
We have a detailed def/use lists for every physical register in
MachineRegisterInfo anyway, so there is little use in maintaining an
additional bitset of which ones are used.

Removing it frees us from extra book keeping. This simplifies
VirtRegMap.

Differential Revision: http://reviews.llvm.org/D10911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242173 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 17:52:07 +00:00
Alexander Kornienko
cd52a7a381 Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 09:49:53 +00:00
Alexander Kornienko
cf0db29df2 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 15:57:42 +00:00
Sanjay Patel
044d1f10ac remove function names from comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232328 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-15 18:16:04 +00:00
Sanjay Patel
3445f7efbd fix typo: NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-15 18:11:35 +00:00
Matthias Braun
ae6bbac733 ExecutionDepsFix: Indizes -> Indices.
Translate german to english.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 18:56:20 +00:00
Eric Christopher
be76ce0ca1 Fix typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 18:20:23 +00:00
Matthias Braun
cd56c19f3c ExecutionDepsFix: Correctly handle wide registers.
The ExecutionDepsFix previously mapped each register to 1 or zero
registers of the register class it was called with and therefore
simulating liveness for.  This was problematic for cases involving wider
registers like Q0 on ARM where ExecutionDepsFix gets invoked for the Dxx
registers. In these cases the wide register would get mapped to the last
matching D register, while it should have been all matching D registers.
This commit changes the AliasMap to use a SmallVector to map registers
to potentially multiple destination regclass registers. This is required
to avoid regressions with subregister liveness tracking enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224447 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 19:13:47 +00:00
Aaron Ballman
51c2bdca72 Fixing -Wsign-compare warnings; NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224337 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 14:04:11 +00:00
Michael Ilseman
9ecdca9115 Silence more static analyzer warnings.
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 18:48:43 +00:00
Craig Topper
a5babc8a31 Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries.
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 05:50:14 +00:00
Eric Christopher
ded375f282 Remove unnecessary TargetMachine.h includes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219672 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:22:08 +00:00
Eric Christopher
6035518e3b Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214838 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 02:39:49 +00:00
Eric Christopher
9f85dccfc6 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214781 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:25:23 +00:00
Eric Christopher
68c7a1cb98 Clean up language and grammar.
Based on a patch by jfcaron3@gmail.com!
PR19806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209216 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-20 17:11:11 +00:00
Chandler Carruth
8677f2ff9a [Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.

Other sub-trees will follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 02:02:50 +00:00
Craig Topper
4ba844388c [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206142 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-14 00:51:57 +00:00
Craig Topper
9f998de891 [C++11] Add 'override' keyword to virtual methods that override their base class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-07 09:26:03 +00:00
Juergen Ributzka
57c38e3faa Convert register liveness tracking to work on a sub-register level instead of just register units.
Reviewed by Andy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197315 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-14 06:52:56 +00:00
Andrew Trick
e670304f64 comment typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197278 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-13 22:23:54 +00:00
Andrew Trick
a23bd2e761 Revert "Convert liveness tracking to work on a sub-register level instead of just register units."
This reverts commit r197253.

This was a great change, but Juergen should be the commit author.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197262 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-13 19:04:08 +00:00
Andrew Trick
edf1070ca7 Convert liveness tracking to work on a sub-register level instead of just register units.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197253 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-13 18:36:56 +00:00
Andrew Trick
51dee24ca6 Improve on r192635, ExeDepsFix for avx, and add a test case.
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.

This was blocking the switch to MI-Sched.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192669 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-15 03:39:43 +00:00
Andrew Trick
a6a9ac5aa1 Fix the ExecutionDepsFix pass to handle AVX instructions.
This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.

Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.

I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192635 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-14 22:19:03 +00:00
Craig Topper
f22fd3f7b5 Use SmallVectorImpl instead of SmallVector for iterators and references to avoid specifying the vector size unnecessarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185512 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-03 05:11:49 +00:00