llvm-mirror/lib
Hal Finkel 725757ccc8 Add a scheduling model (with itinerary) for the PPC POWER7
This adds a scheduling model for the POWER7 (P7) core, and enables the
machine-instruction scheduler when targeting the P7. Scheduling for the P7,
like earlier ooo PPC cores, requires considering both dispatch group hazards,
and functional unit resources and latencies. These are both modeled in a
combined itinerary. Dispatch group formation is still handled by the post-RA
scheduler (which still needs to be updated for the P7, but nevertheless does a
pretty good job).

One interesting aspect of this change is that I've also enabled to use of AA
duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark
results seem to support this decision (see below), and while this is normally
useful for in-order cores, and not for ooo cores like the P7, I think that the
dispatch slot hazards are enough like in-order resources to make the AA useful.

Test suite significant performance differences (where negative is a speedup,
and positive is a regression) vs. the current situation:

MultiSource/Benchmarks/BitBench/drop3/drop3
  with AA: N/A
  without AA: -28.7614% +/- 19.8356%
(significantly against AA)

MultiSource/Benchmarks/FreeBench/neural/neural
  with AA: -17.7406% +/- 11.2712%
  without AA: N/A
(significantly in favor of AA)

MultiSource/Benchmarks/SciMark2-C/scimark2
  with AA: -11.2079% +/- 1.80543%
  without AA: -11.3263% +/- 2.79651%

MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt
  with AA: -41.8649% +/- 17.0053%
  without AA: -34.5256% +/- 23.7072%

MultiSource/Benchmarks/mafft/pairlocalalign
  with AA: 25.3016% +/- 17.8614%
  without AA: 38.6629% +/- 14.9391%
(significantly in favor of AA)

MultiSource/Benchmarks/sim/sim
  with AA: N/A
  without AA: 13.4844% +/- 7.18195%
(significantly in favor of AA)

SingleSource/Benchmarks/BenchmarkGame/Large/fasta
  with AA: 15.0664% +/- 6.70216%
  without AA: 12.7747% +/- 8.43043%

SingleSource/Benchmarks/BenchmarkGame/puzzle
  with AA: 82.2713% +/- 26.3567%
  without AA: 75.7525% +/- 41.1842%

SingleSource/Benchmarks/Misc/flops-2
  with AA: -37.1621% +/- 20.7964%
  without AA: -35.2342% +/- 20.2999%
(significantly in favor of AA)

These are 99.5% confidence intervals from 5 runs per configuration. Regarding
the choice to turn on AA during CodeGen, of these results, four seem
significantly in favor of using AA, and one seems significantly against. I'm
not making this decision based on these numbers alone, but these results
seem consistent with results I have from other tests, and so I think that, on
balance, using AA is a win.

llvm-svn: 195981
2013-11-30 20:55:12 +00:00
..
Analysis [PM] Split the CallGraph out from the ModulePass which creates the 2013-11-26 04:19:30 +00:00
AsmParser Make it explicit that nulls are not allowed in names. 2013-11-19 21:12:39 +00:00
Bitcode Fix spacing, forward declare order. 2013-11-18 02:51:33 +00:00
CodeGen Reverse the order of eviction checks for possible compile time savings. No functionality. 2013-11-29 23:49:38 +00:00
DebugInfo DebugInfo: Avoid emitting pubtype entries for type DIEs that just indirect to a type unit. 2013-11-26 00:22:37 +00:00
ExecutionEngine Use the mangler consistently instead of using getGlobalPrefix directly. 2013-11-28 08:59:52 +00:00
IR Fix spurious return introduced by my earlier patch to DebugInfo 2013-11-26 18:54:37 +00:00
IRReader [llvm-c] Expose IRReader interface 2013-11-06 09:21:15 +00:00
Linker Revert "Move copying of global initializers below the cloning of functions." 2013-11-09 00:43:18 +00:00
LTO Use array_pod_sort instead of std::sort 2013-11-16 16:15:56 +00:00
MC The global prefix is always one char. Don't use a string for it. 2013-11-28 17:00:49 +00:00
Object Path: Recognize COFF import library file magic. 2013-11-15 21:22:02 +00:00
Option Use startswith_lower() where possible. 2013-11-04 19:22:50 +00:00
Support Lift self-copy protection up to the header file and add self-move 2013-11-26 00:54:44 +00:00
TableGen Fix most memory leaks in tablegen. 2013-10-31 04:07:41 +00:00
Target Add a scheduling model (with itinerary) for the PPC POWER7 2013-11-30 20:55:12 +00:00
Transforms Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)). 2013-11-28 21:38:05 +00:00
CMakeLists.txt Move LTO support library to a component, allowing it to be tested 2013-09-24 23:52:22 +00:00
LLVMBuild.txt Move LTO support library to a component, allowing it to be tested 2013-09-24 23:52:22 +00:00
Makefile Reformat Makefile. No other changes. 2013-10-30 04:03:03 +00:00