llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-03-07 03:47:20 +00:00

Go to file

Andrew Trick 192311ab9a MI-Sched: handle latency of in-order operations with the new machine model.

The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

llvm-svn: 196516

2013-12-05 17:55:58 +00:00

autoconf

Update to reflect the next release.

2013-11-20 10:10:50 +00:00

bindings

[OCaml] Add a slash accidentally omitted from Makefile

2013-11-28 09:03:28 +00:00

cmake

[CMake] add_lit_target: Let lit.site.cfg free from "--param build_mode" on single configuration builds, like autoconf build.

2013-12-04 11:15:17 +00:00

docs

Correct word hyphenations

2013-12-05 05:44:44 +00:00

examples

[weak vtables] Place class definitions into anonymous namespaces to prevent weak vtables.

2013-11-19 03:08:35 +00:00

include

MI-Sched: handle latency of in-order operations with the new machine model.

2013-12-05 17:55:58 +00:00

lib

MI-Sched: handle latency of in-order operations with the new machine model.

2013-12-05 17:55:58 +00:00

projects

Revert "Revert "Windows: Add support for unicode command lines""

2013-10-07 01:00:07 +00:00

test

MI-Sched: handle latency of in-order operations with the new machine model.

2013-12-05 17:55:58 +00:00

tools

Correct word hyphenations

2013-12-05 05:44:44 +00:00

unittests

Use present fast-math flags when applicable in CreateBinOp

2013-12-05 00:32:09 +00:00

utils

Correct word hyphenations

2013-12-05 05:44:44 +00:00

.arcconfig

Add .arcconfig to the repository. Useful if someone wants to use phabricator's command line tool.

2012-12-01 12:07:58 +00:00

.clang-format

Add a clang-format file so that the tool can automatically detect the

2013-09-02 07:19:04 +00:00

.gitignore

Add extra vim swap file pattern

2012-10-09 23:48:34 +00:00

CMakeLists.txt

CMake : optionaly enable LLVM to be compiled with -std=c++11 (default: off)

2013-11-26 10:33:53 +00:00

CODE_OWNERS.TXT

Update email address.

2013-12-04 09:42:49 +00:00

configure

Update to reflect the next release.

2013-11-20 10:10:50 +00:00

CREDITS.TXT

Update CREDITS

2013-11-17 11:44:36 +00:00

LICENSE.TXT

Be more specific and capitalize filenames.

2013-05-21 21:22:34 +00:00

llvm.spec.in

…

LLVMBuild.txt

Remove the very substantial, largely unmaintained legacy PGO

2013-10-02 15:42:23 +00:00

Makefile

Remove the very substantial, largely unmaintained legacy PGO

2013-10-02 15:42:23 +00:00

Makefile.common

Makefile.common: Update a description, s/Source/SOURCES/ , according to MakefileGuide.html#control-variables .

2012-12-07 01:43:23 +00:00

Makefile.config.in

Add an autoconf option for turning on -gsplit-dwarf by default

2013-06-25 01:12:25 +00:00

Makefile.rules

Teach the Makefile build system how to handle SOURCES which include

2013-11-14 23:51:29 +00:00

README.txt

Revert "Test commit to check e-mail address. Please discard this."

2013-10-04 10:59:13 +00:00

README.txt

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for the Low Level
Virtual Machine, a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you're writing a package for LLVM, see docs/Packaging.rst for our
suggestions.

Languages

C++ 96.9%

C 1%

Python 1%

CMake 0.6%

OCaml 0.2%

Other 0.1%