Commit Graph

98076 Commits

Author SHA1 Message Date
Bill Wendling
178e2b5358 Use 'unsigned char' to get this past gcc error message:
error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence'

llvm-svn: 196004
2013-12-01 03:36:07 +00:00
Hal Finkel
725757ccc8 Add a scheduling model (with itinerary) for the PPC POWER7
This adds a scheduling model for the POWER7 (P7) core, and enables the
machine-instruction scheduler when targeting the P7. Scheduling for the P7,
like earlier ooo PPC cores, requires considering both dispatch group hazards,
and functional unit resources and latencies. These are both modeled in a
combined itinerary. Dispatch group formation is still handled by the post-RA
scheduler (which still needs to be updated for the P7, but nevertheless does a
pretty good job).

One interesting aspect of this change is that I've also enabled to use of AA
duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark
results seem to support this decision (see below), and while this is normally
useful for in-order cores, and not for ooo cores like the P7, I think that the
dispatch slot hazards are enough like in-order resources to make the AA useful.

Test suite significant performance differences (where negative is a speedup,
and positive is a regression) vs. the current situation:

MultiSource/Benchmarks/BitBench/drop3/drop3
  with AA: N/A
  without AA: -28.7614% +/- 19.8356%
(significantly against AA)

MultiSource/Benchmarks/FreeBench/neural/neural
  with AA: -17.7406% +/- 11.2712%
  without AA: N/A
(significantly in favor of AA)

MultiSource/Benchmarks/SciMark2-C/scimark2
  with AA: -11.2079% +/- 1.80543%
  without AA: -11.3263% +/- 2.79651%

MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt
  with AA: -41.8649% +/- 17.0053%
  without AA: -34.5256% +/- 23.7072%

MultiSource/Benchmarks/mafft/pairlocalalign
  with AA: 25.3016% +/- 17.8614%
  without AA: 38.6629% +/- 14.9391%
(significantly in favor of AA)

MultiSource/Benchmarks/sim/sim
  with AA: N/A
  without AA: 13.4844% +/- 7.18195%
(significantly in favor of AA)

SingleSource/Benchmarks/BenchmarkGame/Large/fasta
  with AA: 15.0664% +/- 6.70216%
  without AA: 12.7747% +/- 8.43043%

SingleSource/Benchmarks/BenchmarkGame/puzzle
  with AA: 82.2713% +/- 26.3567%
  without AA: 75.7525% +/- 41.1842%

SingleSource/Benchmarks/Misc/flops-2
  with AA: -37.1621% +/- 20.7964%
  without AA: -35.2342% +/- 20.2999%
(significantly in favor of AA)

These are 99.5% confidence intervals from 5 runs per configuration. Regarding
the choice to turn on AA during CodeGen, of these results, four seem
significantly in favor of using AA, and one seems significantly against. I'm
not making this decision based on these numbers alone, but these results
seem consistent with results I have from other tests, and so I think that, on
balance, using AA is a win.

llvm-svn: 195981
2013-11-30 20:55:12 +00:00
Hal Finkel
fa2b249f38 Split some PPC itinerary classes
In preparation for adding scheduling definitions for the POWER7, split some PPC
itinerary classes so that the P7's latencies and hazards can be better
described. For the most part, this means differentiating indexed from non-index
pre-increment loads and stores. Also, differentiate single from
double-precision sqrt.

No functionality change intended (except for a more-specific latency for
single-precision sqrt on the A2).

llvm-svn: 195980
2013-11-30 20:41:13 +00:00
Hal Finkel
14673817db Convert a PPC test from grep to FileCheck
Convert this test to FileCheck, and improve it to check for the instructions it
is trying to exclude instead of checking for register use (especially because
grepping for r1 can be thrown off, for example, by a use of r12).

llvm-svn: 195979
2013-11-30 20:04:33 +00:00
Hal Finkel
ded988ca4c Desensitize a couple of PPC regression tests
Use CHECK-DAG to make these regression tests more resilient against changes in
instruction scheduling.

llvm-svn: 195978
2013-11-30 19:52:28 +00:00
Hal Finkel
1cdcead814 Update the cpu specified on some PPC regression tests
Some of these tests did not specify a cpu but were also sensitive to
instruction scheduling and/or register assignment choices. A few others
similarly-sensitive tests specified a cpu (often the POWER7), and while the P7
currently uses the default model for PPC64, this will soon change. For those
tests which should not really be cpu-dependent anyway, the cpu is set to the
generic 'ppc64'.

llvm-svn: 195977
2013-11-30 19:39:27 +00:00
Zoran Jovanovic
335dc8689e Test case for issue with microMIPS long branch.
llvm-svn: 195976
2013-11-30 19:13:15 +00:00
Zoran Jovanovic
b3e74abf46 Fixed issue with microMIPS long branch.
llvm-svn: 195975
2013-11-30 19:12:28 +00:00
Daniel Sanders
65ab9582ba [mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex.
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.

llvm-svn: 195973
2013-11-30 13:47:57 +00:00
Daniel Sanders
f397466fb3 [mips][msa] A small refactor to reduce patch noise in my next commit
No functional change. An if-statement has been split into two nested if-statements.

llvm-svn: 195972
2013-11-30 13:15:21 +00:00
Juergen Ributzka
7150312963 Force CPU type to unbreak unit tests on Haswell machines.
llvm-svn: 195971
2013-11-30 03:07:16 +00:00
Andrew Trick
b7e697ed41 Reverse the order of eviction checks for possible compile time savings. No functionality.
llvm-svn: 195969
2013-11-29 23:49:38 +00:00
Reed Kotler
95269c69db Part 1 of 3 patches that completes very long conditional branches
in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations. 

llvm-svn: 195968
2013-11-29 22:32:56 +00:00
Zoran Jovanovic
b8cffe14c6 Revert revision 195965.
llvm-svn: 195967
2013-11-29 22:10:02 +00:00
Petar Jovanovic
f12c338160 mips: XFAIL llvm-cov test
XFAIL llvm-cov.test for MIPS until big-endian issues are fixed for llvm-cov.
The test does pass on MIPS little-endian.

llvm-svn: 195966
2013-11-29 21:59:09 +00:00
Zoran Jovanovic
797919cb22 Fixed issue with microMIPS long branch.
llvm-svn: 195965
2013-11-29 21:41:24 +00:00
Hal Finkel
2d7cc00415 Adjust PPC A2 input operand latencies
On the PPC A2, instructions are only issued after their input operands are
ready. Model this by specifying that input operands are read at dispatch (0
cycles after issue). This changes all input operand latencies from 1 to 0.

Significant test-suite performance changes (these are 99.5% confidence
intervals on 6 runs for both before and after):

speedups:
MultiSource/Benchmarks/sim/sim
	-1.21915% +/- 0.175063%
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
	-1.23946% +/- 1.05133%
SingleSource/Benchmarks/Misc/flops-2
	-1.24237% +/- 0.681362%
MultiSource/Applications/JM/lencod/lencod
	-1.33992% +/- 0.757498%
MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt
	-1.51802% +/- 1.21468%
MultiSource/Benchmarks/TSVC/GlobalDataFlow-flt/GlobalDataFlow-flt
	-2.18818% +/- 1.28605%
MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt
	-2.21977% +/- 1.19499%
SingleSource/Benchmarks/BenchmarkGame/spectral-norm
	-2.29822% +/- 0.671871%
MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl
	-2.40975% +/- 0.355931%
SingleSource/Benchmarks/Misc/fp-convert
	-2.41899% +/- 1.04751%
MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl
	-2.50349% +/- 0.126765%
SingleSource/Benchmarks/Misc/flops-3
	-3.00214% +/- 0.700795%
MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt
	-3.56995% +/- 3.2929%
MultiSource/Applications/sgefa/sgefa
	-4.24908% +/- 2.00413%
MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk
	-18.1294% +/- 3.96489%

regressions:
MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
	1.03249% +/- 0.178547%
MultiSource/Applications/hexxagon/hexxagon
	1.16597% +/- 0.285235%
MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt
	1.39576% +/- 1.07855%
SingleSource/Benchmarks/Misc-C++/stepanov_v1p2
	1.71539% +/- 0.173182%
MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1
	1.90013% +/- 0.866472%
MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl
	2.39854% +/- 1.05914%
MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl
	2.4402% +/- 0.817904%
MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl
	5.87997% +/- 3.3172%
MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc
	9.02643% +/- 5.79591%
MultiSource/Benchmarks/VersaBench/bmm/bmm
	10.3517% +/- 1.227%

Obviously, there are data points on both sides of this; but I think, overall,
this supports making the change.

llvm-svn: 195951
2013-11-29 07:04:59 +00:00
Lang Hames
7883c4d5ae Teach LocalStackSlotAllocation that stackmaps/patchpoints don't have range
constraints on their frame offsets.

llvm-svn: 195950
2013-11-29 06:35:30 +00:00
Hal Finkel
69f21285ed Create a PPC440 SchedMachineModel
Some of the older PPC processor definitions don't have associated
SchedMachineModels; correct this for the PPC440.

llvm-svn: 195949
2013-11-29 06:32:17 +00:00
Hal Finkel
c5a38fd3e6 Fixup PPC440 load/store operand latencies
The operand latencies for loads and stores in the PPC440 itinerary were wrong
(the store operands are all inputs, and the "with update" (pre-increment)
instructions need a latency for the additional output).

llvm-svn: 195948
2013-11-29 06:19:43 +00:00
Hal Finkel
a9d93b1740 Adjust PPC440 operand latencies
The operand latencies for the PPC440 should be specified relative to dispatch,
not relative to the initial fetch-and-decode stages. Because most instructions
(ignoring bypass) wait in dispatch until their operands are ready, this is
modeled as reading input operands "at dispatch" (0 cycles after issue), and so
every input and output operand has 4 cycles subtracted from it.

This could alter scheduling slightly, but I don't expect a large effect.

llvm-svn: 195947
2013-11-29 05:59:00 +00:00
Hal Finkel
f086fc01ab Don't model the fetch and decode units for the PPC440
Modeling the fetch and decode units in the PPC440 itinerary does not add
anything to the hazard detection capability (and so modeling them just wastes
compile time).

No functionality change intended.

llvm-svn: 195946
2013-11-29 05:58:38 +00:00
Lang Hames
82e8d4faa9 Remove unused variable from r195944.
llvm-svn: 195945
2013-11-29 03:36:53 +00:00
Lang Hames
067c025250 Refactor a lot of patchpoint/stackmap related code to simplify and make it
target independent.

Most of the x86 specific stackmap/patchpoint handling was necessitated by the
use of the native address-mode format for frame index operands. PEI has now
been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing
us to use a simple, platform independent register/offset pair for frame
indexes on stackmap/patchpoints.

Notes:
  - Folding is now platform independent and automatically supported.
  - Emiting patchpoints with direct memory references now just involves calling
    the TargetLoweringBase::emitPatchPoint utility method from the target's
    XXXTargetLowering::EmitInstrWithCustomInserter method. (See
    X86TargetLowering for an example).
  - No more ugly platform-specific operand parsers.

This patch shouldn't change the generated output for X86. 

llvm-svn: 195944
2013-11-29 03:07:54 +00:00
Hao Liu
b9fa1067c7 AArch64: The pattern match should check the range of the immediate value.
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16].

llvm-svn: 195941
2013-11-29 02:11:22 +00:00
Jiangning Liu
844201423a Add missing test case for bsl_f64 support of AArch64 NEON.
llvm-svn: 195939
2013-11-29 01:38:08 +00:00
Jiangning Liu
afc7f71eb3 Add missing pattern for supporting intrinsic function vbsl_f64 with
argument double floating point.

llvm-svn: 195938
2013-11-29 01:37:15 +00:00
Kevin Qin
b95721d200 [AArch64 NEON]Fix a assertion failure when disassemble SHLL instruction.
llvm-svn: 195936
2013-11-29 01:29:16 +00:00
Stephen Canon
d8aaca93a6 Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).
llvm-svn: 195934
2013-11-28 21:38:05 +00:00
Rafael Espindola
3f4a857bd5 Refactor to remove a bit of duplication. No functionality change.
llvm-svn: 195933
2013-11-28 20:12:44 +00:00
Benjamin Kramer
fd8fd4246f Silence sign-compare warning and reduce nesting.
No functionality change.

llvm-svn: 195932
2013-11-28 19:58:56 +00:00
Rafael Espindola
7e7db10302 Remove an always true parameter.
llvm-svn: 195931
2013-11-28 19:35:07 +00:00
NAKAMURA Takumi
9b851e876f [CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.
I think, in principle, intrinsics_gen may be added explicitly.
That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen.
Almost all source files depend on both CommonTaleGen and intrinsics_gen.

Explicit add_dependencies() have been pruned under lib/Target.

llvm-svn: 195929
2013-11-28 17:04:31 +00:00
NAKAMURA Takumi
6f84abe1df [CMake] Also OptionTests can be free from add_dependencies() with add_public_tablegen_target().
llvm-svn: 195928
2013-11-28 17:04:13 +00:00
NAKAMURA Takumi
99f544b37e [CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen.
add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS.
LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope.

llvm-svn: 195927
2013-11-28 17:04:04 +00:00
Rafael Espindola
02c35a15de The global prefix is always one char. Don't use a string for it.
llvm-svn: 195926
2013-11-28 17:00:49 +00:00
NAKAMURA Takumi
46b765a4a3 [CMake] Prune include_directories() in llvm/lib/Target, take #2.
I forgot to commit them. They were staging in my local repo.

llvm-svn: 195924
2013-11-28 15:30:37 +00:00
Daniel Sanders
86f254d104 [mips] Revert test commit r195922.
llvm-svn: 195923
2013-11-28 15:26:33 +00:00
Daniel Sanders
5b3619e21b [mips] A test commit to test my Herald and Audit workflow
Will be reverted in the next commit

llvm-svn: 195922
2013-11-28 15:25:43 +00:00
NAKAMURA Takumi
5dbd3bcf3d [CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() sets them.
llvm-svn: 195921
2013-11-28 14:53:30 +00:00
NAKAMURA Takumi
128a461f6a Add newline at eof.
llvm-svn: 195920
2013-11-28 14:52:52 +00:00
Daniel Sanders
ae524adc6e As myself as code-owner of the MIPS backend (lib/Target/Mips/*)
llvm-svn: 195915
2013-11-28 09:36:44 +00:00
Peter Zotov
f783e4243e [OCaml] Add a slash accidentally omitted from Makefile
llvm-svn: 195912
2013-11-28 09:03:28 +00:00
Rafael Espindola
05d05c4e8a Use the mangler consistently instead of using getGlobalPrefix directly.
llvm-svn: 195911
2013-11-28 08:59:52 +00:00
Hal Finkel
e49bc01fba Don't share functional units among the PPC itineraries
Instead of sharing functional unit names between the various PPC itineraries,
give each core its own unit names prefixed with the core name.  This follows
the convention used by other backends (such as ARM), and removes a non-obvious
ordering dependency between the various PPCSchedule*.td files.

No functionality change intended.

llvm-svn: 195908
2013-11-28 06:05:59 +00:00
Jiangning Liu
7f44dcb9f4 Remove the variable only used by assert to avoid the build failure
caused by build options [-Werror,-Wunused-variable].

llvm-svn: 195905
2013-11-28 01:34:55 +00:00
Hao Liu
2f617213ef AArch64: Fix a bug about disassembling post-index load single element to 4 vectors
llvm-svn: 195903
2013-11-28 01:07:45 +00:00
Reed Kotler
deb5d6d05e Check in conditional branches for constant islands. Still need to finish
conditional branches for very large targets. That will be the next small
patch. Everything now should in principle work as good (functionality
wise) as without constant islands so we decided at Mips/Imagination to
make constant islands the default for Mips16 now so that it will get
excercised a lot and this port is still experimentatl though hopefully soon
we will change the status. Some more cleanup and code review is in order
but things are converging fast.

llvm-svn: 195902
2013-11-28 00:56:37 +00:00
Akira Hatanaka
a964ea01fe [mips] Redefine TAILCALL as a pseudo instruction.
No functionality change.

llvm-svn: 195896
2013-11-27 23:58:32 +00:00
David Blaikie
91268863a6 DebugInfo: Do not include variables only referenced by templates in aranges.
ARanges included even extern variables referenced by pointer non-type
template parameters even though that variable isn't part of this
compilation unit.

llvm-svn: 195895
2013-11-27 23:53:52 +00:00