Commit Graph

137907 Commits

Author SHA1 Message Date
Yaxun Liu
145ae71240 AMDGPU: Remove a useless variable which caused build failure for lld.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280841 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 18:31:11 +00:00
Wei Mi
ebe55c6a3e Don't reduce the width of vector mul if the target doesn't support SSE2.
The patch is to fix PR30298, which is caused by rL272694. The solution is to
bail out if the target has no SSE2.

Differential Revision: https://reviews.llvm.org/D24288


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280837 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 18:22:17 +00:00
Hans Wennborg
9107e64bb5 Add more triple to conditional-tailcall.ll test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280835 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 18:19:31 +00:00
Chad Rosier
371bac0453 Typo. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280834 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 18:15:12 +00:00
Saleem Abdulrasool
c4c2318e72 CodeGen: ensure that libcalls are always AAPCS CC
The original commit was too aggressive about marking LibCalls as AAPCS.  The
libcalls contain libc/libm/libunwind calls which are not AAPCS, but C.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280833 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 17:56:09 +00:00
Hans Wennborg
4a83266436 X86: Fold tail calls into conditional branches where possible (PR26302)
When branching to a block that immediately tail calls, it is possible to fold
the call directly into the branch if the call is direct and there is no stack
adjustment, saving one byte.

Example:

  define void @f(i32 %x, i32 %y) {
  entry:
    %p = icmp eq i32 %x, %y
    br i1 %p, label %bb1, label %bb2
  bb1:
    tail call void @foo()
    ret void
  bb2:
    tail call void @bar()
    ret void
  }

before:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     .LBB0_2
          jmp     foo
  .LBB0_2:
          jmp     bar

after:

  f:
          movl    4(%esp), %eax
          cmpl    8(%esp), %eax
          jne     bar
  .LBB0_1:
          jmp     foo

I don't expect any significant size savings from this (on a Clang bootstrap I
saw 288 bytes), but it does make the code a little tighter.

This patch only does 32-bit, but 64-bit would work similarly.

Differential Revision: https://reviews.llvm.org/D24108

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280832 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 17:52:14 +00:00
Davide Italiano
fab897dd11 [lib/LTO] Add a way to run a custom pipeline
Differential Revision:  https://reviews.llvm.org/D24095

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280830 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 17:46:16 +00:00
Yaxun Liu
6874fa846b AMDGPU: Add hidden kernel arguments to runtime metadata
OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata.

Differential Revision: https://reviews.llvm.org/D23424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280829 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 17:44:00 +00:00
Reid Kleckner
eadc14fcd6 [codeview] Add new directives to record inlined call site line info
Summary:
Previously we were trying to represent this with the "contains" list of
the .cv_inline_linetable directive, which was not enough information.
Now we directly represent the chain of inlined call sites, so we know
what location to emit when we encounter a .cv_loc directive of an inner
inlined call site while emitting the line table of an outer function or
inlined call site. Fixes PR29146.

Also fixes PR29147, where we would crash when .cv_loc directives crossed
sections. Now we write down the section of the first .cv_loc directive,
and emit an error if any other .cv_loc directive for that function is in
a different section.

Also fixes issues with discontiguous inlined source locations, like in
this example:

  volatile int unlikely_cond = 0;
  extern void __declspec(noreturn) abort();
  __forceinline void f() {
    if (!unlikely_cond) abort();
  }
  int main() {
    unlikely_cond = 0;
    f();
    unlikely_cond = 0;
  }

Previously our tables gave bad location information for the 'abort'
call, and the debugger wouldn't snow the inlined stack frame for 'f'.
It is important to emit good line tables for this code pattern, because
it comes up whenever an asan bug occurs in an inlined function. The
__asan_report* stubs are generally placed after the normal function
epilogue, leading to discontiguous regions of inlined code.

Reviewers: majnemer, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280822 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 16:15:31 +00:00
Chad Rosier
452b6b26eb [LoopInterchange] Improve debug output. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280820 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 16:07:17 +00:00
Chad Rosier
e760d89ddf [LoopInterchange] Improve debug output. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280819 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 15:56:59 +00:00
Justin Lebar
600740bf18 [LSV] Use the original loads' names for the extractelement instructions.
Summary:
LSV replaces multiple adjacent loads with one vectorized load and a
bunch of extractelement instructions.  This patch makes the
extractelement instructions' names match those of the original loads,
for (hopefully) improved readability.

Reviewers: asbirlea, tstellarAMD

Subscribers: arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 15:49:48 +00:00
Sanjay Patel
05601b46da [x86] move combines of 'select of 2 constants' to its own function; NFC
There are missing folds here and possibly folds that could be made generic.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280817 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 15:47:34 +00:00
Simon Pilgrim
0639c00041 Fix typo in test - it should be masking bits0-15 not bit16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280816 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 15:19:07 +00:00
Andrea Di Biagio
959abb8a67 Regenerate vector bitcast folding tests using update_test_checks.py.
Two tests have been merged together, regenerated and then moved to
a more appropriate directory. No functional change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280814 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 14:50:07 +00:00
Simon Pilgrim
56463ddaa2 [X86][SSE] Added or combine tests for known bits of vectors
Part of the yak shaving for D24253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 14:49:50 +00:00
Simon Pilgrim
b362b6eae0 [X86][SSE] Added and+or+zext combine tests for known bits of vectors
Part of the yak shaving for D24253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280810 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 14:00:52 +00:00
Simon Pilgrim
9313182d93 [X86][SSE] Added and+or combine tests currently failing with vectors
(and (or x, C), D) -> D if (C & D) == D

Part of the yak shaving for D24253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280809 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 13:40:03 +00:00
Pablo Barrio
ba3ea4d3a9 [ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)
Summary:
This saves a library call to __aeabi_uidivmod. However, the
processor must feature hardware division in order to benefit from
the transformation.

Reviewers: scott-0, jmolloy, compnerd, rengolin

Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits

Differential Revision: https://reviews.llvm.org/D24133

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:49:15 +00:00
Andrea Di Biagio
2fdf2bfd1f [InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic.
This fixes a similar issue to the one already fixed by r280804
(revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts
in the extrq/extrqi combining logic. However, it turns out that even the
insertq/insertqi logic was affected by the same problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:47:53 +00:00
Andrea Di Biagio
fd11478c26 [InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls.
This patch fixes an assertion failure caused by unsafe dynamic casts on the
constant operands of sse4a intrinsic calls to extrq/extrqi

The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently
checks if the input operands are constants. Internally, that logic relies on
dyn_casts of values returned by calls to method Constant::getAggregateElement.
However, method getAggregateElemet may return nullptr if the constant element
cannot be retrieved. So, all the dyn_casts can potentially fail. This is what
happens for example if a constexpr value is passed in input to an extrq/extrqi
intrinsic call.

This patch fixes the problem by using a dyn_cast_or_null (instead of a simple
dyn_cast) on the result of each call to Constant::getAggregateElement.

Added reproducible test cases to x86-sse4a.ll.

Differential Revision: https://reviews.llvm.org/D24256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:03:03 +00:00
Renato Golin
17c896a4b9 Revert "[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address."
This reverts commit r280796, as it broke the AArch64 bots for no reason.

The tests were passing and we should try to keep them passing, so a proper
review should make that happen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280802 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 10:54:42 +00:00
Vasileios Kalintiris
c7d75117d4 [mips] Disable the TImode shift libcalls for 32-bit targets.
Summary:
The o32 ABI doesn't not support the TImode helpers. For the time being,
disable just the shift libcalls as they break recursive builds on MIPS.

Reviewers: sdardis

Subscribers: llvm-commits, sdardis

Differential Revision: https://reviews.llvm.org/D24259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280798 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 10:01:18 +00:00
Sagar Thakur
51cc6bb2bb [EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address.
Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses.

Reviewed by bruening
Differential: D23801


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280796 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 09:45:37 +00:00
James Molloy
e50466a8d1 [SimplifyCFG] Followup fix to r280790
In failure cases it's not guaranteed that the PHI we're inspecting is actually in the successor block! In this case we need to bail out early, and never query getIncomingValueForBlock() as that will cause an assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280794 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 09:01:22 +00:00
James Molloy
80fedcb5bf [SimplifyCFG] Update workaround for PR30188 to also include loads
I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores.

Fixes PR30188 (again).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 08:40:20 +00:00
Diana Picus
1317eef21c [CMake] Use CMake's default RPATH for the unit tests
In the top-level CMakeLists.txt, we set CMAKE_BUILD_WITH_INSTALL_RPATH to ON,
and then for the unit tests we set it to <test>/../../lib. This works for tests
that live in unittest/<whatever>, but not for those that live in subdirectories
e.g. unittest/Transforms/IPO or unittest/ExecutionEngine/Orc. When building
with BUILD_SHARED_LIBRARIES, such tests don't manage to find their libraries.

Since the tests are run from the build directory, it makes sense to set their
RPATH for the build tree, rather than the install tree. This is the default in
CMake since 2.6, so all we have to do is set CMAKE_BUILD_WITH_INSTALL_RPATH to
OFF for the unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 08:37:15 +00:00
James Molloy
40fa86fff8 [SimplifyCFG] Check PHI uses more accurately
PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG.

Fixes PR30292.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280790 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 08:15:54 +00:00
Hal Finkel
21c68fac74 [PowerPC] Fix address-offset folding for plain addi
When folding an addi into a memory access that can take an immediate offset, we
were implicitly assuming that the existing offset was zero. This was incorrect.
If we're dealing with an addi with a plain constant, we can add it to the
existing offset (assuming that doesn't overflow the immediate, etc.), but if we
have anything else (i.e. something that will become a relocation expression),
we'll go back to requiring the existing immediate offset to be zero (because we
don't know what the requirements on that relocation expression might be - e.g.
maybe it is paired with some addis in some relevant way).

On the other hand, when dealing with a plain addi with a regular constant
immediate, the alignment restrictions (from the TOC base pointer, etc.) are
irrelevant.

I've added the test case from PR30280, which demonstrated the bug, but also
demonstrates a missed optimization opportunity (i.e. we don't need the memory
accesses at all).

Fixes PR30280.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280789 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 07:36:11 +00:00
Elena Demikhovsky
78405002cf AVX512F: FMA intrinsic + FNEG - sequence optimization
The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set.
FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))).
It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead.
I added pattern match for integer XOR.

Differential Revision: https://reviews.llvm.org/D24221



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280785 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 06:54:28 +00:00
Matt Arsenault
1da68f6209 AMDGPU: Make some scalar instructions commutable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280784 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 06:25:55 +00:00
Matt Arsenault
774415bbeb Remove unnecessary call to getAllocatableRegClass
This reapplies r252565 and r252674, effectively reverting r252956.

This allows VS_32/VS_64 to be unallocatable like they should be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280783 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 06:16:45 +00:00
Craig Topper
4cf44f1d2b [X86] Add hasSideEffects=0 to some instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 04:46:15 +00:00
Craig Topper
83d6e111bc [AVX-512] Add support for commuting masked instructions in findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280781 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 04:46:11 +00:00
Saleem Abdulrasool
4ccc33a9ab Revert "CodeGen: ensure that libcalls are always AAPCS CC"
This reverts SVN r280683.  Revert until I figure out why this is breaking lli
tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280778 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 03:17:19 +00:00
Nick Lewycky
36a0b39fae Fix typo in comment, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280774 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 01:49:41 +00:00
Davide Italiano
5c494e7f36 [LTO] Rename variables to be more explicative.
Thanks to Mehdi for the suggestion!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280772 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 01:08:31 +00:00
Davide Italiano
19c8b5e70c [opt] Remove an unused argument to runPassPipeline().
I have plans to use this API also in libLTO (and maybe lld).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280770 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 00:48:47 +00:00
Zachary Turner
8eddc0f657 Re-add "Make FieldList records print as a YAML sequence"
This was originally submitted in r280549, and reverted in r280577
due to breaking one MSVC buildbot.  The issue is that MSVC 2013
doesn't synthesize move constructors.  So even though i was
writing std::move(A) it was copying it, leading to a bogus ArrayRef.
The solution here is to simply remove the std::vector<> from the
type, since it is unused and unnecessary.  This way the ArrayRef
continues to point into the original memory backing the CVType.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 23:45:47 +00:00
Hal Finkel
8d0ebe9402 [DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike)
I might have called this "r246507, the sequel". It fixes the same issue, as the
issue has cropped up in a few more places. The underlying problem is that
isSetCCEquivalent can pick up select_cc nodes with a result type that is not
legal for a setcc node to have, and if we use that type to create new setcc
nodes, nothing fixes that (and so we've violated the contract that the
infrastructure has with the backend regarding setcc node types).

Fixes PR30276.

For convenience, here's the commit message from r246507, which explains the
problem is greater detail:

[DAGCombine] Fixup SETCC legality checking

SETCC is one of those special node types for which operation actions (legality,
etc.) is keyed off of an operand type, not the node's value type. This makes
sense because the value type of a legal SETCC node is determined by its
operands' value type (via the TLI function getSetCCResultType). When the
SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value
type, or directly with the value type provided by TLI.getSetCCResultType.

The first problem being fixed here is that DAGCombine had several places
querying TLI.isOperationLegal on SETCC, but providing the return of
getSetCCResultType, instead of the operand type directly. This does not mean
what the author thought, and "luckily", most in-tree targets have SETCC with
Custom lowering, instead of marking them Legal, so these checks return false
anyway.

The second problem being fixed here is that two of the DAGCombines could create
SETCC nodes with arbitrary (integer) value types; specifically, those that
would simplify:

  (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3
     (which is possible for some combinations of (op1, op2))

If the operands of the and|or node are actual setcc nodes, then this is not an
issue (because the and|or must share the same type), but, the relevant code in
DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls
DAGCombiner::isSetCCEquivalent on each operand, and that function will
recognise setcc-like select_cc nodes with other return types. And, thus, when
creating new SETCC nodes, we need to be careful to respect the value-type
constraint. This is even true before type legalization, because it is quite
possible for the SELECT_CC node to have a legal type that does not happen to
match the corresponding TLI.getSetCCResultType type.

To be explicit, there is nothing that later fixes the value types of SETCC
nodes (if the type is legal, but does not happen to match
TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to
work only because, either MVT::i1 is not legal, or it is what
TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change,
however. For the time being, restrict the relevant transformations to produce
only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1
prior to type legalization).

Fixes PR24636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280767 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 23:02:23 +00:00
Vedant Kumar
423e0ff90b [llvm-cov] Use colors consistently in the summary
Use the same color for counts and percentages. There doesn't seem to be
a reason for them to be different, and the summary looks more consistent
this way.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 22:46:00 +00:00
Vedant Kumar
ff7a0df044 [llvm-cov] Clean up the summary class, delete dead code (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 22:45:57 +00:00
Dehao Chen
543123173c Explicitly require DominatorTreeAnalysis pass for instsimplify pass.
Summary: DominatorTreeAnalysis is always required by instsimplify.

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280760 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 22:17:16 +00:00
Ying Yi
84f34c0155 [llvm-cov] Add the project summary to the text coverage report for each source file.
This patch is a spin-off from https://reviews.llvm.org/D23922. It extends the text view to preserve the same feature as the html view.

Differential Revision: https://reviews.llvm.org/D24241


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280756 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 21:41:38 +00:00
Rafael Espindola
fec7b4848f Avoid using alignas and constexpr.
This requires removing the custom allocator, since Demangle cannot
depend on Support and so cannot use Compiler.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280750 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:36:24 +00:00
Konstantin Zhuravlyov
c87867f700 [AMDGPU] Wave and register controls
- Add missing test



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280749 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:29:10 +00:00
Chris Bieneman
dbbdf44025 [CMake] Cleanup LLVM_OPTIMIZED_TABLEGEN
This cleanup removes the need for the native support library to have its own target. That target was only needed because makefile builds were tripping over each other if two tablegen targets were building at the same time. This causes problems because the parallel make invocations through CMake can't communicate with each other. This is fixed by invoking make directly instead of through CMake which is how we handle this in External Project invocations.

The other part of the cleanup is to mark the custom commands as USES_TERMINAL. This is a bit of a hack, but we need to ensure that Ninja generators don't invoke multiple tablegen targets in the same build dir in parallel, because that too would be bad.

Marking as USES_TERMINAL does have some downside for Ninja because it results in decreased parallelism, but correct builds are worth the minor loss and LLVM_OPTIMZIED_TABLEGEN is such a huge win, it is worth it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280748 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:27:07 +00:00
Konstantin Zhuravlyov
1f99c41083 [AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:22:28 +00:00
Rafael Espindola
fe2ae4c80c Try to fix a circular dependency in the modules build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280746 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:16:19 +00:00
Tom Stellard
2ec1430711 AMDGPU/SI: Teach SIInstrInfo::FoldImmediate() to fold immediates into copies
Summary:
I put this code here, because I want to re-use it in a few other places.
This supersedes some of the immediate folding code we have in SIFoldOperands.
I think the peephole optimizers is probably a better place for folding
immediates into copies, since it does some register coalescing in the same time.

This will also make it easier to transition SIFoldOperands into a smarter pass,
where it looks at all uses of instruction at once to determine the optimal way to
fold operands.  Right now, the pass just considers one operand at a time.

Reviewers: arsenm

Subscribers: wdng, nhaehnle, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280744 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:00:26 +00:00