543 Commits

Author SHA1 Message Date
Matt Arsenault
0e0cc9459a Start replacing vector_extract/vector_insert with extractelt/insertelt
These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.

Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.

Example from AArch64:

def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
          (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;

Which is trying to do sext_inreg i8, i8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255359 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 19:20:16 +00:00
Rafael Espindola
cfc74b78b1 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253436 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-18 06:02:15 +00:00
Benjamin Kramer
6f975c928c Drop code after unreachable. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251278 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-26 09:55:45 +00:00
Benjamin Kramer
165b4f4e46 Convert assert(false) into llvm_unreachable where it makes sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251266 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-25 22:28:27 +00:00
Duncan P. N. Exon Smith
db991cb068 NVPTX: Remove implicit ilist iterator conversions, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250779 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-20 00:54:09 +00:00
Craig Topper
7a2d52ce5d Use std::find instead of manual loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250624 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-17 21:32:28 +00:00
Benjamin Kramer
5f506f370b [NVPTX] Remove dead code.
I left helpers that look useful for debugging alone. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250410 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-15 14:45:41 +00:00
Rafael Espindola
5aa05ff94c git-clang-format r249548.
Sorry for missing this the first time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 20:32:24 +00:00
Rafael Espindola
c6f2e2a584 Use non virtual destructors for sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249548 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 13:46:06 +00:00
Rafael Espindola
93ef2f5a97 Don't repeat names in comments and don't indent in namespaces. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249546 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 13:38:49 +00:00
Rafael Espindola
0eba49c22e Fix pr24486.
This extends the work done in r233995 so that now getFragment (in addition to
getSection) also works for variable symbols.

With that the existing logic to decide if a-b can be computed works even if
a or b are variables. Given that, the expression evaluation can avoid expanding
variables as aggressively and that in turn lets the relocation code see the
original variable.

In order for this to work with the asm streamer, there is now a dummy fragment
per section. It is used to assign a section to a symbol when no other fragment
exists.

This patch is a joint work by Maxim Ostapenko andy myself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249303 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-05 12:07:05 +00:00
Matthias Braun
63daa1436f MachineBasicBlock: Factor out common code into isReturnBlock()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248617 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 21:25:19 +00:00
Eric Christopher
973f7aa32a constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247864 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 23:38:13 +00:00
Daniel Sanders
47b167dd84 Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Eric has replied and has demanded the patch be reverted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 16:17:27 +00:00
Daniel Sanders
9781f90c7e Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 14:08:28 +00:00
Daniel Sanders
a6aa0c3bcc Revert r247684 - Replace Triple with a new TargetTuple ...
LLDB needs to be updated in the same commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:46:21 +00:00
Daniel Sanders
7b82808e13 Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:17:40 +00:00
Daniel Sanders
c413998d28 Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC.
This is to reduce noise in a following commit.

Also fixes a couple missing spaces before the reference operator.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247679 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 12:27:06 +00:00
Bruce Mitchener
767c34a919 Fix typos.
Summary: This fixes a variety of typos in docs, code and headers.

Subscribers: jholewinski, sanjoy, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12626

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 01:17:08 +00:00
Chandler Carruth
6aaf0a68ac [ADT] Switch a bunch of places in LLVM that were doing single-character
splits to actually use the single character split routine which does
less work, and in a debug build is *substantially* faster.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247245 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 06:12:31 +00:00
Artem Belevich
d049b69952 [NVPTX] Added run NVVMReflect pass to NVPTX back-end.
The pass is needed to remove __nvvm_reflect calls when we link in
libdevice bitcode that comes with CUDA.

Differential Revision: http://reviews.llvm.org/D11663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247072 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 21:04:55 +00:00
Bjarke Hammersholt Roune
935e1b6640 [NVPTX] Let NVPTX backend detect integer min and max patterns.
Summary:
Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support.


Reviewers: jholewinski, meheff, jingyue

Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D12377

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246107 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 23:22:02 +00:00
Jingyue Wu
8724a428df [NVPTX] Allow undef value as global initializer
Summary:
__shared__ variable may now emit undef value as initializer, do not
throw error on that.

Test Plan: test/CodeGen/NVPTX/global-addrspace.ll

Patch by Xuetian Weng

Reviewers: jholewinski, tra, jingyue

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D12242

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245785 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-22 05:40:26 +00:00
Jingyue Wu
1670bbc481 [NVPTX] truncating 64-bit to 32-bit is free
Summary:
Add an LSR test that exercises isTruncateFree. Without this change, LSR creates
another indvar representing the truncated value.

Reviewers: jholewinski, eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D12058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245611 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 20:59:02 +00:00
Mark Heffernan
67278b1774 Use 32-bit divides instead of 64-bit divides where possible.
For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor
fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason
is that many index computations are carried out in 64-bits but never actually exceed the
capacity of a 32-bit word.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244684 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-11 22:16:34 +00:00
Benjamin Kramer
d3c712e50b Fix some comment typos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244402 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-08 18:27:36 +00:00
Bjarke Hammersholt Roune
b6d9668365 [NVPTX] Use LDG for pointer induction variables.
More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables.

Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244166 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-05 23:11:57 +00:00
Chandler Carruth
da49414c4b [TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'
rather than 'unsigned' for their costs.

For something like costs in particular there is a natural "negative"
value, that of savings or saved cost. As a consequence, there is a lot
of code that subtracts or creates negative values based on cost, all of
which is prone to awkwardness or bugs when dealing with an unsigned
type. Similarly, we *never* want these values to wrap, as that would
cause Very Bad code generation (likely percieved as an infinite loop as
we try to emit over 2^32 instructions or some such insanity).

All around 'int' seems a much better fit for these basic metrics. I've
added asserts to ensure that at least the TTI interface never returns
negative numbers here. If we ever have a use case for negative numbers,
we can remove this, but this way a bug where someone used '-1' to
produce a 'very large' cost will be caught by the assert.

This passes all tests, and is also UBSan clean.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D11741

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244080 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-05 18:08:10 +00:00
Craig Topper
84bbcfe200 De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-01 22:20:21 +00:00
Jingyue Wu
6ebf387fdd [NVPTX] allow register copy between float and int
Summary:
Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.

Reviewers: jholewinski

Subscribers: eliben, jholewinski, llvm-commits, bruno

Differential Revision: http://reviews.llvm.org/D11622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243839 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-01 18:02:12 +00:00
Jingyue Wu
bd11ccb2b9 [NVPTX] convert pointers in byval kernel arguments to global
Summary:
For example, in

  struct S {
    int *x;
    int *y;
  };
  __global__ void foo(S s) {
    int *b = s.y;
    // use b
  }

"b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for
accessing "b".

Reviewers: jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243790 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-31 21:44:14 +00:00
Jingyue Wu
c71235ab7d Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX
Summary: Use clang-tidy to simplify boolean conditional return statements

Reviewers: rafael, echristo, chandlerc, bkramer, craig.topper, dexonsmith, chapuni, eliben, jingyue, jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9983

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243734 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-31 05:09:47 +00:00
Jingyue Wu
7e90f694b6 Roll forward r242871
r242871 missed one place that should be guarded with isPhysicalReg. This patch
fixes that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243555 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 18:59:09 +00:00
Jingyue Wu
d63325d5ad Temporarily revert r242871
PR24299


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 15:26:11 +00:00
Jingyue Wu
eb653bd9d1 [NVPTX] run LSR before straight-line optimizations
Summary:
Straight-line optimizations can simplify the loop body and make LSR's
cost analysis more precise. This significantly improves several Eigen3
CUDA benchmarks.

With this change, EigenContractionKernel runs up to 40% faster
(753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h (cl-502)).
EigenConvolutionKernel2D runs up to 10% faster
(753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorConvolution.h (cl-605)).

I have some difficulties writing small tests that benefit from this
reordering due to a seemingly issue with LSR (being discussed at
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088244.html).

See the review thread for the compilation time impact of GVN. 

Reviewers: eliben, jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242982 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-23 04:59:07 +00:00
Jingyue Wu
9764983070 [BranchFolding] do not iterate the aliases of virtual registers
Summary:
MCRegAliasIterator only works for physical registers. So, do not run it
on virtual registers.

With this issue fixed, we can resurrect the BranchFolding pass in NVPTX
backend.

Reviewers: jholewinski, bkramer

Subscribers: henryhu, meheff, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242871 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-22 04:16:52 +00:00
Jingyue Wu
c9f86c1260 [NVPTX] make load on global readonly memory to use ldg
Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in 
S3D benchmark of shoc [2]. 

Patched by Xuetian Weng. 

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242713 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-20 21:28:54 +00:00
Eli Bendersky
27bd1ca0d9 Use inbounds GEPs for memcpy and memset lowering
Follow-up on discussion in http://reviews.llvm.org/D11220


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242542 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-17 16:42:33 +00:00
Eli Bendersky
ccd3aa85a9 Streamline the coding style in NVPTXLowerAggrCopies
Make the style consistent with LLVM style throughout and clang-format.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242439 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 20:42:38 +00:00
Jingyue Wu
1241b83d61 [NVPTX] enable SpeculativeExecution in NVPTX
Summary:
SpeculativeExecution enables a series straight line optimizations (such
as SLSR and NaryReassociate) on conditional code. For example,

  if (...)
    ... b * s ...
  if (...)
    ... (b + 1) * s ...

speculative execution can hoist b * s and (b + 1) * s from then-blocks,
so that we have

  ... b * s ...
  if (...)
    ...
  ... (b + 1) * s ...
  if (...)
    ...

Then, SLSR can rewrite (b + 1) * s to (b * s + s) because after
speculative execution b * s dominates (b + 1) * s.

The performance impact of this change is significant. It speeds up the
benchmarks running EigenFloatContractionKernelInternal16x16
(ba68f42fa6/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h (cl-526))
by roughly 2%. Some internal benchmarks that have the above code pattern
are improved by up to 40%. No significant slowdowns are observed on
Eigen CUDA microbenchmarks.

Reviewers: jholewinski, broune, eliben

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 20:13:48 +00:00
Benjamin Kramer
fc0d94c8ec [NVPTX] Don't leak dead instructions after unlinking them from the BasicBlock
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242417 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 16:51:48 +00:00
Eli Bendersky
9e05109e11 Correct lowering of memmove in NVPTX
This fixes https://llvm.org/bugs/show_bug.cgi?id=24056

Also a bit of refactoring along the way.

Differential Revision: http://reviews.llvm.org/D11220


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242413 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 16:27:19 +00:00
Mehdi Amini
9c5961b7ba Move most user of TargetMachine::getDataLayout to the Module one
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

This patch is quite boring overall, except for some uglyness in
ASMPrinter which has a getDataLayout function but has some clients
that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so
some methods are taking a DataLayout as parameter.

Reviewers: echristo

Subscribers: yaron.keren, rafael, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11090

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 06:11:10 +00:00
Mehdi Amini
a5574d611a Remove DataLayout from TargetLoweringObjectFile, redirect to Module
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: yaron.keren, rafael, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11079

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242385 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 06:04:17 +00:00
Mark Heffernan
9be1720729 Enable partial and runtime loop unrolling for NVPTX.
Enable partial and runtime loop unrolling for NVPTX backend via
TTI::UnrollingPreferences with a small threshold. This partially unrolls
small loops which are often unrolled by the PTX to SASS compiler
and unrolling earlier can be beneficial.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242049 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-13 18:33:21 +00:00
Duncan P. N. Exon Smith
16859aa242 MC: Remove MCSubtargetInfo() default constructor
Force all creators of `MCSubtargetInfo` to immediately initialize it,
merging the default constructor and the initializer into an initializing
constructor.  Besides cleaning up the code a little, this makes it clear
that the initializer is never called again later.

Out-of-tree backends need a trivial change: instead of calling:

    auto *X = new MCSubtargetInfo();
    InitXYZMCSubtargetInfo(X, ...);
    return X;

they should call:

    return createXYZMCSubtargetInfoImpl(...);

There's no real functionality change here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241957 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-10 22:43:42 +00:00
Jingyue Wu
020938f3ee [TTI] BasicTTIImpl assumes no vector registers
Summary:
Following the discussion on r241884, it's more reasonable to assume that a
target has no vector registers by default instead of letting every such
target overrides getNumberOfRegisters.

Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to
return 0 when Vector is true, and partially reverts r241884 which
modifies NVPTXTTIImpl::getNumberOfRegisters.

It also fixes a performance bug in LoopVectorizer. Even if a target has
no vector registers, vectorization may still help ILP. So, we need both
checks to be false before disabling loop vectorization all together.

Reviewers: hfinkel

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11108

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241942 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-10 21:14:54 +00:00
Eli Bendersky
98da4704dd Actually support volatile memcpys in NVPTX lowering
Differential Revision: http://reviews.llvm.org/D11091



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241914 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-10 15:40:33 +00:00
Jingyue Wu
dde12814c7 [NVPTX] declare no vector registers
Summary:
Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll)
produces code with complex control flow which hurts later optimizations. Since
NVPTX doesn't have vector registers in LLVM's sense
(NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector
registers to effectively disable loop vectorization.

Reviewers: jholewinski

Subscribers: jingyue, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241884 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-10 04:31:56 +00:00
Eli Bendersky
89a5e2532d Replace index-loops by range-based loops
NFC


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241875 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-09 23:06:03 +00:00