Commit Graph

128857 Commits

Author SHA1 Message Date
Peter Collingbourne
c593fadd95 ARM: Support relative references using the PREL31 symbol variant.
Differential Revision: http://reviews.llvm.org/D17937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263156 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 19:30:18 +00:00
Balaram Makam
21374d486c Fix testicase to turn buildbot green. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263154 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 19:07:50 +00:00
Nicolai Haehnle
5dae380620 [TableGen] more helpful error message in MapTableEmitter
Differential Revision: http://reviews.llvm.org/D17275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263148 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 18:51:58 +00:00
Teresa Johnson
b738b28f97 Materialize metadata in IRLinker before value mapping
Summary:
Unless we plan to do later postpass metadata linking (ThinLTO special mode),
always invoke metadata materialization at the start of IRLinker::run().
This avoids the need for clients who use lazy metadata loading to
explicitly invoke materializeMetadata before the IRMover, which in
turn invokes IRLinker::run and needs materialized metadata for mapping.

Came up in the context of an LLD issue (D17982).

Reviewers: rafael

Subscribers: silvas, llvm-commits

Differential Revision: http://reviews.llvm.org/D17992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263143 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 18:47:03 +00:00
Tim Northover
07f1262d8a AArch64: remove pseudo-instructions used only for their patterns.
There's no real reason for these pseudos to exist, we should be writing real
patterns even if it is slightly less convenient. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263141 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 18:46:12 +00:00
Nicolai Haehnle
f0eb7094d4 AMDGPU/SI: add llvm.amdgcn.buffer.load/store.format intrinsics
Summary:
They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa
to implement the GL_ARB_shader_image_load_store extension.

The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide
whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling
and loads). However, this is not currently implemented.

For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller"
opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32
is actually implemented since GLSL also only has a vec4 variant of the store
instructions, although it's conceivable that Mesa will want to be smarter
about this in the future.

BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which
has a legacy name, pretends not to access memory, and does not capture the
full flexibility of the instruction.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17277

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263140 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 18:43:50 +00:00
Michael Kuperstein
b448651c49 [X86] Correctly select registers to pop into for x86_64
When trying to replace an add to esp with pops, we need to choose dead
registers to pop into. Registers clobbered by the call and not imp-def'd
by it should be safe. Except that it's not enough to check the register
itself isn't defined, we also need to make sure no overlapping registers
are defined either.

This fixes PR26711.

Differential Revision: http://reviews.llvm.org/D18029

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263139 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 18:43:21 +00:00
Balaram Makam
e99741c2d5 [AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2
Summary:
Peephole optimization that generates a single TBZ/TBNZ instruction
for test and branch sequences like in the example below. This handles
the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC

Examples:
   and  w8, w8, #0x400
   cbnz w8, L1
 to
   tbnz w8, #10, L1

Reviewers: MatzeB, jmolloy, mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17942

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263136 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 17:54:55 +00:00
Sanjay Patel
3ad244cde2 give regression test a meaningful name
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263135 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 17:52:19 +00:00
Alexandros Lamprineas
b4bb1d1359 [ARM] Cortex-R8 support
This patch adds Cortex-R8 to Target Parser and TableGen.
It also adds CodeGen tests for the build attributes.

Patch by Pablo Barrio.

Differential Revision: http://reviews.llvm.org/D17925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263132 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 17:38:41 +00:00
Mehdi Amini
fffed50461 Rename -discard-value-names into -lto-discard-value-names in libLLVMLTO
This is avoiding a naming conflict with opt and llc.
While opt and llc don't link to LTO usually, users that are building a
monolithic libLLVM.dylib and linking the tools to it would have a
runtime error because of the duplicate cl::opt registration.

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263127 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 17:06:52 +00:00
Changpeng Fang
de01cf1028 AMDGPU/SI: Define S_GETREG Intrinsic
Summary:
 Define s_getreg intrinsic to generate s_getreg instruction to read
hardware registers.

Reviewers: tstellarAMD, arsenm

Subscribers: llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D17892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263124 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 16:47:15 +00:00
Saleem Abdulrasool
9dd2ef1cf2 ARM: follow up improvements for SVN r263118
The initial change was insufficiently complete for always getting the semantics
of __builtin_longjmp correct.  The builtin is translated into a
`tInt_eh_sjlj_longjmp` DAG node.  This node set R7 as clobbered.  However, the
code would then follow up with a clobber of R11.  I had failed to notice the
imp-def,kill on R7 in the isel.  Unfortunately, it seems that it is not possible
to conditionalise the Defs list via an !if.  Instead, construct a new parallel
WIN node and prefer that when targeting windows.  This ensures that we now both
correctly model the __builtin_longjmp as well as construct the frame in a more
ABI conformant manner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 16:26:37 +00:00
Chandler Carruth
ffadaf5667 [SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out
of, and I misdiagnosed for months and months.

Andrea has had a patch for this forever, but I just couldn't see how
it was fixing the root cause of the problem. It didn't make sense to me,
even though the patch was perfectly good and the analysis of the actual
failure event was *fantastic*.

Well, I came back to it today because the patch has sat for *far* too
long and needs attention and decided I wouldn't let it go until I really
understood what was going on. After quite some time in the debugger,
I finally realized that in fact I had just missed an important case with
my previous attempt to fix PR22093 in r225149. Not only do we need to
handle loads that won't be split, but stores-of-loads that we won't
split. We *do* actually have enough logic in the presplitting to form
new slices for split stores.... *unless* we decided not to split them!

I'm so sorry that it took me this long to come to the realization that
this is the issue. It seems so obvious in hind sight (of course).
Anyways, the fix becomes *much* smaller and more focused. The fact that
we're left doing integer smashing is related to the FIXME in my original
commit: fundamentally, we're not aggressive about pre-splitting for
loads and stores to the same alloca. If we want to get aggressive about
this, it'll need both what Andrea had put into the proposed fix, but
also a *lot* more logic to essentially iteratively pre-split the alloca
until we can't do any more. As I said in that commit log, its really
unclear that this is the right call. Instead, the integer blending and
letting targets lower this to narrower stores seems slightly better. But
we definitely shouldn't really go down that path just to fix this bug.

Again, tons of thanks are owed to Andrea and others at Sony for working
on this bug. I really should have seen what was going on here and
re-directed them sooner. =////

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263121 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 15:31:17 +00:00
David L Kreitzer
ce75835901 Unified the handling of returns in the X87 stackifier so that the stackifier
runs successfully on routines containing IRETs. This fixes PR26410.

Differential Revision: http://reviews.llvm.org/D17643


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263120 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 15:14:02 +00:00
NAKAMURA Takumi
7b7de7dc81 Fixup for r263114. llvm::AnalysisBase<CallGraphAnalysis> should be declared as extern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 15:13:00 +00:00
Saleem Abdulrasool
b3f28fc50b ARM: correct __builtin_longjmp on WoA
WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast
to AAPCS which states r7.  This adjusts __builtin_longjmp to not clobber r7 and
to properly restore the frame pointer on execution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263118 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 15:11:09 +00:00
Chandler Carruth
53fa006c35 [CG] Back out my pointless move ctor and add the explicit template
instantiation needed for the mingw dll build bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263114 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 14:33:10 +00:00
Chandler Carruth
df383be7e5 [SROA] Clean up some really weird code, no functionality changed.
We already have the instruction extracted into 'I', just cast that to
a store the way we do for loads. Also, we don't enter the if unless SI
is non-null, so don't test it again for null.

I'm pretty sure the entire test there can be nuked, but this is just the
trivial cleanup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263112 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 14:16:18 +00:00
Elena Demikhovsky
7d23bb996b AVX-512: Fixed a bug in i1 vector zero extending. (Skylake-avx512)
(failed on instruction selection phase)

Differential Revision: http://reviews.llvm.org/D17924



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263111 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 13:44:22 +00:00
Chandler Carruth
77d9151731 [CG] Try adding an explicit move constructor to see if that helps the
one build bot that is crashing on this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263110 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 13:43:06 +00:00
Valery Pykhtin
134ae13a5d [AMDGPU] Fix SMEM instructions encoding/operand namings
Differential Revision: http://reviews.llvm.org/D17651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263108 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 13:06:08 +00:00
Simon Pilgrim
c21bed9d78 [X86][AVX] Improve target shuffle combining of BLEND+zero
The BLEND+zero combine was failing to combine equivalent BLEND masks.

Follow up to D17483 and D17858

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263105 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:50:15 +00:00
Chandler Carruth
f1b2c7b945 [CG] Add a new pass manager printer pass for the old call graph and
actually finish wiring up the old call graph.

There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.

As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263104 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:24:11 +00:00
Chandler Carruth
c9fd7655a0 [LCG] Spell the printing pass pipeline name for the lazy call graph
'lcg' instead of just 'cg'.

This makes it consistent with the analysis name of 'lcg'.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263103 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:24:06 +00:00
Simon Pilgrim
eda788c7b7 [X86][SSE] Basic combining of unary target shuffles of binary target shuffles.
This patch reorders the combining of target shuffle masks so that when a unary shuffle takes a binary shuffle as its input but only references one of its inputs it can correctly combine into a unary shuffle mask.

This is starting to encroach on the purpose of resolveTargetShuffleInputs, but I don't want to remove it until we definitely know we won't need it for full binary shuffle combining.

There is a lot more work before we can properly support binary target shuffle masks but this was an easy case to add support for.

Differential Revision: http://reviews.llvm.org/D17858

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263102 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:23:51 +00:00
Chandler Carruth
7a06f63f10 [CG] Actually hoist up the generic CallGraphPrinter pass from a weird
location in the opt tool to live along side the analysis in LLVM's
libraries.

No functionality changed here, but this will allow me to port the
printer to the new pass manager as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263101 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:08:44 +00:00
Chandler Carruth
562873f9be [CG] Rename the DOT printing pass to actually reference "DOT".
There is another pass by the generic name 'CallGraphPrinter' which is
actually just a call graph printer tucked away inside the opt tool. I'd
like to bring it out and make it follow the same patterns as the rest of
the CallGraph code, but doing so would end up conflicting with the name
of the DOT printing pass. So this makes the DOT printing pass name be
more precise.

No functionality changed here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263100 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 11:04:40 +00:00
Elena Demikhovsky
4196f55435 AVX-512: Fixed a bug in shuffle for v64i8 type
Operation SCALAR_TO_VECTOR for v64i8 and v32i16 should be lowered if BW feature is "on".

Differential Revision: http://reviews.llvm.org/D17994



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263097 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 08:32:09 +00:00
Vedant Kumar
ee68a6ce38 [opt] Fix description of the -disable-verify flag
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263096 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 06:58:53 +00:00
Mark Lacey
2a9c5591ed Add an LLVM_BUILTIN_DEBUGTRAP macro.
Summary:
This provides a macro that expands to __builtin_debugtrap() for clang,
and __debugbreak() for MSVC.

It intentionally expands to nothing for compilers that do not support a
similar mechanism that halts the debugger without otherwise crashing the
process.

Differential Revision: http://reviews.llvm.org/D18002

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263095 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 05:15:03 +00:00
Roman Levenstein
c461035332 Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263092 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 04:35:09 +00:00
Vedant Kumar
d449fa84ed [opt] Only create Verifier passes when requested
opt adds Verifier passes in AddOptimizationPasses even if
-disable-verify is on. Fix it so that the extra verification occurs
either when (1) -disable-verifier is off, or (2) -verify-each is on.

Thanks to David Jones for pointing out this behavior!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263090 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 03:40:14 +00:00
Michael Zolotukhin
225dd82d63 [SLP] Add -slp-min-reg-size command line option.
MinVecRegSize is currently hardcoded to 128; this patch adds a cl::opt
to allow changing it. I tried not to change any existing behavior for the default
case.

Differential revision: http://reviews.llvm.org/D13278

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263089 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 02:49:47 +00:00
Mehdi Amini
0693d31138 Add an entry in the Release Notes for LLVMContext::discardValueNames()
From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 02:18:17 +00:00
Mehdi Amini
2de9927963 Add a flag to the LLVMContext to disable name for Value other than GlobalValue
Summary:
This is intended to be a performance flag, on the same level as clang
cc1 option "--disable-free". LLVM will never initialize it by default,
it will be up to the client creating the LLVMContext to request this
behavior. Clang will do it by default in Release build (just like
--disable-free).

"opt" and "llc" can opt-in using -disable-named-value command line
option.

When performing LTO on llvm-tblgen, the initial merging of IR peaks
at 92MB without this patch, and 86MB after this patch,setNameImpl()
drops from 6.5MB to 0.5MB.
The total link time goes from ~29.5s to ~27.8s.

Compared to a compile-time flag (like the IRBuilder one), it performs
very close. I profiled on SROA and obtain these results:

 420ms with IRBuilder that preserve name
 372ms with IRBuilder that strip name
 375ms with IRBuilder that preserve name, and a runtime flag to strip

Reviewers: chandlerc, dexonsmith, bogner

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17946

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 01:28:54 +00:00
Chandler Carruth
d18bb9e06b [gvn] Fix more indenting and formatting in regions of code that will
need to be changed for porting to the new pass manager.

Also sink the comment on the ValueTable class back to that class instead
of it dangling on an anonymous namespace.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 00:58:20 +00:00
Chandler Carruth
61c136c43e [gvn] Reformat a chunk of the GVN code that is strangely indented prior
to restructuring it for porting to the new pass manager.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 00:58:18 +00:00
Chandler Carruth
c5266b5293 [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263082 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-10 00:55:30 +00:00
Philip Reames
d8f0e4d761 [BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAA
MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA.

In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice.

Differential Revision: http://reviews.llvm.org/D15912



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263075 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 23:19:56 +00:00
Philip Reames
3239eb1016 [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 23:13:12 +00:00
Philip Reames
78e37a90ad Fix the build
I screwed up rebasing 263072.  This change fixes the build and passes all make check.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 23:07:53 +00:00
Philip Reames
34c171d3ad [LICM] Store promotion when memory is thread local
This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop.

Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped.

In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem.

Differential Revision: http://reviews.llvm.org/D16783



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 22:59:30 +00:00
Sanjay Patel
d05dce8ca6 [x86] fix cost model inaccuracy for vector memory ops
The irony of this patch is that one CPU that is affected is AMD Jaguar, and Jaguar
has a completely double-pumped AVX implementation. But getting the cost model to
reflect that is a much bigger problem. The small goal here is simply to improve on
the lie that !AVX2 == SandyBridge.

Differential Revision: http://reviews.llvm.org/D18000



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263069 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 22:23:33 +00:00
Derek Schuff
70438f5c80 [WebAssembly] Update known gcc test failures
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 22:14:33 +00:00
Sanjay Patel
40847a4c84 [x86, AVX] optimize masked loads with constant masks
Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper.

Differential Revision: http://reviews.llvm.org/D17899


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 22:12:08 +00:00
Philip Reames
352b0048ba [ValueTracking] Extract isKnownPositive [NFCI]
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263062 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 21:31:47 +00:00
Philip Reames
37f4f50139 [InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)
When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min.

Differential Revision: http://reviews.llvm.org/D17873



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263059 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 21:05:07 +00:00
Adam Nemet
cc638e59fd [LLE] Add missing check for unit stride
I somehow missed this.  The case in GCC (global_alloc) was similar to
the new testcase except it had an array of structs rather than a two
dimensional array.

Fixes RP26885.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263058 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 20:47:55 +00:00
Evandro Menezes
1a0cda750f [AArch64] Minor reformatting (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263054 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 19:56:38 +00:00