Commit Graph

36122 Commits

Author SHA1 Message Date
Mehdi Amini
1ebf01e084 ThinLTOCodeGenerator: preserve linkonce when in "MustPreserved" set
If the linker specifically requested for a linkonce to be preserved,
we need to make sure we won't drop it even if all the uses in the
current module disappear.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267543
2016-04-26 10:35:01 +00:00
Renato Golin
f67be413be Revert "ARM: put correct symbol index on indirect pointers in __thread_ptr."
This reverts commit r267488, as it broke some ARM buildbots.

llvm-svn: 267541
2016-04-26 10:02:02 +00:00
Sanjoy Das
938a0f7381 Symbolize operand bundle blocks for bcanalyzer
Reviewers: joker.eph

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19523

llvm-svn: 267524
2016-04-26 05:59:08 +00:00
Craig Topper
b7db006017 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Craig Topper
4513730c09 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper
783834f3cf [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Dehao Chen
23b3e63a01 Tune basic block annotation algorithm.
Summary:
Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary.

This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19301

llvm-svn: 267519
2016-04-26 04:59:11 +00:00
Bill Seurer
ac21264a52 [powerpc] mark JIT tests as UNSUPPORTED on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as UNSUPPORTED for now.

If this is fixed remove the UNSUPPORTED flag "powerpc64-unknown-linux-gnu".

In r267516 I marked these as XFAIL but they succeed on some of the bots
on stage1.

llvm-svn: 267518
2016-04-26 03:59:19 +00:00
Richard Trieu
0f158489d6 Pass the test file in through stdin instead of by filename.
When passed in via filename, this test will fail if the path to the test
has the strings "f1" and "f2" in somewhere.  Pass the file through stdin
to prevent test failures due to coincidences in path names.

llvm-svn: 267517
2016-04-26 03:43:49 +00:00
Bill Seurer
fbfd7df6aa [powerpc] mark JIT tests as XFAIL on powerpc64 big endian
Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit 
r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian.  
To get the buildbots running I am marking these as XFAIL for now.

If this is fixed remove the XFAIL flag "powerpc64-unknown-linux-gnu".

llvm-svn: 267516
2016-04-26 02:33:22 +00:00
Hal Finkel
bee4e0db1c [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging
When SimplifyCFG merges identical instructions from both sides of a diamond, it
can preserve !llvm.mem.parallel_loop_access (as it does with most of the other
metadata). There's no real data or control dependency change in this case.

llvm-svn: 267515
2016-04-26 02:06:06 +00:00
Hal Finkel
7956f36dac [LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops
I really thought we were doing this already, but we were not. Given this input:

void Test(int *res, int *c, int *d, int *p) {
  for (int i = 0; i < 16; i++)
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}

we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.

One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.

The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.

Differential Revision: http://reviews.llvm.org/D19512

llvm-svn: 267514
2016-04-26 02:00:36 +00:00
Dan Gohman
2ad6a4c0e6 [WebAssembly] Account for implicit operands when computing operand indices.
llvm-svn: 267511
2016-04-26 01:40:56 +00:00
Sanjay Patel
b5dda51232 [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488

llvm-svn: 267504
2016-04-26 00:47:39 +00:00
Justin Bogner
4168e588c8 PM: Port GlobalOpt to the new pass manager
llvm-svn: 267499
2016-04-26 00:28:01 +00:00
Ahmed Bougacha
b6c12fe106 [X86] Use LivePhysRegs in X86FixupBWInsts.
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.

Differential Revision: http://reviews.llvm.org/D19472

llvm-svn: 267495
2016-04-26 00:00:48 +00:00
Justin Bogner
504105c02c GlobalOpt: Convert a bunch of tests from grep to FileCheck
llvm-svn: 267493
2016-04-25 23:36:50 +00:00
Sanjay Patel
4f931e53db Add check for "branch_weights" with prof metadata
While we're here, fix the comment and variable names to make it
clear that these are raw weights, not percentages.

llvm-svn: 267491
2016-04-25 23:15:16 +00:00
James Y Knight
439f0092c7 [Sparc] Fix double-float fabs and fneg on little endian CPUs.
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.

However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.

Thus, this expansion must check the endianness to use the correct
subregister.

llvm-svn: 267489
2016-04-25 22:54:09 +00:00
Tim Northover
7337ed82b2 ARM: put correct symbol index on indirect pointers in __thread_ptr.
Otherwise the linker has no idea what should be resolved.

llvm-svn: 267488
2016-04-25 22:36:07 +00:00
Arch D. Robison
d92b3378c4 Optimize store of "bitcast" from vector to aggregate.
This patch is what was the "instcombine" portion of D14185, with an additional 
test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). 
The patch causes instcombine to replace sequences of extractelement-insertvalue-store 
that act essentially like a bitcast followed by a store.

Differential review: http://reviews.llvm.org/D14260

llvm-svn: 267482
2016-04-25 22:22:39 +00:00
Tim Northover
66f8d5ae59 ARM: put extern __thread stubs in a special section.
The linker needs to know that the symbols are thread-local to do its job
properly.

llvm-svn: 267473
2016-04-25 21:12:04 +00:00
Quentin Colombet
7f8c56085e Re-apply r267206 with a fix for the encoding problem: when the immediate of
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.

[AArch64] Fix optimizeCondBranch logic.

The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267465
2016-04-25 20:54:08 +00:00
Matt Arsenault
b60850cb10 AMDGPU: Implement addrspacecast
llvm-svn: 267452
2016-04-25 19:27:24 +00:00
Matt Arsenault
524b24258c AMDGPU: Add queue ptr intrinsic
llvm-svn: 267451
2016-04-25 19:27:18 +00:00
Evgeniy Stepanov
1e42bd82e8 [gold] Fix linkInModule and extend common.ll test.
Fix early exit from linkInModule. IRMover::move returns false on
success and true on error.

Add a few more cases of merged common linkage variables with
different sizes and alignments.

llvm-svn: 267437
2016-04-25 18:23:29 +00:00
Chad Rosier
879234fa76 Fix typo from r267432.
llvm-svn: 267436
2016-04-25 18:20:27 +00:00
Krzysztof Parzyszek
0644ccf3c4 [Hexagon] Use llvm-mc instead of llc in an MC testcase
Remember to svn add the new file.

llvm-svn: 267435
2016-04-25 18:09:36 +00:00
Krzysztof Parzyszek
9da6529b53 [Hexagon] Use llvm-mc instead of llc in an MC testcase
llvm-svn: 267434
2016-04-25 18:08:33 +00:00
Krzysztof Parzyszek
7bc31501a1 [Hexagon] Register save/restore functions do not follow regular conventions
Do not mark them as modifying any of the volatile registers by default.

llvm-svn: 267433
2016-04-25 17:49:44 +00:00
Chad Rosier
95ab8b0612 [ValueTracking] Add an additional test case for r266767 where one operand is a const.
llvm-svn: 267432
2016-04-25 17:41:48 +00:00
Zachary Turner
cf69a7a23d Resubmit "Refactor raw pdb dumper into library"
This fixes a number of endianness issues as well as an ODR
violation that hopefully causes everything to be happy.

llvm-svn: 267431
2016-04-25 17:38:08 +00:00
Chad Rosier
5a0c6f041b [ValueTracking] Improve isImpliedCondition when the dominating cond is false.
llvm-svn: 267430
2016-04-25 17:23:36 +00:00
Adrian Prantl
5d07243529 dsymutil: Only warn about clang module DWO id mismatches in verbose mode.
Until PR27449 (https://llvm.org/bugs/show_bug.cgi?id=27449) is fixed in
clang this warning is pointless, since ASTFileSignatures will change
randomly when a module is rebuilt.

rdar://problem/25610919

llvm-svn: 267427
2016-04-25 17:04:32 +00:00
Sanjay Patel
2bdc7d43ef add tests for potential CGP transform (PR27344)
llvm-svn: 267426
2016-04-25 16:56:52 +00:00
Marcin Koscielnicki
de3ced2d10 [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%typ = type { i32, i32 }

define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
  %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
  %1 = load i32, i32* %b, align 4
  %2 = ptrtoint i32* %b to i64
  %3 = and i64 %2, -35184372088833
  %4 = inttoptr i64 %3 to i32*
  %_msld = load i32, i32* %4, align 4
  %zzz = add i32 %1,  %_msld
  ret i32 %zzz
}

Fix this by checking ResNo.

I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them.  In fact, they might not be triggerable at all,
at least with current targets.  Still, better safe than sorry.

Differential Revision: http://reviews.llvm.org/D19202

llvm-svn: 267420
2016-04-25 15:43:44 +00:00
Hrvoje Varga
e25aaa57ef [mips][microMIPS] Revert commit r267137
Commit r267137 was the reason for failing tests in LLVM test suite.

llvm-svn: 267419
2016-04-25 15:40:08 +00:00
Zlatko Buljan
eabfffce01 [mips][microMIPS] Revert commit r266977
Commit r266977 was reason for failing LLVM test suite with error message: fatal error: error in backend: Cannot select: t17: i32 = rotr t2, t11 ...

llvm-svn: 267418
2016-04-25 15:34:57 +00:00
Sanjay Patel
74d1b00e49 [x86] auto-generate checks for cmov tests
llvm-svn: 267417
2016-04-25 15:26:57 +00:00
David Majnemer
c78d15d0c3 [WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH successors
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.

This resulted in split points which were insufficiently early if an
invoke was present.

This fixes PR27501.

N.B.  This removes getLandingPadSuccessor.

llvm-svn: 267412
2016-04-25 14:31:32 +00:00
Silviu Baranga
6c665bec7d [ARM] Add support for the X asm constraint
Summary:
This patch adds support for the X asm constraint.

To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).

Fixes PR26493

Reviewers: t.p.northover, echristo, rengolin

Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19061

llvm-svn: 267411
2016-04-25 14:29:18 +00:00
Artem Tamazov
7fa01faba1 [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.
Added hwreg(reg[,offset,width]) syntax.
Default offset = 0, default width = 32.
Possibility to specify 16-bit immediate kept.
Added out-of-range checks.
Disassembling is always to hwreg(...) format.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19329

llvm-svn: 267410
2016-04-25 14:13:51 +00:00
Krzysztof Parzyszek
18108f55f4 [Hexagon] Correctly set "Flags" in ELF header
llvm-svn: 267397
2016-04-25 12:49:47 +00:00
James Molloy
4181657e21 [GlobalOpt] Allow constant globals to be SRA'd
The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!).

There seems to be no reason why we can't SRA constants too, so let's do it.

llvm-svn: 267393
2016-04-25 10:48:29 +00:00
Igor Kudrin
7e40f64026 [Coverage] Restore the correct count value after processing a nested region in case of combined regions.
If several regions cover the same area of code, we have to restore
the combined value for that area when return from a nested region.

This patch achieves that by combining regions before calling buildSegments.

Differential Revision: http://reviews.llvm.org/D18610

llvm-svn: 267390
2016-04-25 09:43:37 +00:00
Silviu Baranga
9d55b40815 [SCEV] Improve the run-time checking of the NoWrap predicate
Summary:
This implements a new method of run-time checking the NoWrap
SCEV predicates, which should be easier to optimize and nicer
for targets that don't correctly handle multiplication/addition
of large integer types (like i128).

If the AddRec is {a,+,b} and the backedge taken count is c,
the idea is to check that |b| * c doesn't have unsigned overflow,
and depending on the sign of b, that:

   a + |b| * c >= a (b >= 0) or
   a - |b| * c <= a (b <= 0)

where the comparisons above are signed or unsigned, depending on
the flag that we're checking.

The advantage of doing this is that we avoid extending to a larger
type and we avoid the multiplication of large types (multiplying
i128 can be expensive).

Reviewers: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D19266

llvm-svn: 267389
2016-04-25 09:27:16 +00:00
Marcin Koscielnicki
5a59940bb3 [PowerPC] [PR27387] Disallow r0 for ADD8TLS.
ADD8TLS, a variant of add instruction used for initial-exec TLS,
currently accepts r0 as a source register.  While add itself supports
r0 just fine, linker can relax it to a local-exec sequence, converting
it to addi - which doesn't support r0.

Differential Revision: http://reviews.llvm.org/D19193

llvm-svn: 267388
2016-04-25 09:24:34 +00:00
Michael Zuckerman
4cc752e542 Fixing wrong mask size error. From __mmask8 to __mmask16.
Was reviewed over the shoulder by AsafBadouh.
Connected to review http://reviews.llvm.org/D19195.

llvm-svn: 267379
2016-04-25 05:27:51 +00:00
Craig Topper
65fd322915 [X86] Add a complete set of tests for all operand sizes of cttz/ctlz with and without zero undef being lowered to bsf/bsr.
llvm-svn: 267373
2016-04-25 01:01:15 +00:00
Adrian Prantl
0510e5ff3d Verifier: Verify that each inlinable callsite of a debug-info-bearing function
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.

rdar://problem/25878916

This reaplies r267320 without changes after fixing an issue in the OpenMP IR
generator in clang.

llvm-svn: 267370
2016-04-24 22:23:13 +00:00
Rafael Espindola
feeb4744f9 Also check the IR.
llvm-svn: 267367
2016-04-24 21:42:56 +00:00
Rafael Espindola
5305051507 Add a test for how we handle protected visibility.
llvm-svn: 267366
2016-04-24 21:30:18 +00:00
Simon Pilgrim
386d31614a [X86][AVX] Added PR24935 test case
llvm-svn: 267362
2016-04-24 20:30:48 +00:00
Saleem Abdulrasool
13a27b5a96 ARM: fix __chkstk Frame Setup on WoA
This corrects the MI annotations for the stack adjustment following the __chkstk
invocation.  We were marking the original SP usage as a Def rather than Kill.
The (new) assigned value is the definition, the original reference is killed.

Adjust the ISelLowering to mark Kills and FrameSetup as well.

This partially resolves PR27480.

llvm-svn: 267361
2016-04-24 20:12:48 +00:00
Simon Pilgrim
4943bd650f [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.

llvm-svn: 267359
2016-04-24 18:35:59 +00:00
Simon Pilgrim
1b73eef3b4 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).

2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements

3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly

We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).

Differential Revision: http://reviews.llvm.org/D19318

llvm-svn: 267357
2016-04-24 18:23:14 +00:00
Simon Pilgrim
cd433f92ba [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)

2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input

Differential Revision: http://reviews.llvm.org/D17490

llvm-svn: 267356
2016-04-24 18:12:42 +00:00
Simon Pilgrim
30bc5e06bf [X86][SSE] Added SSSE3/AVX/AVX2 BITREVERSE tests
Codegen is pretty bad at the moment but could use PSHUFB quite efficiently 

llvm-svn: 267347
2016-04-24 15:45:06 +00:00
Simon Pilgrim
9549c9fa45 [X86][XOP] Fixed VPPERM permute op decoding (PR27472).
Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask.

llvm-svn: 267346
2016-04-24 15:05:04 +00:00
Simon Pilgrim
1211a431ae [X86][SSE] Improved support for decoding target shuffle masks through bitcasts
Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm.

This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472.

llvm-svn: 267343
2016-04-24 14:53:54 +00:00
Marcin Koscielnicki
6b999dbbc0 [SystemZ] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on s390x.  The previous attempt at this was D19101,
which was before LOAD_STACK_GUARD existed.  Compared to the previous
version, this always emits a rather ugly block of 4 instructions, involving
a thread pointer load that can't be shared with other potential users.
However, this is necessary for SSP - spilling the guard value (or thread
pointer used to load it) is counter to the goal, since it could be
overwritten along with the frame it protects.

Differential Revision: http://reviews.llvm.org/D19363

llvm-svn: 267340
2016-04-24 13:57:49 +00:00
Simon Pilgrim
79561eb759 [X86][SSE] Demonstrate issue with decoding shuffle masks that have been lowered as rematerialized constants on scalar unit
Found whilst investigating PR27472

llvm-svn: 267339
2016-04-24 13:45:30 +00:00
NAKAMURA Takumi
013ddcd61c llvm/test/tools/gold/X86/thinlto.ll: Possible fix corresponding to r267318.
llvm-svn: 267334
2016-04-24 08:02:00 +00:00
Duncan P. N. Exon Smith
edafeb956c BitcodeReader: Fix some holes in upgrade from r267296
Add tests for some missing cases to bitcode upgrade in r267296.

  - DICompositeType with an 'elements:' field, which will cause it to be
    involved in a cycle after the upgrade.

  - A DIDerivedType that references a class in 'extraData:'.

I updated test/Bitcode/dityperefs-3.8.ll with the missing cases and
regenerated test/Bitcode/dityperefs-3.8.ll.bc.

llvm-svn: 267332
2016-04-24 06:52:01 +00:00
Mehdi Amini
ae8cf7ca98 Add "hasSection" flag in the Summary
Reviewers: tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19405

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267329
2016-04-24 05:31:43 +00:00
Gerolf Hoflehner
f601ce3709 [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.

Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267328
2016-04-24 05:14:01 +00:00
Adrian Prantl
72cad9d8f8 Revert "Verifier: Verify that each inlinable callsite of a debug-info-bearing function"
This reverts commit r267320 while investigating an OpenMP buildbot failure.

llvm-svn: 267322
2016-04-24 03:47:37 +00:00
Adrian Prantl
dcfa52d905 Verifier: Verify that each inlinable callsite of a debug-info-bearing function
in a debug-info-bearing function has a debug location attached to it. Failure to
do so causes an "!dbg attachment points at wrong subprogram for function"
assertion failure when the inliner sets up inline scope info.

rdar://problem/25878916

llvm-svn: 267320
2016-04-24 03:23:02 +00:00
Mehdi Amini
9e35f9ec46 Reorganize GlobalValueSummary with a "Flags" bitfield.
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.

Differential Revision: http://reviews.llvm.org/D19404

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
2016-04-24 03:18:18 +00:00
Mehdi Amini
e000bb6ae8 Add a version field in the bitcode for the summary
Differential Revision: http://reviews.llvm.org/D19456

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267318
2016-04-24 03:18:11 +00:00
Mehdi Amini
c68b6482c1 Add an internalization step to the ThinLTOCodeGenerator
Keeping as much as possible internal/private is
known to help the optimizer. Let's try to benefit from
this in ThinLTO.
Note: this is early work, but is enough to build clang (and
all the LLVM tools). I still need to write some lit-tests...

Differential Revision: http://reviews.llvm.org/D19103

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267317
2016-04-24 03:18:01 +00:00
Craig Topper
917b9d7a47 [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly.
llvm-svn: 267311
2016-04-24 02:01:22 +00:00
Davide Italiano
e5bf42cb61 [RuntimeDyldELF] Handle GOTPCRELX/REX_GOTPCRELX.
llvm-svn: 267309
2016-04-24 01:36:37 +00:00
Davide Italiano
0fa7ff1c0f [MC/ELF] Make the relaxation test more interesting.
Add a case where we can't relax.

llvm-svn: 267308
2016-04-24 01:08:35 +00:00
Davide Italiano
734b1e3ae5 [MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).

llvm-svn: 267307
2016-04-24 01:03:57 +00:00
Mehdi Amini
deab28a4fa Relax test using CHECK-DAG instead of CHECK-NEXT
It seems we still have some ordering issue in the combined index
emission, but I can't figure out why right now.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267306
2016-04-24 00:25:15 +00:00
Mehdi Amini
9b7843d732 Fix test stability (was sensitive to the path)
This is a fixup for r267304.
The test was sensitive to the path in a subtle way:
the index in memory is sorted by GUID, which are hashes
that include the source filename for local globals.
Teresa recently added a directive at the IR level, so
we can specify it here to make the test independent of
the path.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267305
2016-04-24 00:03:57 +00:00
Mehdi Amini
87de206ac4 Store and emit original name in combined index
Summary:
As discussed in D18298, some local globals can't
be renamed/promoted (because they have a section, or because
they are referenced from inline assembly).
To be able to detect naming collision, we need to keep around
the "GUID" using their original name without taking the linkage
into account.

Reviewers: tejohnson

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19454

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267304
2016-04-23 23:38:17 +00:00
Mehdi Amini
bac8a53909 Always traverse GlobalVariable initializer when computing the export list
Summary:
We are always importing the initializer for a GlobalVariable.
So if a GlobalVariable is in the export-list, we pull in any
refs as well.

Reviewers: tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19102

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267303
2016-04-23 23:29:24 +00:00
Duncan P. N. Exon Smith
94512f24de DebugInfo: Remove MDString-based type references
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*.  It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.

Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType.  The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.

This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata.  Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.

The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":

  http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html

llvm-svn: 267296
2016-04-23 21:08:00 +00:00
Renato Golin
d3f349208c Revert "[AArch64] Fix optimizeCondBranch logic."
This reverts commit r267206, as it broke self-hosting on AArch64.

llvm-svn: 267294
2016-04-23 19:30:52 +00:00
Simon Pilgrim
581d3c8ed8 [X86][XOP] Added VPPERM -> BLEND-WITH-ZERO Test
Currently failing due to poor blend matching, found whilst investigating PR27472

llvm-svn: 267282
2016-04-23 11:14:18 +00:00
Benjamin Kramer
7020201be0 Use %T instead of cd'ing to Output directly.
%T expands to Output if not configured differently.

llvm-svn: 267281
2016-04-23 11:01:36 +00:00
Craig Topper
aebe1f85c0 [CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part.
llvm-svn: 267280
2016-04-23 05:20:47 +00:00
Teresa Johnson
d9e21e2b3b [gold] Gate value name discarding under save-temps
Summary:
This removes a couple of flags added to control this behavior, and
simply keeps all value names when save-temps is specified.

Reviewers: rafael

Subscribers: llvm-commits, pcc, davide

Differential Revision: http://reviews.llvm.org/D19384

llvm-svn: 267279
2016-04-23 05:15:59 +00:00
Duncan P. N. Exon Smith
5b72a15d9d BitcodeWriter: Emit uniqued subgraphs after all distinct nodes
Since forward references for uniqued node operands are expensive (and
those for distinct node operands are cheap due to
DistinctMDOperandPlaceholder), minimize forward references in uniqued
node operands.

Moreover, guarantee that when a cycle is broken by a distinct node, none
of the uniqued nodes have any forward references.  In
ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs
first, delaying distinct nodes until all uniqued nodes have been
handled.  This guarantees that uniqued nodes only have forward
references when there is a uniquing cycle (since r267276 changed
ValueEnumerator::organizeMetadata to partition distinct nodes in front
of uniqued nodes as a post-pass).

Note that a single uniqued subgraph can hit multiple distinct nodes at
its leaves.  Ideally these would themselves be emitted in post-order,
but this commit doesn't attempt that; I think it requires an extra pass
through the edges, which I'm not convinced is worth it (since
DistinctMDOperandPlaceholder makes forward references quite cheap
between distinct nodes).

I've added two testcases:

  - test/Bitcode/mdnodes-distinct-in-post-order.ll is just like
    test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes
    instead of uniqued ones.  This confirms that, in the absence of
    uniqued nodes, distinct nodes are still emitted in post-order.

  - test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal
    example where a naive post-order traversal would cause one uniqued
    node to forward-reference another.  IOW, it's the motivating test.

llvm-svn: 267278
2016-04-23 04:59:22 +00:00
Duncan P. N. Exon Smith
f5921278f9 BitcodeWriter: Emit distinct nodes before uniqued nodes
When an operand of a distinct node hasn't been read yet, the reader can
use a DistinctMDOperandPlaceholder.  This is much cheaper than forward
referencing from a uniqued node.  Change
ValueEnumerator::organizeMetadata to partition distinct nodes and
uniqued nodes to reduce the overhead of cycles broken by distinct nodes.

Mehdi measured this for me; this removes most of the RAUW from the
importing step of -flto=thin, even after a WIP patch that removes
string-based DITypeRefs (introducing many more cycles to the metadata
graph).

llvm-svn: 267276
2016-04-23 04:42:39 +00:00
Tim Northover
2b39bb5c9b llvm-objdump: deal with invalid ARM encodings slightly better.
Before we printed a warning to stderr and left the actual output stream in a
mess. This tries to print a .long or .short representation of what we saw (as
if there was a data-in-code directive).

This isn't guaranteed to restore synchronization in Thumb-mode (if the invalid
instruction was supposed to be 32-bits, we may be off-by-16 for the rest of the
function). But there's no certain way to deal with that, and it's invalid code
anyway (if the data really wasn't an instruction, the user can add proper
.data_in_code directives if they care)

llvm-svn: 267250
2016-04-22 23:23:31 +00:00
Tim Northover
5cc375dd3c MachO: remove weird ARM/Thumb interface from MachOObjectFile
Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.

Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.

The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.

llvm-svn: 267249
2016-04-22 23:21:13 +00:00
Matt Arsenault
3cbb6d7d74 AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size
llvm-svn: 267244
2016-04-22 22:59:16 +00:00
NAKAMURA Takumi
2e521b8375 Fix llvm/test/CodeGen/ARM/Windows/dbzchk.ll not to check mixed output, take #2.
llvm-svn: 267242
2016-04-22 22:51:48 +00:00
David Blaikie
a53e28e290 llvm-symbolizer: Avoid infinite recursion walking dwos where the dwo contains a dwo_name attribute
The dwo_name was added to dwo files to improve diagnostics in dwp, but
it confuses tools that attempt to load any dwo named by a dwo_name, even
ones inside dwos. Avoid this by keeping track of whether a unit is
already a dwo unit, and if so, not loading further dwos.

llvm-svn: 267241
2016-04-22 22:50:56 +00:00
Matt Arsenault
b274219b5d AMDGPU: Re-visit nodes in performAndCombine
This fixes test regressions when i64 loads/stores are made promote.

llvm-svn: 267240
2016-04-22 22:48:38 +00:00
Nico Weber
6dca3ae366 Revert r267210, it makes clang assert (PR27490).
llvm-svn: 267232
2016-04-22 22:08:42 +00:00
Andrew Kaylor
653d361880 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Sriraman Tallam
bdcd2a52a2 Differential Revision: http://reviews.llvm.org/D19040
llvm-svn: 267229
2016-04-22 21:41:58 +00:00
David Blaikie
20d90ebb0d llvm-symbolizer: prefer .dwo contents over fission-gmlt-like-data when .dwo file is present
Rather than relying on the gmlt-like data emitted into the .o/executable
which only contains the simple name of any inlined functions, use the
.dwo file if present.

Test symbolication with/without a .dwo, and the old test that was
testing behavior when no gmlt-like data was present. (I haven't included
a test of non-gmlt-like data + no .dwo (that would be akin to
symbolication with no debug info) but we could add one for completeness)

The test was simplified a bit to be a little clearer (unoptimized, force
inline, using a function call as the inlined entity) and regenerated
with ToT clang. For the no-gmlt-like-data case, I modified Clang back to
its old behavior temporarily & the .dwo file is identical so it is
shared between the two executables.

llvm-svn: 267227
2016-04-22 21:32:59 +00:00
Peter Collingbourne
d04766ba20 Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223
2016-04-22 21:18:02 +00:00
Matt Arsenault
953215125b DAGCombiner: Relax alignment restriction when changing store type
If the target allows the alignment, this should be OK.

llvm-svn: 267217
2016-04-22 21:01:41 +00:00
Philip Reames
53e3092198 [unordered] sink unordered stores at end of blocks
The existing code turned out to be completely correct when auditted.  Thus, only minor code changes and adding a couple of tests.

llvm-svn: 267215
2016-04-22 20:53:32 +00:00
Sanjoy Das
cbed8d8247 Fold compares for distinct allocations
Summary:
We can fold compares to false when two distinct allocations within a
function are compared for equality.

Patch by Anna Thomas!

Reviewers: majnemer, reames, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19390

llvm-svn: 267214
2016-04-22 20:52:25 +00:00
Peter Collingbourne
df3f55c3ab CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Philip Reames
29f7cc186f [unordered] Extend load/store type canonicalization to handle unordered operations
Extend the type canonicalization logic to work for unordered atomic loads and stores.  Note that while this change itself is fairly simple and low risk, there's a reasonable chance this will expose problems in the backends by suddenly generating IR they wouldn't have seen before.  Anything of this nature will be an existing bug in the backend (you could write an atomic float load), but this will definitely change the frequency with which such cases are encountered.  If you see problems, feel free to revert this change, but please make sure you collect a test case.  

llvm-svn: 267210
2016-04-22 20:33:48 +00:00
Matt Arsenault
9b242e0d0e DAGCombiner: Relax alignment restriction when changing load type
If the target allows the alignment, this should still be OK.

llvm-svn: 267209
2016-04-22 20:21:36 +00:00
Quentin Colombet
dfaf1211a5 [AArch64] Fix optimizeCondBranch logic.
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267206
2016-04-22 20:09:58 +00:00
Justin Bogner
0bc097ad38 PM: Port SinkingPass to the new pass manager
llvm-svn: 267199
2016-04-22 19:54:10 +00:00
Jun Bum Lim
b80664f84c [DeadStoreElimination] Shorten beginning of memset overwritten by later stores
Summary: This change will shorten memset if the beginning of memset is overwritten by later stores.

Reviewers: hfinkel, eeckstein, dberlin, mcrosier

Subscribers: mgrang, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18906

llvm-svn: 267197
2016-04-22 19:51:29 +00:00
Justin Bogner
fad7547388 PM: Port DCE to the new pass manager
Also add a very basic test, since apparently there aren't any tests
for DCE whatsoever to add the new pass version to.

llvm-svn: 267196
2016-04-22 19:40:41 +00:00
Matthias Braun
97996c46e1 MachineScheduler: Limit the size of the ready list.
Avoid quadratic complexity in unusually large basic blocks by limiting
the size of the ready lists.

Differential Revision: http://reviews.llvm.org/D19349

llvm-svn: 267189
2016-04-22 19:09:17 +00:00
Quentin Colombet
e89cf71564 [AArch64] When creating MRS instruction, make sure the destination register is
declared as a definition.

This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll.

llvm-svn: 267185
2016-04-22 18:46:17 +00:00
Adam Nemet
089177d9da [LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable
In the next change, I am generalizing the function
findStringMetadataForLoop and I want to make sure I don't break this.
Looks like there was no coverage for this so far.

llvm-svn: 267182
2016-04-22 18:34:50 +00:00
Quentin Colombet
e4d08c7e23 [AArch64][AdvSIMDScalar] Update the kill flags correctly.
We used to simply set the kill flags to true when transforming a scalar
instruction to a vector one.
SrcScalar1 = copy SrcVector1
... = opScalar SrcScalar1
=>
SrcScalar1 = copy SrcVector1
... = opVector SrcVector1<kill>

This is obviously wrong. The proper update consists in:
1. Propagate the kill status from the copy to the new opVector
2. Reset the kill status on the copy, since the live-range of
   SrcVector1 got extended.

This fixes some of the machine verifier errors for AArch64 with make check.

llvm-svn: 267180
2016-04-22 18:09:14 +00:00
Saleem Abdulrasool
deec7429ad test: split test into two runs
Rather than checking both stdout and stderr simultaneously, split it into two
tests.  This apparently breaks on Windows where MSVCRT does not buffer output
correctly.  NFC.

Thanks to chapuni for bringing the issue to my attention!

llvm-svn: 267179
2016-04-22 18:06:51 +00:00
Chad Rosier
44f5b9d174 [SimplifyCFG] Add final missing implications to isImpliedTrueByMatchingCmp.
Summary: eq imply [u|s]ge and [u|s]le are true.

Remove redundant logic by implementing isImpliedFalseByMatchingCmp(Pred1, Pred2)
as isImpliedTrueByMatchingCmp(Pred1, getInversePredicate(Pred2)).

llvm-svn: 267177
2016-04-22 17:57:34 +00:00
Sanjoy Das
cc00fe4f8f Have isKnownNotFullPoison be smarter around control flow
Summary:
(... while still not using a PostDomTree)

The way we use isKnownNotFullPoison from SCEV today, the new CFG walking
logic will not trigger for any realistic cases -- it will kick in only
for situations where we could have merged the contiguous basic blocks
anyway[0], since the poison generating instruction dominates all of its
non-PHI uses (which are the only uses we consider right now).

However, having this change in place will allow a later bugfix to break
fewer llvm-lit tests.

[0]: i.e. cases where block A branches to block B and B is A's only
successor and A is B's only predecessor.

Reviewers: broune, bjarke.roune

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19212

llvm-svn: 267175
2016-04-22 17:41:06 +00:00
Krzysztof Parzyszek
78673d03bc [Hexagon] Properly close live range in HexagonBlockRanges ---add testcase
llvm-svn: 267174
2016-04-22 17:30:13 +00:00
Chad Rosier
492b4554a0 [SimplifyCFG] Add missing implications to isImpliedTrueByMatchingCmp.
Summary: [u|s]gt and [u|s]lt imply [u|s]ge and [u|s]le are true, respectively.
I've simplified the existing tests and added additional tests to cover the new
cases mentioned above.  I've also added tests for all the cases where the
first compare doesn't imply anything about the second compare.

llvm-svn: 267171
2016-04-22 17:14:12 +00:00
Chad Rosier
aa97cc7ffd [SimplifyCFG] Simplify code review by temporarily removing this test file.
A followup commit will replace these tests with simplified and more inclusive
tests.  The diff is unreadable if this were to be done in a single commit.

llvm-svn: 267170
2016-04-22 17:14:08 +00:00
Konstantin Zhuravlyov
613d1574ee [AMDGPU] Insert nop pass: take care of outstanding feedback
- Switch few loops to range-based for loops
- Fix nop insertion at the end of BB
- Fix formatting
- Check for endpgm

Differential Revision: http://reviews.llvm.org/D19380

llvm-svn: 267167
2016-04-22 17:04:51 +00:00
Zoran Jovanovic
cc65c7cf04 [mips][microMIPS] Revert commit r266861.
Commit r266861 was the reason for failing tests in LLVM test suite.

llvm-svn: 267166
2016-04-22 16:53:15 +00:00
Krzysztof Parzyszek
e1e99be481 [Hexagon] Teach mux expansion how to deal with undef predicates
llvm-svn: 267165
2016-04-22 16:47:01 +00:00
Krzysztof Parzyszek
dc41008bf4 [Hexagon] Add definitions for trap/pause instructions
Also add tests for other instructions from HexagonSystemInst.td.

llvm-svn: 267162
2016-04-22 16:25:00 +00:00
David Majnemer
0c47b59618 [EarlyCSE] Don't add the overflow flags to the hash
We take the intersection of overflow flags while CSE'ing.
This permits us to consider two instructions with different overflow
behavior to be replaceable.

llvm-svn: 267153
2016-04-22 14:12:50 +00:00
Nirav Dave
2e439346e9 Emit code16 in assembly in 16-bit mode
Summary:
When generating assembly using -m16 we must explicitly mark it as
16-bit. Emit .code16 at beginning of file. Fixes wrong results when
using -fno-integrated-as.

Reviewers: dwmw2

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19392

llvm-svn: 267152
2016-04-22 13:36:11 +00:00
Simon Dardis
c64ea03044 [mips] Fix select patterns for MIPS64
When targetting MIPS64R6 some of the patterns for select were guarded by a
broken predicate. The predicate was supposed to test if a constant value
could fit in a 16 bit zero-extended field. Instead the value was tested to
fit in a 16 bit sign-extended field. For negative constants of native word
width this resulted in wrong code generation.

Reviewers: vkalintiris, dsanders

Differential Review: http://reviews.llvm.org/D19378

llvm-svn: 267151
2016-04-22 13:19:22 +00:00
Daniel Sanders
4115c5c9c7 Revert r267049, r26706[16789], r267071 - Refactor raw pdb dumper into library
r267049 broke multiple buildbots (e.g. clang-cmake-mips, and clang-x86_64-linux-selfhost-modules) which the follow-ups have not yet resolved and this is preventing subsequent committers from being notified about additional failures on the affected buildbots.

llvm-svn: 267148
2016-04-22 12:04:42 +00:00
Nikolay Haustov
951d602962 AMDGPU/SI: Add test missed in rL266865
llvm-svn: 267144
2016-04-22 11:39:43 +00:00
Silviu Baranga
52b925e4c9 [InstCombine] Preserve fast math flags when combining PHIs
Summary:
When optimizing PHIs which have inputs floating point binary
operators, we preserve all IR flags except the fast math
flags.

This change removes the logic which tracked some of the IR flags
(no wrap, exact) and replaces it by doing an and on the IR flags of
all inputs to the PHI - which will also handle the fast math
flags.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19370

llvm-svn: 267139
2016-04-22 11:21:36 +00:00
Hrvoje Varga
0102a446f7 [mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions
Differential Revision: http://reviews.llvm.org/D19354

llvm-svn: 267137
2016-04-22 11:18:40 +00:00
Zoran Jovanovic
057e3e3388 [mips][microMIPS] Add R_MICROMIPS_PC18_S3 relocation
Differential Revision: http://reviews.llvm.org/D15026

llvm-svn: 267130
2016-04-22 10:15:12 +00:00
Daniel Sanders
ffe901fc26 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Ashutosh Nema
4becb7c51e [X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode.
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE 
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.

Reviewers: Simon Pilgrim
llvm-svn: 267123
2016-04-22 08:34:05 +00:00
Vedant Kumar
b6cc52b7d8 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Zlatko Buljan
b2404307e8 [mips][microMIPS] Implement DVP, EVP and JALRC.HB instructions
Differential Revision: http://reviews.llvm.org/D18687

llvm-svn: 267114
2016-04-22 06:44:34 +00:00
David Majnemer
0b1305ba18 [GVN] Respect fast-math-flags on fcmps
We assumed that flags were only present on binary operators.  This is
not true, they may also be present on calls and fcmps.

llvm-svn: 267113
2016-04-22 06:37:51 +00:00
David Majnemer
56b0cb8635 [EarlyCSE] Take the intersection of flags on instructions
EarlyCSE had inconsistent behavior with regards to flag'd instructions:
- In some cases, it would pessimize if the available instruction had
  different flags by not performing CSE.
- In other cases, it would miscompile if it replaced an instruction
  which had no flags with an instruction which has flags.

Fix this by being more consistent with our flag handling by utilizing
andIRFlags.

llvm-svn: 267111
2016-04-22 06:37:45 +00:00
Nicolai Haehnle
f54e57a212 AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.

Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.

This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19191

llvm-svn: 267102
2016-04-22 04:04:08 +00:00
Craig Topper
6ea4b97a42 [AVX512] Teach lowering to use vplzcntd/q to implement 128/256-bit CTTZ_ZERO_UNDEF even without VLX support. We can just extend to 512-bits and extract like we do for CTLZ.
llvm-svn: 267100
2016-04-22 03:22:38 +00:00
Gerolf Hoflehner
d63bafa58b [MachineCombiner] Support for floating-point FMA on ARM64
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267098
2016-04-22 02:15:19 +00:00
Nico Weber
84e1414d06 Try to fix UNRESOLVED: LLVM :: CodeGen/AArch64/arm64-regress-opt-cmp.s on bots.
This test used to write a .s file until r266971 fixed that.  But on most bots,
the .s file still exists.  Add an rm statement to clean up the bots.  In a few
days, this statement can go away again.

llvm-svn: 267095
2016-04-22 01:08:56 +00:00
Saleem Abdulrasool
eed028bbcd ARM: fix test for Windows division
This was meant to be part of SVN r267080.  cbz cannot use a high register, which
would be silently truncated.  This has now been fixed.

llvm-svn: 267092
2016-04-22 01:03:38 +00:00
Dan Gohman
fdbfed8f83 [WebAssembly] Limit alignment hints to natural alignment.
This follows the current binary format rules.

llvm-svn: 267082
2016-04-21 23:59:48 +00:00
Saleem Abdulrasool
4fb09b352f ARM: restrict register class for WIN__DBZCHK
WIN__DBZCHK will insert a CBZ instruction into the stream.  This instruction
reserves 3 bits for the condition register (rn).  As such, we must ensure that
we restrict the register to a low register.  Use the tGPR class instead of GPR
to ensure that this is properly constrained.  In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.

llvm-svn: 267080
2016-04-21 23:53:19 +00:00
Mike Aizatsky
27db5e9a0c [sancov] using normalized filenames for blacklist checks.
Differential Revision: http://reviews.llvm.org/D19395

llvm-svn: 267078
2016-04-21 23:38:45 +00:00
Tim Northover
95d001461b MachO: enable .data_region directives everywhere
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.

rdar://25254790

llvm-svn: 267075
2016-04-21 23:00:17 +00:00
Zachary Turner
f283a5743f Fix pdbdump-headers.test after guid format change.
llvm-svn: 267067
2016-04-21 22:13:25 +00:00
Derek Bruening
468017deae [esan] EfficiencySanitizer instrumentation pass
Summary:
Adds an instrumentation pass for the new EfficiencySanitizer ("esan")
performance tuning family of tools.  Multiple tools will be supported
within the same framework.  Preliminary support for a cache fragmentation
tool is included here.

The shared instrumentation includes:
+ Turn mem{set,cpy,move} instrinsics into library calls.
+ Slowpath instrumentation of loads and stores via callouts to
  the runtime library.
+ Fastpath instrumentation will be per-tool.
+ Which memory accesses to ignore will be per-tool.

Reviewers: eugenis, vitalybuka, aizatsky, filcab

Subscribers: filcab, vkalintiris, pcc, silvas, llvm-commits, zhaoqin, kcc

Differential Revision: http://reviews.llvm.org/D19167

llvm-svn: 267058
2016-04-21 21:30:22 +00:00
Kevin Enderby
2afb9b31e6 Fix a typo in an error message. Caught by Sean Silva!
llvm-svn: 267056
2016-04-21 21:20:40 +00:00
Sanjay Patel
081b99ba47 add tests for disguised fabs/fneg
llvm-svn: 267053
2016-04-21 21:02:25 +00:00
Sanjay Patel
67f5c38cab use FileCheck; add test for disguised fabs
llvm-svn: 267051
2016-04-21 20:58:58 +00:00
Krzysztof Parzyszek
59ba2ba8f4 [Hexagon] Properly recognize register alt names
llvm-svn: 267038
2016-04-21 19:49:53 +00:00
Sanjoy Das
d82c062070 Folding compares with unescaped allocations
Summary:
If we know that the pointer allocated within a function does not escape,
we can fold away comparisons that are done with global pointers

Patch by Anna Thomas!

Reviewers: reames, majnemer, sanjoy

Subscribers: mgrang, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D19276

llvm-svn: 267035
2016-04-21 19:26:45 +00:00
Krzysztof Parzyszek
b6d9f909f0 [Hexagon] Expand handling of the small-data/bss section
llvm-svn: 267034
2016-04-21 18:56:45 +00:00
Matt Arsenault
a95bd86b3c DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component
If the extracted bits are restricted to the upper half or lower half,
this can be truncated.

llvm-svn: 267024
2016-04-21 18:03:06 +00:00
Philip Reames
1c0459c510 [instcombine][unordered] Extend load(select) transform to handle unordered loads
llvm-svn: 267023
2016-04-21 17:59:40 +00:00
Andrew Kaylor
fd49f275f8 Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Philip Reames
7e0ae29aca [unordered] unordered loads from null are still unreachable
llvm-svn: 267019
2016-04-21 17:45:05 +00:00
Marcin Koscielnicki
452981873a [PowerPC] [SSP] Fix stack guard load for 32-bit.
r266809 incorrectly used LD to load the stack guard, it should be LWZ.

Differential Revision: http://reviews.llvm.org/D19358

llvm-svn: 267017
2016-04-21 17:36:05 +00:00
Philip Reames
e04aebaa98 [instcombine][unordered] Implement *-load forwarding for unordered atomics
This builds on 266999 which made FindAvailableValue do the right thing.  Tests included show the newly enabled transforms and those which disabled either due to conservatism or correctness requirements.

llvm-svn: 267006
2016-04-21 17:03:33 +00:00
Amjad Aboud
65ad240089 Fixed Dwarf debug info emission to skip DILexicalBlockFile entries.
Before this fix, DILexicalBlockFile entries were skipped only in some cases and were not in other cases.

Differential Revision: http://reviews.llvm.org/D18724

llvm-svn: 267004
2016-04-21 16:58:49 +00:00
Philip Reames
7e4f2fa4ac [unordered] Add tests and conservative handling in support of future changes [NFCI]
This change adds a couple of test cases to make sure FindAvailableLoadedValue does the right thing.  At the moment, the code added is dead, but separating it makes follow on changes far more obvious.

llvm-svn: 266999
2016-04-21 16:51:08 +00:00
Rafael Espindola
7e0cfdf745 Fix recursive -only-needed.
We were assuming that only linkonce_odr GVs were lazy linked.

llvm-svn: 266995
2016-04-21 14:56:33 +00:00
Zoran Jovanovic
173f170236 [mips][microMIPS] Implement ldpc instruction
Differential Revision: http://reviews.llvm.org/D15009

llvm-svn: 266990
2016-04-21 14:32:12 +00:00
Zoran Jovanovic
2f0e417cac [mips][microMIPS] Add R_MICROMIPS_PC19_S2 relocation
Differential Revision: http://reviews.llvm.org/D14915

llvm-svn: 266988
2016-04-21 14:09:35 +00:00
Zoran Jovanovic
a3f572b81a [mips][microMIPS] Add R_MICROMIPS_PC26_S1 relocation
Differential Revision: http://reviews.llvm.org/D14822

llvm-svn: 266985
2016-04-21 13:43:26 +00:00
Zlatko Buljan
f385a81a9c [mips][microMIPS] Implement TLBP, TLBR, TLBWI and TLBWR instructions
Differential Revision: http://reviews.llvm.org/D18855

llvm-svn: 266980
2016-04-21 11:32:40 +00:00
Zlatko Buljan
d599e5d9a8 [mips][microMIPS] Implement LL, SC, MOVEP, ROTR, ROTRV and SYSCALL instructions and add tests for LWM32 and SWM32
Differential Revision: http://reviews.llvm.org/D19150

llvm-svn: 266977
2016-04-21 11:01:51 +00:00
Evgeny Astigeevich
8e76562ba2 Updated a test not to produce an empty s-file.
llvm-svn: 266971
2016-04-21 09:36:49 +00:00
Evgeny Astigeevich
4405b74d21 [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr
AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code.
A compare instruction is substituted with another instruction which does not
produce the same flags as the original compare instruction.
This patch contains:
1. Fix of the bug.
2. A regression test in MIR.
3. A new test to check that SUBS is replaced by SUB.

Differential Revision: http://reviews.llvm.org/D18838

llvm-svn: 266969
2016-04-21 08:54:08 +00:00
Craig Topper
b64a162a22 [AVX512] Add CTTZ support for v8i64 and v16i32 vectors.
llvm-svn: 266968
2016-04-21 07:30:06 +00:00
Craig Topper
22948cc4a9 [X86] Fix vector-tzcnt-512 test to disable CDI while enabling BWI for one of the runs. Update check patterns accordingly.
llvm-svn: 266967
2016-04-21 07:30:03 +00:00
Craig Topper
1c2be5dcd4 Fix test command line to explicitly disable CDI instructions for one test.
llvm-svn: 266966
2016-04-21 07:29:59 +00:00
Craig Topper
dc636097f0 [AVX512] Add support for lowering CTTZ v64i8 and v32i16 with BWI instructions.
llvm-svn: 266963
2016-04-21 06:39:34 +00:00
Mehdi Amini
86c526e25e ThinLTO: Resolve linkonce_odr aliases just like functions
This help to streamline the process of handling importing since
we don't need to special case alias everywhere: just like
linkonce_odr function, make sure at least one alias is emitted
by turning it weak.

Differential Revision: http://reviews.llvm.org/D19308

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266958
2016-04-21 05:47:17 +00:00
Sanjoy Das
785700a50c [SimplifyCFG] Fold llvm.guard(false) to unreachable
Summary:
`llvm.guard(false)` always bails out of the current compilation unit, so
we can prune any control flow following it.

Reviewers: hfinkel, pcc, reames

Subscribers: majnemer, reames, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19245

llvm-svn: 266955
2016-04-21 05:09:12 +00:00
Craig Topper
7e74d3bc64 [AVX512] Add support for popcount of v8i64 and v16i32 with and without BWI instructions.
Without BWI we have to split the vectors into 256-bit vectors so we can use AVX2 pshufb and then concatenate the results.

llvm-svn: 266950
2016-04-21 03:57:24 +00:00
Mehdi Amini
857b9f0901 ThinLTO/ModuleLinker: add a flag to not always pull-in linkonce when performing importing
Summary:
The function importer already decided what symbols need to be pulled
in. Also these magically added ones will not be in the export list
for the source module, which can confuse the internalizer for
instance.

Reviewers: tejohnson, rafael

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19096

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266948
2016-04-21 01:59:39 +00:00
Duncan P. N. Exon Smith
556324f139 BitcodeWriter: Emit metadata in post-order (again)
Emit metadata nodes in post-order.  The iterative algorithm from r266709
failed to maintain this property.  After understanding my mistake, it
wasn't too hard to write a test with llvm-bcanalyzer (and I've actually
made this change once before: see r220340).

This also reverts the "noisy" testcase change from r266709.  That should
have been more of a red flag :/.

Note: The same bug crept into the ValueMapper in r265456.  I'm still
working on the fix.

llvm-svn: 266947
2016-04-21 01:55:12 +00:00
Nick Lewycky
39244705f7 Add optimization for 'icmp slt (or A, B), A' and some related idioms based on knowledge of the sign bit for A and B.
No matter what value you OR in to A, the result of (or A, B) is going to be UGE A. When A and B are positive, it's SGE too. If A is negative, OR'ing a value into it can't make it positive, but can increase its value closer to -1, therefore (or A, B) is SGE A. Working through all possible combinations produces this truth table:

```
A is
+, -, +/-
F  F   F   +    B is
T  F   ?   -
?  F   ?   +/-
```

The related optimizations are flipping the 'slt' for 'sge' which always NOTs the result (if the result is known), and swapping the LHS and RHS while swapping the comparison predicate.

There are more idioms left to implement (aren't there always!) but I've stopped here because any more would risk becoming unreasonable for reviewers.

llvm-svn: 266939
2016-04-21 00:53:14 +00:00
Vedant Kumar
d3e6694129 [test/PGOProfile] Make tests independent of the raw profile version (NFC)
Differential Revision: http://reviews.llvm.org/D19290

llvm-svn: 266928
2016-04-20 22:24:01 +00:00
Kevin Enderby
92582f2b18 Thread Expected<...> up from libObject’s getName() for symbols to allow llvm-objdump to produce a good error message.
Produce another specific error message for a malformed Mach-O file when a symbol’s
string index is past the end of the string table.  The existing test case in test/Object/macho-invalid.test
for macho-invalid-symbol-name-past-eof now reports the error with the message indicating
that a symbol at a specific index has a bad sting index and that bad string index value.
 
Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same.  There is some
code for this that could be factored into a routine but I would like to leave that for
the code owners post-commit to do as they want for handling an llvm::Error.  An
example of how this could be done is shown in the diff in
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h which had a Check() routine
already for std::error_code so I added one like it for llvm::Error .

Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values.  So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.

Note there fixes needed to lld that goes along with this that I will commit right after this.
So expect lld not to built after this commit and before the next one.

llvm-svn: 266919
2016-04-20 21:24:34 +00:00
Kostya Serebryany
d328bd016e Rename asan-check-lifetime into asan-stack-use-after-scope
Summary:
This is done for consistency with asan-use-after-return.
I see no other users than tests.

Reviewers: aizatsky, kcc

Differential Revision: http://reviews.llvm.org/D19306

llvm-svn: 266906
2016-04-20 20:02:58 +00:00
Jacques Pienaar
0cb285e7fd [lanai] Add subword scheduling itineraries.
Differentiate between word and subword memory operations as they take different
amount of cycles to complete. This just adds a basic model of the subword
latency to the scheduler.

llvm-svn: 266898
2016-04-20 18:28:55 +00:00
Krzysztof Parzyszek
c603d6f55f [Hexagon] Fix handling of lcomm directive
Patch by Colin LeMahieu.

llvm-svn: 266882
2016-04-20 15:54:13 +00:00
Teresa Johnson
ff13fea2b3 Re-enable "[gold-plugin] Disable name for values other than GlobalValue"
This restores r266871 with a fix for gold tests relying on the value
names, when using a release compiler, by adding a way to disable the
default discarding. Update affected tests to use the new mechanism so
that value names are preserved as expected, regardless of how the
compiler was built.

llvm-svn: 266881
2016-04-20 15:16:57 +00:00
Teresa Johnson
e24ec3840e [ThinLTO] Prevent importing of "llvm.used" values
Summary:
This patch prevents importing from (and therefore exporting from) any
module with a "llvm.used" local value. Local values need to be promoted
and renamed when importing, and their presense on the llvm.used variable
indicates that there are opaque uses that won't see the rename. One such
example is a use in inline assembly.

See also the discussion at:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098047.html

As part of this, move collectUsedGlobalVariables out of Transforms/Utils
and into IR/Module so that it can be used more widely. There are several
other places in LLVM that used copies of this code that can be cleaned
up as a follow on NFC patch.

Reviewers: joker.eph

Subscribers: pcc, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18986

llvm-svn: 266877
2016-04-20 14:39:45 +00:00
Krzysztof Parzyszek
17749987f0 [RDF] Consider register as live if any alias is live
This only affects the recomputation of kill flags.

llvm-svn: 266875
2016-04-20 14:33:23 +00:00
Zoran Jovanovic
69b803dc41 [mips][microMIPS] Implement BGEC, BGEUC, BLTC, BLTUC, BEQC and BNEC instructions
Differential Revision: http://reviews.llvm.org/D14206

llvm-svn: 266873
2016-04-20 14:07:46 +00:00
Teresa Johnson
9f8bde4940 Revert "[gold-plugin] Disable name for values other than GlobalValue"
This reverts commit r266871. Setting the default based on the NDEBUG
flag is causing test failures. Need to figure out whether to change this
approach or update tests.

llvm-svn: 266872
2016-04-20 13:18:47 +00:00
Teresa Johnson
485edc2952 [gold-plugin] Disable name for values other than GlobalValue
Summary:
Applies Mehdi's optimization (r263086) to disable value names other than
for GlobalValues to LTO/ThinLTO performed via the gold-plugin, in the
same manner as it is applied in libLTO.

Reviewers: rafael, joker-eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19269

llvm-svn: 266871
2016-04-20 13:01:37 +00:00
Nikolay Haustov
b49bd2b1da AMDGPU/SI: Assembler: improvements to support trap handlers.
Add ParseAMDGPURegister which can be invoked recursively for parsing lists.
Rename getRegForName to getSpecialRegForName.
Support legacy SP3 register list syntax: [s2,s3,s4,s5] or [flat_scratch_lo,flat_scratch_hi].
Add 64-bit registers TBA, TMA where missing.
Add some tests.

Differential Revision: http://reviews.llvm.org/D19163

llvm-svn: 266865
2016-04-20 09:34:48 +00:00
Asaf Badouh
9cc5def4a3 [X86] enable PIE for functions
Call locally defined function directly for PIE/fPIE

Differential Revision: http://reviews.llvm.org/D19226

llvm-svn: 266863
2016-04-20 08:32:57 +00:00
Hrvoje Varga
c7e838369e [mips][microMIPS]Implement CFC*, CTC* and LDC* instructions
Differential Revision: http://reviews.llvm.org/D18640

llvm-svn: 266861
2016-04-20 06:34:48 +00:00
Craig Topper
6c544565e4 [AVX512] Add avx512cd+vl runs to vector-tzcnt-128/256 tests to show using the vplzcntd/q instructions.
llvm-svn: 266860
2016-04-20 05:19:01 +00:00
Craig Topper
73cbc9c236 [AVX512] Update vector-tzcnt-512 test to show how bad v32i16 and v64i8 is with avx512bw enabled.
llvm-svn: 266859
2016-04-20 05:18:58 +00:00
Craig Topper
f480c8bddb [AVX512] Add popcount support for v32i16 and v64i8.
llvm-svn: 266858
2016-04-20 05:18:55 +00:00
Mehdi Amini
45e686121e ThinLTO: never promote as external weak
This linkage is *not* intended to express that a declaration refers
to a weak symbol, but that the symbol might not be present at link
time. I don't believe it was the intent.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266856
2016-04-20 04:18:11 +00:00
Mehdi Amini
8039b6b99a FunctionImport: make sure we always select the right callee in presence of alias
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266854
2016-04-20 04:17:36 +00:00
Marcin Koscielnicki
64bfaf0336 [SystemZ] Add support for llvm.thread.pointer intrinsic.
Differential Revision: http://reviews.llvm.org/D19054

llvm-svn: 266844
2016-04-20 01:03:48 +00:00
Mandeep Singh Grang
28ddad394b [LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC.
Summary: Removed unwanted --check-prefix=CHECK from numerous unit tests.

Reviewers: t.p.northover, dblaikie, uweigand, MatzeB, tstellarAMD, mcrosier

Subscribers: mcrosier, dsanders

Differential Revision: http://reviews.llvm.org/D19279

llvm-svn: 266834
2016-04-19 23:51:52 +00:00
Tim Northover
8ead88dca1 ARM: fix assertion failure on -O0 cmpxchg.
Because lowering of CMP_SWAP_64 occurs during type legalization, there can be
i64 types produced by more than just a BUILD_PAIR or similar. My initial tests
used just incoming function args.

llvm-svn: 266828
2016-04-19 22:25:02 +00:00
Nicolai Haehnle
0c7a341af5 Add IntrWrite[Arg]Mem intrinsic property
Summary:
This property is used to mark an intrinsic that only writes to memory, but
neither reads from memory nor has other side effects.

An example where this is useful is the llvm.amdgcn.buffer.store.format.*
intrinsic, which corresponds to a store instruction that goes through a special
buffer descriptor rather than through a plain pointer.

With this property, the intrinsic should still be handled as having side
effects at the LLVM IR level, but machine scheduling can make smarter
decisions.

Reviewers: tstellarAMD, arsenm, joker.eph, reames

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18291

llvm-svn: 266826
2016-04-19 21:58:33 +00:00
Nicolai Haehnle
e4cebbe0dc AMDGPU: Guard VOPC instructions against incorrect commute
Summary:
The added testcase, which triggered this, was derived from a shader-db case
via bugpoint. A separate question is why scalar branching wasn't used.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19208

llvm-svn: 266825
2016-04-19 21:58:22 +00:00
Krzysztof Parzyszek
6a86876079 [Hexagon] Fix operand swapping in HexagonPeephole
Also, disable zero- and size-extend optimizations for now.

llvm-svn: 266821
2016-04-19 21:36:24 +00:00
Marcin Koscielnicki
0919e582e6 [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer.  I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.

This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.

Differential Revision: http://reviews.llvm.org/D19098

llvm-svn: 266818
2016-04-19 20:51:05 +00:00
Krzysztof Parzyszek
5607676c28 [Hexagon] Fix printing the address operand of S2_storerinewabs
llvm-svn: 266811
2016-04-19 20:20:33 +00:00
Tim Shen
a12f27cb73 [PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
2016-04-19 20:14:52 +00:00
Tim Shen
3a75cd4bf9 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Jacques Pienaar
88ae65cfbd [lanai] Add lowering for SETCCE i32.
* Add lowering for SETCCE i32.
* Add test to check lowering of i64 compares uses SETCCE expansion (outside of EQ and NE).
* Fix select.ll test and immediate form selection for RI operations.

llvm-svn: 266802
2016-04-19 19:15:25 +00:00
Duncan P. N. Exon Smith
2288252887 IR: Enable debug info type ODR uniquing for forward decls
Add a new method, DICompositeType::buildODRType, that will create or
mutate the DICompositeType for a given ODR identifier, and use it in
LLParser and BitcodeReader instead of DICompositeType::getODRType.

The logic is as follows:

  - If there's no node, create one with the given arguments.
  - Else, if the current node is a forward declaration and the new
    arguments would create a definition, mutate the node to match the
    new arguments.
  - Else, return the old node.

This adds a missing feature supported by the current DITypeIdentifierMap
(which I'm slowly making redudant).  The only remaining difference is
that the DITypeIdentifierMap has a "the-last-one-wins" rule, whereas
DICompositeType::buildODRType has a "the-first-one-wins" rule.

For now I'm leaving behind DICompositeType::getODRType since it has
obvious, low-level semantics that are convenient for unit testing.

llvm-svn: 266786
2016-04-19 18:00:19 +00:00
Duncan P. N. Exon Smith
b39d835190 Linker: Simplify test/Linker/dicompositetype-unique.ll, NFC
Simplify the test logic a little, sharing logic between the two linking
directions by specifying -check-prefix multiple times.  Now it's more
obvious what's hte same and different between the two directions, and
there is less CHECK duplication.  This is a prep for expanding the test.

llvm-svn: 266773
2016-04-19 17:43:43 +00:00
Chad Rosier
6c28047766 [ValueTracking] Improve isImpliedCondition for conditions with matching operands.
This patch improves SimplifyCFG to catch cases like:

  if (a < b) {
    if (a > b) <- known to be false
      unreachable;
  }

Phabricator Revision: http://reviews.llvm.org/D18905

llvm-svn: 266767
2016-04-19 17:19:14 +00:00
Mehdi Amini
f24757c4b7 Fix Gold test after r266750 (ModuleLinker: Do not import linkonce/weak as "external_weak")
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266752
2016-04-19 16:21:37 +00:00
Mehdi Amini
d8a373b882 ModuleLinker: Do not import linkonce/weak as "external_weak"
Summary:
There is no reason to have a weak reference because the external
definition will be weak.

Reviewers: rafael

Subscribers: llvm-commits, tejohnson

Differential Revision: http://reviews.llvm.org/D19267

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266750
2016-04-19 16:11:05 +00:00
Duncan P. N. Exon Smith
e6893b5521 IR: getOrInsertODRUniquedType => DICompositeType::getODRType, NFC
Lift the API for debug info ODR type uniquing up a layer.  Instead of
clients managing the map directly on the LLVMContext, add a static
method to DICompositeType called getODRType and handle the map in the
background.  Also adds DICompositeType::getODRTypeIfExists, so far just
for convenience in the unit tests.

This simplifies the logic in LLParser and BitcodeReader.  Because of
argument spam there are actually a few more lines of code now; I'll see
if I come up with a reasonable way to clean that up.

llvm-svn: 266742
2016-04-19 14:55:09 +00:00
Simon Pilgrim
2006c6eadb [InstCombine][X86] Added extra tests introduced for D17490
llvm-svn: 266732
2016-04-19 12:59:52 +00:00
Simon Pilgrim
501f5ba3b6 [InstCombine][X86] Regenerate SSE combine tests as part of setup for D17490
Regenerated with utils/update_test_checks.py 

llvm-svn: 266731
2016-04-19 12:56:46 +00:00
Simon Pilgrim
e1392bc92c [X86][AVX2] Prefer VPERMQ/VPERMPD over VINSERTI128/VINSERTF128 for unary shuffles
Using VPERMQ/VPERMPD allows memory folding of the (repeated) input where VINSERTI128/VINSERTF128 can not.

Differential Revision: http://reviews.llvm.org/D19228

llvm-svn: 266728
2016-04-19 12:26:40 +00:00
Sanjoy Das
91fd65c3a6 Introduce a "patchable-function" function attribute
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.

Reviewers: joker.eph, rnk, echristo, dberris

Subscribers: joker.eph, echristo, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19046

llvm-svn: 266715
2016-04-19 05:24:47 +00:00
Duncan P. N. Exon Smith
32da2f75ed BitcodeWriter: Break recursion when enumerating Metadata, almost NFC
Use a worklist instead of recursing through MDNode operands in
ValueEnumerator.  The actual record output order has changed slightly,
but otherwise there's no functionality change.

I had to update test/Bitcode/metadata-function-blocks.ll.  I renumbered
nodes so they continue to match the implicit record ids.

llvm-svn: 266709
2016-04-19 03:46:51 +00:00
Michael Kuperstein
3e5d8ebde9 Port DemandedBits to the new pass manager.
Differential Revision: http://reviews.llvm.org/D18679

llvm-svn: 266699
2016-04-18 23:55:01 +00:00
Paul Robinson
28e9fd46e3 [DWARF] Force a linkage_name on an inlined subprogram's abstract origin.
When we suppress linkage names, for a non-inlined subprogram the name
can still be found in the object-file symbol table, because we have
the code address of the subprogram.  This is not necessarily the case
for an inlined subprogram, so we still want to emit the linkage name
in the DWARF.  Put this on the abstract-origin DIE because it's common
to all inlined instances.

Differential Revision: http://reviews.llvm.org/D18706

llvm-svn: 266692
2016-04-18 22:41:41 +00:00
Tim Northover
5f6de253c5 ARM: use a pseudo-instruction for cmpxchg at -O0.
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.

Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.

Should fix the 32-bit part of PR25526.

llvm-svn: 266679
2016-04-18 21:48:55 +00:00
Simon Pilgrim
72df26628e [X86][SSE] Test case for PR2585
llvm-svn: 266669
2016-04-18 21:07:49 +00:00
Simon Pilgrim
2ca1c20043 [X86][AVX] Added extra memory folding tests for D19228
llvm-svn: 266662
2016-04-18 19:48:16 +00:00
Chad Rosier
067146c0fb [ValueTracking] Correct lit test comments. NFC.
llvm-svn: 266657
2016-04-18 19:11:45 +00:00
Sanjoy Das
f97ce19844 [BPI] Consider deoptimize calls as "unreachable"
Summary:
Calls to @llvm.experimental.deoptimize are expected to "never execute",
so optimize them as such.

Reviewers: chandlerc

Subscribers: junbuml, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19095

llvm-svn: 266654
2016-04-18 19:01:28 +00:00
Xinliang David Li
8a449b13f6 Port InstrProfiling pass to the new pass manager
Differential Revision: http://reviews.llvm.org/D18126

llvm-svn: 266637
2016-04-18 17:47:38 +00:00
Simon Pilgrim
3b3d060fd0 [X86][AVX] Added zero+blend vs vperm2f128 optsize tests cases (PR22984)
We should be trying to use vperm2f128 instead of zero+blend (if we're only the user of zero?) when optsize is enabled.

llvm-svn: 266632
2016-04-18 17:14:04 +00:00
Konstantin Zhuravlyov
809d827c49 [AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt
Also,
- Skip pass if machine module does not have debug info
- Minor comment changes
- Added test

Differential Revision: http://reviews.llvm.org/D19079

llvm-svn: 266626
2016-04-18 16:28:23 +00:00
Simon Pilgrim
d93cf6e183 [X86][AVX] Renamed vperm2f128 test to make it quicker to review
missed one the first time round...

llvm-svn: 266623
2016-04-18 16:08:19 +00:00
Simon Pilgrim
cb2fac28c3 [X86][AVX] Renamed vperm2f128 tests to make it quicker to review
llvm-svn: 266621
2016-04-18 15:37:45 +00:00
Igor Kudrin
32058f8126 Reapply "[Coverage] Prevent detection of false instantiations in case of macro expansion."
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.

This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).

This commit reapplies r266436, which was reverted by r266458,
with the .covmapping file serialized in v1 format.

Differential Revision: http://reviews.llvm.org/D18787

llvm-svn: 266620
2016-04-18 15:36:30 +00:00
Eric Liu
4b68ab25ac Revert "Replace the use of MaxFunctionCount module flag"
This reverts commit r266477.

This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData",
while "ProfileData" depends on "Object", which depends on "BitCode", which
depends on "Analysis".

llvm-svn: 266619
2016-04-18 15:31:11 +00:00
Artem Tamazov
cc6f4e6962 [AMDGPU][llvm-mc] s_setreg* - Fix order of operands
Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first.

Differential Revision: http://reviews.llvm.org/D19161

llvm-svn: 266617
2016-04-18 14:54:26 +00:00
Daniel Sanders
e94fb01a45 [mips][ias] Prevent double-filling of delay slots by generating '.set noreorder' regions.
Summary:
When clang is given -save-temps or -via-file-asm, any inline assembly in
the source is parsed twice. Once by the compiler, and again by the
assembler. We must take care to ensure that this doesn't lead to
double-filling delay slots.

Reviewers: sdardis, vkalintiris

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19166

llvm-svn: 266608
2016-04-18 12:35:36 +00:00
Renato Golin
4628935835 [ARM] AArch32 v8 NEON is still not IEEE-754 compliant
llvm-svn: 266603
2016-04-18 12:06:47 +00:00
Strahinja Petrovic
19968b2801 [PowerPC] add comment to test
Added comment in test for soft-float operations on ppc architecture.
Test commit.

llvm-svn: 266600
2016-04-18 11:52:14 +00:00
Craig Topper
8419372e6c Declare MVT::SimpleValueType as an int8_t sized enum. This removes 400 bytes from TargetLoweringBase and probably other places.
This required changing several places to print VT enums as strings instead of raw ints since the proper method to use to print became ambiguous. This is probably an improvement anyway.

This also appears to save ~8K from an x86 self host build of llc.

llvm-svn: 266562
2016-04-17 17:37:33 +00:00
Simon Pilgrim
bb26e7370b [X86][SSE] Added 16i8 -> 8i64 sext test
Shows poor codegen for AVX2

llvm-svn: 266560
2016-04-17 15:10:42 +00:00
Craig Topper
bb00d9af19 [AVX512] ISD::MUL v2i64/v4i64 should only be legal if DQI and VLX features are enabled.
llvm-svn: 266554
2016-04-17 07:25:39 +00:00
Duncan P. N. Exon Smith
57434d797d IR: Fix type-refs in testcase from r266548
There's a hole in the verifier right now: if a module has no compile
units, it never checks that all the string-based DITypeRefs get
resolved.  As a result, this testcase didn't fail the verifier, even
there were references to `!"has-uuid"` instead of `!"uuid"` (the former
was a composite type's 'name:' field, the latter its 'identifier:'
field).

I'm currently working on removing string-based type refs entirely, and
this testcase started failing (because the upgrade script can't resolve
the type refs).  Rather than fixing the (about-to-be-removed) hole in
the verifier, I'm just going to fix the test so that my upgrade script
handles it.

llvm-svn: 266553
2016-04-17 06:42:30 +00:00
Sanjoy Das
157d1c4576 Fix a typo in rL265762
I accidentally replaced `mayBeOverridden` with `!isInterposable`.
Remove the negation and add a test case that would've caught this.

Many thanks to Håkan Hjort for spotting this!

llvm-svn: 266551
2016-04-17 04:30:43 +00:00
Duncan P. N. Exon Smith
0c4233db2f IR: Use an explicit map for debug info type uniquing
Rather than relying on the structural equivalence of DICompositeType to
merge type definitions, use an explicit map on the LLVMContext that
LLParser and BitcodeReader consult when constructing new nodes.
Each non-forward-declaration DICompositeType with a non-empty
'identifier:' field is stored/loaded from the type map, and the first
definiton will "win".

This map is opt-in: clients that expect ODR types from different modules
to be merged must call LLVMContext::ensureDITypeMap.

  - Clients that just happen to load more than one Module in the same
    LLVMContext won't magically merge types.

  - Clients (like LTO) that want to continue to merge types based on ODR
    identifiers should opt-in immediately.

I have updated LTOCodeGenerator.cpp, the two "linking" spots in
gold-plugin.cpp, and llvm-link (unless -disable-debug-info-type-map) to
set this.

With this in place, it will be straightforward to remove the DITypeRef
concept (i.e., referencing types by their 'identifier:' string rather
than pointing at them directly).

llvm-svn: 266549
2016-04-17 03:58:21 +00:00
Duncan P. N. Exon Smith
a1404a67fa IR: Use ODR to unique DICompositeType members
Merge members that are describing the same member of the same ODR type,
even if other bits differ.  If the file or line differ, we don't care;
if anything else differs, it's an ODR violation (and we still don't
really care).

For DISubprogram declarations, this looks at the LinkageName and Scope.
For DW_TAG_member instances of DIDerivedType, this looks at the Name and
Scope.  In both cases, we know that the Scope follows ODR rules if it
has a non-empty identifier.

llvm-svn: 266548
2016-04-17 02:30:20 +00:00
Duncan P. N. Exon Smith
75936fee09 Linker: Clarify test/Linker/type-unique-odr-a.ll, NFC
Split up the long RUN and clarify the CHECK lines:

  - Explicitly confirm there are no other subprograms inside of "A".

  - Remove checks for "bar" and "baz", which were just implicitly
    checking that there were no other subprograms inside of "A".

This prepares for adding a RUN line which links the two files in the
opposite direction.

llvm-svn: 266543
2016-04-17 00:26:17 +00:00
Simon Pilgrim
fc44bd8574 [X86][AVX] Add shuffle combine tests for MOVDDUP/MOVSHDUP/MOVSLDUP
128, 256 and 512 bit implementations (some not yet supported by combineX86ShuffleChain)

llvm-svn: 266535
2016-04-16 20:30:59 +00:00
Simon Pilgrim
cd3aca2121 [X86][XOP] Added VPPERM constant mask decoding and target shuffle combining support
Added additional test that peeks through bitcast to v16i8 mask

llvm-svn: 266533
2016-04-16 17:52:07 +00:00
Simon Pilgrim
8fc00dfd01 [X86][XOP] More VPPERM shuffle mask decode tests
As requested by D18441

llvm-svn: 266531
2016-04-16 16:37:21 +00:00
Mehdi Amini
d3e6360b27 ThinLTO: Make aliases explicit in the summary
To be able to work accurately on the reference graph when taking
decision about internalizing, promoting, renaming, etc. We need
to have the alias information explicit.

Differential Revision: http://reviews.llvm.org/D18836

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266517
2016-04-16 06:56:44 +00:00