Commit Graph

150934 Commits

Author SHA1 Message Date
Wei Mi
71d7c09ce8 [GVN] Recommit the patch "Add phi-translate support in scalarpre".
The recommit fixes three bugs: The first one is to use CurrentBlock instead of
PREInstr's Parent as param of performScalarPREInsertion because the Parent
of a clone instruction may be uninitialized. The second one is stop PRE when
CurrentBlock to its predecessor is a backedge and an operand of CurInst is
defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306313 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 18:16:10 +00:00
Matt Arsenault
8e828b87b2 AMDGPU: Setup SP/FP in callee function prolog/epilog
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306312 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 17:53:59 +00:00
Eric Beckmann
69e4d36881 Replace trivial use of external rc.exe by writing our own .res file.
This patch removes the dependency on the external rc.exe tool by writing
a simple .res file using our own library. In this patch I also added an
explicit definition for the .res file magic.  Furthermore, I added a
unittest for embeded manifests and fixed a bug exposed by the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306311 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 17:43:30 +00:00
Zachary Turner
7226719a52 [llvm-pdbutil] Add a mode to bytes for dumping split debug chunks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306309 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 17:22:36 +00:00
Brian Gesiak
8ae15db2a4 [opt-viewer] Python 3 support in opt-stats.py
Summary: Minor changes that allow opt-stats.py to support both Python 2 and 3.

Reviewers: anemet, davidxl

Reviewed By: anemet

Subscribers: llvm-commits, fhahn

Differential Revision: https://reviews.llvm.org/D34564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306306 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 16:51:24 +00:00
Ulrich Weigand
18f8cae766 [SystemZ] Fix missing emergency spill slot corner case
We sometimes need emergency spill slots for the register scavenger.
This may be the case when code needs to access a stack slot that
has an offset of 4096 or more relative to the stack pointer.

To make that determination, processFunctionBeforeFrameFinalized
currently simply checks the total stack frame size of the current
function.  But this is not enough, since code may need to access
stack slots in the caller's stack frame as well, in particular
incoming arguments stored on the stack.

This commit fixes the problem by taking argument slots into account.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306305 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 16:50:32 +00:00
Simon Pilgrim
575411ebf8 [X86][SSE] Add combine tests for PMULDQ/PMULUDQ
Found several missed optimizations while investigating replacing _mm_mul_epi32/_mm_mul_epu32 with generic implementations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 16:22:52 +00:00
Marina Yatsina
2db1a71f59 [inline asm] dot operator while using imm generates wrong ir + asm - llvm part
Inline asm dot operator while using imm generates wrong ir and asm

This also fixes bugzilla 32987:
https://bugs.llvm.org//show_bug.cgi?id=32987

The clang part of the review that contains the test can be found here:
https://reviews.llvm.org/D33040

commit on behald of zizhar

Differential Revision:
https://reviews.llvm.org/D33039



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306300 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 16:03:42 +00:00
Ahmed Bougacha
699f4c431d [X86][AVX-512] Don't raise inexact in ceil, floor, round, trunc.
The non-AVX-512 behavior was changed in r248266 to match N1778
(C bindings for IEEE-754 (2008)), which defined the four functions
to not raise the inexact exception ("rint" is still defined as raising
it).

Update the AVX-512 lowering of these functions to match that: it should
not be different.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306299 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 16:00:24 +00:00
Tom Stellard
8d3ca7cfeb AMDGPU/GlobalISel: Mark 32-bit G_SHL as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D34589

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306298 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 15:56:52 +00:00
Simon Pilgrim
d71e04af3a [X86] Add test case for PR15981
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306296 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 15:53:11 +00:00
Simon Pilgrim
7c16260531 [llvm-stress] Add getRandom() helper that was going to be part of D34157. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306294 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 15:41:36 +00:00
Sanjay Patel
8bfecccf46 [x86] transform vector inc/dec to use -1 constant (PR33483)
Convert vector increment or decrement to sub/add with an all-ones constant:

add X, <1, 1...> --> sub X, <-1, -1...>
sub X, <1, 1...> --> add X, <-1, -1...>

The all-ones vector constant can be materialized using a pcmpeq instruction that is 
commonly recognized as an idiom (has no register dependency), so that's better than 
loading a splat 1 constant.

AVX512 uses 'vpternlogd' for 512-bit vectors because there is apparently no better
way to produce 512 one-bits.

The general advantages of this lowering are:
1. pcmpeq has lower latency than a memop on every uarch I looked at in Agner's tables, 
   so in theory, this could be better for perf, but...

2. That seems unlikely to affect any OOO implementation, and I can't measure any real 
   perf difference from this transform on Haswell or Jaguar, but...

3. It doesn't look like it from the diffs, but this is an overall size win because we 
   eliminate 16 - 64 constant bytes in the case of a vector load. If we're broadcasting 
   a scalar load (which might itself be a bug), then we're replacing a scalar constant 
   load + broadcast with a single cheap op, so that should always be smaller/better too.

4. This makes the DAG/isel output more consistent - we use pcmpeq already for padd x, -1 
   and psub x, -1, so we should use that form for +1 too because we can. If there's some
   reason to favor a constant load on some CPU, let's make the reverse transform for all
   of these cases (either here in the DAG or in a later machine pass).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=33483

Differential Revision: https://reviews.llvm.org/D34336


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306289 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 14:19:26 +00:00
Krzysztof Parzyszek
ca59b915b5 [Hexagon] Handle cases when the aligned stack pointer is missing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306288 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 14:17:58 +00:00
Jonas Paulsson
018c368d38 [SystemZ] Add a check against zero before calling getTestUnderMaskCond()
Csmith discovered that this function can be called with a zero argument,
in which case an assert for this triggered.

This patch also adds a guard before the other call to this function since
it was missing, although the test only covers the case where it was
discovered.

Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306287 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 13:38:27 +00:00
Michael Zuckerman
b76f903b2e [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess test.
Adding base tast (to trunk) for Store strid=4 vf=32. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306286 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 13:27:32 +00:00
Simon Pilgrim
151f40dcea [llvm-stress] Remove Rand32 helper function
To try and help avoid repeats of PR32585, remove Rand32 which is only called by Rand64


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306285 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 13:17:36 +00:00
Simon Pilgrim
2e6fc83eac [llvm-stress] Ensure that the C++11 random device respects its min/max values (PR32585)
As noted on PR32585, the change in D29780/rL295325 resulted in calls to Rand32() (values 0 -> 0xFFFFFFFF) but the min()/max() operators indicated it would be (0 -> 0x7FFFF).

This patch changes the random operator to call Rand() instead which does respect the 0 -> 0x7FFFF range and asserts that the value is in range as well.

Differential Revision: https://reviews.llvm.org/D34089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306281 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 10:16:34 +00:00
Mikael Holmen
4ae836e7b4 [IfConversion] Hoist removeBranch calls out of if/else clauses [NFC]
Summary:
Also added a comment.

Pulled out of https://reviews.llvm.org/D34099.

Reviewers: iteratee

Reviewed By: iteratee

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 09:33:04 +00:00
Craig Topper
8edc5b1c77 [IR] Rename BinaryOperator::init to AssertOK and remove argument. Replace default case in switch with llvm_unreachable since all valid opcodes are covered.
This method doesn't do any initializing. It just contains asserts. So renaming to AssertOK makes it consistent with similar instructions in other Instruction classes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306277 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 07:15:59 +00:00
Serguei Katkov
330bfeddce This reverts commit r306272.
Revert "[MBP] do not rotate loop if it creates extra branch"

It breaks the sanitizer build bots. Need to fix this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306276 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 06:51:45 +00:00
Tobias Grosser
5be3ca82d9 [bugpoint] Do not initialize disassembler passes
We added the initilization of disassembler passes in r306208 with the goal to
bring bugpoint in line with 'opt'. However, 'opt' does itself not initialize
dissassembler passes. As our goal was consistency, we drop the initialization
of dissassembler passes again from bugpoint.

Thanks to Chandler for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306275 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 06:50:50 +00:00
Hiroshi Inoue
6d7e03aae9 fix trivial typo in comment, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306274 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 06:32:04 +00:00
Serguei Katkov
0eb7237e75 [MBP] do not rotate loop if it creates extra branch
This is a last fix for the corner case of PR32214. Actually this is not really corner case in general.

We should not do a loop rotation if we create an additional branch due to it.
Consider the case where we have a loop chain H, M, B, C , where
H is header with viable fallthrough from pre-header and exit from the loop
M - some middle block
B - backedge to Header but with exit from the loop also.
C - some cold block of the loop.

Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch.
Let's compute the change in number of branches:
+1 branch from pre-header to header
-1 branch from header to exit
+1 branch from header to middle block if there is such
-1 branch from cold bock to header if there is one

So if C is not a predecessor of H then we introduce extra branch.

This change actually prohibits rotation of the loop if both true
1) Best Exit has next element in chain as successor.
2) Last element in chain is not a predecessor of first element of chain.

Reviewers: iteratee, xur
Reviewed By: iteratee
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34271


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306272 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 05:27:27 +00:00
Davide Italiano
6eebd6c274 [CFL-AA] Remove unneeded function declaration. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306268 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 03:55:41 +00:00
Chandler Carruth
5a057dc8ed [InstCombine] Factor the logic for propagating !nonnull and !range
metadata out of InstCombine and into helpers.

NFC, this just exposes the logic used by InstCombine when propagating
metadata from one load instruction to another. The plan is to use this
in SROA to address PR32902.

If anyone has better ideas about how to factor this or name variables,
I'm all ears, but this seemed like a pretty good start and lets us make
progress on the PR.

This is based on a patch by Ariel Ben-Yehuda (D34285).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306267 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 03:31:31 +00:00
Matt Arsenault
92c7507eee AMDGPU: Whitespace fixes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306265 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 03:01:36 +00:00
Matt Arsenault
ec6175c524 AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling
This should not be treated as a different version of
private_segment_buffer. These are distinct things with
different uses and register classes, and requires the
function argument info to have more context about the
function's type and environment.

Also add missing test coverage for the intrinsic, and
emit an error for HSA. This also encovers that the intrinsic
is broken unless there happen to be stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306264 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 03:01:31 +00:00
Sylvestre Ledru
e9d67e46c2 fix various typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306262 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 02:45:39 +00:00
Chandler Carruth
e27904f6c7 [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.
This was reverted in r306252, but I already had the bug fixed and was
just trying to form a test case.

The original commit factored the logic for forming dedicated exits
inside of LoopSimplify into a helper that could be used elsewhere and
with an approach that required fewer intermediate data structures. See
that commit for full details including the change to the statistic, etc.

The code looked fine to me and my reviewers, but in fact didn't handle
indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty.

If you have code that looks *just* right, you can end up leaking these
predecessors into a subsequent rewrite, and crash deep down when trying
to update PHI nodes for predecessors that don't exist.

I've added an assert that makes the bug much more obvious, and then
changed the code to reliably clear the vector so we don't get this bug
again in some other form as the code changes.

I've also added a test case that *does* manage to catch this while also
giving some nice positive coverage in the face of indirectbr.

The real code that found this came out of what I think is CPython's
interpreter loop, but any code with really "creative" interpreter loops
mixing indirectbr and other exit paths could manage to tickle the bug.
I was hard to reduce the original test case because in addition to
having a particular pattern of IR, the whole thing depends on the order
of the predecessors which is in turn depends on use list order. The test
case added here was designed so that in multiple different predecessor
orderings it should always end up going down the same path and tripping
the same bug. I hope. At least, it tripped it for me without
manipulating the use list order which is better than anything bugpoint
could do...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306257 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 22:45:31 +00:00
Chandler Carruth
5999c342a6 [LoopSimplify] Improve a test for loop simplify minorly. NFC.
I did some basic testing while looking for a bug in my recent change to
loop simplify and even though it didn't find the bug it seems like
a useful improvement anyways.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306256 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 22:24:02 +00:00
Davide Italiano
82f1a7fc01 [MemDep] Cleanup return after else & use auto. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306255 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 22:12:59 +00:00
Anna Thomas
dcc5fa654c [LoopDeletion] NFC: Move phi node value setting into prepass
Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator,
which caused loop deletion tests to hang due to infinite loop.
Had reverted it in rL306162.

rL306157 commit message:
Currently, the implementation of delete dead loops has a special case
when the loop being deleted is never executed. This special case
(updating of exit block's incoming values for phis) can be
run as a prepass for non-executable loops before performing
the actual deletion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306254 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 21:13:58 +00:00
Daniel Jasper
6cf9acbae6 Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."
This leads to a segfault. Chandler already has a test case and should be
able to recommit with a fix soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306252 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 17:58:25 +00:00
Craig Topper
3ace6d8a95 [TableGen] Remove some copies around PatternToMatch.
Summary:
This patch does a few things that should remove some copies around PatternsToMatch. These were noticed while reviewing code for D34341.

Change constructor to take Dstregs by value and move it into the class. Change one of the callers to add std::move to the argument so that it gets moved.

Make AddPatternToMatch take PatternToMatch by rvalue reference so we can move it into the PatternsToMatch vector. I believe we should have a implicit default move constructor available on PatternToMatch. I chose rvalue reference because both callers call it with temporaries already.

Reviewers: RKSimon, aymanmus, spatel

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34411

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306251 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 17:33:49 +00:00
Craig Topper
0b2cfb74b6 [IR] Use isIntOrIntVectorTy instead of writing it out the long way. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306250 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 17:33:48 +00:00
Craig Topper
37582000a6 [IR] Move repeated asserts in FCmpInst constructor to a helper method like we do for ICmpInst and other classes. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306249 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 17:33:46 +00:00
Simon Pilgrim
61b5c7bb8b [X86][SSE] Remove unused memopfsf32_128/memopfsf64_128 scalar memops
The 'scalar' simd bitops were dropped a while ago

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 17:04:58 +00:00
Simon Pilgrim
90dbf3c865 Strip trailing whitespace. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306247 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 16:57:46 +00:00
Simon Pilgrim
d33969c6c7 [X86] Add test case for PR15705
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306246 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 16:12:45 +00:00
Sanjay Patel
1e95676e58 [InstCombine] add (sext i1 X), 1 --> zext (not X)
http://rise4fun.com/Alive/i8Q

A narrow bitwise logic op is obviously better than math for value tracking, 
and zext is better than sext. Typically, the 'not' will be folded into an 
icmp predicate.

The IR difference would even survive through codegen for x86, so we would see 
worse code:

https://godbolt.org/g/C14HMF

one_or_zero(int, int):                      # @one_or_zero(int, int)
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setle   %al
        retq

one_or_zero_alt(int, int):                  # @one_or_zero_alt(int, int)
        xorl    %ecx, %ecx
        cmpl    %esi, %edi
        setg    %cl
        movl    $1, %eax
        subl    %ecx, %eax
        retq




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306243 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 14:15:28 +00:00
Elena Demikhovsky
e8c8f15850 AVX-512: Fixed a crash during legalization of <3 x i8> type
The compiler fails with assertion during legalization of SETCC for <3 x i8> operands.
The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>.

Differential Revision: https://reviews.llvm.org/D34503



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 13:36:20 +00:00
Xin Tong
5b97b27fed [AST] Fix a bug in aliasesUnknownInst. Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
Summary:
Make sure we are comparing the unknown instructions in the alias set and the instruction interested in.
I believe this is clearly a bug (missed opportunity). I can also add some test cases if desired.

Reviewers: hfinkel, davide, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34597

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306241 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 12:55:11 +00:00
Igor Breger
17d822b423 [GlobalISel][X86] Support vector type G_EXTRACT selection.
Summary:
Support vector type G_EXTRACT selection. For now G_EXTRACT marked as legal for any type, so nothing to do in legalizer.
Split from https://reviews.llvm.org/D33665

Reviewers: qcolombet, t.p.northover, zvi, guyblank

Reviewed By: guyblank

Subscribers: guyblank, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306240 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 11:42:17 +00:00
Dorit Nuzman
65b3f67e1c [AVX2] [TTI CostModel] Add cost of interleaved loads/stores for AVX2
The cost of an interleaved access was only implemented for AVX512. For other
X86 targets an overly conservative Base cost was returned, resulting in
avoiding vectorization where it is actually profitable to vectorize.
This patch starts to add costs for AVX2 for most prominent cases of
interleaved accesses (stride 3,4 chars, for now).

Note1: Improvements of up to ~4x were observed in some of EEMBC's rgb
workloads; There is also a known issue of 15-30% degradations on some of these
workloads, associated with an interleaved access followed by type
promotion/widening; the resulting shuffle sequence is currently inefficient and
will be improved by a series of patches that extend the X86InterleavedAccess pass
(such as D34601 and more to follow).

Note 2: The costs in this patch do not reflect port pressure penalties which can
be very dominant in the case of interleaved accesses since most of the shuffle
operations are restricted to a single port. Further tuning, that may incorporate
these considerations, will be done on top of the upcoming improved shuffle
sequences (that is, along with the abovementioned work to extend
X86InterleavedAccess pass).


Differential Revision: https://reviews.llvm.org/D34023



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306238 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 08:26:25 +00:00
Ed Schouten
fc7d8c45e2 Add support for Ananas platform
Ananas is a home-brew operating system, mainly for amd64 machines. After
using GCC for quite some time, it has switched to clang and never looked
back - yet, having to manually patch things is annoying, so it'd be much
nicer if this was in the official tree.

More information:

https://github.com/zhmu/ananas/
https://rink.nu/projects/ananas.html

Submitted by:	Rink Springer
Differential Revision:	https://reviews.llvm.org/D32937


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306237 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 08:19:37 +00:00
Craig Topper
750feae3fa [PatternMatch] Just check if value is a Constant before calling isAllOnesValue for not_match. We don't really need to check for a specific subclass of Constant. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306236 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 06:56:34 +00:00
Zachary Turner
e7f7e6d72a [pdb] Fix reading of llvm-generated PDBs by cvdump.
If you dump a pdb to yaml, and then round-trip it back to a pdb,
and run cvdump -l <file> on the new pdb, cvdump will generate
output such as this.

*** LINES

** Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj"

Error: Line number corrupted: invalid file id 0
  <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3

        5 00000010      6 00000013      7 00000018

Note the error message about the corrupted line number.

It turns out that the problem is that cvdump cannot find the
/names stream (e.g. the global string table), and the reason it
can't find the /names stream is because it doesn't understand
the NameMap that we serialize which tells pdb consumers which
stream has the string table.

Some experimentation shows that if we add items to the hash
table in a specific order before serializing it, cvdump can read
it. This suggests that either we're using the wrong hash function,
or we're serializing something incorrectly, but it will take some
deeper investigation to figure out how / why.  For now, this at
least allows cvdump to read our line information (and incidentally,
produces an identical byte sequence to what Microsoft tools
produce when writing the named stream map).

Differential Revision: https://reviews.llvm.org/D34491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306233 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 03:51:42 +00:00
Xinliang David Li
0a14fbb39c [PGO] Implementate profile counter regiser promotion
Differential Revision: http://reviews.llvm.org/D34085


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306231 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 00:26:43 +00:00
Zachary Turner
f33ec6fb18 [Support] Don't use std::iterator, it's deprecated in C++17.
In converting this over to iterator_facade_base, some member
operators and methods are no longer needed since iterator_facade
implements them in the base class using CRTP.

Differential Revision: https://reviews.llvm.org/D34223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306230 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-25 00:00:08 +00:00