Commit Graph

106362 Commits

Author SHA1 Message Date
Hans Wennborg
36d48161a2 Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."
This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376
2017-09-15 18:40:26 +00:00
Eric Beckmann
5470a5de61 Fix Bug 30978 by emitting cv file checksums.
Summary:
The checksums had already been placed in the IR, this patch allows
MCCodeView to actually write it out to an MCStreamer.

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37157

llvm-svn: 313374
2017-09-15 18:20:28 +00:00
Craig Topper
877a55b379 [X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128
The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks.

But I think we want to allow VPERMQ for any unary shuffle.

Differential Revision: https://reviews.llvm.org/D37893

llvm-svn: 313373
2017-09-15 18:11:13 +00:00
Adrian Prantl
72a8ab736e llvm-dwarfdump: Factor out the printing of the section header (NFC)
llvm-svn: 313370
2017-09-15 17:39:50 +00:00
Craig Topper
a95407b7df [X86] Use SDNode::ops() instead of makeArrayRef and op_begin(). NFCI
llvm-svn: 313367
2017-09-15 17:09:05 +00:00
Craig Topper
82b1d47df9 [X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors
When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split.

This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues.

Fixes PR34605.

Differential Revision: https://reviews.llvm.org/D37858

llvm-svn: 313366
2017-09-15 17:09:03 +00:00
Craig Topper
ddf0ac4f90 [X86] Add isel pattern infrastructure to begin recognizing when we're inserting 0s into the upper portions of a vector register and the producing instruction as already produced the zeros.
Currently if we're inserting 0s into the upper elements of a vector register we insert an explicit move of the smaller register to implicitly zero the upper bits. But if we can prove that they are already zero we can skip that. This is based on a similar idea of what we do to avoid emitting explicit zero extends for GR32->GR64.

Unfortunately, this is harder for vector registers because there are several opcodes that don't have VEX equivalent instructions, but can write to XMM registers. Among these are SHA instructions and a MMX->XMM move. Bitcasts can also get in the way.

So for now I'm starting with explicitly allowing only VPMADDWD because we emit zeros in combineLoopMAddPattern. So that is placing extra instruction into the reduction loop.

I'd like to allow PSADBW as well after D37453, but that's currently blocked by a bitcast. We either need to peek through bitcasts or canonicalize insert_subvectors with zeros to remove bitcasts on the value being inserted.

Longer term we should probably have a cleanup pass that removes superfluous zeroing moves even when the producer is in another basic block which is something these isel tricks can't do. See PR32544.

Differential Revision: https://reviews.llvm.org/D37653

llvm-svn: 313365
2017-09-15 17:09:00 +00:00
Anna Thomas
9a38f41e5e [RuntimeUnroll] Add heuristic for unrolling multi-exit loop
Add a profitability heuristic to enable runtime unrolling of multi-exit
loop: There can be atmost two unique exit blocks for the loop and the
second exit block should be a deoptimizing block. Also, there can be one
other exiting block other than the latch exiting block. The reason for
the latter is so that we limit the number of branches in the unrolled
code to being at most the unroll factor.  Deoptimizing blocks are rarely
taken so these additional number of branches created due to the
unrolling are predictable, since one of their target is the deopt block.

Reviewers: apilipenko, reames, evstupac, mkuper

Subscribers: llvm-commits

Reviewed by: reames

Differential Revision: https://reviews.llvm.org/D35380

llvm-svn: 313363
2017-09-15 15:56:05 +00:00
Krzysztof Parzyszek
325bb38667 [Hexagon] Switch to parameterized register classes for HVX
This removes the duplicate HVX instruction set for the 128-byte mode.
Single instruction set now works for both modes (64- and 128-byte).

llvm-svn: 313362
2017-09-15 15:46:05 +00:00
Anna Thomas
b78a541c07 [RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup
During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi values from both original and remainder
loop.
In this process, we lookup the VMap for the mapped incoming phi values,
but did not update the VMap if a default entry was generated in the VMap
during the lookup. This default value is generated when constants or
values outside the current loop are looked up.
This patch fixes the assertion failure when null entries are present in
the VMap because of this lookup. Added a testcase that showcases the
problem.

llvm-svn: 313358
2017-09-15 13:29:33 +00:00
Ilya Biryukov
51785ee0c4 Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
This reverts commit r313348.

Reason: it caused buildbot failures.
llvm-svn: 313352
2017-09-15 10:15:00 +00:00
Sjoerd Meijer
386ba01b9c [AArch64] allow v8f16 types when FullFP16 is supported
This adds support for allowing v8f16 vector types, thus avoiding conversions
from/to single precision for these types. This is a follow up patch of
commits r311154 and r312104, which added support for scalars and v4f16
types, respectively.

Differential Revision: https://reviews.llvm.org/D37802

llvm-svn: 313351
2017-09-15 09:24:48 +00:00
Jonas Paulsson
cec44e7eee Recommit "[RegAlloc] Make sure live-ranges reflect the state of the IR when
removing them"

This was temporarily reverted, but now that the fix has been commited (r313197)
it should be put back in place.

https://bugs.llvm.org/show_bug.cgi?id=34502

This reverts commit 9ef93d9dc4c51568e858cf8203cd2c5ce8dca796.

llvm-svn: 313349
2017-09-15 07:47:38 +00:00
Dinar Temirbulatov
abd1b29772 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:

void add1(int * __restrict dst, const int * __restrict src) {
  *dst++ = *src++;
  *dst++ = *src++ + 1;
  *dst++ = *src++ + 2;
  *dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.

Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 313348
2017-09-15 06:56:39 +00:00
Lang Hames
29ed838bb2 [ORC] Fix a typo.
llvm-svn: 313346
2017-09-15 06:50:19 +00:00
Jatin Bhateja
856f7f79e2 [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343
2017-09-15 05:29:51 +00:00
Dinar Temirbulatov
f74bdecbc6 [SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI.
llvm-svn: 313341
2017-09-15 04:31:54 +00:00
Martin Pelikan
2f3bd56a98 [XRay] fix and clarify comments in the log file decoder
Summary:
For readers unfamiliar with the XRay code base, reference the compiler-rt
implementation even though we're not allowed to share any code and explain
our little-endian views more clearly.

For code clarity either get rid of obvious comments or explain their
intentions, fix typos, correct coding style according to LLVM's standards
and manually CSE long expressions to point out it is the same expression.

Reviewers: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34339

llvm-svn: 313340
2017-09-15 04:22:16 +00:00
Reid Kleckner
4ae1df5b81 [codeview] Use a type index of zero for static method "this" types
Otherwise VS won't show anything in the autos or watch window of static
methods.

llvm-svn: 313329
2017-09-15 00:59:07 +00:00
Alina Sbirlea
bf65988660 Refactor collectChildrenInLoop to LoopUtils [NFC]
Summary: Move to LoopUtils method that collects all children of a node inside a loop.

Reviewers: majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37870

llvm-svn: 313322
2017-09-15 00:04:16 +00:00
Sam Clegg
98d7dfcbb9 [WebAssembly] Use a separate wasm data segment for each global symbol
This is stepping stone towards honoring -fdata-sections
and letting the assembler decide how many wasm data
segments to create.

Differential Revision: https://reviews.llvm.org/D37834

llvm-svn: 313313
2017-09-14 23:07:53 +00:00
Eric Beckmann
c0f1f5876a Fix bug 34608 by moving private header out of public header.
WindowsManifestMerger.h should not include llvm/Config/config.h, since it is private.  The include has been moved to the source instead.

Summary:
The checksums had already been placed in the IR, this patch allows
MCCodeView to actually write it out to an MCStreamer.

Move private config.h header dependency out of public header file.

Addresses Bug 34608

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37863

llvm-svn: 313312
2017-09-14 23:01:13 +00:00
Craig Topper
3bcc062cdc [X86] Remove an unnecessary SmallVector from LowerBUILD_VECTOR.
I think this may have existed to convert from SDUse to SDValue, but it doesn't look like its needed now.

llvm-svn: 313311
2017-09-14 22:47:59 +00:00
Jan Sjodin
43e732c065 Fix warnings in r313297.
llvm-svn: 313302
2017-09-14 21:49:52 +00:00
Matt Arsenault
a8560d8853 AMDGPU: Fix violating constant bus restriction
You can't use madmk/madmk if it already uses an SGPR input.

llvm-svn: 313298
2017-09-14 20:54:29 +00:00
Jan Sjodin
242b2dcd0b Add AddresSpace to PseudoSourceValue.
Differential Revision: https://reviews.llvm.org/D35089

llvm-svn: 313297
2017-09-14 20:53:51 +00:00
Krzysztof Parzyszek
97d598594a Subtarget support for parameterized register class information
Implement "checkFeatures" and emitting HW mode check code.

Differential Revision: https://reviews.llvm.org/D31959

llvm-svn: 313295
2017-09-14 20:44:20 +00:00
Benjamin Kramer
b6a866ac7b Remove usages of deprecated std::unary_function and std::binary_function.
These are removed in C++17. We still have some users of
unary_function::argument_type, so just spell that typedef out. No
functionality change intended.

Note that many of the argument types are actually wrong :)

llvm-svn: 313287
2017-09-14 18:33:25 +00:00
Matt Arsenault
f18ea9e4aa AMDGPU: Fix assert on alloca of array of struct
llvm-svn: 313282
2017-09-14 18:02:29 +00:00
Jonas Devlieghere
df36b9b028 [test] Fix TestDWARFDieRangeInfoIntersects
Fixes heap buffer overflow triggered in DWARF verifier, detected by ASAN.

llvm-svn: 313280
2017-09-14 17:46:23 +00:00
Matt Arsenault
0a3745b2a6 AMDGPU: Stop modifying SP in call sequences
Because the stack growth direction and addressing is done
in the same direction, modifying SP at the beginning of the
call sequence was incorrect. If we had a stack passed argument,
we would end up skipping that number of bytes before pushing
arguments, leaving unused/inconsistent space.

The callee creates fixed stack objects in its frame, so
the space necessary for these is already logically allocated
in the callee, so we just let the callee increment SP if
it really requires it.

llvm-svn: 313279
2017-09-14 17:37:40 +00:00
Dehao Chen
c6dcd59361 Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: vitalybuka, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313277
2017-09-14 17:29:56 +00:00
Simon Dardis
c8036cbb4c [mips] Implement the 'dext' aliases and it's disassembly alias.
The other members of the dext family of instructions (dextm, dextu) are
traditionally handled by the assembler selecting the right variant of
'dext' depending on the values of the position and size operands.

When these instructions are disassembled, rather than reporting the
actual instruction, an equivalent aliased form of 'dext' is generated
and is reported. This is to mimic the behaviour of binutils.

Reviewers: slthakur, nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D34887

llvm-svn: 313276
2017-09-14 17:27:53 +00:00
Matt Arsenault
332360a091 AMDGPU: Make frame register caller preserved
Using SplitCSR for the frame register was very broken. Often
the copies in the prolog and epilog were optimized out, in addition
to them being inserted after the true prolog where the FP
was clobbered.

I have a hacky solution which works that continues to use
split CSR, but for now this is simpler and will get to working
programs.

llvm-svn: 313274
2017-09-14 17:14:57 +00:00
Adrian Prantl
180de22eaf llvm-dwarfdump: support dumping static archives.
llvm-svn: 313272
2017-09-14 17:01:53 +00:00
Krzysztof Parzyszek
f817a9b5c3 TableGen support for parameterized register class information
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.

This affects the way that types and type sets are printed, and the
tests relying on that have been updated.

There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)

For more information, please refer to the review page.

Differential Revision: https://reviews.llvm.org/D31951

llvm-svn: 313271
2017-09-14 16:56:21 +00:00
Krzysztof Parzyszek
0a6ba29aad [IfConversion] More simple, correct dead/kill liveness handling
Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D37611

llvm-svn: 313268
2017-09-14 15:53:11 +00:00
Simon Dardis
279ec81c7c [mips] Implement the 'dins' aliases.
Traditionally GAS has provided automatic selection between dins, dinsm and
dinsu. Binutils also disassembles all instructions in that family as 'dins'
rather than the actual instruction.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D34877

llvm-svn: 313267
2017-09-14 15:17:50 +00:00
Sanjay Patel
3e2b1fd515 [InstSimplify] fold sdiv/srem based on compare of dividend and divisor
This should bring signed div/rem analysis up to the same level as unsigned. 
We use icmp simplification to determine when the divisor is known greater than the dividend.

Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits.
There are extra tests for the signed-min-value special cases.

Alive proofs:
http://rise4fun.com/Alive/WI5

Differential Revision: https://reviews.llvm.org/D37713

llvm-svn: 313264
2017-09-14 14:59:07 +00:00
Aleksandar Beserminji
22c7d9115b Test commit.
llvm-svn: 313262
2017-09-14 14:34:04 +00:00
Sanjay Patel
cb5ae7de4b [InstSimplify] clean up div/rem handling; NFCI
The idea to make an 'isDivZero' helper was suggested for the signed case in D37713:
https://reviews.llvm.org/D37713

This clean-up makes it clear that D37713 is just filling the gap for signed div/rem,
removes unnecessary code, and allows us to remove a bit of duplicated code from the
planned improvement in D37713.

llvm-svn: 313261
2017-09-14 14:09:11 +00:00
Krzysztof Parzyszek
805df76dbe [Hexagon] Make getMemAccessSize return size in bytes
It used to return the actual field value from the instruction descriptor.
There is no reason for that, that value is not interesting in any way and
the specifics of its encoding in the descriptor should not be exposed.

llvm-svn: 313257
2017-09-14 12:06:40 +00:00
Ayman Musa
7c7c79c212 [X86] When applying the shuffle-to-zero-extend transformation on floating point, bitcast to integer first.
Fix issue described in PR34577.

Differential Revision: https://reviews.llvm.org/D37803

llvm-svn: 313256
2017-09-14 12:06:38 +00:00
Jonas Devlieghere
0a12d5c67d [dwarfdump] Add DWARF verifiers for address ranges
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:

 - Verify that all address ranges in a DIE are valid.
 - Verify that no ranges within the DIE overlap.
 - Verify that no ranges overlap with the ranges of a sibling.
 - Verify that children are completely contained in its (direct)
   parent's address range. (unless both are subprograms)

Differential revision: https://reviews.llvm.org/D37696

llvm-svn: 313255
2017-09-14 11:33:42 +00:00
Simon Dardis
c2db7efb6c [mips] Pick the right variant of DINS upfront and enable target instruction verification
This patch complements D16810 "[mips] Make isel select the correct DEXT variant
up front.". Now ISel picks the right variant of DINS, so now there is no need
to replace DINS with the appropriate variant during
MipsMCCodeEmitter::encodeInstruction().

This patch also enables target specific instruction verification for ins, dins,
dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that
are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these
constraints are not checked during instruction selection. Adding machine
verification should catch outstanding cases.

Finally, correct a bug that instruction verification uncovered, where the
position operand of a DINSU generated during lowering was being silently
and accidently corrected to the correct value.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D34809

llvm-svn: 313254
2017-09-14 10:58:00 +00:00
Jonas Devlieghere
e1e2b6ad2d Revert "[dwarfdump] Add DWARF verifiers for address ranges"
This reverts commit r313250.

llvm-svn: 313253
2017-09-14 10:49:15 +00:00
Simon Pilgrim
a9a617e651 [DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.

Original patch by @tstellar

Differential Revision: https://reviews.llvm.org/D19325

llvm-svn: 313251
2017-09-14 10:38:30 +00:00
Jonas Devlieghere
8bc89997a1 [dwarfdump] Add DWARF verifiers for address ranges
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:

 - Verify that all address ranges in a DIE are valid.
 - Verify that no ranges within the DIE overlap.
 - Verify that no ranges overlap with the ranges of a sibling.
 - Verify that children are completely contained in its (direct)
   parent's address range. (unless both are subprograms)

Differential revision: https://reviews.llvm.org/D37696

llvm-svn: 313250
2017-09-14 10:38:18 +00:00
Simon Pilgrim
efa82226b8 [SelectionDAG] ComputeNumSignBits - cleanup ROTL/ROTR wrapping to match DAGCombine etc.
Use RotAmt.urem(VTBits) instead of AND(RotAmt, VTBits - 1)

TBH I don't expect non-power-of-2 types to be created, but it makes the logic clearer and matches what we do in other rotation combines.

llvm-svn: 313245
2017-09-14 10:28:01 +00:00
Chandler Carruth
9586106cf5 [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
invalidated SCCs even when we do not have an updated SCC to redirect
towards.

This comes up in a fairly subtle and surprising circumstance: we need to
have a connected but internal node in the call graph which later becomes
a disconnected island, and then gets deleted. All of this needs to
happen mid-CGSCC walk. Because it is disconnected, we have no way of
computing a new "current" SCC when it gets deleted. Instead, we need to
explicitly check for a deleted "current" SCC and bail out of the current
CGSCC step. This will bubble all the way up to the post-order walk and
then resume correctly.

I've included minimal tests for this bug. The specific behavior
matches something we've seen in the wild with the new PM combined with
ThinLTO and sample PGO, but I've not yet confirmed whether this is the
only issue there.

llvm-svn: 313242
2017-09-14 08:33:57 +00:00