144273 Commits

Author SHA1 Message Date
Nicolai Haehnle
ab43652716 [DAGCombine] require UnsafeFPMath for re-association of addition
Summary:
The affected transforms all implicitly use associativity of addition,
for which we usually require unsafe math to be enabled.

The "Aggressive" flag is only meant to convey information about the
performance of the fused ops relative to a fmul+fadd sequence.

Fixes Bug 31626.

Reviewers: spatel, hfinkel, mehdi_amini, arsenm, tstellarAMD

Subscribers: jholewinski, nemanjai, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D28675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293635 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 14:35:37 +00:00
Sam Parker
d9605fec4b [ARM] Avoid using ARM instructions in Thumb mode
The Requires class overrides the target requirements of an instruction,
rather than adding to them, so all ARM instructions need to include the
IsARM predicate when they have overwitten requirements.

This caused the swp and swpb instructions to be allowed in thumb mode
assembly, and the ARM encoding of CDP to be selected in codegen (which
is different for conditional instructions).

Differential Revision: https://reviews.llvm.org/D29283



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293634 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 14:35:01 +00:00
Benjamin Kramer
3bfb126ba5 [X86] Silence unused variable warning in Release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293631 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 14:13:53 +00:00
Silviu Baranga
150da15398 [InstCombine] Make sure that LHS and RHS have the same type in
transformToIndexedCompare

If they don't have the same type, the size of the constant
index would need to be adjusted (and this wouldn't be always
possible).

Alternatively we could try the analysis with the initial
RHS value, which would guarantee that the two sides have
the same type. However it is unlikely that in practice this
would pass our transformation requirements.

Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293629 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 14:04:15 +00:00
Simon Pilgrim
438897d8de [X86][SSE] Detect unary PBLEND shuffles.
These can appear during shuffle combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293628 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 13:58:01 +00:00
Simon Pilgrim
e8b7298325 [X86][SSE] Add support for combining PINSRW into a target shuffle.
Also add the ability to recognise PINSR(Vex, 0, Idx).

Targets shuffle combines won't replace multiple insertions with a bit mask until a depth of 3 or more, so we avoid codesize bloat.

The unnecessary vpblendw in clearupper8xi16a will be fixed in an upcoming patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293627 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 13:51:10 +00:00
Nemanja Ivanovic
9ca0f8737c [PowerPC][Altivec] Add vmr extended mnemonic
Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in
the PPC back end.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29133


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293626 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 13:43:11 +00:00
Florian Hahn
2938de42db [LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC)
Summary:
rL293124 added the necessary infrastructure to properly add the cloned
top level loop to LoopInfo, which means we do not have to do it manually
in CloneLoopBlocks.

@mkuper sorry for not pointing this out during my review of D29156, I just
realized that today.


Reviewers: mzolotukhin, chandlerc, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits, mkuper

Differential Revision: https://reviews.llvm.org/D29173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293615 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 11:13:44 +00:00
Simon Dardis
0f9a41d64a [mips] Addition of the immediate cases for the instructions [d]div, [d]divu
Related to http://reviews.llvm.org/D15772

Depends on http://reviews.llvm.org/D16888

Adds support for immediate operand for [D]DIV[U] instructions.

Patch By: Srdjan Obucina

Reviewers: zoran.jovanovic, vkalintiris, dsanders, obucina

Differential Revision: https://reviews.llvm.org/D16889



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293614 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 10:49:24 +00:00
Craig Topper
5f528dcc0a [AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293608 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:55 +00:00
Craig Topper
25b197c99d [X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293607 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:53 +00:00
Craig Topper
bde6e7c9de [AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:50 +00:00
Justin Lebar
6731e5394b [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.
Summary:

This lets us lower to sqrt.approx and rsqrt.approx under more
circumstances.

* Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32,
  when fast-math is enabled.  Previously, we only would emit it for
  calls to @llvm.nvvm.sqrt.f.  (With this patch we no longer emit
  sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to
  simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.)

* Now we emit the ftz version of rsqrt.approx when ftz is enabled.
  Previously, we only emitted rsqrt.approx when ftz was disabled.

Reviewers: hfinkel

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: https://reviews.llvm.org/D28508

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293605 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 05:58:22 +00:00
Craig Topper
374362d920 [X86] Update the broadcast fallback patterns to use shuffle instructions from the appropriate execution domain.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293603 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 05:18:29 +00:00
Craig Topper
04abfe94cf [X86] Add test cases for AVX1 broadcast fallback patterns when load can't be folded.
Also add test cases that do an insertelement to all elements for the 8 element vector tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293602 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 05:18:27 +00:00
Craig Topper
dc6cd899fc [AVX-512] Fix the ExeDomain for VMOVDDUP, VMOVSLDUP, and VMOVSHDUP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293601 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 05:18:24 +00:00
Matt Arsenault
3b595d2304 AMDGPU: Generalize matching of v_med3_f32
I think this is safe as long as no inputs are known to ever
be nans.

Also add an intrinsic for fmed3 to be able to handle all safe
math cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293598 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 03:07:46 +00:00
Matt Arsenault
6a569f5700 InferAddressSpaces: Rename constant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293594 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:17:41 +00:00
Matt Arsenault
1394a16b28 InferAddressSpaces: Handle icmp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293593 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:17:32 +00:00
Craig Topper
3a54f7ad1b [X86] Remove patterns for X86VPermilpi with integer types. I don't think we've formed these since the shuffle lowering rewrite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293592 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:09:53 +00:00
Craig Topper
98925ead78 [X86] Remove duplicate patterns for X86VPermilpv that already exist in the instructions themselves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:09:51 +00:00
Craig Topper
afa112a7fa [X86] Remove patterns for selecting PSHUFD with FP types. We don't seem to do this anymore and the AVX case definitely should be using VPERMILPS anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293590 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:09:49 +00:00
Craig Topper
e45c0bf83f [X86] Remove 'else' after 'return'. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293589 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:09:46 +00:00
Craig Topper
1c8955ec1e [X86] Use integer broadcast instructions for integer broadcast patterns.
I'm not sure why we were using an FP instruction before and had to have a comment calling attention to it, but not justifying it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293588 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 02:09:43 +00:00
Matt Arsenault
35b092a75f InferAddressSpaces: Support memory intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293587 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 01:56:57 +00:00
Matt Arsenault
6be67912a8 InferAddressSpaces: Support atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293584 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 01:40:38 +00:00
Matt Arsenault
de6cb7e695 InferAddressSpaces: Don't replace volatile users
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293582 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 01:30:16 +00:00
Matt Arsenault
264e91f294 AMDGPU: Implement hook for InferAddressSpaces
For now just port some of the existing NVPTX tests
and from an old HSAIL optimization pass which
approximately did the same thing.

Don't enable the pass yet until more testing is done.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293580 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 01:20:54 +00:00
Matt Arsenault
9be098398c NVPTX: Move InferAddressSpaces to generic code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293579 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 01:10:58 +00:00
Eugene Zelenko
bfea59083d [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293578 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 00:56:17 +00:00
Saleem Abdulrasool
91f734b9ab TableGen: use fully qualified name for StringLiteral
Use the qualified name for StringLiteral (llvm::StringLiteral) when
generating the sources.  This is needed as the generated files may be
used out-of-tree (e.g. swift) where you may not have a
`using namespace llvm;` resulting in an undefined lookup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 00:45:01 +00:00
Eli Friedman
51c6922329 [SCEV] Simplify/generalize howFarToZero solving.
Make SolveLinEquationWithOverflow take the start as a SCEV, so we can
solve more cases. With that implemented, get rid of the special case
for powers of two.

The additional functionality probably isn't particularly useful,
but it might help a little for certain cases involving pointer
arithmetic.

Differential Revision: https://reviews.llvm.org/D28884



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 00:42:42 +00:00
Reid Kleckner
fa35d5c251 Remove LLVM_CONFIG from config headers
It appears to be dead, and it needlessly caused me to rebuild all of
LLVM when I changed CMAKE_INSTALL_PREFIX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293574 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 00:34:23 +00:00
Vedant Kumar
af0fd6790d Fix llvm-readobj build error after r293569
Clang complains about an ambiguous call to printNumber() because it
can't work out what size_t should convert to. I picked uint64_t.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293573 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:58:51 +00:00
Keno Fischer
2a3b42cf37 [ExecutionDepsFix] Improve clearance calculation for loops
Summary:
In revision rL278321, ExecutionDepsFix learned how to pick a better
register for undef register reads, e.g. for instructions such as
`vcvtsi2sdq`. While this revision improved performance on a good number
of our benchmarks, it unfortunately also caused significant regressions
(up to 3x) on others. This regression turned out to be caused by loops
such as:

PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT
      ^                                  |
      +----------------------------------+

In the previous version of the clearance calculation, we would visit
the blocks in order, remembering for each whether there were any
incoming backedges from blocks that we hadn't processed yet and if
so queuing up the block to be re-processed. However, for loop structures
such as the above, this is clearly insufficient, since the block B
does not have any unknown backedges, so we do not see the false
dependency from the previous interation's Def of xmm registers in B.

To fix this, we need to consider all blocks that are part of the loop
and reprocess them one the correct clearance values are known. As
an optimization, we also want to avoid reprocessing any later blocks
that are not part of the loop.

In summary, the iteration order is as follows:
Before: PH A B C D A'
Corrected (Naive): PH A B C D A' B' C' D'
Corrected (w/ optimization): PH A B C A' B' C' D

To facilitate this optimization we introduce two new counters for each
basic block. The first counts how many of it's predecssors have
completed primary processing. The second counts how many of its
predecessors have completed all processing (we will call such a block
*done*. Now, the criteria to reprocess a block is as follows:
    - All Predecessors have completed primary processing
    - For x the number of predecessors that have completed primary
      processing *at the time of primary processing of this block*,
      the number of predecessors that are done has reached x.

The intuition behind this criterion is as follows:
We need to perform primary processing on all predecessors in order to
find out any direct defs in those predecessors. When predecessors are
done, we also know that we have information about indirect defs (e.g.
in block B though that were inherited through B->C->A->B). However,
we can't wait for all predecessors to be done, since that would
cause cyclic dependencies. However, it is guaranteed that all those
predecessors that are prior to us in reverse postorder will be done
before us. Since we iterate of the basic blocks in reverse postorder,
the number x above, is precisely the count of the number of predecessors
prior to us in reverse postorder.

Reviewers: myatsina
Differential Revision: https://reviews.llvm.org/D28759

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293571 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:37:03 +00:00
Sanjay Patel
126da6d52b [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293570 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:35:52 +00:00
Derek Schuff
c20099fa53 [WebAssembly] Add wasm support for llvm-readobj
Create a WasmDumper subclass of ObjDumper to support Webassembly binary
files.

Patch by Sam Clegg

Differential Revision: https://reviews.llvm.org/D27355

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293569 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:30:52 +00:00
Matt Arsenault
e6e6b711b1 NVPTX: Trivial cleanups of NVPTXInferAddressSpaces
- Move DEBUG_TYPE below includes
- Change unknown address space constant to be consistent with other
  passes
- Grammar fixes in debug output

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293567 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:27:11 +00:00
Sanjay Patel
deb5ed0b1a [InstCombine] add vector test for (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293566 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:26:17 +00:00
Eugene Zelenko
ac9b2ba76d [Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:21:32 +00:00
Benjamin Kramer
cc71b6de7d [ICP] Fix bool conversion warning and actually write out the reason instead of dropping it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293564 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:11:29 +00:00
Matt Arsenault
4aff9df5c8 NVPTX: Refactor NVPTXInferAddressSpaces to check TTI
Add a new TTI hook for getting the generic address space value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293563 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:02:12 +00:00
Sanjay Patel
222619278d [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293562 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 23:01:05 +00:00
Simon Pilgrim
ab1d988959 [X86][SSE] Fix unsigned <= 0 warning in assert. NFCI.
Thanks to @mkuper

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293561 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:58:44 +00:00
Simon Pilgrim
1380699fd1 [X86][SSE] Generalize the number of decoded shuffle inputs. NFCI.
combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs.

This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293560 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:48:49 +00:00
Dehao Chen
7bac4378dd Expose isLegalToPromot as a global helper function so that SamplePGO pass can call it for legality check.
Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293559 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:46:37 +00:00
Dehao Chen
ff08fc9e81 Revert r292979 which causes compile time failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293557 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:26:05 +00:00
Sanjay Patel
fdcb936596 [InstCombine] add tests for more shift-shift patterns; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293555 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:24:36 +00:00
Eli Friedman
b643b21e99 Fix line endings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:04:23 +00:00
Tom Stellard
01e3167e09 AMDGPU: Fix release build broken by r293551
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293553 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 22:02:58 +00:00