2217 Commits

Author SHA1 Message Date
Renato Golin
99ec266022 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262738 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-04 19:19:36 +00:00
Simon Pilgrim
946e6cb363 [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262599 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-03 09:43:28 +00:00
Renato Golin
ff17c53224 Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262594 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-03 08:57:44 +00:00
Renato Golin
4d7de4fa50 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262507 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 19:35:45 +00:00
Matt Arsenault
543afc9d41 DAGCombiner: Make sure an integer is being truncated
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 01:36:51 +00:00
Matt Arsenault
d06f393d79 DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262397 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 21:31:53 +00:00
Vasileios Kalintiris
6e09ce7e5f Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262387 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 20:25:43 +00:00
Matt Arsenault
8ba165a405 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262358 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 18:01:37 +00:00
Vasileios Kalintiris
ef35f84f09 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 10:08:01 +00:00
Matt Arsenault
2bc40a1dbd DAGCombiner: Don't unnecessarily swap operands in ReassociateOps
In the case where op = add, y = base_ptr, and x = offset, this
transform:

(op y, (op x, c1)) -> (op (op x, y), c1)

breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.

This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262148 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-27 19:57:45 +00:00
Matt Arsenault
f51a2196a5 DAGCombiner: Relax sqrt NaN folding check
This is OK for +0 since compares to +/-0 give the same result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262125 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-27 09:38:05 +00:00
Simon Pilgrim
01ad432bbe [DAGCombiner] Use getBitcast helper when possible. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-20 15:05:29 +00:00
Ahmed Bougacha
606927f7bb [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260901 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 18:07:29 +00:00
Pirama Arumuga Nainar
d313df118a Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 00:08:05 +00:00
Ahmed Bougacha
3248c624fa [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 22:54:12 +00:00
Tim Shen
33bf0bd3ea [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.

The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.

The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.

The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259691 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-03 20:58:55 +00:00
Balaram Makam
60101204f1 AArch64: Implement missed conditional compare sequences.
Summary:
This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.

Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB

Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry

Differential Revision: http://reviews.llvm.org/D16291

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259387 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-01 19:13:07 +00:00
Matthias Braun
5e08bd340a Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259283 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 01:24:31 +00:00
Junmo Park
91de9d1201 [DAGCombiner] Don't add volatile or indexed stores to ChainedStores
Summary:
findBetterNeighborChains does not handle volatile or indexed stores.
However, it did not check when adding stores to ChainedStores.

Reviewers: arsenm

Differential Revision: http://reviews.llvm.org/D16463


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 06:23:33 +00:00
Simon Pilgrim
2a554ce9c0 Tidied up TRUNC combine code. NFC.
Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258644 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 21:50:40 +00:00
Dan Gohman
f2cde91200 [SelectionDAG] Fold more offsets into GlobalAddresses
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258482 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 03:57:34 +00:00
Reid Kleckner
99fdb962e1 Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
  #include <string>
  #include <iostream>
  static const char kData[] = "asdf jkl;";
  int main() {
    std::string s(kData + 3, sizeof(kData) - 3);
    std::cout << s << '\n';
  }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258465 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 01:09:29 +00:00
Dan Gohman
4d7ffe9779 [SelectionDAG] Fold more offsets into GlobalAddresses
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.

This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.

Differential Revision: http://reviews.llvm.org/D16090


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 07:03:08 +00:00
Sanjay Patel
47436fa2d1 [DAGCombiner] don't dereference an operand that doesn't exist (PR26070)
The bug was introduced with changes for x86-64 fp128:
http://reviews.llvm.org/rL254653

I don't know why an x86 change is here, so I'll follow up in:
http://reviews.llvm.org/D15134

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=26070



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257200 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 19:53:24 +00:00
Tim Shen
d6cee943e5 Test commit access - add a blank line in comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257192 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 19:20:23 +00:00
Dan Gohman
0fcad92ee3 [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets.
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.

This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.

Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.

Differential Revision: http://reviews.llvm.org/D15544


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256890 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 00:43:06 +00:00
Eric Christopher
54f1a90d45 Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,
x)) combines for ppc_fp128, since signbit computation is more
complicated.

Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html

Patch by Tim Shen!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 22:09:06 +00:00
Sanjay Patel
afd3f07154 fix return values to match bool return type; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254968 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-07 23:34:30 +00:00
Chih-Hung Hsieh
9f51f8f7e7 [X86] Part 1 to fix x86-64 fp128 calling convention.
Almost all these changes are conditioned and only apply to the new
x86-64 f128 type configuration, which will be enabled in a follow up
patch. They are required together to make new f128 work. If there is
any error, we should fix or revert them as a whole.
These changes should have no impact to current configurations.

* Relax type legalization checks to accept new f128 type configuration,
  whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has
  TLI.isTypeLegal true.
* Relax GetSoftenedFloat to return in some cases f128 type SDValue,
  which is TLI.isTypeLegal but not "softened" to i128 node.
* Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration,
  to generate optimized bitwise operators for libm functions.
* Enhance related Lower* functions to handle f128 type.
* Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions
  to keep new f128 type in register, and convert f128 operators to library calls.
* Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
* Add ExpandConstant to handle i128 constants, ExpandNode
  to handle ISD::Constant node.
* Add one more parameter to getCommonSubClass and firstCommonClass,
  to guarantee that returned common sub class will contain the specified
  simple value type.
  This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
* Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
* Fix printOperand to handle null operand.
* Enhance ISD::BITCAST node to handle f128 constant.
* Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
* Enhance X86AsmPrinter to emit f128 values in comments.

Differential Revision: http://reviews.llvm.org/D15134



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254653 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-03 22:02:40 +00:00
Craig Topper
96be2c60e3 Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254260 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-29 18:05:22 +00:00
Craig Topper
ce14a216e2 [SelectionDAG] Use std::any_of instead of a manually coded loop. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254242 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-29 04:37:11 +00:00
Artyom Skrobov
824e14ddab Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-25 19:41:11 +00:00
Simon Pilgrim
f4a7279ca5 Remove duplicate getValueType() calls. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253823 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-22 16:49:38 +00:00
Jonas Paulsson
546611e398 [DAGCombiner] Bugfix for lost chain depenedency.
When MergeConsecutiveStores() combines two loads and two stores into
wider loads and stores, the chain users of both of the original loads
must be transfered to the new load, because it may be that a chain
user only depends on one of the loads.

New test case: test/CodeGen/SystemZ/dag-combine-01.ll

Reviewed by James Y Knight.

Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253779 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-21 13:25:07 +00:00
Hans Wennborg
086b179985 X86: More efficient legalization of wide integer compares
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.

Example:

  define i32 @test_slt(i64 %a, i64 %b) {
  entry:
    %cmp = icmp slt i64 %a, %b
    br i1 %cmp, label %bb1, label %bb2
  bb1:
    ret i32 1
  bb2:
    ret i32 2
  }

Before this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          setae   %al
          cmpl    16(%esp), %ecx
          setge   %cl
          je      .LBB2_2
          movb    %cl, %al
  .LBB2_2:
          testb   %al, %al
          jne     .LBB2_4
          movl    $1, %eax
          retl
  .LBB2_4:
          movl    $2, %eax
          retl

After this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          sbbl    16(%esp), %ecx
          jge     .LBB1_2
          movl    $1, %eax
          retl
  .LBB1_2:
          movl    $2, %eax
          retl

Differential Revision: http://reviews.llvm.org/D14496

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253572 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 16:35:08 +00:00
Geoff Berry
e2eaa9712d [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252789 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 19:42:52 +00:00
Matt Arsenault
b8f7aeb218 Add target preference for GatherAllAliases max depth
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252775 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 18:44:33 +00:00
Sanjay Patel
512052a88e add a SelectionDAG method to check if no common bits are set in two nodes; NFCI
This was suggested in:
http://reviews.llvm.org/D13956

and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519

This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.

A corresponding function for IR instructions already exists in ValueTracking.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-09 23:31:38 +00:00
Tom Stellard
136bd632b6 DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252349 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 21:58:37 +00:00
James Y Knight
74615f5548 Fix two issues in MergeConsecutiveStores:
1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251816 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 18:48:08 +00:00
Sanjay Patel
6f6bce9783 Use the 'arcp' fast-math-flag when combining repeated FP divisors
This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. 
This was originally part of D8900.

Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and 
possibly other changes.

Differential Revision: http://reviews.llvm.org/D9708



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251450 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 20:27:25 +00:00
Steve King
6f257342a1 Fix llc crash processing S/UREM for -Oz builds caused by rL250825.
When taking the remainder of a value divided by a constant, visitREM()
attempts to convert the REM to a longer but faster sequence of instructions.
This conversion calls combine() on a speculative DIV instruction. Commit
rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes.
Flow eventually hits unreachable().

This patch adds a test case and a check to prevent visitREM() from trying
to convert the REM instruction in cases where a DIVREM is possible.
See http://reviews.llvm.org/D14035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251373 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 00:14:06 +00:00
Simon Pilgrim
2bc87a6f42 [DAGCombiner] Generalize masking of constant rotates.
We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded.

Followup to D13851.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251197 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-24 18:44:52 +00:00
Simon Pilgrim
d0ca754540 [X86][XOP] Add support for lowering vector rotations
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.

This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.

Differential Revision: http://reviews.llvm.org/D13851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251188 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-24 13:17:26 +00:00
Zia Ansari
02da4e7721 [X86] - Catch extra combine opportunities for redundant imuls.
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.

I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).

Differential Revision: http://reviews.llvm.org/D13740



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251028 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-22 16:14:45 +00:00
Artyom Skrobov
42c2a81d51 Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner.
Summary:
In addition to moving the code over, this patch amends the DIV,REM -> DIVREM
combining to run on all affected nodes at once: if the nodes are converted
to DIVREM one at a time, then the resulting DIVREM may get legalized by the
backend into something target-specific that we won't be able to recognize
and correlate with the remaining nodes.

The motivation is to "prepare terrain" for D13862: when we set DIV and REM
to be legalized to libcalls, instead of the DIVREM, we otherwise lose the
ability to combine them together. To prevent this, we need to take the
DIV,REM -> DIVREM combining out of the lowering stage.

Reviewers: RKSimon, eli.friedman, rengolin

Subscribers: john.brawn, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13733



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250825 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-20 13:06:02 +00:00
Artyom Skrobov
d27f5c0eb0 A doccomment for CombineTo, and some NFC refactorings
Summary:
Caching SDLoc(N), instead of recreating it in every single
function call, keeps the code denser, and allows to unwrap long lines.

Reviewers: sunfish, atrick, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13726

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-14 17:18:35 +00:00
Artyom Skrobov
d86f867e9e Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC)
Summary: The two implementations had more code in common than not.

Reviewers: sunfish, MatzeB, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-14 16:54:14 +00:00
Matt Arsenault
1349eb7659 DAGCombiner: Don't stop finding better chain on 2 aliases
The comment says this was stopped because it was unlikely to be
profitable. This is not true if you want to combine vector loads
with multiple components.

For a simple case that looks like

t0 = load t0 ...
t1 = load t0 ...
t2 = load t0 ...
t3 = load t0 ...

t4 = store t0:1, t0:1
  t5 = store t4, t1:0
    t6 = store t5, t2:0
	  t7 = store t6, t3:0

We want to get all of these stores onto a chain
that is a TokenFactor of these N loads. This mostly
solves the AMDGPU merge-stores.ll regressions
with -combiner-alias-analysis for merging vector
stores of vector loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250138 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-13 00:49:00 +00:00
Matt Arsenault
c08fe15c4f DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-12 23:59:50 +00:00