Commit Graph

446 Commits

Author SHA1 Message Date
Joerg Sonnenberger
732f95ff9a Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374784 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-14 16:15:14 +00:00
Dmitri Gribenko
e0cea29324 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374768 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-14 12:22:48 +00:00
Joerg Sonnenberger
314e3cde15 Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374743 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-13 23:00:15 +00:00
Simon Pilgrim
d88182269a CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI.
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374085 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 17:00:01 +00:00
Guillaume Chatelet
dd96b472b6 [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373207 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 13:34:44 +00:00
Jesper Antonsson
fd525cdd72 [CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix.
Summary:
An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst()  in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was:
 https://bugs.llvm.org/show_bug.cgi?id=41052
... and the previous fix was D59358.

Reviewers: aprantl, skatkov

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67838

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373084 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-27 13:01:37 +00:00
Sanjay Patel
9dcc14cb65 Revert [IR] allow fast-math-flags on phi of FP values
This reverts r372866 (git commit dec03223a97af0e4dfcb23da55c0f7f8c9b62d00)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372868 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 13:29:09 +00:00
Sanjay Patel
5d102992ac [IR] allow fast-math-flags on phi of FP values
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917

As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086

The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535

But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.

The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.

Differential Revision: https://reviews.llvm.org/D67564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372866 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 13:14:12 +00:00
David Green
3110f6b18e [CGP] Ensure sinking multiple instructions does not invalidate dominance checks
In MVE, as of rL371218, we are attempting to sink chains of instructions such as:
  %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0
  %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer
In certain situations though, we can end up breaking the dominance relations of
instructions. This happens when we sink the instruction into a loop, but cannot
remove the originals. The Use is updated, which might in fact be a Use from the
second instruction to the first.

This attempts to fix that by reversing the order of instruction that are sunk,
and ensuring that we update the uses on new instructions if they have already
been sunk, not the old ones.

Differential Revision: https://reviews.llvm.org/D67366


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371743 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 16:00:07 +00:00
Tim Northover
03dc1837d3 CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI.
Up to now, we've decided whether to sink address calculations using GEPs or
normal arithmetic based on the useAA hook, but there are other reasons GEPs
might be preferred. So this patch splits the two questions, with a default
implementation falling back to useAA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371721 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 10:21:00 +00:00
Teresa Johnson
ef512ca8e6 Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371284 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-07 03:09:36 +00:00
Craig Topper
cd76070885 [CGP] Remove ModifiedDT from the makeBitReverse loop
I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag.

This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization.

Differential Revision: https://reviews.llvm.org/D66366

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369283 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-19 18:02:24 +00:00
Sanjay Patel
f67401bb31 [CodeGenPrepare] Fix use-after-free
If OptimizeExtractBits() encountered a shift instruction with no operands at all,
it would erase the instruction, but still return false.

This previously didn’t matter because its caller would always return after
processing the instruction, but https://reviews.llvm.org/D63233 changed the
function’s caller to fall through if it returned false, which would then cause
a use-after-free detectable by ASAN.

This change makes OptimizeExtractBits return true if it removes a shift
instruction with no users, terminating processing of the instruction.

Patch by: @brentdax (Brent Royal-Gordon)

Differential Revision: https://reviews.llvm.org/D66330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369168 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-16 23:10:34 +00:00
Jonas Devlieghere
114087caa6 [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369013 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-15 15:54:37 +00:00
Kang Zhang
8928b46f98 [NFC][CodeGen] Modify the type element of TailCalls to simplify the dupRetToEnableTailCallOpts()
Summary:
The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D64905


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367644 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 03:09:07 +00:00
Luis Marques
e36a3e35dd [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting
Some GEPs were not being split, presumably because that split would just be 
undone by the DAGCombiner. Not performing those splits can prevent important 
optimizations, such as preventing the element indices / member offsets from 
being (partially) folded into load/store instruction immediates. This patch:

- Makes the splits also occur in the cases where the base address and the GEP 
  are in the same BB.
- Ensures that the DAGCombiner doesn't reassociate them back again.

Differential Revision: https://reviews.llvm.org/D60294



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363544 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 10:54:12 +00:00
Sanjay Patel
d049b8952b [CodeGenPrepare][x86] shift both sides of a vector select when profitable
This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428

Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.

This extends CGP functionality introduced with:
rL201655

Differential Revision: https://reviews.llvm.org/D63233

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363511 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-16 15:29:03 +00:00
Sanjay Patel
31b94430d7 [CodeGenPrepare] propagate debuginfo when copying a shuffle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363409 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 15:05:35 +00:00
Matt Arsenault
a69b8271d5 TTI: Improve default costs for addrspacecast
For some reason multiple places need to do this, and the variant the
loop unroller and inliner use was not handling it.

Also, introduce a new wrapper to be slightly more precise, since on
AMDGPU some addrspacecasts are free, but not no-ops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362436 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 18:41:34 +00:00
Bjorn Pettersson
d7d580485b Use the DataLayout::typeSizeEqualsStoreSize helper. NFC
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361613 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-24 09:20:20 +00:00
Simon Pilgrim
c350a3e6fc [CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360328 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-09 10:51:26 +00:00
QingShan Zhang
23785c65ac [CodeGenPrepare] Don't split the store if it is volatile
We shouldn't split the store when it is volatile.

Differential Revision: https://reviews.llvm.org/D61169


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360228 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-08 07:32:12 +00:00
Roman Lebedev
fb75d57329 [NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use it
Summary:
It is a common thing to loop over every `PHINode` in some `BasicBlock`
and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block.
`replaceSuccessorsPhiUsesWith()` already had code to do that,
it just wasn't a function.
So outline it into a new function, and use it.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61013

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359996 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-05 18:59:39 +00:00
Roman Lebedev
14d59d23ed [NFC] PHINode: introduce replaceIncomingBlockWith() function, use it
Summary:
There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()`
and `PHINode::getNumOperands()`, but no function to replace every
specified `BasicBlock*` predecessor with some other specified `BasicBlock*`.
Clearly, there are a lot of places that could use that functionality.

Reviewers: chandlerc, craig.topper, spatel, danielcdh

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359995 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-05 18:59:30 +00:00
Sanjay Patel
addf0bb7c5 [CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try)
This is a subset of the original commit from rL359879
which was reverted because it could crash when using the 'RemovedInstructions'
structure that enables delayed deletion of dead instructions. The motivating
compile-time win does not require that change though. We should get most of
that win from this change alone.

Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.

See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html

Differential Revision: https://reviews.llvm.org/D61075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359969 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-04 12:46:32 +00:00
Evgeniy Stepanov
ff2899148a Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block"
This reverts commit r359879, which introduced a compiler crash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359908 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 17:31:49 +00:00
Sanjay Patel
866bc92c5d [CodeGenPrepare] limit overflow intrinsic matching to a single basic block
Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.

Also, we were restarting the iterator loops when doing the overflow intrinsic
transforms by marking the dominator tree for update. That was done to prevent
iterating over a removed instruction. But we can postpone the deletion using
the existing "RemovedInsts" structure, and that means we don't need to update
the DT.

See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html

Differential Revision: https://reviews.llvm.org/D61075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359879 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 13:09:18 +00:00
Francis Visoiu Mistrih
a8196a8e04 [CGP] Look through bitcasts when duplicating returns for tail calls
The simple case of:

```
int *callee();
void *caller(void *a) {
  if (a == NULL)
    return callee();
  return a;
}
```

would generate a regular call instead of a tail call because we don't
look through the bitcast of the call to `callee` when duplicating the
return blocks.

Differential Revision: https://reviews.llvm.org/D60837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359041 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-23 21:57:46 +00:00
Eric Christopher
0c40f6d128 Include what's used in a few cpp files - these were getting transitive
includes from MCDwarf.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358254 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:16:33 +00:00
Evandro Menezes
d71ea05ab7 [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357731 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-04 22:40:06 +00:00
Teresa Johnson
9af297cc57 [CGP] Reset DT when optimizing select instructions
Summary:
A recent fix (r355751) caused a compile time regression because setting
the ModifiedDT flag in optimizeSelectInst means that each time a select
instruction is optimized the function walk in runOnFunction stops and
restarts again (which was needed to build a new DT before we started
building it lazily in r356937). Now that the DT is built lazily, a
simple fix is to just reset the DT at this point, rather than restarting
the whole function walk.

In the future other places that set ModifiedDT may want to switch to
just resetting the DT directly. But that will require an evaluation to
ensure that they don't otherwise need to restart the function walk.

Reviewers: spatel

Subscribers: jdoerfert, llvm-commits, xur

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357111 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 18:44:25 +00:00
Teresa Johnson
30cc7c8130 [CGP] Build the DominatorTree lazily
Summary:
In r355512 CGP was changed to build the DominatorTree only once per
function traversal, to avoid repeatedly building it each time it was
accessed. This solved one compile time issue but introduced another. In
the second case, we now were building the DT unnecessarily many times
when we performed many function traversals (i.e. more than once per
function when running CGP because of changes made each time).

Change to saving the DT in the CodeGenPrepare object, and building it
lazily when needed. It is reset whenever we need to rebuild it.

The case that exposed the issue there are 617 functions, and we walk
them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a
total of 12083 times (so previously we were building the DT 12083
times). With this patch we only build the DT 844 times (average of 1.37
times per function). We dropped the total time to compile this file from
538.11s without this patch to 339.63s with it.

There is still an issue as CGP is taking much longer than all other
passes even with this patch, and before a recent compiler release cut at
r355392 the total time to this compile was only 97 sec with a huge
reduction in CGP time. I suspect that one of the other recent changes to
CGP led to iterating each function many more times on average, but I
need to do some more investigation.

Reviewers: spatel

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356937 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 18:38:48 +00:00
Teresa Johnson
ff5f64e4c8 [CGP] Make several static functions member functions (NFC)
This is extracted from D59696 as suggested in the review. It is
preparation for making the DominatorTree a member variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356857 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-24 15:18:50 +00:00
Sanjay Patel
c4a68ddfbf [CodeGenPrepare] limit formation of overflow intrinsics (PR41129)
This is probably a bigger limitation than necessary, but since we don't have any evidence yet
that this transform led to real-world perf improvements rather than regressions, I'm making a
quick, blunt fix.

In the motivating x86 example from:
https://bugs.llvm.org/show_bug.cgi?id=41129
...and shown in the regression test, we want to avoid an extra instruction in the dominating
block because that could be costly.

The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version
is any better than the other yet.

Differential Revision: https://reviews.llvm.org/D59602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356665 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-21 13:57:07 +00:00
Sanjay Patel
a062393d06 [CGP] fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356572 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 16:47:53 +00:00
Sanjay Patel
0fd81a3c2d [CGP] convert chain of 'if' to 'switch'; NFC
This should be extended, but CGP does some strange things,
so I'm intentionally not changing the potential order of
any transforms yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356566 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 15:53:06 +00:00
Mikael Holmen
bc7dbf662a [CodeGenPrepare] avoid crashing from replacing a phi twice
Summary:
This is a fix to bug 41052:
https://bugs.llvm.org/show_bug.cgi?id=41052

While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi.

Patch by: JesperAntonsson

Reviewers: skatkov, aprantl

Reviewed By: aprantl

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 13:51:05 +00:00
Sanjay Patel
f1180b0c1f [CGP] add another bailout for degenerate code (PR41064)
This is almost the same as:
rL355345
...and should prevent any potential crashing from examples like:
https://bugs.llvm.org/show_bug.cgi?id=41064
...although the bug was masked by:
rL355823
...and I'm not sure how to repro the problem after that change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356218 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-14 23:14:31 +00:00
Tim Northover
99eb9152f9 CodeGenPrep: preserve inbounds attribute when sinking GEPs.
Targets can potentially emit more efficient code if they know address
computations never overflow. For example ILP32 code on AArch64 (which only has
64-bit address computation) can ignore the possibility of overflow with this
extra information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355926 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 15:22:23 +00:00
Eugene Leviant
13df7e16ef [CGP] Fix UB when GEP is bound to trivial PHINode
Differential revision: https://reviews.llvm.org/D59140


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355904 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 10:10:29 +00:00
Sam Parker
64ecbb6deb [CGP] Limit distance between overflow math and cmp
Inserting an overflowing arithmetic intrinsic can increase register
pressure by producing two values at a point where only one is needed,
while the second use maybe several blocks away. This increase in
pressure is likely to be more detrimental on performance than
rematerialising one of the original instructions.
    
So, check that the arithmetic and compare instructions are no further
apart than their immediate successor/predecessor.

Differential Revision: https://reviews.llvm.org/D59024


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355823 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 13:19:46 +00:00
Sanjay Patel
87e0f26154 [CGP] fix comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355791 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-10 18:42:30 +00:00
Rong Xu
3441a8988a [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst
r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be
maintained correctly in optimizations in optimizeBlock() and optimizeInst().
Function optimizeSelectInst() does not update the flag.
This patch propagates the flag in optimizeSelectInst() back to
optimizeBlock().

This patch also removes ModifiedDT in CodeGenPrepare class (which is not used).
The property of ModifiedDT is now recorded in a ref parameter.

Differential Revision: https://reviews.llvm.org/D59139


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355751 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 22:46:18 +00:00
Teresa Johnson
f383764b62 [CGP] Avoid repeatedly building DominatorTree causing long compile-time (NFC)
Summary:
In r354298 a DominatorTree construction was added via new function
combineToUSubWithOverflow, which was subsequently restructured into
replaceMathCmpWithIntrinsic in r354689. We are hitting a very long
compile time due to this repeated construction, once per math cmp in
the function.

We shouldn't need to build the DominatorTree more than once per
function, except when a transformation invalidates it. There is already
a boolean flag that is returned from these methods indicating whether
the DT has been modified. We can simply build the DT once per
Function walk in CodeGenPrepare::runOnFunction, since any time a change
is made we break out of the Function walk and restart it.

I modified the code so that both replaceMathCmpWithIntrinsic as well as
mergeSExts (which was also building a DT) use the DT constructed by the
run method.

From -mllvm -time-passes:
Before this patch: CodeGen Prepare user time is 328s
With this patch: CodeGen Prepare user time is 21s

Reviewers: spatel

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355512 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-06 14:57:40 +00:00
Sanjay Patel
b4cc6b6cce [CodeGenPrepare] avoid crashing on non-canonical/degenerate code
The test is reduced from an example in the post-commit thread for:
rL354746
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html

While we must avoid dying here, the real question should be:
Why is non-canonical and/or degenerate code making it to CGP when
using the new pass manager?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355345 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-04 22:47:13 +00:00
Sanjay Patel
5478a36577 [CGP] add special-cases to form unsigned add with overflow (PR40486)
There's likely a missed IR canonicalization for at least 1 of these
patterns. Otherwise, we wouldn't have needed the pattern-matching
enhancement in D57516.

Note that -- unlike usubo added with D57789 -- the TLI hook for
this transform defaults to 'on'. So if there's any perf fallout
from this, targets should look at how they're lowering the uaddo
node in SDAG and/or override that hook.

The x86 diffs suggest that there's some missing pattern-matching
for forming inc/dec.

This should fix the remaining known problems in:
https://bugs.llvm.org/show_bug.cgi?id=40486
https://bugs.llvm.org/show_bug.cgi?id=31754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354746 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-24 15:31:27 +00:00
Sanjay Patel
dc14d72264 [CGP] move overflow intrinsic insertion to common location; NFCI
We need to enhance the uaddo matching to handle special-cases
as seen in PR40486 and PR31754. That means we won't necessarily
have a def-use pattern, so we'll need to check dominance to
determine where to place the intrinsic (as we already do for
usubo). This preliminary patch is just rearranging the code,
so the planned follow-up to improve uaddo will be more clear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354689 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-22 20:20:24 +00:00
Sanjay Patel
1f63d705ea [CGP] match a special-case of unsigned subtract overflow
This is the 'sub0' (negate) pattern from PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354519 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-20 21:23:04 +00:00
Sanjay Patel
09f3b5ecb1 [CGP] form usub with overflow from sub+icmp
The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487:
https://bugs.llvm.org/show_bug.cgi?id=31754
https://bugs.llvm.org/show_bug.cgi?id=40487
..and those are shown in the IR test file and x86 codegen file.

Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use.

This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo
formation by default. Only x86 overrides that setting for now although other targets will likely benefit
by forming usbuo too.

Differential Revision: https://reviews.llvm.org/D57789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354298 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-18 23:33:05 +00:00
Craig Topper
e3696113b6 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353563 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 20:48:56 +00:00