222 Commits

Author SHA1 Message Date
Jun Bum Lim
40a6d15e7a [CodeGenPrep]Restructure promoting Ext to form ExtLoad
Summary:
Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case.
This change was motivated from D26524.  Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP.

Reviewers: jmolloy, qcolombet, mcrosier, javed.absar

Reviewed By: qcolombet

Subscribers: rengolin, llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D27853

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298114 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 19:05:21 +00:00
Matt Arsenault
365e17251e CodeGenPrepare: Sink addressing modes for atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297903 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 22:35:20 +00:00
Michael Kuperstein
1872f69aec [CGP] Split some critical edges coming out of indirect branches
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.

This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.

Differential Revision: https://reviews.llvm.org/D29916


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296416 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 00:11:34 +00:00
Daniel Jasper
e5e8f2aec1 Revert "[CGP] Split some critical edges coming out of indirect branches"
This reverts commit r296149 as it leads to crashes when compiling for
PPC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296295 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-26 11:09:12 +00:00
Eli Friedman
0ee8e563c9 [CodeGenPrepare] Make -addr-sink-using-gep work with address spaces.
When we construct addressing modes, we use isNoopAddrSpaceCast to ignore
addrspacecast instructions. Make sure we insert the correct addrspacecast
when we reconstruct the addressing mode.

Differential Revision: https://reviews.llvm.org/D30114



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296167 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 20:51:36 +00:00
Michael Kuperstein
98ee128c8e [CGP] Split some critical edges coming out of indirect branches
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.

This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.

Differential Revision: https://reviews.llvm.org/D29916


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296149 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 18:41:32 +00:00
Michael Kuperstein
969577f54d Revert r269060 to pacify bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296064 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 01:22:19 +00:00
Michael Kuperstein
8981fc9888 [CGP] Split some critical edges coming out of indirect branches
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.

This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.

Differential Revision: https://reviews.llvm.org/D29916


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296060 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 00:56:21 +00:00
Geoff Berry
fc170d8f5d [CodeGenPrepare] Sink and duplicate more 'and' instructions.
Summary:
Rework the code that was sinking/duplicating (icmp and, 0) sequences
into blocks where they were being used by conditional branches to form
more tbz instructions on AArch64.  The new code is more general in that
it just looks for 'and's that have all icmp 0's as users, with a target
hook used to select which subset of 'and' instructions to consider.
This change also enables 'and' sinking for X86, where it is more widely
beneficial than on AArch64.

The 'and' sinking/duplicating code is moved into the optimizeInst phase
of CodeGenPrepare, where it can take advantage of the fact the
OptimizeCmpExpression has already sunk/duplicated any icmps into the
blocks where they are used.  One minor complication from this change is
that optimizeLoadExt needed to be updated to always mark 'and's it has
determined should be in the same block as their feeding load in the
InsertedInsts set to avoid an infinite loop of hoisting and sinking the
same 'and'.

This change fixes a regression on X86 in the tsan runtime caused by
moving GVNHoist to a later place in the optimization pipeline (see
PR31382).

Reviewers: t.p.northover, qcolombet, MatzeB

Subscribers: aemerson, mcrosier, sebpop, llvm-commits

Differential Revision: https://reviews.llvm.org/D28813

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295746 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 18:53:14 +00:00
Matt Arsenault
a647924c48 TargetLowering: Remove AddrSpace parameter from GetAddrModeArguments
It doesn't make any sense to pass in to what is supposed to be parsing
the call, and this can be inferred from the pointer output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294412 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 07:09:03 +00:00
Igor Laevsky
5ac65c9c9a [CodeGenPrepare] Hoist all getSubtargetImpl calls to the beginning of the pass
Differential Revision: https://reviews.llvm.org/D29456



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294301 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 13:27:20 +00:00
Jun Bum Lim
c96f661d59 [CodeGenPrep]No negative cost in the ExtLd promotion
Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument.

Reviewers: mcrosier, jmolloy, qcolombet, javed.absar

Reviewed By: mcrosier, qcolombet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28871

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293307 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 17:16:37 +00:00
Haicheng Wu
9c79f14436 [CodeGenPrepare] Fix a typo in the comment. NFC.
encode => endcode.

Differential Revision: https://reviews.llvm.org/D28866

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292438 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-18 21:12:10 +00:00
Wei Mi
1d866f19de Redo store splitting in CodeGenPrepare.
This is a succeeding patch of https://reviews.llvm.org/D22840 to address the
issue when a value to be merged into an int64 pair is in a different BB. Redoing
the store splitting in CodeGenPrepare so we can match the pattern across multiple
BBs and move some instructions into the same BB. We still keep the code in dag
combine so that we can catch cases that show up after DAG combining runs.

Differential Revision: https://reviews.llvm.org/D25914



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290365 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 19:44:45 +00:00
George Burgess IV
1ced44b92a [Analysis] Centralize objectsize lowering logic.
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290214 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 23:46:36 +00:00
Jun Bum Lim
4d9c93dc3f [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly :

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289988 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 20:38:39 +00:00
Sanjoy Das
0499c46012 Inline stripInvariantGroupMetadata out of existence
As a one liner function, I don't think it is pulling its weight in terms
of helping readability.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289987 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 20:29:39 +00:00
Sanjoy Das
83c8485899 Fix CodeGenPrepare::stripInvariantGroupMetadata
`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs.  The
only reason it worked at all is that `getMetadataID` returns something
unrelated -- it returns the subclass ID of the receiver (which is used
in `dyn_cast` etc.).  That does not numerically match
`LLVMContext::MD_invariant_group` and ends up dropping `invariant_group`
along with every other metadata that does not numerically match
`LLVMContext::MD_invariant_group`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289973 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 18:52:33 +00:00
Jun Bum Lim
51900e49f9 Revert "[CodeGenPrep] Skip merging empty case blocks"
This reverts commit r289951.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289960 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 17:06:14 +00:00
Jun Bum Lim
6db0eaf697 [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block:

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289951 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 16:03:31 +00:00
Peter Collingbourne
06115803f9 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288458 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-02 02:24:42 +00:00
Joerg Sonnenberger
46cc79217b Revert r287553: [CodeGenPrep] Skip merging empty case blocks
It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line
670 ("Expected irreducible CFG").


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288052 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-28 18:56:54 +00:00
Justin Lebar
09220c80d3 [CodeGenPrepare] Don't sink non-cheap addrspacecasts.
Summary:
Previously, CGP would unconditionally sink addrspacecast instructions,
even going so far as to sink them into a loop.

Now we check that the cast is "cheap", as defined by TLI.

We introduce a new "is-cheap" function to TLI rather than using
isNopAddrSpaceCast because some GPU platforms want the ability to ask
for non-nop casts to be sunk.

Reviewers: arsenm, tra

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287591 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 22:49:15 +00:00
Justin Lebar
5db0e4c349 [CodeGenPrepare] Rewrite a loop in terms of llvm::none_of. NFC.
Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287590 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 22:49:11 +00:00
Jun Bum Lim
b68036c70c [CodeGenPrep] Skip merging empty case blocks
Summary: Merging an empty case block into the header block of switch could cause
ISel to add COPY instructions in the header of switch, instead of the case
block, if the case block is used as an incoming block of a PHI. This could
potentially increase dynamic instructions, especially when the switch is in a
loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl

Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287553 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 16:47:28 +00:00
Simon Pilgrim
84d1e9104a Fix comment typos. NFC.
Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 13:47:59 +00:00
Chandler Carruth
0e7e3d2f38 Hoist check for TLI above all of the attempts to use it (including one
of which that is hidden inside a separate function call) and helpfully
before building expensive transaction infrastructure. This will avoid
crashing when running CGP in a generic mode if we ever managed to hit
this case.

Note that I spent some time looking at alternatives. CGP is actually
used without a TM or TLI in order to do some target-independent testing.
Further, all of the neighboring optimization techniques actually have
some paths that are effective even in the absence of TLI so this seemed
the correct scope at which to check and bypass logic. It still isn't
clear that long-term support for missing TM/TLI is the right
cost/benefit tradeoff for CGP -- we seem to get relatively little for it
and the code is just littered with checks (and assumptions which
I suspect are still missing some checks).

This at least fixes the potential bug in this code spotted by
PVS-Studio, so we've got that going for us. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285987 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-04 06:54:00 +00:00
Dehao Chen
977fc82cac Use profile info to set function section prefix to group hot/cold functions.
Summary:
The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation:
1. add a new metadata for function section prefix
2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function
3. output the section prefix set by CGP

Reviewers: davidxl, eraman

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D24989

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284533 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 20:42:47 +00:00
Andrea Di Biagio
fe42b012cd [CodeGenPrepare] When moving a zext near to its associated load, do not retain the original debug location.
CodeGenPrepare knows how to move a zext of a load into the same basic block
where the load lives. The goal is to help ISel match a zero-extending load
instead of two separated instructions.

CGP attempts to move a zext computation even if it lives in a basic block that
does not post-dominate the load's basic block. That means, the hoisted zext may
be speculated. Preserving the zext location would hurt the debugging experience
and the quality of sample pgo.
With this patch, when moving a zext near to its associated load, CGP no longer
propagates the zext's debug location. Instead, CGP conservatively reuses the
same debug location for the load and the zext.

An alternative approach would be to assign an artificial line-0 location to the
zext. However we don't want to over-use the 'line-0' for this particular case
because it would have a size cost in the line-table section for no additional
benefit.

Differential Revision: https://reviews.llvm.org/D25611


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284377 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 11:32:26 +00:00
Wolfgang Pieb
65236ec176 Preserve the debug location when CodeGenPrepare sinks a compare instruction into the
basic block of a user.

Patch by Andrea DiBiagio.

Differential Revision: https://reviews.llvm.org/D24632



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283500 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-06 21:43:45 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Dehao Chen
77ad09277a Fix the bug introduced in r281252.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281253 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 20:29:54 +00:00
Dehao Chen
8c8aa02da6 Lower consecutive select instructions correctly.
Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095

Reviewers: davidxl

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D24147

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281252 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 20:23:28 +00:00
Michael Kuperstein
2a93bda6e9 [CGP] Be less conservative about tail-duplicating a ret to allow tail calls
CGP tail-duplicates rets into blocks that end with a call that feed the ret.
This puts the call in tail position, potentially allowing the DAG builder to
lower it as a tail call. To avoid tail duplication in cases where we won't
form the tail call, CGP tried to predict whether this is going to be possible,
and avoids doing it when lowering as a tail call will definitely fail.
However, it was being too conservative by always throwing away calls to
functions with a signext/zeroext attribute on the return type.

Instead, we can use the same logic the builder uses to determine whether the
attributes work out.

Differential Revision: https://reviews.llvm.org/D24315


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280894 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 00:48:37 +00:00
Michael Kuperstein
9c0826c800 Don't reuse a variable name in a nested scope. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280853 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 20:29:49 +00:00
Xinliang David Li
da48846879 [Profile] preserve branch metadata lowering select in CGP
CGP currently drops select's MD_prof profile data when
generating conditional branch which can lead to bad
code layout. The patch fixes the issue.

Differential Revision: http://reviews.llvm.org/D24169


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280600 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 21:26:36 +00:00
David Majnemer
975248e4fb Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278433 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 22:21:41 +00:00
Tim Northover
d7c16ef558 CodeGenPrep: use correct function to determine Global's alignment.
Elsewhere (particularly computeKnownBits) we assume that a global will be
aligned to the value returned by Value::getPointerAlignment. This is used to
boost the alignment on memcpy/memset, so any target-specific request can only
increase that value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275866 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:28:52 +00:00
Chad Rosier
b744845677 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270715 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 16:22:14 +00:00
Jun Bum Lim
1bbd21ba14 Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes  fixed in r269433.

Reviewers: joker.eph, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20248

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269456 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 18:38:35 +00:00
Sanjay Patel
bc9409d1af [CGP] avoid crashing from weightlessness
It's possible that we have branch weights with 0 values.
In that case, don't try to create an impossible BranchProbability.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268935 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 17:31:55 +00:00
David Majnemer
11dea5d5dd [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267767 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 19:36:38 +00:00
Sanjay Patel
6f5aa79cda [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267572 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 17:11:17 +00:00
Sanjay Patel
e59120290f [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 00:47:39 +00:00
Sanjay Patel
7ceecf02a1 replace duplicated static functions for profile metadata access with BranchInst member function; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267295 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-23 20:01:22 +00:00
Andrew Kaylor
1e455c5cfb Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267231 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 22:06:11 +00:00
Vedant Kumar
8866d94a61 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 06:51:37 +00:00
Andrew Kaylor
c852398cbc Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-21 17:58:54 +00:00
Petar Jovanovic
396d592ab0 Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266193 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-13 12:25:25 +00:00
Sanjay Patel
f3fc3d19a4 use range-loops; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265985 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-11 20:13:44 +00:00