232 Commits

Author SHA1 Message Date
Sanjay Patel
bc9409d1af [CGP] avoid crashing from weightlessness
It's possible that we have branch weights with 0 values.
In that case, don't try to create an impossible BranchProbability.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268935 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 17:31:55 +00:00
David Majnemer
11dea5d5dd [CodeGenPrepare] Don't sink a cast past its user
The sink cast machinery is supposed to sink casts as close to their user
as possible.  However, an EH pad is the first instruction in it's basic
block.  Don't sink if the user is an EH pad.

This fixes PR27536.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267767 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 19:36:38 +00:00
Sanjay Patel
6f5aa79cda [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267572 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 17:11:17 +00:00
Sanjay Patel
e59120290f [CodeGenPrepare] don't convert an unpredictable select into control flow
Suggested in the review of D19488:
http://reviews.llvm.org/D19488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 00:47:39 +00:00
Sanjay Patel
7ceecf02a1 replace duplicated static functions for profile metadata access with BranchInst member function; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267295 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-23 20:01:22 +00:00
Andrew Kaylor
1e455c5cfb Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267231 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 22:06:11 +00:00
Vedant Kumar
8866d94a61 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 06:51:37 +00:00
Andrew Kaylor
c852398cbc Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-21 17:58:54 +00:00
Petar Jovanovic
396d592ab0 Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266193 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-13 12:25:25 +00:00
Sanjay Patel
f3fc3d19a4 use range-loops; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265985 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-11 20:13:44 +00:00
Chuang-Yu Cheng
afa4f41e9c Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge
Presently, CodeGenPrepare deletes all nearly empty (only phi and branch)
basic blocks. This pass can delete loop preheaders which frequently creates
critical edges. A preheader can be a convenient place to spill registers to
the stack. If the entrance to a loop body is a critical edge, then spills
may occur in the loop body rather than immediately before it. This patch
protects loop preheaders from deletion in CodeGenPrepare even if they are
nearly empty.

Since the patch alters the CFG, it affects a large number of test cases.
In most cases, the changes are merely cosmetic (basic blocks have different
names or instruction orders change slightly). I am somewhat concerned about
the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not
deleted, then the MIPS backend does not take advantage of a branch delay
slot. Consequently, I would like some close review by a MIPS expert.

The patch also partially subsumes D16893 from George Burgess IV. George
correctly notes that CodeGenPrepare does not actually preserve the dominator
tree. I think the dominator tree was usually not valid when CodeGenPrepare
ran, but I am using LoopInfo to mark preheaders, so the dominator tree is
now always valid before CodeGenPrepare.

Author: Tom Jablin (tjablin)
Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng

http://reviews.llvm.org/D16984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265397 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-05 14:06:20 +00:00
Peter Zotov
7ab7e0efeb [CodeGenPrepare] Fix r265264 (again).
Don't require TLI for SinkCmpExpression, like it wasn't before
r265264.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265271 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-03 19:32:13 +00:00
Peter Zotov
ad9ddc5a39 [CodeGenPrepare] Fix r265264.
The case where there was no TargetLowering was not handled,
leading to null pointer dereferences.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265265 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-03 17:11:53 +00:00
Peter Zotov
6a4952db2d [CodeGenPrepare] Avoid sinking soft-FP comparisons
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.

A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.

Differential Revision: http://reviews.llvm.org/D18744

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265264 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-03 16:36:17 +00:00
George Burgess IV
58cad7988f Keep CodeGenPrepare from preserving the domtree.
CGP modifies the domtree in some cases, so saying that it preserves the
domtree is a lie. We'll be able to selectively preserve it with the new
pass manager.

Differential Revision: http://reviews.llvm.org/D16893


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264099 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-22 21:25:08 +00:00
Junmo Park
abc3287851 Minor code cleanups. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263200 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-11 07:05:32 +00:00
Philip Reames
3239eb1016 [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 23:13:12 +00:00
Junmo Park
43a6e3e075 [CodeGenPrepare] Remove load-based heuristic
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.

There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).

Reviewers: spatel, zansari
    
Differential Revision: http://reviews.llvm.org/D16836


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261809 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-25 00:23:27 +00:00
Duncan P. N. Exon Smith
14147eda61 ADT: Stop using getNodePtrUnchecked on end() iterators
Stop using `getNodePtrUnchecked()` when building IR.  Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.

This should have no functionality change for now, but is a path towards
removing some more UB from ilist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-21 19:52:15 +00:00
Duncan P. N. Exon Smith
2d6fcba17e CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway.  This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over.  But having this pointer available in
ilist_iterator depends on UB in the first place.

Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.

There should be no functionality change here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261493 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-21 19:37:45 +00:00
Yaron Keren
55307987a1 Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259240 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-29 20:50:44 +00:00
Junmo Park
d92d3ed12b Minor code cleanups. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 09:42:39 +00:00
Sanjay Patel
36893e923d function names start with a lowercase letter; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258552 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 21:11:47 +00:00
Sanjay Patel
c25c05335a fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258330 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 18:59:16 +00:00
Manuel Jacob
75e1cfb035 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257999 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-16 20:30:46 +00:00
James Y Knight
b707e2a85c Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257902 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 16:33:06 +00:00
James Molloy
d6a317790e [CodeGenPrepare] Try and appease sanitizers
dupRetToEnableTailCallOpts(BB) can invalidate BB. It must run *after* we iterate across BB!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257886 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 10:36:01 +00:00
James Molloy
cd54bb4874 [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257875 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 09:20:19 +00:00
James Y Knight
ee7d060ab8 Revert "Stop increasing alignment of externally-visible globals on ELF platforms."
This reverts commit r257719, due to PR26144.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257775 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-14 16:33:21 +00:00
James Y Knight
326b7ceee0 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257719 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-13 23:59:19 +00:00
Junmo Park
a94cb08fda Remove extra whitespace. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257144 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 04:20:32 +00:00
Manuel Jacob
397864c712 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256811 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 04:03:00 +00:00
Eric Christopher
0ecf67f2fe Clarify that the bypassSlowDivision optimization operates on a single BB [v2]
Update some comments to be more explicit.

Change bypassSlowDivision and the functions it calls so that they take
BasicBlock*s and Instruction*s, rather than Function::iterator&s and
BasicBlock::iterator&s.

Change the APIs so that the caller is responsible for updating the
iterator, rather than the callee. This makes control flow much easier
to follow.

Patch by Justin Lebar!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256789 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 23:18:58 +00:00
Manuel Jacob
f463d69145 Remove unnecessary casts. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256101 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-19 18:38:42 +00:00
Sanjay Patel
59dc7be11d getParent() ^ 3 == getModule() ; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 17:24:23 +00:00
Reid Kleckner
f4e0a47be3 [CGP] Reimplement r255055 a different way
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255070 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 23:00:03 +00:00
Reid Kleckner
903c0998cc Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around"
This reverts commit r255055.

Breakage has been reported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255063 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 22:33:23 +00:00
Reid Kleckner
898fa74bf2 [CGP] Check that we have an insert point before moving llvm.dbg.value around
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255055 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 21:50:52 +00:00
Andrew Kaylor
e03f5e5c43 [WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253902 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-23 19:16:15 +00:00
Geoff Berry
2d2aadfa70 [CodeGenPrepare] Create more extloads and fewer ands
Summary:
Add and instructions immediately after loads that only have their low
bits used, assuming that the (and (load x) c) will be matched as a
extload and the ands/truncs fed by the extload will be removed by isel.

Reviewers: mcrosier, qcolombet, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14584

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253722 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-20 22:34:39 +00:00
Sanjay Patel
da754b5015 [CGP] despeculate expensive cttz/ctlz intrinsics
This is another step towards allowing SimplifyCFG to speculate harder, but then have 
CGP clean things up if the target doesn't like it.

Previous patches in this series:
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297

D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input 
(that definition can probably be blamed on x86).

For example, if we have the usual speculated-by-select expensive op pattern like this:

  %tobool = icmp eq i64 %A, 0
  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)   ; is_zero_undef == true
  %cond = select i1 %tobool, i64 64, i64 %0
  ret i64 %cond

There's an instcombine that will turn it into:

  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false)   ; is_zero_undef == false

This CGP patch is looking for that case and despeculating it back into:

  entry:
    %tobool = icmp eq i64 %A, 0
    br i1 %tobool, label %cond.end, label %cond.true

  cond.true:
    %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)    ; is_zero_undef == true
    br label %cond.end

  cond.end:
    %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
    ret i64 %cond

This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), 
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.

The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554

Differential Revision: http://reviews.llvm.org/D14630



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253573 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 16:37:10 +00:00
Pete Cooper
6d024c616a Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 05:56:52 +00:00
Pete Cooper
8b170f7f29 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-18 22:17:24 +00:00
Igor Laevsky
e78962ca4a [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251957 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 18:37:40 +00:00
Sanjay Patel
9620204c3b [CGP] widen switch condition and case constants to target's register width (2nd try)
This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1:
  cmpwi  3, 99
  bgt    0, .LBB0_9
 BB#2:
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#3:
  cmplwi         4, 10
  bne    0, .LBB0_12
 BB#4:
  li 3, 1
  blr
.LBB0_5:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 65436
  beq    0, .LBB0_13
 BB#6:
  cmplwi         3, 65526
  beq    0, .LBB0_15
 BB#7:
  cmplwi         3, 65535
  bne    0, .LBB0_12
 BB#8:
  li 3, 4
  blr
.LBB0_9:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 100
  beq    0, .LBB0_14
...

After:
BB#0:
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
 BB#1:
  lis 3, 0
  ori 3, 3, 65525
  cmpw   4, 3
  bgt    0, .LBB0_9
 BB#2:
  cmplwi         4, 1000
  beq    0, .LBB0_14
 BB#3:
  cmplwi         4, 65436
  bne    0, .LBB0_13
 BB#4:
  li 3, 6
  blr
.LBB0_5:
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#6:
  cmplwi         4, 10
  beq    0, .LBB0_12
 BB#7:
  cmplwi         4, 100
  bne    0, .LBB0_13
 BB#8:
  li 3, 2
  blr
.LBB0_9:
  cmplwi         4, 65526
  beq    0, .LBB0_15
 BB#10:
  cmplwi         4, 65535
  bne    0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251857 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:22:49 +00:00
Sanjay Patel
2762d00d74 revert r251849; need to move tests to arch-specific folders
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251851 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:05:20 +00:00
Sanjay Patel
1e6387531e [CGP] widen switch condition and case constants to target's register width
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of 
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any 
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1: 
  cmpwi	 3, 99
  bgt	 0, .LBB0_9
 BB#2:            
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#3:            
  cmplwi	 4, 10
  bne	 0, .LBB0_12
 BB#4:                      
  li 3, 1
  blr
.LBB0_5:                             
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 65436
  beq	 0, .LBB0_13
 BB#6:                            
  cmplwi	 3, 65526
  beq	 0, .LBB0_15
 BB#7:                       
  cmplwi	 3, 65535
  bne	 0, .LBB0_12
 BB#8:                       
  li 3, 4
  blr
.LBB0_9:                       
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 100
  beq	 0, .LBB0_14
...

After:
BB#0:        
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi	 4, 999
  ble 0, .LBB0_5
 BB#1:          
  lis 3, 0
  ori 3, 3, 65525
  cmpw	 4, 3
  bgt	 0, .LBB0_9
 BB#2:         
  cmplwi	 4, 1000
  beq	 0, .LBB0_14
 BB#3:    
  cmplwi	 4, 65436
  bne	 0, .LBB0_13
 BB#4:       
  li 3, 6
  blr
.LBB0_5:   
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#6: 
  cmplwi	 4, 10
  beq	 0, .LBB0_12
 BB#7:             
  cmplwi	 4, 100
  bne	 0, .LBB0_13
 BB#8:             
  li 3, 2
  blr
.LBB0_9:       
  cmplwi	 4, 65526
  beq	 0, .LBB0_15
 BB#10:      
  cmplwi	 4, 65535
  bne	 0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251849 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 22:46:24 +00:00
Elena Demikhovsky
16ed8780c7 Scalarizer for masked.gather and masked.scatter intrinsics.
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.

Differential Revision: http://reviews.llvm.org/D13722



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251237 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-25 15:37:55 +00:00
Rafael Espindola
2fe94b6b08 Refactor: Simplify boolean conditional return statements in lib/CodeGen.
Patch by Richard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251213 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-24 23:11:13 +00:00
Elena Demikhovsky
c57d9bd443 Masked Load/Store optimization for scalar code
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.

Differential Revision: http://reviews.llvm.org/D13855



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250893 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-21 11:50:54 +00:00