Benjamin Kramer
355b353595
Stop emitting instructions with the name "tmp" they eat up memory and have to be uniqued, without any benefit.
...
If someone prefers %tmp42 to %42, run instnamer.
llvm-svn: 140634
2011-09-27 20:39:19 +00:00
Nadav Rotem
43912ff374
Fixes following the CR by Chris and Duncan:
...
Optimize chained bitcasts of the form A->B->A.
Undo r138722 and change isEliminableCastPair to allow this case.
llvm-svn: 138756
2011-08-29 19:58:36 +00:00
Nadav Rotem
6280c8eecc
Bitcasts are transitive. Bitcast-Bitcast-X becomes Bitcast-X.
...
llvm-svn: 138722
2011-08-28 11:51:08 +00:00
Jay Foad
6513dac6e2
Convert GetElementPtrInst to use ArrayRef.
...
llvm-svn: 135904
2011-07-25 09:48:08 +00:00
Jay Foad
42463ed852
Convert IRBuilder::CreateGEP and IRBuilder::CreateInBoundsGEP to use
...
ArrayRef.
llvm-svn: 135761
2011-07-22 08:16:57 +00:00
Eli Friedman
c18314afef
Clean up includes of llvm/Analysis/ConstantFolding.h so it's included where it's used and not included where it isn't.
...
llvm-svn: 135628
2011-07-20 21:57:23 +00:00
Chris Lattner
e1fe7061ce
land David Blaikie's patch to de-constify Type, with a few tweaks.
...
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Evan Cheng
ba4a50f10c
It's not safe to fold (fptrunc (sqrt (fpext x))) to (sqrtf x) if there is another use of sqrt. rdar://9763193
...
llvm-svn: 135058
2011-07-13 19:08:16 +00:00
Bob Wilson
d5c5f63f43
Reapply a fixed version of r133285.
...
This tightens up checking for overflow in alloca sizes, based on feedback
from Duncan and John about the change in r132926.
llvm-svn: 134749
2011-07-08 22:09:33 +00:00
Chad Rosier
0dc865af56
Revert r133285. Causing odd failures on Dragonegg.
...
llvm-svn: 133301
2011-06-17 22:08:25 +00:00
Stuart Hastings
03f59f5916
Relocate NUW test to cover all binary ops in a dynamic alloca expr.
...
Followup to 132926. rdar://problem/9265821
llvm-svn: 133285
2011-06-17 20:21:52 +00:00
Stuart Hastings
65d0bc94b4
Avoid fusing bitcasts with dynamic allocas if the amount-to-allocate
...
might overflow. Re-typing the alloca to a larger type (e.g. double)
hoists a shift into the alloca, potentially exposing overflow in the
expression. rdar://problem/9265821
llvm-svn: 132926
2011-06-13 18:48:49 +00:00
Eli Friedman
6937c422a0
Final step of instcombine debuginfo; switch a couple more places over to InsertNewInstWith, and use setDebugLoc for the cases which can't be easily handled by the automated mechanisms.
...
llvm-svn: 132167
2011-05-27 00:19:40 +00:00
Eli Friedman
2fa7bea638
More instcombine simplifications towards better debug locations.
...
llvm-svn: 131596
2011-05-18 23:11:30 +00:00
Eli Friedman
358d9a5af3
Use ReplaceInstUsesWith instead of replaceAllUsesWith where appropriate in instcombine.
...
llvm-svn: 131512
2011-05-18 00:32:01 +00:00
Benjamin Kramer
fd520474ca
While SimplifyDemandedBits constant folds this, we can't rely on it here.
...
It's possible to craft an input that hits the recursion limits in a way
that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits
can infer which bits are zero.
No test case as it depends on too many other things. Fixes PR9609.
llvm-svn: 128777
2011-04-02 18:50:58 +00:00
Benjamin Kramer
d91d0d877e
Fix comment.
...
llvm-svn: 128745
2011-04-01 22:29:18 +00:00
Benjamin Kramer
eb9bd6ed23
Tweaks to the icmp+sext-to-shifts optimization to address Frits' comments:
...
- Localize the check if an icmp has one use to a place where we know we're
introducing something that's likely more expensive than a sext from i1.
- Add an assert to make sure a case that would lead to a miscompilation is
folded away earlier.
- Fix a typo.
llvm-svn: 128744
2011-04-01 22:22:11 +00:00
Benjamin Kramer
09e0a56ebc
Fix build.
...
llvm-svn: 128733
2011-04-01 20:15:16 +00:00
Benjamin Kramer
7c0178b9ec
InstCombine: Turn icmp + sext into bitwise/integer ops when the input has only one unknown bit.
...
int test1(unsigned x) { return (x&8) ? 0 : -1; }
int test3(unsigned x) { return (x&8) ? -1 : 0; }
before (x86_64):
_test1:
andl $8, %edi
cmpl $1, %edi
sbbl %eax, %eax
ret
_test3:
andl $8, %edi
cmpl $1, %edi
sbbl %eax, %eax
notl %eax
ret
after:
_test1:
shrl $3, %edi
andl $1, %edi
leal -1(%rdi), %eax
ret
_test3:
shll $28, %edi
movl %edi, %eax
sarl $31, %eax
ret
llvm-svn: 128732
2011-04-01 20:09:10 +00:00
Benjamin Kramer
d74739be04
InstCombine: Move (sext icmp) transforms into their own method. No intended functionality change.
...
llvm-svn: 128731
2011-04-01 20:09:03 +00:00
Jay Foad
53632b7c03
Remove PHINode::reserveOperandSpace(). Instead, add a parameter to
...
PHINode::Create() giving the (known or expected) number of operands.
llvm-svn: 128537
2011-03-30 11:28:46 +00:00
Jay Foad
dc5a008237
(Almost) always call reserveOperandSpace() on newly created PHINodes.
...
llvm-svn: 128535
2011-03-30 11:19:20 +00:00
Devang Patel
2f204229ef
llvm.dbg.declare intrinsic does not use any llvm::Values. It's magic!
...
llvm-svn: 127282
2011-03-08 22:12:11 +00:00
Chris Lattner
db204cbe42
convert ConstantVector::get to use ArrayRef.
...
llvm-svn: 125537
2011-02-15 00:14:00 +00:00
Chris Lattner
ee7f7c2494
revert my ConstantVector patch, it seems to have made the llvm-gcc
...
builders unhappy.
llvm-svn: 125504
2011-02-14 18:15:46 +00:00
Chris Lattner
34f32cb4c2
Switch ConstantVector::get to use ArrayRef instead of a pointer+size
...
idiom. Change various clients to simplify their code.
llvm-svn: 125487
2011-02-14 07:55:32 +00:00
Chris Lattner
74ed5d30ca
implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
...
This fixes rdar://8808586 which observed that we used to compile:
union xy {
struct x { _Bool b[15]; } x;
__attribute__((packed))
struct y {
__attribute__((packed)) unsigned long b0to7;
__attribute__((packed)) unsigned int b8to11;
__attribute__((packed)) unsigned short b12to13;
__attribute__((packed)) unsigned char b14;
} y;
};
struct x
foo(union xy *xy)
{
return xy->x;
}
into:
_foo: ## @foo
movq (%rdi), %rax
movabsq $1095216660480, %rcx ## imm = 0xFF00000000
andq %rax, %rcx
movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000
andq %rax, %rdx
movzbl %al, %esi
orq %rdx, %rsi
movq %rax, %rdx
andq $65280, %rdx ## imm = 0xFF00
orq %rsi, %rdx
movq %rax, %rsi
andq $16711680, %rsi ## imm = 0xFF0000
orq %rdx, %rsi
movl %eax, %edx
andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000
orq %rsi, %rdx
orq %rcx, %rdx
movabsq $280375465082880, %rcx ## imm = 0xFF0000000000
movq %rax, %rsi
andq %rcx, %rsi
orq %rdx, %rsi
movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000
andq %r8, %rax
orq %rsi, %rax
movzwl 12(%rdi), %edx
movzbl 14(%rdi), %esi
shlq $16, %rsi
orl %edx, %esi
movq %rsi, %r9
shlq $32, %r9
movl 8(%rdi), %edx
orq %r9, %rdx
andq %rdx, %rcx
movzbl %sil, %esi
shlq $32, %rsi
orq %rcx, %rsi
movl %edx, %ecx
andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000
orq %rsi, %rcx
movq %rdx, %rsi
andq $16711680, %rsi ## imm = 0xFF0000
orq %rcx, %rsi
movq %rdx, %rcx
andq $65280, %rcx ## imm = 0xFF00
orq %rsi, %rcx
movzbl %dl, %esi
orq %rcx, %rsi
andq %r8, %rdx
orq %rsi, %rdx
ret
We now compile this into:
_foo: ## @foo
## BB#0: ## %entry
movzwl 12(%rdi), %eax
movzbl 14(%rdi), %ecx
shlq $16, %rcx
orl %eax, %ecx
shlq $32, %rcx
movl 8(%rdi), %edx
orq %rcx, %rdx
movq (%rdi), %rax
ret
A small improvement :-)
llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Bill Wendling
b1f6875ae3
Whitespace fixes. No functionality change.
...
llvm-svn: 122110
2010-12-17 23:27:41 +00:00
Nate Begeman
063d88d6fb
Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
...
llvm-svn: 122105
2010-12-17 23:12:19 +00:00
Chris Lattner
a58a97dafc
Fix a serious performance regression introduced by r108687 on linux:
...
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have
to delete the original sqrt as well. Not doing so causes us to do
two sqrt's when building with -fmath-errno (the default on linux).
llvm-svn: 113260
2010-09-07 20:01:38 +00:00
Chris Lattner
fc1da78d16
for completeness, allow undef also.
...
llvm-svn: 112351
2010-08-28 03:36:51 +00:00
Chris Lattner
b61cf1e296
handle the constant case of vector insertion. For something
...
like this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.B;
A.A = 42;
return A;
}
we now generate:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
pshufd $16, %xmm2, %xmm2
movss LCPI1_1(%rip), %xmm0
pshufd $16, %xmm0, %xmm0
unpcklps %xmm2, %xmm0
ret
instead of:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss 12(%rax), %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
unpcklps %xmm0, %xmm1
addss LCPI1_0(%rip), %xmm2
movd %xmm2, %eax
shlq $32, %rax
addq $1109917696, %rax ## imm = 0x42280000
movd %rax, %xmm0
ret
llvm-svn: 112345
2010-08-28 01:50:57 +00:00
Chris Lattner
c70b0c0ee7
optimize bitcasts from large integers to vector into vector
...
element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi. We now compile something like
this:
struct S { float A, B, C, D; };
struct S g;
struct S bar() {
struct S A = g;
++A.A;
++A.C;
return A;
}
into all nice vector operations:
_bar: ## @bar
## BB#0: ## %entry
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm0
movss 4(%rax), %xmm2
movss 12(%rax), %xmm3
pshufd $16, %xmm2, %xmm2
unpcklps %xmm2, %xmm0
addss 8(%rax), %xmm1
pshufd $16, %xmm1, %xmm1
pshufd $16, %xmm3, %xmm2
unpcklps %xmm2, %xmm1
ret
instead of icky integer operations:
_bar: ## @bar
movq _g@GOTPCREL(%rip), %rax
movss LCPI1_0(%rip), %xmm1
movss (%rax), %xmm0
addss %xmm1, %xmm0
movd %xmm0, %ecx
movl 4(%rax), %edx
movl 12(%rax), %esi
shlq $32, %rdx
addq %rcx, %rdx
movd %rdx, %xmm0
addss 8(%rax), %xmm1
movd %xmm1, %eax
shlq $32, %rsi
addq %rax, %rsi
movd %rsi, %xmm1
ret
This resolves rdar://8360454
llvm-svn: 112343
2010-08-28 01:20:38 +00:00
Chris Lattner
80632e5fd9
Implement a pretty general logical shift propagation
...
framework, which is good at ripping through bitfield
operations. This generalize a bunch of the existing
xforms that instcombine does, such as
(x << c) >> c -> and
to handle intermediate logical nodes. This is useful for
ripping up the "promote to large integer" code produced by
SRoA.
llvm-svn: 112304
2010-08-27 22:24:38 +00:00
Chris Lattner
866b888095
teach the truncation optimization that an entire chain of
...
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.
llvm-svn: 112285
2010-08-27 20:32:06 +00:00
Chris Lattner
69a9143584
Add an instcombine to clean up a common pattern produced
...
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
llvm-svn: 112278
2010-08-27 18:31:05 +00:00
Chris Lattner
d5d68438c1
optimize "integer extraction out of the middle of a vector" as produced
...
by SRoA. This is part of rdar://7892780, but needs another xform to
expose this.
llvm-svn: 112232
2010-08-26 22:14:59 +00:00
Chris Lattner
19a5dc488b
optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'
...
is a vector to be a vector element extraction. This allows clang to
compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
movd %eax, %xmm0
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movd %xmm1, %rax
movd %eax, %xmm1
addss %xmm2, %xmm1
shrq $32, %rax
movd %eax, %xmm0
addss %xmm1, %xmm0
ret
... eliminating half of the horribleness.
llvm-svn: 112227
2010-08-26 21:55:42 +00:00
Owen Anderson
0217b8e1d9
Tweak per Chris' comments.
...
llvm-svn: 108736
2010-07-19 19:23:32 +00:00
Owen Anderson
8159755faa
Reimplement r108639 in InstCombine rather than DAGCombine.
...
llvm-svn: 108687
2010-07-19 08:09:34 +00:00
Dan Gohman
082d391a46
Fix instcombine's handling of alloca to accept non-i32 types.
...
llvm-svn: 104935
2010-05-28 04:33:04 +00:00
Dan Gohman
a08b225866
Fix a missing newline in debug output.
...
llvm-svn: 104644
2010-05-25 21:50:35 +00:00
Chris Lattner
0b442d35da
Teach instcombine to transform a bitcast/(zext|trunc)/bitcast sequence
...
with a vector input and output into a shuffle vector. This sort of
sequence happens when the input code stores with one type and reloads
with another type and then SROA promotes to i96 integers, which make
everyone sad.
This fixes rdar://7896024
llvm-svn: 103354
2010-05-08 21:50:26 +00:00
Dan Gohman
8f2ebaf7ec
Say bitcast instead of bitconvert.
...
llvm-svn: 100720
2010-04-07 23:22:42 +00:00
Duncan Sands
1b33dd3c83
There are two ways of checking for a given type, for example isa<PointerType>(T)
...
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
llvm-svn: 96344
2010-02-16 11:11:14 +00:00
Duncan Sands
2acaf3609c
Uniformize the names of type predicates: rather than having isFloatTy and
...
isInteger, we now have isFloatTy and isIntegerTy. Requested by Chris!
llvm-svn: 96223
2010-02-15 16:12:20 +00:00
Chris Lattner
a59eb7c09c
Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect
...
what it does. Enhance it to return false to optimizing vector
sign extensions from vector comparisions, which is the idiom used
to get a splatted vector for a vector comparison.
Doing this breaks vector-casts.ll, add some compensating
transformations to handle the important case they cover without
depending on this canonicalization.
This fixes rdar://7434900 a serious pessimization of vector compares.
llvm-svn: 95855
2010-02-11 06:26:33 +00:00
Dan Gohman
95e0161d1b
LangRef.html says that inttoptr and ptrtoint always use zero-extension
...
when the cast is extending.
llvm-svn: 95046
2010-02-02 01:44:02 +00:00
Chris Lattner
91fafbd4e8
change the canonical form of "cond ? -1 : 0" to be
...
"sext cond" instead of a select. This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.
llvm-svn: 94339
2010-01-24 00:09:49 +00:00