disabled by default. This divides the existing load PRE code into 2 phases:
first it checks that it is safe to move the load to each of the predecessors
where it is unavailable, and then if it is safe, the code is changed to move
the load. Radar 7571861.
llvm-svn: 95007
of objc message send was getting marked arm_apcscc, but the prototype
isn't. This is fine at runtime because objcmsgsend is implemented in
assembly. Only turn a mismatched caller and callee into 'unreachable'
if the callee is a definition.
llvm-svn: 94986
case, instcombine can't zap the invoke for fear of changing the CFG.
However, we have to do something to prevent the next iteration of
instcombine from inserting another store -> undef before the invoke
thereby getting into infinite iteration between dead store elim and
store insertion.
Just zap the callee to null, which will prevent the next iteration
from doing anything.
llvm-svn: 94985
This bug was exposed by my inliner cost changes in r94615, and caused failures
of lencod on most architectures when building with LTO.
This patch fixes lencod and 464.h264ref on x86-64 (and likely others).
llvm-svn: 94858
create a testcase where this matters. The select+load transformation only
occurs when isSafeToLoadUnconditionally is true, and in those situations,
instcombine also changes the underlying objects to be aligned. This seems
like a good idea regardless, and I've verified that it doesn't pessimize
the subsequent realignment.
llvm-svn: 94850
(via APInt &RHSKnownZero = KnownZero, etc) seems dangerous and confusing to me: it
is easy not to notice this, and then wonder why KnownZero/RHSKnownZero changed
underneath you when you modified RHSKnownZero/KnownZero etc. So get rid of this.
No intended functionality change (tested with "make check" + llvm-gcc bootstrap).
llvm-svn: 94802
This was already being done in SSAUpdater::GetValueAtEndOfBlock so I've
just changed SSAUpdater to check for existing PHIs in both places.
llvm-svn: 94690
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.
Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods. The bindings to
other languages still use the ModuleProvider concept. It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.
Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.
llvm-svn: 94686
parameter with a default value, instead of just hardcoding it in the
implementation. The limit of MaxLookup = 6 was introduced in r69151 to fix
a performance problem with O(n^2) behavior in instcombine, but the scalarrepl
pass is relying on getUnderlyingObject to go all the way back to an AllocaInst.
Making the limit part of the method signature makes it clear that by default
the result is limited and should help avoid similar problems in the future.
This fixes pr6126.
llvm-svn: 94433
"sext cond" instead of a select. This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.
llvm-svn: 94339
for arbitrary terminators in predecessors, don't assume
it is a conditional or uncond branch. The testcase shows
an example where they can happen with switches.
llvm-svn: 94323
handle the case when we can infer an input to the xor
from all inputs that agree, instead of going into an
infinite loop. Another part of PR6199
llvm-svn: 94321
missing ones are libsupport, libsystem and libvmcore. libvmcore is
currently blocked on bugpoint, which uses EH. Once it stops using
EH, we can switch it off.
This #if 0's out 3 unit tests, because gtest requires RTTI information.
Suggestions welcome on how to fix this.
llvm-svn: 94164
loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.
llvm-svn: 94110
ValueMapper.cpp ends up calling an out of line
__ZNK4llvm12PATypeHolder3getEv, which is a template and llvm-config
determines arbitrarily to use the one in libipo. This sucks, but
keeping the #include is a reasonable workaround.
llvm-svn: 94103
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
llvm-svn: 94061
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.
llvm-svn: 93935
are the same. I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value. Radar 7552893.
llvm-svn: 93848
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.
llvm-svn: 93787
added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).
This is causing LLVM to perform incorrect xforms for code like:
void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
double mh, ml;
double c = 134217729.0;
double up, u1, u2, vp, v1, v2;
up = xh*c;
u1 = (xh - up) + up;
u2 = xh - u1;
vp = yh*c;
v1 = (yh - vp) + vp;
v2 = yh - v1;
mh = xh*yh;
ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
ml += xh*yl + xl*yh;
*rhi = mh + ml;
*rlo = (mh - (*rhi)) + ml;
}
The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.
llvm-svn: 93369
in JT.
2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go. This allows us to
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.
llvm-svn: 93253
condition is a xor with a phi node. This eliminates nonsense
like this from 176.gcc in several places:
LBB166_84:
testl %eax, %eax
- setne %al
- xorb %cl, %al
- notb %al
- testb $1, %al
- je LBB166_85
+ je LBB166_69
+ jmp LBB166_85
This is rdar://7391699
llvm-svn: 93221
trunc has multiple uses. Codegen is not able to coalesce the subreg case
correctly and so this leads to higher register pressure and spilling (see PR5997).
This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.
llvm-svn: 93200
BitsToClear case. This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216)
and test3 (random extample extracted from a spec benchmark).
clang now compiles the code in PR4216 into:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
orl $32768, %edi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
instead of:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
shrl $8, %edi
orl $128, %edi
shlq $8, %rdi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
which is still not great, but is progress.
llvm-svn: 93145
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant. This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.
llvm-svn: 93144
the zext dest type. This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl %edi, %eax
andl $-25350, %eax
ret
This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.
llvm-svn: 93127