An obfuscated splat is where the frontend poorly generates code for a splat
using several different shuffles to create the splat, i.e.,
%A = load <4 x float>* %in_ptr, align 16
%B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef>
%C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef>
%D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166061 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165915 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strchr and strrchr optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165875 91177308-0d34-0410-b5e6-96231b3b80d8
This patch migrates the strcat and strncat optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165874 91177308-0d34-0410-b5e6-96231b3b80d8
This optimization is really just replacing allocas wholesale with
globals, there is no scalarization.
The underlying motivation for this patch is to simplify the SROA pass
and focus it on splitting and promoting allocas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162271 91177308-0d34-0410-b5e6-96231b3b80d8
The previous fix only checked for simple cycles, use a set to catch longer
cycles too.
Drop the broken check from the ObjectSizeOffsetEvaluator. The BoundsChecking
pass doesn't have to deal with invalid IR like InstCombine does.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162120 91177308-0d34-0410-b5e6-96231b3b80d8
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
and SimplifyMemSet in visitCallInst, replace 0 checking with
assertions.
- replace getZExtValue() with getLimitedValue() according to
Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161923 91177308-0d34-0410-b5e6-96231b3b80d8
An unsigned value converted to floating-point will always be greater than
a negative constant. Unfortunately InstCombine reversed the check so that
unsigned values were being optimized to always be greater than all positive
floating-point constants. <rdar://problem/12029145>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161452 91177308-0d34-0410-b5e6-96231b3b80d8
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160874 91177308-0d34-0410-b5e6-96231b3b80d8
of an array element (rather than at the beginning of the element) and extended
into the next element, then the load from the second element was being handled
wrong due to incorrect updating of the notion of which byte to load next. This
fixes PR13442. Thanks to Chris Smowton for reporting the problem, analyzing it
and providing a fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160711 91177308-0d34-0410-b5e6-96231b3b80d8
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, -1
%and = and i64 %sub, %shr
ret i64 %and
to:
%shr = lshr i64 %key, 3
%0 = load i64* %val, align 8
%sub = add i64 %0, 2305843009213693951
%and = and i64 %sub, %shr
ret i64 %and
The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.
rdar://11793464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160101 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :)
In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159952 91177308-0d34-0410-b5e6-96231b3b80d8
This means we can do cheap DSE for heap memory.
Nothing is done if the pointer excapes or has a load.
The churn in the tests is mostly due to objectsize, since we want to make sure we
don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159876 91177308-0d34-0410-b5e6-96231b3b80d8
another mechanical change accomplished though the power of terrible Perl
scripts.
I have manually switched some "s to 's to make escaping simpler.
While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.
Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159547 91177308-0d34-0410-b5e6-96231b3b80d8
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.
If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.
Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.
Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159525 91177308-0d34-0410-b5e6-96231b3b80d8
// C - zext(bool) -> bool ? C - 1 : C
if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
if (ZI->getSrcTy()->isIntegerTy(1))
return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);
This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
return (x - y) + 1;
}
=>
movzbl %dil, %eax
movzbl %sil, %ecx
shll $31, %ecx
sarl $31, %ecx
leal 1(%rax,%rcx), %eax
ret
Without the rule, llvm now generates:
movzbl %sil, %ecx
movzbl %dil, %eax
incl %eax
subl %ecx, %eax
ret
It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).
The transformation was done as part of Eli's r75531. He has given the ok to
remove it.
rdar://11748024
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159230 91177308-0d34-0410-b5e6-96231b3b80d8