Commit Graph

17102 Commits

Author SHA1 Message Date
Chandler Carruth
8638d35784 When rewriting the pointer operand to a load or store which has
alignment guarantees attached, re-compute the alignment so that we
consider offsets which impact alignment.

llvm-svn: 164690
2012-09-26 10:45:28 +00:00
Chandler Carruth
0254cf6d85 Teach all of the loads, stores, memsets and memcpys created by the
rewriter in SROA to carry a proper alignment. This involves
interrogating various sources of alignment, etc. This is a more complete
and principled fix to PR13920 as well as related bugs pointed out by Eli
in review and by inspection in the area.

Also by inspection fix the integer and vector promotion paths to create
aligned loads and stores. I still need to work up test cases for
these... Sorry for the delay, they were found purely by inspection.

llvm-svn: 164689
2012-09-26 10:27:46 +00:00
NAKAMURA Takumi
f8c0be6df4 ARM/atomicrmw_minmax.ll: Fix RUN line.
llvm-svn: 164687
2012-09-26 10:12:20 +00:00
Benjamin Kramer
50774721dc Fix tests that didn't test anything.
llvm-svn: 164686
2012-09-26 09:51:39 +00:00
James Molloy
220547e625 Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM.
llvm-svn: 164685
2012-09-26 09:48:32 +00:00
Hans Wennborg
d3f44548bf SimplifyCFG: Make the switch-to-lookup table transformation store the
tables in bitmaps when they fit in a target-legal register.

This saves some space, and it also allows for building tables that would
otherwise be deemed too sparse.

One interesting case that this hits is example 7 from
http://blog.regehr.org/archives/320. We currently generate good code
for this when lowering the switch to the selection DAG: we build a
bitmask to decide whether to jump to one block or the other. My patch
will result in the same bitmask, but it removes the need for the jump,
as the return value can just be retrieved from the mask.

llvm-svn: 164684
2012-09-26 09:44:49 +00:00
NAKAMURA Takumi
af3b653c3e llvm/test/CodeGen/X86/mulx*.ll: Fix copypasto.
llvm-svn: 164681
2012-09-26 09:24:12 +00:00
Michael Liao
4b9634175f Add SARX/SHRX/SHLX code generation support
llvm-svn: 164675
2012-09-26 08:26:25 +00:00
Michael Liao
1a80ec900d Add RORX code generation support
llvm-svn: 164674
2012-09-26 08:24:51 +00:00
Michael Liao
6d6afbf548 Add MULX code generation support
llvm-svn: 164673
2012-09-26 08:22:37 +00:00
Duncan Sands
3e1f8917ba Teach the 'lint' sanity checking pass to detect simple buffer overflows.
llvm-svn: 164671
2012-09-26 07:45:36 +00:00
Chandler Carruth
146e90b6dd Revert the business end of r164636 and try again. I'll come in again. ;]
This should really, really fix PR13916. For real this time. The
underlying bug is... a bit more subtle than I had imagined.

The setup is a code pattern that leads to an @llvm.memcpy call with two
equal pointers to an alloca in the source and dest. Now, not any pattern
will do. The alloca needs to be formed just so, and both pointers should
be wrapped in different bitcasts etc. When this precise pattern hits,
a funny sequence of events transpires. First, we correctly detect the
potential for overlap, and correctly optimize the memcpy. The first
time. However, we do simplify the set of users of the alloca, and that
causes us to run the alloca back through the SROA pass in case there are
knock-on simplifications. At this point, a curious thing has happened.
If we happen to have an i8 alloca, we have direct i8 pointer values. So
we don't bother creating a cast, we rewrite the arguments to the memcpy
to dircetly refer to the alloca.

Now, in an unrelated area of the pass, we have clever logic which
ensures that when visiting each User of a particular pointer derived
from an alloca, we only visit that User once, and directly inspect all
of its operands which refer to that particular pointer value. However,
the mechanism used to detect memcpy's with the potential to overlap
relied upon getting visited once per *Use*, not once per *User*. This is
always true *unless* the same exact value is both source and dest. It
turns out that almost nothing actually produces that pattern though.

We can hand craft test cases that more directly test this behavior of
course, and those are included. Also, note that there is a significant
missed optimization here -- we prove in many cases that there is
a non-volatile memcpy call with identical source and dest addresses. We
shouldn't prevent splitting the alloca in that case, and in fact we
should just remove such memcpy calls eagerly. I'll address that in
a subsequent commit.

llvm-svn: 164669
2012-09-26 07:41:40 +00:00
Bill Wendling
28f1f0139e Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>

llvm-svn: 164662
2012-09-26 06:16:18 +00:00
Bill Wendling
9a7fa167f9 Generate an error message instead of asserting or segfaulting when we have a
scalar-to-vector conversion that we cannot handle. For instance, when an invalid
constraint is used in an inline asm statement.
<rdar://problem/12284092>

llvm-svn: 164657
2012-09-26 04:04:19 +00:00
Nick Lewycky
f44573ac9b Don't drop the alignment on a memcpy intrinsic when producing a store. This is
only a missed optimization opportunity if the store is over-aligned, but a
miscompile if the store's new type has a higher natural alignment than the
memcpy did. Fixes PR13920!

llvm-svn: 164641
2012-09-25 22:46:21 +00:00
Nick Lewycky
0c83ee116c Don't try to promote the same alloca twice. Fixes PR13916!
Chandler, it's not obvious that it's okay that this alloca gets into the list
twice to begin with. Please review and see whether this is the fix you really
want, but I wanted to get a fix checked in quickly.

llvm-svn: 164634
2012-09-25 21:15:50 +00:00
Nick Lewycky
fc5f72e758 Make this test check the transforms it's actually doing. Also add a test that it
doesn't transform the trivially unsafe case.

llvm-svn: 164617
2012-09-25 18:17:38 +00:00
Michael Liao
b50e89ddce Add missing i64 max/min/umax/umin on 32-bit target
- Turn on atomic6432.ll and add specific test case as well

llvm-svn: 164616
2012-09-25 18:08:13 +00:00
Jim Grosbach
97e019d375 ARM: Darwin BL/BLX relocations to out-of-range symbols.
When a BL/BLX references a symbol in the same translation unit that is
out of range, use an external relocation. The linker will use this to
generate a branch island rather than a direct reference, allowing the
relocation to resolve correctly.

rdar://12359919

llvm-svn: 164615
2012-09-25 18:07:17 +00:00
Chandler Carruth
67698842df Fix a case where SROA did not correctly detect dead PHI or selects due
to chains or cycles between PHIs and/or selects. Also add a couple of
really nice test cases reduced from Kostya's reports in PR13905 and
PR13906. Both are fixed by this patch.

llvm-svn: 164596
2012-09-25 10:03:40 +00:00
Duncan Sands
45a436ffcb Change the way the lint sanity checking pass detects misaligned memory accesses.
Previously it was only be able to detect problems if the pointer was a numerical
value (eg inttoptr i32 1 to i32*), but not if it was an alloca or globa.  The
reason was the use of ComputeMaskedBits: imagine you have "alloca i8, align 2",
and ask ComputeMaskedBits what it knows about the bits of the alloca pointer.
It can tell you that the bottom bit is known zero (due to align 2) but it can't
tell you that bit 1 is known one.  That's because the address could be an even
multiple of 2 rather than an odd multiple, eg it might be a multiple of 4.  Thus
trying to use KnownOne is ineffective in the case of an alloca as it will never
have any bits set.  Instead look explicitly for constant offsets from allocas
and globals.

llvm-svn: 164595
2012-09-25 10:00:49 +00:00
Evan Cheng
a5b2ee52c9 Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511
llvm-svn: 164588
2012-09-25 05:32:34 +00:00
Nick Lewycky
7b751873f1 Don't forget that strcpy and friends return a pointer to the destination, so
it's not a dead store if that pointer is used. Whoops!

llvm-svn: 164583
2012-09-25 01:55:59 +00:00
Jim Grosbach
d5ba471995 ARM: 'add Rd, pc, #imm' is an alias for 'adr Rd, #imm'.
rdar://9795790

llvm-svn: 164577
2012-09-25 00:08:13 +00:00
Jim Grosbach
484960af64 Mark jump tables in code sections with DataRegion directives.
Even out-of-line jump tables can be in the code section, so mark them
as data-regions for those targets which support the directives.

rdar://12362871&12362974

llvm-svn: 164571
2012-09-24 23:06:27 +00:00
Nick Lewycky
74f862b349 Teach DSE that strcpy, strncpy, strcat and strncat are all stores which may be
dead.

llvm-svn: 164561
2012-09-24 22:09:10 +00:00
Roman Divacky
87f9c41b1c Specify MachinePointerInfo as refering to the argument value and offset of the
store when handling byval arguments. Thus preventing reordering of the store
with load with post-RA scheduler.

llvm-svn: 164553
2012-09-24 20:47:19 +00:00
Richard Osborne
53dd900512 Add missing : in CHECK line.
llvm-svn: 164540
2012-09-24 17:22:43 +00:00
Richard Osborne
9be8a6db61 Add missing check for presence of target data.
This avoids a crash in visitAllocaInst when target data isn't available.

llvm-svn: 164539
2012-09-24 17:10:03 +00:00
Chandler Carruth
d6a41eacf0 Address one of the original FIXMEs for the new SROA pass by implementing
integer promotion analogous to vector promotion. When there is an
integer alloca being accessed both as its integer type and as a narrower
integer type, promote the narrower access to "insert" and "extract" the
smaller integer from the larger one, and make the integer alloca
a candidate for promotion.

In the new formulation, we don't care about target legal integer or use
thresholds to control things. Instead, we only perform this promotion to
an integer type which the frontend has already emitted a load or store
for. This bounds the scope and prevents optimization passes from
coalescing larger and larger entities into a single integer.

llvm-svn: 164479
2012-09-24 00:34:20 +00:00
Anton Korobeynikov
0457dce9c9 Emit dtors into proper section while compiling in vcpp-compatible mode.
Patch by Kai!

llvm-svn: 164476
2012-09-23 15:53:47 +00:00
Chandler Carruth
9b73630229 Switch to a signed representation for the dynamic offsets while walking
across the uses of the alloca. It's entirely possible for negative
numbers to come up here, and in some rare cases simply doing the 2's
complement arithmetic isn't the correct decision. Notably, we can't zext
the index of the GEP. The definition of GEP is that these offsets are
sign extended or truncated to the size of the pointer, and then wrapping
2's complement arithmetic used.

This patch fixes an issue that comes up with *no* input from the
buildbots or bootstrap afaict. The only place where it manifested,
disturbingly, is Clang's own regression test suite. A reduced and
targeted collection of tests are added to cope with this. Note that I've
tried to pin down the potential cases of overflow, but may have missed
some cases. I've tried to add a few cases to test this, but its hard
because LLVM has quite limited support for >64bit constructs.

llvm-svn: 164475
2012-09-23 11:43:14 +00:00
Nick Lewycky
ac620030a2 Don't do actual work inside an assert statement. Fixes PR11760!
llvm-svn: 164474
2012-09-23 03:58:21 +00:00
Michael Liao
4b489dd743 Revise test to avoid using of 'grep'
llvm-svn: 164472
2012-09-23 02:41:47 +00:00
Michael Liao
240d90ec9b Enhance test case of atomic16 to verify inst encoding fixed in r164453.
llvm-svn: 164465
2012-09-22 21:07:59 +00:00
Tim Northover
7cab153d37 Fix edge cases of ARM shift operands in arith instructions.
As before with load instructions, oddities like "asr #32", "rrx" could
be printed incorrectly.

Patch by Chris Lidbury.

llvm-svn: 164456
2012-09-22 11:18:19 +00:00
Tim Northover
1c60305666 Fix the handling of edge cases in ARM shifted operands.
This patch fixes load/store instructions to handle less common cases
like "asr #32", "rrx" properly throughout the MC layer.

Patch by Chris Lidbury.

llvm-svn: 164455
2012-09-22 11:18:12 +00:00
Chandler Carruth
438cc8b234 Fix a case where the new SROA pass failed to zap dead operands to
selects with a constant condition. This resulted in the operands
remaining live through the SROA rewriter. Most of the time, this just
caused some dead allocas to persist and get zapped by later passes, but
in one case found by Joerg, it caused a crash when we tried to *promote*
the alloca despite it having this dead use. We already have the
mechanisms in place to handle this, just wire select up to them.

llvm-svn: 164427
2012-09-21 23:36:40 +00:00
Benjamin Kramer
baec630d4c LoopIdiom: Give up when the loop is not in canonical form.
We rely on it when doing the transforms. This can happen when there is an
indirectbr in  the loop.

Fixes PR13892.

llvm-svn: 164383
2012-09-21 17:27:23 +00:00
Chad Rosier
a58913fc00 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-aligned i32 loads/stores.
rdar://12304911

llvm-svn: 164381
2012-09-21 16:58:35 +00:00
Benjamin Kramer
3083ebd4de InstCombine: Make sure we use the pre-zext type when creating a constant of a value that is zext'd.
Fixes PR13250.

llvm-svn: 164377
2012-09-21 16:26:41 +00:00
Benjamin Kramer
fd67118ab5 BitcodeReader: Correctly insert blockaddress constant referring to a already parsed function.
We inserted a placeholder that was never replaced because the function was
already visited. Assert that all placeholders have been resolved when tearing
down the bitcode reader.

Fixes PR13895.

llvm-svn: 164369
2012-09-21 14:34:31 +00:00
Alexey Samsonov
ad52a8844e Fix SymbolRef::getAddress implementation for ELF. The 'value' field in symbol table entry should be treated differently for relocatable and relocated files. This patch fixes symbol addresses printed by llvm-nm for executables and shared objects.
llvm-svn: 164365
2012-09-21 07:08:08 +00:00
NAKAMURA Takumi
8c966675e5 llvm/test/CodeGen/X86/pr5145.ll: Tweak expressions to match for darwin target.
.LBB0_1: # Linux
LBB0_1:  # Darwin

llvm-svn: 164362
2012-09-21 05:19:19 +00:00
Michael Liao
2197b133f8 Add missing i8 max/min/umax/umin support
- Fix PR5145 and turn on test 8-bit atomic ops

llvm-svn: 164358
2012-09-21 03:18:52 +00:00
NAKAMURA Takumi
a7a5a7cd7d llvm/test/CodeGen/ARM/fast-isel.ll: Fix possible typos, s/@unaligned_i16_store/@unaligned_i16_load/g.
I guess this had apparently passed in +Asserts possibly due to verborsity.

llvm-svn: 164350
2012-09-21 01:15:05 +00:00
Chad Rosier
d80fb0b13d Testcase does not need to be this strict.
llvm-svn: 164347
2012-09-21 00:47:08 +00:00
Chad Rosier
c15c6508e0 Add newline.
llvm-svn: 164346
2012-09-21 00:43:18 +00:00
Chad Rosier
8a1b0217f6 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-halfword-aligned i16 loads/stores.
rdar://12304911

llvm-svn: 164345
2012-09-21 00:41:42 +00:00
Jim Grosbach
135898ebe3 ARM: Use a dedicated intrinsic for vector bitwise select.
The expression based expansion too often results in IR level optimizations
splitting the intermediate values into separate basic blocks, preventing
the formation of the VBSL instruction as the code author intended. In
particular, LICM would often hoist part of the computation out of a loop.

rdar://11011471

llvm-svn: 164340
2012-09-21 00:18:20 +00:00