Commit Graph

15029 Commits

Author SHA1 Message Date
Hal Finkel
77cfe064a7 add basic PPC register-pressure feedback; adjust the vaarg test to match the new register-allocation pattern
llvm-svn: 145065
2011-11-22 16:21:04 +00:00
Chandler Carruth
59f0abf50e Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062
2011-11-22 13:13:16 +00:00
Rafael Espindola
05f88add77 Add triple to the test.
llvm-svn: 145057
2011-11-22 06:36:25 +00:00
Rafael Espindola
b45f6ea527 If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
2011-11-22 06:27:18 +00:00
Nick Lewycky
566ea855fd Fix crasher in GVN due to my recent capture tracking changes.
llvm-svn: 145047
2011-11-21 19:42:56 +00:00
Craig Topper
866214a486 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
llvm-svn: 145028
2011-11-21 08:26:50 +00:00
Craig Topper
05db995317 Test case for r145026
llvm-svn: 145027
2011-11-21 06:58:09 +00:00
Craig Topper
62ae335144 Make LowerSIGN_EXTEND_INREG split 256-bit vectors when AVX1 is enabled and use AVX2 shifts when AVX2 is enabled.
llvm-svn: 145022
2011-11-21 01:12:36 +00:00
NAKAMURA Takumi
0738f386b4 test/CodeGen/X86/block-placement.ll: Relax expressions for Win32.
llvm-svn: 145011
2011-11-20 12:49:45 +00:00
Chandler Carruth
aac8e5082a The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Benjamin Kramer
1967d4e20b XFAIL this test until I figure out what indvars is doing here (or find someone who does)
llvm-svn: 145008
2011-11-20 11:10:03 +00:00
Chandler Carruth
fa7964c81f Add some comments to the latest test case I added here to document what
is actually being tested. Also add some FileCheck goodness to much more
carefully ensure that the result is the desired result. Before this test
would only have failed through an assert failure if the underlying fix
were reverted.

Also, add some weight metadata and a comment explaining exactly what is
going on to a trick section of the test case. Originally, we were
getting very unlucky and trying to form a block chain that isn't
actually profitable. I'm working on a fix to avoid forming these
unprofitable chains, and that would also have masked any failure from
this test case. The easy solution is to add some metadata that makes it
*really* profitable to form the bad chain here.

llvm-svn: 145006
2011-11-20 09:30:40 +00:00
Craig Topper
e878c775cf Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine.
llvm-svn: 145005
2011-11-20 00:12:05 +00:00
Craig Topper
6ed413c495 Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled.
llvm-svn: 145004
2011-11-19 22:34:59 +00:00
Chandler Carruth
f24d3f8fc7 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00
Craig Topper
44b06b0096 Test cases for SSSE3/AVX integer horizontal add/sub.
llvm-svn: 144990
2011-11-19 09:03:33 +00:00
Craig Topper
536f9d9434 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
llvm-svn: 144987
2011-11-19 07:07:26 +00:00
Andrew Trick
fe5f7fc3b8 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Nadav Rotem
08f8a75c2c Add AVX2 vpbroadcast support
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Kostya Serebryany
3a83736893 [asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler
llvm-svn: 144962
2011-11-18 01:41:06 +00:00
Devang Patel
a0973b0c53 DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Andrew Trick
7dc21d8c0e Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Kostya Serebryany
3b8d362511 fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
2011-11-17 23:14:59 +00:00
Chad Rosier
2673f8862f When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman
d02d82d355 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Daniel Dunbar
4affd889a2 build/make/test: Get rid of unused BUGPOINT_TOPTS variable.
llvm-svn: 144864
2011-11-16 23:56:03 +00:00
Eli Friedman
51adc2ea5a Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Jim Grosbach
1b837af2bd Remove obsolete test.
The PLD encoding is checked via the .s file now.

llvm-svn: 144853
2011-11-16 22:50:38 +00:00
Jim Grosbach
fe5f0cfa29 Generalize the fixup info for ARM mode.
We don't (yet) have the granularity in the fixups to be specific about which
bitranges are affected. That's a future cleanup, but we're not there yet.

llvm-svn: 144852
2011-11-16 22:48:37 +00:00
Jim Grosbach
8fae277866 Update test for r144842.
llvm-svn: 144851
2011-11-16 22:46:27 +00:00
Evan Cheng
5bae2333cb Another missing X86ISD::MOVLPD pattern. rdar://10450317
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Evan Cheng
5242b6aaa1 Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Nick Lewycky
29efc8f15d Fix typo in test.
llvm-svn: 144774
2011-11-16 03:56:38 +00:00
Nick Lewycky
ff690249a9 Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Eli Friedman
c2a46f1a90 Fix testcase.
llvm-svn: 144769
2011-11-16 03:03:52 +00:00
Eli Friedman
1f3d774ba4 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Kostya Serebryany
4105068ea9 AddressSanitizer, first commit (compiler module only)
llvm-svn: 144758
2011-11-16 01:35:23 +00:00
Andrew Trick
fe618116fc Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Jim Grosbach
044acb8bee ARM assembly parsing for register range syntax for VLD/VST register lists.
For example,
vld1.f64 {d2-d5}, [r2,:128]!

Should be equivalent to:
vld1.f64 {d2,d3,d4,d5}, [r2,:128]!

It's not documented syntax in the ARM ARM, but it is consistent with what's
accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to
support.

rdar://10451128

llvm-svn: 144727
2011-11-15 23:19:15 +00:00
Nadav Rotem
63be4a26a9 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
llvm-svn: 144720
2011-11-15 22:50:37 +00:00
NAKAMURA Takumi
f99c0f0fcd test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64.
llvm-svn: 144714
2011-11-15 22:30:37 +00:00
Jim Grosbach
b8ebc386df ARM assembly parsing two operand forms for shift instructions.
llvm-svn: 144713
2011-11-15 22:27:54 +00:00
Pete Cooper
8441c08e0b Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Jim Grosbach
4d0ad5a4e0 ARM alternate size suffices for VTRN instructions.
rdar://10435076

llvm-svn: 144694
2011-11-15 20:49:46 +00:00
Jim Grosbach
8987b277cb ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns.
Yet more of rdar://10435076.

llvm-svn: 144691
2011-11-15 20:29:42 +00:00
Jim Grosbach
f0690cd90c ARM assembly parsing for two-operand form of 'mul' instruction.
rdar://10449856.

llvm-svn: 144689
2011-11-15 20:14:51 +00:00
Jim Grosbach
8b1d4c989c ARM assembly parsing for two-operand form of 'mul' instruction.
Ongoing rdar://10435114.

llvm-svn: 144688
2011-11-15 20:02:06 +00:00
Jim Grosbach
e4933acaa7 Testcase for r144684.
llvm-svn: 144685
2011-11-15 19:56:17 +00:00
Owen Anderson
35f049f1fb Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32.
llvm-svn: 144683
2011-11-15 19:55:00 +00:00
Jim Grosbach
df951fa128 Thumb2 assembly parsing for mul.w in IT block fix.
When the 3rd operand is not a low-register, and the first two operands are
the same low register, the parser was incorrectly trying to use the 16-bit
instruction encoding.

rdar://10449281

llvm-svn: 144679
2011-11-15 19:29:45 +00:00