Commit Graph

45302 Commits

Author SHA1 Message Date
Nick Lewycky
41fa5796ca There are no vectors of pointer or arrays, so we don't need to check vector
elements for type equivalence.

llvm-svn: 124284
2011-01-26 08:50:18 +00:00
Duncan Sands
803522ec6f APInt has a method for determining whether a number is a power of 2
which is more efficient than countPopulation - use it.

llvm-svn: 124283
2011-01-26 08:44:16 +00:00
Nick Lewycky
fc7a74c9a0 Fix memory corruption. If one of the SCEV creation functions calls another but
doesn't return immediately after then the insert position in UniqueSCEVs will
be out of date. No test because this is a memory corruption issue. Fixes PR9051!

llvm-svn: 124282
2011-01-26 08:40:22 +00:00
Eric Christopher
cb32adbd3f Separate out the constant bonus from the size reduction metrics. Rework
a few loops accordingly. Should be no functional change.

This is a step for more accurate cost/benefit analysis of devirt/inlining
bonuses.

llvm-svn: 124275
2011-01-26 02:58:39 +00:00
Bill Wendling
a4abd1fc4b Add needed braces.
llvm-svn: 124273
2011-01-26 02:06:22 +00:00
NAKAMURA Takumi
8ace7260cc Target/X86: Tweak win64's tailcall.
llvm-svn: 124272
2011-01-26 02:04:09 +00:00
NAKAMURA Takumi
066378440a Fix whitespace.
llvm-svn: 124270
2011-01-26 02:03:37 +00:00
NAKAMURA Takumi
a3d094f248 lib/Target/X86/X86RegisterInfo.cpp: Fix whitespace.
llvm-svn: 124268
2011-01-26 01:28:06 +00:00
NAKAMURA Takumi
1557668f7b lib/Target/X86/X86RegisterInfo.cpp: Fix a typo in comment.
llvm-svn: 124267
2011-01-26 01:27:58 +00:00
Eric Christopher
45e584b1b7 Coding style formatting changes.
llvm-svn: 124260
2011-01-26 01:09:59 +00:00
Jakob Stoklund Olesen
5c0fcc03af Rename member variables to follow the rest of LLVM.
No functional change.

llvm-svn: 124257
2011-01-26 00:50:53 +00:00
Devang Patel
134e5b7679 Provide an interface to transfer SDDbgValue from one SDNode to another.
llvm-svn: 124245
2011-01-25 23:27:42 +00:00
Bill Wendling
69916a2e10 Revert 124230. It was causing test failures.
llvm-svn: 124233
2011-01-25 21:48:36 +00:00
Bill Wendling
80dd6f7494 The floating point value is encoded in its binary form as an Imm. Convert it
appropriately so that it prints out the decimal representation.

llvm-svn: 124230
2011-01-25 21:27:46 +00:00
Bill Wendling
195d8c3988 Add support for parsing a Real value. It stores the Real value as its binary
encoding. It's up to the individual back-ends to convert it to their preferred
representation when printing.

llvm-svn: 124229
2011-01-25 21:26:41 +00:00
Rafael Espindola
29e8317caa Move unnamed_addr after the function arguments on Sabre's request.
llvm-svn: 124209
2011-01-25 19:09:56 +00:00
Devang Patel
fce915414e Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124203
2011-01-25 18:09:58 +00:00
Devang Patel
e1d739cd64 This assertion is too restrictive, it does not apply for dangling dbg value nodes (nodes where dbg.value intrinsic preceds use of the value).
llvm-svn: 124202
2011-01-25 18:09:33 +00:00
Duncan Sands
017a3d76f7 In which I discover that zero+zero is zero, d'oh!
llvm-svn: 124188
2011-01-25 15:14:15 +00:00
Duncan Sands
4d8a541ae2 See if this fixes llvm-gcc bootstrap.
llvm-svn: 124184
2011-01-25 12:15:09 +00:00
Duncan Sands
92b081bd42 According to my auto-simplifier the most common missed simplifications in
optimized code are:
  (non-negative number)+(power-of-two) != 0 -> true
and
  (x | 1) != 0 -> true
Instcombine knows about the second one of course, but only does it if X|1
has only one use.  These fire thousands of times in the testsuite.

llvm-svn: 124183
2011-01-25 09:38:29 +00:00
Nick Lewycky
b20b284b35 Teach mergefunc how to emit aliases safely again -- but keep it turned it off
for now. It's controlled by the HasGlobalAliases variable which is not attached
to any flag yet.

llvm-svn: 124182
2011-01-25 08:56:50 +00:00
Eric Christopher
8c76a85e48 Reorganize this so that the early exit and special cases come early
rather than interspersed. No functional change.

llvm-svn: 124168
2011-01-25 01:34:31 +00:00
Evan Cheng
8e47a6e196 Don't merge restore with tail call instruction.
llvm-svn: 124167
2011-01-25 01:28:33 +00:00
Anton Korobeynikov
964b274578 Provide correct registers for EH stuff on ARM
llvm-svn: 124151
2011-01-24 22:38:45 +00:00
Anton Korobeynikov
febd3ec17f Support printing exception section into the current one. This is the case when LSDASection is blank
llvm-svn: 124150
2011-01-24 22:38:40 +00:00
Devang Patel
431a9b9c2f Speculatively revert r124138.
llvm-svn: 124142
2011-01-24 20:04:37 +00:00
Devang Patel
5ccc4e884c Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124138
2011-01-24 19:24:37 +00:00
Andrew Trick
977497dbf1 Temporarily workaround JM/lencod miscompile (SIGSEGV).
rdar://problem/8893967

llvm-svn: 124137
2011-01-24 19:08:15 +00:00
Dan Gohman
db0dc19c04 Give GetUnderlyingObject a TargetData, to keep it in sync
with BasicAA's DecomposeGEPExpression, which recently began
using a TargetData. This fixes PR8968, though the testcase
is awkward to reduce.

Also, update several off GetUnderlyingObject's users
which happen to have a TargetData handy to pass it in.

llvm-svn: 124134
2011-01-24 18:53:32 +00:00
Chris Lattner
1a73dcfbdd fix PR8928 by clearing a stale map, patch by Jakub Staszak!
llvm-svn: 124132
2011-01-24 18:36:51 +00:00
Rafael Espindola
ead58a5259 Handle strings in section names the same way as gas:
* If the name is a single string, we remove the quotes
* If the name starts without a quote, we include any quotes in the name

llvm-svn: 124127
2011-01-24 18:02:54 +00:00
Dan Gohman
308677c046 Add a comment.
llvm-svn: 124126
2011-01-24 17:54:18 +00:00
Daniel Dunbar
652bc3b0a3 Support/CommandLine: Fix LookupNearestOption to also search extra option names.
llvm-svn: 124124
2011-01-24 17:27:17 +00:00
Chris Lattner
9ba0a83f2b fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey!
llvm-svn: 124102
2011-01-24 03:42:46 +00:00
Chris Lattner
f33b65080c fix PR9017, a bug where we'd assert when promoting in unreachable
code.

llvm-svn: 124100
2011-01-24 03:29:07 +00:00
Chris Lattner
077cdfcadb fix PR9015, a crash linking recursive metadata.
llvm-svn: 124099
2011-01-24 03:18:24 +00:00
Chris Lattner
c025bec9e1 this isn't a memset, we do convert dest[i] to one though :)
llvm-svn: 124097
2011-01-24 02:32:00 +00:00
Chris Lattner
1f9ed3b437 with recent work, we now optimize this into:
define i32 @foo(i32 %x) nounwind readnone ssp {
entry:
  %tobool = icmp eq i32 %x, 0
  %tmp5 = select i1 %tobool, i32 2, i32 1
  ret i32 %tmp5
}

llvm-svn: 124091
2011-01-24 01:12:18 +00:00
Chris Lattner
3609e5afb4 enhance SRoA to promote allocas that are used by PHI nodes. This often
occurs because instcombine sinks loads and inserts phis.  This kicks in 
on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in
spec2006 in some app that uses std::deque.

This resolves the last of rdar://7339113.

llvm-svn: 124090
2011-01-24 01:07:11 +00:00
Chris Lattner
50288f8e54 Enhance SRoA to promote allocas that are used by selects in some
common cases.  This triggers a surprising number of times in SPEC2K6
because min/max idioms end up doing this.  For example, code from the
STL ends up looking like this to SRoA:

  %202 = load i64* %__old_size, align 8, !tbaa !3
  %203 = load i64* %__old_size, align 8, !tbaa !3
  %204 = load i64* %__n, align 8, !tbaa !3
  %205 = icmp ult i64 %203, %204
  %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size
  %206 = load i64* %storemerge.i, align 8, !tbaa !3

We can now promote both the __n and the __old_size allocas.

This addresses another chunk of rdar://7339113, poor codegen on
stringswitch.

llvm-svn: 124088
2011-01-23 22:04:55 +00:00
Chris Lattner
0277fe2039 teach Value::isDereferenceablePointer that byval arguments are always
dereferencable, noticed by inspection.

llvm-svn: 124085
2011-01-23 21:15:29 +00:00
Anders Carlsson
ba7114ef4c Add a memset loop that LoopIdiomRecognize doesn't recognize.
llvm-svn: 124082
2011-01-23 20:31:00 +00:00
Nick Lewycky
13a2b8281f Simplify some code with no functionality change. Make the test a lot more
robust against smarter optimizations, using the power of FileCheck.

llvm-svn: 124081
2011-01-23 20:06:05 +00:00
Rafael Espindola
47bc703c55 Initialize MCNoExecStack.
llvm-svn: 124079
2011-01-23 18:50:12 +00:00
Rafael Espindola
547873da60 Add support for the --noexecstack option.
llvm-svn: 124077
2011-01-23 17:55:27 +00:00
Ted Kremenek
880c19c032 Null initialize a few variables flagged by
clang's -Wuninitialized-experimental warning.
While these don't look like real bugs, clang's
-Wuninitialized-experimental analysis is stricter
than GCC's, and these fixes have the benefit
of being general nice cleanups.

llvm-svn: 124073
2011-01-23 17:05:06 +00:00
Rafael Espindola
14333be1af Add support for lowercase variants.
llvm-svn: 124071
2011-01-23 16:11:25 +00:00
Chris Lattner
2ef605b499 Enhance SRoA to be more aggressive about scalarization of aggregate allocas
that have PHI or select uses of their element pointers.  This can often happen
when instcombine sinks two loads into a successor, inserting a phi or select.

With this patch, we can scalarize the alloca, but the pinned elements are not
yet promoted.  This is still a win for large aggregates where only one element
is used.  This fixes rdar://8904039 and part of rdar://7339113 (poor codegen
on stringswitch).

llvm-svn: 124070
2011-01-23 08:27:54 +00:00
Cameron Zwarich
3295a3b4db Convert two std::vectors to SmallVectors for a 3.4% speedup running -scalarrepl
on test-suite + SPEC2000 & SPEC2006.

llvm-svn: 124068
2011-01-23 08:03:04 +00:00
Chris Lattner
2e57722d9d have AllocaInfo store the alloca being inspected, simplifying callers.
No functionality change.

llvm-svn: 124067
2011-01-23 07:29:29 +00:00
Chris Lattner
2064a3e213 Rearrange some code a bit. Change MarkUnsafe to
handle the "Transformation preventing inst" printing, 
so that -scalarrepl -debug will always print the rejected
instruction.  No functionality change.

llvm-svn: 124066
2011-01-23 07:05:44 +00:00
Chris Lattner
ba66871643 remove an old hack that avoided creating MMX datatypes. The
X86 backend has been fixed.

llvm-svn: 124064
2011-01-23 06:40:33 +00:00
Nick Lewycky
2503c9f9c8 Use value ranges to fold ext(trunc) in SCEV when possible.
llvm-svn: 124062
2011-01-23 06:20:19 +00:00
Rafael Espindola
aefd549139 Delay the creation of eh_frame so that the user can change the defaults.
Add support for SHT_X86_64_UNWIND.

llvm-svn: 124059
2011-01-23 05:43:40 +00:00
Rafael Espindola
492ad6ca06 Remove more duplicated code.
llvm-svn: 124056
2011-01-23 04:43:11 +00:00
Rafael Espindola
59c1246cee Remove duplicated code.
llvm-svn: 124054
2011-01-23 04:28:49 +00:00
Nick Lewycky
4440e5815b Have SCEV turn sext(x) into zext(x) when x is s>= 0. This applies many times in
"make check" alone.

llvm-svn: 124046
2011-01-22 22:06:21 +00:00
Eric Christopher
ee9652eceb Add a FIXME explaining the move to a single indirect call bonus per function
that we can change from indirect to direct.

llvm-svn: 124045
2011-01-22 21:56:53 +00:00
Eric Christopher
5da99702ec Only apply the devirtualization bonus once instead of per-call site in the
target function.

Fixes part of rdar://8546196

llvm-svn: 124044
2011-01-22 21:17:33 +00:00
Venkatraman Govindaraju
66369057ae Pass sret arguments through the stack instead of through registers in Sparc backend. It makes the code generated more compliant with the sparc32 ABI.
llvm-svn: 124030
2011-01-22 13:05:16 +00:00
Venkatraman Govindaraju
3c37c9914f Added ICC, FCC as uses of movcc instruction to generate correct code when -mattr=v9 is used.
llvm-svn: 124027
2011-01-22 11:36:24 +00:00
Dan Gohman
a9724ada65 Actually check memcpy lengths, instead of just commenting about
how they should be checked.

llvm-svn: 123999
2011-01-21 22:07:57 +00:00
Venkatraman Govindaraju
c20beed917 Sparc backend:
Rename FLUSH to FLUSHW.
 Output "ta 3" instead of a "flushw" instruction if v8 instruction set is used.

llvm-svn: 123997
2011-01-21 22:00:00 +00:00
Owen Anderson
6e3425c7e0 Just because we have determined that an (fcmp | fcmp) is true for A < B,
A == B, and A > B, does not mean we can fold it to true.  We still need to
check for A ? B (A unordered B).

llvm-svn: 123993
2011-01-21 19:39:42 +00:00
Evan Cheng
0dfe28a9b5 Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.

llvm-svn: 123991
2011-01-21 18:55:51 +00:00
Renato Golin
cf89d692dc Clang was not parsing target triples involving EABI and was generating wrong IR (wrong PCS) and passing the wrong information down llc via the target-triple printed in IR. I've fixed this by adding the parsing of EABI into LLVM's Triple class and using it to choose the correct PCS in Clang's Tools. A Clang patch is on its way to use this infrastructure.
llvm-svn: 123990
2011-01-21 18:25:47 +00:00
Oscar Fuentes
da69ba084d Handles libffi on the CMake build.
Patch by arrowdodger!

llvm-svn: 123976
2011-01-21 15:42:54 +00:00
Bruno Cardoso Lopes
2f96371a7a Fix the encoding of QADD/SUB, QDADD/SUB. While qadd16, qadd8 use "rd, rn, rm",
qadd and qdadd uses "rd, rm, rn", the same applies to the 'sub' variants. This
is described in ARM manuals and matches the encoding used by the gnu assembler.

llvm-svn: 123975
2011-01-21 14:07:40 +00:00
Venkatraman Govindaraju
6a083c355b Implement support for byval arguments in Sparc backend.
llvm-svn: 123974
2011-01-21 14:00:01 +00:00
Nick Lewycky
5de10fb201 SCCP doesn't actually preserve the CFG. It will delete and insert terminator
instructions.

llvm-svn: 123973
2011-01-21 08:38:09 +00:00
Andrew Trick
e0bccb5f87 Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).

llvm-svn: 123971
2011-01-21 06:19:05 +00:00
Andrew Trick
7155e98904 Convert -enable-sched-cycles and -enable-sched-hazard to -disable
flags. They are still not enable in this revision.

Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.

Generalized unit tests to work with sched-cycles.

llvm-svn: 123969
2011-01-21 05:51:33 +00:00
Chris Lattner
f225708ef1 fix PR9013, an infinite loop in instcombine.
llvm-svn: 123968
2011-01-21 05:29:50 +00:00
Chris Lattner
9a4cefc8ee update obsolete comment.
llvm-svn: 123965
2011-01-21 05:08:26 +00:00
Nick Lewycky
c4300debc2 Don't try to pull vector bitcasts that change the number of elements through
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.

llvm-svn: 123963
2011-01-21 02:30:43 +00:00
Michael J. Spencer
e4fe76fa22 Object: Fix type punned pointer issues by making DataRefImpl a union and using intptr_t.
llvm-svn: 123962
2011-01-21 02:27:02 +00:00
Nick Lewycky
50f86f3414 Add a constant folding of casts from zero to zero. Fixes PR9011!
While here, I'd like to complain about how vector is not an aggregate type
according to llvm::Type::isAggregateType(), but they're listed under aggregate
types in the LangRef and zero vectors are stored as ConstantAggregateZero.

llvm-svn: 123956
2011-01-21 01:12:09 +00:00
Evan Cheng
52fe62c996 Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative
value, the "add pc" must be CSE'ed at the same time. We could follow the same
approach as T2 by adding pseudo instructions that combine the ldr + "add pc".
But the better approach is to use movw + movt (which I will enable soon), so
I'll leave this as a TODO.

llvm-svn: 123949
2011-01-20 23:55:07 +00:00
Tobias Grosser
ea8985cc25 Implement requiredTransitive
The PassManager did not implement the transitivity of requiredTransitive. This
was unnoticed since 2006.

llvm-svn: 123942
2011-01-20 21:03:22 +00:00
Bruno Cardoso Lopes
6aeb2e320f Fix the encoding and parsing of clrex instruction
llvm-svn: 123936
2011-01-20 19:18:32 +00:00
Bruno Cardoso Lopes
c0f87c11d6 Change instruction names for consistency
llvm-svn: 123930
2011-01-20 18:36:07 +00:00
Bruno Cardoso Lopes
5f06c0aa3b Add cdp/cdp2 instructions for thumb/thumb2
llvm-svn: 123929
2011-01-20 18:32:09 +00:00
Bruno Cardoso Lopes
3584c02d83 - Use a more appropriate name for Owen's ARM Parser isMCR hack since the same operands can be present
in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions.
- Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t
hem.

llvm-svn: 123927
2011-01-20 18:06:58 +00:00
Jakob Stoklund Olesen
60743dd45f SplitKit requires that all defs are in place before calling useIntv().
The value mapping gets confused about which original values have multiple new
definitions so they may need phi insertions.

This could probably be simplified by letting enterIntvBefore() take a live range
to be added following the instruction. As long as the range stays inside the
same basic block, value mapping shouldn't be a problem.

llvm-svn: 123926
2011-01-20 17:45:23 +00:00
Jakob Stoklund Olesen
360b0921ac Add LiveIntervalMap::dumpCache() to print out the cache used by the ssa update algorithm.
llvm-svn: 123925
2011-01-20 17:45:20 +00:00
Bruno Cardoso Lopes
75712e8a7a Add mcr*2 and mr*c2 support to thumb2 targets
llvm-svn: 123919
2011-01-20 16:58:48 +00:00
Bruno Cardoso Lopes
f377d1721e Add mcr* and mr*c support to thumb targets
llvm-svn: 123917
2011-01-20 16:35:57 +00:00
Kalle Raiskila
070fb5e54d Allow sign-extending of i8 and i16 to i128 on SPU.
llvm-svn: 123912
2011-01-20 15:49:06 +00:00
Duncan Sands
1faa8712c9 At -O123 the early-cse pass is run before instcombine has run. According to my
auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0.
This patch adds this transform and some related logic to InstructionSimplify
and removes some of the logic from instcombine (unfortunately not all because
there are several situations in which instcombine can improve things by making
new instructions, whereas instsimplify is not allowed to do this).  At -O2 this
often results in more than 15% more simplifications by early-cse, and results in
hundreds of lines of bitcode being eliminated from the testsuite.  I did see some
small negative effects in the testsuite, for example a few additional instructions
in three programs.  One program, 483.xalancbmk, got an additional 35 instructions,
which seems to be due to a function getting an additional instruction and then
being inlined all over the place.

llvm-svn: 123911
2011-01-20 13:21:55 +00:00
Bruno Cardoso Lopes
0312bb222e Refactor mcr* and mr*c instructions into classes with the same encoding. No functionality change.
llvm-svn: 123910
2011-01-20 13:17:59 +00:00
Eric Christopher
d17b5b7988 My editor's indent went crazy. Fix.
llvm-svn: 123909
2011-01-20 08:56:34 +00:00
Eric Christopher
f7579ff174 Expand invalid return values for umulo and smulo. Handle these similarly
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.

Fixes the 64-bit part of rdar://8622122 and rdar://8774702.

llvm-svn: 123908
2011-01-20 08:54:28 +00:00
Evan Cheng
d9fdc9771e Correct itinerary entry for t2MOV_pic_ga_add_pc.
llvm-svn: 123907
2011-01-20 08:43:03 +00:00
Evan Cheng
6dc21c7358 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.

llvm-svn: 123905
2011-01-20 08:34:58 +00:00
Michael J. Spencer
d54bad63f7 Object: Add ELF support.
llvm-svn: 123896
2011-01-20 06:38:47 +00:00
Michael J. Spencer
d4d4ab7712 Object: Add COFF Support.
llvm-svn: 123895
2011-01-20 06:38:34 +00:00
Andrew Trick
bf079d8831 Selection DAG scheduler register pressure heuristic fixes.
Added a check for already live regs before claiming HighRegPressure.
Fixed a few cases of checking the wrong number of successors.
Added some tracing until these heuristics are better understood.

llvm-svn: 123892
2011-01-20 06:21:59 +00:00
Jakob Stoklund Olesen
ea33059ff5 Check that a live range exists before shortening it. This fixes PR8989.
The live range may have been deleted earlier because of rematerialization.

llvm-svn: 123891
2011-01-20 06:20:02 +00:00
Jakob Stoklund Olesen
bb94da29b2 Add hidden -verify-coalescing to run the machine code verifier before and after
register coalescing.

llvm-svn: 123890
2011-01-20 06:20:00 +00:00
Venkatraman Govindaraju
5280b2876f Sparc backend: Implements a delay slot filler that attempt to fill delay slots
with useful instructions.

llvm-svn: 123884
2011-01-20 05:08:26 +00:00
Cameron Zwarich
bf70725ae3 Update a comment.
llvm-svn: 123879
2011-01-20 03:58:43 +00:00
Jakob Stoklund Olesen
c387993232 Fix bug found by new clang warning.
llvm-svn: 123872
2011-01-20 02:43:19 +00:00
Eric Christopher
58f8058502 Use only one API at a time.
llvm-svn: 123866
2011-01-20 01:29:23 +00:00
Eric Christopher
1b0e5debb4 If we can, lower the multiply part of a umulo/smulo call to a libcall
with an invalid type then split the result and perform the overflow check
normally.

Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.

llvm-svn: 123864
2011-01-20 00:29:24 +00:00
Devang Patel
729c5e59af Fix debug info for merged global.
llvm-svn: 123862
2011-01-20 00:02:16 +00:00
Jakob Stoklund Olesen
69294ae8d7 Divert Hopfield network debug output. It is very noisy.
llvm-svn: 123859
2011-01-19 23:14:59 +00:00
Jakob Stoklund Olesen
c47bd85657 Don't accidentally leave small gaps in the live ranges when leaving the active
interval after an instruction. The leaveIntvAfter() method only adds liveness
from the instruction's boundary index to the inserted copy.

Ideally, SplitKit should be smarter about this, perhaps by combining useIntv()
and leaveIntvAfter() into one method that guarantees continuity.

llvm-svn: 123858
2011-01-19 23:14:56 +00:00
Jim Grosbach
868d2877dc Make sure to propogate the error code when we fail to parse a modifier.
llvm-svn: 123857
2011-01-19 23:06:07 +00:00
Devang Patel
574e10fa1e Fix register address expression. Patch by Ken Dyck.
llvm-svn: 123856
2011-01-19 23:04:47 +00:00
Jakob Stoklund Olesen
77738dd84e Implement RAGreedy::splitAroundRegion and remove loop splitting.
Region splitting includes loop splitting as a subset, and it is more generic.
The splitting heuristics for variables that are live in more than one block are
now:

1. Try to create a region that covers multiple basic blocks.
2. Try to create a new live range for each block with multiple uses.
3. Spill.

Steps 2 and 3 are similar to what the standard spiller is doing.

llvm-svn: 123853
2011-01-19 22:11:48 +00:00
Nick Lewycky
51c13384f5 Similarly, analyze truncate through multiply.
llvm-svn: 123842
2011-01-19 18:56:00 +00:00
Nick Lewycky
9867e58096 Add a missed SCEV fold that is required to continue analyzing the IR produced
by indvars through the scev expander.

trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way
which is probably wrong, but preserved to minimize churn. Instcombine doesn't
do this fold either, demonstrating a missed optz'n opportunity on code doing
add+trunc+add.

llvm-svn: 123838
2011-01-19 16:59:46 +00:00
Bruno Cardoso Lopes
0f7a30b1cb Fix the encoding of mrrc and mcrr family of instructions. Also add testcases for mcr and mrc
llvm-svn: 123837
2011-01-19 16:56:52 +00:00
Rafael Espindola
ce499efe1d Add unnamed_addr when we can show that address of a global is not used.
llvm-svn: 123834
2011-01-19 16:32:21 +00:00
Nick Lewycky
5a538b62ca Add a missing SCEV simplification sext(zext x) --> zext x.
llvm-svn: 123832
2011-01-19 15:56:12 +00:00
Daniel Dunbar
dd1dd698f0 ARM/ISel: Factor out isScaledConstantInRange() helper.
llvm-svn: 123823
2011-01-19 15:12:16 +00:00
Andrew Trick
305613b87a For ARM subtargets with useNEONForSinglePrecisionFP, double count uses
of the floating point types less than 64-bits. It's somewhat of a temporary
hack but forces more accurate modeling of register pressure and results
in fewer spills.

llvm-svn: 123811
2011-01-19 02:35:27 +00:00
Andrew Trick
cf6999ed86 whitespace
llvm-svn: 123810
2011-01-19 02:26:13 +00:00
Evan Cheng
7e2b414953 Don't forget to emit the load from indirect symbol when using movw + movt to materialize GA indirect symbols.
llvm-svn: 123809
2011-01-19 02:16:49 +00:00
Bruno Cardoso Lopes
e0f8fee637 Create two new generic classes to represent the following VMRS/VMSR variations:
vmrs  reg, fpexc
vmrs  reg, fpsid
vmsr  fpexc, reg
vmsr  fpsid, reg

llvm-svn: 123783
2011-01-18 21:58:20 +00:00
Bruno Cardoso Lopes
82c6fe3dfe Fix MRS encoding for arm and thumb.
llvm-svn: 123778
2011-01-18 21:31:35 +00:00
Bruno Cardoso Lopes
6e4c5af01e Fix the encoding of t2ISB by using the right class and also parse it correctly
llvm-svn: 123776
2011-01-18 21:17:09 +00:00
Dan Gohman
df668227fb Teach BasicAA to return PartialAlias in cases where both pointers
are pointing to the same object, one pointer is accessing the entire
object, and the other is access has a non-zero size. This prevents
TBAA from kicking in and saying NoAlias in such cases.

llvm-svn: 123775
2011-01-18 21:16:06 +00:00
Jakob Stoklund Olesen
c0ff5356d4 Add RAGreedy methods for splitting live ranges around regions.
Analyze the live range's behavior entering and leaving basic blocks. Compute an
interference pattern for each allocation candidate, and use SpillPlacement to
find an optimal region where that register can be live.

This code is still not enabled.

llvm-svn: 123774
2011-01-18 21:13:27 +00:00
Bruno Cardoso Lopes
c1e21b06b9 Follow the current hack set and enable the correct parsing of bkpt while in thumb mode.
llvm-svn: 123772
2011-01-18 20:55:11 +00:00
Chris Lattner
4832a9d32c fix rdar://8878965, a regression I introduced with the recent
llvm.objectsize changes.

llvm-svn: 123771
2011-01-18 20:53:04 +00:00
Bruno Cardoso Lopes
94247155c4 Add support for parsing and encoding ARM's official syntax for the BFI instruction
llvm-svn: 123770
2011-01-18 20:45:56 +00:00
Jim Grosbach
6de3a4f76f Add a FIXME.
llvm-svn: 123769
2011-01-18 19:59:19 +00:00
Bruno Cardoso Lopes
13cefe452d Ensure Mips::GP is properly reloaded after a function call. Patch by Sasa Stankovic
llvm-svn: 123768
2011-01-18 19:50:18 +00:00
Bruno Cardoso Lopes
2d509b8b40 Negative zero is not legal on mips. Patch by Sasa Stankovic
llvm-svn: 123766
2011-01-18 19:41:41 +00:00
Bruno Cardoso Lopes
542853bcd7 Handle (i32,i32) => f64 in a cleaner way. Patch by Sasa Stankovic
llvm-svn: 123763
2011-01-18 19:38:25 +00:00
Bruno Cardoso Lopes
6c5db0236a Add support for mips32 madd and msub instructions. Patch by Akira Hatanaka
llvm-svn: 123760
2011-01-18 19:29:17 +00:00
Duncan Sands
732cb58b61 For completeness, generalize the (X + Y) - Y -> X transform and add X - (X + 1) -> -1.
These were not recommended by my auto-simplifier since they don't fire often enough.
However they do fire from time to time, for example they remove one subtraction from
the final bitcode for 483.xalancbmk.

llvm-svn: 123755
2011-01-18 11:50:19 +00:00
Duncan Sands
2abe6f500f Simplify (X<<1)-X into X. According to my auto-simplier this is the most common missed
simplification in fully optimized code.  It occurs sporadically in the testsuite, and
many times in 403.gcc: the final bitcode has 131 fewer subtractions after this change.
The reason that the multiplies are not eliminated is the same reason that instcombine
did not catch this: they are used by other instructions (instcombine catches this with
a more general transform which in general is only profitable if the operands have only
one use).

llvm-svn: 123754
2011-01-18 09:24:58 +00:00
Chris Lattner
08e1bf567f add a note
llvm-svn: 123752
2011-01-18 07:47:48 +00:00
Venkatraman Govindaraju
ecf49c6279 SPARC backend: Modified LowerCall and LowerFormalArguments so that they use CallingConv assignments.
llvm-svn: 123749
2011-01-18 06:09:55 +00:00
Cameron Zwarich
ea98dd25a5 Remove an unnecessary #include.
llvm-svn: 123748
2011-01-18 06:07:18 +00:00
Cameron Zwarich
c8083524f8 Move DominanceFrontier from VMCore to Analysis.
llvm-svn: 123747
2011-01-18 06:06:27 +00:00
Daniel Dunbar
12dd48769d McARM: Use accessors where appropriate.
llvm-svn: 123746
2011-01-18 05:55:27 +00:00
Daniel Dunbar
51fef8d445 McARM: Fill in ASMOperand::dump() for memory operands.
llvm-svn: 123745
2011-01-18 05:55:21 +00:00
Daniel Dunbar
f966e16cb0 McARM: Make ARMOperand use a union where appropriate.
llvm-svn: 123744
2011-01-18 05:55:15 +00:00
Cameron Zwarich
caca9a63e6 There is no point in verifying an analysis that is never updated.
llvm-svn: 123743
2011-01-18 05:44:04 +00:00
Daniel Dunbar
0cff3f953b McARM: Unify ParseMemory() successfull return.
llvm-svn: 123740
2011-01-18 05:34:24 +00:00
Daniel Dunbar
3cb5e8b0cb McARM: Early exit on failure (NEFC).
llvm-svn: 123739
2011-01-18 05:34:17 +00:00
Daniel Dunbar
9ea6873c89 McARM: Always keep an offset expression, if used (instead of assuming == 0 if used but not present), and simplify logic.
Also, clean up various non-sensicalisms in isMemModeRegThumb() and isMemModeImmThumb().

llvm-svn: 123738
2011-01-18 05:34:11 +00:00
Daniel Dunbar
8d7ed1f6a8 McARM: Add a variety of asserts on the sanity of memory operands.
llvm-svn: 123737
2011-01-18 05:34:05 +00:00
Daniel Dunbar
aa5e17f3a7 McARM: Use a consistent marker for not-set OffsetRegNum.
llvm-svn: 123736
2011-01-18 05:33:57 +00:00
Cameron Zwarich
4ccaa5dccc Convert a std::map to a DenseMap for another 1.7% speedup on -scalarrepl.
llvm-svn: 123732
2011-01-18 04:50:38 +00:00
Cameron Zwarich
c35c06d0d4 Make a std::vector a SmallVector<*, 32> like the other vectors in the same
function. This seems to be about a 1.5% speedup of -scalarrepl on test-suite
with SPEC2000 and SPEC2006.

llvm-svn: 123731
2011-01-18 04:41:32 +00:00
Rafael Espindola
1cab858a5c Reduce indentation and remove commented out code.
llvm-svn: 123729
2011-01-18 04:36:06 +00:00
Cameron Zwarich
62a9d4d454 Remove some now-unused DominanceFrontier methods.
llvm-svn: 123726
2011-01-18 04:21:57 +00:00
Cameron Zwarich
c93f994801 Remove code for updating dominance frontiers and some outdated references to
dominance and post-dominance frontiers.

llvm-svn: 123725
2011-01-18 04:11:31 +00:00
Cameron Zwarich
e39e476305 Remove outdated references to dominance frontiers.
llvm-svn: 123724
2011-01-18 03:53:26 +00:00
Daniel Dunbar
ba39b2fdc1 McARM: Start marking T2 address operands as such, for the benefit of the parser.
llvm-svn: 123722
2011-01-18 03:06:03 +00:00
Daniel Dunbar
d4e20df45a Support/CommandLine: Add "Did you mean" print for mismatched operands.
llvm-svn: 123717
2011-01-18 01:59:24 +00:00
Eric Christopher
e8aa8b114f The stub routine that we're calling uses test and so clobbers
the flags.

llvm-svn: 123712
2011-01-18 01:37:20 +00:00
Chris Lattner
047388c197 minor change to rafael's recent patches: if something is
constant but requires a unique address, we can still put it in a
readonly section, just not a mergable one.

llvm-svn: 123711
2011-01-18 01:23:44 +00:00
Jeffrey Yasskin
5f5e1f5ef1 Remove unused variables found by gcc-4.6's -Wunused-but-set-variable.
llvm-svn: 123707
2011-01-18 00:51:23 +00:00
Stuart Hastings
f5f8318eb6 Remove checking that prevented overlapping CALLSEQ_START/CALLSEQ_END
ranges, add legalizer support for nested calls.  Necessary for ARM
byval support.  Radar 7662569.

llvm-svn: 123704
2011-01-18 00:09:27 +00:00
NAKAMURA Takumi
04e2fe7fd0 Windows/PathV2.inc: For CryptAcquireContext(), CRYPT_VERIFYCONTEXT may be specified for easy use.
llvm-svn: 123687
2011-01-17 22:41:34 +00:00
NAKAMURA Takumi
bfd623d7e5 Windows/PathV2.inc: MoveFileEx() can behave like Posix's mv(1) to specify MOVEFILE_COPY_ALLOWED | MOVEFILE_REPLACE_EXISTING.
llvm-svn: 123686
2011-01-17 22:41:25 +00:00
NAKAMURA Takumi
6a74aa16c3 lib/Support/Windows/Signals.inc: "Showstopper" dialogs may be suppressed with SetErrorMode() on Windows 7.
llvm-svn: 123685
2011-01-17 22:41:15 +00:00
Owen Anderson
b315a60d0f Remove dead code, that I apparently wrote a while back. We seem to be doing well enough
without whatever this was trying to do.  When/if someone has the time to do some empirical
evaluations, it might be worth it to figure out what this code was trying to do and see if
it's worth resurrecting/fixing.

llvm-svn: 123684
2011-01-17 22:39:54 +00:00
Douglas Gregor
49f7d8c38c Add a missing <cctype> include, from Joerg Sonnenberger!
llvm-svn: 123670
2011-01-17 19:17:01 +00:00
Benjamin Kramer
869dc645f1 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00
Cameron Zwarich
0a1975802e Roll r123609 back in with two changes that fix test failures with expensive
checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662
2011-01-17 17:38:41 +00:00
Michael J. Spencer
d076942763 Archive: Fix temp path names.
llvm-svn: 123660
2011-01-17 16:43:30 +00:00
Michael J. Spencer
30124fea1f Support/raw_ostream: Fix uninitalized variable in raw_fd_ostream constructor.
llvm-svn: 123643
2011-01-17 15:53:12 +00:00
Jay Foad
599c83ed66 Remove useless Tag enumeration.
llvm-svn: 123623
2011-01-17 15:18:06 +00:00
Kalle Raiskila
7401b2a1db Split up RotateShift itinerary in SPU.
'rotq*' and 'shlq*' instructions go to the odd pipeline,
wheras the inter-vector equivalents 'rot*', 'shl*' go 
to the even.

llvm-svn: 123622
2011-01-17 13:33:19 +00:00
Benjamin Kramer
e9488ed8eb Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0.
This shaves off 4 popcounts from the hacked 186.crafty source.

This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.

llvm-svn: 123621
2011-01-17 12:04:57 +00:00
Kalle Raiskila
8eaf0e83d5 Don't crash SPU BE with memory accesses with big alignmnet.
llvm-svn: 123620
2011-01-17 11:59:20 +00:00
Evan Cheng
53ec6fc591 Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.
movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0

It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.

llvm-svn: 123619
2011-01-17 08:03:18 +00:00
Cameron Zwarich
77e381e302 Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot.
llvm-svn: 123618
2011-01-17 07:26:51 +00:00
Cameron Zwarich
e35642342b Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609
2011-01-17 01:08:59 +00:00
Michael J. Spencer
861d5122e2 UnRevert "Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.""
llvm-svn: 123605
2011-01-16 23:39:59 +00:00
Michael J. Spencer
4e4cb59457 Fix rename.
llvm-svn: 123604
2011-01-16 22:18:41 +00:00
Anton Korobeynikov
2a9d9ef36f Provide instruction sizes for ARMv5 variants of MUL instructions.
This fixes PR8987

llvm-svn: 123598
2011-01-16 21:28:33 +00:00
Anders Carlsson
d0103ebf92 Update README.txt to remove the DAE enhancement.
llvm-svn: 123597
2011-01-16 21:26:15 +00:00
Anders Carlsson
c9781e5764 Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead.
llvm-svn: 123596
2011-01-16 21:25:33 +00:00
Michael J. Spencer
7663111de0 UnRevert "Revert the archive part of "Support/PathV2: Add identify_magic.""
This reverts commit dd103021a889a986a181ce36ed7b0e8dc9b645e1.

llvm-svn: 123595
2011-01-16 21:13:51 +00:00
Michael J. Spencer
29fdfe200e Revert the archive part of "Support/PathV2: Add identify_magic."
llvm-svn: 123593
2011-01-16 19:56:42 +00:00
Chris Lattner
2f902bc3b1 tidy up a comment, as suggested by duncan
llvm-svn: 123590
2011-01-16 17:46:19 +00:00
Rafael Espindola
7933fffe38 Only put unnamed_addr constants in mergeable sections. Fixes PR8297.
llvm-svn: 123585
2011-01-16 17:19:34 +00:00
Rafael Espindola
41852873f7 Don't merge two constants if we care about the address of both.
This fixes the original testcase in PR8927. It also causes a clang
binary built with a patched clang to increase in size by 0.21%.

We can probably get some of the size back by writing a pass that
detects that a global never has its pointer compared and adds
unnamed_addr to it (maybe extend global opt). It is also possible that
there are some other cases clang could add unnamed_addr to.

I will investigate extending globalopt next.

llvm-svn: 123584
2011-01-16 17:05:09 +00:00
Jay Foad
aa0d059b67 Simplify the construction and destruction of Uses. Simplify
User::dropHungOffUses().

llvm-svn: 123580
2011-01-16 15:30:52 +00:00
Chris Lattner
dde85de90f fix PR8514, a bug where the "heroic" transformation of shift/and
into and/shift would cause nodes to move around and a dangling pointer
to happen.  The code tried to avoid this with a HandleSDNode, but 
got the details wrong.

llvm-svn: 123578
2011-01-16 08:48:11 +00:00
Jay Foad
030a48213f Move the implementation of the User class into a new source file,
User.cpp.

llvm-svn: 123575
2011-01-16 08:10:57 +00:00
Chris Lattner
a4454efc85 fix PR8932, a case where arg promotion could infinitely promote.
llvm-svn: 123574
2011-01-16 08:09:24 +00:00
Chris Lattner
f3b214e298 simplify a little
llvm-svn: 123573
2011-01-16 07:11:21 +00:00
Chris Lattner
91f1b21cf1 add some commentary
llvm-svn: 123572
2011-01-16 06:39:44 +00:00
Chris Lattner
29f339f87c if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571
2011-01-16 06:18:28 +00:00
Chris Lattner
b43fce09a9 Use an irbuilder to get some trivial constant folding when doing a store
of a constant.

llvm-svn: 123570
2011-01-16 05:58:24 +00:00
Chris Lattner
1a125a870f remove a dead check, this was needed before we had an explicit veto on uses of phis.
llvm-svn: 123569
2011-01-16 05:37:55 +00:00
Chris Lattner
2067fb2a93 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Evan Cheng
144b435a15 Spill R4 if it's going to be used to restore SP from FP.
llvm-svn: 123567
2011-01-16 05:14:33 +00:00
Chris Lattner
84d8f40fbb remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
2011-01-16 05:14:26 +00:00
Chris Lattner
c639cb2c82 more cleanups: use the IR builder.
llvm-svn: 123565
2011-01-16 05:08:00 +00:00
Chris Lattner
9af2484c39 tidy up code.
llvm-svn: 123564
2011-01-16 04:37:29 +00:00
Owen Anderson
6e0fa67f91 Improve the safety of my globalopt enhancement by ensuring that the bitcast
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
2011-01-16 04:33:33 +00:00
Chris Lattner
aba06ce448 fix PR8983, a broken assertion.
llvm-svn: 123562
2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju
fe346f6cba Implement AnalyzeBranch in Sparc Backend.
llvm-svn: 123561
2011-01-16 03:15:11 +00:00
Chris Lattner
24ea7f696e fix PR8981, a crash trying to form a conditional inc with a floating point compare.
llvm-svn: 123560
2011-01-16 02:56:53 +00:00
Chris Lattner
c4d1d86d3e reapply my fix for PR8961 with a tweak to properly handle
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
2011-01-16 02:27:38 +00:00
Chris Lattner
e3d0c7819e simplify this code, it is still broken but will follow up on llvm-commits.
llvm-svn: 123558
2011-01-16 02:05:10 +00:00
Michael J. Spencer
76f1706025 Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1."
llvm-svn: 123557
2011-01-16 01:43:22 +00:00
Chandler Carruth
a3261fcca5 Simplify a README.txt entry significantly to expose the core issue.
llvm-svn: 123556
2011-01-16 01:40:23 +00:00
Chris Lattner
75599bb566 remove the partial specialization pass. It is unmaintained and has bugs.
llvm-svn: 123554
2011-01-16 00:27:10 +00:00
Michael J. Spencer
303c304f0d Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123551
2011-01-15 21:43:37 +00:00
Benjamin Kramer
2e7ead5bb5 Add an assert so we don't silently miscompile ctpop for bit widths > 128.
llvm-svn: 123549
2011-01-15 21:19:37 +00:00
Michael J. Spencer
e1defa51ae Support/PathV2: Add identify_magic.
llvm-svn: 123548
2011-01-15 20:39:36 +00:00
Benjamin Kramer
b48a048de6 Reimplement CTPOP legalization with the "best" algorithm from
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
2011-01-15 20:30:30 +00:00
Michael J. Spencer
86a5515979 Support/PathV2: Implement has_magic in terms of get_magic.
llvm-svn: 123545
2011-01-15 18:52:41 +00:00
Michael J. Spencer
78fc0cacd0 Support/PathV2: Implement get_magic.
llvm-svn: 123544
2011-01-15 18:52:33 +00:00
Nick Lewycky
7e71443cf2 Add missing whitespace.
llvm-svn: 123543
2011-01-15 18:42:52 +00:00
Nick Lewycky
1d57e867a4 Make constmerge a two-pass algorithm so that it won't miss merging
opporuntities. Fixes PR8978.

llvm-svn: 123541
2011-01-15 18:14:21 +00:00
Benjamin Kramer
91f0608676 Try to unbreak selfhost.
llvm-svn: 123537
2011-01-15 11:25:34 +00:00
Nick Lewycky
9293c403d8 Add a cache that protects mergefunc's internals from more surprises in DenseSet.
Also, replace tabs with spaces. Yes, it's 2011.

llvm-svn: 123535
2011-01-15 10:16:23 +00:00
Nick Lewycky
708df45c84 Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this saves
half a million non-local queries, each of which would otherwise have triggered a
linear scan over a basic block.

Also fix a fixme for memory intrinsics which dereference pointers. With this,
we prove that a pointer is non-null because it was dereferenced by an intrinsic
112 times in llvm-test.

llvm-svn: 123533
2011-01-15 09:16:12 +00:00
Rafael Espindola
3b43f22391 Allow unnamed_addr on declarations.
llvm-svn: 123529
2011-01-15 08:15:00 +00:00
Chris Lattner
55c2150f36 temporarily revert r123526. While working on a follow-on patch I
realize that ConstantFoldTerminator doesn't preserve dominfo.

llvm-svn: 123527
2011-01-15 07:51:19 +00:00
Chris Lattner
68a47147ba fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.

llvm-svn: 123526
2011-01-15 07:36:13 +00:00
Chris Lattner
2a7c042c37 simplify code, no functionality change.
llvm-svn: 123525
2011-01-15 07:29:01 +00:00
Chris Lattner
d4eaf6eba8 Now that instruction optzns can update the iterator as they go, we can
have objectsize folding recursively simplify away their result when it
folds.  It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.

llvm-svn: 123524
2011-01-15 07:25:29 +00:00
Chris Lattner
939e77a0df make the current instruction iterator an ivar, allowing xforms that
potentially invalidate it (like inline asm lowering) to be sunk into
their proper place, cleaning up a ton of code.

llvm-svn: 123523
2011-01-15 07:14:54 +00:00
Chris Lattner
74ed5d30ca implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Chris Lattner
0868c29c36 one more instcombine variant that is needed to work with future changes,
no functionality change currently.

llvm-svn: 123517
2011-01-15 05:50:18 +00:00
Chris Lattner
360fedf20a fix typo
llvm-svn: 123516
2011-01-15 05:42:47 +00:00
Chris Lattner
ca796e7838 Catch ~x < cst just like ~x < ~y, we currently handle this through
means that are about to disappear.

llvm-svn: 123515
2011-01-15 05:41:33 +00:00
Chris Lattner
06849c1228 reduce indentation
llvm-svn: 123514
2011-01-15 05:40:29 +00:00
Eric Christopher
d675e0b362 80-col.
llvm-svn: 123505
2011-01-15 00:25:09 +00:00
Chris Lattner
e6d5b3c4ce Generalize LoadAndStorePromoter a bit and switch LICM
to use it.

llvm-svn: 123501
2011-01-15 00:12:35 +00:00
Bob Wilson
e6b8ba1ae4 Fix a comment.
llvm-svn: 123497
2011-01-15 00:09:18 +00:00
Eric Christopher
b00cef51d8 Fix 80-cols.
llvm-svn: 123494
2011-01-14 23:50:53 +00:00
Ted Kremenek
c9d2425c5a Update CMake build.
llvm-svn: 123491
2011-01-14 22:58:11 +00:00
Ted Kremenek
4b09cdedb2 'HiReg' is written but never read. Nuke its
declaration and its assignments.

Found by clang static analyzer.

llvm-svn: 123486
2011-01-14 22:34:13 +00:00
Owen Anderson
63902f2c99 Fix a false-positive warning.
llvm-svn: 123480
2011-01-14 22:31:13 +00:00
Dan Gohman
a4f2631ea9 Delete an assignment to ThisBB which isn't needed, and tidy up some
comments.

llvm-svn: 123479
2011-01-14 22:26:16 +00:00
Owen Anderson
51dcd56f96 Enhance GlobalOpt to be able evaluate initializers that involve stores through
bitcasts, at least in simple cases.  This fixes clang's CodeGenCXX/virtual-base-dtor.cpp

llvm-svn: 123477
2011-01-14 22:19:20 +00:00
Anton Korobeynikov
1f9df99db1 Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.
llvm-svn: 123476
2011-01-14 21:58:08 +00:00
Anton Korobeynikov
6b2f110a3d Cleanup
llvm-svn: 123475
2011-01-14 21:57:58 +00:00
Anton Korobeynikov
ef11a77938 Add CFI directives-based frame information emission. Not hooked yet.
llvm-svn: 123474
2011-01-14 21:57:53 +00:00
Anton Korobeynikov
e53322ef91 Split stuff as a preparation for CFI directives-based frame information emission
llvm-svn: 123473
2011-01-14 21:57:45 +00:00
Anton Korobeynikov
aeeec825cd Use common style for .cfi directives
llvm-svn: 123472
2011-01-14 21:57:39 +00:00
Andrew Trick
a0e69757d1 Support for precise scheduling of the instruction selection DAG,
disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468
2011-01-14 21:11:41 +00:00
Chris Lattner
1ce35a0362 switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.
llvm-svn: 123457
2011-01-14 19:50:47 +00:00
Chris Lattner
2cf8e75d34 Add a new LoadAndStorePromoter class, which implements the general
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
2011-01-14 19:36:13 +00:00
Duncan Sands
dc51b0ee48 Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common
simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use).  Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.

llvm-svn: 123442
2011-01-14 15:26:10 +00:00
Duncan Sands
4757061c47 Factorize common code out of the InstructionSimplify shift logic. Add in
threading of shifts over selects and phis while there.  This fires here and
there in the testsuite, to not much effect.  For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.

llvm-svn: 123441
2011-01-14 14:44:12 +00:00