Commit Graph

73760 Commits

Author SHA1 Message Date
Zoran Jovanovic
59e16813d2 [mips][microMIPS] Implement ADDU16 and SUBU16 instructions
Differential Revision: http://reviews.llvm.org/D5118


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 08:44:58 +00:00
Zoran Jovanovic
a245b68293 [mips][microMIPS] Implement AND16, NOT16, OR16 and XOR16 instructions
Differential Revision: http://reviews.llvm.org/D5117


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220275 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 08:32:40 +00:00
Zoran Jovanovic
fa20aa343b [mips][microMIPS] Implement microMIPS 16-bit instructions registers
Differential Revision: http://reviews.llvm.org/D5116


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220273 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 08:23:11 +00:00
Rafael Espindola
45968c54e9 Fix a bit of confusion about .set and produce more readable assembly.
Every target we support has support for assembly that looks like

a = b - c
.long a

What is special about MachO is that the above combination suppresses the
production of a relocation.

With this change we avoid producing the intermediary labels when they don't
add any value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 01:17:30 +00:00
Paul Robinson
f8c9d3d3c2 Do not attribute static allocas to the call site's DebugLoc.
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.

Patch by Wolfgang Pieb!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 01:00:55 +00:00
David Blaikie
82805eb232 PR21202: Memory leak in Windows RWMutexImpl when using SRWLOCK
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220251 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 00:34:39 +00:00
Rafael Espindola
33966cf988 Make AsmPrinter::EmitLabelOffsetDifference a static helper and simplify.
It had exactly one caller in a position where we know hasSetDirective is true.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220250 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 00:25:49 +00:00
Lang Hames
acaf8f5618 [MCJIT] Temporarily revert r220245 - it broke several bots.
(See e.g. http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/17653)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220249 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 00:24:02 +00:00
Philip Reames
9be9473394 Introduce enum values for previously defined metadata types. (NFC)
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well.  Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison.  This change adds enum value for three existing metadata types:
+    MD_nontemporal = 9, // "nontemporal"
+    MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access"
+    MD_nonnull = 11 // "nonnull"

I went through an updated various uses as well.  I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate.  For example, there were several items in LoopInfo.cpp I chose not to update.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 00:13:20 +00:00
Philip Reames
c0c4f1b78e Extend the verifier to validate range metadata on calls and invokes.
Range metadata applies to loads, call, and invokes.  We were validating that metadata applied to loads was correct according to the LangRef, but we were not validating metadata applied to calls or invokes.  This change extracts the checking functionality to a common location, reuses it for all valid locations, and adds a simple test to ensure a misused range on a call gets reported.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220246 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 23:52:07 +00:00
Lang Hames
32aaaeaa05 [MCJIT] Make MCJIT honor symbol visibility settings when populating the global
symbol table.

Patch by Anthony Pesch. Thanks Anthony!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220245 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 23:39:54 +00:00
Quentin Colombet
a37862e2de [X86] Fix a bug in the lowering of the mask of VSELECT.
X86 code to lower VSELECT messed a bit with the bits set in the mask of VSELECT
when it knows it can be lowered into BLEND. Indeed, only the high bits need to be
set for those and it optimizes those accordingly.
However, when the mask is a compile time constant, the lowering will be handled
by the generic optimizer and those modifications will generate bad code in the
generic optimizer.

This patch fixes that by preventing the optimization if the VSELECT will be
handled by the generic optimizer.

<rdar://problem/18675020>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 23:13:30 +00:00
Philip Reames
90f3f15da5 Introduce a 'nonnull' metadata on Load instructions.
The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns.  Long term, it would be nice to combine these into a single construct.   The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull.

Reviewed by: Hal Finkel
Differential Revision:  http://reviews.llvm.org/D5220




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220240 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 22:40:55 +00:00
Simon Pilgrim
0d1978b813 [X86] Memory folding for commutative instructions (updated)
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.

Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt.

Added additional regression test case provided by Joerg Sonnenberger.

Differential Revision: http://reviews.llvm.org/D5818



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220239 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 22:14:22 +00:00
Tim Northover
4e399f4500 ARM: rework Thumb1 frame index rewriting
The previous code had a few problems, motivating the choices here.

1. It could create instructions clobbering CPSR, but the incoming MachineInstr
   didn't reflect this. A potential source of corruption. This is why the patch
   has a new PseudoInst for before lowering.
2. Similarly, there was some code to handle the incoming instruction not being
   ARMCC::AL, but this would have caused massive problems if it was actually
   invoked when a complex offset needing more than one instruction was requested.
3. It wasn't designed to handle unaligned pointers (or offsets). These should
   probably be minimised anyway, but the code needs to deal with them properly
   regardless.
4. It had some rather dubious ad-hoc code to avoid calling
   emitThumbRegPlusImmediate, a function which should be designed to do precisely
   this job.

We seem to cover the common cases correctly now, and hopefully can enhance
emitThumbRegPlusImmediate to handle any extra optimisations we need to add in
future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 21:28:41 +00:00
Alexey Samsonov
9170808b2a Be more specific about return type of MachOUniversalBinary::getObjectForArch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 20:30:57 +00:00
Alexey Samsonov
10051f0f62 Constify input argument of RelocVisitor and DWARFContext constructors. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220228 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 20:28:51 +00:00
Robert Khasanov
10646db916 Moved out IIT_V64 from common values section.
Thanks Juergen Ributzka for notice.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220224 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 19:25:05 +00:00
Steven Wu
3ea39890d1 Fix Intrinsic::getType not working with vararg
VarArg Intrinsic functions are encoded with "void" type as the last
argument. Now Intrinsic::getType can correctly return all the intrinsic
function type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220205 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 15:47:24 +00:00
Oliver Stannard
e7c9c44387 [Thumb2] RFE, SRS and "SUBS pc, lr" are undefined on v7M
These instructions are related to the v7[AR] exception model, and are
not defined on v7M.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220204 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 15:37:35 +00:00
Sid Manning
958df22c9f Remove unnecessary else.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 13:08:19 +00:00
Oliver Stannard
19d010b851 [ARM] Do not select SMULW[BT] or SMLAW[BT]
The current instruction selection patterns for SMULW[BT] and SMLAW[BT]
are incorrect. These instructions multiply a 32-bit and a 16-bit value
(both signed) and return the top 32 bits of the 48-bit result. This
preserves the 16 bits of overflow, whereas the patterns they currently
match truncate the result to 16 bits then sign extend.

To select these instructions, we would need to match an ISD::SMUL_LOHI,
a sign extend, two shifts and an or. There is no way to match SMUL_LOHI
in an instruction pattern as it defines multiple values, so this would
have to be done in C++. I have raised
http://llvm.org/bugs/show_bug.cgi?id=21297 to cover allowing correct
selection of these instructions.

This fixes http://llvm.org/bugs/show_bug.cgi?id=19396



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220196 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 11:30:35 +00:00
Oliver Stannard
508c39393a [Thumb] Fix crash in Thumb1RegisterInfo::rewriteFrameIndex
This function can, for some offsets from the SP, split one instruction
into two. Since it re-uses the original instruction as the first
instruction of the result, we need ensure its result register is not
marked as dead before we use it in the second instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 11:00:18 +00:00
Chandler Carruth
34b45cdb95 Switch the default DataLayout to be little endian, and make the variable
be BigEndian so the default can continue to be zero-initialized.

This is one of the prerequisites to making DataLayout a constant and
always available part of every module.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 10:41:29 +00:00
Chandler Carruth
3ac929c473 Fix a miscompile introduced in r220178.
The original code had an implicit assumption that if the test for
allocas or globals was reached, the two pointers were not equal. With my
changes to make the pointer analysis more powerful here, I also had to
guard against circumstances where the results weren't useful. That in
turn violated the assumption and gave rise to a circumstance in which we
could have a store with both the queried pointer and stored pointer
rooted at *the same* alloca. Clearly, we cannot ignore such a store.
There are other things we might do in this code to better handle the
case of both pointers ending up at the same alloca or global, but it
seems best to at least make the test explicit in what it intends to
check.

I've added tests for both the alloca and global case here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 10:03:01 +00:00
David Majnemer
7798534e77 IR: Replace DataLayout::RoundUpAlignment with RoundUpToAlignment
No functional change intended, just cleaning up some code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 06:13:33 +00:00
Chandler Carruth
080dfb5bda Fix a somewhat subtle pair of issues with JumpThreading I introduced in
r220178. First, the creation routine doesn't insert prior to the
terminator of the basic block provided, but really at the end of the
basic block. Instead, get the terminator and insert before that. The
next issue was that we need to ensure multiple PHI node entries for
a single predecessor re-use the same cast instruction rather than
creating new ones.

All of the logic here was without tests previously. I've reduced and
added a test case from the test suite that crashed without both of these
fixes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 05:34:36 +00:00
Chandler Carruth
35c4e071be Teach the load analysis driving core instcombine logic and other bits of
logic to look through pointer casts, making them trivially stronger in
the face of loads and stores with intervening pointer casts.

I've included a few test cases that demonstrate the kind of folding
instcombine can do without pointer casts and then variations which
obfuscate the logic through bitcasts. Without this patch, the variations
all fail to optimize fully.

This is more important now than it has been in the past as I've started
moving the load canonicialization to more closely follow the value type
requirements rather than the pointer type requirements and thus this
needs to be prepared for more pointer casts. When I made the same change
to stores several test cases regressed without logic along these lines
so I wanted to systematically improve matters first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220178 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 00:24:14 +00:00
Chandler Carruth
63276ccdbd Do a better and more complete job of preserving metadata when combining
loads.

This handles many more cases than just the AA metadata, some of them
suggested by Hal in his review of the AA metadata handling patch. I've
tried to test this behavior where tractable to do so.

I'll point out that I have specifically *not* included a test for
debuginfo because it was going to require 2 or 3 times as much work to
craft some input which would survive the "helpful" stripping of debug
info metadata that doesn't match the desired schema. This is another
good example of why the current state of write-ability for our debug
info metadata is unacceptable. I spent over 30 minutes trying to conjure
some test case that would survive, even copying from other debug info
tests, but it always failed to survive with no explanation of why or how
I might fix it. =[

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220165 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 10:46:46 +00:00
Chandler Carruth
4d2a706176 Move previously dead code to handle computing the known bits of an alias
up to where it actually works as intended. The problem is that
a GlobalAlias isa GlobalValue and so the prior block handled all of the
cases.

This allows us to constant fold based on the actual constant expression
in the global alias. As an example, see the last function in the newly
added test case which explicitly aligns an unaligned pointer using
constant expression math. Without this change, we fail to see that and
fold an alignment test to zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220164 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 09:06:56 +00:00
David Majnemer
0fd4e2e5a1 InstCombine: (sub (or A B) (xor A B)) --> (and A B)
The following implements the transformation:
(sub (or A B) (xor A B)) --> (and A B).

Patch by Ankur Garg!

Differential Revision: http://reviews.llvm.org/D5719

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220163 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:32:32 +00:00
David Majnemer
242aeb9d84 InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1
The following implements the optimization for sequences of the form:
icmp eq/ne (shl Const2, A), Const1

Such sequences can be transformed to:
icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2))

This handles only the equality operators for now. Other operators need
to be handled.

Patch by Ankur Garg!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220162 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:23:08 +00:00
Chandler Carruth
908d4514f6 Fix a long-standing miscompile in the load analysis that was uncovered
by my refactoring of this code.

The method isSafeToLoadUnconditionally assumes that the load will
proceed with the preferred type alignment. Given that, it has to ensure
that the alloca or global is at least that aligned. It has always done
this historically when a datalayout is present, but has never checked it
when the datalayout is absent. When I refactored the code in r220156,
I exposed this path when datalayout was present and that turned the
latent bug into a patent bug.

This fixes the issue by just removing the special case which allows
folding things without datalayout. This isn't worth the complexity of
trying to tease apart when it is or isn't safe without actually knowing
the preferred alignment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:17:50 +00:00
Chandler Carruth
f84407be32 Switch how the datalayout availability test is handled in this code to
make much more sense and in theory be more correct.

If you trace the code alllll the way back to when it was first
introduced, the comments make it slightly more clear what was going on
here. At that time, the only way Base != V was if DL (then TD) was
non-null. As a consequence, if DL *was* null, that meant we were loading
directly from the alloca or global found above the test. After
refactoring, this has become at least terribly subtle and potentially
incorrect. There are many forms of pointer manipulation that can be
traversed without DataLayout, and some of them would in fact change the
size of object being loaded vs. allocated.

Rather than this subtlety, I've hoisted the actual 'return true' bits
into the code which actually found an alloca or global and based them on
the loaded pointer being that alloca or global. This is both more clear
and safer. I've also added comments about exactly why this set of
predicates is used.

I've also corrected a misleading comment about globals -- if overridden
they may not just have a different size, they may be null and completely
unsafe to load from!

Hopefully this confuses the next reader a bit less. I don't have any
test cases or anything, the patch is motivated strictly to improve the
readability of the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220156 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 00:42:16 +00:00
Bob Wilson
efed41c621 Use triple predicate functions instead of checking values directly. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220155 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 00:39:30 +00:00
Chandler Carruth
652627d301 Rename 'TD' to 'DL' in this function as the argument is now a DataLayout
argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220151 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 23:47:22 +00:00
Chandler Carruth
dacb8a615d Fix the other comment to use modern doxygen style and be a bit more
direct. Notably, comment on the fact that the loaded type is significant
in that it determines how wide of an access must be safe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220150 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 23:46:17 +00:00
Chandler Carruth
28502a895a More formatting cleanup brought to you by clang-format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220149 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 23:41:25 +00:00
Chandler Carruth
e99ca835bc Clean up doxygen syntax and reword comments to flow better, have a brief
section, and not have unfinished sentence fragments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220147 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 23:31:55 +00:00
Chandler Carruth
01dc911c73 Clean up the formatting and trailing whitespace of a routine before
editting it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220146 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 23:19:03 +00:00
Lang Hames
33ea6f23fc [PBQP] Replace the interference-constraints algorithm with a faster version
loosely based on linear scan.

On x86-64 this is good for a ~2% drop in compile time on the nightly test suite.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220143 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 17:26:07 +00:00
Chandler Carruth
797e9b812e Preserve AA metadata when combining (cast (load (...))) -> (load (cast
(...))).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 11:00:12 +00:00
Chandler Carruth
9b2d091a9c [InstCombine] Do an about-face on how LLVM canonicalizes (cast (load
...)) and (load (cast ...)): canonicalize toward the former.

Historically, we've tried to load using the type of the *pointer*, and
tried to match that type as closely as possible removing as many pointer
casts as we could and trading them for bitcasts of the loaded value.
This is deeply and fundamentally wrong.

Repeat after me: memory does not have a type! This was a hard lesson for
me to learn working on SROA.

There is only one thing that should actually drive the type used for
a pointer, and that is the type which we need to use to load from that
pointer. Matching up pointer types to the loaded value types is very
useful because it minimizes the physical size of the IR required for
no-op casts. Similarly, the only thing that should drive the type used
for a loaded value is *how that value is used*! Again, this minimizes
casts. And in fact, the *only* thing motivating types in any part of
LLVM's IR are the types used by the operations in the IR. We should
match them as closely as possible.

I've ended up removing some tests here as they were testing bugs or
behavior that is no longer present. Mostly though, this is just cleanup
to let the tests continue to function as intended.

The only fallout I've found so far from this change was SROA and I have
fixed it to not be impeded by the different type of load. If you find
more places where this change causes optimizations not to fire, those
too are likely bugs where we are assuming that the type of pointers is
"significant" for optimization purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 06:36:22 +00:00
Nick Kledzik
50ede1623d [llvm-objdump] Fix mach-o binding decompression error
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 01:21:02 +00:00
Chandler Carruth
2402e6315d [SROA] Change how SROA does vector-based promotion of allocas to handle
cases where the alloca type, the load types, and the store types used
all disagree.

Previously, the only way that vector-based promotion occured was if the
alloca type was a vector type. This was one of the *very* few remaining
uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns
out it was a bad idea.

The alloca type can change very easily based on the mixture of types
loaded and stored to that alloca. We shouldn't be relying on it as
a signal for very much. Instead, the source of truth should be loads and
stores. We should canonicalize the loads and stores as much as possible
and then rely on them exclusively in SROA.

When looking and loads and stores, we may find many different candidate
vector types. This change will let SROA try all of them to find a vector
type which is a viable way to promote the entire alloca to a vector
register.

With this change, it becomes possible to do better canonicalization and
optimization of loads and stores without breaking SROA in random ways,
and that should allow fixing a core source of performance loss in hot
numerical loops such as those in Eigen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220116 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 00:44:02 +00:00
Aaron Watry
4e00650f58 R600/SI: Add global atomicrmw xchg
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220110 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:33:03 +00:00
Aaron Watry
2107be5bc7 R600/SI: Add global atomicrmw xor
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:33:01 +00:00
Aaron Watry
e81b68b86c R600/SI: Add global atomicrmw or
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220108 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:59 +00:00
Aaron Watry
1883b51d2e R600/SI: Add global atomicrmw min/umin
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220107 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:57 +00:00
Aaron Watry
387e397ecd R600/SI: Add global atomicrmw max/umax
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:56 +00:00
Aaron Watry
beac0c1403 R600/SI: Add global atomicrmw and
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220105 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:54 +00:00
Aaron Watry
892bb7df98 R600/SI: Add global atomicrmw sub
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:52 +00:00
Evgeniy Stepanov
c83c81a62e [msan] Fix handling of byval arguments with large alignment.
MSan param-tls slots are 8-byte aligned. This change clips
alignment of memcpy into param-tls to 8.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220101 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:29:44 +00:00
Pete Cooper
185a992dcf Check for dynamic alloca's when selecting lifetime intrinsics.
TL;DR: Indexing maps with [] creates missing entries.

The long version:

When selecting lifetime intrinsics, we index the *static* alloca map with the AllocaInst we find for that lifetime.  Trouble is, we don't first check to see if this is a dynamic alloca.

On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime.  0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots.

PEI would later trigger an assert because it expects the stack protector to not be dead.

This fix ensures that we only get frame indices for static allocas, ie, those in the map.  Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken.

rdar://problem/18672951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220099 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 22:59:33 +00:00
Rafael Espindola
ec51f45338 Revert "TRE: make TRE a bit more aggressive"
This reverts commit r219899.

This also updates byval-tail-call.ll to make it clear what was breaking.
Adding r219899 again will cause the load/store to disappear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220093 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:25:48 +00:00
Bill Schmidt
b56bf6b112 [PowerPC] Change assert to better form
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220092 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:19:59 +00:00
Matt Arsenault
24463c7df7 R600/SI: Remove redundant setting of instruction bits
These are all set on the instruction base classes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220091 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:13:11 +00:00
Bill Schmidt
d2dcbd00f7 [PowerPC] Change liveness testing in VSX FMA mutation pass
With VSX enabled, LLVM crashes when compiling
test/CodeGen/PowerPC/fma.ll.  I traced this to the liveness test
that's revised in this patch. The interval test is designed to only
work for virtual registers, but in this case the AddendSrcReg is
physical. Since there is already a walk of the MIs between the
AddendMI and the FMA, I added a check for def/kill of the AddendSrcReg
in that loop.  At Hal Finkel's request, I converted the liveness test
to an assert restricted to virtual registers.

I've changed the fma.ll test to have VSX and non-VSX variants so we
can test both kinds of multiply-adds.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:02:44 +00:00
Matt Arsenault
b6591042cd Fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220068 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:02:31 +00:00
Matt Arsenault
0e974f694b R600/SI: Also check for FPImm literal constants
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220067 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:50 +00:00
Matt Arsenault
7d8f1710a3 R600/SI: Allow commuting with source modifiers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:48 +00:00
Matt Arsenault
b4fe2b433e R600/SI: Simplify code with hasModifiersSet
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220065 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:45 +00:00
Matt Arsenault
415789c57e R600/SI: Fix general commuting breaking src mods
The generic code trying to use findCommutedOpIndices won't
understand that it needs to swap the modifier operands also,
so it should fail if they are set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220064 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:43 +00:00
Matt Arsenault
bf5be3f989 R600/SI: Cleanup code with ChangeToFPImmediate
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220063 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:41 +00:00
Matt Arsenault
84895bd2e6 R600/SI: Allow comuting fp immediates
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220062 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:39 +00:00
Matt Arsenault
7eeaefa0c8 R600/SI: Use early return instead of checking condition twice
Any commutable instruction will have at least src1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:37 +00:00
Matt Arsenault
aa796b99bb R600/SI: Use complex pattern for MUBUF load patterns.
This eliminates a use of the SI_ADDR64_RSRC pseudo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220057 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:43:00 +00:00
Matt Arsenault
46b53c9e4b R600/SI: Remove SI_BUFFER_RSRC pseudo
Just use REG_SEQUENCE directly, so there are fewer
instructions to need to deal with later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220056 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:42:56 +00:00
Juergen Ributzka
32ef68718d [Stackmaps] Enable invoking the patchpoint intrinsic.
Patch by Kevin Modzelewski
Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits, reames

Differential Revision: http://reviews.llvm.org/D5634

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220055 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:39:00 +00:00
Andrea Di Biagio
5512b50db5 [X86] Fix missed selection of non-temporal store of zero vector.
When the input to a store instruction was a zero vector, the backend
always selected a normal vector store regardless of the non-temporal
hint. This is fixed by this patch.

This fixes PR19370.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:27:06 +00:00
James Molloy
7023b85187 [AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.

While we're at it, avoid writing to std::vector::end()...

Bug found with random testing and a lot of coffee.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220051 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:06:31 +00:00
Bill Schmidt
b76f5ba103 [PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation
Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64
types, but does not yet use lxvw4x and stxvw4x for 4x32 types.  This
patch adds that support.

As with lxvd2x/stxvd2x, this involves straightforward overriding of
the patterns normally recognized for lvx/stvx, with preference given
to the VSX patterns when VSX is enabled.

In addition, the logic for permitting misaligned memory accesses is
modified so that v4r32 and v4i32 are treated the same as v2f64 and
v2i64 when VSX is enabled.  Finally, the DAG generation for unaligned
loads is changed to just use a normal LOAD (which will become lxvw4x)
on P8 and later hardware, where unaligned loads are preferred over
lvsl/lvx/lvx/vperm.

A number of tests now generate the VSX loads/stores instead of
lvx/stvx, so this patch adds VSX variants to those tests.  I've also
added <4 x float> tests to the vsx.ll test case, and created a
vsx-p8.ll test case to be used for testing code generation for the
P8Vector feature.  For now, that simply tests the unaligned load/store
behavior.

This has been tested along with a temporary patch to enable the VSX
and P8Vector features, with no new regressions encountered with or
without the temporary patch applied.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220047 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 15:13:38 +00:00
Jan Vesely
7eee3b07b5 Mips: Only set divrem i64 to custom on 64bit
Reviewed-by: Daniel Sanders <daniel.sanders@imgtec.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:45:28 +00:00
Jan Vesely
cef793e8c7 SelectionDAG: Add sext_inreg optimizations
v2: use dyn_cast
    fixup comments
v3: use cast

Reviewed-by: Matt Arsenault <arsenm2@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:45:25 +00:00
Vasileios Kalintiris
eaf8f5efe9 [mips] Add support for COP1's Branch-On-Cond-Likely instructions
Summary: Depends on D5782

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:08:28 +00:00
Vasileios Kalintiris
0f22fe9b56 [mips] Add support for COP0's Branch-On-Cond-Likely instructions
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220036 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 12:38:35 +00:00
Hal Finkel
9d85eff56a [DSE] Remove no-data-layout-only type-based overlap checking
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.

Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 11:56:00 +00:00
Rafael Espindola
d8ee23f34c Add back commits r219835 and a fixed version of r219829.
The only difference from r219829 is using

getOrCreateSectionSymbol(*ELFSec)

instead of

GetOrCreateSymbol(ELFSec->getSectionName())

in ELFObjectWriter which causes us to use the correct section symbol even if
we have multiple sections with the same name.

Original messages:

r219829:
Correctly handle references to section symbols.

When processing assembly like

.long .text

we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.

This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.

The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.

r219835:
Allow forward references to section symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220021 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:48:58 +00:00
Akira Hatanaka
1cdebe50c1 ARM: Fix a bug which was causing convergence failure in constant-island pass.
The bug is in ARMConstantIslands::createNewWater where the upper bound of the
new water split point is computed:

// This could point off the end of the block if we've already got constant
// pool entries following this block; only the last one is in the water list.
// Back past any possible branches (allow for a conditional and a maximally
// long unconditional).
if (BaseInsertOffset + 8 >= UserBBI.postOffset()) {
  BaseInsertOffset = UserBBI.postOffset() - UPad - 8;
  DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset));
}

The split point is supposed to be somewhere between the machine instruction that
loads from the constant pool entry and the end of the basic block, before branch
instructions. The code above is fine if the basic block is large enough and
there are a sufficient number of instructions following the machine instruction.
However, if the machine instruction is near the end of the basic block,
BaseInsertOffset can point to the machine instruction or another instruction
that precedes it, and this can lead to convergence failure.

This commit fixes this bug by ensuring BaseInsertOffset is larger than the
offset of the instruction following the constant-loading instruction.

rdar://problem/18581150


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220015 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:31:47 +00:00
Rafael Espindola
70a1be3f76 Revert commit r219835 and r219829.
Revert "Correctly handle references to section symbols."
Revert "Allow forward references to section symbols."

Rui found a regression I am debugging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:06:02 +00:00
Peter Zotov
cb76f395d7 [LLVM-C] Add LLVMInstructionClone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220007 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:02:34 +00:00
Matt Arsenault
a383742439 R600/SI: Simplify debug printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 00:36:20 +00:00
Matt Arsenault
61bf4bf2c3 R600/SI: Remove another VALU pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219988 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 23:33:37 +00:00
Peter Collingbourne
86b3d8eb43 Introduce LLVMParseCommandLineOptions C API function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219975 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 22:47:52 +00:00
Juergen Ributzka
933d703d7c Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.
This is in preparation for another patch that makes patchpoints invokable.

Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219967 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:26:35 +00:00
Chandler Carruth
68ca48cd90 [SROA] Switch the common variable name for the 'AllocaSlices' class to
'AS'.

Using 'S' as this was a terrible idea. Arguably, 'AS' is not much
better, but it at least follows the idea of using initialisms and
removes active confusion about the AllocaSlices variable and a Slice
variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219963 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:11:55 +00:00
Chandler Carruth
c62c42b1e4 [SROA] More range-based cleanups to SROA, these brought to you by
clang-modernize.

I did have to clean up the variable types and whitespace a bit because
the use of auto made the code much less readable here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219962 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:05:14 +00:00
Chandler Carruth
5269b24da1 [SROA] Switch a couple of overly complex iterator accessors to just be
ArrayRef accessors.

I think this even came up in review that this was over-engineered, and
indeed it was. Time to un-build it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219958 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:42:08 +00:00
Robin Morisset
d310963833 Erase fence insertion from SelectionDAGBuilder.cpp (NFC)
Summary:
Backends can use setInsertFencesForAtomic to signal to the middle-end that
montonic is the only memory ordering they can accept for
stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger
ordering to fences + monotonic accesses is currently living in
SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it
for several reasons:
- There is lots of redundancy to avoid: extremely similar logic already
  exists in AtomicExpand.
- The current code in SelectionDAGBuilder does not use any target-hooks, it
  does the same transformation for every backend that requires it
- As a result it is plain *unsound*, as it was apparently designed for ARM.
  It happens to mostly work for the other targets because they are extremely
  conservative, but Power for example had to switch to AtomicExpand to be
  able to use lwsync safely (see r218331).
- Because it produces IR-level fences, it cannot be made sound ! This is noted
  in the C++11 standard (section 29.3, page 1140):
```
Fences cannot, in general, be used to restore sequential consistency for atomic
operations with weaker ordering semantics.
```
It can also be seen by the following example (called IRIW in the litterature):
```
atomic<int> x = y = 0;
int r1, r2, r3, r4;
Thread 0:
  x.store(1);
Thread 1:
  y.store(1);
Thread 2:
  r1 = x.load();
  r2 = y.load();
Thread 3:
  r3 = y.load();
  r4 = x.load();
```
r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst.
But if they are lowered to monotonic accesses, no amount of fences can prevent it..

This patch does three things (I could cut it into parts, but then some of them
would not be tested/testable, please tell me if you would prefer that):
- it provides a default implementation for emitLeadingFence/emitTrailingFence in
terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder.
As we saw above, this is unsound, but the best that can be done without knowing
the targets well (and there is a comment warning about this risk).
- it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default
implementation (that exactly replicates the logic of SelectionDAGBuilder, so no
functional change)
- it finally erase this logic from SelectionDAGBuilder as it is dead-code.

Ideally, each target would define its own override for emitLeading/TrailingFence
using target-specific fences, but I do not know the Sparc/Mips/XCore memory model
well enough to do this, and they appear to be dealing fine with the ARM-inspired
default expansion for now (probably because they are overly conservative, as
Power was). If anyone wants to compile fences more agressively on these
platforms, the long comment should make it clear why he should first override
emitLeading/TrailingFence.

Test Plan: make check-all, no functional change

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219957 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:34:57 +00:00
Matt Arsenault
ceb4f4907d R600/SI: Remove unnecessary VALU patterns
These haven't been necessary since allowing
selecting SALU instructions in non-entry blocks
was enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219956 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:31:50 +00:00
Chandler Carruth
c2320545bc [SROA] Start more deeply moving SROA to use ranges rather than just
iterators.

There are a ton of places where it essentially wants ranges
rather than just iterators. This is just the first step that adds the
core slice range typedefs and uses them in a couple of places. I still
have to explicitly construct them because they've not been punched
throughout the entire set of code. More range-based cleanups incoming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219955 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:24:07 +00:00
Matt Arsenault
0134a9bed3 R600: Fix nonsensical implementation of computeKnownBits for BFE
This was resulting in invalid simplifications of sdiv

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219953 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:07:40 +00:00
Rafael Espindola
2f8f1d34e3 Delete -std-compile-opts.
These days -std-compile-opts was just a silly alias for -O3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:00:02 +00:00
Bjorn Steinbrink
6eaa62af77 Allow call-slop optzn for destinations with a suitable dereferenceable attribute
Summary:
Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 19:43:08 +00:00
Sanjay Patel
d8214db086 fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().

In the simplest case, this:

   y = sqrt(x * x);

becomes this:

   y = fabs(x);

This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.

Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.

Differential Revision: http://reviews.llvm.org/D5787



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 18:48:17 +00:00
Juergen Ributzka
c40dab2069 [AArch64] Fix miscompile of sdiv-by-power-of-2.
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.

This fixes rdar://problem/18678801.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 16:41:15 +00:00
Vasileios Kalintiris
3b72ec5083 [mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 nodes.
Summary:
In order to support big endian targets for the BuildPairF64 nodes we
just need to swap the low/high pair registers. Additionally, for the
ExtractElementF64 nodes we have to calculate the correct stack offset
with respect to the node's register/operand that we want to extract.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219931 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 15:41:51 +00:00
Vasileios Kalintiris
02065a65cd [mips] Marked the DI/EI instruction aliases as MIPS32r2
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 15:23:52 +00:00
Vasileios Kalintiris
c3ab7837e8 Test commit access: remove extra new line at the end of file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219925 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 14:37:00 +00:00
Akira Hatanaka
4eb03123df Reapply r219832 - InstCombine: Narrow switch instructions using known bits.
The code committed in r219832 asserted when it attempted to shrink a switch
statement whose type was larger than 64-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 06:00:46 +00:00