Commit Graph

73760 Commits

Author SHA1 Message Date
Aaron Watry
beac0c1403 R600/SI: Add global atomicrmw and
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220105 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:54 +00:00
Aaron Watry
892bb7df98 R600/SI: Add global atomicrmw sub
v2: Add separate offset/no-offset tests

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:32:52 +00:00
Evgeniy Stepanov
c83c81a62e [msan] Fix handling of byval arguments with large alignment.
MSan param-tls slots are 8-byte aligned. This change clips
alignment of memcpy into param-tls to 8.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220101 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 23:29:44 +00:00
Pete Cooper
185a992dcf Check for dynamic alloca's when selecting lifetime intrinsics.
TL;DR: Indexing maps with [] creates missing entries.

The long version:

When selecting lifetime intrinsics, we index the *static* alloca map with the AllocaInst we find for that lifetime.  Trouble is, we don't first check to see if this is a dynamic alloca.

On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime.  0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots.

PEI would later trigger an assert because it expects the stack protector to not be dead.

This fix ensures that we only get frame indices for static allocas, ie, those in the map.  Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken.

rdar://problem/18672951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220099 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 22:59:33 +00:00
Rafael Espindola
ec51f45338 Revert "TRE: make TRE a bit more aggressive"
This reverts commit r219899.

This also updates byval-tail-call.ll to make it clear what was breaking.
Adding r219899 again will cause the load/store to disappear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220093 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:25:48 +00:00
Bill Schmidt
b56bf6b112 [PowerPC] Change assert to better form
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220092 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:19:59 +00:00
Matt Arsenault
24463c7df7 R600/SI: Remove redundant setting of instruction bits
These are all set on the instruction base classes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220091 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:13:11 +00:00
Bill Schmidt
d2dcbd00f7 [PowerPC] Change liveness testing in VSX FMA mutation pass
With VSX enabled, LLVM crashes when compiling
test/CodeGen/PowerPC/fma.ll.  I traced this to the liveness test
that's revised in this patch. The interval test is designed to only
work for virtual registers, but in this case the AddendSrcReg is
physical. Since there is already a walk of the MIs between the
AddendMI and the FMA, I added a check for def/kill of the AddendSrcReg
in that loop.  At Hal Finkel's request, I converted the liveness test
to an assert restricted to virtual registers.

I've changed the fma.ll test to have VSX and non-VSX variants so we
can test both kinds of multiply-adds.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:02:44 +00:00
Matt Arsenault
b6591042cd Fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220068 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:02:31 +00:00
Matt Arsenault
0e974f694b R600/SI: Also check for FPImm literal constants
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220067 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:50 +00:00
Matt Arsenault
7d8f1710a3 R600/SI: Allow commuting with source modifiers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:48 +00:00
Matt Arsenault
b4fe2b433e R600/SI: Simplify code with hasModifiersSet
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220065 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:45 +00:00
Matt Arsenault
415789c57e R600/SI: Fix general commuting breaking src mods
The generic code trying to use findCommutedOpIndices won't
understand that it needs to swap the modifier operands also,
so it should fail if they are set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220064 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:43 +00:00
Matt Arsenault
bf5be3f989 R600/SI: Cleanup code with ChangeToFPImmediate
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220063 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:41 +00:00
Matt Arsenault
84895bd2e6 R600/SI: Allow comuting fp immediates
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220062 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:39 +00:00
Matt Arsenault
7eeaefa0c8 R600/SI: Use early return instead of checking condition twice
Any commutable instruction will have at least src1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 18:00:37 +00:00
Matt Arsenault
aa796b99bb R600/SI: Use complex pattern for MUBUF load patterns.
This eliminates a use of the SI_ADDR64_RSRC pseudo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220057 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:43:00 +00:00
Matt Arsenault
46b53c9e4b R600/SI: Remove SI_BUFFER_RSRC pseudo
Just use REG_SEQUENCE directly, so there are fewer
instructions to need to deal with later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220056 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:42:56 +00:00
Juergen Ributzka
32ef68718d [Stackmaps] Enable invoking the patchpoint intrinsic.
Patch by Kevin Modzelewski
Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits, reames

Differential Revision: http://reviews.llvm.org/D5634

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220055 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:39:00 +00:00
Andrea Di Biagio
5512b50db5 [X86] Fix missed selection of non-temporal store of zero vector.
When the input to a store instruction was a zero vector, the backend
always selected a normal vector store regardless of the non-temporal
hint. This is fixed by this patch.

This fixes PR19370.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:27:06 +00:00
James Molloy
7023b85187 [AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.

While we're at it, avoid writing to std::vector::end()...

Bug found with random testing and a lot of coffee.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220051 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:06:31 +00:00
Bill Schmidt
b76f5ba103 [PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation
Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64
types, but does not yet use lxvw4x and stxvw4x for 4x32 types.  This
patch adds that support.

As with lxvd2x/stxvd2x, this involves straightforward overriding of
the patterns normally recognized for lvx/stvx, with preference given
to the VSX patterns when VSX is enabled.

In addition, the logic for permitting misaligned memory accesses is
modified so that v4r32 and v4i32 are treated the same as v2f64 and
v2i64 when VSX is enabled.  Finally, the DAG generation for unaligned
loads is changed to just use a normal LOAD (which will become lxvw4x)
on P8 and later hardware, where unaligned loads are preferred over
lvsl/lvx/lvx/vperm.

A number of tests now generate the VSX loads/stores instead of
lvx/stvx, so this patch adds VSX variants to those tests.  I've also
added <4 x float> tests to the vsx.ll test case, and created a
vsx-p8.ll test case to be used for testing code generation for the
P8Vector feature.  For now, that simply tests the unaligned load/store
behavior.

This has been tested along with a temporary patch to enable the VSX
and P8Vector features, with no new regressions encountered with or
without the temporary patch applied.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220047 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 15:13:38 +00:00
Jan Vesely
7eee3b07b5 Mips: Only set divrem i64 to custom on 64bit
Reviewed-by: Daniel Sanders <daniel.sanders@imgtec.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:45:28 +00:00
Jan Vesely
cef793e8c7 SelectionDAG: Add sext_inreg optimizations
v2: use dyn_cast
    fixup comments
v3: use cast

Reviewed-by: Matt Arsenault <arsenm2@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:45:25 +00:00
Vasileios Kalintiris
eaf8f5efe9 [mips] Add support for COP1's Branch-On-Cond-Likely instructions
Summary: Depends on D5782

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 14:08:28 +00:00
Vasileios Kalintiris
0f22fe9b56 [mips] Add support for COP0's Branch-On-Cond-Likely instructions
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220036 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 12:38:35 +00:00
Hal Finkel
9d85eff56a [DSE] Remove no-data-layout-only type-based overlap checking
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.

Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 11:56:00 +00:00
Rafael Espindola
d8ee23f34c Add back commits r219835 and a fixed version of r219829.
The only difference from r219829 is using

getOrCreateSectionSymbol(*ELFSec)

instead of

GetOrCreateSymbol(ELFSec->getSectionName())

in ELFObjectWriter which causes us to use the correct section symbol even if
we have multiple sections with the same name.

Original messages:

r219829:
Correctly handle references to section symbols.

When processing assembly like

.long .text

we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.

This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.

The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.

r219835:
Allow forward references to section symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220021 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:48:58 +00:00
Akira Hatanaka
1cdebe50c1 ARM: Fix a bug which was causing convergence failure in constant-island pass.
The bug is in ARMConstantIslands::createNewWater where the upper bound of the
new water split point is computed:

// This could point off the end of the block if we've already got constant
// pool entries following this block; only the last one is in the water list.
// Back past any possible branches (allow for a conditional and a maximally
// long unconditional).
if (BaseInsertOffset + 8 >= UserBBI.postOffset()) {
  BaseInsertOffset = UserBBI.postOffset() - UPad - 8;
  DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset));
}

The split point is supposed to be somewhere between the machine instruction that
loads from the constant pool entry and the end of the basic block, before branch
instructions. The code above is fine if the basic block is large enough and
there are a sufficient number of instructions following the machine instruction.
However, if the machine instruction is near the end of the basic block,
BaseInsertOffset can point to the machine instruction or another instruction
that precedes it, and this can lead to convergence failure.

This commit fixes this bug by ensuring BaseInsertOffset is larger than the
offset of the instruction following the constant-loading instruction.

rdar://problem/18581150


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220015 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:31:47 +00:00
Rafael Espindola
70a1be3f76 Revert commit r219835 and r219829.
Revert "Correctly handle references to section symbols."
Revert "Allow forward references to section symbols."

Rui found a regression I am debugging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:06:02 +00:00
Peter Zotov
cb76f395d7 [LLVM-C] Add LLVMInstructionClone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220007 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 01:02:34 +00:00
Matt Arsenault
a383742439 R600/SI: Simplify debug printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 00:36:20 +00:00
Matt Arsenault
61bf4bf2c3 R600/SI: Remove another VALU pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219988 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 23:33:37 +00:00
Peter Collingbourne
86b3d8eb43 Introduce LLVMParseCommandLineOptions C API function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219975 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 22:47:52 +00:00
Juergen Ributzka
933d703d7c Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.
This is in preparation for another patch that makes patchpoints invokable.

Reviewers: atrick, ributzka
Reviewed By: ributzka
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219967 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:26:35 +00:00
Chandler Carruth
68ca48cd90 [SROA] Switch the common variable name for the 'AllocaSlices' class to
'AS'.

Using 'S' as this was a terrible idea. Arguably, 'AS' is not much
better, but it at least follows the idea of using initialisms and
removes active confusion about the AllocaSlices variable and a Slice
variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219963 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:11:55 +00:00
Chandler Carruth
c62c42b1e4 [SROA] More range-based cleanups to SROA, these brought to you by
clang-modernize.

I did have to clean up the variable types and whitespace a bit because
the use of auto made the code much less readable here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219962 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 21:05:14 +00:00
Chandler Carruth
5269b24da1 [SROA] Switch a couple of overly complex iterator accessors to just be
ArrayRef accessors.

I think this even came up in review that this was over-engineered, and
indeed it was. Time to un-build it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219958 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:42:08 +00:00
Robin Morisset
d310963833 Erase fence insertion from SelectionDAGBuilder.cpp (NFC)
Summary:
Backends can use setInsertFencesForAtomic to signal to the middle-end that
montonic is the only memory ordering they can accept for
stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger
ordering to fences + monotonic accesses is currently living in
SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it
for several reasons:
- There is lots of redundancy to avoid: extremely similar logic already
  exists in AtomicExpand.
- The current code in SelectionDAGBuilder does not use any target-hooks, it
  does the same transformation for every backend that requires it
- As a result it is plain *unsound*, as it was apparently designed for ARM.
  It happens to mostly work for the other targets because they are extremely
  conservative, but Power for example had to switch to AtomicExpand to be
  able to use lwsync safely (see r218331).
- Because it produces IR-level fences, it cannot be made sound ! This is noted
  in the C++11 standard (section 29.3, page 1140):
```
Fences cannot, in general, be used to restore sequential consistency for atomic
operations with weaker ordering semantics.
```
It can also be seen by the following example (called IRIW in the litterature):
```
atomic<int> x = y = 0;
int r1, r2, r3, r4;
Thread 0:
  x.store(1);
Thread 1:
  y.store(1);
Thread 2:
  r1 = x.load();
  r2 = y.load();
Thread 3:
  r3 = y.load();
  r4 = x.load();
```
r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst.
But if they are lowered to monotonic accesses, no amount of fences can prevent it..

This patch does three things (I could cut it into parts, but then some of them
would not be tested/testable, please tell me if you would prefer that):
- it provides a default implementation for emitLeadingFence/emitTrailingFence in
terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder.
As we saw above, this is unsound, but the best that can be done without knowing
the targets well (and there is a comment warning about this risk).
- it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default
implementation (that exactly replicates the logic of SelectionDAGBuilder, so no
functional change)
- it finally erase this logic from SelectionDAGBuilder as it is dead-code.

Ideally, each target would define its own override for emitLeading/TrailingFence
using target-specific fences, but I do not know the Sparc/Mips/XCore memory model
well enough to do this, and they appear to be dealing fine with the ARM-inspired
default expansion for now (probably because they are overly conservative, as
Power was). If anyone wants to compile fences more agressively on these
platforms, the long comment should make it clear why he should first override
emitLeading/TrailingFence.

Test Plan: make check-all, no functional change

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219957 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:34:57 +00:00
Matt Arsenault
ceb4f4907d R600/SI: Remove unnecessary VALU patterns
These haven't been necessary since allowing
selecting SALU instructions in non-entry blocks
was enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219956 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:31:50 +00:00
Chandler Carruth
c2320545bc [SROA] Start more deeply moving SROA to use ranges rather than just
iterators.

There are a ton of places where it essentially wants ranges
rather than just iterators. This is just the first step that adds the
core slice range typedefs and uses them in a couple of places. I still
have to explicitly construct them because they've not been punched
throughout the entire set of code. More range-based cleanups incoming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219955 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:24:07 +00:00
Matt Arsenault
0134a9bed3 R600: Fix nonsensical implementation of computeKnownBits for BFE
This was resulting in invalid simplifications of sdiv

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219953 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:07:40 +00:00
Rafael Espindola
2f8f1d34e3 Delete -std-compile-opts.
These days -std-compile-opts was just a silly alias for -O3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:00:02 +00:00
Bjorn Steinbrink
6eaa62af77 Allow call-slop optzn for destinations with a suitable dereferenceable attribute
Summary:
Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 19:43:08 +00:00
Sanjay Patel
d8214db086 fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().

In the simplest case, this:

   y = sqrt(x * x);

becomes this:

   y = fabs(x);

This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.

Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.

Differential Revision: http://reviews.llvm.org/D5787



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 18:48:17 +00:00
Juergen Ributzka
c40dab2069 [AArch64] Fix miscompile of sdiv-by-power-of-2.
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.

This fixes rdar://problem/18678801.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 16:41:15 +00:00
Vasileios Kalintiris
3b72ec5083 [mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 nodes.
Summary:
In order to support big endian targets for the BuildPairF64 nodes we
just need to swap the low/high pair registers. Additionally, for the
ExtractElementF64 nodes we have to calculate the correct stack offset
with respect to the node's register/operand that we want to extract.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219931 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 15:41:51 +00:00
Vasileios Kalintiris
02065a65cd [mips] Marked the DI/EI instruction aliases as MIPS32r2
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 15:23:52 +00:00
Vasileios Kalintiris
c3ab7837e8 Test commit access: remove extra new line at the end of file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219925 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 14:37:00 +00:00
Akira Hatanaka
4eb03123df Reapply r219832 - InstCombine: Narrow switch instructions using known bits.
The code committed in r219832 asserted when it attempted to shrink a switch
statement whose type was larger than 64-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 06:00:46 +00:00
Saleem Abdulrasool
ebe6584c32 TRE: make TRE a bit more aggressive
Make tail recursion elimination a bit more aggressive.  This allows us to get
tail recursion on functions that are just branches to a different function.  The
fact that the function takes a byval argument does not restrict it from being
optimised into just a tail call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 03:27:30 +00:00
Akira Hatanaka
608d59f535 Revert r219832.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 01:17:02 +00:00
Hal Finkel
61c65b2884 [LVI] Add some additional comments about caching and context instructions
Philip Reames and I had a long conversation about this, mostly because it is
not obvious why the current logic is correct. Hopefully, these comments will
prevent such confusion in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219882 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 00:40:05 +00:00
Matt Arsenault
231a8d6cfb R600: Remove dead function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 00:08:09 +00:00
Sanjoy Das
a0b0184b33 Revert "r219834 - Teach ScalarEvolution to sharpen range information"
This change breaks the asan buildbots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:46:04 +00:00
Hal Finkel
43141a0764 Preserve non-byval pointer alignment attributes using @llvm.assume when inlining
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219876 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:44:41 +00:00
Hal Finkel
76ce614af7 Add CreateAlignmentAssumption to IRBuilder
Clang CodeGen had a utility function for creating pointer alignment assumptions
using the @llvm.assume intrinsic. This functionality will also be needed by the
inliner (to preserve function-argument alignment attributes when inlining), so
this moves the utility function into IRBuilder where it can be used both by
Clang CodeGen and also other LLVM-level code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219875 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:44:22 +00:00
Adam Nemet
fb9d61a8d6 [AVX512] Add DQ subvector inserts
In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4
respectively.  These are matched by "Alt" Pat<>'s (Alt stands for alternative
VTs).

Since DQ has native support for these intructions, I peeled off the non-"Alt"
part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are
derived from this multiclass.  The "Alt" Pat<>'s are disabled with DQ.

Fixes <rdar://problem/18426089>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:42:17 +00:00
Adam Nemet
ccebe7258e [AVX512] Two new attributes in X86VectorVTInfo for subvector insert
The new attributes are NumElts and the CD8TupleForm.  This prepares the code
to enable x8 and x2 inserts.

NFC, no change in X86.td.expanded except for the new attributes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:42:09 +00:00
Adam Nemet
80b9e006aa [AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size
It's the W bit that selects between 32 or 64 elt type and not the opcode.  The
opcode selects between the width of the insert (128 or 256).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:42:04 +00:00
Matt Arsenault
37e484b5ef R600: Remove unnecessary part of computeKnownBitsForTargetNode
Zero-width BFEs are combined away already, so there's no point in
handling them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:37:49 +00:00
Matt Arsenault
bb402de0c9 Move variable down to use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:37:42 +00:00
Alexander Potapenko
4976a53fb7 Add MachOObjectFile::getUuid()
This CL introduces MachOObjectFile::getUuid(). This function returns an ArrayRef to the object file's UUID, or an empty ArrayRef if the object file doesn't contain an LC_UUID load command.
The new function is gonna be used by llvm-symbolizer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219866 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:35:45 +00:00
Chris Bieneman
aaf36b40cc Fixing the build failure due to compiler warnings and unnecessary disambiguation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219861 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:11:35 +00:00
Chris Bieneman
c14fb89680 Defining a new API for debug options that doesn't rely on static global cl::opts.
Summary:
This is based on the discussions from the LLVMDev thread:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075886.html

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219854 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 21:54:35 +00:00
Tom Stellard
d3fc10a525 R600/SI: Fix bug where immediates were being used in DS addr operands
The SelectDS1Addr1Offset complex pattern always tries to store constant
lds pointers in the offset operand and store a zero value in the addr operand.
Since the addr operand does not accept immediates, the zero value
needs to first be copied to a register.

This newly created zero value will not go through normal instruction
selection, so we need to manually insert a V_MOV_B32_e32 in the complex
pattern.

This bug was hidden by the fact that if there was another zero value
in the DAG that had not been selected yet, then the CSE done by the DAG
would use the unselected node for the addr operand rather than the one
that was just created.  This would lead to the zero value being selected
and the DAG automatically inserting a V_MOV_B32_e32 instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219848 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 21:08:59 +00:00
Eric Christopher
55751ba9e2 Avoid caching the MachineFunction, we don't use it outside of
runOnMachineFunction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 21:06:25 +00:00
Sid Manning
0a72d8bb20 Wrong attribute. LLVM_ATTRIBUTE_UNUSED not LLVM_ATTRIBUTE_USED
This original fix for the build break was correct.  LLVM_ATTRIBUTE_USED
removes the warning message because it keeps the function in the object
file.  LLVM_ATTRIBUTE_UNUSED indicates that it may or may not be used
depending on build settings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 20:41:17 +00:00
Duncan P. N. Exon Smith
9ca230f11c IR: Move NumOperands from User to Value, NFC
Store `User::NumOperands` (and `MDNode::NumOperands`) in `Value`.

On 64-bit host architectures, this reduces `sizeof(User)` and all
subclasses by 8, and has no effect on `sizeof(Value)` (or, incidentally,
on `sizeof(MDNode)`).

On 32-bit host architectures, this increases `sizeof(Value)` by 4.
However, it has no effect on `sizeof(User)` and `sizeof(MDNode)`, so the
only concrete subclasses of `Value` that actually see the increase are
`BasicBlock`, `Argument`, `InlineAsm`, and `MDString`.  Moreover, I'll
be shocked and confused if this causes a tangible memory regression.

This has no functionality change (other than memory footprint).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 20:39:05 +00:00
Duncan P. N. Exon Smith
40dd9d68d7 IR: Cleanup comments for Value, User, and MDNode
A follow-up commit will modify the memory-layout of `Value`, `User`, and
`MDNode`.  First fix the comments to be doxygen-friendly (and to follow
the coding standards).

  - Use "\brief" instead of "repeatedName -".
  - Add a brief intro where it was missing.
  - Remove duplicated comments from source files (and a couple of
    noisy/trivial comments altogether).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 20:28:31 +00:00
Sid Manning
a169d59437 Wrong attribute. LLVM_ATTRIBUTE_USED not LLVM_ATTRIBUTE_UNUSED
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:32:52 +00:00
Rafael Espindola
fc6e0f6f87 Allow forward references to section symbols.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219835 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:30:18 +00:00
Sanjoy Das
40edbf130e Teach ScalarEvolution to sharpen range information.
If x is known to have the range [a, b) in a loop predicated by (icmp
ne x, a), its range can be sharpened to [a + 1, b).  Get
ScalarEvolution and hence IndVars to exploit this fact.
    
This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.

phabricator: http://reviews.llvm.org/D5639



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219834 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:25:28 +00:00
Sid Manning
f0f7ec31d4 Add LLVM_ATTRIBUTE_UNUSED to function currently just used in an assert
Fixes break when -Wunused-function is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219833 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:24:14 +00:00
Akira Hatanaka
38537634e2 InstCombine: Narrow switch instructions using known bits.
Truncate the operands of a switch instruction to a narrower type if the upper
bits are known to be all ones or zeros.

rdar://problem/17720004


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:05:50 +00:00
Juergen Ributzka
7440a83e60 Reapply "[FastISel][AArch64] Add custom lowering for GEPs."
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.

The original commit had a bug in the add emit logic, which has been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219831 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:07 +00:00
Juergen Ributzka
32c5fde3f1 [FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:02 +00:00
Rafael Espindola
ad04f5db82 Correctly handle references to section symbols.
When processing assembly like

.long .text

we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.

This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.

The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:55:30 +00:00
Sid Manning
1338612c55 Enable the instruction printer in HexagonMCTargetDesc
This adds the MCInstPrinter to the LLVMHexagonDesc library and removes
the dependency LLVMHexagonAsmPrinter had on LLVMHexagonDesc. This is
a prerequisite needed by the disassembler.

Phabricator Revision: http://reviews.llvm.org/D5734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:27:40 +00:00
Matt Arsenault
8b3a9205b7 R600/SI: Also try to use 0 base for misaligned 8-byte DS loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:06:43 +00:00
Matt Arsenault
7fdd553b66 R600: Fix miscompiles when BFE has multiple uses
SimplifyDemandedBits would break the other uses of the operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219819 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:58:34 +00:00
Sanjay Patel
ddcfe81459 correct const-ness with auto and dyn_cast
1. Use const with autos.
2. Don't bother with explicit const in cast ops because they do it automagically.

Thanks, David B. / Aaron B. / Reid K.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:45:13 +00:00
Hal Finkel
6c15862fd3 [SLPVectorize] Basic ephemeral-value awareness
The SLP vectorizer should not vectorize ephemeral values. These are used to
express information to the optimizer, and vectorizing them does not lead to
faster code (because the ephemeral values are dropped prior to code generation,
vectorized or not), and obscures the information the instructions are
attempting to communicate (the logic that interprets the arguments to
@llvm.assume generically does not understand vectorized conditions).

Also, uses by ephemeral values are free (because they, and the necessary
extractelement instructions, will be dropped prior to code generation).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219816 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:35:01 +00:00
Hal Finkel
9819bcf7f1 Treat the WorkSet used to find ephemeral values as double-ended
We need to make sure that we visit all operands of an instruction before moving
deeper in the operand graph. We had been pushing operands onto the back of the work
set, and popping them off the back as well, meaning that we might visit an
instruction before visiting all of its uses that sit in between it and the call
to @llvm.assume.

To provide an explicit example, given the following:
  %q0 = extractelement <4 x float> %rd, i32 0
  %q1 = extractelement <4 x float> %rd, i32 1
  %q2 = extractelement <4 x float> %rd, i32 2
  %q3 = extractelement <4 x float> %rd, i32 3
  %q4 = fadd float %q0, %q1
  %q5 = fadd float %q2, %q3
  %q6 = fadd float %q4, %q5
  %qi = fcmp olt float %q6, %q5
  call void @llvm.assume(i1 %qi)

%q5 is used by both %qi and %q6. When we visit %qi, it will be marked as
ephemeral, and we'll queue %q6 and %q5. %q6 will be marked as ephemeral and
we'll queue %q4 and %q5. Under the old system, we'd then visit %q4, which
would become ephemeral, %q1 and then %q0, which would become ephemeral as
well, and now we have a problem. We'd visit %rd, but it would not be marked as
ephemeral because we've not yet visited %q2 and %q3 (because we've not yet
visited %q5).

This will be covered by a test case in a follow-up commit that enables
ephemeral-value awareness in the SLP vectorizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:34:48 +00:00
Derek Schuff
279b5504a3 [MC] Make bundle alignment mode setting idempotent and support nested bundles
Summary:
Currently an error is thrown if bundle alignment mode is set more than once
per module (either via the API or the .bundle_align_mode directive). This
change allows setting it multiple times as long as the alignment doesn't
change.

Also nested bundle_lock groups are currently not allowed. This change allows
them, with the effect that the group stays open until all nests are exited,
and if any of the bundle_lock directives has the align_to_end flag, the
group becomes align_to_end.

These changes make the bundle aligment simpler to use in the compiler, and
also better match the corresponding support in GNU as.

Reviewers: jvoung, eliben

Differential Revision: http://reviews.llvm.org/D5801

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219811 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:10:04 +00:00
Duncan P. N. Exon Smith
ffc65d2bfe DI: Make comments "brief"-er, NFC
Follow-up to r219801.  Post-commit review pointed out that all comments
require a `\brief` description [1], so I converted many and recrafted a
few to be briefer or to include a brief intro.  (If I'm going to clean
them up, I should do it right!)

[1]: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:01:28 +00:00
Sanjay Patel
5af49c83c3 Use 'auto' for easier reading; no functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:21:37 +00:00
Duncan P. N. Exon Smith
03631a8ad5 DI: Cleanup comments, NFC
A number of comment cleanups:

  - Remove duplicated function and class names from comments.

  - Remove duplicated comments from source file (some of which were
    out-of-sync).

  - Move any unduplicated comments from source file to header.

  - Remove some noisy comments entirely (e.g., a comment for
    `DIDescriptor::print()` saying "print descriptor" just gets in the
    way of reading the code).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:15:15 +00:00
Rafael Espindola
90ce9f70e2 Simplify handling of --noexecstack by using getNonexecutableStackSection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:12:52 +00:00
Duncan P. N. Exon Smith
5c2d60d357 DI: Use a DenseMap instead of named metadata, NFC
Remove a strange round-trip through named metadata to assign preserved
local variables to their subprograms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219798 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:11:41 +00:00
Rafael Espindola
b510f8d08c Move getNonexecutableStackSection up to the base ELF class.
The .note.GNU-stack section is not SystemZ/X86 specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219796 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 15:44:16 +00:00
Matt Arsenault
18ed4acf21 R600: Use existing variable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219778 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 05:07:00 +00:00
Matt Arsenault
8a55ca3c41 R600: Remove outdated comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 05:06:57 +00:00
Juergen Ributzka
0081070cfd Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 04:55:48 +00:00
Jingyue Wu
75d77cd179 [MachineSink] Use the real post dominator tree
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in
isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo.
The old heuristics caused instructions to sink unnecessarily, and might create
register pressure.

This is the second try of the fix. The first one (D4814) caused a performance
regression due to failing to sink instructions out of loops (PR21115). This
patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower
one regardless of whether the target block post-dominates the source.

Thanks Alexey Volkov for reporting PR21115! 

Test Plan:
Added a NVPTX codegen test to verify that our change prevents the backend from
over-sinking. It also shows the unnecessary register pressure caused by
over-sinking.

Added an X86 test to verify we can sink instructions out of loops regardless of
the dominance relationship. This test is reduced from Alexey's test in PR21115.

Updated an affected test in X86.

Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime
performance. Results are attached separately in the review thread.

Reviewers: Jiangning, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219773 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 03:27:43 +00:00
Tim Northover
d3458577a9 ARM: drop check for triple that's no longer used.
Early attempts to support AAPCS bare metal MachO targets based the decision on
the CPU being compiled for. This was not a particularly great idea and we've
got a better option now, but this check remained.

No functional change for any target we care about.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 01:05:01 +00:00
Eric Christopher
c6d2db4db1 Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 00:09:07 +00:00
Eric Christopher
2ff93bfec6 No need to cache this unused variable.
Patch by Ehsan Akhgari.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219749 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:58:51 +00:00
Gerolf Hoflehner
cd27f3fb33 [AArch64] Wrong CC access in CSINC-conditional branch sequence
This is a follow up to commit r219742. It removes the CCInMI variable
and accesses the CC in CSCINC directly. In the case of a conditional
branch accessing the CC with CCInMI was wrong.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:55:00 +00:00
Gerolf Hoflehner
2bddd7cf65 [AAarch64] Optimize CSINC-branch sequence
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.

Examples:

1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
   to b.<invCC>

2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
   to b.<CC>


rdar://problem/18506500



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:07:53 +00:00
Hal Finkel
75277b9f70 [LoopVectorize] Ignore @llvm.assume for cost estimates and legality
A few minor changes to prevent @llvm.assume from interfering with loop
vectorization. First, treat @llvm.assume like the lifetime intrinsics, which
are scalarized (but don't otherwise interfere with the legality checking).
Second, ignore the cost of ephemeral instructions in the loop (these will go
away anyway during CodeGen).

Alignment assumptions and other uses of @llvm.assume can often end up inside of
loops that should be vectorized (this is not uncommon for assumptions generated
by __attribute__((align_value(n))), for example).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219741 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:59:49 +00:00
Simon Pilgrim
84a3feea38 [X86][SSE] pslldq/psrldq shuffle mask decodes
Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions.

Differential Revision: http://reviews.llvm.org/D5598


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219738 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:31:34 +00:00
Tim Northover
7419c9c0c0 ARM: remove ARM/Thumb distinction for preferred alignment.
Thumb1 has legitimate reasons for preferring 32-bit alignment of types
i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be
a multiple of 4. However, this is a trade-off betweem code size and RAM usage;
the DataLayout string is not the best place to represent it even if desired.

So this patch removes the extra Thumb requirements, hopefully making ARM and
Thumb completely compatible in this respect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:12:17 +00:00
Tim Northover
32d728fbb9 ARM: allow misaligned local variables in Thumb1 mode.
There's no hard requirement on LLVM to align local variable to 32-bits, so the
Thumb1 frame handling needs to be able to deal with variables that are only
naturally aligned without falling over.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:12:14 +00:00
Juergen Ributzka
569c5b62af [FastISel][AArch64] Add custom lowering for GEPs.
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 21:41:23 +00:00
Hans Wennborg
76806748d4 [x86 asm] allow fwait alias in both At&t and Intel modes (PR21208)
Differential Revision: http://reviews.llvm.org/D5741

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 21:41:17 +00:00
Tim Northover
eddeac0b8c ARM: set preferred aggregate alignment to 32 universally.
Before, ARM and Thumb mode code had different preferred alignments, which could
lead to some rather unexpected results. There's justification for reducing it
from the default 64-bits (wasted space), but I don't think there is for going
below 32-bits.

There's no actual ABI change here, just to reassure people.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:57:26 +00:00
Hal Finkel
2a77e6bdd1 [CFL-AA] CFL-AA should not assert on an va_arg instruction
The CFL-AA implementation was missing a visit* routine for va_arg instructions,
causing it to assert when run on a function that had one. For now, handle these
in a conservative way.

Fixes PR20954.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219718 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:51:26 +00:00
Sanjay Patel
3f349b2ba8 Optimize away fabs() calls when input is squared (known positive).
Eliminate library calls and intrinsic calls to fabs when the input 
is a squared value.

Note that no unsafe-math / fast-math assumptions are needed for
this optimization.

Differential Revision: http://reviews.llvm.org/D5777



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:43:11 +00:00
Juergen Ributzka
40017084f7 [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.

Related to rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:36:02 +00:00
David Majnemer
505187a9bd InstCombine: Don't miscompile X % ((Pow2 << A) >>u B)
We assumed that A must be greater than B because the right hand side of
a remainder operator must be nonzero.

However, it is possible for A to be less than B if Pow2 is a power of
two greater than 1.

Take for example:
i32 %A = 0
i32 %B = 31
i32 Pow2 = 2147483648

((Pow2 << 0) >>u 31) is non-zero but A is less than B.

This fixes PR21274.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219713 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:28:40 +00:00
Jan Vesely
d6315ea5a5 Reapply "R600: Add new intrinsic to read work dimensions"
This effectively reverts revert 219707. After fixing the test to work with
new function name format and renamed intrinsic.

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:05:26 +00:00
Hal Finkel
f0f98417ca Revert "r216914 - Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'"
Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted
because of buildbot failures, and credit goes to Ulrich Weigand for isolating
the underlying issue (which can be confirmed by Valgrind, which does helpfully
light up like the fourth of July). Uli explained the problem with the original
patch as:

  It seems the problem is calling multiplySignificand with an addend of category
  fcZero; that is not expected by this routine.  Note that for fcZero, the
  significand parts are simply uninitialized, but the code in (or rather, called
  from) multiplySignificand will unconditionally access them -- in effect using
  uninitialized contents.

This version avoids using a category == fcZero addend within
multiplySignificand, which avoids this problem (the Valgrind output is also now
clean).

Original commit message:

[APFloat] Fixed a bug in method 'fusedMultiplyAdd'.

When folding a fused multiply-add builtin call, make sure that we propagate the
correct result in the case where the addend is zero, and the two other operands
are finite non-zero.

Example:
  define double @test() {
    %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0)
    ret double %1
  }

Before this patch, the instruction simplifier wrongly folded the builtin call
in function @test to constant 'double 7.0'.
With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and
propagates the expected result (i.e. 56.0).

Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra
test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence
of NaN/Inf operands.

This fixes PR20832.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219708 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 19:23:07 +00:00
Rafael Espindola
e8e8db7ff6 Revert "R600: Add new intrinsic to read work dimensions"
This reverts commit r219705.

CodeGen/R600/work-item-intrinsics.ll was failing on linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:58:04 +00:00
Rafael Espindola
d1494d5ff3 Remove unused member variable.
Fixes pr20904.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:53:16 +00:00
Jan Vesely
6a529850f3 R600: Add new intrinsic to read work dimensions
v2: Add SI lowering
    Add test

v3: Place work dimensions after the kernel arguments.
v4: Calculate offset while lowering arguments
v5: rebase
v6: change prefix to AMDGPU

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:52:07 +00:00
Jan Vesely
787e3ca6a4 R600: FMA is VecALU only instruction
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:52:04 +00:00
Reed Kotler
c180402b48 Finish getting Mips fast-isel to match up with AArch64 fast-isel
Summary:
In order to facilitate use of common code, checking by reviewers of other fast-isel ports, and hopefully to eventually move most of Mips and other fast-isel ports into target independent code, I've tried to get the two implementations to line up.

There is no functional code change. Just methods moved in the file to be in the same order as in AArch64.

Test Plan: No functional change.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D5692

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219703 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:27:58 +00:00
David Blaikie
d661fde971 DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.
Let me tell you a tale...

Originally committed in r211723 after discovering a nasty case of weird
scoping due to inlining, this was reverted in r211724 after it fired in
ASan/compiler-rt.

(minor diversion where I accidentally committed/reverted again in
r211871/r211873)

After further testing and fixing bugs in ArgumentPromotion (r211872) and
Inlining (r212065) it was recommitted in r212085. Reverted in r212089
after the sanitizer buildbots still showed problems.

Fixed another bug in ArgumentPromotion (r212128) found by this
assertion.

Recommitted in r212205, reverted in r212226 after it crashed some more
on sanitizer buildbots.

Fix clang some more in r212761.

Recommitted in r212776, reverted in r212793. ASan failures.
Recommitted in r213391, reverted in r213432, trying to reproduce flakey
ASan build failure.

Fixed bugs in r213805 (ArgPromo + DebugInfo), r213952
(LiveDebugVariables strips dbg_value intrinsics in functions not
described by debug info).

Recommitted in r214761, reverted in r214999, flakey failure on Windows
buildbot.

Fixed DeadArgElimination + DebugInfo bug in r219210.

Recommitted in r219215, reverted in r219512, failure on ObjC++ atomic
properties in the test-suite on Darwin.

Fixed ObjC++ atomic properties issue in Clang in r219690.

[This commit is provided 'as is' with no hope that this is the last time
I commit this change either expressed or implied]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219702 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 18:22:52 +00:00
Rafael Espindola
e41812973d Remove method that is identical to the base class one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:38:38 +00:00
Matt Arsenault
704b06ce61 R600/SI: Use DS offsets for constant addresses
Use 0 as the base address for a constant address, so if
we have a constant address we can save moves and form
read2/write2s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:21:19 +00:00
David Blaikie
8730dc16e9 Revert "Fix stuff... again."
Accidental commit.

This reverts commit r219693.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219695 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:13:09 +00:00
David Blaikie
2b0657e28f Revert some parts of r196288 that were confusing and untested.
If we figure out why they should be here, let's add some testing of some
kind so we can better demonstrate why it's needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219694 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:12:02 +00:00
David Blaikie
8a61781323 Fix stuff... again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 17:11:59 +00:00
Hal Finkel
2993617e41 [LVI] Check for @llvm.assume dominating the edge branch
When LazyValueInfo uses @llvm.assume intrinsics to provide edge-value
constraints, we should check for intrinsics that dominate the edge's branch,
not just any potential context instructions. An assumption that dominates the
edge's branch represents a truth on that edge. This is specifically useful, for
example, if multiple predecessors assume a pointer to be nonnull, allowing us
to simplify a later null comparison.

The test case, and an initial patch, were provided by Philip Reames. Thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 16:04:49 +00:00
NAKAMURA Takumi
65e4aa4656 Revert r219638, (r219640 and r219676), "Removing the static destructor from ManagedStatic.cpp by controlling the allocation and de-allocation of the mutex."
It caused hang-up on msc17 builder, probably deadlock.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219687 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 15:58:16 +00:00
Robert Khasanov
ad5d223cb5 [AVX512] Extended avx512_binop_rm to DQ/VL subsets.
Added encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219686 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 15:13:56 +00:00
Robert Khasanov
33a95f24bb [AVX512] Extended avx512_binop_rm to BW/VL subsets.
Added encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219685 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 14:36:19 +00:00
Bradley Smith
5051f6033d [AArch64] Fix crash with empty/pseudo-only blocks in A53 erratum (835769) workaround
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219684 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 14:02:41 +00:00
Eric Christopher
0ba4483d01 Grab the subtarget info off of the MachineFunction rather than
indirecting through the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219674 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 08:44:19 +00:00
Eric Christopher
ff9182749e Use the triple to figure out if this is a darwin target, not
the subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 08:25:26 +00:00
Eric Christopher
ded375f282 Remove unnecessary TargetMachine.h includes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219672 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:22:08 +00:00
Eric Christopher
1dd55ba94e Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219671 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:22:00 +00:00
Eric Christopher
eb271ec9d5 Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219670 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:17:23 +00:00
Eric Christopher
188c856c38 Instead of the TargetMachine cache the MachineFunction
and TargetRegisterInfo in the peephole optimizer. This
makes it easier to grab subtarget dependent variables off
of the MachineFunction rather than the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:17:20 +00:00
Eric Christopher
d879de1ece Access subtarget specific variables off of the MachineFunction's
cached subtarget and not the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219668 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 07:00:33 +00:00
Eric Christopher
cd694819ad Access the subtarget off of the MachineFunction via the DAG
scheduler or via the SelectionDAG if available. Otherwise
grab the subtarget off of the MachineFunction by going up
the parent chain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219666 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:56:25 +00:00
Hao Liu
75ad488c41 [AArch64]Select wide immediate offset into [Base+XReg] addressing mode
e.g Currently we'll generate following instructions if the immediate is too wide:
    MOV  X0, WideImmediate
    ADD  X1, BaseReg, X0
    LDR  X2, [X1, 0]

    Using [Base+XReg] addressing mode can save one ADD as following:
    MOV  X0, WideImmediate
    LDR  X2, [BaseReg, X0]

    Differential Revision: http://reviews.llvm.org/D5477


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:50:36 +00:00
Eric Christopher
b15974e996 Remove the use and member variable of the TargetMachine from
MachineLICM as we can get the same data off of the MachineFunction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219663 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:26:57 +00:00
Eric Christopher
c026db75e7 Have MachineInstrBundle use the MachineFunction for subtarget
access rather than the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219662 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:26:55 +00:00
Eric Christopher
3788687f31 Access the subtarget off of the MachineFunction rather than
through the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219661 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:26:53 +00:00
Marcello Maggioni
db9fed93fa Switch to select optimization for two-case switches
This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block
and a test to check that this condition is handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:58:26 +00:00
Eric Christopher
9accb10855 Include map into the A15SDOptimizer rather than pick it up
transitively from the DFAPacketizer via TargetInstrInfo.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:13:51 +00:00
Eric Christopher
8ff8c16f58 Remove the TargetMachine from DFAPacketizer since it was only
being used to grab subtarget specific things that we can grab
from the MachineFunction anyhow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219650 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:03:16 +00:00
Sanjay Patel
e0a0018345 fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219645 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 00:33:23 +00:00
Chandler Carruth
ab1f4ef9a2 Add some optional passes around the vectorizer to both better prepare
the IR going into it and to clean up the IR produced by the vectorizers.

Note that these are *off by default* right now while folks collect data
on whether the performance tradeoff is reasonable.

In a build of the 'opt' binary, I see about 2% compile time regression
due to this change on average. This is in my mind essentially the worst
expected case: very little of the opt binary is going to *benefit* from
these extra passes.

I've seen several benchmarks improve in performance my small amounts due
to running these passes, and there are certain (rare) cases where these
passes make a huge difference by either enabling the vectorizer at all
or by hoisting runtime checks out of the outer loop. My primary
motivation is to prevent people from seeing runtime check overhead in
benchmarks where the existing passes and optimizers would be able to
eliminate that.

I've chosen the sequence of passes based on the kinds of things that
seem likely to be relevant for the code at each stage: rotaing loops for
the vectorizer, finding correlated values, loop invariants, and
unswitching opportunities from any runtime checks, and cleaning up
commonalities exposed by the SLP vectorizer.

I'll be pinging existing threads where some of these issues have come up
and will start new threads to get folks to benchmark and collect data on
whether this is the right tradeoff or we should do something else.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 00:31:29 +00:00
Peter Collingbourne
75202eb0c6 Introduce LLVMWriteBitcodeToMemoryBuffer C API function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 00:30:59 +00:00
David Majnemer
af6be11a60 InstCombine: Fix miscompile in X % -Y -> X % Y transform
We assumed that negation operations of the form (0 - %Z) resulted in a
negative number.  This isn't true if %Z was originally negative.
Substituting the negative number into the remainder operation may result
in undefined behavior because the dividend might be INT_MIN.

This fixes PR21256.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219639 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 22:37:51 +00:00
Chris Bieneman
3a143ce2e7 Removing the static destructor from ManagedStatic.cpp by controlling the allocation and de-allocation of the mutex.
This patch adds a new llvm_call_once function which is used by the ManagedStatic implementation to safely initialize a global to avoid static construction and destruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 22:37:25 +00:00
Eric Christopher
5db6cf4884 Migrate another set of getSubtargetImpl away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219636 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 21:57:44 +00:00