Commit Graph

15245 Commits

Author SHA1 Message Date
Dan Gohman
98efbf1e86 [WebAssembly] Remove .import printing.
For now, LLVM doesn't know about wasm module imports, so it shouldn't
emit .import directives.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255602 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 02:20:44 +00:00
JF Bastien
5e556cda42 WebAssembly: test global array indexing
This case was tested in the linker from code, but not from globals indexing into other globals. The linker currently barfs on this, ncbray volunteered to fix it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255601 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 02:02:51 +00:00
Dan Gohman
ae06f491f7 [WebAssembly] Add type prefixes to call instructions
Add return type information to call and call_indirect instructions. This
allows them to be disambiguated without knowledge of the callee.

Differential Revision: http://reviews.llvm.org/D15484


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255565 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 22:56:51 +00:00
Dan Gohman
4fe3f079fb [WebAssembly] Implement a new algorithm for placing BLOCK markers
Implement a new BLOCK scope placement algorithm which better handles
early-return blocks and early exists from nested scopes.

Differential Revision: http://reviews.llvm.org/D15368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255564 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 22:51:54 +00:00
Chih-Hung Hsieh
3a1999a311 [X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
  Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
  When target has MMX registers, configure MVT::f128 in FR128RegClass,
  with TypeSoftenFloat action, and custom actions for some opcodes.
  Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
  Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
  Fix infinite loop when f128 type can have
  VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.

Differential Revision: http://reviews.llvm.org/D11438



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255558 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 22:08:36 +00:00
David Majnemer
868145efb0 [IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function.  This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.

Depends on D15478.

Differential Revision: http://reviews.llvm.org/D15479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 18:34:23 +00:00
Paul Robinson
16cba6923a FastISel needs to remove dead code when it bails out.
When FastISel fails to translate an instruction it hands off code
generation to SelectionDAG. Before it does so, it may have generated
local value instructions to feed phi nodes in successor blocks. These
instructions will then be generated again by SelectionDAG, causing
duplication and less efficient code, including extra spill
instructions.

Patch by Wolfgang Pieb!

Differential Revision: http://reviews.llvm.org/D11768


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255520 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 18:33:18 +00:00
Petar Jovanovic
db0b22110a [Power PC] llvm soft float support for ppc32
This is the second in a set of patches for soft float support for ppc32,
it enables soft float operations.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D13700


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255516 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 17:57:33 +00:00
Matt Arsenault
1451e94ee0 AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255512 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 17:25:38 +00:00
Matt Arsenault
4f0027139e AMDGPU: Fix splitting vector loads with existing offsets
If the original MMO had an offset, it was dropped.
Also use the correct alignment after adding the new offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255508 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 16:59:40 +00:00
Saleem Abdulrasool
d11fee0082 ARM: only emit EABI attributes on EABI targets
EABI attributes should only be emitted on EABI targets.  This prevents the
emission of the optimization goals EABI attribute on Windows ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255448 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-13 05:27:45 +00:00
Simon Pilgrim
3f006c6658 [X86][AVX512] Added support for VMOVQ shuffle comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255442 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 21:46:23 +00:00
Manuel Jacob
f3296cdf51 Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0.
Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256.  This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view.  On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.

This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls.  If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.

The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86).  NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction.  All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator).  X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.

I call this a "partial fix" because there are still cases that can't be
lowered.  A fatal error is reported in these cases.

Reviewers: arsenm, theraven, compnerd, hfinkel

Subscribers: hfinkel, llvm-commits, alex

Differential Revision: http://reviews.llvm.org/D7241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255441 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 21:33:31 +00:00
Simon Pilgrim
d7f7b7e355 [X86][AVX] Tests tidyup
Cleanup/regenerate some tests for some upcoming patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 12:52:52 +00:00
David Majnemer
8cec2f2816 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255422 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 05:38:55 +00:00
Chen Li
f99bcae76f [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.

Reviewers: craig.topper, RKSimon

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 01:04:15 +00:00
Hal Finkel
9bc6874751 Fix test/CodeGen/PowerPC/ppc-shrink-wrapping.ll after r255398
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255414 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 00:42:05 +00:00
Hal Finkel
8f2bcca5c7 [PowerPC] Add Branch Hints for Highly-Biased Branches
This branch adds hints for highly biased branches on the PPC architecture. Even
in absence of profiling information, LLVM will mark code reaching unreachable
terminators and other exceptional control flow constructs as highly unlikely to
be reached.

Patch by Tom Jablin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 00:32:00 +00:00
Chen Li
3d25c881e3 Revert rL255391: [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
because it broke buildbot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255395 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 00:08:37 +00:00
Derek Schuff
e8d6789f06 [WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination
Summary:
Use the SP32 physical register as the base for FrameIndex
lowering. Update it and the __stack_pointer global var in the prolog and
epilog. Extend the mapping of virtual registers to wasm locals to
include the physical registers.

Rather than modify the target-independent PrologEpilogInserter (which
asserts that there are no virtual registers left) include a
slightly-modified copy for Wasm that does not have this assertion and
only clears the virtual registers if scavenging was needed (which of
course it isn't for wasm).

Differential Revision: http://reviews.llvm.org/D15344

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255392 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 23:49:46 +00:00
Chen Li
3c55c7bc30 [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.

Reviewers: craig.topper, RKSimon

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255391 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 23:39:32 +00:00
Matt Arsenault
be6eaee35a SelectionDAG: Match min/max if the scalar operation is legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255388 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 23:16:47 +00:00
Hal Finkel
15c5be1ee5 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 23:11:52 +00:00
Matthias Braun
f43272c76e CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.

This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).

Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.

This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033

Differential Revision: http://reviews.llvm.org/D15320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255362 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 19:42:09 +00:00
Kyle Butt
9a3d083ea5 [PPC]: Peephole optimize small accesss to aligned globals.
Access to aligned globals gives us a chance to peephole optimize nonzero
offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't
overflow the available displacement. For example:
        addis 3, 2, b4v@toc@ha
        addi 4, 3, b4v@toc@l
        lbz 5, b4v@toc@l(3) ; This is the result of the current peephole
        lbz 6, 1(4)         ; optimizer
        lbz 7, 2(4)
        lbz 8, 3(4)
If b4v is 4-byte aligned, we can skip using register 4 because we know
that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate:
        addis 3, 2, b4v@toc@ha
        lbz 4, b4v@toc@l(3)
        lbz 5, b4v@toc@l+1(3)
        lbz 6, b4v@toc@l+2(3)
        lbz 7, b4v@toc@l+3(3)
Saving a register and an addition.
Larger alignments allow larger structures/arrays to be optimized.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255319 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 00:47:36 +00:00
Eric Christopher
54f1a90d45 Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,
x)) combines for ppc_fp128, since signbit computation is more
complicated.

Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html

Patch by Tim Shen!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 22:09:06 +00:00
Kyle Butt
e7ad314f80 PPC: Teach FMA mutate to respect register classes.
This was causing bad code gen and assembly that won't assemble, as
mixed altivec and vsx code would end up with a vsx high register
assigned to an altivec instruction, which won't work. Constraining the
classes allows the optimization to proceed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255299 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 21:28:40 +00:00
Simon Pilgrim
45d4194e91 [DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension
PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255289 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 19:47:06 +00:00
Pirama Arumuga Nainar
7845ed025c Fix fptosi, fptoui from f16 vectors to i8, i16 vectors
Summary:
Convert f16 vectors to corresponding f32 vectors before doing the
conversion to int.

Add tests for v4f16, v8f16.

Reviewers: ab, jmolloy

Subscribers: llvm-commits, srhines

Differential Revision: http://reviews.llvm.org/D14936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 17:16:49 +00:00
Dan Gohman
0abf891ab8 [WebAssembly] Tighten up several CHECK tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255255 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 14:52:34 +00:00
Nemanja Ivanovic
1c23594d2c Bitcasts between FP and INT values using direct moves
This patch corresponds to review:
http://reviews.llvm.org/D15286

LLVM IR frequently contains bitcast operations between floating point and
integer values of the same width. Doing this through memory operations is
quite expensive on PPC. This patch allows the use of direct register moves
between FPRs and GPRs for lowering bitcasts.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255246 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 13:35:28 +00:00
Dan Gohman
54fd4d4360 [WebAssembly] Implement mixed-type ISD::FCOPYSIGN.
ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner
uses this. Add some def : Pat rules to expand this out into an explicit
conversion and a normal copysign operation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255220 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 04:55:31 +00:00
Dan Gohman
cc39a6efb8 [WebAssembly] Implement fma.
It is lowered to a libcall for now, but this is expected to change in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255219 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 04:52:33 +00:00
Tom Stellard
7d2a810fef AMDGPU/SI: Emit constant arrays in the .text section
Summary:
This allows us to remove the END_OF_TEXT_LABEL hack we had been using
and simplifies the fixups used to compute the address of constant
arrays.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255204 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 02:13:01 +00:00
Tom Stellard
6052acda66 AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints
Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs.

Reviewers: arsenm, echristo

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255203 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 02:12:53 +00:00
Dan Gohman
b2d324bc23 [WebAssembly] Fix legalization of f32->f64 EXTLOAD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255202 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 02:07:53 +00:00
Dan Gohman
3384127652 [WebAssembly] Also legalize sign_extend_inreg of i32->i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255191 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 01:00:19 +00:00
Dan Gohman
029e84caf4 PeepholeOptimizer: Ignore dead implicit defs
Target-specific instructions may have uninteresting physreg clobbers,
for target-specific reasons. The peephole pass doesn't need to concern
itself with such defs, as long as they're implicit and marked as dead.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255182 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 00:37:51 +00:00
Dan Gohman
a796bb8e0a [WebAssembly] Fix legalization of shift operators with illegal types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255181 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 00:26:26 +00:00
Dan Gohman
492f1085a4 [WebAssembly] Implement anyext.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255179 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 00:17:35 +00:00
Quentin Colombet
d110e2b4a3 [X86] Enable shrink-wrapping by default, but keep it disabled for stack frames
without a frame pointer when unwind may happen.
This is a workaround for a bug in the way we emit the CFI directives for
frameless unwind information. See PR25614.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255175 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 23:08:18 +00:00
Dan Gohman
4c8fe28374 [WebAssembly] Reintroduce ARGUMENT moving logic
Reinteroduce the code for moving ARGUMENTS back to the top of the basic block.
While the ARGUMENTS physical register prevents sinking and scheduling from
moving them, it does not appear to be sufficient to prevent SelectionDAG from
moving them down in the initial schedule. This patch introduces a patch that
moves them back to the top immediately after SelectionDAG runs.

This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is
one alternative, though the review has not been favorable, and proposed
alternatives are longer-term and have other downsides.

This fixes the main outstanding -verify-machineinstrs failures, so it adds
-verify-machineinstrs to several tests.

Differential Revision: http://reviews.llvm.org/D15377


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255125 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 16:23:59 +00:00
Tim Northover
6d8e50b6e2 ARM: don't use a deleted node as the BaseReg in complex pattern.
We mutated the DAG, which invalidated the node we were trying to use
as a base register. Sometimes we got away with it, but other times the
node really did get deleted before it was finished with.

Should fix PR25733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 15:54:50 +00:00
Robert Lougher
c8a1727001 Fix cycle in selection DAG introduced by extractelement legalization
During selection DAG legalization, extractelement is replaced with a load
instruction.  To do this, a temporary store to the stack is used unless an
existing store is found that can be re-used.
    
If re-using a store, the chain going out of the store must be replaced by
the one going out of the new load (this ensures that any stores that must
take place after the store happens after the load, else the value might
be overwritten before it is loaded).
    
The problem is, if the extractelement index is dependent on the store
replacing the chain will introduce a cycle in the selection DAG (the load
uses the index, and by replacing the chain we will make the index dependent
on the load).
    
To fix this, if the index is dependent on the store, the store is skipped.
This is conservative as we may end up creating an unnecessary extra store
to the stack.  However, the situation is not expected to occur very often.

Differential Revision: http://reviews.llvm.org/D15330


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255114 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 14:34:10 +00:00
Ahmed Bougacha
65422bf239 [AArch64][ARM] Don't base interleaved op legality on type alloc size.
Otherwise, we think that most types that look like they'd fit in a
legal vector type are legal (so, basically, *any* vector type with a
size between 33 and 128 bits, I think, since we use pow2 alignment;
e.g., v2i25, v3f32, ...).

DataLayout::getTypeAllocSize rounds up based on alignment.
When checking for target intrinsic legality, that's not what we want:
if rounding makes a difference, the type isn't legal, and the
target intrinsics shouldn't be used, as they are always assumed legal.

One could make the argument that alloc size is ultimately the most
relevant here, since we're dealing with LD/ST intrinsics. That's only
true if we did legalize them though; that's a problem for another day.

Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits.
Type::getSizeInBits can't be used because that'd gratuitously break
pointer vector support.

Some of these uses are currently fine, because we only hit them when
the type is already known legal (e.g., r114454). Update them for
consistency. It's faster to avoid the rounding anyway!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255089 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 01:19:50 +00:00
Vyacheslav Klochkov
a23ddb7891 X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes.
Reviewer: Simon Pilgrim.
Differential Revision: http://reviews.llvm.org/D15317


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255080 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 00:12:13 +00:00
Pirama Arumuga Nainar
46994bcf9c Define selection for v4f16, v8f16 scalar_to_vector
Summary:
This fixes failure when trying to select
    insertelement <4 x half> undef, half %a, i64 0
which gets transformed to a scalar_to_vector node.

The accompanying v4 and v8 tests fail instruction selection without this
patch.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255072 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 23:07:06 +00:00
Simon Pilgrim
029bc0f70a [X86][AVX] Fold loads + splats into broadcast instructions
On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector.

This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element.

Fix for PR23022

Differential Revision: http://reviews.llvm.org/D15310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 22:17:11 +00:00
Simon Pilgrim
9d46b32e88 [X86][SSE4A] Added fast-isel intrinsics tests
As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse4a-builtins.c


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255053 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 21:43:41 +00:00
Simon Pilgrim
9a17b124c5 [X86][SSSE3] Added fast-isel intrinsics tests
As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/ssse3-builtins.c


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255052 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-08 21:32:08 +00:00