101 Commits

Author SHA1 Message Date
Charlie Turner
1e9e8c6572 [AArch64] Add UABDL patterns for log2 shuffle.
Summary:
This matches the sum-of-absdiff patterns emitted by the vectoriser using log2 shuffles.

Relies on D14207 to be able to match the `extract_subvector(..., 0)`

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-09 13:10:52 +00:00
Charlie Turner
e6e427c6b3 [AArch64] Handle extract_subvector(..., 0) in ISel.
Summary:
Lowering this pattern early to an `EXTRACT_SUBREG` was making it impossible to match larger patterns in tblgen that use `extract_subvector(..., 0)` as part of the their input pattern.

It seems like there will exist somewhere a better way of specifying this pattern over all relevant register value types, but I didn't manage to find it.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252464 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-09 12:45:11 +00:00
Alexandros Lamprineas
82e78e2e9d [MC layer][AArch64] llvm-mc accepts 4-bit immediate values for
"msr pan, #imm", while only 1-bit immediate values should be valid.
Changed encoding and decoding for msr pstate instructions.

Differential Revision: http://reviews.llvm.org/D13011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249313 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-05 13:42:31 +00:00
Stephen Canon
eb1a6c5669 Don't raise inexact when lowering ceil, floor, round, trunc.
The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions *did* raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior.  This also lets us fold some logic into TD patterns, which is nice.

Differential Revision: http://reviews.llvm.org/D12969


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248266 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:43:17 +00:00
Ahmed Bougacha
d636e64cbc [AArch64] Support selecting STNP.
We could go through the load/store optimizer and match STNP where
we would have matched a nontemporal-annotated STP, but that's not
reliable enough, as an opportunistic optimization.
Insetad, we can guarantee emitting STNP, by matching them at ISel.
Since there are no single-input nontemporal stores, we have to
resort to some high-bits-extracting trickery to generate an STNP
from a plain store.

Also, we need to support another, LDP/STP-specific addressing mode,
base + signed scaled 7-bit immediate offset.
For now, only match the base. Let's make it smart separately.

Part of PR24086.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247231 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 01:42:28 +00:00
Ahmed Bougacha
37d12daa3a [AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0.
This matches the ARM behavior. In both cases, the register is part
of the optional Performance Monitors extension, so, add the feature,
and enable it for the A-class processors we support.

Differential Revision: http://reviews.llvm.org/D12425



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246555 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-01 16:23:45 +00:00
Silviu Baranga
8577e95874 [AArch64] Unify the integer min/max vector selection patterns with the intrinsic ones
Summary:
This change lowers the aarch64 integer vector min/max intrinsic nodes to
generic min/max nodes and replaces the intrinsic selection patterns with
the generic ones.

There should already be testing in place for this, so no further tests
were added.

Reviewers: jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246030 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 11:11:14 +00:00
Ahmed Bougacha
e9da6d5d72 [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.
We canonicalize V64 vectors to V128 through insert_subvector: the other
FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't,
so we'd fail to match fmls and generate fneg+fmla instead.
The vector equivalents are already tested and functional.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245107 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-14 22:06:05 +00:00
James Molloy
3b4cd46ad5 [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:

  - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
  - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244595 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-11 12:06:37 +00:00
James Molloy
42fd74f775 [AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNAN
NFCI. This just removes custom ISDNodes that are no longer needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244594 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-11 12:06:33 +00:00
Ahmed Bougacha
3e59e1aa47 [AArch64] Rename FP formats to be more consistent. NFC.
Some are named "FP", others "SD", others still "FP*SD".
Rename all this to just use "FP", which, except for conversions
(which don't use this format naming scheme), implies "SD" anyway.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243936 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-04 01:38:08 +00:00
Geoff Berry
0dd663b598 [AArch64] Favor extended reg patterns for sub
Summary:
Favor the extended reg patterns over the shifted reg patterns that match
only the operand shift and not the full sign/zero extend and shift.

Reviewers: jmolloy, t.p.northover

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243753 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-31 15:55:54 +00:00
Tim Northover
3614662adb AArch64: use 32-bit MOV rather than UBFX to truncate registers.
It's potentially more efficient on Cyclone, and from the optimization guides &
schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd
expect a MOV to be about the most efficient instruction with its semantics,
even though the official "UXTW" alias is really a UBFX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243576 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-29 21:34:32 +00:00
Chad Rosier
fccc20d417 [AArch64] Change EON pattern to match more often.
Phabricator: http://reviews.llvm.org/D11359
Patch by Geoff Berry <gberry@codeaurora.org>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242694 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-20 18:42:27 +00:00
James Molloy
126ab2389e [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
No functional change, but it preps codegen for the future when SABSDIFF
will start getting generated in anger.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242545 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-17 17:10:45 +00:00
Matthias Braun
66b12088cb AArch64: Implement conditional compare sequence matching.
This is a new iteration of the reverted r238793 /
http://reviews.llvm.org/D8232 which wrongly assumed that any and/or
trees can be represented by conditional compare sequences, however there
are some restrictions to that. This version fixes this and adds comments
that explain exactly what types of and/or trees can actually be
implemented as conditional compare sequences.

Related to http://llvm.org/PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D10579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242436 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-16 20:02:37 +00:00
Tim Northover
93398438ff AArch64: add rev64 alias for 64-bit rev instruction.
It could be useful to assembly programmers and makes the permitted variants a
little more uniform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 17:07:29 +00:00
Arnold Schwaighofer
39fe55270a Add more nvcasts
Tim Northover has told me that they can occur when the compiler cleverly
constructs constants - as demonstrated in the test case.

rdar://21703486

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241641 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-07 23:13:18 +00:00
Arnold Schwaighofer
f869ca86f1 Add a pattern for a nvcast from v2f64 -> v4f32
Since the NvCast is generated by the selection process the concerns about
endianess and bit reversal don't apply.

rdar://21703486

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241611 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-07 18:31:55 +00:00
Arnaud A. de Grandmaison
bdaa375556 [AArch64] Implement add/adds/sub/subs/cmp/cmn with negative immediate aliases
This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn
with a negative immediate operand and convert them as shown:

  add  Rd, Rn, -imm -> sub  Rd, Rn, imm
  sub  Rd, Rn, -imm -> add  Rd, Rn, imm
  adds Rd, Rn, -imm -> subs Rd, Rn, imm
  subs Rd, Rn, -imm -> adds Rd, Rn, imm
  cmp  Rn, -imm     -> cmn  Rn, imm
  cmn  Rn, -imm     -> cmp  Rn, imm

Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers (gas). They are documented in the "ARMv8 Instruction Set
Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc
a programmer-friendly assembler !

This also fixes PR20978: "Assembly handling of adding negative numbers
not as smart as gas".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241166 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-01 15:05:58 +00:00
Matthias Braun
e460807bcd Revert "AArch64: Use CMP;CCMP sequences for and/or/setcc trees."
The patch triggers a miscompile on SPEC 2006 403.gcc with the (ref)
200.i and scilab.i inputs. I opened PR23866 to track analysis of this.

This reverts commit r238793.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239880 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 04:02:32 +00:00
Tim Northover
33d75a269c AArch64: fix typo in SMIN far atomics and add tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238858 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 18:37:20 +00:00
Vladimir Sukharev
1e7e9b1881 [AArch64] Add v8.1a atomic instructions
Patch by: Tom Coxon

Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8501


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238818 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 10:58:41 +00:00
Matthias Braun
fe1391f07d AArch64: Use CMP;CCMP sequences for and/or/setcc trees.
Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.

PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D8232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238793 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 22:31:17 +00:00
James Molloy
d63e0fc2d9 Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them.
The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for
most NEON datatypes - the big exclusion being v2i64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-15 16:15:57 +00:00
Sergey Dmitrouk
1f7a90d793 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235989 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 14:05:47 +00:00
Daniel Jasper
515cc265c9 Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk
716c5d8a30 [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235977 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 11:56:37 +00:00
Pirama Arumuga Nainar
dab5145cb3 [AArch64] Add nvcast patterns for v4f16 and v8f16
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants.  This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Reviewers: jmolloy, srhines, ab

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:32:25 +00:00
Vladimir Sukharev
39c4ba63f2 [AArch64] Add v8.1a "Limited Ordering Regions" extension
Reviewers: 	t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8499

Patch by: Tom Coxon


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:30:43 +00:00
Bradley Smith
d87c77c0e8 [AArch64] Allow non-standard INS/DUP encodings
The ARMv8 ARMARM states that for these instructions in A64 state:

  "Unspecified bits in "imm5" are ignored but should be set to zero by an assembler.", (imm4 for INS).

Make the disassembler accept any encoding with these ignored bits set to 1.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 15:07:26 +00:00
Vladimir Sukharev
65f303fd0c [AArch64] Rename v8.1a from "extension" to "architecture"
v8.1a is renamed to architecture, accordingly to approaches in ARM backend.

Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 14:49:29 +00:00
Vladimir Sukharev
e99524cf52 [AArch64] Add v8.1a "Rounding Double Multiply Add/Subtract" extension
Reviewers: t.p.northover, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233693 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 13:15:48 +00:00
Vladimir Sukharev
27d12f3e6e [AArch64, ARM] Add v8.1a architecture and generic cpu
New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development

Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8505


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:05:54 +00:00
Chad Rosier
c1813d8fe1 [AArch64] Enable rematerialization of float 0 values.
Patch by Geoff Berry<gberry@codeaurora.org>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232967 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 17:19:34 +00:00
Ahmed Bougacha
4a3cd42601 [AArch64] Avoid going through GPRs for across-vector instructions.
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
  (v4i32 (uaddv ...))
is the same as
  (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
  (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
           (i32 (int_aarch64_neon_uaddv ...)), ssub)

In a combine, we transform all such across-vector-lanes intrinsics to:

  (i32 (extract_vector_elt (uaddv ...), 0))

This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs.  Consider:

    uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
        return vmulq_n_u32(a, vaddvq_u32(b));
    }

We now generate:
    addv.4s  s1, v1
    mul.4s   v0, v0, v1[0]
instead of the previous:
    addv.4s  s1, v1
    fmov     w8, s1
    dup.4s   v1, w8
    mul.4s   v0, v1, v0

rdar://20044838


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:45:38 +00:00
Ahmed Bougacha
4cd59eb629 [AArch64] Remove integer INSvi*lane patterns. NFCI.
Most are redundant, and they never seem to fire.

The V128 integer patterns already exist in the INS multiclass.
The duplicates only fire when the vector index type isn't i64,
because they accept "imm" instead of an explicit "i64", as the
instruction definition patterns do.

TLI::getVectorIdxTy is i64 on AArch64, so this should never happen.
Also, one of them had a typo: for i64, INSvi32lane was used.
I noticed because I mistakenly used an explicit i32 as the idx type,
and got ins.s for an i64 vector_insert.

The V64 patterns also don't seem to ever fire, as V64 vector
extract/insert are legalized to V128.

The equivalent float patterns are unique and useful, so keep them.

No functional change intended;  none exhibited on the LIT and LNT tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231838 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:37:19 +00:00
Kristof Beyls
78c4ef5120 Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet.
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers
ld.bfd and ld.gold currently only support a subset of the whole range of AArch64
ELF TLS relocations. Furthermore, they assume that some of the code sequences to
access thread-local variables are produced in a very specific sequence.
When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize
the instructions.
Even if that wouldn't be the case, it's good to produce the exact sequence,
as that ensures that linkers can perform optimizing relaxations.

This patch:

* implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang
  would grow an -mtls-size option to allow support for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold
  linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation)
  is added to enable generation of local dynamic code sequence, but is off by default.
* makes sure that the exact expected code sequence for local dynamic and general dynamic
  accesses is produced, by making use of a new pseudo instruction. The patch also removes
  two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo
  SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231227 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 09:12:08 +00:00
Ahmed Bougacha
0f1a21bcb8 [AArch64] Prefer DUP/MOV ("CPY") to INS for vector_extract.
This avoids a partial false dependency on the previous content of
the upper lanes of the destination vector register.

Differential Revision: http://reviews.llvm.org/D7307


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227820 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-02 17:55:57 +00:00
Karthik Bhat
f2b3638c3d Revert r225165 and r225169
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case
where the original subtraction could have caused an overflow.
Reverting the same.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 06:34:34 +00:00
Ahmed Bougacha
3c9fb6e1ad [AArch64] Improve codegen of store lane instructions by avoiding GPR usage.
We used to generate code similar to:

  umov.b        w8, v0[2]
  strb  w8, [x0, x1]

because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:

  add   x8, x0, x1
  st1.b { v0 }[2], [x8]

This patch increases the ST1* AddedComplexity to achieve that.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 17:10:26 +00:00
Ahmed Bougacha
c52cd839b9 [AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister.
For 0-lane stores, we used to generate code similar to:

  fmov w8, s0
  str w8, [x0, x1, lsl #2]

instead of:

  str s0, [x0, x1, lsl #2]

To correct that: for store lane 0 patterns, directly match to STR <subreg>0.

Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 17:02:28 +00:00
Karthik Bhat
050064d32c Select lower fsub,fabs pattern to fabd on AArch64
This patch lowers patterns such as-
  fsub   v0.4s, v0.4s, v1.4s
  fabs   v0.4s, v0.4s
to
  fabd  v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6791



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225169 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:57:59 +00:00
Karthik Bhat
e239724d12 Select lower sub,abs pattern to sabd on AArch64
This patch lowers patterns such as-
  sub	v0.4s, v0.4s, v1.4s
  abs	v0.4s, v0.4s
to
  sabd	v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6781



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:11:07 +00:00
Karthik Bhat
0c2590a266 Lower multiply-negate operation to mneg on AArch64
This patch pattern matches code such as-
neg	 w8, w8
mul	 w8, w9, w8
to
mneg	 w8, w8, w9

Review: http://reviews.llvm.org/D6754



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 13:38:58 +00:00
Juergen Ributzka
446f01b1d5 [AArch64] MachO large code-model: Materialize FP constants in code.
In the large code model we have to first get the address of the GOT entry, load
the address of the constant, and then load the constant itself.

To avoid these loads and the GOT entry alltogether this commit changes the way
how FP constants are materialized in the large code model. The constats are now
materialized in a GPR and then bitconverted/moved into the FPR.

Reviewed by Tim Northover

Fixes rdar://problem/16572564.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223941 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 19:43:32 +00:00
Craig Topper
c0dae440e6 Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-26 00:46:26 +00:00
Benjamin Kramer
4844826c15 AArch64: Pattern match integer vector abs like we do on ARM.
This kind of pattern is emitted by the loop vectorizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221289 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-04 20:10:06 +00:00
Chad Rosier
2929da99a9 [AArch64] Generate vector signed/unsigned mul and mla/mls long.
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 02:31:24 +00:00
Asiri Rathnayake
e9bbacd0a8 Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218751 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 09:59:45 +00:00