Commit Graph

44601 Commits

Author SHA1 Message Date
Matt Arsenault
ec61af4bcc AMDGPU: Fix crash on immediate operand
We can have a v_mac with an immediate src0.
We can still fold if it's an inline immediate,
otherwise it already uses the constant bus.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-21 00:45:59 +00:00
Craig Topper
8d6f84c7d5 [X86] Replace a condition that can never be true with an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313848 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-21 00:18:48 +00:00
Eugene Zelenko
20d1cb14e8 [ARM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313823 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:35:51 +00:00
Artem Belevich
34fb94caca [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.
Differential Revision: https://reviews.llvm.org/D38090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313820 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:23:07 +00:00
Simon Atanasyan
ffd407ea16 [mips] Fix calculation of a branch instruction offset to escape left shift of negative value
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313815 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:01:30 +00:00
Matt Arsenault
820b8a54fb AMDGPU: Start selecting v_mad_mixhi_f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:01:24 +00:00
Saleem Abdulrasool
9d697e1c3c X86: treat SwiftCC as Win64_CC on Win64
The Swift CC is identical to Win64 CC with the exception of swift error
being passed in r12 which is a CSR.  However, since this calling
convention is only used in swift -> swift code, it does not impact
interoperability and can be treated entirely as Win64 CC.  We would
previously incorrectly lower the frame setup as we did not treat the
frame as conforming to Win64 specifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313813 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:00:40 +00:00
Matt Arsenault
4739f7353d AMDGPU: Add tied operands to v_mad_mix{lo|hi}_f16
These write to the low and high half of the destination
register and leave the other 16-bits unchanged. This is true
for most 16-bit instructions on gfx9, but we don't use that
now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313812 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:53:49 +00:00
Eric Christopher
5a4ca9ad9c Remove the default subtarget from the new Nios2 port. It's unused and deprecated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313808 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:32:23 +00:00
Matt Arsenault
7287fcb5d5 AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:28:39 +00:00
Matt Arsenault
ae40a10420 AMDGPU: Fix encoding of op_sel for mad_mix* opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313797 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 19:09:28 +00:00
Saleem Abdulrasool
a689afa09d CodeGen: support SwiftError SwiftCC on Windows x64
Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 18:40:59 +00:00
Simon Pilgrim
e0d66261a1 [X86][SSE] Remove unnecessary NonceMasks from combineX86ShufflesRecursively calls (NFCI)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313743 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 09:36:11 +00:00
Andrew V. Tischenko
1f42b92202 'into' instruction should not be decoded as a valid instr in 64-bit mode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313735 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 08:17:17 +00:00
Craig Topper
abd71f69bc [X86] Remove isel checks for immediate size on floating point compare and xop compare instructions. NFCI
If these checks fail we end up not selecting an instruction at all. So we are already relying on the immediate being checked upstream of isel. So doing the check in isel is just bloat to the isel table. Interestingly, we didn't check on the AVX512 version of the instructions anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313724 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:38:41 +00:00
Stanislav Mekhanoshin
b5a9104224 [AMDGPU] Fixed memory leak with inliner replaced
Delete inliner before replacing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313723 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:34:28 +00:00
Matt Arsenault
e232c83060 AMDGPU: Move r600 only code into r600 only td file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313719 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:11:25 +00:00
Stanislav Mekhanoshin
2e5d75b42d [AMDGPU] Fix regression in test clang/test/CodeGen/backend-unsupported-error.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:10:15 +00:00
Matt Arsenault
a942315e5f AMDGPU: Match load d16 hi instructions
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.

We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin
fbf0e1603c [AMDGPU] Port of HSAIL inliner
Differential Revision: https://reviews.llvm.org/D36849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 04:25:58 +00:00
Matt Arsenault
6a28475ea4 AMDGPU: Cleanup load/store PatFrags
Try to use a consistent naming scheme.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313713 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 03:43:35 +00:00
Matt Arsenault
8e11a03a95 AMDGPU: Match store d16_hi instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 03:20:09 +00:00
Jonathan Roelofs
ee28e9a4fa [ARM] Relax 'cpsie'/'cpsid' flag parsing.
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.

https://reviews.llvm.org/D37953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 21:23:19 +00:00
Vadzim Dambrouski
cb6f3f43df [MSP430] Align functions on 2-byte boundary instead of 4.
Summary:
There is no benefit in having the 4-byte alignment, and removing this
restriction can save a lot of space for some applications.

Reviewers: asl, awygle

Reviewed By: awygle

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 21:05:20 +00:00
Stanislav Mekhanoshin
60873e298c [AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.

Differential Revision: https://reviews.llvm.org/D38014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313670 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 20:54:38 +00:00
Ulrich Weigand
5994e167a7 [SystemZ] Fix truncstore + bswap codegen bug
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.

This transformation is correct for regular stores, but not for
truncating stores.  The routine neglected to check for that case.

Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313669 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 20:50:05 +00:00
Craig Topper
1bfa1fd4c5 [X86] Convert X86ISD::SELECT to ISD::VSELECT just before instruction selection to avoid duplicate patterns
Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns.

Differential Revision: https://reviews.llvm.org/D38022

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313644 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 17:19:45 +00:00
Tony Jiang
a2daaca0d6 [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.

Differential Revision: https://reviews.llvm.org/D36734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313639 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 16:14:37 +00:00
Tony Jiang
9488976a0f [Power9] Add missing Power9 instructions.
The following 8 instructions are implemented in this patch.
addpcis(subpcis, lnia), darn, maddhd, maddhdu, maddld, setb

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313636 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 15:22:36 +00:00
Daniel Sanders
68b21d6108 [globalisel] Add a G_BSWAP instruction and support bswap using it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 14:25:15 +00:00
Nikolai Bozhenov
be4aef480e [Nios2] Subtarget, basic infrastructure for frame, instructions and registers
This is the second minimal patch keeping Nios2 target buildable.
I'm adding subtarget here and other stuff for frame lowering, instruction,
register information methods. I do not add any test cases, as still there
are missing parts like DAG selector and assembly printing. I plan to include
them into the next patch.

Patch by Andrei Grischenko <andrei.l.grischenko@intel.com>

Differential Revision: https://reviews.llvm.org/D37256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313626 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 11:54:29 +00:00
Jina Nahias
eef725fc85 [x86] Lowering Mask Set1 intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D37669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313625 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 11:03:06 +00:00
Roger Ferrer Ibanez
081fd494d0 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313618 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 09:05:39 +00:00
Matt Arsenault
4b385be048 AMDGPU: Run internalize symbols at -O0
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 07:40:11 +00:00
Gadi Haber
bbbe81ad6a [X86][Skylake] Adding the scheduling information for the SkylakeClient target
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.

Please expect some performance fluctuations due to code alignment effects.

Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313613 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 06:19:27 +00:00
Craig Topper
33bc70e64d [X86] Remove some unnecessary patterns for truncate with X86ISD::SELECT and undef preserved source.
We canonicalize undef preserved sources to zero during intrinsic lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313612 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 05:30:24 +00:00
Craig Topper
4c92030df7 [X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 04:39:55 +00:00
Yonghong Song
2865ab6996 bpf: add inline-asm support
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313593 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 23:29:36 +00:00
Sanjay Patel
a3209ae52e [DAGCombiner] fold assertzexts separated by trunc
If we have an AssertZext of a truncated value that has already been AssertZext'ed, 
we can assert on the wider source op to improve the zext-y knowledge:
 assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN

This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.

Differential Revision: https://reviews.llvm.org/D37017



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 22:05:35 +00:00
Konstantin Zhuravlyov
fe0a82a17c AMDGPU: Start selecting s_xnor_{b32, b64}
Differential Revision: https://reviews.llvm.org/D37981


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 21:22:45 +00:00
Craig Topper
1ae3ba04b8 [X86] Make sure we still emit zext for GR32 to GR64 when the source of the zext is AssertZext
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.

This fixes PR28540.

Differential Revision: https://reviews.llvm.org/D37729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313563 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 20:49:13 +00:00
Craig Topper
3541751e37 [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit
This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets.

Differential Revision: https://reviews.llvm.org/D37890

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313558 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 19:21:21 +00:00
Craig Topper
c72506420a [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bit_hi
I'm pretty sure that InstrEmitter::EmitSubregNode will take care of this itself by calling ConstrainForSubReg which in turn calls TRI->getSubClassWithSubReg.

I think Jakob Stoklund Olesen alluded to this in his commit message for r141207 which added the code to EmitSubregNode.

Differential Revision: https://reviews.llvm.org/D37843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313557 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 19:21:19 +00:00
Evandro Menezes
2e76246c41 [AArch64] Adjust the cost model for Exynos M1 and M2
Refine the model of FP loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313555 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 19:00:38 +00:00
Evandro Menezes
125471dac2 [AArch64] Adjust the cost model for Exynos M1 and M2
Refine the model of loads and stores using the register offset addressing
modes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 19:00:36 +00:00
Evandro Menezes
0469ffd5a1 [AArch64] Adjust the cost model for Exynos M1 and M2
Fix formatting in the predicate function AArch64InstrInfo::isExynosShiftLeftFast().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313553 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 19:00:31 +00:00
Simon Pilgrim
d7c4504964 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare results.
As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313547 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 17:58:31 +00:00
Craig Topper
09f2a0775a [X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is enabled
The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there.

If we have VPERMQ/PD we should prefer those since they only have a single input.

Differential Revision: https://reviews.llvm.org/D37947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313542 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 16:39:49 +00:00
Sam Parker
df2a024df2 [AArch64] Add V8_2aOps feature to Cortex-A55 and 75
Add the missing hardware features the ProcA55 and ProcA75 feature.
These are already enabled via the target parser, but I had missed
them in the backend.

Differential Revision: https://reviews.llvm.org/D37974


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313535 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 14:46:14 +00:00
Sam Parker
7863128a7d [ARM] Implement isTruncateFree
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.

Differential Revision: https://reviews.llvm.org/D37516


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313533 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 14:28:51 +00:00