Commit Graph

17142 Commits

Author SHA1 Message Date
Matt Arsenault
115244a728 AMDGPU: Fix kernel argument alignment impacting stack size
Don't use AllocateStack because kernel arguments have nothing
to do with the stack. The ensureMaxAlignment call was still
changing the stack alignment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273080 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-18 05:15:53 +00:00
Simon Pilgrim
e2e7d46a44 [X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics
Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273077 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-18 02:38:26 +00:00
Matt Arsenault
863cff46f2 AMDGPU: Temporarily select trap to s_endpgm
This should select to s_trap, but that requires
additonal work to setup and enable the trap handler.
For now emit s_endpgm so bugpoint stops getting stuck
on the unsupported call to abort.

Emit a warning that this will only terminate the wave and
not really trap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273062 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 22:27:03 +00:00
Matt Arsenault
310a3752c0 AMDGPU: Remove llvm.SI.tid intrinsic
Mesa doesn't emit this for llvm >= 3.8 anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273050 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 21:18:41 +00:00
Marcin Koscielnicki
e2e9cee0d7 [SelectionDAG] Don't treat library calls specially if marked with nobuiltin.
To be used by D19781.

Differential Revision: http://reviews.llvm.org/D19801

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273039 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 20:24:07 +00:00
Michael Kuperstein
987a5dc3a7 [X86] Add missing AVX512 anyext patterns.
Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and
i32 patterns.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273038 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 20:21:17 +00:00
Tim Northover
d476afda38 ARM: take account of possible bundle when erasing an instruction.
Fortunately this appears to be the only ARM-specific pass that runs while
bundles might be in play, so no other cases need modifying.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273029 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 18:40:46 +00:00
James Y Knight
8d30502e60 Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.

This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.

Tests added for the IR transform and SPARCv9.

Differential Revision: http://reviews.llvm.org/D21029

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273025 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 18:11:48 +00:00
Rafael Espindola
bef5612dfd Change RelaxELFRelocations for llc.
As a developer tool it makes sense for it to use the new relocations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 17:43:41 +00:00
Simon Pilgrim
649d92ad2f [X86][SSE4A] Remove the GCCBuiltins from the movntsd/movntss intrinsic defs so we can emit native IR from clang.
Clang-side sibling commit to follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273002 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 14:27:38 +00:00
Ranjeet Singh
e5b666f4cf [ARM] Add support for mrrc/mrrc2 intrinsics.
Reapplying patch as it was reverted when it was first
committed because of an assertion failure when the
mrrc2 intrinsic was called in ARM mode. The failure
was happening because the instruction was being built
in ARMISelDAGToDAG.cpp and the tablegen description for
mrrc2 instruction doesn't allow you to use a predicate.

The ARM architecture manuals do say that mrrc2 in ARM
mode can be predicated with AL in assembly but this has
no effect on the encoding of the instruction as the top
4 bits will always be 1111 not 1110 which is the encoding
for the condition AL.

Differential Revision: http://reviews.llvm.org/D21408


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272982 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17 00:52:41 +00:00
Sanjay Patel
a240c3eb0e [x86] autoupgrade and remove AVX2 integer min/max intrinsics
This will (hopefully very temporarily) break clang.
The clang side of this should be the next commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272932 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 18:44:20 +00:00
Rafael Espindola
1f4afa2405 dos2unix this test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272928 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 18:21:11 +00:00
Sanjay Patel
255484c723 remove old FileCheck lines that are no longer used
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272921 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 17:04:16 +00:00
Sanjay Patel
f449b0e944 [DAG] Remove redundant FMUL in Newton-Raphson SQRT code
When calculating a square root using Newton-Raphson with two constants,
a naive implementation is to use five multiplications (four muls to calculate
reciprocal square root and another one to calculate the square root itself).
However, after some reassociation and CSE the same result can be obtained
with only four multiplications. Unfortunately, there's no reliable way to do
such a reassociation in the back-end. So, the patch modifies NR code itself
so that it directly builds optimal code for SQRT and doesn't rely on any
further reassociation.

Patch by Nikolai Bozhenov!

Differential Revision: http://reviews.llvm.org/D21127



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272920 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 16:58:54 +00:00
Rafael Espindola
b72793375f Don't print (PLT) on arm.
The R_ARM_PLT32 relocation is deprecated and is not produced by MC.

This means that the code being deleted is dead from the .o point of
view and was making the .s more confusing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272909 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 16:09:53 +00:00
Sanjay Patel
2c8c45d695 [x86] autoupgrade and remove SSE2/SSE41 integer min/max intrinsics
Follow-up to:
http://reviews.llvm.org/rL272806
http://reviews.llvm.org/rL272807


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272907 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 15:48:30 +00:00
Daniel Sanders
efa6f4843e [mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores.
Summary:
[ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore
lower frameindex nodes such that there is a copy to a CPU16Regs register. This
is now done consistently using a separate addressing mode that does not
permit frameindex nodes.

As part of this I've had to remove an optimization that reduced the number of
instructions needed to work around the lack of sp-relative addresses on [ls][bh]
and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as
a copy of the stack pointer and it's implementation was the root cause of many
of the register vs register class mismatches.

lw/sw can use sp-relative addresses but we ought to ensure that we use the
correct version of lw/sw internally for things like IAS. This is not currently
the case and this change does not fix this. However, this change does clean it
up sufficiently well to fix the machine verifier failures.

Also removed irrelevant functions from stchar.ll.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272882 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 10:20:59 +00:00
Daniel Sanders
0b8fc77698 [llvm-objdump] Support detection of feature bits from the object and implement this for Mips.
Summary:
The Mips implementation only covers the feature bits described by the ELF
e_flags so far. Mips stores additional feature bits such as MSA in the
.MIPS.abiflags section.

Also fixed a small bug this revealed where microMIPS wouldn't add the
EF_MIPS_MICROMIPS flag when using -filetype=obj.

Reviewers: echristo, rafael

Subscribers: rafael, mehdi_amini, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272880 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 09:17:03 +00:00
Hrvoje Varga
98d31c1b79 [mips][micromips] Implement DCLO, DCLZ, DROTR, DROTR32 and DROTRV instructions
Differential Revision: http://reviews.llvm.org/D16917


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272876 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 07:06:25 +00:00
Tim Northover
bde073f537 AArch64: allow MOV (imm) alias to be printed
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272865 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 01:42:25 +00:00
Matt Arsenault
11e5e3bbe1 AMDGPU: Disable scheduling in some slow tests
Disabling the pre-RA scheduler on large-work-group-registers
causes it to be ~50% slower.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272860 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 00:56:47 +00:00
Sanjay Patel
079ac1edc0 [x86, SSE] update packed FP compare tests for direct translation from builtin to IR
The clang side of this was r272840:
http://reviews.llvm.org/rL272840

A follow-up step would be to auto-upgrade and remove these LLVM intrinsics completely.

Differential Revision: http://reviews.llvm.org/D21269



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272841 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 21:22:15 +00:00
Sanjay Patel
efa2e76536 [x86] delete unnecessary function declarations
Missed this in r272806, r272807.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272834 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 20:51:47 +00:00
Tim Northover
4d6b849855 AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints.
Of course the assembly was right but because the opcode was MOVZWi it was
encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and
would go horribly wrong on a CPU.

No idea how this bug survived this long. It seems nobody is using that aspect
of patchpoints.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272831 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 20:33:36 +00:00
Sanjay Patel
2df18c8dc0 [x86] add folds for x86 vector compare nodes (PR27924)
Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened 
with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend
too while we're working on that and as a backstop.

This fixes:
https://llvm.org/bugs/show_bug.cgi?id=27924

Differential Revision: http://reviews.llvm.org/D21356



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272828 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 20:26:58 +00:00
Kevin B. Smith
42abc92144 [X86]: Updated r272801 to promote 16 bit compares with immediate operand
to 32 bits. This is in response to a comment by Eli Friedman.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272814 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 18:18:05 +00:00
Sanjay Patel
3539217c1c [x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics
This allows us to emit native IR in Clang (next commit).
Also, update the intrinsic tests to show that codegen already knows how to handle
the IR that Clang will soon produce.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272806 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 17:17:27 +00:00
Kevin B. Smith
746c3b7ff4 [X86]: Quit promoting 8 and 16 bit compares to 32 bit.
Differential Revision: http://reviews.llvm.org/D21144


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272801 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 16:37:46 +00:00
Kevin B. Smith
8800861c19 [X86]: Improve Liveness checking for X86FixupBWInsts.cpp
Differential Revision: http://reviews.llvm.org/D21085


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272797 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 16:03:06 +00:00
Ranjeet Singh
a94e734a24 Reverting r272778 because there's an assertion
failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 14:23:29 +00:00
Simon Dardis
b5361d7101 [mips] Missing test case
Add missing testcase from r272666.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272784 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 13:49:58 +00:00
Ranjeet Singh
c0f8f419a5 [ARM] Add support for mrrc/mrrc2 intrinsics.
Differential Revision: http://reviews.llvm.org/D21178



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272778 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 11:32:24 +00:00
Daniel Sanders
5ffd0983e4 [mips] Removed invalid test from o32_cc.ll
MIPS32R1 cannot implement a 64-bit FPU because this was introduced in MIPS32R2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 09:47:27 +00:00
Daniel Sanders
ee9790cc7e [mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX().
Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 08:43:23 +00:00
Zlatko Buljan
1f61965a92 [mips][microMIPS] Add CodeGen support for AND*, OR16, OR*, XOR*, NOT16 and NOR instructions
Differential Revision: http://reviews.llvm.org/D16719


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 07:46:24 +00:00
Igor Breger
b8387d0ad3 [AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match SELECT behavior.
Use BLENDM instead of masked move instruction.

Differential Revision: http://reviews.llvm.org/D21001

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272763 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 07:30:38 +00:00
Nicolai Haehnle
682fc3e780 AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics
Summary:
This fixes two related bugs. First, the generic optimization passes
unfortunately generate negative constant offsets but the hardware treats
SOffset as an unsigned value.

Second, there is a hardware bug on SI and CI, where address clamping in MUBUF
instructions does not work correctly when SOffset is larger than the buffer
size. This patch works around this bug by never using SOffset.

An alternative workaround would be to do the clamping manually when SOffset
is too large, but generating the required code sequence during instruction
selection would be rather involved, and in any case the resulting code would
probably be worse.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272761 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 07:13:05 +00:00
Sanjoy Das
e9f1be7f56 Don't force SP-relative addressing for statepoints
Summary:
...  when the offset is not statically known.

Prioritize addresses relative to the stack pointer in the stackmap, but
fallback gracefully to other modes of addressing if the offset to the
stack pointer is not a known constant.

Patch by Oscar Blumberg!

Reviewers: sanjoy

Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm

Differential Revision: http://reviews.llvm.org/D21259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272756 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 05:35:14 +00:00
David Majnemer
0c4f69f653 Remove the ScalarReplAggregates pass
Nearly all the changes to this pass have been done while maintaining and
updating other parts of LLVM.  LLVM has had another pass, SROA, which
has superseded ScalarReplAggregates for quite some time.

Differential Revision: http://reviews.llvm.org/D21316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272737 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 00:19:09 +00:00
Matt Arsenault
6af03e5068 AMDGPU: Run pointer optimization passes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272736 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 00:11:01 +00:00
Xinliang David Li
bfbb095051 Fix a test case to match its intention
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272733 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 23:05:46 +00:00
Dehao Chen
97615522d0 Set machine block placement hot prob threshold for both static and runtime profile.
Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally.

Reviewers: djasper, davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272729 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 22:27:17 +00:00
Sanjay Patel
8291779372 [x86] add current codegen tests for PR27924
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272714 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 21:25:46 +00:00
Peter Collingbourne
63b34cdf34 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272709 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 21:01:22 +00:00
Wei Mi
b7b4ba37de [X86] Reduce the width of multiplification when its operands are extended from i8 or i16
For <N x i32> type mul, pmuludq will be used for targets without SSE41, which
often introduces many extra pack and unpack instructions in vectorized loop
body because pmuludq generates <N/2 x i64> type value. However when the operands
of <N x i32> mul are extended from smaller size values like i8 and i16, the type
of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which
generates better code. For targets with SSE41, pmulld is supported so no
shrinking is needed.

Differential Revision: http://reviews.llvm.org/D20931



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272694 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 18:53:20 +00:00
Nirav Dave
adf7e0e7c6 Fix BSS global handling in AsmPrinter
Change EmitGlobalVariable to check final assembler section is in BSS
before using .lcomm/.comm directive. This prevents globals from being
put into .bss erroneously when -data-sections is used.

This fixes PR26570.

Reviewers: echristo, rafael

Subscribers: llvm-commits, mehdi_amini

Differential Revision: http://reviews.llvm.org/D21146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272674 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 15:09:30 +00:00
Simon Dardis
659228b9db [mips] Optimize stack pointer adjustments.
Instead of always using addu to adjust the stack pointer when the
size out is of the range of an addiu instruction, use subu so that
a smaller constant can be generated.

This can give savings of ~3 instructions whenever a function has a
a stack frame whose size is out of range of an addiu instruction.

This change may break some naive stack unwinders.

Partially resolves PR/26291.

Thanks to David Chisnall for reporting the issue.

Reviewers: dsanders, vkalintiris

Differential Review: http://reviews.llvm.org/D21321


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272666 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 13:39:43 +00:00
James Molloy
a523293cd9 [Thumb] Fix off-by-one error in r272007
We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256.

Found by Oliver Stannard and csmith!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272665 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 13:33:07 +00:00
Simon Dardis
55037ea9b0 [mips][atomics] Fix atomic instruction descriptions and uses.
PR27458 highlights that the MIPS backend does not have well formed
MIR for atomic operations (among other errors).

This patch adds expands and corrects the LL/SC descriptions and uses
for MIPS(64).

Reviewers: dsanders, vkalintiris

Differential Review: http://reviews.llvm.org/D19719



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272655 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 11:29:28 +00:00
Simon Pilgrim
ece1ebdf12 [X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles using MOVNTSD/MOVNTSS
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272651 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 09:43:38 +00:00
Simon Dardis
217c907cac [mips] MIPS32/64 itineraries
Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded
instructions are marked as having no scheduling info.

Reviewers: dsanders, vkalintiris

Differential Review: http://reviews.llvm.org/D20418


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272648 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 09:35:29 +00:00
Daniel Sanders
bd52e9f225 [mips][dsp] Fix use without def on DSPCtrl registers read by rddsp intrinsic.
Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272647 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 09:29:46 +00:00
Daniel Sanders
9cea6726d3 [mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA.
Summary:
The machine verifier reports 'Explicit operand marked as def' when it is
manually specified even though it agrees with the operand info.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272646 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 09:11:33 +00:00
Diana Picus
7845b7dd45 [SelectionDAG] Remove exit-on-error flag from test (PR27765)
The exit-on-error flag in the ARM test is necessary in order to avoid an
unreachable in the DAGTypeLegalizer, when trying to expand a physical register.
We can also avoid this situation by introducing a bitcast early on, where the
invalid scalar-to-vector conversion is detected.

We also add a test for PowerPC, which goes through a similar code path in the
SelectionDAGBuilder.

Fixes PR27765.

Differential Revision: http://reviews.llvm.org/D21061

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272644 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 07:30:20 +00:00
Igor Breger
7c5456b142 re-generate the tests using the update_llc_test_checks.py script
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272643 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 07:05:10 +00:00
Craig Topper
62458bf56e [AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272626 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 03:13:00 +00:00
Craig Topper
51ad7064a4 [AVX512] Add patterns for zero-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272625 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 03:12:54 +00:00
Craig Topper
2dba2a4d42 [AVX512] Add tests for zero extending masks that show an unnecessary movzx instruction. A followup patch will remove that instruction, but adding the tests first to make the more obvious.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272624 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 03:12:48 +00:00
Sanjoy Das
60271907e7 Move previously added test case to the right location
In rL272580 I accidentally added a test case to test/CodeGen when
test/Transforms/DeadStoreElimination/ is a better place for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272581 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 20:12:07 +00:00
Sanjoy Das
ae9ac9ce84 Fix AAResults::callCapturesBefore for operand bundles
Summary:
AAResults::callCapturesBefore would previously ignore operand
bundles. It was possible for a later instruction to miss its memory
dependency on a call site that would only access the pointer through a
bundle.

Patch by Oscar Blumberg!

Reviewers: sanjoy

Differential Revision: http://reviews.llvm.org/D21286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272580 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 19:55:04 +00:00
Simon Pilgrim
933aa2e6be [X86][SSE] Added extract to scalar nontemporal store tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272577 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 19:08:28 +00:00
David Majnemer
204d45582a [X86] Remove llvm.x86.bit.scan.{forward,reverse}.32
The need for these intrinsics has been obviated by r272564 which
reimplements their functionality using generic IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272566 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 17:33:13 +00:00
Marek Olsak
760c36c5ae AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing
Summary:
Mesa and other users must set this to enable coalescing:
- STRIDE = 0
- SWIZZLE_ENABLE = 1

This makes one particular compute shader 8x faster.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D21136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 16:05:57 +00:00
Ulrich Weigand
603d680eb5 [SystemZ] Enable index register memory constraints for inline ASM
This enables use of the 'R' and 'T' memory constraints for inline ASM
operands on SystemZ, which allow an index register as well as an
immediate displacement. This patch includes corresponding documentation
and test case updates.

As with the last patch of this kind, I moved the 'm' constraint to the
most general case, which is now 'T' (base + 20-bit signed displacement +
index register).

Author: colpell
Differential Revision: http://reviews.llvm.org/D21239



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272547 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 14:24:05 +00:00
Ranjeet Singh
1dd5b28858 [ARM] Reverting r272544 because clang patch needs
to go in as soon as llvm patch has gone in because
tests will start breaking in Clang.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272546 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 10:58:24 +00:00
Ranjeet Singh
84bf8bc6d0 [ARM] Add mrrc/mrrc2 co-processor intrinsics
MRRC/MRRC2 instruction writes to two registers. The
intrinsic definition returns a single uint64_t to
represent the write, this is a compact way of
representing a write to two 32 bit registers,
the alternative might have been two return a
struct of 2 uint32_t's but this isn't as nice.

Differential Revision: 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272544 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 10:43:50 +00:00
Strahinja Petrovic
7417e35311 This patch fixes handling long double type when it is
constant in soft float mode on PowerPC 32 architecture.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272543 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 10:29:29 +00:00
Simon Pilgrim
dc051d0c8c [X86][SSE4A] Renamed tests to correspond with the the instruction with being tested
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272542 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 10:14:42 +00:00
Craig Topper
dbd262941c [AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and autoupgrade them to selects and shufflevector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272527 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 02:36:48 +00:00
Sanjay Patel
9a476793c5 [x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044)
This patch is intended to solve:
https://llvm.org/bugs/show_bug.cgi?id=28044

By changing the definition of X86ISD::CMPP to use float types, we allow it to be created 
and pass legalization for an SSE1-only target where v4i32 is not legal.

The motivational trail for this change includes:
https://llvm.org/bugs/show_bug.cgi?id=28001

and eventually makes this trigger:
http://reviews.llvm.org/D21190

Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86
intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM 
sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86
intrinsics, a big pile of generic optimizations can trigger.

Differential Revision: http://reviews.llvm.org/D21235


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272511 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 15:03:25 +00:00
Craig Topper
b2cfb64e72 [X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to shufflevector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272510 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 14:11:32 +00:00
Simon Pilgrim
e4f64b2456 [X86][BMI] Added fast-isel tests for BMI1 intrinsics
A lot of the codegen is pretty awful for these as they are mostly implemented as generic bit twiddling ops 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272508 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 09:56:05 +00:00
Craig Topper
0771fcddeb [X86] Move tests for llvm.x86.avx.vpermil.* intrinsics to a -upgrade test since they are autoupgraded to shufflevector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272494 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 01:41:06 +00:00
Simon Pilgrim
58140e4f69 [X86] Updated test checks script to generalise LCPI symbol refs
The script now replace '.LCPI888_8' style asm symbols with the {{\.LCPI.*}} re pattern - this helps stop hardcoded symbols in 32-bit x86 tests changing with every edit of the file

Refreshed some tests to demonstrate the new check

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272488 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 20:39:21 +00:00
Simon Pilgrim
fd46fc3322 [X86][SSSE3] Added PSHUFB LUT implementation of BITREVERSE
PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272477 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 15:44:13 +00:00
Craig Topper
1385fc37d8 [AVX512] Re-generate v8i64 shuffle test now that we use pshufd for some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272474 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 13:57:08 +00:00
Craig Topper
64162b5008 [AVX512] Lower v8i64 and v16i32 to pshufd when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272473 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 13:43:21 +00:00
Simon Pilgrim
56a9634f03 [X86][SSE] Added PSLLDQ/PSRLDQ as a target shuffle type
Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272471 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 13:38:28 +00:00
Simon Pilgrim
5221b09672 [X86][AVX2] Added PSLLDQ/PSRLDQ shuffle combining tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272469 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 13:18:21 +00:00
Craig Topper
ff7edbe38c [AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272450 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 03:27:42 +00:00
Quentin Colombet
5b7abef816 [IRTranslator] Support the translation of or.
Now or instructions get translated into G_OR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272433 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 20:50:35 +00:00
Sanjay Patel
e208b3790d [x86] enable bitcasted fabs/fneg transforms
The vector cases don't change because we already have folds in X86ISelLowering
to look through and remove bitcasts.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272427 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 20:33:50 +00:00
Zhan Jun Liau
23b8e3e7cc [SystemZ] Support Compare and Traps
Support and generate Compare and Traps like CRT, CIT, etc.

Support Trap as legal DAG opcodes and generate "j .+2" for them by default.
Add support for Conditional Traps and use the If Converter to convert them into
the corresponding compare and trap opcodes.

Differential Revision: http://reviews.llvm.org/D21155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272419 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 19:58:10 +00:00
Tom Stellard
4ee3d0cb4d AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations
Summary:
We need to set the fixup type to FK_Data_4 for the
SCRATCH_RSRC_DWORD[01] symbols, since these require absolute
relocations, and fixup_si_rodata is for relative relocations.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272417 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 19:26:38 +00:00
Mehdi Amini
af43daa44e Move CodeGen test from Generic to X86 specific directory
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272416 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 19:14:01 +00:00
Mehdi Amini
13c1e2501a Interprocedural Register Allocation (IPRA): add a Transformation Pass
Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D21180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272414 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 18:37:21 +00:00
Sanjay Patel
2e8d26714f [x86] add test for PR28044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272411 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 18:05:55 +00:00
Mehdi Amini
32b9ed845a Interprocedural Register Allocation (IPRA) Analysis
Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.

When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.

The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.

An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.

Patch by Vivek Pandya <vivekvpandya@gmail.com>

Differential Revision: http://reviews.llvm.org/D20769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272403 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 16:19:46 +00:00
Sanjay Patel
2f8d84f8f9 [x86] fix test attributes and autogenerate checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272398 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 15:30:52 +00:00
Sanjay Patel
8858190401 [x86] add missing tests for fcmp ueq/one
Somehow, the codegen logic for these sequences has gone completely untested
until now (note the 2 compare instructions generated per test).

There's also an *Intel* AVX optimization opportunity exposed in these cases
and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP
predicates were added, so a single comparison should always be sufficient,
and operand commutation should never be necessary.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272397 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 15:17:54 +00:00
Sanjay Patel
7564bb471e [x86] regenerate checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272396 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 14:48:50 +00:00
Simon Pilgrim
d6d608f9e8 [X86][SSE] Added target shuffle combine tests for byte shift/rotates (PSLLDQ/PSRLDQ/PALIGNR)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272392 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 13:03:22 +00:00
Simon Pilgrim
0d0f696346 [X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests
Memory operand is new for AVX512 (SSE/AVX2 didn't support it).

Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations).

Regenerated VPALIGNR test now that the shuffle comments work

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272383 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 09:56:20 +00:00
Craig Topper
e041e2676c [AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272371 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 05:12:40 +00:00
Craig Topper
0e4bc86c7c [AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272367 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 04:48:05 +00:00
Quentin Colombet
47535581c8 [LiveRangeEdit] Add a test case for r272314.
The test case is not great espicially because it is still cumbersome to
run the regalloc pass with run-pass. (We miss a bunch of initiliazier to
be properly implemented.)

Related to llvm.org/PR27983

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272360 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 01:57:48 +00:00
Quentin Colombet
aef5f640f7 [llc] Add support for several run-pass options.
Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.

Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272356 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 00:52:10 +00:00
Matt Arsenault
dbaa4b4486 AMDGPU: v_cndmask_b32 does not def vcc
Fixes verifier errors after SIShrinkInstructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272351 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 00:18:41 +00:00
Tom Stellard
60f588f570 AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.

Reviewers: scchan, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 00:01:04 +00:00