Commit Graph

16144 Commits

Author SHA1 Message Date
Nirav Dave
54cc8d76c8 Add support for no-jump-tables
Add function soft attribute to the generation of Jump Tables in CodeGen
as initial step towards clang support of gcc's no-jump-table support

Reviewers: hans, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18321

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264756 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 17:46:23 +00:00
Manman Ren
d9e9e2b717 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264754 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 17:37:21 +00:00
Sanjay Patel
65a7ad238f [x86] add tests to show current memset codegen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264748 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 17:09:27 +00:00
Sanjay Patel
4fdd8ba990 regenerate checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264738 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 16:11:29 +00:00
Junmo Park
7c78d07dc3 fix CHECK_NOT -> CHECK-NOT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264706 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 07:53:07 +00:00
Elena Demikhovsky
c4a86129f8 AVX-512: fixed a bug in fp_to_uint pattern on KNL
Fixed fp_to_uint instruction selection on KNL.
One pattern was missing for <4 x double> to <4 x i32>

Differential Revision: http://reviews.llvm.org/D18512



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264701 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 06:33:41 +00:00
Hal Finkel
f0740433d3 [PowerPC] Refactor popcnt[dw] target features
Instead of using two feature bits, one to indicate the availability of the
popcnt[dw] instructions, and another to indicate whether or not they're fast,
use a single enum. This allows more consistent control via target attribute
strings, and via Clang's command line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264690 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 01:36:01 +00:00
Kyle Butt
b76dcf45e6 [Codegen] Decrease minimum jump table density.
Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264689 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-29 00:23:41 +00:00
Sanjay Patel
206cd3a64e fix checks: *_DAG -> *-DAG
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264676 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 22:11:06 +00:00
Sanjay Patel
52488aeab7 fix CHECK_NEXT -> CHECK-NEXT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264674 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 22:03:07 +00:00
Sanjay Patel
f584710ea2 fix CHECK_DAG -> CHECK-DAG
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264673 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 22:00:38 +00:00
Sanjay Patel
04d6f99c12 fix CHECK_NEXT -> CHECK-NEXT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264672 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:58:27 +00:00
Sanjay Patel
daf1ffde5a fix CHECK_LABEL -> CHECK-LABEL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264671 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:56:48 +00:00
Sanjay Patel
76ae15b1ec trailing whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264670 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:52:53 +00:00
Simon Pilgrim
574e4b288d [X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op for all their scalar elements.
If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors.

The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent.

Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least.

Differential Revision: http://reviews.llvm.org/D18492

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264666 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:33:52 +00:00
Sanjay Patel
0969a1d57f fix CHECK_NEXT -> CHECK-NEXT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264661 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:14:24 +00:00
Matthias Braun
c1e6bf9a70 MIRParser: Add %subreg.xxx syntax for subregister index operands
Differential Revision: http://reviews.llvm.org/D18279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 18:18:46 +00:00
Matthias Braun
1598569f70 CodeGen: Correct specification of PHI nodes
They do have a def machine operand.

Fixing the definition is necessary for an upcoming patch.

Differential Revision: http://reviews.llvm.org/D18384

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264607 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 18:18:41 +00:00
Haicheng Wu
dd7070ecb9 [AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize
Mimic what x86 does when optimizing sdiv/udiv for minsize.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264606 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 18:17:07 +00:00
Hal Finkel
8541eae72c [PowerPC] On the A2, popcnt[dw] are very slow
The A2 cores support the popcntw/popcntd instructions, but they're microcoded,
and slower than our default software emulation. Specifically, popcnt[dw] take
approximately 74 cycles, whereas our software emulation takes only 24-28
cycles.

I've added a new target feature to indicate a slow popcnt[dw], instead of just
removing the existing target feature from the a2/a2q processor models, because:
  1. This allows us to return more accurate information via the TTI interface
     (I recognize that this currently makes no practical difference)
  2. Is hopefully easier to understand (it allows the core's features to match
     its manual while still having the desired effect).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264600 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 17:52:08 +00:00
Derek Schuff
fadd113c9b Introduce MachineFunctionProperties and the AllVRegsAllocated property
MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264593 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 17:05:30 +00:00
Tom Stellard
4a79dec8e2 AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions
Summary:
This helps prevent load clustering from drastically increasing register
pressure by trying to cluster 4 SMRDx8 loads together.  The limit of 16
bytes was chosen, because it seems like that was the original intent
of setting the limit to 4 instructions, but more analysis could show
that a different limit is better.

This fixes yields small decreases in register usage with shader-db, but
also helps avoid a large increase in register usage when lane mask
tracking is enabled in the machine scheduler, because lane mask tracking
enables more opportunities for load clustering.

shader-db stats:

2379 shaders in 477 tests
Totals:
SGPRS: 49744 -> 48600 (-2.30 %)
VGPRS: 34120 -> 34076 (-0.13 %)
Code Size: 1282888 -> 1283184 (0.02 %) bytes
LDS: 28 -> 28 (0.00 %) blocks
Scratch: 495616 -> 492544 (-0.62 %) bytes per wave
Max Waves: 6843 -> 6853 (0.15 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264589 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 16:10:13 +00:00
Krzysztof Parzyszek
3fb5c4321a [Hexagon] Improve handling of unaligned vector loads and stores
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264584 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 15:43:03 +00:00
Krzysztof Parzyszek
621888ea62 [Hexagon] Only use restore functions for single register at -Oz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264581 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 14:52:21 +00:00
Jacques Pienaar
cf0b01d7ec [lanai] Add Lanai backend.
Add the Lanai backend to lib/Target.

General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).

Differential Revision: http://reviews.llvm.org/D17011



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264578 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 13:09:54 +00:00
Elena Demikhovsky
ebc3cb93f0 AVX-512: Fixed ICMP instruction selection for i1 operands
ICMP instruction selection fails on SKX and KNL for i1 operand.
I use XOR to resolve:
(A == B) is equivalent to (A xor B) == 0

Differential Revision: http://reviews.llvm.org/D18511



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264566 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 07:47:58 +00:00
Hal Finkel
68e74bdd60 [PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based loop legality
Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc
function calls (fmax[f], etc.) generally map to function calls after lowering.
For some vector types with QPX at least, however, we can legally lower these,
and we don't need to prohibit CTR-based loops on their account.

It turned out, however, that the logic that checked the opcodes associated with
intrinsics was broken (it would set the Opcode variable, but that variable was
later checked only if set for some otherwise-external function call.

This fixes the latter problem and adds the FMAX/MINNUM mappings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264532 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-27 05:40:56 +00:00
Simon Pilgrim
def1cb0575 [X86][AVX] Enabled SMUL_LOHI/UMUL_LOHI v8i32 vectors on AVX1 targets
Correct splitting of v8i32 vectors into v4i32 vectors to prevent scalarization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264517 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 18:32:13 +00:00
Simon Pilgrim
3711520788 [X86][AVX] Enabled MULHS/MULHU v16i16 vectors on AVX1 targets
Correct splitting of v16i16 vectors into v8i16 vectors to prevent scalarization

Differential Revision: http://reviews.llvm.org/D18307

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264512 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 15:44:55 +00:00
Simon Pilgrim
1ecfaf6f50 [X86][SSE] Add MULHS/MULHU custom lowering for i8 vectors
Currently this is to mainly to prevent scalarization of integer division by constants.

Differential Revision: http://reviews.llvm.org/D18307

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264511 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 15:27:20 +00:00
Simon Pilgrim
e1019f460b [X86][SSE] Added v64i8 vector integer multiply tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264510 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 09:50:06 +00:00
Simon Pilgrim
fac6a88e2b [X86][AVX512BW] AVX512BW can sign-extend v32i8 to v32i16 for simpler v32i8 multiplies.
Only pre-AVX512BW targets need to split v32i8 vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264509 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 09:44:27 +00:00
David Majnemer
4fda227053 [PowerPC] Disable the CTR optimization in the presence of {min,max}num
The minnum and maxnum intrinsics get lowered to libcalls which
invalidates the CTR optimization.

This fixes PR27083.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264508 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 09:42:31 +00:00
Simon Pilgrim
e2ec01fe07 [X86][SSE] Refreshed vector integer multiply tests
Add all 256-bit vector tests.
Added AVX512F/AVX512BW test targets.
Renamed tests something more meaningful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264507 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-26 09:35:48 +00:00
David Majnemer
4b3682d24a [X86] Emit a proper ADJCALLSTACKDOWN in EmitLoweredTLSAddr
We forgot to add the second machine operand to our ADJCALLSTACKDOWN,
resulting in crashes in PEI.

This fixes PR27071.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264465 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 21:49:11 +00:00
Jun Bum Lim
386c67f7fc [MachineCopyPropagation] Expose more dead copies across instructions with regmasks
When encountering instructions with regmasks, instead of cleaning up all the
elements in MaybeDeadCopies map, remove only the instructions erased. By keeping
more instruction in MaybeDeadCopies, this change will expose more dead copies
across instructions with regmasks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264462 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 21:15:35 +00:00
Nirav Dave
a9f320779b Prevent construction of cycle in DAG store merge
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG.  This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.

This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.

Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight

Subscribers: llvm-commits, tberghammer, danalbert, srhines

Differential Revision: http://reviews.llvm.org/D18336

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264461 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 21:06:30 +00:00
Saleem Abdulrasool
bac79c5a0a ARM: maintain BB ordering when expanding WIN__DBZCHK
It is possible to have a fallthrough MBB prior to MBB placement.  The original
addition of the BB would result in reordering the BB as not preceding the
successor.  Because of the fallthrough nature of the BB, we could end up
executing incorrect code or even a constant pool island!  Insert the spliced BB
into the same location to avoid that.

Thanks to Tim Northover for invaluable hints and Fiora for the discussion on
what may have been occurring!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264454 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 19:48:06 +00:00
Hans Wennborg
3b7626fb55 [X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsize
64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes,
respectively, whereas and/or with 8-bit immediate is only three bytes.

Since these instructions imply an additional memory read (which the CPU could
elide, but we don't think it does), restrict these patterns to minsize functions.

Differential Revision: http://reviews.llvm.org/D18374

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264440 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 18:11:31 +00:00
Hans Wennborg
6a62eecdb6 X86: Use push-pop for materializing 8-bit immediates for minsize (take 2)
This is the same as r255936, with added logic for avoiding clobbering of the
red zone (PR26023).

Differential Revision: http://reviews.llvm.org/D18246

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264375 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 01:10:56 +00:00
Saleem Abdulrasool
c737b6fc30 ARM: fix optimised division on WoA
We did not have an explicit branch to the continuation BB.  When the check was
hoisted, this could permit control follow to fall through into the division
trap.  Add the explicit branch to the continuation basic block to ensure that
code execution is correct.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264370 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 00:34:11 +00:00
Manman Ren
7121713797 CXX TLS: collect return blocks after SelectAllBasicBlocks.
It is incorrect to get the corresponding MBB for a ReturnInst before
SelectAllBasicBlocks since SelectAllBasicBlocks can change the
correspondence between a ReturnInst and the MBB it is in.

PR27062


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264358 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 23:21:29 +00:00
Sanjoy Das
abd2448893 Lower varargs correctly in deopt bundle lowering
Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because
populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264354 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 22:37:52 +00:00
Matthias Braun
afb111acf7 LiveInterval: Fix Distribute() failing on liveranges with unused VNInfos
This fixes http://llvm.org/PR26991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 21:41:38 +00:00
Eric Christopher
ee2d71dfcd Finish the incomplete 'd' inline asm constraint support for PPC by
making sure we give it a register and mark it as a register constraint.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 21:04:52 +00:00
Eric Christopher
aa3cfdd253 Reorder check lines, comments in test and remove unnecessary IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 21:04:47 +00:00
Sanjoy Das
2f8286bd50 Match call and target calling conventions in test
Fixes an issue in rL264329.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264337 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 20:51:24 +00:00
Sanjoy Das
8ba5ebc098 Add lowering support for llvm.experimental.deoptimize
Summary:
Only adds support for "naked" calls to llvm.experimental.deoptimize.
Support for round-tripping through RewriteStatepointsForGC will come
as a separate patch (should be simpler than this one).

Reviewers: reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264329 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 20:23:29 +00:00
Krzysztof Parzyszek
d4af74e531 [Hexagon] Add support for run-time stack overflow checking
Patch by Sundeep Kushwaha.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264328 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 20:20:07 +00:00
Krzysztof Parzyszek
deffc78c77 [Hexagon] Generate PIC-specific versions of save/restore routines
In PIC mode, the registers R14, R15 and R28 are reserved for use by
the PLT handling code. This causes all functions to clobber these
registers. While this is not new for regular function calls, it does
also apply to save/restore functions, which do not follow the standard
ABI conventions with respect to the volatile/non-volatile registers.

Patch by Jyotsna Verma.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264324 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-24 19:18:48 +00:00