Commit Graph

20130 Commits

Author SHA1 Message Date
Tim Northover
e3a136c5b5 GlobalISel: check for CImm rather than Imm on G_CONSTANTs.
All G_CONSTANTS created by the MachineIRBuilder have an operand of type CImm
(i.e. a ConstantInt), so that's what the selector needs to look for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296176 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 21:21:38 +00:00
Sanjay Patel
f7f0f08973 [ARM] auto-generate complete checks; NFC
The affected test may change with a patch I'm looking at for DAGCombiner,
so I want to make sure it's not a regression.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296175 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 21:19:09 +00:00
Dan Gohman
3d2ef31552 [WebAssembly] Handle f16 in fast-isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296172 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 21:05:35 +00:00
Michael Kuperstein
98ee128c8e [CGP] Split some critical edges coming out of indirect branches
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.

This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.

Differential Revision: https://reviews.llvm.org/D29916


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296149 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 18:41:32 +00:00
Nemanja Ivanovic
ab05c2009d [PowerPC] Use subfic instruction for subtract from immediate
Provide a 64-bit pattern to use SUBFIC for subtracting from a 16-bit immediate.
The corresponding pattern already exists for 32-bit integers.

Committing on behalf of Hiroshi Inoue.

Differential Revision: https://reviews.llvm.org/D29387


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296144 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 18:16:06 +00:00
Nemanja Ivanovic
92606b3cc6 [PowerPC] Use rldicr instruction for AND with an immediate if possible
Emit clrrdi (extended mnemonic for rldicr) for AND-ing with masks that
clear bits from the right hand size.

Committing on behalf of Hiroshi Inoue.

Differential Revision: https://reviews.llvm.org/D29388


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296143 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 18:03:16 +00:00
Sanjay Patel
9a9478ccb0 [DAGCombiner] add missing folds for scalar select of {-1,0,1}
The motivation for filling out these select-of-constants cases goes back to D24480, 
where we discussed removing an IR fold from add(zext) --> select. And that goes back to:
https://reviews.llvm.org/rL75531
https://reviews.llvm.org/rL159230

The idea is that we should always canonicalize patterns like this to a select-of-constants 
in IR because that's the smallest IR and the best for value tracking. Note that we currently 
do the opposite in some cases (like the cases in *this* patch). Ie, the proposed folds in 
this patch already exist in InstCombine today:
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151

As this patch shows, most targets generate better machine code for simple ext/add/not ops 
rather than a select of constants. So the follow-up steps to make this less of a patchwork 
of special-case folds and missing IR canonicalization:

1. Have DAGCombiner convert any select of constants into ext/add/not ops.
2  Have InstCombine canonicalize in the other direction (create more selects).

Differential Revision: https://reviews.llvm.org/D30180


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296137 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 17:17:33 +00:00
Simon Dardis
f64a815fb7 Recommit "[mips] Fix atomic compare and swap at O0."
This time with the missing files.

Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the store can cause the
atomic sequence to fail.

This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.

This resolves PR/32020.

Thanks to James Cowgill for reporting the issue!

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D30257



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296134 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 16:32:18 +00:00
Simon Dardis
3a8812c0db Revert "[mips] Fix atomic compare and swap at O0."
This reverts r296132. I forgot to include the tests.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296133 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 16:30:27 +00:00
Simon Dardis
2468c7d5ea [mips] Fix atomic compare and swap at O0.
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the store can cause the
atomic sequence to fail.

This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.

This resolves PR/32020.

Thanks to James Cowgill for reporting the issue!

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D30257



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296132 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 16:27:45 +00:00
Daniel Sanders
bf21af7b42 [globalisel] Decouple src pattern operands from dst pattern operands.
Summary:
This isn't testable for AArch64 by itself so this patch also adds
support for constant immediates in the pattern and physical
register uses in the result.

The new IntOperandMatcher matches the constant in patterns such as
'(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold
immediates into an instruction so this is the first rule that will match
across multiple BB's.

The Renderer hierarchy is responsible for adding operands to the result
instruction. Renderers can copy operands (CopyRenderer) or add physical
registers (in particular %wzr and %xzr) to the result instruction
in any order (OperandMatchers now import the operand names from
SelectionDAG to allow renderers to access any operand). This allows us to
emit the result instruction for:
  %1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0
  %1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0
although the latter is untested since the matcher/importer has not been
taught about commutativity yet.

Added BuildMIAction which can build new instructions and mutate them where
possible. W.r.t the mutation aspect, MatchActions are now told the name of
an instruction they can recycle and BuildMIAction will emit mutation code
when the renderers are appropriate. They are appropriate when all operands
are rendered using CopyRenderer and the indices are the same as the matcher.
This currently assumes that all operands have at least one matcher.

Finally, this change also fixes a crash in
AArch64InstructionSelector::select() caused by an immediate operand
passing isImm() rather than isCImm(). This was uncovered by the other
changes and was detected by existing tests.

Depends on D29711

Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar

Reviewed By: rovka

Subscribers: aemerson, dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D29712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296131 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 15:43:30 +00:00
Diana Picus
27abbacf66 [ARM] GlobalISel: Select G_STORE
Same as selecting G_LOAD.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296122 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 14:01:27 +00:00
Diana Picus
dba394d354 Minor test fix
The test was using a size of 8 for loading/storing pointers. It should be 4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296120 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 13:27:55 +00:00
Diana Picus
11601c0bd3 [ARM] GlobalISel: Add reg bank mappings for stores
Same as the ones for loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296115 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 13:07:25 +00:00
Diana Picus
29289da775 [ARM] GlobalISel: Legalize stores
Allow the same types that we allow for loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296108 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 11:28:24 +00:00
Diana Picus
269c5fd3e9 Revert "[ARM] GlobalISel: Legalize stores"
This reverts commit r296103 because the test broke on one of the bots. Sorry!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296104 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 10:35:39 +00:00
Diana Picus
03012a011d [ARM] GlobalISel: Legalize stores
Allow the same types that we allow for loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296103 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 10:19:23 +00:00
Craig Topper
474a418aa7 [AVX-512] Remove lzcnt intrinsics and autoupgrade them to generic ctlz intrinsics with select.
Clang has been emitting cltz intrinsics for a while now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296091 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 05:35:04 +00:00
Craig Topper
cc2f430f22 [AVX-512] Move lzcnt and conflict intrinsic tests to avx512cd intrinsic test file since that's their feature.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296090 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 05:34:59 +00:00
Craig Topper
e35238cd60 [AVX-512] Use update_llc_test_checks.py to generate a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296089 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 05:34:57 +00:00
Petr Hosek
193856a80c [Fuchsia] Use thread-pointer ABI slots for stack-protector and safe-stack
The Fuchsia ABI defines slots from the thread pointer where the
stack-guard value for stack-protector, and the unsafe stack pointer
for safe-stack, are stored. This parallels the Android ABI support.

Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D30237

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296081 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 03:10:10 +00:00
Eli Friedman
f884f5ab95 Add some testcases for bitfields with illegal widths.
clang will generate IR like this for input using packed bitfields;
very simple semantically, but it's a bit tricky to actually
generate good code.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296080 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 03:04:11 +00:00
Eli Friedman
c39d5e685a Fix old testcase for dead store to match the original intent.
The x86 backend has a special case for load+xor+store, which isn't really
what this is trying to test.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296077 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 02:58:49 +00:00
Adam Nemet
c1a4dd37ad [LazyMachineBFI] Add testcase
This is based on Justin's testcase and checking whether BFI is not populated
in case hotness is off.

This is a patch meant on top of Justin's patch to enable Machine opt-remarks
in the
AsmPrinter (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170130/426595.html)

Differential Revision: https://reviews.llvm.org/D29837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296065 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 01:22:55 +00:00
Michael Kuperstein
969577f54d Revert r269060 to pacify bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296064 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 01:22:19 +00:00
Justin Bogner
3c86e87de3 OptDiag: Add test for r296053
Forgot to commit this with the change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296061 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 01:13:09 +00:00
Michael Kuperstein
8981fc9888 [CGP] Split some critical edges coming out of indirect branches
Splitting critical edges when one of the source edges is an indirectbr
is hard in general (because it requires changing the memory the indirectbr
reads). But if a block only has a single indirectbr predecessor (which is
the common case), we can simulate splitting that edge by splitting
the destination block, and retargeting the *direct* branches.

This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame()
ends up using an indirect branch with ~100 successors, and passing a constant to
each of those. Since MachineSink can't break indirect critical edges on demand
(and doing this in MIR doesn't look feasible), this causes us to emit about ~100
defs of registers containing constants, which we in the predecessor block, where
only one of those constants is used in each successor. So, at each computed goto,
we needlessly spill about a 100 constants to stack. The end result is that a
clang-compiled python interpreter can be about ~2.5x slower on a simple python
reduction loop than a gcc-compiled interpreter.

Differential Revision: https://reviews.llvm.org/D29916


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296060 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 00:56:21 +00:00
Artem Belevich
6bc216ccf6 [NVPTX] Added support for .f16x2 instructions.
This patch enables support for .f16x2 operations.

Added new register type Float16x2.
Added support for .f16x2 instructions.
Added handling of vectorized loads/stores of v2f16 values.

Differential Revision: https://reviews.llvm.org/D30057
Differential Revision: https://reviews.llvm.org/D30310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296032 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 22:38:24 +00:00
Tim Northover
a328146a75 ARM: make sure FastISel bails on f64 operations for Cortex-M4.
FastISel wasn't checking the isFPOnlySP subtarget feature before emitting
double-precision operations, so it got completely invalid CodeGen for doubles
on Cortex-M4F.

The normal ISel testing wasn't spectacular either so I added a second RUN line
to improve that while I was in the area.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296031 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 22:35:00 +00:00
Krzysztof Parzyszek
9bb4f10172 [Hexagon] Handle saturations in Hexagon bit tracker
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296026 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 22:11:52 +00:00
Evgeniy Stepanov
23c98ecd0f Disable TLS for stack protector on Android API<17.
The TLS slot did not exist back then.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296014 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 21:06:35 +00:00
Ahmed Bougacha
65d76e1285 [GlobalISel] Emit opt remarks on isel fallbacks.
Having more fine-grained information on the specific construct that
caused us to fallback is valuable for large-scale data collection.

We still have the fallback warning, that's also used for FastISel.
We still need to remove the fallback warning, and teach FastISel to also
emit remarks (it currently has a combination of the warning, stats, and
debug prints: the remarks could unify all three).

The abort-on-fallback path could also be better handled using remarks:
one could imagine a "-Rpass-error", analoguous to "-Werror", which would
promote missed/failed remarks to errors.  It's not clear whether that
would be useful for other remarks though, so we're not there yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296013 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 21:05:42 +00:00
Stanislav Mekhanoshin
0bf4d71d50 Correct register pressure calculation in presence of subregs
If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.

Differential Revision: https://reviews.llvm.org/D29835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296009 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 20:19:44 +00:00
Krzysztof Parzyszek
ab761176c9 [Hexagon] Avoid IMPLICIT_DEFs as new-value producers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295997 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 17:47:34 +00:00
Jan Vesely
dae323db22 AMDGPU/SI: Fix trunc i16 pattern
Hit on ASICs that support 16bit instructions.

Differential Revision: https://reviews.llvm.org/D30281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295990 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 16:12:21 +00:00
Krzysztof Parzyszek
69d8d82e3c [Hexagon] Patterns for CTPOP, BSWAP and BITREVERSE
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295981 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 15:02:09 +00:00
Diana Picus
31d09e83a5 [ARM] GlobalISel: Lower call returns
Introduce a common ValueHandler for call returns and formal arguments, and
inherit two different versions for handling the differences (at the moment the
only difference is the way physical registers are marked as used).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295973 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 14:18:41 +00:00
Diana Picus
5e98318841 [ARM] GlobalISel: Lower call parameters in regs
Add support for lowering calls with parameters than can fit into regs.  Use the
same ValueHandler that we used for function returns, but rename it to match its
new, extended purpose.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295971 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 13:25:43 +00:00
Ayman Musa
6f30b9797e [X86][AVX] Disable VCVTSS2SD & VCVTSD2SS memory folding and fix the register class of their first input when creating node in fast-isel.
(Quick fix to buildbot failure after rL295940 commit).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295970 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 13:15:44 +00:00
Kristof Beyls
93418eeb1d Fix assertion failure in ARMConstantIslandPass.
The ARMConstantIslandPass didn't have support for handling accesses to
constant island objects through ARM::t2LDRBpci instructions. This adds
support for that.

This fixes PR31997.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295964 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 12:24:55 +00:00
Ayman Musa
70ad23eba8 [X86][AVX512] Change VCVTSS2SD and VCVTSD2SS node types to keep consistency between VEX/EVEX versions.
AVX versions of the converts work on f32/f64 types, while AVX512 version work on vectors.

Differential Revision: https://reviews.llvm.org/D29988



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 07:24:21 +00:00
Matt Arsenault
32a81bbff2 AMDGPU: Add another BFE pattern
This is the pattern that falls out of the instruction's
definition if offset == 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295912 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 00:23:43 +00:00
Matt Arsenault
cd39b42cab AMDGPU: Use clamp with f64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295908 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:53:37 +00:00
Matt Arsenault
e184e01dd7 AMDGPU: Fold FP clamp as modifier bit
The manual is unclear on the details of this. It's not
clear to me if denormals are not allowed with clamp,
or if that is only omod. Not allowing denorms for
fp16 or fp64 isn't useful so I also question if that
is really a restriction. Same with whether this is valid
without IEEE mode enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:27:53 +00:00
Wei Ding
1cfed01e02 AMDGPU : Update TrapCode based on Trap Handler ABI.
Differential Revision: http://reviews.llvm.org/D30232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:22:19 +00:00
Matt Arsenault
c2d34b5027 AMDGPU: Add replacement bfe intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295899 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:04:58 +00:00
Dylan McKay
ec26388916 [AVR] Disable integrated assembler for a few tests
Fixes the build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295895 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:41:13 +00:00
Krzysztof Parzyszek
e9d7ca1b92 [Hexagon] Implement @llvm.readcyclecounter()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295892 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:28:47 +00:00
Matt Arsenault
206dfa3c0d AMDGPU: Don't add emergency stack slot if all spills are SGPR->VGPR
This should avoid reporting any stack needs to be allocated in the
case where no stack is truly used. An unused stack slot is still
left around in other cases where there are real stack objects
but no spilling occurs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:23:32 +00:00
Krzysztof Parzyszek
7af390a681 [Hexagon] Add intrinsics for masked vector stores
Patch by Harsha Jagasia.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295879 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:23:09 +00:00