10 Commits

Author SHA1 Message Date
Sam Kolton
3481b0868e [AMDGPU] Resubmit SDWA peephole: enable by default
Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299654 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 15:03:28 +00:00
Ivan Krasin
f836c7dbde Revert r299536. [AMDGPU] SDWA peephole: enable by default.
Reason: breaks multiple bots:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173

Original Review URL: https://reviews.llvm.org/D31671



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299583 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 19:58:12 +00:00
Sam Kolton
4523389543 [AMDGPU] SDWA peephole: enable by default
Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299536 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 12:00:45 +00:00
Matt Arsenault
d706d030af AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 21:39:51 +00:00
Matt Arsenault
0c52bece01 AMDGPU: Fix unnecessary ands when packing f16 vectors
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 19:04:26 +00:00
Matt Arsenault
430953ebc8 DAG: Constant fold fp16_to_fp/fp16_to_fp
This fixes emitting conversions of constants on targets
without legal f16 that need to use these for legalization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293499 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 16:57:41 +00:00
Matt Arsenault
d019e8638a Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292982 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-24 22:02:15 +00:00
Matt Arsenault
e2b3286a26 AMDGPU: Invert cmp + select with constant
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.

This seems to usually be better but causes some code size regressions
in some tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290372 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 21:40:08 +00:00
Matt Arsenault
8d631491b3 AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289306 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-10 00:39:12 +00:00
Konstantin Zhuravlyov
7f4d4fdf3f [AMDGPU] Handle f16 select{_cc}
- Select `select` to `v_cndmask_b32`
- Expand `select_cc`
- Refactor patterns

Differential Revision: https://reviews.llvm.org/D26714


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 03:16:26 +00:00