Commit Graph

1438 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin
f304f044ed [AMDGPU] Prevent spills before exec mask is restored
Inline spiller can decide to move a spill as early as possible in the basic block.
It will skip phis and label, but we also need to make sure it skips instructions
in the basic block prologue which restore exec mask.

Added isPositionLike callback in TargetInstrInfo to detect instructions which
shall be skipped in addition to common phis, labels etc.

Differential Revision: https://reviews.llvm.org/D27997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 00:44:31 +00:00
Stanislav Mekhanoshin
b8fa7c40ea [AMDGPU] Add exec copy to LiveIntervals in SILowerControlFlow::emitElse
This instruction is missing from LiveIntervals.
I'm not aware of any problems because of this though.

Differential Revision: https://reviews.llvm.org/D28879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292521 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 21:26:22 +00:00
Matt Arsenault
261f60f486 AMDGPU: Disable some fneg combines unless nsz
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.

fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292473 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 06:35:27 +00:00
Matt Arsenault
cfe56d7c95 AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.

This should be applied to the release branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292472 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 06:04:12 +00:00
Stanislav Mekhanoshin
d78f00a4d1 [AMDGPU] Do not allow register coalescer to create big superregs
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.

With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.

Differential Revision: https://reviews.llvm.org/D28782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-18 17:30:05 +00:00
Sam Kolton
9298246497 [AMDGPU] Assembler: fix v_mac_f16 immediates
Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292224 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-17 15:26:02 +00:00
Matt Arsenault
62b3258a7c AMDGPU: Add replacement export intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292205 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-17 07:26:53 +00:00
Matt Arsenault
d4ac29ff29 AMDGPU: Remove dead pattern
This is the unsafe conversion pattern, but not guarded by
an unsafe math check. It is also already done in LegalizeDAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292173 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-17 00:10:43 +00:00
Jan Vesely
6d821c2f7c ADMGPU/EG,CM: Implement _noret global atomics
_RTN versions will be a lot more complicated

Differential Revision: https://reviews.llvm.org/D28067

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292162 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-16 21:20:13 +00:00
Konstantin Zhuravlyov
999a6572f3 [AMDGPU] Implement f16 fcopysign and fcopysign(f32, f64)
Differential Revision: https://reviews.llvm.org/D28496


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291954 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 19:49:25 +00:00
Benjamin Kramer
1fb85c6675 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 14:39:03 +00:00
Diana Picus
8a47810cd6 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 09:58:52 +00:00
Matt Arsenault
cd002582ba AMDGPU: Skip fneg/select combine if it can fold into other
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291792 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:58:15 +00:00
Matt Arsenault
9db1ec3d4d AMDGPU: Fold free fneg into sin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291790 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:48:09 +00:00
Matt Arsenault
49dd8fcb21 AMDGPU: Fold fneg into fmul_legacy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:26:30 +00:00
Matt Arsenault
bd870734a5 AMDGPU: Fold fneg into rcp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:35 +00:00
Matt Arsenault
cca494fd03 AMDGPU: Fold fneg into fp_round
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:33 +00:00
Matt Arsenault
e652041f69 AMDGPU: Fold fneg into fp_extend
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:28 +00:00
Matt Arsenault
3517370c4d AMDGPU: Fix sub_oneuse being marked commutative
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291748 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 07:17:28 +00:00
Matt Arsenault
94bf68d551 AMDGPU: Fold fneg into fma or fmad
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:32:16 +00:00
Matt Arsenault
ef33822be5 AMDGPU: Fold fneg into fmul
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:23:20 +00:00
Matt Arsenault
bcf34bbbdd AMDGPU: Fold fneg into fadd
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:09:34 +00:00
Matt Arsenault
8694e2f853 AMDGPU: Pull fneg/fabs out of a select
Allows better source modifier usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 23:57:38 +00:00
Matt Arsenault
f1e95d3604 AMDGPU: Fix shrinking of addc/subb.
To shrink to VOP2 the input carry must also be VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:58:12 +00:00
Matt Arsenault
fac51240d9 AMDGPU: Fix sext_inreg for i1 in i16
This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:35:22 +00:00
Matt Arsenault
8c7e9845cf AMDGPU: Fix breaking VOP3 v_add_i32s
This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:35:17 +00:00
Matt Arsenault
c6b1aed80d AMDGPU: Fix folding immediates into mac src2
Whether it is legal or not needs to check for the instruction
it will be replaced with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:00:02 +00:00
Sam Kolton
e8d33c0464 [AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands
Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291668 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 11:46:30 +00:00
Mohammed Agabaria
9c6b24cc3a [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. 
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291657 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 08:23:37 +00:00
Jan Vesely
53dcfdf89b AMDGPU/EG,CM: Add fp16 conversion instructions
Differential Revision: https://reviews.llvm.org/D28164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291622 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 00:12:39 +00:00
Matt Arsenault
1639229587 AMDGPU: Constant fold when immediate is materialized
In future commits these patterns will appear after moveToVALU changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291615 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 23:32:04 +00:00
Matt Arsenault
da59cd0847 AMDGPU: Add tests for HasMultipleConditionRegisters
This was enabled without many specific tests or the comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 19:08:15 +00:00
Matt Arsenault
f7c0d4013c AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.

Fixes code quality regressions vs. SI in min.ll test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291461 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-09 18:52:39 +00:00
Matt Arsenault
155581a09a Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291460 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-09 18:44:11 +00:00
Jan Vesely
0835374acb AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes
This will make transition to SCRATCH_MEMORY easier

Differential Revision: https://reviews.llvm.org/D24746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 21:00:46 +00:00
Konstantin Zhuravlyov
f6f52a315e [AMDGPU] Remove extra semicolon. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291246 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 17:23:21 +00:00
Konstantin Zhuravlyov
9060577664 [AMDGPU] Do not emit .AMDGPU.config section for amdhsa
Differential Revision: https://reviews.llvm.org/D27732


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291245 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 17:02:10 +00:00
Evgeniy Stepanov
795e15e398 Revert "Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")"
Summary: This reverts commit r291144. It breaks build bots.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/3270, http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/2058

lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:1638:12: error: could not convert ‘(const unsigned int*)(& Variants)’ from ‘const unsigned int*’ to ‘llvm::ArrayRef<unsigned int>’
     return Variants;

Reviewers: eugenis, tstellarAMD

Patch by Alex Shlyapnikov.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D28372

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291168 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-05 19:51:13 +00:00
Matt Arsenault
5bd3cb572f Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")
Arrays are supposed to be static const

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291144 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-05 17:36:11 +00:00
Richard Smith
866c5c1860 Revert r291025 ("AMDGPU: Remove unneccessary intermediate vector")
This caused buildbot failures due to returning ArrayRefs referencing local
(temporary) objects.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291067 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-05 03:13:10 +00:00
Matt Arsenault
cc6adc86e9 AMDGPU: Remove unneccessary intermediate vector
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291025 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 22:54:10 +00:00
Jan Vesely
bf64cb107c AMDGPU/SI: Implement sendmsghalt intrinsic
v2: expose using amdgcn prefix

Differential Revision: https://reviews.llvm.org/D23511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290977 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 18:06:55 +00:00
Artem Tamazov
097cd5f5b3 [AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directive
Among other stuff, this allows to use predefined .option.machine_version_major
/minor/stepping symbols in the directive.

Relevant test expanded at once (also file renamed for clarity).

Differential Revision: https://reviews.llvm.org/D28140

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290710 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-29 15:41:52 +00:00
Artem Tamazov
d8dc65b207 [AMDGPU][llvm-mc] Predefined symbols to access register counts (.kernel.{v|s}gpr_count)
The feature allows for conditional assembly, filling the entries
of .amd_kernel_code_t etc.

Symbols are defined with value 0 at the beginning of each kernel scope.
After each register usage, the respective symbol is set to:
	value = max( value, ( register index + 1 ) )
Thus, at the end of scope the value represents a count of used registers.

Kernel scopes begin at .amdgpu_hsa_kernel directive, end at the
next .amdgpu_hsa_kernel (or EOF, whichever comes first). There is also
dummy scope that lies from the beginning of source file til the
first .amdgpu_hsa_kernel.

Test added.

Differential Revision: https://reviews.llvm.org/D27859

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-27 16:00:11 +00:00
Sam Kolton
79df598ffb [AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions
Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290599 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-27 10:06:42 +00:00
Jan Vesely
0547dafe38 AMDGPU: split ret/noret patterns for global atomics
Differential Revision: https://reviews.llvm.org/D27989

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290435 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-23 15:34:51 +00:00
Chandler Carruth
3abadf484b Enable '-Wstring-conversion' and fix some bad asserts that it helped
find.

Notable is the assert in NewGVN which had no effect because of the bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290400 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-23 01:38:06 +00:00
Matt Arsenault
e2b3286a26 AMDGPU: Invert cmp + select with constant
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.

This seems to usually be better but causes some code size regressions
in some tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290372 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 21:40:08 +00:00
Matt Arsenault
ad47821c65 AMDGPU: Use i16 for i16 shift amount
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290351 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 16:36:25 +00:00
Matt Arsenault
b1e16f2d3a AMDGPU: Fix missing 16-bit cmpx instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290349 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 16:27:14 +00:00