Commit Graph

168 Commits

Author SHA1 Message Date
Matt Arsenault
2e07d5cebb GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365658 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-10 16:31:19 +00:00
Tom Stellard
b97830df46 AMDGPU/GlobalISel: Add support for wide loads >= 256-bits
Summary:
This adds support for the most commonly used wide load types:
<8xi32>, <16xi32>, <4xi64>, and <8xi64>

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365586 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-10 00:22:41 +00:00
Matt Arsenault
13b473d248 GlobalISel: Implement lower for G_FCOPYSIGN
In SelectionDAG AMDGPU treated these as legal, but this was mostly
because the bitcasts required for FP types were painful. Theoretically
the bitpattern should eventually match to bfi, so don't bother trying
to get the patterns to import.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365583 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-09 23:34:29 +00:00
Matt Arsenault
9b44e7d33a AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTOR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365575 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-09 22:48:04 +00:00
Matt Arsenault
0d10177cd3 AMDGPU/GlobalISel: Legalize more concat_vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365488 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-09 14:17:31 +00:00
Matt Arsenault
a568e61708 AMDGPU/GlobalISel: Make s16 G_ICMP legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365486 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-09 14:10:43 +00:00
Matt Arsenault
fd0d82251d AMDGPU/GlobalISel: Select G_MERGE_VALUES
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365482 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-09 14:02:20 +00:00
Matt Arsenault
d51fdec4e0 AMDGPU/GlobalISel: Handle more input argument intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364836 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:50:50 +00:00
Matt Arsenault
51744f169f AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364835 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:49:01 +00:00
Matt Arsenault
c087c304f7 AMDGPU/GlobalISel: Legalize workgroup ID intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364834 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:47:22 +00:00
Matt Arsenault
fa7c34bbaf AMDGPU/GlobalISel: Legalize workitem ID intrinsics
Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.

Test is copied directly from the existing test, with 2 additions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364833 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:45:36 +00:00
Matt Arsenault
643e4ad7f2 AMDGPU/GlobalISel: Custom lower control flow intrinsics
Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364832 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:40:23 +00:00
Matt Arsenault
db3759c68a AMDGPU/GlobalISel: Legalize s16 add/sub/mul
If this is scalar, promote to s32. Use a new observer class to assign
the register bank of newly created registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364827 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 18:18:55 +00:00
Matt Arsenault
65d89fc110 AMDGPU/GlobalISel: Legalize s16 fcmp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364817 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 17:35:53 +00:00
Matt Arsenault
faac328d81 AMDGPU/GlobalISel: Make s16 select legal
This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364787 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 15:42:47 +00:00
Matt Arsenault
30321492f6 AMDGPU/GlobalISel: Convert to using Register
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364616 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-28 01:16:46 +00:00
Matt Arsenault
a3af6bb71d GlobalISel: Remove unsigned variant of SrcOp
Force using Register.

One downside is the generated register enums require explicit
conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364194 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 16:16:12 +00:00
Matt Arsenault
a2b05bc24d CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364191 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 15:50:29 +00:00
Matt Arsenault
d5a79b9727 AMDGPU: Consolidate some getGeneration checks
This is incomplete, and ideally these would all be removed, but it's
better to localize them to the subtarget first with comments about
what they're for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363902 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 23:54:58 +00:00
Matt Arsenault
3dd7804824 AMDGPU/GlobalISel: Legality for integer min/max
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361519 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 17:58:48 +00:00
Matt Arsenault
489e4500c0 AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361082 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 23:05:18 +00:00
Matt Arsenault
6d4e78cfc3 GlobalISel: Implement lower for S64->S32 [SU]ITOFP
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.

This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361081 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 23:05:13 +00:00
Matt Arsenault
628e4bfaaf AMDGPU: Fix unused variable warnings in release builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361030 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 12:59:27 +00:00
Matt Arsenault
9b7f345dfd AMDGPU/GlobalISel: Legalize G_FCEIL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361028 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 12:20:05 +00:00
Matt Arsenault
f9d63f2d76 AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361027 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 12:20:01 +00:00
Matt Arsenault
fa86511b4c AMDGPU/GlobalISel: Legalize G_FRINT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361026 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 12:19:57 +00:00
Matt Arsenault
2e85df364d AMDGPU/GlobalISel: Legalize G_FCOPYSIGN
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361025 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 12:19:52 +00:00
Matt Arsenault
4d1518d22a AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sources
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358894 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-22 15:22:46 +00:00
Amara Emerson
040f61e117 [GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.

This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.

I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.

Compile time:
Program                                        base   cse    diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test     9.04   9.12  0.8%
test-suite...Mark/mafft/pairlocalalign.test     2.68   2.66 -0.7%
test-suite...-typeset/consumer-typeset.test     5.53   5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test         5.30   5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test        25.82  25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test     6.92   6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test    34.24  34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test           6.25   6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test     1.66   1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test         13.61  13.60 -0.0%
Geomean difference                                          -0.2%

Code size:
Program                                        base     cse      diff
test-suite...-typeset/consumer-typeset.test    1315632  1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test    1313892  1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test        1439504  1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test    2936980  2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test        3478276  3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test    8082868  8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test         3870380  3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test          1434904  1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test    764528   764528   0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test    782092   782092   0.0%
Geomean difference                                              -0.9%

Differential Revision: https://reviews.llvm.org/D60580

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358369 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-15 05:04:20 +00:00
Matt Arsenault
6c7dd5967a AMDGPU/GlobalISel: Fix non-power-of-2 select
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357762 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 14:03:04 +00:00
Matt Arsenault
2d429b1091 GlobalISel: Implement fewerElementsVector for phi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355048 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 00:16:32 +00:00
Matt Arsenault
0d2ad48b33 GlobalISel: Implement moreElementsVector for phi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355047 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 00:01:05 +00:00
Matt Arsenault
453c7ee1c9 AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354825 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-25 21:32:48 +00:00
Matt Arsenault
da91f4c26a AMDGPU/GlobalISel: Clamp max implicit_def elements
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354818 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-25 20:46:06 +00:00
Matt Arsenault
f237196b1b AMDGPU/GlobalISel: Make phis legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354592 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-21 15:48:13 +00:00
Matt Arsenault
4c14549b42 AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354587 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-21 15:22:20 +00:00
Matt Arsenault
1b59f4c380 GlobalISel: Fix fewerElementsVector for ctlz with different result type
Also complete the set of related operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354480 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-20 16:42:52 +00:00
Matt Arsenault
7e1a65dad5 GlobalISel: Implement moreElementsVector for g_insert results
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354477 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-20 16:11:22 +00:00
Matt Arsenault
379689ce0c GlobalISel: Implement moreElementsVector for select
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354354 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-19 17:03:09 +00:00
Matt Arsenault
406dc2a0d5 GlobalISel: Implement moreElementsVector for G_EXTRACT source
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354348 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-19 16:44:22 +00:00
Matt Arsenault
47f8b7cd25 GlobalISel: Implement moreElementsVector for bit ops
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354345 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-19 16:30:19 +00:00
Matt Arsenault
b1b624d08a GlobalISel: Implement widenScalar for g_extract scalar results
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354293 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-18 22:39:27 +00:00
Matt Arsenault
6f55bb14e6 GlobalISel: Add alignment to LegalityQuery MMOs
This allows targets to specify the minimum alignment required for the
load/store.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354071 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-14 22:41:09 +00:00
Matt Arsenault
217f03f322 AMDGPU/GlobalISel: Fix RegBankSelect for GEP.
This is basically a pointer typed add, so shouldn't be any different.
This was assuming everything was an SGPR, which is not true.

Also cleanup legality for GEP. I don't seem to be seeing the problem
the hack marking s64 as a legal pointer type the comment mentions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354067 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-14 22:24:28 +00:00
Matt Arsenault
bb8d9165a4 AMDGPU/GlobalISel: Only make f16 constants legal on f16 targets
We could deal with it, but there's no real point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353845 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-12 14:54:55 +00:00
Matt Arsenault
f3f4691605 GlobalISel: Implement moreElementsVector for implicit_def
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353754 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-11 22:00:39 +00:00
Matt Arsenault
c0665d4bcd GlobalISel: Add G_FCANONICALIZE instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353719 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-11 17:05:20 +00:00
Matt Arsenault
7772407272 AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2
clampScalar doesn't do anything for non-power-of-2 in range.
There should probably be a combination rule to reduce the number
of matching rules.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353526 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 15:06:24 +00:00
Matt Arsenault
b997233527 AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353522 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 14:46:27 +00:00
Matt Arsenault
4473e73049 AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353516 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 14:16:11 +00:00