RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

Fork 0

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-20 12:26:02 +00:00

Commit Graph

Author	SHA1	Message	Date
Carl Ritson	dfa4156319	[AMDGPU] Fix DPP operand order in atomic optimizer Summary: Ensure order of operands in DPP atomic optimizer final WWM step is appropriate for sub instructions. Change-Id: I631d050e1c00a3b4bc7c11a90437064403c4cf30 Reviewers: sheredom, tpr Reviewed By: sheredom Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58900 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355394 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-05 12:21:44 +00:00
Neil Henning	4fee3cff4d	[AMDGPU] Fix DPP sequence in atomic optimizer. This commit fixes the DPP sequence in the atomic optimizer (which was previously missing the row_shr:3 step), and works around a read_register exec bug by using a ballot instead. Differential Revision: https://reviews.llvm.org/D57737 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353703 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-11 14:44:14 +00:00
Neil Henning	b461f4de29	[AMDGPU] Add an AMDGPU specific atomic optimizer. This commit adds a new IR level pass to the AMDGPU backend to perform atomic optimizations. It works by: - Running through a function and finding atomicrmw add/sub or uses of the atomic buffer intrinsics for add/sub. - If all arguments except the value to be added/subtracted are uniform, record the value to be optimized. - Run through the atomic operations we can optimize and, depending on whether the value is uniform/divergent use wavefront wide operations (DPP in the divergent case) to calculate the total amount to be atomically added/subtracted. - Then let only a single lane of each wavefront perform the atomic operation, reducing the total number of atomic operations in flight. - Lastly we recombine the result from the single lane to each lane of the wavefront, and calculate our individual lanes offset into the final result. Differential Revision: https://reviews.llvm.org/D51969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343973 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-08 15:49:19 +00:00

Author

SHA1

Message

Date

Carl Ritson

dfa4156319

[AMDGPU] Fix DPP operand order in atomic optimizer

Summary:
Ensure order of operands in DPP atomic optimizer final WWM step is appropriate for sub instructions.

Change-Id: I631d050e1c00a3b4bc7c11a90437064403c4cf30

Reviewers: sheredom, tpr

Reviewed By: sheredom

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355394 91177308-0d34-0410-b5e6-96231b3b80d8

2019-03-05 12:21:44 +00:00

Neil Henning

4fee3cff4d

[AMDGPU] Fix DPP sequence in atomic optimizer.

This commit fixes the DPP sequence in the atomic optimizer (which was
previously missing the row_shr:3 step), and works around a read_register
exec bug by using a ballot instead.

Differential Revision: https://reviews.llvm.org/D57737

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353703 91177308-0d34-0410-b5e6-96231b3b80d8

2019-02-11 14:44:14 +00:00

Neil Henning

b461f4de29

[AMDGPU] Add an AMDGPU specific atomic optimizer.

This commit adds a new IR level pass to the AMDGPU backend to perform
atomic optimizations. It works by:

- Running through a function and finding atomicrmw add/sub or uses of
  the atomic buffer intrinsics for add/sub.
- If all arguments except the value to be added/subtracted are uniform,
  record the value to be optimized.
- Run through the atomic operations we can optimize and, depending on
  whether the value is uniform/divergent use wavefront wide operations
  (DPP in the divergent case) to calculate the total amount to be
  atomically added/subtracted.
- Then let only a single lane of each wavefront perform the atomic
  operation, reducing the total number of atomic operations in flight.
- Lastly we recombine the result from the single lane to each lane of
  the wavefront, and calculate our individual lanes offset into the
  final result.

Differential Revision: https://reviews.llvm.org/D51969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343973 91177308-0d34-0410-b5e6-96231b3b80d8

2018-10-08 15:49:19 +00:00

3 Commits