Commit Graph

8017 Commits

Author SHA1 Message Date
Henrik Rydgård
cafce7365b Vulkan: Fix frame ordering issue with postprocessing shaders
Requested an init command buffer outside the frame, which is dangerous
and caused validation problems with command pool resets.

Would like to assert on insideFrame in GetInitCmd, but we use it from
some init code where it does work correctly. Might clean that up at some
point.
2022-10-21 12:52:21 +02:00
Henrik Rydgård
a830f18054
Merge pull request #16265 from unknownbrackets/lighting-nonormal
GPU: Respect world matrix and reverse flag w/o normals
2022-10-21 09:16:53 +02:00
Unknown W. Brackets
d23293ee72 GPU: Respect matrix and reverse flag w/o normals.
See frame dump in #14223, which requires world matrix be applied.
2022-10-20 23:15:25 -07:00
Unknown W. Brackets
6945604492 softgpu: Multiply prev normal by world matrix.
Even when it's not in the vertex data, we still must multiply by the world
matrix.  Fixes some lighting issues in Nayuta no Kiseki.
2022-10-20 23:14:54 -07:00
Unknown W. Brackets
d6a59be012 softgpu: Respect negate normal flag without norm.
Per tests, Z is still negated even when using the previous normal value.
2022-10-20 23:09:48 -07:00
fp64
2c814c568a
Some more codestyle cleanup 2022-10-20 06:59:35 -04:00
fp64
9a01db5f42
Change wrap_mode to clamp for bicubic upscaler
It was set to wrap, even though the comment claimed otherwise. Previous implementation had clamp, as do (I think) other upscaling modes (Hybrid, etc.).

Also make upscaler codestyle a little more consistent with the rest.
2022-10-20 06:53:49 -04:00
Henrik Rydgård
90d395a10d Remove "attachment" parameter from BindFramebufferAsTexture everywhere.
Not actually useful since our framebuffer objects don't support multiple
color images, and probably won't ever need to.
2022-10-20 10:15:19 +02:00
Henrik Rydgård
8cd602a9c6
Merge pull request #16257 from unknownbrackets/error-cleanup
Kernel: Fix reported StopThread error
2022-10-19 08:06:30 +02:00
Unknown W. Brackets
a42064eb48 Vulkan: Correct some enum switch warnings.
Nice to log debug annotations anyway.
2022-10-18 21:52:38 -07:00
Lubos
eed75889e4 OpenXR - Ensure scene analyze is called the same way as before 2022-10-18 12:23:23 +02:00
Henrik Rydgård
269eb55c17 Build/warning fix 2022-10-18 10:48:16 +02:00
Henrik Rydgård
aa51bfd1ef Use GPU "use" flags to replace IsVRBuild in the renderer. It remains elsewhere. 2022-10-17 19:57:11 +02:00
Henrik Rydgård
7c5fc3ccb5 Reorder the GPU USE flags a bit 2022-10-17 19:55:11 +02:00
Lubos
1a6180583e OpenXR - Reduce uniform calls 2022-10-17 19:00:38 +02:00
Henrik Rydgård
123311b0c7
Merge pull request #16241 from unknownbrackets/softgpu-rect
softgpu: Correct linear interp for uneven positions
2022-10-17 09:29:31 +02:00
Henrik Rydgård
30aa07b156 Two more renames to make things read better 2022-10-17 08:47:05 +02:00
Henrik Rydgård
9b8a5d1db3 Rename GPU_SUPPORTS_ to GPU_USE_ 2022-10-17 08:47:03 +02:00
Henrik Rydgård
daca0b2109 Rename gstate_c.Supports to gstate_c.Use 2022-10-17 08:46:37 +02:00
Unknown W. Brackets
162b27e136 GPU: Replace logic ops with blend for simple cases.
So that alpha/stencil are handled correctly.
2022-10-16 22:24:43 -07:00
Unknown W. Brackets
7eb7bd5141 softgpu: Correct linear interp for uneven positions.
Can't round to the pixel when calculating the S/T deltas.
This fixes issues in Wipeout (#16131) and Call of Duty bloom.
2022-10-16 18:57:55 -07:00
Unknown W. Brackets
9d6de98ed9 softgpu: Correct drawing outside TL of rectangle.
If the start coordinate was something like 51.75, we were incorrectly
drawing to 51.  This can be seen in the Metal Slug intro (#15755.)
2022-10-16 18:46:38 -07:00
Unknown W. Brackets
1931fa2f5f softgpu: Make triangle fan rect detection generic. 2022-10-16 16:01:09 -07:00
Unknown W. Brackets
d33986a5ad softgpu: Cull a triangle with all negative w.
Per tests, this seems to happen but only when all points are below zero.
2022-10-15 11:49:28 -07:00
Unknown W. Brackets
4f5e821ce2 softgpu: Fix crash on screenshot w/o display. 2022-10-15 11:49:05 -07:00
Henrik Rydgård
a605366254 Add ShaderId utility function to be used for some sanity checking. 2022-10-13 22:39:47 +02:00
Henrik Rydgård
4d1da5859c Add simple way to add debug annotation in the middle of the command stream. Vulkan-only. 2022-10-13 22:39:44 +02:00
Unknown W. Brackets
2f9392083a GPU: Respect stencil state in shader blend. 2022-10-11 22:26:31 -07:00
Henrik Rydgård
e0e29a1556
Merge pull request #16197 from hrydgard/more-uniform-optimization
More uniform optimization, fixes
2022-10-12 01:00:27 +02:00
Henrik Rydgård
d30d8bf35c Removes the option, autodetect instead - only enable if the GPU doesn't support bitwise ops. 2022-10-11 16:09:57 +02:00
Henrik Rydgård
901f698a10
Merge pull request #16201 from unknownbrackets/gedebugger
GE Debugger: Normalize framebuffer texture preview
2022-10-11 11:33:19 +02:00
Henrik Rydgård
804220afb1
Merge pull request #16198 from unknownbrackets/gles-stencil
Readback stencil buffer for debugger on GLES
2022-10-11 10:55:38 +02:00
Henrik Rydgård
baa9451cb7 Warning fixes 2022-10-11 09:55:53 +02:00
Unknown W. Brackets
416265431b GE Debugger: Display if tex is framebuf.
Rather than guessing based on size, let's show explicitly.
2022-10-10 22:35:42 -07:00
Unknown W. Brackets
48c39543af GE Debugger: Normalize framebuffer texture preview.
Previously, we would show the render-to-texture with its original
dimensions.  While useful, this skewed the preview coordinates and was
sometimes confusing.  Additionally, depth texturing didn't preview.

This pads and subsets the texture preview so it's the right size.
2022-10-10 21:54:24 -07:00
Unknown W. Brackets
fc9f200224 GE Debugger: Centralize current fb tex fetch code. 2022-10-10 21:50:53 -07:00
Unknown W. Brackets
999055791d D3D9: Remove block transfer code overrides.
We can just use Draw now.  Keep depth, though, since it applies scale.
2022-10-10 21:48:38 -07:00
Unknown W. Brackets
c89cf1cde7 D3D9: Implement CopyFramebufferToMemorySync().
This works like other backends, including D3D11.  This allows us to get
rid of the old implementation and reuse more code.
2022-10-10 21:28:14 -07:00
Unknown W. Brackets
fc68cd9457 GLES: Add debug readback of stencil data.
This allows the existing gpu.buffer.renderStencil to snapshot the state.
2022-10-10 17:09:14 -07:00
Unknown W. Brackets
c03d327ddd GLES: Refactor depth pipeline create.
So we can reuse for stencil as well.
2022-10-10 16:55:30 -07:00
Unknown W. Brackets
59cc7a8000 GPU: Rename stencil write pipeline. 2022-10-10 16:54:29 -07:00
Henrik Rydgård
089ac9a50e Comment about VR uniforms 2022-10-10 18:02:19 +02:00
Henrik Rydgård
ee46f8992e Don't use fragmentShaderInt32Support as a replacement for checking for bitwiseOps 2022-10-10 18:02:19 +02:00
Henrik Rydgård
aec22491fe Don't expand alphaColorRef to 128 bytes on backends where we don't need to. 2022-10-10 18:02:01 +02:00
Henrik Rydgård
d56bdcb81e
Merge pull request #16196 from hrydgard/improved-render-stats
Improved stats in the Vulkan GPU profiler
2022-10-10 15:40:17 +02:00
Henrik Rydgård
32699da6df Vulkan (trivial): Fix numDraws stat when merging render passes. Shorten a name. 2022-10-10 10:06:30 +02:00
Unknown W. Brackets
1dc35b3ac4 GLES: Simplify, enable debug depth readback. 2022-10-10 01:04:57 -07:00
Unknown W. Brackets
f8908c691b GLES: Use Draw for depth readback shader.
Was not working before, since the program was not being used by Draw2D.
2022-10-10 00:54:29 -07:00
Unknown W. Brackets
93346d6e2c GLES: Refactor depth shader download.
This makes it similar to the Draw interface.
2022-10-10 00:54:29 -07:00
Unknown W. Brackets
c2da29392c GLES: Depth download cleanup. 2022-10-10 00:54:29 -07:00
Unknown W. Brackets
bad4a93d3c D3D11: Correct depth readback. 2022-10-09 15:50:50 -07:00
Unknown W. Brackets
55d5dc3834 GPU: Rename readback and buffer write operations.
Avoid download/upload and pack, which don't have clear directions.
2022-10-09 13:49:41 -07:00
Unknown W. Brackets
d83f736b1f D3D9: Correct depth readback. 2022-10-09 13:21:04 -07:00
Henrik Rydgård
0c4935f336 Depal from dynamic CLUT: When detecting bounds, be more conservative.
Followup to #16188 .

Further fixes the lens flare.

It confused me before that there are two sections of the track on
Sunset Drive where the sun is visible, but only on the second is the
lens flare sprite actually shown, which is rather weird.

Verified that exactly the same thing happens on hardware, so it's not a
an emulation problem! Rather seems like a glitch in the game itself.
2022-10-09 20:57:06 +02:00
Henrik Rydgård
9422b05ee3 Fix depal bounds with dynamic CLUT. Fixes lens flare glitches in Ridge Racer
With this wrong, we ended up drawing pixels that came from a DONT_CARE
init of the depal temp buffer, which was a pile of garbage on Android
and blank on PC.

Now, we seem to end up not drawing anything because the depal operation
results in transparent black into whatever is actually intended, but at
least the screen isn't full of glitches when the sun is visible on Adreno.

See issue #16083
2022-10-09 20:27:45 +02:00
Henrik Rydgård
28bc45451c
Merge pull request #16184 from unknownbrackets/depth-download
GPU: Hook Gods Eater Burst avatar read
2022-10-09 16:37:42 +02:00
Henrik Rydgård
49de375bff
Merge pull request #16183 from unknownbrackets/depth-usage
GPU: Ignore depth when masked and ALWAYS
2022-10-09 10:41:40 +02:00
Unknown W. Brackets
ad3220f857 GLES: Hook up depth download.
Currently, only used by one hook.
2022-10-09 01:08:04 -07:00
Unknown W. Brackets
bc84d6345b Vulkan: Disable geometry shaders for Mali <= 18.
These drivers apparently have some weird behavior.
2022-10-09 00:57:10 -07:00
Unknown W. Brackets
d0eb14ec02 GPU: Correct sizing account on block transfer. 2022-10-09 00:54:59 -07:00
Unknown W. Brackets
e7e7528fbc GPU: Consider depth buffers in block transfer.
Right now, only with an explicit flag (not yet used.)
2022-10-09 00:50:45 -07:00
Unknown W. Brackets
b2ce4d2c3f GPU: Refuse to set fb_address == z_address.
We don't do it when creating framebufs either, so don't update to
matching values.
2022-10-08 17:50:18 -07:00
Unknown W. Brackets
7d331f1928 GPU: Ignore depth when masked and ALWAYS.
Seen in NFS Pro Street, for example.  Shouldn't be interpreted as depth
usage.
2022-10-08 17:49:25 -07:00
Unknown W. Brackets
157ffed57f D3D9: Add simple rendered CLUT handling.
I think there's still a deeper half-pixel offset issue, but this fixes
Brave Story.
2022-10-08 15:36:36 -07:00
Henrik Rydgård
bf25f4b283 Shader uniforms (VK/D3D11): Fix issue where we could overwrite the fourth component padding. 2022-10-06 10:52:58 +02:00
Unknown W. Brackets
3aa863ec41 GPU: Clip against neg Z even w/o cull support.
This should fix rendering issues on Apple devices.
2022-10-06 00:34:02 -07:00
Henrik Rydgård
87d00f79da
Merge pull request #16165 from unknownbrackets/geo-shader
Vulkan: Clip clamped depth in geometry shader
2022-10-06 09:18:08 +02:00
Henrik Rydgård
3da1b46104
Merge pull request #16166 from unknownbrackets/hwtess
GPU: Verify generated shader buffer length
2022-10-06 08:24:46 +02:00
Unknown W. Brackets
aee2ad46a2 GPU: Verify generated shader buffer length.
Hardware tessellation + uberlighting + clamp was exceeding the buffer,
causing memory corruption.  Let's try to catch it, but also increase
buffers to be safe.
2022-10-05 21:41:09 -07:00
Unknown W. Brackets
bc3d3cf9fb GPU: Optimize clip distances needed.
We only need to write one clip distance to clip clamped depth, since we
don't clamp when it needs clipping on both sides.
2022-10-05 21:17:17 -07:00
Unknown W. Brackets
14bf9d1923 Vulkan: Correct clamped Z clip when clipping neg Z.
In the geometry shader, if used, we need to output the clip distance from
the clamped Z clip or it gets lost.
2022-10-05 20:48:38 -07:00
Unknown W. Brackets
8663541403 Vulkan: Avoid max_vertices=12 if unnecessary. 2022-10-05 20:11:10 -07:00
Unknown W. Brackets
3e5c09d432 Vulkan: Clip clamped depth in geometry shader.
This corrects deformed geometry on Mali devices which don't support
user-space clipping but do support depth clamp.
2022-10-05 19:41:59 -07:00
Henrik Rydgård
d6bd08cae7
Merge pull request #16162 from unknownbrackets/geo-shader
Implement negative Z clipping in geometry shader
2022-10-06 01:00:41 +02:00
Unknown W. Brackets
5d88e50201 Vulkan: Generate indices in clipping. 2022-10-04 23:04:25 -07:00
Unknown W. Brackets
f24edbe8a8 Compat: Remove DisableRangeCulling.
This hack was used because culling previously incorrectly handled Z, which
was fixed in #14833.
2022-10-04 22:19:40 -07:00
Unknown W. Brackets
8025def8d2 Vulkan: Clip to neg z in the geometry shader.
This is only used when clip distance is unsupported, such as on Mali.
2022-10-04 22:10:24 -07:00
Henrik Rydgård
362391b9d8 Fix Kurohyou again. See #9576 2022-10-04 20:56:41 +02:00
Henrik Rydgård
b333695cd1
Merge pull request #16160 from unknownbrackets/vram-mirrors
GPU: Use flags to fix triggered upload/download
2022-10-04 08:45:06 +02:00
Unknown W. Brackets
9ac4523fd2 GPU: Skip matching a framebuf for RAM. 2022-10-03 20:22:27 -07:00
Unknown W. Brackets
a1efed31b9 GPU: Use flags to fix triggered upload/download.
No longer using mirror hacks.
2022-10-03 20:17:25 -07:00
Henrik Rydgård
1469a32a9d Vertex decoder: Add fallback for non-SSE4.1
See #16157
2022-10-03 19:06:17 +02:00
Henrik Rydgård
973d0435c1 Fix another crash with non-buffered rendering 2022-10-03 19:02:16 +02:00
Henrik Rydgård
ed3cd1dc26
Merge pull request #16150 from unknownbrackets/vram-mirrors
GPU: Mask away unused bits in framebuf/zbuf ptr, cleanup
2022-10-03 11:56:24 +02:00
Herman Semenov
29b87e0c0b
Merge branch 'master' into master 2022-10-03 07:49:13 +00:00
Unknown W. Brackets
0be891c7ff softgpu: Minor opt, ignore unused z_stride. 2022-10-02 21:31:07 -07:00
Unknown W. Brackets
58a4376998 GPU: Normalize framebuf addresses.
In VRAM, always store without mirror.  In RAM, always store without
cache/kernel bits.
2022-10-02 21:28:53 -07:00
Unknown W. Brackets
73040ebb8f GE Debugger: Ignore mirrors for target in record. 2022-10-02 20:48:28 -07:00
Unknown W. Brackets
4a17ab8070 GE Debugger: Correct mask in target breakpoints. 2022-10-02 20:47:12 -07:00
Unknown W. Brackets
b9b59f7806 GPU: Mask away unused bits in framebuf/zbuf ptr.
Lower 4 bits are ignored during rendering, and mirrors (even even the 8
bit at the top) are ignored.
2022-10-02 20:44:35 -07:00
Unknown W. Brackets
4df7a8f357 Vulkan: Cleanup unused geometry shader vars.
Without clipping, these aren't used (but could be in the future with
manual clipping.)
2022-10-02 07:43:35 -07:00
Unknown W. Brackets
2832edcc37 Vulkan: Allow configuring geometry shaders on/off. 2022-10-02 07:42:22 -07:00
Unknown W. Brackets
8df956b036 Vulkan: Block geometry shaders on older Mali.
They're too slow to be usable.
2022-10-02 07:42:22 -07:00
Unknown W. Brackets
36eb0d9ad5 Vulkan: Use geo clip distance only where supported.
It might be supported without cull or GS.  Otherwise we may need to clip
the triangles manually.
2022-10-02 07:42:22 -07:00
Unknown W. Brackets
2ce0cda333 Vulkan: Enable geo shader for culling.
The compat setting was really for some previously buggy cases that
couldn't work without cull.
2022-10-02 07:42:22 -07:00
Unknown W. Brackets
bfaa304461 Vulkan: Correct geometry shader culling. 2022-10-02 07:42:17 -07:00
Henrik Rydgård
ac248338be Vulkan: Cull in geoshader, hack to on for now. 2022-10-02 07:42:17 -07:00
Henrik Rydgård
cdee10fe86 Vulkan: Basic geoshader code generation. 2022-10-02 07:42:17 -07:00
Unknown W. Brackets
fbdb278168 Vulkan: Update shader cache format for geo shaders. 2022-10-02 07:42:16 -07:00
Unknown W. Brackets
d16caa71af Vulkan: Add geometry shader ID tracking.
We're still not generating them, yet.  But this tracks the objects and
IDs through the pipeline.
2022-10-02 07:42:16 -07:00
Unknown W. Brackets
38e16324f0 Vulkan: Clean up shader module tag. 2022-10-02 07:42:16 -07:00
Unknown W. Brackets
878a049f60 GPU: Add dirtying for geo shader state.
Not yet used, but dirtied at the right times.
2022-10-02 07:42:16 -07:00
Henrik Rydgård
b36bfc37d5
Merge pull request #16139 from hrydgard/tighten-up-format-checks
Tighten up some color format checks with displays and copies
2022-10-02 15:39:51 +02:00
Henrik Rydgård
10b2263673
Merge pull request #16143 from unknownbrackets/edram-trans
Report, save, and frame dump the Edram translation value
2022-10-02 09:25:45 +02:00
Unknown W. Brackets
fcc877a0f3 GE Debugger: Fix memcpy/memset recording.
Uhh, oops.  I'm surprised I didn't notice these were broken for so long.
2022-10-01 23:48:23 -07:00
Unknown W. Brackets
978fd9fc60 GE Debugger: Record the Edram translation value. 2022-10-01 23:48:06 -07:00
Unknown W. Brackets
24999e792a Ge: Report and save Edram translation value.
See #16126 for some details on its usage and effects.
2022-10-01 23:18:42 -07:00
Unknown W. Brackets
80cccd7abb Build: Fix debug build on Windows 32-bit. 2022-10-01 17:07:27 -07:00
Henrik Rydgård
ab08db6fca Tighten up some color format checks with displays and copies
Now that we allow multiple color format buffers to overlap, and don't
just take one and change its format during copy for example, we could
use some additional checking.

Additionally, do a simple heuristic to reject "obviously" wrong copies
copies to framebuffers.

Fixes #15959, should also help #16124
2022-10-02 00:10:19 +02:00
Henrik Rydgård
151db69a32
Merge pull request #16138 from unknownbrackets/geo-shader-2
Basic groundwork for geometry shaders
2022-10-01 22:23:48 +02:00
Unknown W. Brackets
87171cef98 GPU: Add geometry path for shader writer.
Not yet used.
2022-10-01 12:45:43 -07:00
Unknown W. Brackets
59a489f883 Draw: Add COLOR1 semantic. 2022-10-01 12:14:46 -07:00
Henrik Rydgård
9ec41436d1 ES2 crash fix: Don't draw depth if lacking fragment shader depth write. 2022-10-01 19:28:52 +02:00
lainon
3cdf72b68b Better readability and optimization insertion into container by replacing 'insert' -> 'emplace', 'push_back' -> 'emplace_back' 2022-09-30 12:35:28 +03:00
lainon
c953bf7fc7 Fixed bug and memleaks 2022-09-30 12:32:49 +03:00
lainon
b304551747 Code readability, vec reserve() and remove excess c_str() 2022-09-30 12:31:32 +03:00
lainon
fec708489a Correct cleaning string and remove unused vars 2022-09-30 12:26:30 +03:00
Unknown W. Brackets
77696573f4 GE Debugger: Correct rounded coords in vertex list.
Were previously rounding to pixel, not subpixel.  Also, show out of range
values for clarity on clamping/culling.
2022-09-30 00:19:21 -07:00
Unknown W. Brackets
6468e0f03e softjit: Fix dst blend shift.
Example: src * dst.a + dst * one, still requires a shift back.
2022-09-29 22:31:50 -07:00
Unknown W. Brackets
dc90a5a851 softgpu: Avoid projecting textures in common case.
Several games appear to intentionally set the matrix flat.
2022-09-29 22:31:49 -07:00
Unknown W. Brackets
7cf05d0a46 GPU: Fix missed dirtying when fast loading tgen. 2022-09-29 22:31:07 -07:00
Unknown W. Brackets
904fb38003 GPU: Restore matrices with dirtying.
Without this, it's possible we might not notice or apply a change
whether in uniforms or etc.
2022-09-29 22:31:02 -07:00
Henrik Rydgård
bd759790b0 Update the Vulkan debug names when reassigning depth buffers. 2022-09-28 14:09:40 +02:00
Henrik Rydgård
de51d067f2 If a framebuffer starts using a different depth buffer than before, re-point.
Fixes depth artifacts in Silent Hill: Origins. See issue #16126
2022-09-28 13:41:41 +02:00
Henrik Rydgård
30c7b45ac8
Merge pull request #16123 from unknownbrackets/gpu-matrix
softgpu: Correct matrix value update wrapping
2022-09-28 09:39:27 +02:00
Unknown W. Brackets
6b20c0318d softgpu: Correct matrix value update wrapping.
The values read back when saving a context or getting matrix data are set
differently than the actual values used for rendering.

This implements the wrapping and bleeding between matrices within softgpu,
but leaves hardware rendering to only use the rendering registers for
speed.
2022-09-27 22:29:55 -07:00
Unknown W. Brackets
95d2083f04 Ge: Move matrix reading into GPU.
Let's keep managing its state / registers internal.
2022-09-27 22:23:02 -07:00
Unknown W. Brackets
38818f9f6e GLES: Fix colortest/logicop uint/int conversion.
Shown well in #16119.
2022-09-27 19:24:54 -07:00
Henrik Rydgård
ca5c69d3dd Vulkan: Better debug names for RENDER passes. 2022-09-27 23:41:09 +02:00
Henrik Rydgård
e538f5a441 Better bit scrambling when computing draw call IDs for vertex cache.
Fixes #13324
2022-09-27 10:09:52 +02:00
Unknown W. Brackets
23af9be9f4 softgpu: Handle rectangle texture projection. 2022-09-26 18:44:39 -07:00
Unknown W. Brackets
faa6c2d461 softgpu: Implement triangle texture projection. 2022-09-26 18:12:20 -07:00
Unknown W. Brackets
6282f8b05f softgpu: Expand texture coords to include q.
We'll need this to correctly project.
2022-09-26 17:13:14 -07:00
Unknown W. Brackets
8376176b2f softgpu: Split clippos out of rasterization vert.
We don't use it, except w, at all in rasterization, so no need to keep it
in the bin queue.
2022-09-26 16:50:40 -07:00
Unknown W. Brackets
97ae4ae712 GPU: Correct flat normal projection mapping. 2022-09-26 15:11:11 -07:00
Unknown W. Brackets
34a8056017 GPU: Correct normalized zero normal proj map.
Unlike lighting, this does not use 0, 0, 1.
2022-09-26 15:11:11 -07:00
Unknown W. Brackets
b3c0f177e2 softgpu: Save last tc/normal in vertex reading.
Matches PSP behavior, reusing last set values.
2022-09-26 15:11:11 -07:00
Unknown W. Brackets
59f11df98b
Merge pull request #16116 from hrydgard/color-test-fix
Fix color test.
2022-09-26 14:18:12 -07:00
Henrik Rydgård
9b46adb985 Fix color test.
Fixes the new color test bug reported in #13324, though doesn't fix that
issue (didn't confirm it still is one).
2022-09-26 22:51:46 +02:00
Henrik Rydgård
1c0d66aef7 Add compatibility flag for loading pixels on framebuffer create using nearest filtering
Solves the last problem with the speedometers - so we can finally say: Fixes #8509

Render-to-CLUT for speedometers renders on top of an image that just comes from the
underlying memory, so it's been drawn to the framebuffer with DrawPixels. That adds
filtering so at higher resolutions, there's some blurring of the CLUT, causing
artifacts.  We can solve this two ways: either we force on lower-resolution-for-effects
for Ridge Racer games, or we use nearest filtering when doing DrawPixels of the
memory under a framebuffer. For best result, we do the latter.

(The speedometers look even better with nearest filtering, but that's a more
general issue of UI looking better that way).
2022-09-26 20:47:55 +02:00
Unknown W. Brackets
4329aaa31c GPU: Apply color test mask as a uint.
This is simpler and allows us to unify paths better.
2022-09-26 06:57:41 -07:00
Unknown W. Brackets
a19a057e8c GPU: Consistently use uvec3 for colortest. 2022-09-26 06:57:41 -07:00
Henrik Rydgård
d9f74d2fb7 ivec->uvec, comment fix 2022-09-26 13:05:25 +02:00
Henrik Rydgård
fc30b04430 ShaderUniforms: cleanup, put every "4-float" on a line for clarity 2022-09-26 13:05:25 +02:00
Henrik Rydgård
cfa427c37a Shuffle constants around, squeezing them into gaps. Saves another 16 bytes. 2022-09-26 13:05:24 +02:00
Henrik Rydgård
f4b71e2dc7 Fragment shader uniforms: Pack color mask in 32 bits instead of expand to 128 bits.
Allows us to save 16 bytes from the main uniform buffer, since there's
free 32-bit spaces here and there to use.
2022-09-26 13:04:56 +02:00
Henrik Rydgård
07ca9e4656 Fold the "materialUpdate" flag into the light ubershader part.
This reduces the number of vertex shaders and thus pipelines by quite a
bit more in a few games, like Tekken and GoW, continuing the fight
against shader compile stutter.

The perf impact should be minimal if not positive due to less pipeline
changes.

GLES fixes

Make the vertex input declarations match (always declare fog input).  Fixes D3D11 validation

Tess fix
2022-09-26 12:06:16 +02:00
Henrik Rydgård
ad1021ea4b Add some recent flags to FragmentShaderDesc 2022-09-26 12:06:16 +02:00
Henrik Rydgård
76f03d30bf Remove suspicious dirty flag 2022-09-26 11:21:40 +02:00
Henrik Rydgård
196f8e3461 Prepare for dynamic mat update 2022-09-26 11:21:40 +02:00
Henrik Rydgård
94e439280e
Merge pull request #16111 from hrydgard/always-compute-fog-in-vs
Always do the vertex shader part of the fog computation.
2022-09-26 11:20:06 +02:00
Henrik Rydgård
9d1355e137 Always do the vertex shader part of the fog computation.
In #16104, we drastically reduced the number of shader variants for
games that use flexible lighting setups. I looked at a few games and it
seems that a lot of games have the same shaders with fog on/off, while
fog is super cheap to compute. So let's just always do it, reducing
vertex shader variants further (though the amount of pipelines will probably
remain the same, since we still specialize the fragment shader).

Might also be worth adding a dynamic bool for the fragment shader, but
if so, doing it separately.
2022-09-26 09:30:54 +02:00
Unknown W. Brackets
c80f325912 GPU: Fix SSE4 Vec3f normalize.
Was sometimes adding in garbage data, which could create NANs.
2022-09-26 00:24:12 -07:00
Henrik Rydgård
f30252f8d5 Oops! Also, testfix 2022-09-25 23:35:08 +02:00
Henrik Rydgård
96f054f098 Fix light ubershader for D3D11 and OpenGL, GLES unsigned/signed stuff 2022-09-25 23:35:08 +02:00
Henrik Rydgård
7adba20fac Experiment: Generate "Ubershaders" that can handle all lighting configurations
This drastically reduces the shader compile stutter that happens when a lot of new
light setups are created, like on the first punch in Tekken 6.

There's more stuff that might benefit from being made dynamic like this.
These branches are very cheap on modern GPUs since they're branching on
a uniform variable, so no divergence.

Only tested on Vulkan. I think we'll need to keep the old path too for
gpus like Mali-450...
2022-09-25 23:35:01 +02:00
Henrik Rydgård
b1afeeaf43
Merge pull request #16100 from unknownbrackets/d3d9-debugger
D3D9: Allow INTZ depth buffers more correctly
2022-09-25 17:37:56 +02:00
Unknown W. Brackets
e6db0bef2d
Merge pull request #16099 from hrydgard/vulkan-dont-always-alloc-depth
Vulkan: Avoid allocating depth images for stuff like temp copies, depal buffers etc.
2022-09-25 08:05:50 -07:00
Henrik Rydgård
a26a353c25
Merge pull request #16102 from unknownbrackets/softgpu-bin-tweaks
softgpu: Avoid waiting for a thread to drain
2022-09-25 10:01:44 +02:00
Henrik Rydgård
70c5ca62e6 Remove debug log. Add some new debug log though, unrelated to this PR, for fb clut + fb texture. Plus a couple asserts. 2022-09-25 09:56:39 +02:00
Henrik Rydgård
8a4147c042
Merge pull request #16101 from unknownbrackets/softgpu-fixes
softgpu: Avoid fast path in another wrong case
2022-09-25 09:52:29 +02:00
Unknown W. Brackets
24560eef5c softgpu: Avoid waiting for a thread to drain.
If we can, we want to keep the thread queues healthy, but not full.
Reduce the amount we push on a typical drain to avoid the Wait().
2022-09-24 20:01:00 -07:00
Unknown W. Brackets
1aa6841759 softgpu: Increase queued prims.
We made them smaller, so we can queue more of them in the same space.
Helps a little bit.
2022-09-24 20:01:00 -07:00
Unknown W. Brackets
444781c7b0 softgpu: Fix triangle strip with partial rects.
Seen in Wild Arms XF shop menu.
2022-09-24 18:55:45 -07:00
Unknown W. Brackets
c47d7eab38 softgpu: Simply 5551 blending fast path.
Since it only supports multiply and add, let's just stick with that.
2022-09-24 18:55:45 -07:00
Unknown W. Brackets
1eeb4f0bcf softpu: Refactor out 5551 fast path checks.
They were duplicated, and better to organize them according to state.
2022-09-24 18:55:45 -07:00
Unknown W. Brackets
f30b1d048d softgpu: Avoid fast path in another wrong case.
Seen in Kurohyo.  Missed that the alpha blend check essentially means only
standard blending can work.
2022-09-24 17:53:09 -07:00
Unknown W. Brackets
81e8336985 D3D9: Allow INTZ depth buffers more correctly.
The FBO check was wrong and just always failed.
2022-09-24 15:17:18 -07:00
Henrik Rydgård
08d2cb4486 Bump the shader cache version 2022-09-24 22:40:42 +02:00
Henrik Rydgård
9f3dfe7ebe Vulkan: Don't compile pipeline variants that don't make sense given their flags.
Ran into this with cache files from previous version of my change.

Also bumping the shader cache ID again to avoid this in other ways, but
good to be robust here.
2022-09-24 22:39:22 +02:00
Henrik Rydgård
c3b4caa30b
Merge pull request #15984 from lvonasek/compat_openxr_gta
OpenXR - Sky fix for GTA games
2022-09-24 17:16:28 +02:00
Unknown W. Brackets
c76d31dfa8 GPU: Cleanup unused CheckAlpha() funcs. 2022-09-24 02:00:03 -07:00
Unknown W. Brackets
6e6535c263 softjit: Skip reading dst pixel where blended out.
Sometimes used by blends used purely to multiply the source color by
something, usually prep for bloom.
2022-09-24 02:00:03 -07:00
Unknown W. Brackets
a4c3718431 softgpu: Optimize rectangle sampling/blending.
Sometimes the vertex color or alpha can allow us to optimize away some
multiplication.
2022-09-24 02:00:03 -07:00
Unknown W. Brackets
794a5c07ad softgpu: Ignore a needless color test case.
This happens in Ridge Racer, and we can entirely skip the color test.
2022-09-24 02:00:03 -07:00
Unknown W. Brackets
7ff5434968 GE Debugger: Tag frame dump replay VRAM writes.
Just for debugging, it's helpful especially paired with softgpu tagging.
2022-09-23 21:20:14 -07:00
Unknown W. Brackets
c3c5450b8f GE Debugger: Fix small tex/clut recopying.
If it's less than 256 bytes, we can't mark the entire VRAM area copied.
This still helps frame dumps avoid excessively slow VRAM recopying
situations, but fixes issues like missing trees in #12738.
2022-09-23 21:18:39 -07:00
Unknown W. Brackets
b56bd0d0fc
Merge pull request #16090 from hrydgard/more-vulkan-cleanup-work
Simplify synchronization in VulkanRenderManager
2022-09-23 17:24:34 -07:00
Henrik Rydgård
d743bfac93
Merge pull request #16085 from unknownbrackets/softgpu-vert
softgpu: Cache reused indexed verts
2022-09-24 00:00:26 +02:00
Henrik Rydgård
1259283c2e More tweaks, fix crash on exit (double-join thread) 2022-09-23 22:10:29 +02:00
Lubos
adffbb2ea7 Merge branch 'master' into compat_openxr_gta 2022-09-23 14:16:58 +02:00
Henrik Rydgård
7884e4ccb3 Another uninitialized variable (VAI minihash/hash) 2022-09-23 12:33:16 +02:00
Henrik Rydgård
ac7ca963db Make valgrind happy 2022-09-23 12:24:43 +02:00
Henrik Rydgård
bb6919ebcb
Merge pull request #16087 from unknownbrackets/depth-upload
GPU: Upload depth only on first usage
2022-09-23 09:07:33 +02:00
Unknown W. Brackets
93c909a88e GPU: Upload depth only on first usage.
Fixes various glitches in Kingdom Hearts, etc.
2022-09-23 00:04:14 -07:00
Unknown W. Brackets
66b6dfd0a5 softgpu: Fix self-render detect in Ridge Racer.
When we flush we mark all pending writes zero, but we rely on this being
set to detect self-render.

TRANSFORM_ALL was wrong as well, sometimes clearing BINNER_RANGE.
2022-09-22 20:36:15 -07:00
Unknown W. Brackets
88b3b26ed3 softgpu: Cache reused indexed verts.
This happens a lot for spline/bezier, so can significantly speed up curve
heavy scenes.  Isn't necessarily that common otherwise, though.
2022-09-22 18:27:59 -07:00
Unknown W. Brackets
067fac6817 softgpu: Skip matrix multiply for fog factor calc.
We can just use a dot product instead, and always skip viewpos.
2022-09-22 18:19:53 -07:00
Unknown W. Brackets
84a3f6de71 softgpu: Remove unnecessary state param.
Oops, meant to remove this when refactoring imm prims.
2022-09-22 18:18:49 -07:00
Henrik Rydgård
a6d6e0a3cc Texture/Framebuffer match: Ignore stride if texHeight == 1. Fixes Ridge Racer lens flares. 2022-09-22 22:11:16 +02:00
Henrik Rydgård
078fa9beb2 Fix corruption of Ridge Racer speedometers with AutoMaxQuality enabled.
See #8509
2022-09-22 15:27:17 +02:00
Henrik Rydgård
c3cbb68452
Merge pull request #16072 from hrydgard/depth-free-renderpass
Vulkan: Don't have renderpasses store/load depth buffers when we don't use them
2022-09-22 11:05:25 +02:00
Henrik Rydgård
c108db0e71
Merge pull request #16081 from hrydgard/zbuffer-upload-heuristic
Fix green flashes with Burnout Dominator lens flare
2022-09-22 11:02:27 +02:00
Henrik Rydgård
a31c5c8239 Cleanup logic 2022-09-22 10:48:45 +02:00
Henrik Rydgård
8e30a7ccfc Vulkan: Don't have renderpasses store/load depth buffers when we don't use them 2022-09-22 10:06:05 +02:00
Henrik Rydgård
e9bcefb052
Merge pull request #16080 from unknownbrackets/softgpu-spline
softgpu: Avoid unnecessary flushing for curves
2022-09-22 10:05:23 +02:00
Henrik Rydgård
bd196f7a50 Preserve depth buffer on framebuffer resize, if has been used. 2022-09-22 09:59:49 +02:00
Henrik Rydgård
188ab67d6a More lenient heuristic for uploading depth buffers. Still behind compat flag. See #11100 2022-09-22 09:29:33 +02:00
Henrik Rydgård
287e025978 Minor cleanups around dirtying of render state 2022-09-22 09:12:58 +02:00
Unknown W. Brackets
fc39f042ae softgpu: Avoid unnecessary flushing for curves.
We don't need to flush all drawing between curves in softgpu, let them
queue up.
2022-09-22 00:08:38 -07:00