Commit Graph

9145 Commits

Author SHA1 Message Date
Henrik Rydgård
4e2a1bf81c NEON: vcvtq can scale directly, no need for a mul by const. 2023-12-09 16:48:59 +01:00
Henrik Rydgård
99548be8a3 NEON culling: Use mla operations to shave off some more cycles. ARM32 compat. 2023-12-09 16:36:01 +01:00
Henrik Rydgård
6a7ef83f4b NEON-optimize the culling 2023-12-09 15:55:51 +01:00
Henrik Rydgård
5b44e25150 SSE-optimize the frustum culling 2023-12-09 15:55:51 +01:00
Henrik Rydgård
62c936babf Flip the cull plane data around to avoid transforming each vertex multiple times. 2023-12-09 15:55:51 +01:00
Henrik Rydgård
a043962447 World space planes 2023-12-09 15:55:51 +01:00
Henrik Rydgård
dbf796bb66 Fastcull: SSE/NEON-optimize 16-bit position conversion 2023-12-09 15:55:51 +01:00
Henrik Rydgård
89d8ef87ec Use a less accurate but faster frustum cull for the general draws. 2023-12-09 15:55:51 +01:00
Henrik Rydgård
0905b6a5ad Frustum-cull small draws
Some games do a poor job of culling stuff, and some transparent
sprites can be very expensive if they cause a copy.
Skipping them if outside the viewport makes sense in that case.

One example are the flame sprites in #17797 .

Additionally, we should be able to cull through-mode draws easily, this
one doesn't even try.
2023-12-09 15:55:51 +01:00
oltolm
6e609342d6 fix ASAN error in Vec2<float>::Length() 2023-12-08 19:50:47 +01:00
Henrik Rydgård
8dab823936
Merge pull request #18484 from hrydgard/mlb-fix
Fix frozen pitch meters in MLB series games - we were not hashing enough texture data
2023-12-07 12:12:35 +01:00
Henrik Rydgård
877324c978 Add comment about swizzling to the texture replacer hash 2023-12-07 11:01:51 +01:00
Henrik Rydgård
443a882041 Fix the size calculation when hashing small swizzled textures 2023-12-07 10:45:31 +01:00
Henrik Rydgård
36a2174ac0 Vulkan: Add indicator of swizzle mode to texture debug names 2023-12-07 10:35:04 +01:00
Henrik Rydgård
7bf8023dce Don't do the texture size check for the built-in font texture.
Fixes #18483
Fixes #18479
2023-12-07 09:01:24 +01:00
Henrik Rydgård
b90b6268ec
Merge pull request #18478 from hrydgard/block-transfer-to-depth
Handle block transfers from RAM to depth buffers.
2023-12-06 10:12:31 +01:00
Henrik Rydgård
26a51191b3 Cleaner solution to previous commit 2023-12-06 09:54:03 +01:00
Henrik Rydgård
8588b11a6a Rename MayIntersectFramebuffer to MayInteresectFramebufferColor 2023-12-06 09:42:44 +01:00
Henrik Rydgård
88f2657bb1 Allow block transfers from RAM to depth buffers.
Reuses the existing compat flag BlockTransferDepth.

I do aim to remove that compat flag in the future, it's probably not
even necessary here, it's just that general depth block transfers were
already gated on it.

Fixes #17878
2023-12-06 00:29:09 +01:00
Henrik Rydgård
e2480b9fa0 D3D9: Apply a half-pixel offset in 2D shader draws. Makes Tantalus games less broken (but still broken) 2023-12-05 14:09:14 +01:00
Henrik Rydgård
64a810f9bf VertexDecoder: Minor optimization for CPUs not supporting SSE4. 2023-12-05 01:23:09 +01:00
Henrik Rydgård
6b8ec972fb A couple of warning fixes (real issues though small) 2023-12-05 01:12:42 +01:00
Henrik Rydgård
5373b8c5b3 Fix double-free problem in "low-memory" texture fallback (Vulkan) 2023-12-04 19:47:20 +01:00
Henrik Rydgård
fb8ad0c33a Very minor cleanup in display list processing 2023-12-04 18:56:06 +01:00
Henrik Rydgård
fd73522efc Changing UV scale/offset requires us to stop "extending" prims
This is because we currently can't change these scales mid-decode, so we
need to break up the collection there. Note that this still won't cause
a full flush, just that the new extra-efficient tristrip merging can't go
through these commands.
2023-12-03 19:21:55 +01:00
Henrik Rydgård
7f67a10543 Texture replacement: Prioritize ini file lines over files in the "root".
This reverts back to the old behavior, as mentioned in #18465
2023-12-03 00:18:39 +01:00
Henrik Rydgård
d584162e06
Merge pull request #18462 from hrydgard/framebuffer-listing-overlay
Framebuffer listing overlay
2023-12-02 18:51:33 +01:00
Henrik Rydgård
6d977b4a12 Remove unnecessary struct FramebufferInfo 2023-12-02 13:56:18 +01:00
Henrik Rydgård
4ef54169af Add a compat.ini setting to allow delayed GPU readbacks, for experimentation. 2023-12-02 11:34:59 +01:00
Henrik Rydgård
b636356f36 copy: Reverse the order of the y and seq heuristics 2023-12-01 20:40:12 +01:00
Henrik Rydgård
cef17589d2 Move the oversize copy detection to a better location (less false positive) 2023-12-01 00:30:06 +01:00
Henrik Rydgård
d9365a6df1 FramebufferCopy: New framebuffer candidate sorting, similar to block transfer detection.
The previous attempt was simply flawed.
2023-12-01 00:10:16 +01:00
Henrik Rydgård
7920e86098 Add heuristic, fixing video flicker in Naruto UNH 2 caused by copy to wrong target. 2023-11-30 22:19:52 +01:00
Henrik Rydgård
737ec3e90b NEON buildfix 2023-11-28 18:40:10 +01:00
Henrik Rydgård
4ec2d76bc9 NEON-optimize matrix tranposes 2023-11-27 23:57:26 +01:00
Henrik Rydgård
45aae7b9da ARM32: Backport a lot of previously 64-bit-only NEON optimizations to ARM32. 2023-11-27 23:51:10 +01:00
Henrik Rydgård
dae758e5f4 Fix some bugs and mistakes found by Nemoumbra through static analysis 2023-11-26 13:43:11 +01:00
Henrik Rydgård
aec0606ba4 Optimize the bounding box code for more vertex formats 2023-11-26 13:40:37 +01:00
Henrik Rydgård
cb9c6dc661
Merge pull request #18418 from hrydgard/simplify-input-layout
thin3d/backends: Remove code that pretended that we supported multiple vertex streams
2023-11-13 12:51:09 +01:00
Henrik Rydgård
d891aaf9cd Remove code that pretended that we supported multiple vertex streams
Don't really see that we'll have much use for this feature, so simplify
it away. Only single vertex stream data is now supported by the thin3d
API.
2023-11-13 01:15:28 +01:00
Henrik Rydgård
77825484a0 If available, use 16-bit texture formats for MakePixelTexture when appropriate.
Optimization for God of War on low-end platforms. Avoids calling a color
conversion function that's currently only SIMD-optimized on x86, so will
also benefit ARM a little bit.
2023-11-12 15:58:03 +01:00
Henrik Rydgård
49f5da370a Simplify the logic in MakePixelTexture a bit 2023-11-12 11:19:45 +01:00
Henrik Rydgård
cc6f9a73ca Oops, fix for previous commit. And minor optimization. 2023-11-12 01:32:02 +01:00
Henrik Rydgård
632fa1c9d6 Cache and hash data for DrawPixels.
We already had a cache to reuse texture objects so just
opportunistically reuse them when easy to do so.
2023-11-11 19:58:12 +01:00
Henrik Rydgård
dd032dc533 Delete two unused structs 2023-11-11 10:55:54 +01:00
Henrik Rydgård
4f2f1c4392 Tilt: Fix some edge cases leading to division by zero and similar. 2023-11-09 19:14:31 +01:00
Henrik Rydgård
48a1348352 Move a var for clarity 2023-11-01 21:30:04 -06:00
Henrik Rydgård
ee6ffac28e Ignore triangle strips with less than 3 vertices.
Should fix the new issue reported in #18273
2023-11-01 21:28:37 -06:00
Henrik Rydgård
618589abd8
Merge pull request #18362 from unknownbrackets/softgpu-zmirror
softgpu: Point depthbuf at the first VRAM mirror
2023-10-15 22:53:00 +02:00
Unknown W. Brackets
f7f05072fe softgpu: Point depthbuf at the first VRAM mirror. 2023-10-15 10:33:05 -07:00
Henrik Rydgård
f842f16fbe Inline "DecodeVertexToPushPool" for ease of change. 2023-10-12 11:58:49 +02:00
Henrik Rydgård
12a98baf59 Cleanups, make the various SubmitPrim implementations more similar 2023-10-12 11:58:48 +02:00
Henrik Rydgård
f769b2c8a3 Remove unused functionality from descpool 2023-10-11 12:29:57 +02:00
Henrik Rydgård
0ad2827e14 Vulkan: Fix synchronization when shutting the GPU down in-game. 2023-10-11 12:27:39 +02:00
Henrik Rydgård
183d49329a Allow writing directly into the packed descriptor buffer, saving a memcpy. 2023-10-11 11:02:17 +02:00
Henrik Rydgård
2ac14f555d Remove VulkanPushBuffer (keeping our newer replacement VulkanPushPool) 2023-10-11 09:06:24 +02:00
Henrik Rydgård
e4ea4831e9 Delete the vertex cache option from the code. 2023-10-10 15:43:43 +02:00
Henrik Rydgård
078018a943 Move the clockwise calculation out of DrawEngineCommon 2023-10-10 13:16:34 +02:00
Henrik Rydgård
82606b6eb2 Move the clockwise calculation out of the AddPrim loop 2023-10-10 13:00:57 +02:00
Henrik Rydgård
3d949b080d Prepare VulkanDescSetPool for block allocation 2023-10-10 09:14:10 +02:00
Henrik Rydgård
9c1c09ff5c Remove commented out code 2023-10-10 09:02:35 +02:00
Henrik Rydgård
ba4d1668ce Don't forget to update descCount in tess mode 2023-10-10 09:02:35 +02:00
Henrik Rydgård
af47ad035d Also use the new descriptor mechanism for in-game 2023-10-10 09:00:29 +02:00
Henrik Rydgård
35fcec1e4b Another small fix, helps Toca series games. 2023-10-10 02:13:25 +02:00
Henrik Rydgård
24409f6f94 Additional check fix 2023-10-09 21:15:17 +02:00
Henrik Rydgård
f6ba4ee4de Only support extending triangle-based draw calls. Fixes Crazy Taxi. 2023-10-09 21:14:00 +02:00
Henrik Rydgård
10bc6b4cd8 Safety check that doesn't fix crazy taxi 2023-10-09 21:10:53 +02:00
Henrik Rydgård
ced821169e Bump shader cache versions 2023-10-09 19:39:25 +02:00
Henrik Rydgård
bb38210cfb We somehow lost the usage_ counter increment in VulkanDescSetPool, fix that 2023-10-09 17:01:35 +02:00
Henrik Rydgård
a8b8580756 Don't forget to check the stall address, even in the optimized primitive loop 2023-10-09 14:08:11 +02:00
Henrik Rydgård
7fd7015987 Fix bug in vertex cache using uninitialized data 2023-10-09 14:03:41 +02:00
Henrik Rydgård
a780d02c07 Minor reordering 2023-10-09 11:54:15 +02:00
Henrik Rydgård
316bc03ac9 Move the destroy function for VKRPipelineLayout to VulkanRenderManager 2023-10-09 11:54:13 +02:00
Henrik Rydgård
ae58fe3828 In GL and Vulkan soft-skin, we might not be fully done decoding when we reach flush. Take that into account. 2023-10-08 16:51:58 +02:00
Henrik Rydgård
c73e2351de Add checks for unused topology values when loading pipeline caches. 2023-10-08 13:39:04 +02:00
Henrik Rydgård
28ed12aa93 Simplify descriptor pool creation 2023-10-08 12:39:19 +02:00
Henrik Rydgård
b82a34539d Same as last commit, but in DrawEngineVulkan. 2023-10-08 12:39:19 +02:00
Henrik Rydgård
dbe395dd00 Add a wrapper around VKRPipelineLayout / descsetlayout 2023-10-08 12:39:18 +02:00
Henrik Rydgård
34fbbf2c2a Split out the descriptorset pool from VulkanMemory.cpp/h 2023-10-08 11:45:00 +02:00
Henrik Rydgård
c7a3e7bc32 Remove a redundant variable 2023-10-06 16:32:59 +02:00
Henrik Rydgård
cd35252400 DrawEngine; Convert strip sequences in a tight loop 2023-10-06 16:25:13 +02:00
Henrik Rydgård
10ccbfd68c Unify the clearing of variables after a draw call 2023-10-06 15:39:59 +02:00
Henrik Rydgård
d4703e9534 Decoded position format is always the same 2023-10-06 15:39:58 +02:00
Henrik Rydgård
69b43ab734 Extend the Test Drive color ramp smoother to detect up to 3 ramps in a texture.
Note that we also offset the lookup slightly to miss the wrap-around
points. The existing 31 scale factor instead of 32, together with that
half-texel, are enough to avoid that problem.

Fixes #18300
2023-10-03 23:30:18 +02:00
Henrik Rydgård
226d25721a Add a block transfer GPU stat, remove a redundant one 2023-10-03 13:15:55 +02:00
Henrik Rydgård
d07c3c5148 Fix main-thread stalls due to decimate during replacement texture loading 2023-10-03 12:17:43 +02:00
Henrik Rydgård
af7efe4b5d Fix. Need to flush soft-skinned vertices when changing vertex format. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
200575b2bc Allow the new optimization through redundant VADDR instructions, very common 2023-10-03 11:01:37 +02:00
Henrik Rydgård
3aa0f5b543 A bit more 2023-10-03 11:01:37 +02:00
Henrik Rydgård
4d95250052 Optimize further 2023-10-03 11:01:37 +02:00
Henrik Rydgård
0260aebc26 Implement fast-path for merging non-indexed draws quickly. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
e63bb0459c Add a new stat, so we can see per game if the optimization has an effect 2023-10-03 11:01:37 +02:00
Henrik Rydgård
1c49d5718c Add an offset field that we'll need later 2023-10-03 11:01:37 +02:00
Henrik Rydgård
92ffef2626 Remove some state from IndexGenerator, fix bugs. Mostly works except vertex cache. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
9b411af1f5 It's running. 2023-10-03 11:01:37 +02:00
Henrik Rydgård
bd760b9115
Merge pull request #18217 from hrydgard/gles-simplify-disk-cache
Simplify disk-cache-load on GLES as well, for the same reasons as #18216
2023-10-03 10:39:27 +02:00
Henrik Rydgård
76ad3dec4d Revert unclear optimization 2023-10-01 16:43:33 +02:00
Henrik Rydgård
bd931f9cbe Additional minor cleanups 2023-10-01 14:31:46 +02:00
Henrik Rydgård
3cef04f885 Fix incorrect flushing behavior in the prim sequencer, small optimization 2023-10-01 14:23:34 +02:00
Henrik Rydgård
a2fe906534 Micro-optimization: Don't need to check drawcalls for 0. Extract shared expression. Yes I checked assembly. 2023-10-01 14:10:19 +02:00
Henrik Rydgård
52ad0d0335 Minor cleanup in Prim() 2023-10-01 13:57:41 +02:00
Unknown W. Brackets
e79e0e21ad arm64jit: Skip unnecessary const load w/4 weights. 2023-09-30 15:41:56 -07:00
Henrik Rydgård
fb4a1fb7dd Simplify disk-cache-load on GLES as well, for the same reasons as #18216 2023-09-30 13:45:13 +02:00
Henrik Rydgård
70edf4f234
Merge pull request #18233 from unknownbrackets/meminfo-defer
Use a thread for meminfo and defer tag lookup for copies
2023-09-29 11:37:47 +02:00
Henrik Rydgård
cf48532ef5
Merge pull request #18219 from hrydgard/get-index-bounds-autovec
Make GetIndexBounds friendlier to autovectorization. Works on x86 at least.
2023-09-29 11:31:34 +02:00
Henrik Rydgård
b8fa3a2071
Merge pull request #18125 from unknownbrackets/arm64-vertexjit
arm64jit: Optimize weight loading a bit
2023-09-29 09:52:56 +02:00
Henrik Rydgård
db421165c0
Merge pull request #18172 from hrydgard/more-lenient-clear-detection
Make clear detection a bit more lenient
2023-09-29 09:52:08 +02:00
Henrik Rydgård
abbd1c83bd Revert "Merge pull request #18184 from hrydgard/expand-lines-mem-fix"
This reverts commit 65b995ac6c, reversing
changes made to 01c3c3638f.
2023-09-27 20:04:37 +02:00
Henrik Rydgård
48d3efc473 Bump shader cache versions again, just because. 2023-09-27 17:38:15 +02:00
Henrik Rydgård
038bc7fc49 Fix issue uploading narrow textures in OpenGL.
We had some stride adjustment that is not needed - and we're not passing
the stride along, so it can't do the "right thing".

Fixes #18254
2023-09-27 16:43:06 +02:00
Henrik Rydgård
f2cfbe1bcf Vulkan: Add the same shutdown logic to stop async shader compiles to DeviceLost 2023-09-26 01:28:59 +02:00
Henrik Rydgård
db245e1b34 Fix old texture leak in GLES hardware tessellation 2023-09-26 00:38:11 +02:00
Henrik Rydgård
01035f48a4 Fix for crash when changing backends in-game 2023-09-26 00:13:53 +02:00
Unknown W. Brackets
810d8c0890 Debugger: Use dedicated func to notify mem copy. 2023-09-24 19:07:36 -07:00
Henrik Rydgård
45bc4d8750 Make GetIndexBounds friendlier to autovectorization. Works on x86 at least. 2023-09-24 12:15:04 +02:00
Henrik Rydgård
6e303e8f1d Vulkan: Simplify GetShaders and DirtyLastShader, making them internally consistent. 2023-09-24 11:55:15 +02:00
Henrik Rydgård
d31ba393af Don't load the shader cache on a separate thread - all it does is already async 2023-09-24 10:53:23 +02:00
Henrik Rydgård
964f606a9c Fix some issues around geometry shaders - like, loading them from shader cache while disabled 2023-09-24 01:29:38 +02:00
Henrik Rydgård
dbd3045f87 Join the shader cache load thread on exit 2023-09-24 01:07:08 +02:00
Henrik Rydgård
9a515c851f Vulkan: Extend the cacheLock usage in GetShaders (was unsafe, though mildly) 2023-09-24 00:58:45 +02:00
Unknown W. Brackets
b610e2f314 GPU: Handle invalid blendeq more accurately. 2023-09-23 13:08:25 -07:00
Henrik Rydgård
6a8f65b566 Some assert paranoia, remove unused "failed_" variable 2023-09-23 10:09:32 +02:00
Henrik Rydgård
81f47caf2f Clarify the primitive expansion, add reporting 2023-09-22 10:27:02 +02:00
Henrik Rydgård
602407fcf2 Warning and comment fixes, logic precedence fixes in PPGeDraw 2023-09-21 16:41:42 +02:00
Henrik Rydgård
1aab1c4b09 Be a bit smarter when loading the shader cache, avoid duplicating work 2023-09-21 10:44:04 +02:00
Henrik Rydgård
2e171b22ec Vulkan: Remove an assert that didn't give much actionable information. Replace with reporting. 2023-09-20 22:50:38 +02:00
Henrik Rydgård
65b995ac6c
Merge pull request #18184 from hrydgard/expand-lines-mem-fix
Add memory bounds-check when expanding points, rects and lines to triangles
2023-09-20 20:39:16 +02:00
Henrik Rydgård
966144fa64 Bounds check writing to the index buffer when expanding lines/rects/points 2023-09-20 19:26:36 +02:00
Henrik Rydgård
3f2ef508c9 Make it easier to reason about space in the inds buffer by moving an offset instead of the pointer. 2023-09-20 19:23:24 +02:00
Henrik Rydgård
3783afd855 Fix a really bad race condition during game shutdown. 2023-09-20 18:47:32 +02:00
Henrik Rydgård
5c94b41dde Vulkan: If a createimageview failed, don't leak the image. Probably very rare. 2023-09-20 18:47:32 +02:00
Henrik Rydgård
e6a864ee04 Make clear detection a bit more lenient. Allows using clears in Assassin's Creed and likely more. 2023-09-18 23:57:20 +02:00
Henrik Rydgård
0bfd166200 Try to prevent a weird shutdown race condition that I'm not sure can happen - but crash logs show it 2023-09-18 16:45:07 +02:00
Henrik Rydgård
f4b0cddda3 ShaderId: Safer way to check for backend. 2023-09-18 16:25:00 +02:00
Henrik Rydgård
946d4b6251 Avoid causing shader gen failures due to bad blend eq values 2023-09-18 16:12:27 +02:00
Henrik Rydgård
b7d28cd10a Remove redundant fail state. Bail from shader cache load if a fragment shader fails to generate, too. 2023-09-18 14:38:22 +02:00
Henrik Rydgård
c2bf09744a SoftGPU: Fix refactoring mistake where we could return an uninitialized value. Oops. 2023-09-15 10:01:28 +02:00
Henrik Rydgård
6600b7ab08 Improved logging 2023-09-12 17:15:26 +02:00
Henrik Rydgård
447b28d277 Vulkan DrawEngine: Reset bound secondary texture on clear.
Fixes a validation error hit in Beats
2023-09-12 17:15:26 +02:00
Henrik Rydgård
be65cf0fc2 Assert improvements 2023-09-12 17:15:26 +02:00
Henrik Rydgård
844f1de041 Revert "Merge pull request #18008 from hrydgard/naruto-video-flicker-heuristic"
This reverts commit 985af4b03d, reversing
changes made to 64d04782ea.
2023-09-12 12:19:37 +02:00
Henrik Rydgård
97404354ef More asserts and checks in pipeline manager 2023-09-11 17:38:17 +02:00
Henrik Rydgård
052747aa30 Add reporting of GLSL shader gen errors 2023-09-11 15:37:35 +02:00
Henrik Rydgård
d335393d4e GLSL shader cache: Improve robustness against null shaders. See #18116 2023-09-11 12:07:18 +02:00
Henrik Rydgård
10f93875c6 Fix the semantics of DenseHashMap to be consistent even when inserting nulls 2023-09-11 12:07:18 +02:00
Unknown W. Brackets
3c7b05c3e8 PPGe: Use texture windows for atlas text.
This makes it software rendering, which correctly applies clamp/wrap
limits at 512x512, still has readable text.  Other textures may still be
wrong.
2023-09-10 23:54:55 -07:00
Unknown W. Brackets
5c4e08fe19 arm64jit: Use FMLA for TC precale. 2023-09-10 23:04:15 -07:00
Unknown W. Brackets
646e3b269d arm64jit: Skip vertexjit prolog/epilog if possible. 2023-09-10 23:04:15 -07:00
Unknown W. Brackets
a8493c0e19 arm64jit: Optimize weight loading a bit. 2023-09-10 23:04:15 -07:00
Unknown W. Brackets
f1f3e6fba2 arm64jit: Optimize vertex full alpha tracking. 2023-09-10 13:08:33 -07:00
Henrik Rydgård
162b363063 Bump shader cache version, just because. 2023-09-09 15:13:52 +02:00
Henrik Rydgård
ce4ee78157
Merge pull request #18099 from unknownbrackets/include-guards
Build: Add some missing include guards.
2023-09-08 08:33:53 +02:00
Unknown W. Brackets
cec9dbbdf7 Build: Add some missing include guards. 2023-09-07 17:14:58 -07:00
Henrik Rydgård
05d8752a64 Vulkan: Correct the calculation for max possible mip levels 2023-09-07 15:20:16 +02:00
Henrik Rydgård
6799e8a67b Add a little reminder that saving new textures is on, if they are. 2023-09-07 13:57:52 +02:00
Henrik Rydgård
f70d233511 Vulkan: Fix ordering issue in tex loading - decided on color swizzle too early 2023-09-06 22:48:11 +02:00
Henrik Rydgård
9a7579f8fa Typo fix, fixes #18038 2023-09-06 17:10:17 +02:00
Henrik Rydgård
0aa67e5276 Add some texture loading safety checks
I hit a spurious, non-reproducible debug assert in Archer McLean's Mercury.
Just want to rule out some bad code paths.
2023-09-06 15:38:47 +02:00
Unknown W. Brackets
e4cf2c3a13 arm64jit: Correct vertexjit bug on invalid case. 2023-09-04 23:42:59 -07:00
Henrik Rydgård
3e6788defe
Merge pull request #18022 from hrydgard/screen-scaler-ingame-fix
Android: Fix changing display resolution scale in-game
2023-08-31 08:45:39 +02:00
Henrik Rydgård
62c9041060
Merge pull request #18011 from hrydgard/collapse-degenerate-volume-textures
Detect the simplest Tactics Ogre case (US/EU) early
2023-08-31 08:45:11 +02:00
Henrik Rydgård
92e600a1c0
Merge pull request #18013 from hrydgard/remove-bad-heuristic
Replace a too-simple heuristic with a compat flag, fixing Castlevania flicker.
2023-08-31 08:18:53 +02:00
Henrik Rydgård
4b89fab91c NativeInitGraphics: Update core parameter pixel width/height (since we lose resized flag) 2023-08-30 23:42:13 +02:00
Henrik Rydgård
7e0e9a6d2b Shrink TexCacheEntry by 4 bytes and clean up a naming issue
Was a little confused by the "/2". It's not really useful to cache this
value anyway.
2023-08-30 16:37:38 +02:00
Henrik Rydgård
131163bf4c Replace a too-simple heuristic with a compat flag.
Should fix the flicker in Castlevania.

Fixes #17517

The heuristic worked for Rainbow Six but broke Castlevania, so I'd rather
use a compat flag instead of breaking a different game until we can find a
more reliable heuristic for Rainbow Six.
2023-08-30 10:59:57 +02:00
Henrik Rydgård
0d0c11a1ed Remove unnecessary check, add comment. 2023-08-30 10:29:14 +02:00
Henrik Rydgård
30a165b1dd Detect the simplest Tactics Ogre case (US/EU) early
Removes the need for the compat.ini flag for these versions, since we
can just treat the texture exactly as a regular 2D texture.
2023-08-30 10:27:19 +02:00
Henrik Rydgård
a5117249bd Add a debug assert during texture loading. 2023-08-29 23:15:30 +02:00
Henrik Rydgård
42b0ccd07d Revert some unnecessary log changes from #18001 2023-08-29 23:13:45 +02:00
Henrik Rydgård
985af4b03d
Merge pull request #18008 from hrydgard/naruto-video-flicker-heuristic
Add heuristic for memory->framebuffer copies, fixing video flicker in Naruto UNH 2
2023-08-29 13:21:40 +02:00
Henrik Rydgård
c563d4e57d NotifyFramebufferCopy: Pick the target framebuffer by scoring. 2023-08-29 12:53:18 +02:00
Henrik Rydgård
e3bdf1a70b Add heuristic, fixing video flicker in Naruto UNH 2 caused by copy to wrong target. 2023-08-29 11:46:24 +02:00
Henrik Rydgård
af1a1c5182 Improve logging of bad filenames 2023-08-29 10:45:00 +02:00
Henrik Rydgård
de679e2761 Generalize the odd/even mip level check 2023-08-29 10:44:18 +02:00
Henrik Rydgård
0cdfaffb48 Enable the FakeMipmapChange flag for US/EU Tactics Ogre, fixing replacement problem.
For correct lookups, without our texture replacement actually supporting
volume textures, we need to use this mechanism here too.

The game actually uses two mipmaps, but they're identical and point to
the same memory, so we treat them as a regular 2D texture instead for
purposes of both texturing and replacement. This is presumably legacy
from the initial Japanese version that needs to use multiple texture
layers. Similarly it does in in pairs.

This does actually not fully fix texture replacement for the Japanese
version, unfortunately. For that we need more proper support for these
weird textures in the texture replacement code - when I refactored it
before for more natural handling of regular mipmapping, this kinda got
lost.
2023-08-28 20:58:57 +02:00
Henrik Rydgård
412c4547cd textures.ini loader logging improvement 2023-08-28 16:34:58 +02:00
Henrik Rydgård
5c42aa07fc Minor log improvement 2023-08-28 16:11:47 +02:00
Henrik Rydgård
a529a9c408 Improve the logging in the CLUT load path 2023-08-28 14:39:24 +02:00
Henrik Rydgård
d2d8688e47 Add "Create frame dump" to the in-game developer menu (that can be enabled in dev settings)
Makes it possible to create one without connecting the websocket
debugger, even on non-Windows platforms.
2023-08-24 14:41:35 +02:00
Henrik Rydgård
16d073c4ad Add compat flag to not load CLUTs from old framebuffers 2023-08-24 10:30:37 +02:00
Unknown W. Brackets
622c69dbb9 x86jit: Expose option to select new IR based jit. 2023-08-20 22:28:54 -07:00
Henrik Rydgård
714558853c Enable anisotropic filtering for replacement textures with mipmaps 2023-08-18 15:21:07 +02:00
Henrik Rydgård
44d602ca7d Move InitSysDirectories to where it belongs and rename it. Plus warning fixes. 2023-08-18 13:03:32 +02:00
Henrik Rydgård
13cfd9c3d6 Add Mesa as a known GPU driver "vendor". 2023-08-17 22:06:03 +02:00
Henrik Rydgård
89ff606ccb D3D9 fix. Make a check more break-point-able. 2023-08-17 20:46:43 +02:00
Henrik Rydgård
8a6e288fcc Add checkboxes in developer tools to allow disabling ubershaders.
Might be helpful to diagnose performance problems on user devices.

Additionally, moves the texture replacement controls to the top. They
should probably be moved somewhere else entirely...

See #17918
2023-08-17 20:16:04 +02:00
Henrik Rydgård
ff6e118fff Get rid of a lot of ifdefs around presentation mode. Instead, set things dynamically. 2023-08-14 11:02:29 +02:00
Unknown W. Brackets
f03cd0b2ad
Merge pull request #17899 from unknownbrackets/riscv-minor
Minor RISC-V cleanups, frame profiler fix
2023-08-13 11:19:42 -07:00
Unknown W. Brackets
41cddce167 TexCache: Encourage vectorization.
This gets clang to vectorize on RISC-V V, although it looks suboptimal
(probably faster than not using vector, though.)  Also improves other
platforms, but our specializations seem better.
2023-08-13 10:21:04 -07:00
Henrik Rydgård
d82ecf1d3e IniFile: Store sections in unique_ptrs, instead of directly.
This fixes an issue when you create two sections consecutively and
retain pointers to them, and then modify them, such as happens in the
postshader ini initialization. Previously, one of the section pointers
could get invalidated since the section vector got resized. Now, the
pointed-to sections don't move around in memory, only the list of them
does.
2023-08-13 13:41:43 +02:00
Henrik Rydgård
ebfd76d742 Add back the self-render check that kept Ridge Racer working.
This hack was removed in #17838
2023-08-08 15:42:52 +02:00
Henrik Rydgård
74a471d7a5
Merge pull request #17846 from hrydgard/debug-overlay-everywhere
Debug overlay everywhere
2023-08-03 20:55:35 +02:00
Henrik Rydgård
be63ce3a4a Minor refactor allowing getting the GPU profile string outside games 2023-08-03 16:31:20 +02:00
Henrik Rydgård
a32249a3cf Move DebugOverlay rendering to the overlay screen, allowing drawing it on top of the menu 2023-08-03 16:19:18 +02:00
Henrik Rydgård
5ed4b532b7 Micro-optimize SubmitPrim, remove outdated mitigation 2023-08-02 19:14:32 +02:00
Henrik Rydgård
fc6879674e Refactor overlays into an enum 2023-08-02 13:03:04 +02:00
Henrik Rydgård
1475fcb065 Fix a comment 2023-08-01 00:28:54 +02:00
Henrik Rydgård
3861e97a94 Experiment with the collapsible header thingy. Slightly increase the font size of headers. 2023-07-31 11:48:50 +02:00
Henrik Rydgård
f0fd9e85aa Try dirtying CULL_PLANES in Execute_BoundingBox in SoftGPU 2023-07-30 18:35:18 +02:00
Henrik Rydgård
fd656c629d More dirtying 2023-07-30 17:45:19 +02:00
Henrik Rydgård
061131ec8a Cache planes used for BBOX culling
This isn't a huge performance boost for the games that use BBOX (like
Tekken), but it'll be more valuable if we start using soft culling more
widely automatically, see #17808
2023-07-30 14:42:22 +02:00
Henrik Rydgård
6da6de8201 Re-enable framebuffer fetch for blend where available.
Accidentally disabled this in #17575

Helps #17797 but only on OpenGL on mobile. There's more to improve
there.

caps_.framebufferFetchSupported is now always set to false in Vulkan.
2023-07-30 11:13:42 +02:00
Unknown W. Brackets
8f404a1961 softgpu: Fix minor typo. 2023-07-25 19:42:36 -07:00
Henrik Rydgård
bee2400230
Merge pull request #17769 from unknownbrackets/vertexjit-debug
Add compilation-enabled vertexjit compare tool
2023-07-24 09:39:52 +02:00
Unknown W. Brackets
b041e712de riscv: Fix signed position bug in vertexjit. 2023-07-23 17:57:08 -07:00
Unknown W. Brackets
5cbad1982b riscv: Correct 565 morph mistake.
Observed ni Valkyria Chronicles 3.
2023-07-23 17:57:08 -07:00
Henrik Rydgård
da2b31e2cc
Merge pull request #17771 from unknownbrackets/riscv-vertexjit
Fix some silly mistakes in the RISC-V vertexjit
2023-07-23 23:57:04 +02:00
Henrik Rydgård
89d5f55893
Merge pull request #17768 from unknownbrackets/vertex-uvscale
GPU: Correct UV scale for non-jit vertices
2023-07-23 23:52:20 +02:00
Unknown W. Brackets
f1c90a6014 riscv: Fix skinning decode, morph and not.
Was transposed and using the wrong matrix when morphing.
2023-07-23 14:35:37 -07:00
Unknown W. Brackets
1790964ffe riscv: Fix vertexjit skinning, oops. 2023-07-23 14:35:37 -07:00
Unknown W. Brackets
311c78f26b GPU: Make the vertexjit diff smarter. 2023-07-23 14:28:45 -07:00
Unknown W. Brackets
b6f11d6dae GPU: Add a little tool to debug vertexjit.
Although it's too exacting right now, it still helps.
2023-07-23 14:28:45 -07:00
Unknown W. Brackets
312dcfc1c5 GPU: Correct UV scale for non-jit. 2023-07-23 14:25:43 -07:00
Henrik Rydgård
c1a290b41f ReplacedTexture: Bugfix D3D workaround log check 2023-07-23 22:06:06 +02:00
Henrik Rydgård
ace217008a In D3D11, force block compressed textures to have dimensions divisible by 4
Fixes #17745 (crash when loading certain texture packs in D3D11)

This is an old unfortunate limitation. Only applies to the top mip
level, which makes it obvious that it's kinda unnecessary for the
hardware and indeed, Vulkan and OpenGL don't have this limitation.
2023-07-20 19:44:00 +02:00
Henrik Rydgård
6b574e497f
Merge pull request #17730 from unknownbrackets/gedebugger-steptex
GE Debugger: Make step tex jump to first prim
2023-07-16 21:02:06 +02:00
Unknown W. Brackets
3b03c1ca85 GE Debugger: Make step tex jump to first prim. 2023-07-16 11:34:51 -07:00
Unknown W. Brackets
d6a5e84db5 softgpu: Fix worldpos skipping.
Oops, was reversed.  We need worldpos for non-directional lights.
2023-07-16 10:59:44 -07:00
Unknown W. Brackets
47c29e0874 sopftgpu: Disable lights if all else disabled.
Tiny gain, but seeing it happen so might as well.
2023-07-16 10:31:58 -07:00
Unknown W. Brackets
d5b4c98f96 softgpu: Reduce some non-SIMD lighting math.
Small perf improvement for vertex/lighting heavy (i.e. 3D) scenes.
2023-07-16 10:31:44 -07:00
Henrik Rydgård
b4419a9146 Remove the old screen resolution popup thing 2023-07-16 17:05:26 +02:00
Henrik Rydgård
952e125c7e Break out rendering of "notices" from OnScreenDisplay. They can now also be used as views.
Use it for the new message in ControlMappingScreen, when you try to map
a combo when that's disabled. It'll have more uses.
2023-07-07 15:23:19 +02:00
fp64
b0f71e08f4 Simplify projective texcoord calculation
As mentioned in https://github.com/hrydgard/ppsspp/issues/17613#issuecomment-1613583152 .
2023-07-03 10:59:09 -04:00
Henrik Rydgård
fc797ec55f
Merge pull request #17656 from lvonasek/compat_openxr_fixes
OpenXR - Game compatibility fixes
2023-07-02 21:12:21 +02:00
Lubos
6e10f20f8b OpenXR - Tony Hawk mirroring hack better 2023-07-02 20:29:59 +02:00
Lubos
843b169fa3 OpenXR - Digimon Adventure rendering fixed 2023-07-02 15:05:29 +02:00
Unknown W. Brackets
9c08e27a0c
Merge pull request #17648 from fp64/div-less
Replace some signed divison in SoftGPU
2023-07-01 12:28:52 -07:00
fp64
cd9f01c4df Remove SSE4 path from Vec4<int>::operator* 2023-06-30 22:07:26 -04:00
Henrik Rydgård
eb21a2e6c9 Break out the OSD data holder from Common/System/System.h, into OSD.cpp/h 2023-06-30 17:15:49 +02:00
fp64
f133739cd0 Replace some signed divison in SoftGPU
This also adds a few bitwise operations to Vec4<int> and further
SIMDifies it.
Also, fixes unrelated warning.
2023-06-29 16:43:21 -04:00
Unknown W. Brackets
dfe113e846
Merge pull request #17634 from fp64/macro-x86-loadu
Streamline x86 SSE workaround
2023-06-27 23:01:41 -07:00
Henrik Rydgård
e4229886b7
Merge pull request #17636 from lvonasek/review_openxr
OpenXR - Major review
2023-06-27 20:07:42 +02:00
Lubos
880168ee3c OpenXR - Fix render glitches caused by wrong mirroring 2023-06-27 18:54:38 +02:00
M4xw
99ce3125df [Softgpu] Fix AArch64 oversight 2023-06-27 17:20:11 +02:00
fp64
436b49c4f2 Streamline x86 SSE workaround
Seems clearer than using #ifdef's at each site. Also rationale
is clearly spelled out, one 'Go to definition' away from any instance.
2023-06-27 00:30:01 -04:00
Unknown W. Brackets
fedb92b0e9 softgpu: Ensure early depth test uses SIMD. 2023-06-25 10:18:21 -07:00
Henrik Rydgård
08d578dce9
Merge pull request #17618 from unknownbrackets/softgpu-opt-cast
Optimize casts in softgpu
2023-06-25 07:55:30 +02:00
Henrik Rydgård
ec92675c5e
Merge pull request #17619 from unknownbrackets/softgpu-opt-z
softgpu: Improve Z interpolation SIMD
2023-06-25 07:55:03 +02:00
Unknown W. Brackets
d42642edd2 softgpu: Improve Z interpolation SIMD. 2023-06-24 22:17:11 -07:00
Unknown W. Brackets
15b66ba6c0 softgpu: Make SIMD on x86_32 a bit safer. 2023-06-24 14:49:23 -07:00
Unknown W. Brackets
ae9d34370e softgpu: Move wsum_recip out of the triangle loop.
Seems like a small benefit, but not seeing any issues from this.
Noticed by fp64.
2023-06-24 12:38:05 -07:00
Unknown W. Brackets
795de9b164 softgpu: Use SIMD for more Vec4 casts.
A number of these were falling back to some pretty terrible code.
Thanks to fp64 for noticing.
2023-06-24 12:36:44 -07:00
Unknown W. Brackets
76990aec70
Merge pull request #17609 from fp64/optimize-softgpu-tex-linear
softgpu: Optimize (bi-)linear texture filtering
2023-06-21 23:39:15 -07:00
fp64
159faaa2ec softgpu: Optimize (bi-)linear texture filtering
Seeing as SampleLinearLevel is near the top in the profiler,
optimize actual bilinear filtering using SSE2. Solid win in the
synthetic benchmark (https://godbolt.org/z/fqh3xvbGx, also doubles
as correctness check), no visible difference in actual PPSSPP.
Note: profiler suggests that hot part of SampleLinearLevel is
elsewhere.
2023-06-21 20:02:34 +03:00
Henrik Rydgård
7cc8c6cea4 OSD: Add semantics, move the the OSD state to common (while keeping the renderer in the UI). 2023-06-20 14:40:46 +02:00
Unknown W. Brackets
efd8565ffe
Merge pull request #17592 from fp64/anymask-movemask
Use _mm_movemask_ps for AnyMask
2023-06-17 09:48:09 -07:00
fp64
ab85c46161 Use _mm_movemask_ps for AnyMask
Probably very minor speed improvement, but it's rather neat.
2023-06-17 01:05:02 -04:00
Henrik Rydgård
5b4fa06b00 Revert Dot33 on 32-bit x86 only. See #17584 2023-06-16 23:43:33 +02:00
Henrik Rydgård
6d4e5a0f3e
Merge pull request #17584 from fp64/sse2-dot33
Convert Dot33 to SSE2
2023-06-15 20:08:23 +02:00
Henrik Rydgård
def09bf575 Update the uvscale uniform a bit more conservatively on framebuffer changes
Plus fixes a few minor oversights

Fixes #17581 and possibly #17522
2023-06-15 11:57:30 +02:00