57 Commits

Author SHA1 Message Date
Henrik Rydgård
c2bf09744a SoftGPU: Fix refactoring mistake where we could return an uninitialized value. Oops. 2023-09-15 10:01:28 +02:00
Henrik Rydgård
10f93875c6 Fix the semantics of DenseHashMap to be consistent even when inserting nulls 2023-09-11 12:07:18 +02:00
Unknown W. Brackets
cd3fc26190 samplerjit: Prevent thread local stale cache read.
If the generation count happens to match, would still get a stale pointer
and crash.  Let's just make the generation count static so it always
increases.
2023-02-22 21:15:03 -08:00
Unknown W. Brackets
587a322207 softgpu: Use NEON SIMD for alpha blending, etc. 2023-01-07 19:06:34 -08:00
Unknown W. Brackets
d9522a7ac5 softgpu: Avoid clear hazard for last cached funcs. 2022-12-06 21:23:56 -08:00
Unknown W. Brackets
eda3ce556e softgpu: Avoid atomic structs.
Apparently we don't link libatomic and rather than fighting that, I'll
just use thread local values.
2022-12-06 20:35:07 -08:00
Unknown W. Brackets
400f6abf9a softgpu: Optimize lookup of last jit func.
This is common (for example, maybe a pixel state is updated but sampler is
not), and reduces time spent in ComputeRasterizerState() quite a bit in
Darkstalkers, where jits are available (i.e. Intel currently.)
2022-12-06 19:16:19 -08:00
Unknown W. Brackets
87fb9eef37 softgpu: Remove std::function usage.
Wanted to avoid coupling these, but don't like the std::function
construct/destructs showing in profiles...
2022-12-06 19:15:57 -08:00
Unknown W. Brackets
38eb0a7a82 softgpu: Check for queued compile.
Rarely, we could have queued compiling the same one, which would crash on
a double insert.
2022-12-03 12:15:58 -08:00
Unknown W. Brackets
778a0487cb softjit: Switch to DenseHashMap. 2022-12-02 20:59:13 -08:00
Unknown W. Brackets
4d06400548 softgpu: Fix compile hazard while running.
This prevents any clearing of cache while other threads may be using
previously cached funcs, and avoids wx exclusive hazards.
2022-11-20 12:04:02 -08:00
Unknown W. Brackets
ce51942508 softgpu: Correct WX-exclusive platform hazards.
Should mainly affect BSD at this point.
2022-11-20 10:55:35 -08:00
Unknown W. Brackets
c47d7eab38 softgpu: Simply 5551 blending fast path.
Since it only supports multiply and add, let's just stick with that.
2022-09-24 18:55:45 -07:00
Unknown W. Brackets
6e6535c263 softjit: Skip reading dst pixel where blended out.
Sometimes used by blends used purely to multiply the source color by
something, usually prep for bloom.
2022-09-24 02:00:03 -07:00
Unknown W. Brackets
e72309745e softjit: Implement accurate fog color blending. 2022-09-11 08:50:07 -07:00
Unknown W. Brackets
b90fc7137f softgpu: Correct accuracy of fog calculation.
This matches values from a PSP exactly, with the help of immediate mode
vertex values (since this directly allows specifying the fog factor
without any floating point math.)
2022-09-11 08:24:40 -07:00
Unknown W. Brackets
15d5fa48f7 softgpu: Check depth test early on simple stencil.
If we don't need to write stencil on sfail/zfail, we can do the depthtest
early, which allows us to more often skip texture sampling.

This gives a good improvement in Chains of Olympus.
2022-09-10 21:24:19 -07:00
Unknown W. Brackets
2479d52202 Global: Reduce includes of common headers.
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
6c723c0517 softjit: Fix src blend factor handling.
This was causing us to skip a shift, oops.
2022-01-24 00:05:00 -08:00
Unknown W. Brackets
c2985bca31 softjit: Centralize some common funcs from sampler.
No need to duplicate this code.
2022-01-19 00:03:59 -08:00
Unknown W. Brackets
ac2b96cec0 softjit: Switch to constant pool.
This is simpler without RIP access checks, and tends to be fast.
2022-01-17 19:50:37 -08:00
Unknown W. Brackets
fc292b127b softgpu: Correct dither matrix lookup.
Oops, need to wrap x/y, of course...
2022-01-15 23:51:21 -08:00
Unknown W. Brackets
f4f7ea2736 softgpu: Cache colortest params in draw pix state. 2022-01-15 13:03:11 -08:00
Unknown W. Brackets
aa9d751248 softgpu: Cache alpha/stencil test masks in state. 2022-01-15 13:03:11 -08:00
Unknown W. Brackets
acad2640dd softgpu: Cache logicOp in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
c0d548846f softgpu: Use cached write mask in draw pixel. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
f1ce2e7715 softgpu: Cache minz/maxz in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
0b3f096c01 softgpu: Cache strides in draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
e9f3720e20 softgpu: Cache fog color draw pixel state. 2022-01-15 13:03:10 -08:00
Unknown W. Brackets
9ec7d65c49 softgpu: Use func IDs instead of gstate more. 2022-01-10 22:12:35 -08:00
Unknown W. Brackets
961cfcd75c softjit: Add describes here too.
Helpful to aggregate when there are multiple rasterizers.
2022-01-03 06:45:10 -08:00
Unknown W. Brackets
e93c709f5c sofjit: Correctly poison memory.
Noticed this wasn't breakpoints when reviewing some assembly output.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
355bad666c softjit: Optimize common case bloom blending.
Bloom often uses fixed ONE + ONE, which is a lot less work for us.  And
bloom often runs over and over again on pixels, so saving work is good.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
2d8fdd8cf4 Math3D: Allow construction from NEON vectors.
This makes it match SSE and easier to keep things generic.  Will impact
alignment of non-packed Vec2/Vec3.
2021-11-28 08:24:53 -08:00
Unknown W. Brackets
96a7554053 sofjit: Move common types to reg cache header.
This makes it easier to use vectors elsewhere.
2021-11-28 08:03:15 -08:00
Unknown W. Brackets
3d5bced296 softjit: Rename reg cache so it can be reused.
Intentionally just the name changes in this commit.
2021-11-28 08:03:15 -08:00
Unknown W. Brackets
4703b6cb56 softjit: Cleanup, add other arch types to regcache. 2021-11-28 08:03:15 -08:00
Unknown W. Brackets
c1882fa1c0 softjit: Disallow use of register after unlock. 2021-11-28 08:03:14 -08:00
Unknown W. Brackets
2f039abd13 softjit: Simplify regcache usage as purpose only.
Dealing with types was annoying, and this helps validate the right
register is released.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
e1ed49a3e4 softjit: Ensure all regs are released. 2021-11-28 08:03:14 -08:00
Unknown W. Brackets
d53e13b862 softjit: Manage args in the register cache. 2021-11-28 08:03:13 -08:00
Unknown W. Brackets
1cb48a7bd2 softjit: Reduce jit pool size a bit. 2021-11-26 10:30:00 -08:00
Unknown W. Brackets
c62457bb33 softjit: Optimize common blend inverse alpha case. 2021-11-26 09:30:48 -08:00
Unknown W. Brackets
a07017dbb0 softjit: Prefer easier to refill regs. 2021-11-26 09:30:47 -08:00
Unknown W. Brackets
7f167c3660 softjit: Implement min/max/absdiff blending.
Alpha not yet implemented.
2021-11-26 09:30:47 -08:00
Unknown W. Brackets
2b4b4ae064 softjit: Add config setting to enable/disable.
Also use it for samplerjit.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
0e63b357b3 softjit: Add dithering. 2021-11-26 08:21:13 -08:00
Unknown W. Brackets
2423285831 softjit: Add helpers to get framebuf offsets. 2021-11-26 08:21:12 -08:00
Unknown W. Brackets
f8819308ff softjit: Add levels of register locking.
Locking also in helpers, so need to nest locks.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
9fed7ea732 softjit: Add register cache for softjit. 2021-11-26 08:21:11 -08:00