Henrik Rydgård
d3f0af7458
Merge pull request #15273 from unknownbrackets/softjit-bloom
...
Optimize software renderer handling of common bloom operations
2022-01-02 18:11:07 +01:00
Henrik Rydgård
c07ca2d89d
Merge pull request #15272 from unknownbrackets/softgpu-meminfo
...
softgpu: Add code for tracking GPU writes
2022-01-02 18:09:16 +01:00
Henrik Rydgård
c7062d7063
Merge pull request #15271 from unknownbrackets/samplerjit-color16
...
samplerjit: Decode colors in parallel
2022-01-02 17:55:46 +01:00
Unknown W. Brackets
a259761262
samplerjit: Use nearest func in fast path too.
...
This uses the more optimal tex funcs.
2022-01-02 08:48:16 -08:00
Unknown W. Brackets
ba17f538d6
softjit: Avoid const temp registers.
...
Was trying to make sure register allocation was okay in the worst case.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
e93c709f5c
sofjit: Correctly poison memory.
...
Noticed this wasn't breakpoints when reviewing some assembly output.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
745c35f320
softjit: Small bloom optimization.
...
Another common case, src*dst + dst*0. Can skip the add.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
355bad666c
softjit: Optimize common case bloom blending.
...
Bloom often uses fixed ONE + ONE, which is a lot less work for us. And
bloom often runs over and over again on pixels, so saving work is good.
2022-01-02 08:47:04 -08:00
Henrik Rydgård
6fb5d82fe0
Merge pull request #15264 from unknownbrackets/samplerjit-vec
...
A couple more smaller samplerjit optimizations
2022-01-02 17:32:54 +01:00
Unknown W. Brackets
496545e55c
softgpu: Add code for tracking GPU writes.
...
Unfortunately, it has a pretty noticeable speed impact, even at the basic
"assume everything's written" level. Compiled off by default, but at
least it's there.
Doesn't account for tests (i.e. alpha test skipping write) so still not
perfectly accurate.
2022-01-02 08:28:30 -08:00
Unknown W. Brackets
0eec4e7e4d
samplerjit: Decode colors in parallel.
...
Not used in a ton of games, but a decent improvement where it is used.
2022-01-02 08:27:55 -08:00
Henrik Rydgård
cb1f26122d
Merge pull request #15269 from unknownbrackets/softgpu-opt
...
softgpu: Reduce interpolation if not needed
2022-01-02 09:47:19 +01:00
Henrik Rydgård
da38c027b5
Merge pull request #15268 from unknownbrackets/samplerjit-nearest
...
Implement nearest in samplerjit, like linear
2022-01-02 09:46:29 +01:00
Unknown W. Brackets
025ac99f2f
softgpu: Reduce interpolation if not needed.
...
About 3% gain in some areas.
2022-01-01 18:34:04 -08:00
Unknown W. Brackets
7060035303
samplerjit: Implement nearest in jit.
...
This uses the tex func and similar within jit.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
91c9343e87
samplerjit: Refactor and reuse constant pool.
...
It's just here to be rip accessible, the fixed values can be output just
once.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
40240be91c
samplerjit: Update nearest args, temp disable jit.
...
This temporarily disables jit for nearest, but refactors to use the new
arg structure. It now matches linear.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
5f84de7de7
softjit: Small optimizations.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
06e954fe2a
samplerjit: Create a separate fetch func.
...
This allows nearest to become more similar to linear, where it applies the
texture function.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
3bc6009158
samplerjit: Refactor sampler ID calculation.
...
Make it the same as pixel func IDs.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
d41e42d247
softgpu: Correct off-by-one scissor mask.
...
Fixes Brave Story in the software renderer. Was overwriting display list
data in the stride gap.
2022-01-01 16:42:36 -08:00
Unknown W. Brackets
b35ca3d472
softgpu: Cleanup min/max tri range handling.
...
The previous looked like it had off by one errors. This is simpler.
2022-01-01 16:42:36 -08:00
Unknown W. Brackets
e82fd3bd33
GPU: Avoid spline crashes on bad data.
...
If we get 0 prims, we can generate confusing index bounds and go out of
bounds. Similarly, if we get a crazy number of control points and fail to
allocate, we can crash.
2022-01-01 16:40:59 -08:00
Unknown W. Brackets
12405709f0
softgpu: Skip processing scissored triangles.
...
If only one side was scissored (common), we might even put it on a thread,
which ended up as a lot of overhead. Gives 3-4% improvement in some
places.
2022-01-01 16:40:34 -08:00
Unknown W. Brackets
6aec68aa5c
samplerjit: Correct wrong bufw at mip levels.
...
Oops, was always using the base bufw.
2022-01-01 16:40:02 -08:00
Unknown W. Brackets
dbb015f427
samplerjit: Oops, fix Linux mipmap handling.
2022-01-01 16:40:02 -08:00
Unknown W. Brackets
8c31f1bb38
softjit: Fix regcache error when clearing.
...
Happens for non-through clears.
2022-01-01 16:40:01 -08:00
Unknown W. Brackets
8ea67b571b
samplerjit: Tiny dependency optimizations.
...
This had a small but measureable impact (~0.3%.)
2021-12-31 08:11:57 -08:00
Unknown W. Brackets
fc3688d273
samplerjit: Small AVX optimization to modulate.
...
Only gives about 0.5% but it's still something.
2021-12-31 08:10:04 -08:00
Henrik Rydgård
244b0a86f6
Merge pull request #15262 from unknownbrackets/samplerjit-vec
...
samplerjit: Use SSSE3/SSE4 in linear filtering
2021-12-31 09:29:59 +01:00
Unknown W. Brackets
33e9841a4a
softgpu: Skip zero size triangles.
...
These were drawing before, incorrectly, which caused artifacts.
Noticeable in Blade Dancer.
2021-12-31 00:20:12 -08:00
Unknown W. Brackets
1addf84e90
samplerjit: Use SSSE3/SSE4 in linear filtering.
2021-12-30 23:22:56 -08:00
Unknown W. Brackets
147b81d6f7
x64jit: Add AVX/AVX2 encodings.
...
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
4bd94a4e5e
samplerjit: Pass funcs as an argument.
...
Seeing computing the ID in some profiles, so want to avoid computing per
thread/invocation.
2021-12-29 07:11:53 -08:00
Unknown W. Brackets
28cfbe0e5a
samplerjit: Add an alternate profiling method.
...
This is more useful to group common operations together for profiling.
2021-12-29 07:11:39 -08:00
Unknown W. Brackets
3aedea89eb
samplerjit: Correct level lookup offset.
2021-12-29 07:09:36 -08:00
Unknown W. Brackets
bf06342f9d
samplerjit: Minor SSE4 optimizations.
...
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
631706a8ba
samplerjit: Set stackArgPos_ early.
...
Unfortunately, this has to match the value set lower...
2021-12-28 20:21:21 -08:00
Unknown W. Brackets
74eb450e76
samplerjit: Move texture function into jit.
...
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7
samplerjit: Lookup both mip tex values.
2021-12-28 16:22:54 -08:00
Unknown W. Brackets
6b55d328e5
samplerjit: Use regcache for linear filtering.
...
This makes it easier to reuse for mipmap filtering.
2021-12-28 15:37:25 -08:00
Unknown W. Brackets
cdf14c8579
samplerjit: Calculate mip level U/V/offsets.
...
Not actually doing the sampling for the second mip level in the single jit
pass yet, but close.
2021-12-28 14:12:58 -08:00
Unknown W. Brackets
a4558a5736
samplerjit: Take texptr/bufw as arrays.
...
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
4864850b3b
samplerjit: Handle mipmap width/height in S/T calc.
2021-12-28 11:29:29 -08:00
Unknown W. Brackets
a84accf713
samplerjit: Move S/T calculation into jit.
...
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
476dfdf731
samplerjit: Add more bits for S/T, skip multiply.
...
For now, we're not using those other bits yet.
2021-12-27 18:24:37 -08:00
Unknown W. Brackets
9cc0883d53
softgpu: Correct non-SSE T clamp.
2021-12-27 15:31:37 -08:00
Unknown W. Brackets
39d5b1c221
softgpu: Reduce mipmap fraction to 4 bits.
...
For CONST (and SLOPE with flat w), this produces accurate values.
SLOPE is still wrong in its handling of w, and AUTO seems to calculate
using a different and less accurate ramp. But they both produce values
with 16 steps, in any case.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
d6b6ef4cb1
softgpu: Correct nearest filtering too.
...
Turns out to have the same behavior as linear, when it comes to the
subpixel offset.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
1dfaea9062
softgpu: Remove no longer possible report.
...
Also, it's known how this behaves, now.
2021-12-27 11:37:33 -08:00
Unknown W. Brackets
75f105f84b
softgpu: Make linear filtering more accurate.
...
This matches tests for various u/v offsets and x/y subpixel offsets.
Mipmaps are probably still wrong.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3cd19b02ac
samplerjit: Handle unswizzled offsets too.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
820361f34b
samplerjit: Calculate texel byte offset as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
4d6a2f3919
samplerjit: Blend linear using integers.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
6f4e735757
samplerjit: Accumulate results in an XMM.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
b00a66e34c
samplerjit: Pass u/v coords as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
ce3e29a649
softjit: Fix a function arg template warning.
...
We're just ignoring it because it's a false positive in this case.
2021-12-11 10:45:27 -08:00
Unknown W. Brackets
0d4ec5ca20
softjit: Fix an enum type comparion error.
...
Same values, though, so didn't matter.
2021-12-11 10:45:27 -08:00
Henrik Rydgård
818f33d979
Merge pull request #15225 from unknownbrackets/softjit-cond-fix
...
softjit: Throw away regs allocated in conditionals
2021-12-11 09:30:43 +01:00
Unknown W. Brackets
5593b8ff64
softjit: Skip a common case CMP.
2021-12-11 00:06:45 -08:00
Unknown W. Brackets
d35ef352c3
softjit: Throw away regs allocated in conditionals.
...
If this happens, the register no longer has a deterministic value.
2021-12-11 00:06:14 -08:00
Unknown W. Brackets
b3cd135000
samplerjit: Fix DXT1/DXT5 register releasing.
...
Oops, broke this while refactoring.
2021-12-09 08:17:29 -08:00
Unknown W. Brackets
3180e6c043
softgpu: Correct alpha on add + invalid texfuncs.
2021-12-05 16:28:37 -08:00
Unknown W. Brackets
325a1f75aa
softgpu: Match texenv blend texfunc accurately.
2021-12-05 16:09:26 -08:00
Unknown W. Brackets
0b6e7c421f
softgpu: Make decal tex func more accurate.
...
Tested for all values of A * B + 0 * (255 - B), as well as A * 127 + B *
(255 - 127), and matches accurately. Spot checked other values, but not
exhaustively.
2021-12-05 13:34:19 -08:00
Unknown W. Brackets
154bb53744
softgpu: Correct accuracy on fast path modulate.
2021-12-05 13:10:18 -08:00
Unknown W. Brackets
73460f7461
softgpu: Correct accuracy of MODULATE texfunc.
...
This matches hardware tests for every value of A * B.
Interesting that it's a different formula than alpha blend.
2021-12-05 12:06:52 -08:00
Unknown W. Brackets
891fa8c613
softgpu: Template away uncommon mip usage.
...
Improves general case about 10%.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
48e9404419
softgpu: Remove useless switch by UV gen mode.
...
They're all handled earlier now, and the switch is on a value & 3, so the
default wasn't even possible.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
ff94974df9
softgpu: Avoid texlevel check when maxlevel is 0.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
823c4adb15
softgpu: Keep arguments in vectors for sampling.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
d7c25b3e7c
samplerjit: Refactor nearest using reg cache.
2021-12-04 13:04:53 -08:00
Unknown W. Brackets
4aa5bee14c
softjit: Make it an error to unlock a temp.
...
Also fix some register usage in logic ops.
2021-12-01 21:50:02 -08:00
Unknown W. Brackets
75a918f96f
softjit: Get rid of pointless AGE00 tests.
2021-12-01 21:44:10 -08:00
Unknown W. Brackets
f47fb7e14e
softjit: Normalize some stencil test patterns.
2021-12-01 21:43:52 -08:00
Unknown W. Brackets
ba69e39256
softjit: Avoid tests for greater than 0.
...
They take more instructions, and can be somewhat common.
2021-12-01 21:40:10 -08:00
Unknown W. Brackets
aec41b34d6
softjit: Reduce ditherMatrix to 8-bit.
...
Oops, not sure why I made it 16 bit.
2021-12-01 21:39:29 -08:00
Unknown W. Brackets
1c5615624a
softjit: Oops, correct allocation typo.
...
Decided to leave these for paired operations.
2021-12-01 21:37:55 -08:00
Unknown W. Brackets
bfe82e417d
softjit: Fix locked stencil reg.
2021-11-28 20:26:01 -08:00
Unknown W. Brackets
99c213f244
softjit: Centralize argument register allocation.
2021-11-28 15:53:24 -08:00
Unknown W. Brackets
7aea6d2ab0
softjit: Fix fog typo causing locking bug.
2021-11-28 12:26:23 -08:00
Unknown W. Brackets
9653c33d9c
softjit: Fix PixelFuncID arg on non-Windows x64.
...
Oops, this is of course not put on the stack, it's in R8.
2021-11-28 08:54:36 -08:00
Unknown W. Brackets
2d8fdd8cf4
Math3D: Allow construction from NEON vectors.
...
This makes it match SSE and easier to keep things generic. Will impact
alignment of non-packed Vec2/Vec3.
2021-11-28 08:24:53 -08:00
Unknown W. Brackets
96a7554053
sofjit: Move common types to reg cache header.
...
This makes it easier to use vectors elsewhere.
2021-11-28 08:03:15 -08:00
Unknown W. Brackets
3d5bced296
softjit: Rename reg cache so it can be reused.
...
Intentionally just the name changes in this commit.
2021-11-28 08:03:15 -08:00
Unknown W. Brackets
4703b6cb56
softjit: Cleanup, add other arch types to regcache.
2021-11-28 08:03:15 -08:00
Unknown W. Brackets
c1882fa1c0
softjit: Disallow use of register after unlock.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
2f039abd13
softjit: Simplify regcache usage as purpose only.
...
Dealing with types was annoying, and this helps validate the right
register is released.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
722c04c5e2
samplerjit: Allow disabling linear too, oops.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
cc099c73f1
softjit: Decide stack offset on compile.
...
This makes it easier to compile different entries or push regs.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
e1ed49a3e4
softjit: Ensure all regs are released.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
d53e13b862
softjit: Manage args in the register cache.
2021-11-28 08:03:13 -08:00
Unknown W. Brackets
6fbcf67093
softjit: Fix disabled cache.
2021-11-27 11:32:47 -08:00
Unknown W. Brackets
1cb48a7bd2
softjit: Reduce jit pool size a bit.
2021-11-26 10:30:00 -08:00
Unknown W. Brackets
1f9dc3a568
softjit: Precalculate write mask and dither.
...
This is slightly abusing PixelFuncID, but the intent is to provide some
memory that's easily accessible from the jit func, but still associated
with that calculation (i.e. not global.)
2021-11-26 10:12:54 -08:00
Unknown W. Brackets
4e6a5ce760
softjit: Log any failed compiles.
2021-11-26 09:30:49 -08:00
Unknown W. Brackets
446eec0dff
softjit: Keep color 16-bit when useful.
...
Reuse it expanded where we can, in case of dither+fog+blend, etc.
2021-11-26 09:30:48 -08:00
Unknown W. Brackets
c62457bb33
softjit: Optimize common blend inverse alpha case.
2021-11-26 09:30:48 -08:00
Unknown W. Brackets
1fa4e6ba2c
softjit: Add alpha blending factors.
2021-11-26 09:30:48 -08:00
Unknown W. Brackets
bc8d5ad372
softjit: Cache zero vector to avoid recreating.
2021-11-26 09:30:48 -08:00
Unknown W. Brackets
a07017dbb0
softjit: Prefer easier to refill regs.
2021-11-26 09:30:47 -08:00
Unknown W. Brackets
932481d3cd
softjit: Minor tweak to reg order for XCHG.
...
It's easier to use it in these places, but seems it stalls longer on the
dest reg.
2021-11-26 09:30:47 -08:00
Unknown W. Brackets
7f167c3660
softjit: Implement min/max/absdiff blending.
...
Alpha not yet implemented.
2021-11-26 09:30:47 -08:00
Unknown W. Brackets
771d459025
softjit: Use SSE4.1 for fog and dither a bit.
2021-11-26 08:42:17 -08:00
Unknown W. Brackets
cf888257ab
softjit: Fix dithering bug.
2021-11-26 08:21:15 -08:00
Unknown W. Brackets
3f3e0ea8cf
softjit: Optimize typical alpha/depth test.
...
Messed with SSE4 then realized there's no point, just use SHR.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
6644c4225c
softjit: Apply logic ops.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
961273fcf5
softjit: Apply color write mask.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
a49a189962
softjit: Refactor color conv to dedicated funcs.
...
Will use this for masking too.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
2b4b4ae064
softjit: Add config setting to enable/disable.
...
Also use it for samplerjit.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
edb21b57bb
softjit: Initial color write.
...
At this point, it's used in some areas in some games.
Alpha blending is the main unimplemented path, then logic/masking.
2021-11-26 08:21:13 -08:00
Unknown W. Brackets
0e63b357b3
softjit: Add dithering.
2021-11-26 08:21:13 -08:00
Unknown W. Brackets
bd99448863
softjit: Keep x and y args for dither.
...
But let's still special case the 512 path, since it's so common.
2021-11-26 08:21:13 -08:00
Unknown W. Brackets
5ee4bdbe05
softjit: Depth and stencil testing.
2021-11-26 08:21:13 -08:00
Unknown W. Brackets
f3f32cebeb
softjit: Optimize some imm sizes.
2021-11-26 08:21:13 -08:00
Unknown W. Brackets
2423285831
softjit: Add helpers to get framebuf offsets.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
f8819308ff
softjit: Add levels of register locking.
...
Locking also in helpers, so need to nest locks.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
1e00a3b842
softjit: Add color test.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
14d322956a
softjit: Add alpha test.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
d9f7b9cca2
softjit: Initial depthrange, fog.
...
Not really tested, just filling out parts.
2021-11-26 08:21:12 -08:00
Unknown W. Brackets
9fed7ea732
softjit: Add register cache for softjit.
2021-11-26 08:21:11 -08:00
Unknown W. Brackets
91787e63d9
softjit: Switch to the __vectorcall convention.
2021-11-26 08:21:11 -08:00
Unknown W. Brackets
ae3299ea04
softjit: Add stubbed DrawPixel for x64.
2021-11-26 08:21:11 -08:00
Unknown W. Brackets
ce5ae95854
softgpu: Correct alpha blend subtract on negative.
...
Oops, we need to subtract signed, but then clamp to unsigned.
2021-11-25 22:06:48 -08:00
Unknown W. Brackets
dad85b97f1
softgpu: Use KEEP for any invalid stencil ops.
...
This just keeps the ID more consistent.
2021-11-25 21:02:20 -08:00
Unknown W. Brackets
d4bf7ea392
sofgpu: Disable alpha blend for invalid equations.
2021-11-25 19:23:41 -08:00
Unknown W. Brackets
35444b3051
softgpu: Accurately alpha blend.
2021-11-25 19:23:41 -08:00
Unknown W. Brackets
2acf7f4edf
softgpu: Use 0 alpha for 565 alpha blending.
...
We were previously blending as 0xFF.
2021-11-25 19:23:40 -08:00
Unknown W. Brackets
2ef7dd6b03
softgpu: Correct tagging of vertexjit.
2021-11-25 19:21:56 -08:00
Unknown W. Brackets
73de8db996
softgpu: Fix stencil DECR on 5551.
2021-11-25 19:21:56 -08:00
Unknown W. Brackets
53c6a3933d
softgpu: Use ALWAYS for alpha/depth test in clear.
2021-11-25 19:21:55 -08:00
Unknown W. Brackets
876c8cd368
softgpu: Fix PixelFuncID size.
...
Oops, can't use unions in bitfields. Also improve typesafety.
2021-11-21 09:40:13 -08:00
Unknown W. Brackets
28bc91bd79
softgpu: Add func to tersely name pixel funcs.
2021-11-21 08:23:32 -08:00
Unknown W. Brackets
f8bc6e5b9e
softgpu: Template draw pixel on fb format.
...
This introduces a small 5-10% perf improvement.
2021-11-21 08:23:32 -08:00
Unknown W. Brackets
09dc38080a
softgpu: Move draw pixel code to separate file.
...
This separates things better anyway. No major perf impact.
2021-11-21 08:23:32 -08:00
Henrik Rydgård
824805ec1e
Merge pull request #15154 from unknownbrackets/softjit
...
Use a pixel func ID in software rendering
2021-11-21 10:50:06 +01:00
Unknown W. Brackets
e2f0713cc2
softgpu: Clamp and round fog by mantissa bits.
...
This matches hardware calculated fog values much better.
2021-11-20 20:54:52 -08:00
Unknown W. Brackets
9abf2a4725
softgpu: Confirm mask doesn't hit stencil REPLACE.
2021-11-20 18:53:51 -08:00
Unknown W. Brackets
aa3786ed21
softgpu: Force off alpha blend if uselessly on.
...
This is a simple optimization to prevent some work games sometimes waste.
2021-11-20 15:27:04 -08:00
Unknown W. Brackets
26378f9c89
softgpu: Specialize sprite based on pixel func ID.
2021-11-20 15:27:04 -08:00
Unknown W. Brackets
f7a31c992d
softgpu: Use pixel func ID to draw pixels.
...
This just reduces reliance on gstate directly, and should help keep things
consistent.
2021-11-20 15:27:04 -08:00
Unknown W. Brackets
953200c995
softgpu: Add func to calculate pixel func ID.
...
This normalizes some things, and eventually can be used for a jit key.
2021-11-20 15:27:04 -08:00
Unknown W. Brackets
b6bdd69572
softgpu: Clear by dividing out subpixel first.
2021-11-15 06:26:11 -08:00
Unknown W. Brackets
f802c3bc6d
softgpu: Add some comments and cleanup.
2021-11-15 06:09:12 -08:00
Unknown W. Brackets
babd63c644
softgpu: Tune thread minimums better.
...
Darkstalkers seems more sensitive to these than many other games, this
improves performance more.
2021-11-14 18:44:30 -08:00
Unknown W. Brackets
66f635cba0
sfotgpu: Use threads to apply clears.
2021-11-14 18:31:46 -08:00
Unknown W. Brackets
2ab7499d8d
softgpu: Combine sliced rectangles.
...
This mostly affects clears, and reduces overhead. Only about 2%
improvement, but it's a small change.
2021-11-14 18:31:46 -08:00
Unknown W. Brackets
0281e2f017
softgpu: Split out rectangle path for combining.
2021-11-14 18:31:46 -08:00
Unknown W. Brackets
9545e3b0e2
softgpu: Fixup range cull for fans and fast path.
2021-11-14 18:31:45 -08:00
Unknown W. Brackets
fb6fadbbb7
softgpu: Fast path rectangles as fans.
...
Some games, such as Legend of Heroes III, use fans instead of strips.
2021-11-14 18:31:45 -08:00
Unknown W. Brackets
09a9927b82
softgpu: Use range loops for sprite fast path.
2021-11-14 18:31:45 -08:00
Unknown W. Brackets
55cde6bd6a
softgpu: Check flat z in fast path.
2021-11-14 12:27:39 -08:00
Unknown W. Brackets
361c8f966c
softgpu: Fast path triangles without textures.
...
The fast path may still be useful in this case.
2021-11-14 12:27:39 -08:00
Unknown W. Brackets
5bb6245b1f
softgpu: Fix leaked range flag on cull.
...
Fixes some backgrounds in Final Fantasy 4, probably others.
2021-11-14 08:43:52 -08:00
Unknown W. Brackets
f66e243727
softgpu: Correct scissor for pixel centers.
2021-11-07 11:19:41 -08:00
Unknown W. Brackets
8db2d37e64
softgpu: Fix depth cull in softgpu.
...
Was improperly skipping cull for positive Z.
2021-11-05 21:38:13 -07:00
Unknown W. Brackets
fe440d40e5
softgpu: Clip full weighted Z without truncating.
...
In case wsum_recip is nan or similar, we want to make sure we still
properly clip to minz/maxz.
2021-11-05 21:36:38 -07:00
Unknown W. Brackets
f03fa2b0b8
softgpu: Improve accuracy of line drawing.
...
Needs higher precision to change pixel at the right time.
This makes the lines in Persona 1 look right, see #3871 .
2021-11-04 00:11:09 -07:00
Unknown W. Brackets
b1009f70f9
softgpu: Allow end coordinate at bounds.
...
Oops, was excluding some valid usage that wouldn't wrap.
2021-09-30 06:33:25 -07:00
Unknown W. Brackets
953916a842
softgpu: Avoid fast path for clamp/wrap cases.
...
It doesn't clamp or wrap, and those are uncommon for the fast path.
Fixes #14951 .
2021-09-29 19:19:21 -07:00
Unknown W. Brackets
08816a544d
softgpu: Implement DXT5 in samplerx86.
2021-09-12 17:17:09 -07:00
Unknown W. Brackets
c4de5bfb9f
softgpu: Implement DXT3 in samplerx86.
2021-09-12 14:53:55 -07:00
Unknown W. Brackets
ee9d19430f
softgpu: Implement DXT1 decoding in samplerx86.
2021-09-12 13:57:28 -07:00
Unknown W. Brackets
a0eeb52444
softgpu: Decode DXT texels directly.
...
This improves performance a lot compared to decoding the whole block.
Eventually we may implement a cache, but threading makes that complex to
make properly fast.
2021-09-12 09:37:34 -07:00
Unknown W. Brackets
121c56e6db
softgpu: Clip only on -Z, cull if entirely outside.
...
This is important for several issues, like #12058 or #12060 , where
something is drawn entirely outside valid Z, and should be culled.
2021-09-09 20:13:42 -07:00
Unknown W. Brackets
0b73c1ce83
softgpu: Correct guardband cull behavior.
...
Culling is based on whether clipping happens, not whether clamping
happens. This is important for issues like #12348 .
2021-09-09 20:05:41 -07:00
Unknown W. Brackets
b5ba469826
softgpu: Prevent pixel gaps when drawing sprites.
...
If you end a sprite at 255.9, it draws the pixel at 255. This uses the
same logic to handle that as in the triangle path.
2021-09-06 22:05:39 -07:00
Unknown W. Brackets
7addc18a6b
softgpu: Avoid overflow infinite loop.
...
For certain large values, it would overflow and continue looping
endlessly.
2021-09-05 23:24:08 -07:00
Henrik Rydgård
3be5c7bd9a
Make the minimum items per thread explicit. Found some bugs, optional arguments are evil.
2021-06-12 21:21:28 +02:00
Henrik Rydgård
73871b9b7e
Implement new thread manager, port stuff to it.
2021-06-12 13:03:53 +02:00
Unknown W. Brackets
3304814fd6
GPU: Minor cleanup duplicate header/conditions.
2021-05-08 09:12:22 -07:00
Unknown W. Brackets
de46b0998a
GPU: Correctly initialize HW tessellation support.
...
Oops, shouldn't call a virtual in a constructor.
2021-05-08 09:10:23 -07:00
Unknown W. Brackets
8a8328c431
Common: Move ColorConv to a more appropriate place.
2021-05-01 11:20:05 -07:00
Unknown W. Brackets
ee749804fc
Debugger: Note GPU block transfer src as well.
2021-04-03 18:11:44 -07:00
Unknown W. Brackets
4178f09e57
Build: More consistently avoid _M_ defines.
...
We use PPSSPP_ARCH in several places already, this makes it more complete.
2021-03-02 21:49:21 -08:00
Unknown W. Brackets
d9aecffd72
Build: Remove old ARM define.
2021-03-02 21:26:03 -08:00
Henrik Rydgård
0facd4d4a6
Merge pull request #14230 from unknownbrackets/texreplace
...
Support texture replacement filtering overrides
2021-02-28 18:09:38 +01:00
Unknown W. Brackets
2f63f9999d
GPU: Normalize 0 to 1 always in software lighting.
...
See #14167 . This seems to be consistent.
2021-02-27 23:51:45 -08:00
Unknown W. Brackets
fb3ad1df4b
Replacement: Read in texture filtering overrides.
...
If you're replacing, you can know more information about linear safety for
tests.
2021-02-27 17:16:16 -08:00
Henrik Rydgård
2f3bc2d373
Merge pull request #14056 from unknownbrackets/debugger-mem
...
Track memory allocations and writes for debug info
2021-02-21 10:18:11 +01:00
aliaspider
9a3e5879bb
Global: Correct many endian types and casts.
2021-02-18 22:25:24 -08:00
Unknown W. Brackets
f7740edc6d
Debugger: Add more metadata for memory usage.
2021-02-15 15:01:21 -08:00
Unknown W. Brackets
f32f89dd90
Global: Remove some unused variables.
2021-02-15 11:59:45 -08:00
Unknown W. Brackets
5e3579a780
SoftGPU: Fix sprite provoking vertex in fast path.
...
It was right everywhere else.
2021-01-16 20:13:16 -08:00
Henrik Rydgård
3f01cbb98c
Initialize/Deinitialize the shader translation system once globally.
...
Fixes #13839 .
2021-01-04 23:51:34 +01:00
Unknown W. Brackets
e1050fe855
UWP: Don't try compiling samplerjit.
2021-01-02 09:54:35 -08:00
Unknown W. Brackets
ed65bc2327
SoftGPU: Allow rendering with no backend at all.
2021-01-02 09:25:41 -08:00
Unknown W. Brackets
6a2b3f8f78
SoftGPU: Update PPGe draw context.
...
Oops, this was missing.
2021-01-02 09:23:25 -08:00
Henrik Rydgård
32c9728c0c
Some cleanups in GL feature and shader language detection.
...
Gets rid of many wrong or bad checks for IsCoreContext.
2020-12-14 19:46:11 +01:00
Henrik Rydgård
766dbc5a9f
Move ShaderTranslation.cpp/h to Common/GPU.
2020-11-09 11:18:43 +01:00
Henrik Rydgård
03e8eac6ef
Merge the two ShaderLanguage enums.
2020-11-04 09:40:11 +01:00
Henrik Rydgård
b7d674411e
Test parsing of generated OpenGL shaders too (by using glslang).
2020-10-31 18:32:43 +01:00
Henrik Rydgård
886a8b1ac6
Remove Timer.cpp/h. Move various collections into Common/Data/Collections.
2020-10-05 21:05:23 +02:00
Henrik Rydgård
0e3a84b4a8
Move most GPU things to Common.
...
It works after the move, on Windows and Android at least.
Deletes the D3DX9 shader compiler loader, which was not used.
2020-10-04 23:39:02 +02:00
Henrik Rydgård
b7edf75437
Move Display.cpp/h to Common.
2020-10-04 11:42:16 +02:00
Henrik Rydgård
821817e6d4
Move the profiler to Common
2020-10-04 11:42:16 +02:00
Henrik Rydgård
9e41fafd0d
Move math and some file and data conversion files out from native to Common.
...
Buildfixing
Move some file util files
Buildfix
Move KeyMap.cpp/h to Core where they belong better.
libretro buildfix attempt
Move ini_file
More buildfixes
2020-10-04 09:12:46 +02:00
Henrik Rydgård
3162f30158
Merge base/basictypes.h into Common/Common.h (mostly).
2020-09-29 15:51:51 +02:00
Henrik Rydgård
1b3413945b
Some header include cleanup
2020-09-16 09:20:41 +02:00
Henrik Rydgård
cea35007ae
Always use a linear filter for video, unless forcing NEAREST filtering.
2020-09-13 16:40:37 +02:00
Henrik Rydgård
60a6bf6d43
Optimize the DarkStalkers software rendering path a little more.
2020-09-12 16:10:17 +02:00
Unknown W. Brackets
5fae2171cc
softgpu: Correct cull handling for sprites.
2020-09-08 16:29:45 -07:00
Unknown W. Brackets
3055deeba6
GPU: Fix some case warnings.
...
Better to avoid the warnings.
2020-08-19 21:18:44 -07:00
Henrik Rydgård
2e06386cf6
Software renderer clipper: Don't clip on the sides. Fixes #4845 and should fix #7124 for the SW renderer.
2020-08-16 21:38:07 +02:00
Henrik Rydgård
c5e0b799d9
Remove category from _assert_msg_ functions. We don't filter these by category anyway.
...
Fixes the inconsistency where we _assert_ didn't take a category but
_assert_msg_ did.
2020-07-19 20:33:25 +02:00
Henrik Rydgård
defa8aa480
DarkStalkers: Handle the "normal" screen stretch too, not just "wide", to avoid a surprising performance drop.
2020-05-24 16:53:44 +02:00
Henrik Rydgård
fabe987c8f
Add a name tag for all render steps (GL/Vulkan). Helps with debugging and should be cheap enough (a single pointer per "step").
2020-05-21 11:24:05 +02:00
Henrik Rydgård
7a6489ebb4
Merge pull request #12905 from unknownbrackets/postshader
...
Allow chained post-processing shaders
2020-05-17 16:09:05 +02:00
Unknown W. Brackets
7910b4029a
arm64jit: Track writable and non-writable pointers.
...
Switch uses different memory regions. We can handle this, might as well
cleanup some const abuse.
2020-05-17 00:15:12 -07:00
Unknown W. Brackets
b79ecc159f
GPU: Update postshader uniforms for each.
2020-05-16 12:04:36 -07:00
Henrik Rydgård
864d138cd9
Fix DarkStalkers after the just-merged refactoring.
2020-05-14 23:28:37 +02:00
Unknown W. Brackets
7024a2877d
GPU: Take A off RGB565 conversion funcs.
2020-05-13 18:17:58 -07:00
Unknown W. Brackets
03e3a935da
GPU: Cleanup presentation flipping a bit.
2020-05-13 18:11:25 -07:00
Unknown W. Brackets
a41fbb9225
softgpu: Fix postshader on 5551.
...
This also fixes rendering on Windows 7 Direct3D 11.
2020-05-13 18:10:09 -07:00
Unknown W. Brackets
762b656ea2
GPU: Use a texture directly for MakePixelTexture.
...
This makes it easier to do things with it.
2020-05-13 18:10:09 -07:00
Unknown W. Brackets
2653e50200
softgpu: Avoid RB swizzle when using a postshader.
...
So that it can post-process correctly.
2020-05-13 18:10:09 -07:00
Unknown W. Brackets
3aa8287b74
softgpu: Enable postshader support.
2020-05-13 18:10:09 -07:00
Unknown W. Brackets
cb94487a16
GPU: Move post shader handling to new class.
...
Currently, Vulkan is not working properly and direct (RAM -> output) is
not hooked up. But in general, it works.
2020-05-13 18:10:06 -07:00
Unknown W. Brackets
57bd88fc33
softgpu: Allow display rotation.
2020-05-13 18:07:25 -07:00
Unknown W. Brackets
a03e368566
GPU: Move cardboard/etc. to PresentationCommon.
...
Now this works on softgpu as well.
Some hacks for backend differences...
2020-05-13 18:07:25 -07:00
Unknown W. Brackets
d39b0bdca2
GPU: Split FramebufferCommon into two classes.
...
Only some things moved over so far.
FramebufferCommon does too much, we want to share it with softgpu without
all the buffer management stuff.
2020-05-13 18:07:22 -07:00
Unknown W. Brackets
1b9440611a
softgpu: Fix texture overlap.
...
Mainly happened when we had wide textures and split them up between GPUs.
2020-05-13 17:53:00 -07:00
Unknown W. Brackets
ac60e2ecd4
GPU: Track HW tess at start of frame too.
...
This also makes it so we don't force the setting off when you change
backends, and just ignore it if unsupported.
2020-04-04 11:52:32 -07:00
Unknown W. Brackets
4a0109d273
GPU: Treat negative light exp same as 0.
...
Based on #12507 and some tests, seems like negative exponents are also
fixed to a 1.0f result.
2020-03-22 22:28:05 -07:00
Unknown W. Brackets
f1dfb25427
softgpu: Correct clear/solid rect BR corner.
...
The scissor is inclusive, not exclusive.
2020-03-09 18:57:55 -07:00
Unknown W. Brackets
cebcfb1bbd
GPU: Use old frame when presenting a skip.
...
If we flip using a skipped frame, we may show an even older frame causing
weird flickering.
2020-03-01 13:55:28 -08:00
Unknown W. Brackets
072041a63d
SoftGPU: Convert from 16-bit if unsupported.
...
Should help #12455 , but not actually tested on an affected device.
2019-12-24 11:08:44 -08:00
Henrik Rydgård
54823a87cc
Oops
2019-10-28 13:13:52 +01:00
Henrik Rydgård
970adfbcc9
Isolate most of the softgpu specialization code to RasterizerRectangle.
...
See comments.
2019-10-28 09:33:30 +01:00
Henrik Rydgård
1966c8fe75
Fix a backwards check
2019-10-27 20:55:32 +01:00
Henrik Rydgård
6c8186d046
Remove unused textureswizzle support (we use shaders instead). Universally support presenting 5551 format directly.
2019-10-27 20:55:32 +01:00
Henrik Rydgård
86c781e434
Hack around most of the problems with the save/load dialog. Software stretch gets enabled in non-wide mode, so wallpapers work at a cost of speed.
2019-10-27 20:55:32 +01:00
Henrik Rydgård
102a70b4a5
Scissor fix
2019-10-27 20:55:32 +01:00
Henrik Rydgård
a84f4a0caa
Even more speed.
2019-10-27 20:55:32 +01:00
Henrik Rydgård
eb53609cb0
More speed
2019-10-27 20:55:32 +01:00
Henrik Rydgård
bbbd7f8acc
Buildfix
2019-10-27 20:55:32 +01:00
Henrik Rydgård
714f83f614
Further specialization.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
290e9971a7
More specialization work.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
4f7c23fe79
DarkStalkers: Fix display on OpenGL ES.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
796539ad7f
DarkStalkers: Fix display in the D3D backends. Still broken in OpenGL.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
9099441973
Darkstalkers: Gross hack to avoid the game's own stretch, and present the raw buffer instead for a sharper image.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
2dd7a9aa12
More darkstalkers work
2019-10-27 20:54:36 +01:00
Henrik Rydgård
c7f6724f7e
Detect sprite drawing (1:1 texture mapping), run a simpler function without the triangle state tracking.
...
This will allow further simplification and specialization.
2019-10-27 20:54:36 +01:00
Henrik Rydgård
510229b68b
SoftGPU: Detect through-mode rectangles from triangle strips
2019-10-27 20:54:36 +01:00
Henrik Rydgård
58568632e8
Software renderer: Use hardware color conversion on Vulkan in 5551 16-bit mode
2019-10-27 20:54:36 +01:00
Henrik Rydgård
3a0804a7dd
Start slowly migrating from macros
2019-10-27 20:54:36 +01:00
Henrik Rydgård
ae286aef86
Vulkan+SoftwareRenderer: Fix screen rotation on Android.
...
(Missed this because software rendering is normally disabled on Android)
2019-10-22 22:08:21 +02:00
Unknown W. Brackets
5871ab0538
UI: Stop caching the draw context in coreParam.
...
This is possibly getting outdated in some paths of graphics reinit, and
then causing crashes. Let's just always get it from the graphicsContext.
2019-09-28 21:58:15 -07:00
Unknown W. Brackets
7412e13767
SoftGPU: Implement dithering.
...
Note: it applies even in 8888, so it can be used as a slight brightness
adjustment.
2019-05-26 09:52:34 -07:00
Unknown W. Brackets
0b48c6d066
SoftGPU: Apply color doubling only to RGB.
...
Broken in #11379 - accidentally applied to the alpha value. See #11901
for an example where this caused issues with blending.
2019-03-16 19:40:33 -07:00