Герман Семенов
122b63b9a8
GPU: using if constexpr
C++17 optimization
2023-04-02 16:36:37 +02:00
Unknown W. Brackets
cd3fc26190
samplerjit: Prevent thread local stale cache read.
...
If the generation count happens to match, would still get a stale pointer
and crash. Let's just make the generation count static so it always
increases.
2023-02-22 21:15:03 -08:00
Unknown W. Brackets
62fe03dcb4
softgpu: Use NEON for texture blending.
2023-01-07 19:06:35 -08:00
Unknown W. Brackets
49f6c461ad
Reporting: Fix some header includes.
...
Particularly in Common, avoid including Core/Reporting.h.
2022-12-27 14:58:20 -08:00
Unknown W. Brackets
d9522a7ac5
softgpu: Avoid clear hazard for last cached funcs.
2022-12-06 21:23:56 -08:00
Unknown W. Brackets
eda3ce556e
softgpu: Avoid atomic structs.
...
Apparently we don't link libatomic and rather than fighting that, I'll
just use thread local values.
2022-12-06 20:35:07 -08:00
Unknown W. Brackets
400f6abf9a
softgpu: Optimize lookup of last jit func.
...
This is common (for example, maybe a pixel state is updated but sampler is
not), and reduces time spent in ComputeRasterizerState() quite a bit in
Darkstalkers, where jits are available (i.e. Intel currently.)
2022-12-06 19:16:19 -08:00
Unknown W. Brackets
87fb9eef37
softgpu: Remove std::function usage.
...
Wanted to avoid coupling these, but don't like the std::function
construct/destructs showing in profiles...
2022-12-06 19:15:57 -08:00
Unknown W. Brackets
38eb0a7a82
softgpu: Check for queued compile.
...
Rarely, we could have queued compiling the same one, which would crash on
a double insert.
2022-12-03 12:15:58 -08:00
Unknown W. Brackets
778a0487cb
softjit: Switch to DenseHashMap.
2022-12-02 20:59:13 -08:00
Unknown W. Brackets
4d06400548
softgpu: Fix compile hazard while running.
...
This prevents any clearing of cache while other threads may be using
previously cached funcs, and avoids wx exclusive hazards.
2022-11-20 12:04:02 -08:00
Unknown W. Brackets
ce51942508
softgpu: Correct WX-exclusive platform hazards.
...
Should mainly affect BSD at this point.
2022-11-20 10:55:35 -08:00
Unknown W. Brackets
79b1d1d35f
softgpu: Better approximate slope mip level mode ( #16276 )
...
* samplerjit: Remove unused x/y parameters.
Still need to tune the accuracy of filtering, but those were not the
right way.
* softgpu: Better approximate slope mip level mode.
This isn't exactly right, but it's closer.
* softgpu: Calculate auto from largest difference.
Direction shouldn't matter.
2022-10-23 10:15:43 +02:00
Unknown W. Brackets
167213c746
softgpu: Cache texture bufws at 16 bit.
...
Reducing the size of state a bit.
2022-09-12 21:57:00 -07:00
Unknown W. Brackets
90e009edb9
softgpu: Clamp/wrap textures at 512 pixels.
...
A texture larger than 512 is "valid", but simply wraps/clamps at 512.
Importantly, the texture coords are still calculated at the specified
size, which can be up to 32768.
2022-09-10 20:23:09 -07:00
Unknown W. Brackets
a88c9a0680
softgpu: Remove incorrect offsetting for X/Y.
2022-02-20 09:13:20 -08:00
Unknown W. Brackets
2479d52202
Global: Reduce includes of common headers.
...
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
d200ef40de
samplerjit: Compile sampler funcs together.
...
We can't have the cache clear between nearest/linear, because then we'll
call a bunch of int3's.
2022-01-29 20:28:20 -08:00
Unknown W. Brackets
99d6d569f0
samplerjit: Reduce transfers in nearest texel calc.
...
This benefits a few games, mostly where there's lots of UI or similar.
2022-01-24 21:28:04 -08:00
Unknown W. Brackets
c1e657ed47
samplerjit: Better vectorize UV linear calc.
...
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
c2985bca31
softjit: Centralize some common funcs from sampler.
...
No need to duplicate this code.
2022-01-19 00:03:59 -08:00
Unknown W. Brackets
d6fa301ab1
softgpu: Track CLUTs as states for binning.
...
This way we can have multiple CLUTs in process at once, which helps.
2022-01-16 08:14:09 -08:00
Unknown W. Brackets
edb79d968f
softgpu: Cache CLUT params in sampler state.
...
And now there's no more gstate for pixel drawing or sampling. Just a
little left in rasterization.
2022-01-15 18:09:09 -08:00
Unknown W. Brackets
c0e85e6170
softgpu: Move texenv color into sampler state.
2022-01-15 17:52:40 -08:00
Unknown W. Brackets
ad3635c82a
softgpu: Move tex size to cached state.
2022-01-15 17:22:43 -08:00
Unknown W. Brackets
bf2e060735
softgpu: Move c++ tex func to sampler.
...
It's not used anywhere else now.
2022-01-15 15:28:07 -08:00
Unknown W. Brackets
a228b2ab6c
softgpu: Use cached sampler state outside jit.
2022-01-15 15:26:26 -08:00
Henrik Rydgård
d3f0af7458
Merge pull request #15273 from unknownbrackets/softjit-bloom
...
Optimize software renderer handling of common bloom operations
2022-01-02 18:11:07 +01:00
Unknown W. Brackets
a259761262
samplerjit: Use nearest func in fast path too.
...
This uses the more optimal tex funcs.
2022-01-02 08:48:16 -08:00
Unknown W. Brackets
e93c709f5c
sofjit: Correctly poison memory.
...
Noticed this wasn't breakpoints when reviewing some assembly output.
2022-01-02 08:47:04 -08:00
Unknown W. Brackets
0eec4e7e4d
samplerjit: Decode colors in parallel.
...
Not used in a ton of games, but a decent improvement where it is used.
2022-01-02 08:27:55 -08:00
Unknown W. Brackets
91c9343e87
samplerjit: Refactor and reuse constant pool.
...
It's just here to be rip accessible, the fixed values can be output just
once.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
40240be91c
samplerjit: Update nearest args, temp disable jit.
...
This temporarily disables jit for nearest, but refactors to use the new
arg structure. It now matches linear.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
06e954fe2a
samplerjit: Create a separate fetch func.
...
This allows nearest to become more similar to linear, where it applies the
texture function.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
3bc6009158
samplerjit: Refactor sampler ID calculation.
...
Make it the same as pixel func IDs.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
28cfbe0e5a
samplerjit: Add an alternate profiling method.
...
This is more useful to group common operations together for profiling.
2021-12-29 07:11:39 -08:00
Unknown W. Brackets
74eb450e76
samplerjit: Move texture function into jit.
...
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7
samplerjit: Lookup both mip tex values.
2021-12-28 16:22:54 -08:00
Unknown W. Brackets
a4558a5736
samplerjit: Take texptr/bufw as arrays.
...
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
a84accf713
samplerjit: Move S/T calculation into jit.
...
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
476dfdf731
samplerjit: Add more bits for S/T, skip multiply.
...
For now, we're not using those other bits yet.
2021-12-27 18:24:37 -08:00
Unknown W. Brackets
75f105f84b
softgpu: Make linear filtering more accurate.
...
This matches tests for various u/v offsets and x/y subpixel offsets.
Mipmaps are probably still wrong.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
b00a66e34c
samplerjit: Pass u/v coords as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
ff94974df9
softgpu: Avoid texlevel check when maxlevel is 0.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
823c4adb15
softgpu: Keep arguments in vectors for sampling.
2021-12-04 15:45:06 -08:00
Unknown W. Brackets
722c04c5e2
samplerjit: Allow disabling linear too, oops.
2021-11-28 08:03:14 -08:00
Unknown W. Brackets
2b4b4ae064
softjit: Add config setting to enable/disable.
...
Also use it for samplerjit.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
876c8cd368
softgpu: Fix PixelFuncID size.
...
Oops, can't use unions in bitfields. Also improve typesafety.
2021-11-21 09:40:13 -08:00
Unknown W. Brackets
a0eeb52444
softgpu: Decode DXT texels directly.
...
This improves performance a lot compared to decoding the whole block.
Eventually we may implement a cache, but threading makes that complex to
make properly fast.
2021-09-12 09:37:34 -07:00
Unknown W. Brackets
8a8328c431
Common: Move ColorConv to a more appropriate place.
2021-05-01 11:20:05 -07:00