Unknown W. Brackets
2482b2a1e0
Use aligned NEON loads in texhash.
2014-03-22 21:58:50 -07:00
Unknown W. Brackets
b44d10a91e
Move texture unswizzling to decoder, use NEON.
2014-03-22 21:35:16 -07:00
Henrik Rydgard
8b92dcea47
Transform: Compute the "DCID" (draw call ID) incrementally instead of an extra pass.
2014-03-23 01:51:51 +01:00
Henrik Rydgard
8bf015fe16
texcache: SSE optimized version of the most common case of Unswizzle
...
(didn't put this in fast_math because it's pretty specific to PSP)
2014-03-23 01:50:50 +01:00
Henrik Rydgård
dac51b9c1b
Merge pull request #5693 from unknownbrackets/jit-minor
...
x86 jit and vertex jit changes, ARM emitter changes
2014-03-23 00:22:04 +01:00
Henrik Rydgård
5d44b09cb2
Merge pull request #5692 from hrydgard/fast_math
...
Use the new fast-math from native for 4x4 matrix mul
2014-03-23 00:17:19 +01:00
Unknown W. Brackets
632eec38e8
vertexjit: Use SSE4.1 where available on x86.
...
Just because we can.
2014-03-22 16:11:16 -07:00
Unknown W. Brackets
5d04f123b9
vertexjit: A couple more tweaks to morph on x86.
2014-03-22 15:56:30 -07:00
Unknown W. Brackets
12c2683fb8
vertexjit: Cut a few more instrs from x86 morph.
2014-03-22 15:56:30 -07:00
Unknown W. Brackets
0da5caf11a
vertexjit: Cut a few instrs from morph on x86.
2014-03-22 15:56:30 -07:00
Unknown W. Brackets
162f229294
vertexjit: Support the color morphs on x86.
2014-03-22 15:56:29 -07:00
Henrik Rydgard
63aeb31e07
Attempt workaround for fog issue #5384
2014-03-22 23:49:14 +01:00
Henrik Rydgård
8dfadf7b8e
ArmEmitter: Add VMOV_neon and a Size parameter to VFMA for consistency.
2014-03-22 16:31:16 +01:00
Henrik Rydgard
bc121242b3
Use fast_math matrix multiplication for culling and sw transform
2014-03-22 14:40:09 +01:00
Henrik Rydgård
98da5144ef
Merge pull request #5612 from raven02/patch-27
...
Shade mapping fix
2014-03-22 14:37:22 +01:00
Unknown W. Brackets
66f501b981
Avoid an invalid enum on GLES2 texture creation.
...
My device logs an error, which I'm guessing has perf impact.
2014-03-22 09:34:22 +01:00
Henrik Rydgard
f4db725400
Remove redundant call to ReplaceAlphaWithStencil
2014-03-22 09:28:45 +01:00
Henrik Rydgard
ba5d88e9d6
Fix bug in FastLoadBoneMatrix where the wrong uniform could be dirtied
2014-03-22 09:27:43 +01:00
Henrik Rydgard
0b673719c2
Crashfix for software renderer in 32-bit (SSE misalignment)
2014-03-22 00:12:21 +01:00
Unknown W. Brackets
a8a299c2e3
Fix ToRGB/ToRGBA possible accuracy loss.
...
It was always like this, but not used as much before. Shifts are fast and
it eneds to sum anyway, there should not be any benefit to multiplying as
floats, and it will probably lose accuracy.
2014-03-18 22:56:27 -07:00
Unknown W. Brackets
678237aa6c
Improve SSE usage in software transform.
...
It's actually already pretty decent (unlike the softgpu), but there were a
few places it could use a bit of help. Speeds up things with hardware
transform off, or areas that need to use software transform.
2014-03-17 23:05:48 -07:00
Unknown W. Brackets
416df17088
Inline From/ToRGB(A) to avoid losing SSE.
...
Otherwise it has to store it, which I'd like to avoid.
2014-03-17 23:03:04 -07:00
Unknown W. Brackets
1ce6bf399a
Buildfix for 32-bit x86, arg.
2014-03-17 21:52:45 -07:00
Unknown W. Brackets
833c93bd98
Dumb mistake, forgot the divide.
...
Probably caused the blending issues.
2014-03-17 12:53:49 -07:00
Unknown W. Brackets
6630e45eff
Just add a packed version of Vec3f.
...
This way we can have it aligned to memory where needed. I think it'd be
better to avoid this if possible so that we can actually vectorize
spline/etc. code.
Fixes #5673 .
2014-03-17 06:59:40 -07:00
Unknown W. Brackets
38d0bac1df
Optimize some 4444/8888 color conversions.
...
Small performance boost in softgpu.
2014-03-17 01:21:52 -07:00
Unknown W. Brackets
6de2129f98
softgpu: Don't re-pack 8888 colors.
...
It's like a bad joke, but MSVC was not optimizing this out.
2014-03-16 23:03:07 -07:00
Unknown W. Brackets
10456a09ac
Oops, forgot to multiply in float ToRGBA().
...
Not actually used...
2014-03-16 21:12:23 -07:00
Unknown W. Brackets
627027307c
softgpu: Use SSE in ToRGB()/FromRGB() etc.
2014-03-16 19:21:35 -07:00
Unknown W. Brackets
07ca96e226
softgpu: Use SSE in alpha blending.
2014-03-16 18:57:11 -07:00
Unknown W. Brackets
601ff10f1e
softgpu: Use SSE in tex modulation.
...
Could do others, this seems the most common. Gives a few more percent.
2014-03-16 18:28:06 -07:00
Unknown W. Brackets
47728528d7
softgpu: Use SSE in Vec?::Length().
...
Minor perf boost but if I do everything in Vec things get slower.
2014-03-16 17:56:34 -07:00
Unknown W. Brackets
6ef0aa123f
softgpu: Use SSE for the secondary color.
...
It's easy to speed up this code since it's so hot.
2014-03-16 16:21:12 -07:00
Unknown W. Brackets
7f3e158a0f
softgpu: Get all tex samples at the same time.
...
Kills a bunch of overhead, improving speed more.
2014-03-16 15:51:47 -07:00
Unknown W. Brackets
d9e29a2edf
softgpu: Optimize alpha blending handling.
...
This alone makes it a good bit faster.
2014-03-16 15:22:31 -07:00
Unknown W. Brackets
f21649e563
softgpu: Minor simplification for alpha blend.
2014-03-16 15:09:42 -07:00
Unknown W. Brackets
1ab7325d4a
softgpu: Use a full Vec4 for the prim color.
...
Simpler, and slightly faster.
2014-03-16 15:04:41 -07:00
Unknown W. Brackets
c3530a6674
softgpu: Don't multithread small triangles.
...
It ends up being slower with all the overhead, of course.
2014-03-16 14:49:49 -07:00
Unknown W. Brackets
b33d0c4046
softgpu: Use SSE for texture sampling.
2014-03-16 14:33:42 -07:00
Unknown W. Brackets
b357b00ace
softgpu: Use SSE for through texture coords.
2014-03-16 14:30:20 -07:00
Unknown W. Brackets
dd140b73bb
softgpu: Use SSE for gouraud shading.
2014-03-16 14:29:22 -07:00
Unknown W. Brackets
743854afc8
Fix off-by-one on fast matrix loads.
...
May matter mostly if there's a stall right at the end of the matrix.
2014-03-15 15:23:55 -07:00
Henrik Rydgård
78ce9b3f3c
Spline patches: Ignore too-small patch_div_s/t. May help #5663
2014-03-15 21:29:48 +01:00
Unknown W. Brackets
a843cbd580
Shrink the very common sceKernelThread.h include.
2014-03-15 11:44:02 -07:00
Unknown W. Brackets
996fa39684
Reduce some unnecessary includes in Core/.
2014-03-15 10:41:07 -07:00
Henrik Rydgard
b4d99b1981
Revert "Avoid caching when HW T&L with morph enabled."
...
This reverts commit 557eae7ca9b6130a4645e346049940116031b109.
2014-03-15 10:46:04 +01:00
raven02
557eae7ca9
Avoid caching when HW T&L with morph enabled.
2014-03-14 21:04:32 +08:00
Henrik Rydgard
4df49a72ab
Add yet another hack setting to work around the 3rd Birthday problem.
...
Hopefully temporary...
2014-03-13 19:00:35 +01:00
Henrik Rydgard
2eb6a4e2f2
Fix a warning, rename some parameters, etc.
2014-03-08 10:40:43 +01:00
raven02
1b831ce022
SW T&L
2014-03-07 21:41:40 +08:00