2402 Commits

Author SHA1 Message Date
Unknown W. Brackets
2482b2a1e0 Use aligned NEON loads in texhash. 2014-03-22 21:58:50 -07:00
Unknown W. Brackets
b44d10a91e Move texture unswizzling to decoder, use NEON. 2014-03-22 21:35:16 -07:00
Henrik Rydgard
8b92dcea47 Transform: Compute the "DCID" (draw call ID) incrementally instead of an extra pass. 2014-03-23 01:51:51 +01:00
Henrik Rydgard
8bf015fe16 texcache: SSE optimized version of the most common case of Unswizzle
(didn't put this in fast_math because it's pretty specific to PSP)
2014-03-23 01:50:50 +01:00
Henrik Rydgård
dac51b9c1b Merge pull request #5693 from unknownbrackets/jit-minor
x86 jit and vertex jit changes, ARM emitter changes
2014-03-23 00:22:04 +01:00
Henrik Rydgård
5d44b09cb2 Merge pull request #5692 from hrydgard/fast_math
Use the new fast-math from native for 4x4 matrix mul
2014-03-23 00:17:19 +01:00
Unknown W. Brackets
632eec38e8 vertexjit: Use SSE4.1 where available on x86.
Just because we can.
2014-03-22 16:11:16 -07:00
Unknown W. Brackets
5d04f123b9 vertexjit: A couple more tweaks to morph on x86. 2014-03-22 15:56:30 -07:00
Unknown W. Brackets
12c2683fb8 vertexjit: Cut a few more instrs from x86 morph. 2014-03-22 15:56:30 -07:00
Unknown W. Brackets
0da5caf11a vertexjit: Cut a few instrs from morph on x86. 2014-03-22 15:56:30 -07:00
Unknown W. Brackets
162f229294 vertexjit: Support the color morphs on x86. 2014-03-22 15:56:29 -07:00
Henrik Rydgard
63aeb31e07 Attempt workaround for fog issue #5384 2014-03-22 23:49:14 +01:00
Henrik Rydgård
8dfadf7b8e ArmEmitter: Add VMOV_neon and a Size parameter to VFMA for consistency. 2014-03-22 16:31:16 +01:00
Henrik Rydgard
bc121242b3 Use fast_math matrix multiplication for culling and sw transform 2014-03-22 14:40:09 +01:00
Henrik Rydgård
98da5144ef Merge pull request #5612 from raven02/patch-27
Shade mapping fix
2014-03-22 14:37:22 +01:00
Unknown W. Brackets
66f501b981 Avoid an invalid enum on GLES2 texture creation.
My device logs an error, which I'm guessing has perf impact.
2014-03-22 09:34:22 +01:00
Henrik Rydgard
f4db725400 Remove redundant call to ReplaceAlphaWithStencil 2014-03-22 09:28:45 +01:00
Henrik Rydgard
ba5d88e9d6 Fix bug in FastLoadBoneMatrix where the wrong uniform could be dirtied 2014-03-22 09:27:43 +01:00
Henrik Rydgard
0b673719c2 Crashfix for software renderer in 32-bit (SSE misalignment) 2014-03-22 00:12:21 +01:00
Unknown W. Brackets
a8a299c2e3 Fix ToRGB/ToRGBA possible accuracy loss.
It was always like this, but not used as much before.  Shifts are fast and
it eneds to sum anyway, there should not be any benefit to multiplying as
floats, and it will probably lose accuracy.
2014-03-18 22:56:27 -07:00
Unknown W. Brackets
678237aa6c Improve SSE usage in software transform.
It's actually already pretty decent (unlike the softgpu), but there were a
few places it could use a bit of help.  Speeds up things with hardware
transform off, or areas that need to use software transform.
2014-03-17 23:05:48 -07:00
Unknown W. Brackets
416df17088 Inline From/ToRGB(A) to avoid losing SSE.
Otherwise it has to store it, which I'd like to avoid.
2014-03-17 23:03:04 -07:00
Unknown W. Brackets
1ce6bf399a Buildfix for 32-bit x86, arg. 2014-03-17 21:52:45 -07:00
Unknown W. Brackets
833c93bd98 Dumb mistake, forgot the divide.
Probably caused the blending issues.
2014-03-17 12:53:49 -07:00
Unknown W. Brackets
6630e45eff Just add a packed version of Vec3f.
This way we can have it aligned to memory where needed.  I think it'd be
better to avoid this if possible so that we can actually vectorize
spline/etc. code.

Fixes #5673.
2014-03-17 06:59:40 -07:00
Unknown W. Brackets
38d0bac1df Optimize some 4444/8888 color conversions.
Small performance boost in softgpu.
2014-03-17 01:21:52 -07:00
Unknown W. Brackets
6de2129f98 softgpu: Don't re-pack 8888 colors.
It's like a bad joke, but MSVC was not optimizing this out.
2014-03-16 23:03:07 -07:00
Unknown W. Brackets
10456a09ac Oops, forgot to multiply in float ToRGBA().
Not actually used...
2014-03-16 21:12:23 -07:00
Unknown W. Brackets
627027307c softgpu: Use SSE in ToRGB()/FromRGB() etc. 2014-03-16 19:21:35 -07:00
Unknown W. Brackets
07ca96e226 softgpu: Use SSE in alpha blending. 2014-03-16 18:57:11 -07:00
Unknown W. Brackets
601ff10f1e softgpu: Use SSE in tex modulation.
Could do others, this seems the most common.  Gives a few more percent.
2014-03-16 18:28:06 -07:00
Unknown W. Brackets
47728528d7 softgpu: Use SSE in Vec?::Length().
Minor perf boost but if I do everything in Vec things get slower.
2014-03-16 17:56:34 -07:00
Unknown W. Brackets
6ef0aa123f softgpu: Use SSE for the secondary color.
It's easy to speed up this code since it's so hot.
2014-03-16 16:21:12 -07:00
Unknown W. Brackets
7f3e158a0f softgpu: Get all tex samples at the same time.
Kills a bunch of overhead, improving speed more.
2014-03-16 15:51:47 -07:00
Unknown W. Brackets
d9e29a2edf softgpu: Optimize alpha blending handling.
This alone makes it a good bit faster.
2014-03-16 15:22:31 -07:00
Unknown W. Brackets
f21649e563 softgpu: Minor simplification for alpha blend. 2014-03-16 15:09:42 -07:00
Unknown W. Brackets
1ab7325d4a softgpu: Use a full Vec4 for the prim color.
Simpler, and slightly faster.
2014-03-16 15:04:41 -07:00
Unknown W. Brackets
c3530a6674 softgpu: Don't multithread small triangles.
It ends up being slower with all the overhead, of course.
2014-03-16 14:49:49 -07:00
Unknown W. Brackets
b33d0c4046 softgpu: Use SSE for texture sampling. 2014-03-16 14:33:42 -07:00
Unknown W. Brackets
b357b00ace softgpu: Use SSE for through texture coords. 2014-03-16 14:30:20 -07:00
Unknown W. Brackets
dd140b73bb softgpu: Use SSE for gouraud shading. 2014-03-16 14:29:22 -07:00
Unknown W. Brackets
743854afc8 Fix off-by-one on fast matrix loads.
May matter mostly if there's a stall right at the end of the matrix.
2014-03-15 15:23:55 -07:00
Henrik Rydgård
78ce9b3f3c Spline patches: Ignore too-small patch_div_s/t. May help #5663 2014-03-15 21:29:48 +01:00
Unknown W. Brackets
a843cbd580 Shrink the very common sceKernelThread.h include. 2014-03-15 11:44:02 -07:00
Unknown W. Brackets
996fa39684 Reduce some unnecessary includes in Core/. 2014-03-15 10:41:07 -07:00
Henrik Rydgard
b4d99b1981 Revert "Avoid caching when HW T&L with morph enabled."
This reverts commit 557eae7ca9b6130a4645e346049940116031b109.
2014-03-15 10:46:04 +01:00
raven02
557eae7ca9 Avoid caching when HW T&L with morph enabled. 2014-03-14 21:04:32 +08:00
Henrik Rydgard
4df49a72ab Add yet another hack setting to work around the 3rd Birthday problem.
Hopefully temporary...
2014-03-13 19:00:35 +01:00
Henrik Rydgard
2eb6a4e2f2 Fix a warning, rename some parameters, etc. 2014-03-08 10:40:43 +01:00
raven02
1b831ce022 SW T&L 2014-03-07 21:41:40 +08:00