Commit Graph

1741 Commits

Author SHA1 Message Date
raven02
c0e8893560 Another attempt to sizing framebuffer based on fmt 2013-11-07 10:56:41 +08:00
Henrik Rydgard
64bdb5e21d Tiny optimization (32-bit) in GLES_GPU::FastRunLoop 2013-11-07 01:34:43 +01:00
Unknown W. Brackets
34398b7d0c Avoid literal loads in the arm vertexjit. 2013-11-06 08:45:00 -08:00
Unknown W. Brackets
78400fd460 Avoid some dereferencing in gpu FastRunLoop. 2013-11-06 07:50:16 -08:00
Henrik Rydgård
51995a3d43 Vtx dec: After generating ARM, remember to flush the icache.
Will hopefully fix the random crashes in #4461.
2013-11-06 16:14:40 +01:00
Henrik Rydgård
e3f6f25390 Buildfix for non-Windows non-ARM 2013-11-06 13:54:26 +01:00
Henrik Rydgård
ea9da85bdb Missed one possible unaligned access 2013-11-06 13:14:49 +01:00
Henrik Rydgård
b3fdfc01c8 ARM vtx dec: Avoid all unaligned accesses entirely.
Seeing so much contradictory information on the support and performance
of these.
2013-11-06 12:17:41 +01:00
Henrik Rydgård
1e158fa652 ARM vtx dec: Preserving our FP scratch register appears to improve
stability.

Also added some logging.
2013-11-06 11:47:26 +01:00
Henrik Rydgård
b19d41f9a8 Now that LDRH works, use it where appropriate 2013-11-06 10:51:21 +01:00
Henrik Rydgard
0eb3d79de9 x86 VertexDecoder jit: Fix typo in 16-bit weight decoder. May fix crashes. 2013-11-05 21:06:43 +01:00
Henrik Rydgård
a03e5c6de0 Merge pull request #4460 from hrydgard/vertex-decoder-jit
Vertex decoder JIT
2013-11-05 07:30:58 -08:00
Henrik Rydgård
7bf8a4dc5e We need to use a signed VCVT to float in PosS*Through 2013-11-05 12:17:18 +01:00
Unknown W. Brackets
c7edf73cdb Small optimizations to the vertexjit. 2013-11-05 00:32:08 -08:00
Unknown W. Brackets
f6662054bd Fix arm emitter bug in LDRH and friends. 2013-11-05 00:32:08 -08:00
Unknown W. Brackets
e435b81281 Optimize IndexGenerator::AddPrim() funcs for MSVC.
Reduces profile from ~5.4% to ~1.6% (with vertex cache off) in
Senjou no Valkyria 3.  Similar to the TranslatePrim() funcs.
2013-11-04 22:49:28 -08:00
Henrik Rydgård
9113c584f1 Merge pull request #4399 from raven02/patch-11
Attempt to go back to the earliest logic of finding a matching framebuffer
2013-11-04 04:43:34 -08:00
Henrik Rydgård
5886ccffdc Merge pull request #4441 from unknownbrackets/vertex-decoder-jit
Force 5 byte jumps to avoid jump target issues (vertex jit, x86)
2013-11-04 01:25:40 -08:00
Unknown W. Brackets
06194ac261 Add 5551 conversion to the arm vertexjit. 2013-11-04 00:47:05 -08:00
Unknown W. Brackets
16dcf807a8 Add 565 conversion to arm vertexjit. 2013-11-03 21:58:26 -08:00
Unknown W. Brackets
ab17d659cf Implement 4444 conversion in arm vertexjit.
Seems to help Dissidia a bit.
2013-11-03 21:58:26 -08:00
Unknown W. Brackets
bfda36efff Don't subtract nrmoff in arm vertexjit. 2013-11-03 21:57:55 -08:00
Unknown W. Brackets
a1fa65f631 Stupid typos, broke 4444 and 565. 2013-11-03 18:43:24 -08:00
Unknown W. Brackets
d5337edf1f Force 5 byte jumps to avoid jump target issues.
Some with 16-bit colors were too far.
2013-11-03 17:17:04 -08:00
Unknown W. Brackets
aece4fd580 16-bit colors in vertex jit for x86. 2013-11-03 15:04:47 -08:00
Henrik Rydgard
f0fd7679ce Preliminary ARM vertex decoder JIT. Has a weird issue in PosS16.
Other minor changes and fixes.
2013-11-03 20:15:42 +01:00
Unknown W. Brackets
64e977db08 Improve the non-NEON tex hash path on ARM.
This generates better looking disassembly, though a small change.
2013-11-03 07:43:10 -08:00
Henrik Rydgard
810b1a061f Vertex decoder JIT for x86 and x64. Handles the most common vertex formats. 2013-11-03 15:27:12 +01:00
Unknown W. Brackets
e33b7fa1a4 iOS buildfix. 2013-11-03 01:08:48 -08:00
Sacha
5613b86864 Use NEON texture decoder on Blackberry and iOS. Use ARMV7 defines. 2013-11-03 15:59:10 +10:00
Unknown W. Brackets
ed1204a10f Android armv6/etc. buildfix. 2013-11-02 10:14:25 -07:00
Henrik Rydgård
e12894a420 Merge pull request #4407 from unknownbrackets/texcache
Add a NEON version of the tex hash.
2013-11-02 07:41:39 -07:00
Henrik Rydgard
58860158df Turn off the UNPACK optimization when texture scaling is on. Fixes #4408 2013-11-02 15:23:35 +01:00
Henrik Rydgård
4d3e57d6eb Move normal reversion into the vertex shader instead of the decoder. 2013-11-02 11:05:31 +01:00
Unknown W. Brackets
4d47ccd5df Add a NEON version of the tex hash.
Should be used only for NEON devices.  Currently only compiled on Android.
2013-11-02 02:09:54 -07:00
Henrik Rydgård
1347c3b019 Merge pull request #4387 from hrydgard/unpack_subimage
Use GL_EXT_unpack_subimage to speed up non-pow-2 texture loads when available
2013-11-01 12:02:50 -07:00
Henrik Rydgard
a3a1395fc0 No need to set PACK parameters when we UNPACK 2013-11-01 19:56:06 +01:00
Henrik Rydgard
1fb7cdfcd2 Remove redundant call to ConvertColors, skip a copy when possible 2013-11-01 19:38:53 +01:00
Henrik Rydgard
5de7bb2e2d Use GL_EXT_unpack_subimage to speed up non-pow-2 texture loads when available 2013-11-01 19:38:52 +01:00
raven02
5b228d5fe4 Attempt to go back to thevery original finding a matching framebuffer logic 2013-11-01 22:00:34 +08:00
Unknown W. Brackets
1ee83935cb Correct texcache alignment check.
Oops.
2013-10-31 23:37:05 -07:00
Unknown W. Brackets
dc8902dbff Cut down on color conversion instructions (x86.) 2013-10-31 23:29:18 -07:00
Unknown W. Brackets
96256f43e9 Make sure clut memory is aligned.
For SSE color conversion, etc.
2013-10-31 23:29:17 -07:00
Unknown W. Brackets
f42cd11ddb Speed up color conversion using SSE.
Probably not the very most optimal implementation, but faster and SSE2.
2013-10-31 23:29:17 -07:00
Unknown W. Brackets
82761de992 Improve the texture cache hash on x86.
It's a bit less weak now, but still not strong.  Still fairly fast, it
seems like.
2013-10-31 23:29:16 -07:00
Henrik Rydgard
b9c908ba3f Update post-processing shaders to work again after removing u_viewproj.
Initialize some uninitialized variables.
2013-10-31 00:07:55 +01:00
Henrik Rydgård
70e4214e2e Add comment with a link to an important github thread 2013-10-30 23:08:03 +01:00
Henrik Rydgård
07a868910e Add a temporary hack option that may help debugging the wipeout glow.
It reduces the glow problem by a lot but is obviously incorrect.
2013-10-30 22:47:36 +01:00
Henrik Rydgård
357b133ff6 No need to dirty the PROJ_THROUGH when only the proj matrix is being
changed.
2013-10-30 22:47:36 +01:00
Henrik Rydgård
7e27bd9dc9 Framebuffer draws: Get rid of the u_viewproj uniform matrix. 2013-10-30 22:47:36 +01:00