Henrik Rydgård
d04415ac5d
Merge pull request #4479 from unknownbrackets/perf
...
Small optimizations (armjit) for imms
2013-11-08 11:45:16 -08:00
Unknown W. Brackets
385f9a457e
Use GetPointerUnchecked in how, pre-checked areas.
2013-11-08 11:39:24 -08:00
Henrik Rydgard
0e542b6ecc
vtxdec: small omission
2013-11-08 20:38:49 +01:00
Henrik Rydgard
22d6b36005
Vertexdecoder "double": fix for x86 and very minor optimization for arm.
2013-11-08 20:03:28 +01:00
Henrik Rydgard
6b45c321b6
vtxdecjit: turn off excessive logging
2013-11-08 18:49:17 +01:00
Henrik Rydgard
381b6d0f05
VertexDecoder JIT: Add the last missing ones except morph, I think.
2013-11-08 12:43:46 +01:00
Henrik Rydgard
502cbc170a
Revert "Another attempt to sizing framebuffer based on fmt"
...
This reverts commit c0e8893560
.
2013-11-07 15:29:11 +01:00
Henrik Rydgård
c213c44050
Merge pull request #4468 from hrydgard/vtxdec-prescale
...
Add support for prescaled UV in vertex decoder JIT
2013-11-07 02:51:14 -08:00
Henrik Rydgard
c4e3dd14fd
Add commented-out code to save XMM4/XMM5.
...
According to all calling convention manuals I can find, we don't really
need to preserve them. If they become problematic as mentioned, we can
activate this.
2013-11-07 09:54:58 +01:00
raven02
c0e8893560
Another attempt to sizing framebuffer based on fmt
2013-11-07 10:56:41 +08:00
Henrik Rydgard
64bdb5e21d
Tiny optimization (32-bit) in GLES_GPU::FastRunLoop
2013-11-07 01:34:43 +01:00
Henrik Rydgard
b203da05e9
Prescale UV in vtx-dec-jit: Fix bugs, add ARM support
2013-11-07 01:24:53 +01:00
Henrik Rydgård
367bcf6d4f
Prescale in the vertex dec jit. Needs debugging.
2013-11-07 01:24:53 +01:00
Unknown W. Brackets
34398b7d0c
Avoid literal loads in the arm vertexjit.
2013-11-06 08:45:00 -08:00
Unknown W. Brackets
78400fd460
Avoid some dereferencing in gpu FastRunLoop.
2013-11-06 07:50:16 -08:00
Henrik Rydgård
51995a3d43
Vtx dec: After generating ARM, remember to flush the icache.
...
Will hopefully fix the random crashes in #4461 .
2013-11-06 16:14:40 +01:00
Henrik Rydgård
e3f6f25390
Buildfix for non-Windows non-ARM
2013-11-06 13:54:26 +01:00
Henrik Rydgård
ea9da85bdb
Missed one possible unaligned access
2013-11-06 13:14:49 +01:00
Henrik Rydgård
b3fdfc01c8
ARM vtx dec: Avoid all unaligned accesses entirely.
...
Seeing so much contradictory information on the support and performance
of these.
2013-11-06 12:17:41 +01:00
Henrik Rydgård
1e158fa652
ARM vtx dec: Preserving our FP scratch register appears to improve
...
stability.
Also added some logging.
2013-11-06 11:47:26 +01:00
Henrik Rydgård
b19d41f9a8
Now that LDRH works, use it where appropriate
2013-11-06 10:51:21 +01:00
Henrik Rydgard
0eb3d79de9
x86 VertexDecoder jit: Fix typo in 16-bit weight decoder. May fix crashes.
2013-11-05 21:06:43 +01:00
Henrik Rydgård
a03e5c6de0
Merge pull request #4460 from hrydgard/vertex-decoder-jit
...
Vertex decoder JIT
2013-11-05 07:30:58 -08:00
Henrik Rydgård
7bf8a4dc5e
We need to use a signed VCVT to float in PosS*Through
2013-11-05 12:17:18 +01:00
Unknown W. Brackets
c7edf73cdb
Small optimizations to the vertexjit.
2013-11-05 00:32:08 -08:00
Unknown W. Brackets
f6662054bd
Fix arm emitter bug in LDRH and friends.
2013-11-05 00:32:08 -08:00
Henrik Rydgård
9113c584f1
Merge pull request #4399 from raven02/patch-11
...
Attempt to go back to the earliest logic of finding a matching framebuffer
2013-11-04 04:43:34 -08:00
Henrik Rydgård
5886ccffdc
Merge pull request #4441 from unknownbrackets/vertex-decoder-jit
...
Force 5 byte jumps to avoid jump target issues (vertex jit, x86)
2013-11-04 01:25:40 -08:00
Unknown W. Brackets
06194ac261
Add 5551 conversion to the arm vertexjit.
2013-11-04 00:47:05 -08:00
Unknown W. Brackets
16dcf807a8
Add 565 conversion to arm vertexjit.
2013-11-03 21:58:26 -08:00
Unknown W. Brackets
ab17d659cf
Implement 4444 conversion in arm vertexjit.
...
Seems to help Dissidia a bit.
2013-11-03 21:58:26 -08:00
Unknown W. Brackets
bfda36efff
Don't subtract nrmoff in arm vertexjit.
2013-11-03 21:57:55 -08:00
Unknown W. Brackets
a1fa65f631
Stupid typos, broke 4444 and 565.
2013-11-03 18:43:24 -08:00
Unknown W. Brackets
d5337edf1f
Force 5 byte jumps to avoid jump target issues.
...
Some with 16-bit colors were too far.
2013-11-03 17:17:04 -08:00
Unknown W. Brackets
aece4fd580
16-bit colors in vertex jit for x86.
2013-11-03 15:04:47 -08:00
Henrik Rydgard
f0fd7679ce
Preliminary ARM vertex decoder JIT. Has a weird issue in PosS16.
...
Other minor changes and fixes.
2013-11-03 20:15:42 +01:00
Henrik Rydgard
810b1a061f
Vertex decoder JIT for x86 and x64. Handles the most common vertex formats.
2013-11-03 15:27:12 +01:00
Unknown W. Brackets
e33b7fa1a4
iOS buildfix.
2013-11-03 01:08:48 -08:00
Henrik Rydgård
e12894a420
Merge pull request #4407 from unknownbrackets/texcache
...
Add a NEON version of the tex hash.
2013-11-02 07:41:39 -07:00
Henrik Rydgard
58860158df
Turn off the UNPACK optimization when texture scaling is on. Fixes #4408
2013-11-02 15:23:35 +01:00
Henrik Rydgård
4d3e57d6eb
Move normal reversion into the vertex shader instead of the decoder.
2013-11-02 11:05:31 +01:00
Unknown W. Brackets
4d47ccd5df
Add a NEON version of the tex hash.
...
Should be used only for NEON devices. Currently only compiled on Android.
2013-11-02 02:09:54 -07:00
Henrik Rydgård
1347c3b019
Merge pull request #4387 from hrydgard/unpack_subimage
...
Use GL_EXT_unpack_subimage to speed up non-pow-2 texture loads when available
2013-11-01 12:02:50 -07:00
Henrik Rydgard
a3a1395fc0
No need to set PACK parameters when we UNPACK
2013-11-01 19:56:06 +01:00
Henrik Rydgard
1fb7cdfcd2
Remove redundant call to ConvertColors, skip a copy when possible
2013-11-01 19:38:53 +01:00
Henrik Rydgard
5de7bb2e2d
Use GL_EXT_unpack_subimage to speed up non-pow-2 texture loads when available
2013-11-01 19:38:52 +01:00
raven02
5b228d5fe4
Attempt to go back to thevery original finding a matching framebuffer logic
2013-11-01 22:00:34 +08:00
Unknown W. Brackets
1ee83935cb
Correct texcache alignment check.
...
Oops.
2013-10-31 23:37:05 -07:00
Unknown W. Brackets
dc8902dbff
Cut down on color conversion instructions (x86.)
2013-10-31 23:29:18 -07:00
Unknown W. Brackets
96256f43e9
Make sure clut memory is aligned.
...
For SSE color conversion, etc.
2013-10-31 23:29:17 -07:00