Unknown W. Brackets
|
e5dabaabe2
|
x86jit: Optimize simd->non for 1-lane a little.
|
2014-11-26 09:20:50 -08:00 |
|
Unknown W. Brackets
|
5d0c32d1e6
|
x86jit: Assume non-simd regs are dirty.
|
2014-11-26 09:19:50 -08:00 |
|
Unknown W. Brackets
|
a4b9122943
|
x86jit: Use NS instead of NBE for checked entries.
This may cause us to more correctly bail on linked blocks in some cases.
|
2014-11-23 11:05:49 -08:00 |
|
Unknown W. Brackets
|
fe525a52f9
|
Update native (shutdown crash) + comment.
|
2014-11-23 11:04:07 -08:00 |
|
Unknown W. Brackets
|
473f388088
|
Disable the simd stuff for now.
Won't have time to look at this for a bit...
|
2014-11-20 14:07:56 -08:00 |
|
Henrik Rydgård
|
6a49337a0c
|
Merge pull request #7096 from unknownbrackets/jit-simd
x86jit: Add basic support for mapping SIMD
|
2014-11-18 18:25:39 +01:00 |
|
Unknown W. Brackets
|
ab7dd0df25
|
x86jit: Add an option to enable/disable vpfu simd.
|
2014-11-17 20:37:27 -08:00 |
|
Henrik Rydgard
|
53b5d331b4
|
Assorted minor optimizations
|
2014-11-17 21:21:44 +01:00 |
|
Unknown W. Brackets
|
921b39ebf5
|
x86jit: Optimize a 2-reg simd load.
|
2014-11-16 15:05:17 -08:00 |
|
Unknown W. Brackets
|
e68eb0a292
|
x86jit: Load sequential regs in one shot.
|
2014-11-16 15:05:17 -08:00 |
|
Unknown W. Brackets
|
ed501302a2
|
x86jit: Add a check to see if we can map simd.
|
2014-11-16 15:05:16 -08:00 |
|
Unknown W. Brackets
|
27148d3712
|
x86jit: Add some helpers to check state.
|
2014-11-16 13:33:16 -08:00 |
|
Unknown W. Brackets
|
de566be2ce
|
x86jit: Split out the logic for loading simd regs.
|
2014-11-16 13:33:15 -08:00 |
|
Unknown W. Brackets
|
5347431c20
|
x86jit: Initial simd for VecDo3(). Broken.
I'm not sure why/where it's broken...
|
2014-11-16 13:33:15 -08:00 |
|
Unknown W. Brackets
|
aad505e7b3
|
x86jit: Add a TryMapDirtyInInVS() for 3-op.
|
2014-11-16 13:33:14 -08:00 |
|
Unknown W. Brackets
|
88a753eff3
|
x86jit: Add an invariant contract to the fpu cache.
This should help catch things better in debug mode.
|
2014-11-16 13:33:14 -08:00 |
|
Unknown W. Brackets
|
39afeb490f
|
x86jit: Add some typesafety.
|
2014-11-16 13:33:13 -08:00 |
|
Unknown W. Brackets
|
4335bf3346
|
x86jit: Add basic mapping of SIMD regs.
Not tested yet, just sketched out. All very suboptimal.
|
2014-11-16 13:33:13 -08:00 |
|
Unknown W. Brackets
|
9429359b47
|
x86jit: Add fallbacks when moving from VS -> V.
|
2014-11-16 13:33:12 -08:00 |
|
Unknown W. Brackets
|
2862367927
|
x86jit: Add force-non-simd to all current ops.
Unless they already use MapRegs, because that will automatically handle
it.
|
2014-11-16 13:33:12 -08:00 |
|
Unknown W. Brackets
|
4cf0913692
|
x86jit: Sketch some initial SIMD apis.
|
2014-11-16 13:33:07 -08:00 |
|
Henrik Rydgard
|
bfcd3690b6
|
x86 jit: Fix+enable quaternion product, optimize "sw zero, *"
|
2014-11-16 18:37:38 +01:00 |
|
Henrik Rydgard
|
28ca8d4818
|
x86 jit: Use LEA to emulate addu but only when it can save a few bytes
|
2014-11-16 17:39:47 +01:00 |
|
Henrik Rydgard
|
1c78e29c79
|
x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
Might have missed a few places.
|
2014-11-16 17:38:26 +01:00 |
|
Henrik Rydgard
|
8b90f881b8
|
x86 jit: A tiny optimization and a tiny bugfix
|
2014-11-16 16:46:35 +01:00 |
|
Unknown W. Brackets
|
096b41cceb
|
x86jit: Interleave reg usage in vcmp.
|
2014-11-10 23:22:04 -08:00 |
|
Unknown W. Brackets
|
0e1aa35e84
|
x86jit: Just do the ES/NS compare once.
|
2014-11-10 23:04:38 -08:00 |
|
Unknown W. Brackets
|
2758e8fa3c
|
x86jit: Optimize vcmp for single and simd.
|
2014-11-10 23:04:37 -08:00 |
|
Unknown W. Brackets
|
86e3739a3e
|
x86jit: Optimize some cases of ins/ext.
They happen but are minor.
|
2014-11-09 09:22:29 -08:00 |
|
Unknown W. Brackets
|
e05263af32
|
x86jit: Allow EBX sign extension for 32-bit.
|
2014-11-09 09:07:52 -08:00 |
|
Unknown W. Brackets
|
8dbd3c3b9c
|
x86jit: Don't lie about ZERO when it's not an imm.
|
2014-11-09 08:27:02 -08:00 |
|
Unknown W. Brackets
|
d0a2ced2f9
|
x86jit: Flip cc in stl* to avoid reg loads.
Unfortunately, this zero thing is now concerning me...
|
2014-11-09 08:15:39 -08:00 |
|
Unknown W. Brackets
|
59f491eddb
|
x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency. It also avoids
filling registers that are not likely to be used again.
Fixed a small mistake.
|
2014-11-09 07:23:44 -08:00 |
|
Henrik Rydgard
|
18495a452d
|
Rename an enum
|
2014-11-09 14:55:23 +01:00 |
|
Henrik Rydgard
|
a19d0b648a
|
x86 jit: Add a simple speedhack (ignore masking stack pointers) but disable due to low impact.
|
2014-11-09 14:54:39 +01:00 |
|
Henrik Rydgard
|
a528921f3c
|
x86 JIT: EBX was free in 32-bit mode, let's use it in the regcache.
|
2014-11-09 12:55:17 +01:00 |
|
Henrik Rydgard
|
5888b3bdc4
|
Revert "x86jit: Micro optimize slt* a bit."
This reverts commit ee66596b8d .
Broke a lot of games, probably some small bug.
Conflicts:
Core/MIPS/x86/CompALU.cpp
|
2014-11-09 12:07:21 +01:00 |
|
Unknown W. Brackets
|
313d9e95c7
|
Clarify a comment.
|
2014-11-09 01:05:03 -08:00 |
|
Unknown W. Brackets
|
ee66596b8d
|
x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency. It also avoids
filling registers that are not likely to be used again.
|
2014-11-08 22:54:03 -08:00 |
|
Unknown W. Brackets
|
27d8108bb2
|
x86jit: Optimize loads of 0 into fp regs.
|
2014-11-08 18:41:16 -08:00 |
|
Unknown W. Brackets
|
7d8858687e
|
x86jit: Avoid speculative loads in mtc1/mfc1.
|
2014-11-08 18:35:15 -08:00 |
|
Unknown W. Brackets
|
57caa95273
|
x86jit: Implement round.w.s and friends.
They are not terribly fast, though, updating MXCSR.
|
2014-11-08 17:59:38 -08:00 |
|
Unknown W. Brackets
|
3908e0f445
|
x86jit: Small optimization for add.s f1, f2, f2.
Doubles the speed of that particular case. Biggest difference is not
loading fd for no reason.
|
2014-11-08 17:32:53 -08:00 |
|
Unknown W. Brackets
|
f9893c29ce
|
x86jit: Very small optimization to c.nge.s.
|
2014-11-08 17:01:02 -08:00 |
|
Unknown W. Brackets
|
78dfe43776
|
x86jit: Optimize neg.s and abs.s a tiny bit.
Same reg is probably a common case, improves micro benchmark.
|
2014-11-08 16:50:41 -08:00 |
|
Unknown W. Brackets
|
bed0d0b059
|
x86jit: Improve cvt.w.s when fd is loaded or fs.
We have no need to store it.
|
2014-11-08 16:40:54 -08:00 |
|
Unknown W. Brackets
|
1917d946ea
|
x86jit: Micro optimize cvt.s.w a bit.
This implementation is about 5x faster for micro benchmarks. Little
impact to overall perf in games I tested, though.
|
2014-11-08 13:30:38 -08:00 |
|
Unknown W. Brackets
|
671dee85c7
|
x86jit: Micro optimize vi2f a little bit.
This didn't help overall perf much but micro benchmarks are better.
|
2014-11-08 13:07:01 -08:00 |
|
Unknown W. Brackets
|
c29b126357
|
x86jit: Oops, can't have an imm here.
|
2014-11-08 12:41:48 -08:00 |
|
Unknown W. Brackets
|
c0be19edb6
|
x86jit: Simplify vavg a bit.
|
2014-11-08 12:40:04 -08:00 |
|