Commit Graph

505 Commits

Author SHA1 Message Date
Unknown W. Brackets
e5dabaabe2 x86jit: Optimize simd->non for 1-lane a little. 2014-11-26 09:20:50 -08:00
Unknown W. Brackets
5d0c32d1e6 x86jit: Assume non-simd regs are dirty. 2014-11-26 09:19:50 -08:00
Unknown W. Brackets
a4b9122943 x86jit: Use NS instead of NBE for checked entries.
This may cause us to more correctly bail on linked blocks in some cases.
2014-11-23 11:05:49 -08:00
Unknown W. Brackets
fe525a52f9 Update native (shutdown crash) + comment. 2014-11-23 11:04:07 -08:00
Unknown W. Brackets
473f388088 Disable the simd stuff for now.
Won't have time to look at this for a bit...
2014-11-20 14:07:56 -08:00
Henrik Rydgård
6a49337a0c Merge pull request #7096 from unknownbrackets/jit-simd
x86jit: Add basic support for mapping SIMD
2014-11-18 18:25:39 +01:00
Unknown W. Brackets
ab7dd0df25 x86jit: Add an option to enable/disable vpfu simd. 2014-11-17 20:37:27 -08:00
Henrik Rydgard
53b5d331b4 Assorted minor optimizations 2014-11-17 21:21:44 +01:00
Unknown W. Brackets
921b39ebf5 x86jit: Optimize a 2-reg simd load. 2014-11-16 15:05:17 -08:00
Unknown W. Brackets
e68eb0a292 x86jit: Load sequential regs in one shot. 2014-11-16 15:05:17 -08:00
Unknown W. Brackets
ed501302a2 x86jit: Add a check to see if we can map simd. 2014-11-16 15:05:16 -08:00
Unknown W. Brackets
27148d3712 x86jit: Add some helpers to check state. 2014-11-16 13:33:16 -08:00
Unknown W. Brackets
de566be2ce x86jit: Split out the logic for loading simd regs. 2014-11-16 13:33:15 -08:00
Unknown W. Brackets
5347431c20 x86jit: Initial simd for VecDo3(). Broken.
I'm not sure why/where it's broken...
2014-11-16 13:33:15 -08:00
Unknown W. Brackets
aad505e7b3 x86jit: Add a TryMapDirtyInInVS() for 3-op. 2014-11-16 13:33:14 -08:00
Unknown W. Brackets
88a753eff3 x86jit: Add an invariant contract to the fpu cache.
This should help catch things better in debug mode.
2014-11-16 13:33:14 -08:00
Unknown W. Brackets
39afeb490f x86jit: Add some typesafety. 2014-11-16 13:33:13 -08:00
Unknown W. Brackets
4335bf3346 x86jit: Add basic mapping of SIMD regs.
Not tested yet, just sketched out.  All very suboptimal.
2014-11-16 13:33:13 -08:00
Unknown W. Brackets
9429359b47 x86jit: Add fallbacks when moving from VS -> V. 2014-11-16 13:33:12 -08:00
Unknown W. Brackets
2862367927 x86jit: Add force-non-simd to all current ops.
Unless they already use MapRegs, because that will automatically handle
it.
2014-11-16 13:33:12 -08:00
Unknown W. Brackets
4cf0913692 x86jit: Sketch some initial SIMD apis. 2014-11-16 13:33:07 -08:00
Henrik Rydgard
bfcd3690b6 x86 jit: Fix+enable quaternion product, optimize "sw zero, *" 2014-11-16 18:37:38 +01:00
Henrik Rydgard
28ca8d4818 x86 jit: Use LEA to emulate addu but only when it can save a few bytes 2014-11-16 17:39:47 +01:00
Henrik Rydgard
1c78e29c79 x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
Might have missed a few places.
2014-11-16 17:38:26 +01:00
Henrik Rydgard
8b90f881b8 x86 jit: A tiny optimization and a tiny bugfix 2014-11-16 16:46:35 +01:00
Unknown W. Brackets
096b41cceb x86jit: Interleave reg usage in vcmp. 2014-11-10 23:22:04 -08:00
Unknown W. Brackets
0e1aa35e84 x86jit: Just do the ES/NS compare once. 2014-11-10 23:04:38 -08:00
Unknown W. Brackets
2758e8fa3c x86jit: Optimize vcmp for single and simd. 2014-11-10 23:04:37 -08:00
Unknown W. Brackets
86e3739a3e x86jit: Optimize some cases of ins/ext.
They happen but are minor.
2014-11-09 09:22:29 -08:00
Unknown W. Brackets
e05263af32 x86jit: Allow EBX sign extension for 32-bit. 2014-11-09 09:07:52 -08:00
Unknown W. Brackets
8dbd3c3b9c x86jit: Don't lie about ZERO when it's not an imm. 2014-11-09 08:27:02 -08:00
Unknown W. Brackets
d0a2ced2f9 x86jit: Flip cc in stl* to avoid reg loads.
Unfortunately, this zero thing is now concerning me...
2014-11-09 08:15:39 -08:00
Unknown W. Brackets
59f491eddb x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency.  It also avoids
filling registers that are not likely to be used again.

Fixed a small mistake.
2014-11-09 07:23:44 -08:00
Henrik Rydgard
18495a452d Rename an enum 2014-11-09 14:55:23 +01:00
Henrik Rydgard
a19d0b648a x86 jit: Add a simple speedhack (ignore masking stack pointers) but disable due to low impact. 2014-11-09 14:54:39 +01:00
Henrik Rydgard
a528921f3c x86 JIT: EBX was free in 32-bit mode, let's use it in the regcache. 2014-11-09 12:55:17 +01:00
Henrik Rydgard
5888b3bdc4 Revert "x86jit: Micro optimize slt* a bit."
This reverts commit ee66596b8d.

Broke a lot of games, probably some small bug.

Conflicts:
	Core/MIPS/x86/CompALU.cpp
2014-11-09 12:07:21 +01:00
Unknown W. Brackets
313d9e95c7 Clarify a comment. 2014-11-09 01:05:03 -08:00
Unknown W. Brackets
ee66596b8d x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency.  It also avoids
filling registers that are not likely to be used again.
2014-11-08 22:54:03 -08:00
Unknown W. Brackets
27d8108bb2 x86jit: Optimize loads of 0 into fp regs. 2014-11-08 18:41:16 -08:00
Unknown W. Brackets
7d8858687e x86jit: Avoid speculative loads in mtc1/mfc1. 2014-11-08 18:35:15 -08:00
Unknown W. Brackets
57caa95273 x86jit: Implement round.w.s and friends.
They are not terribly fast, though, updating MXCSR.
2014-11-08 17:59:38 -08:00
Unknown W. Brackets
3908e0f445 x86jit: Small optimization for add.s f1, f2, f2.
Doubles the speed of that particular case.  Biggest difference is not
loading fd for no reason.
2014-11-08 17:32:53 -08:00
Unknown W. Brackets
f9893c29ce x86jit: Very small optimization to c.nge.s. 2014-11-08 17:01:02 -08:00
Unknown W. Brackets
78dfe43776 x86jit: Optimize neg.s and abs.s a tiny bit.
Same reg is probably a common case, improves micro benchmark.
2014-11-08 16:50:41 -08:00
Unknown W. Brackets
bed0d0b059 x86jit: Improve cvt.w.s when fd is loaded or fs.
We have no need to store it.
2014-11-08 16:40:54 -08:00
Unknown W. Brackets
1917d946ea x86jit: Micro optimize cvt.s.w a bit.
This implementation is about 5x faster for micro benchmarks.  Little
impact to overall perf in games I tested, though.
2014-11-08 13:30:38 -08:00
Unknown W. Brackets
671dee85c7 x86jit: Micro optimize vi2f a little bit.
This didn't help overall perf much but micro benchmarks are better.
2014-11-08 13:07:01 -08:00
Unknown W. Brackets
c29b126357 x86jit: Oops, can't have an imm here. 2014-11-08 12:41:48 -08:00
Unknown W. Brackets
c0be19edb6 x86jit: Simplify vavg a bit. 2014-11-08 12:40:04 -08:00