Henrik Rydgard
5888b3bdc4
Revert "x86jit: Micro optimize slt* a bit."
...
This reverts commit ee66596b8d
.
Broke a lot of games, probably some small bug.
Conflicts:
Core/MIPS/x86/CompALU.cpp
2014-11-09 12:07:21 +01:00
Unknown W. Brackets
313d9e95c7
Clarify a comment.
2014-11-09 01:05:03 -08:00
Unknown W. Brackets
ee66596b8d
x86jit: Micro optimize slt* a bit.
...
This improves their performance and hopefully latency. It also avoids
filling registers that are not likely to be used again.
2014-11-08 22:54:03 -08:00
Unknown W. Brackets
27d8108bb2
x86jit: Optimize loads of 0 into fp regs.
2014-11-08 18:41:16 -08:00
Unknown W. Brackets
7d8858687e
x86jit: Avoid speculative loads in mtc1/mfc1.
2014-11-08 18:35:15 -08:00
Unknown W. Brackets
57caa95273
x86jit: Implement round.w.s and friends.
...
They are not terribly fast, though, updating MXCSR.
2014-11-08 17:59:38 -08:00
Unknown W. Brackets
3908e0f445
x86jit: Small optimization for add.s f1, f2, f2.
...
Doubles the speed of that particular case. Biggest difference is not
loading fd for no reason.
2014-11-08 17:32:53 -08:00
Unknown W. Brackets
f9893c29ce
x86jit: Very small optimization to c.nge.s.
2014-11-08 17:01:02 -08:00
Unknown W. Brackets
78dfe43776
x86jit: Optimize neg.s and abs.s a tiny bit.
...
Same reg is probably a common case, improves micro benchmark.
2014-11-08 16:50:41 -08:00
Unknown W. Brackets
bed0d0b059
x86jit: Improve cvt.w.s when fd is loaded or fs.
...
We have no need to store it.
2014-11-08 16:40:54 -08:00
Unknown W. Brackets
1917d946ea
x86jit: Micro optimize cvt.s.w a bit.
...
This implementation is about 5x faster for micro benchmarks. Little
impact to overall perf in games I tested, though.
2014-11-08 13:30:38 -08:00
Unknown W. Brackets
671dee85c7
x86jit: Micro optimize vi2f a little bit.
...
This didn't help overall perf much but micro benchmarks are better.
2014-11-08 13:07:01 -08:00
Unknown W. Brackets
c29b126357
x86jit: Oops, can't have an imm here.
2014-11-08 12:41:48 -08:00
Unknown W. Brackets
c0be19edb6
x86jit: Simplify vavg a bit.
2014-11-08 12:40:04 -08:00
Unknown W. Brackets
761e269e5f
x86jit: Avoid some regcache pollution.
2014-11-08 12:38:08 -08:00
Unknown W. Brackets
bc7497857a
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
...
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a
x86jit: Implement vi2x instructions.
...
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
ddc90ee550
x86jit: Implement vfad and vavg.
2014-11-08 12:13:25 -08:00
Unknown W. Brackets
5ae43defd9
Oops, these should be signed.
2014-11-08 09:39:17 -08:00
Unknown W. Brackets
316e923b40
x86jit: Implement other forms of vx2i.
...
Gains 3.2% performance in Grand Knights History.
2014-11-08 00:39:40 -08:00
Unknown W. Brackets
097a483d77
x86jit: Micro optimize vs2i a bit.
2014-11-06 22:45:54 -08:00
Unknown W. Brackets
3061e89250
Fix copy/paste mistake.
2014-11-04 01:41:17 -08:00
Unknown W. Brackets
0d36d4e082
Add a helper to reduce duplicate code.
...
This is not performance critical. I wonder if compilers can inline
closures?
2014-11-03 23:50:23 -08:00
Unknown W. Brackets
16ca2b0155
x86jit: Fix trig vv2ops on 32-bit, arg.
2014-11-03 23:43:18 -08:00
Unknown W. Brackets
3e95763a3f
x86jit: Implement other rounding modes in vf2i.
...
3% improvement in Grand Knights History. I know other games use these
too.
2014-11-03 23:27:05 -08:00
Unknown W. Brackets
717cf25f0d
x86jit: Use our sincos funcs for VV2Op as well.
...
Small (0.7%) speedup in Gods Eater Burst. There's probably SSE
approximations we could use instead, but those will also need at least xmm
reg flushing/thunking.
At least this avoids flushing gprs, etc. The sin and cos ops are fairly
common.
2014-11-03 22:13:38 -08:00
Unknown W. Brackets
5bb9d32eaa
jit: Fix partial invalidation of larger blocks.
...
Fixes #7031 .
2014-10-27 19:04:19 -07:00
Unknown W. Brackets
100afc07a2
x86jit: Fix andLink cases of imm blezl, etc.
2014-10-24 08:57:56 -07:00
Unknown W. Brackets
b53f13480a
x86jit: Centralize continuing logic.
2014-10-12 19:01:04 -07:00
Unknown W. Brackets
d98adf27d6
x86jit: Add proxy blocks for continuing.
2014-10-12 17:15:31 -07:00
Unknown W. Brackets
01f9521dc5
jit: Invalidate blocks even if they end unevenly.
...
This allows blocks to start and end where ever they need, which should be
good for replacements and for continuing.
2014-10-12 17:13:04 -07:00
Unknown W. Brackets
90821b761d
x86jit: Pad linked exits with breakpoints.
...
So that we don't get garbage, and so we see if we end up there.
2014-10-12 16:00:58 -07:00
Unknown W. Brackets
5fd402222b
x86jit: Use the shorter MDisp() offset for andLink.
2014-10-12 15:18:22 -07:00
Unknown W. Brackets
0f32103615
x86jit: Consistently use mips_.
2014-10-12 15:16:09 -07:00
Henrik Rydgård
afbe50d3b9
Merge pull request #6998 from unknownbrackets/jit-minor2
...
x86jit: Preload sp and similar regs used often
2014-10-13 00:00:28 +02:00
Unknown W. Brackets
e3a04aa2d2
x86jit: Preload sp and similar regs used often.
...
This can help us avoid using a temporary.
Very tiny performance improvement.
2014-10-12 14:53:56 -07:00
Unknown W. Brackets
6fae78cd3f
x86jit: Fix a bug in branch continuing.
...
When we predict it won't take a likely delay slot, we'd lose our register
allocation state.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
2f598e8f38
jit: Statically jump for fixed branches.
...
This handles both loops (first step is known) and static branches (some
code uses them instead of jumps, and we disassemble that to "b".)
Not likely to be a big improvement, but might help if the branch predictor
was wrong.
This is as opposed to continuing, which would build a larger jit block.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
9228ac72da
jit: Reorganize imm branch logic a bit.
2014-10-12 12:51:46 -07:00
Unknown W. Brackets
4d30288601
x86jit: Fix force flush to zero.
2014-10-12 12:51:46 -07:00
Unknown W. Brackets
928e2adfc9
jit: Avoid applying/restoring the rounding mode.
...
If the game never sets it, we can skip around syscalls, interpreter,
replacements, etc.
2014-10-12 12:51:45 -07:00
Unknown W. Brackets
8d0dca71fe
jit: Rename the rounding mode funcs to clarify.
...
They apply/restore the value, set/clear is confusing.
2014-10-12 11:35:20 -07:00
Henrik Rydgard
8177b4c43b
Avoid an ifdef using PTRBITS
2014-10-12 19:35:55 +02:00
Henrik Rydgård
eab010a0c0
x86 JIT: Sacrifice a register for a pointer to the MIPS context. Shrinks emitted x86 code considerably.
...
Nice in 64-bit, but might be a bit too much in 32-bit though... Needs testing.
2014-10-12 19:35:55 +02:00
Henrik Rydgård
f99c2cd010
x86 Jit: Generate nicer code for some cases of addiu
2014-10-12 17:47:53 +02:00
Unknown W. Brackets
4210ba44eb
Clean up a few more ImmPtr() cases.
2014-09-21 08:34:27 -07:00
Unknown W. Brackets
52b6f1095e
armjit: Fix rounding mode, allow non flush-to-zero.
...
Default: force flush to zero (for RunFast mode.) But now it's an ini
option so we can more easily compare armjit differences.
2014-09-11 07:58:51 -07:00
Andrew Church
3033dc5138
Revert to unconditional ClearRoundingMode() when setting FCR31.
2014-09-04 11:36:56 +09:00
Andrew Church
128122af39
Fix broken rounding mode handling.
2014-09-04 11:30:11 +09:00
Andrew Church
726cb851b9
Don't unconditionally ClearRoundingMode() before setting it.
2014-09-04 09:28:56 +09:00