Unknown W. Brackets
de566be2ce
x86jit: Split out the logic for loading simd regs.
2014-11-16 13:33:15 -08:00
Unknown W. Brackets
5347431c20
x86jit: Initial simd for VecDo3(). Broken.
...
I'm not sure why/where it's broken...
2014-11-16 13:33:15 -08:00
Unknown W. Brackets
aad505e7b3
x86jit: Add a TryMapDirtyInInVS() for 3-op.
2014-11-16 13:33:14 -08:00
Unknown W. Brackets
88a753eff3
x86jit: Add an invariant contract to the fpu cache.
...
This should help catch things better in debug mode.
2014-11-16 13:33:14 -08:00
Unknown W. Brackets
39afeb490f
x86jit: Add some typesafety.
2014-11-16 13:33:13 -08:00
Unknown W. Brackets
4335bf3346
x86jit: Add basic mapping of SIMD regs.
...
Not tested yet, just sketched out. All very suboptimal.
2014-11-16 13:33:13 -08:00
Unknown W. Brackets
9429359b47
x86jit: Add fallbacks when moving from VS -> V.
2014-11-16 13:33:12 -08:00
Unknown W. Brackets
2862367927
x86jit: Add force-non-simd to all current ops.
...
Unless they already use MapRegs, because that will automatically handle
it.
2014-11-16 13:33:12 -08:00
Unknown W. Brackets
4cf0913692
x86jit: Sketch some initial SIMD apis.
2014-11-16 13:33:07 -08:00
Henrik Rydgard
e43c7af32c
ARM Jit: Implement quaternion multiplication
2014-11-16 19:12:00 +01:00
Henrik Rydgard
bfcd3690b6
x86 jit: Fix+enable quaternion product, optimize "sw zero, *"
2014-11-16 18:37:38 +01:00
Henrik Rydgard
28ca8d4818
x86 jit: Use LEA to emulate addu but only when it can save a few bytes
2014-11-16 17:39:47 +01:00
Henrik Rydgard
1c78e29c79
x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
...
Might have missed a few places.
2014-11-16 17:38:26 +01:00
Henrik Rydgard
8b90f881b8
x86 jit: A tiny optimization and a tiny bugfix
2014-11-16 16:46:35 +01:00
xSacha
57e4088216
Introduce fake vertex decoder JIT as well.
...
Compiles and links on CI20 but gets unknown crash in GL driver.
2014-11-13 17:10:29 +10:00
Sacha
c421617c84
Fix Qt build by building Arm disassembler for all platforms.
2014-11-13 00:55:00 +10:00
Sacha
a0086f6412
Introduce a Fake JIT for generic builds.
2014-11-13 00:09:51 +10:00
Kingcom
479c8646a2
Change vpfxs/r/t disassembly syntax
2014-11-12 00:09:57 +01:00
Unknown W. Brackets
096b41cceb
x86jit: Interleave reg usage in vcmp.
2014-11-10 23:22:04 -08:00
Unknown W. Brackets
0e1aa35e84
x86jit: Just do the ES/NS compare once.
2014-11-10 23:04:38 -08:00
Unknown W. Brackets
2758e8fa3c
x86jit: Optimize vcmp for single and simd.
2014-11-10 23:04:37 -08:00
Unknown W. Brackets
94e29da6c4
Fix a typo in the mips assembler.
...
Oops, this should be a unique value of course.
2014-11-10 09:13:11 -08:00
Unknown W. Brackets
01c2b88911
Avoid signed ints, seems to cause clang errors.
2014-11-09 16:49:24 -08:00
Henrik Rydgård
7fbe8ba898
Merge pull request #7076 from unknownbrackets/debugger
...
Add VFPU instructions to the mips asm tables
2014-11-10 00:57:08 +01:00
Unknown W. Brackets
370fb86379
Add VFPU instructions to the mips asm tables.
2014-11-09 15:14:07 -08:00
Unknown W. Brackets
86e3739a3e
x86jit: Optimize some cases of ins/ext.
...
They happen but are minor.
2014-11-09 09:22:29 -08:00
Unknown W. Brackets
e05263af32
x86jit: Allow EBX sign extension for 32-bit.
2014-11-09 09:07:52 -08:00
Unknown W. Brackets
8dbd3c3b9c
x86jit: Don't lie about ZERO when it's not an imm.
2014-11-09 08:27:02 -08:00
Unknown W. Brackets
d0a2ced2f9
x86jit: Flip cc in stl* to avoid reg loads.
...
Unfortunately, this zero thing is now concerning me...
2014-11-09 08:15:39 -08:00
Unknown W. Brackets
59f491eddb
x86jit: Micro optimize slt* a bit.
...
This improves their performance and hopefully latency. It also avoids
filling registers that are not likely to be used again.
Fixed a small mistake.
2014-11-09 07:23:44 -08:00
Henrik Rydgard
18495a452d
Rename an enum
2014-11-09 14:55:23 +01:00
Henrik Rydgard
a19d0b648a
x86 jit: Add a simple speedhack (ignore masking stack pointers) but disable due to low impact.
2014-11-09 14:54:39 +01:00
Henrik Rydgard
a528921f3c
x86 JIT: EBX was free in 32-bit mode, let's use it in the regcache.
2014-11-09 12:55:17 +01:00
Henrik Rydgard
db853d8513
Collapse sequences of "int3" (padding after block linking) in x86 disassembly.
2014-11-09 12:10:37 +01:00
Henrik Rydgard
5888b3bdc4
Revert "x86jit: Micro optimize slt* a bit."
...
This reverts commit ee66596b8d
.
Broke a lot of games, probably some small bug.
Conflicts:
Core/MIPS/x86/CompALU.cpp
2014-11-09 12:07:21 +01:00
Henrik Rydgard
5bcdecc26b
unittest: Have the JIT harness print disassembly, to make it easy to inspect results.
2014-11-09 12:03:04 +01:00
Unknown W. Brackets
313d9e95c7
Clarify a comment.
2014-11-09 01:05:03 -08:00
Unknown W. Brackets
ee66596b8d
x86jit: Micro optimize slt* a bit.
...
This improves their performance and hopefully latency. It also avoids
filling registers that are not likely to be used again.
2014-11-08 22:54:03 -08:00
Unknown W. Brackets
27d8108bb2
x86jit: Optimize loads of 0 into fp regs.
2014-11-08 18:41:16 -08:00
Unknown W. Brackets
7d8858687e
x86jit: Avoid speculative loads in mtc1/mfc1.
2014-11-08 18:35:15 -08:00
Unknown W. Brackets
57caa95273
x86jit: Implement round.w.s and friends.
...
They are not terribly fast, though, updating MXCSR.
2014-11-08 17:59:38 -08:00
Unknown W. Brackets
3908e0f445
x86jit: Small optimization for add.s f1, f2, f2.
...
Doubles the speed of that particular case. Biggest difference is not
loading fd for no reason.
2014-11-08 17:32:53 -08:00
Unknown W. Brackets
f9893c29ce
x86jit: Very small optimization to c.nge.s.
2014-11-08 17:01:02 -08:00
Unknown W. Brackets
78dfe43776
x86jit: Optimize neg.s and abs.s a tiny bit.
...
Same reg is probably a common case, improves micro benchmark.
2014-11-08 16:50:41 -08:00
Unknown W. Brackets
bed0d0b059
x86jit: Improve cvt.w.s when fd is loaded or fs.
...
We have no need to store it.
2014-11-08 16:40:54 -08:00
Unknown W. Brackets
1917d946ea
x86jit: Micro optimize cvt.s.w a bit.
...
This implementation is about 5x faster for micro benchmarks. Little
impact to overall perf in games I tested, though.
2014-11-08 13:30:38 -08:00
Unknown W. Brackets
671dee85c7
x86jit: Micro optimize vi2f a little bit.
...
This didn't help overall perf much but micro benchmarks are better.
2014-11-08 13:07:01 -08:00
Unknown W. Brackets
c29b126357
x86jit: Oops, can't have an imm here.
2014-11-08 12:41:48 -08:00
Unknown W. Brackets
c0be19edb6
x86jit: Simplify vavg a bit.
2014-11-08 12:40:04 -08:00
Unknown W. Brackets
761e269e5f
x86jit: Avoid some regcache pollution.
2014-11-08 12:38:08 -08:00
Unknown W. Brackets
bc7497857a
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
...
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a
x86jit: Implement vi2x instructions.
...
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
ddc90ee550
x86jit: Implement vfad and vavg.
2014-11-08 12:13:25 -08:00
Unknown W. Brackets
5ae43defd9
Oops, these should be signed.
2014-11-08 09:39:17 -08:00
Unknown W. Brackets
316e923b40
x86jit: Implement other forms of vx2i.
...
Gains 3.2% performance in Grand Knights History.
2014-11-08 00:39:40 -08:00
Unknown W. Brackets
097a483d77
x86jit: Micro optimize vs2i a bit.
2014-11-06 22:45:54 -08:00
Unknown W. Brackets
7819b97c9a
iOS buildfix.
2014-11-04 08:32:43 -08:00
TwistedUmbrella
6797044476
Correct a namespace typo
2014-11-04 05:10:22 -05:00
Unknown W. Brackets
3061e89250
Fix copy/paste mistake.
2014-11-04 01:41:17 -08:00
Unknown W. Brackets
0d36d4e082
Add a helper to reduce duplicate code.
...
This is not performance critical. I wonder if compilers can inline
closures?
2014-11-03 23:50:23 -08:00
Unknown W. Brackets
16ca2b0155
x86jit: Fix trig vv2ops on 32-bit, arg.
2014-11-03 23:43:18 -08:00
Unknown W. Brackets
3e95763a3f
x86jit: Implement other rounding modes in vf2i.
...
3% improvement in Grand Knights History. I know other games use these
too.
2014-11-03 23:27:05 -08:00
Unknown W. Brackets
717cf25f0d
x86jit: Use our sincos funcs for VV2Op as well.
...
Small (0.7%) speedup in Gods Eater Burst. There's probably SSE
approximations we could use instead, but those will also need at least xmm
reg flushing/thunking.
At least this avoids flushing gprs, etc. The sin and cos ops are fairly
common.
2014-11-03 22:13:38 -08:00
Unknown W. Brackets
014445655d
Actually clear the hash->func map on forget.
...
Better not to have dangling pointers.
2014-11-03 13:49:45 -08:00
Unknown W. Brackets
ad6b176e11
Naturally, modern C++ would not build on Symbian.
2014-11-03 08:56:45 -08:00
Unknown W. Brackets
61c21340fb
Warning fixes.
2014-11-03 08:34:34 -08:00
Unknown W. Brackets
9d86d3ca9b
Use std::unordered_multimaps in a few places.
2014-11-03 08:31:52 -08:00
Unknown W. Brackets
67a7205bdd
Switch to a multimap for the hash->function map.
2014-11-03 07:59:56 -08:00
Unknown W. Brackets
56322bdad4
Improve performance of ForgetFunctions().
...
Mostly matters during shutdown, but also module unload.
2014-11-02 17:32:04 -08:00
Unknown W. Brackets
258b7c9a7c
jit: Use the end just to be safe.
...
In case clearing near the end of a block.
2014-10-27 19:05:52 -07:00
Unknown W. Brackets
5bb9d32eaa
jit: Fix partial invalidation of larger blocks.
...
Fixes #7031 .
2014-10-27 19:04:19 -07:00
Unknown W. Brackets
100afc07a2
x86jit: Fix andLink cases of imm blezl, etc.
2014-10-24 08:57:56 -07:00
Unknown W. Brackets
0c1dcfeacf
Avoid comparing invalidated iterators.
2014-10-22 00:50:39 -07:00
Unknown W. Brackets
65ecc9a464
jit: Use an exclusive end in the block map.
...
Simpler, was not consistent before, oops.
2014-10-21 11:52:19 -07:00
BlackDog
f7e8ca486c
add lwc1 lwc2 swc1 swc2 opcodes
2014-10-19 20:42:12 +02:00
Unknown W. Brackets
ef6d583542
x86jit: Oops, don't pad INT3s in prelinked blocks.
...
Fixes #7007 .
2014-10-15 22:07:56 -07:00
Unknown W. Brackets
b53f13480a
x86jit: Centralize continuing logic.
2014-10-12 19:01:04 -07:00
Unknown W. Brackets
040a6d1745
jit: Improve performance of clearing jit.
2014-10-12 19:00:03 -07:00
Unknown W. Brackets
e6373aaed9
jit: Remove from the block map more carefully.
2014-10-12 17:47:07 -07:00
Unknown W. Brackets
2e81a38892
jit: Fix a possible infinite loop in invalidation.
2014-10-12 17:46:54 -07:00
Unknown W. Brackets
4853a1b7a0
jit: Optimize proxy block lookup from address.
...
It was really slow before with enough proxy blocks.
2014-10-12 17:35:23 -07:00
Unknown W. Brackets
1064f580e4
armjit: Add proxy blocks for continuing.
2014-10-12 17:20:26 -07:00
Unknown W. Brackets
d98adf27d6
x86jit: Add proxy blocks for continuing.
2014-10-12 17:15:31 -07:00
Unknown W. Brackets
01f9521dc5
jit: Invalidate blocks even if they end unevenly.
...
This allows blocks to start and end where ever they need, which should be
good for replacements and for continuing.
2014-10-12 17:13:04 -07:00
Unknown W. Brackets
90821b761d
x86jit: Pad linked exits with breakpoints.
...
So that we don't get garbage, and so we see if we end up there.
2014-10-12 16:00:58 -07:00
Unknown W. Brackets
5fd402222b
x86jit: Use the shorter MDisp() offset for andLink.
2014-10-12 15:18:22 -07:00
Unknown W. Brackets
0f32103615
x86jit: Consistently use mips_.
2014-10-12 15:16:09 -07:00
Henrik Rydgård
afbe50d3b9
Merge pull request #6998 from unknownbrackets/jit-minor2
...
x86jit: Preload sp and similar regs used often
2014-10-13 00:00:28 +02:00
Unknown W. Brackets
e3a04aa2d2
x86jit: Preload sp and similar regs used often.
...
This can help us avoid using a temporary.
Very tiny performance improvement.
2014-10-12 14:53:56 -07:00
Unknown W. Brackets
6fae78cd3f
x86jit: Fix a bug in branch continuing.
...
When we predict it won't take a likely delay slot, we'd lose our register
allocation state.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
2f598e8f38
jit: Statically jump for fixed branches.
...
This handles both loops (first step is known) and static branches (some
code uses them instead of jumps, and we disassemble that to "b".)
Not likely to be a big improvement, but might help if the branch predictor
was wrong.
This is as opposed to continuing, which would build a larger jit block.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
9228ac72da
jit: Reorganize imm branch logic a bit.
2014-10-12 12:51:46 -07:00
Unknown W. Brackets
4d30288601
x86jit: Fix force flush to zero.
2014-10-12 12:51:46 -07:00
Unknown W. Brackets
928e2adfc9
jit: Avoid applying/restoring the rounding mode.
...
If the game never sets it, we can skip around syscalls, interpreter,
replacements, etc.
2014-10-12 12:51:45 -07:00
Unknown W. Brackets
8d0dca71fe
jit: Rename the rounding mode funcs to clarify.
...
They apply/restore the value, set/clear is confusing.
2014-10-12 11:35:20 -07:00
Henrik Rydgård
6cb2c9c97d
Merge pull request #6989 from hrydgard/x86-emitter-merge
...
Merge from Dolphin's x86-64 emitter
2014-10-12 19:52:59 +02:00
Henrik Rydgård
3b1476c8ec
MIPSTables: Annotate fp and hi/lo in/out more accurately than just "other"
...
Some typo fixes
2014-10-12 19:46:50 +02:00
Henrik Rydgard
8177b4c43b
Avoid an ifdef using PTRBITS
2014-10-12 19:35:55 +02:00
Henrik Rydgård
eab010a0c0
x86 JIT: Sacrifice a register for a pointer to the MIPS context. Shrinks emitted x86 code considerably.
...
Nice in 64-bit, but might be a bit too much in 32-bit though... Needs testing.
2014-10-12 19:35:55 +02:00
Henrik Rydgård
f99c2cd010
x86 Jit: Generate nicer code for some cases of addiu
2014-10-12 17:47:53 +02:00
Henrik Rydgård
91966824bb
minor cleanup: No point in having special functions for ReadFCR/WriteFCR, they're smaller than many other ops..
2014-10-11 15:57:36 +02:00
Henrik Rydgård
2feae8d98e
Merge pull request #6978 from daniel229/replace_danganronpa
...
Replace frame download in danganronpa 1&2
2014-10-07 00:46:19 +02:00
Henrik Rydgård
c9a21ab44d
Add T2 and T3 to our register enum for clarity
2014-10-05 14:20:30 +02:00
daniel229
5ff098efb9
Another replace frame download in danganronpa 1
2014-10-05 13:46:47 +08:00
daniel229
ef1484da65
Replace frame download in danganronpa 1
2014-10-05 13:44:39 +08:00
daniel229
a7cf3aeafc
Another replace frame download in danganronpa 2
2014-10-05 13:42:03 +08:00
daniel229
d7927009d0
Replace frame download in danganronpa 2
2014-10-05 13:39:15 +08:00
daniel229
aad301a97a
Replace download frame in Boku no Natsuyasumi 2 and 4
2014-09-27 14:00:37 +08:00
daniel229
4de7e89330
Replace download frame in Sora no kiseki SC,and a comment for Sora no kiseki 3rd
2014-09-26 17:13:01 +08:00
daniel229
ea9a0182e4
Replace download frame in Sora no kiseki FC
2014-09-26 16:55:37 +08:00
Henrik Rydgård
9aaaf3a835
Merge pull request #6922 from daniel229/a_few_func_replacement
...
A few functions replace in some games
2014-09-21 23:13:02 +02:00
Unknown W. Brackets
4210ba44eb
Clean up a few more ImmPtr() cases.
2014-09-21 08:34:27 -07:00
daniel229
d452d02aba
Update replace function for Toaru Majutsu to Kagaku no Ensemble.
2014-09-21 10:58:56 +08:00
daniel229
1e2abba2b1
Replace function in Toradora! Portable
2014-09-18 15:47:52 +08:00
daniel229
efa35294ce
Replace function in Hentai Ouji To Warawanai Neko
2014-09-18 15:45:58 +08:00
daniel229
4d6e0f6d31
Another function replace in Kirameki School Life
2014-09-18 15:41:39 +08:00
daniel229
a657da91c3
Replace download frame in Toaru Majutsu to Kagaku no Ensemble
2014-09-18 15:37:12 +08:00
daniel229
48b774d143
Replace download frame in Rezel Cross
2014-09-18 15:29:59 +08:00
daniel229
2dd3ab6c20
Replace frame download in suikoden1&2
2014-09-15 00:19:22 +08:00
Henrik Rydgard
f84ebf6bff
sprintf->snprintf, fix some too short buffers
2014-09-14 00:14:11 +02:00
Unknown W. Brackets
a892779f89
Merge pull request #6880 from daniel229/func_replace_sakurasou
...
Replace frame download in Sakurasou No Pet Na Kanojo
Conflicts:
Core/HLE/ReplaceTables.cpp
2014-09-12 22:28:27 -07:00
Unknown W. Brackets
52b6f1095e
armjit: Fix rounding mode, allow non flush-to-zero.
...
Default: force flush to zero (for RunFast mode.) But now it's an ini
option so we can more easily compare armjit differences.
2014-09-11 07:58:51 -07:00
daniel229
7c1d4234ab
Replace frame download in Sakurasou No Pet Na Kanojo
2014-09-11 15:32:07 +08:00
daniel229
22fa431f89
Replace frame download in Ore no Imouto ga Konnani Kawaii Wake ga Nai
2014-09-11 14:34:17 +08:00
daniel229
81ec625d0a
Update the hook for Kirameki School Life SP
2014-09-11 12:01:58 +08:00
daniel229
e8efab6d21
Change function name
2014-09-11 00:47:06 +08:00
daniel229
7f6f52a904
Fixes saveicons in Kirameki School Life SP.
2014-09-11 00:18:59 +08:00
daniel229
b1d9461779
Replace frame download in Narisokonai Eiyuutan
2014-09-09 14:33:15 +08:00
Unknown W. Brackets
420eb1bed3
Replace frame download in SD Gundam G Generation.
2014-09-08 22:44:27 -07:00
Unknown W. Brackets
ee6960ff7a
Hook the saveicon creation func in Growlanser IV.
2014-09-08 19:10:46 -07:00
daniel229
202f987e9b
Replace function for Zero no Kiseki and Ao no Kiseki
2014-09-05 00:52:04 +08:00
Andrew Church
33264a6b8f
Hook Brandish frame capture for menu fadeout and save screenshots.
2014-09-04 17:36:56 +09:00
Andrew Church
3033dc5138
Revert to unconditional ClearRoundingMode() when setting FCR31.
2014-09-04 11:36:56 +09:00
Andrew Church
128122af39
Fix broken rounding mode handling.
2014-09-04 11:30:11 +09:00
Andrew Church
726cb851b9
Don't unconditionally ClearRoundingMode() before setting it.
2014-09-04 09:28:56 +09:00
Andrew Church
5816685668
Handle the FS (flush-to-zero) bit in FCR31 for x86 JIT.
2014-09-04 01:50:24 +09:00
Unknown W. Brackets
4a1514730f
x86jit/ppcjit: Correct some bad sltiu compares.
2014-09-02 08:04:22 -07:00
Unknown W. Brackets
c9df66a450
Initialize the VFPU revision from a PSP-3000 value.
2014-09-01 23:16:50 -07:00
Unknown W. Brackets
4459b8f483
jit: Actually jit vmtfc/vmfvc.
...
Sicne we have them and they are easy.
2014-09-01 23:13:39 -07:00
Unknown W. Brackets
fd1b01b573
Fix the vrndi.s output range.
...
Was previously outputting only valid positive float values, but should use
a much wider range of a u32.
Might've affected randomness in some games.
2014-09-01 22:33:01 -07:00
Unknown W. Brackets
5f6f6827b5
jit: Update rounding mode immediately on ctc1.
2014-08-30 23:48:27 -07:00
Unknown W. Brackets
e8cdbcc33f
x86jit: Fix some flags/EAX trashing in rounding.
...
Fixes #6810 .
2014-08-30 16:46:43 -07:00
Unknown W. Brackets
8daff0a25e
armjit: Fix some downcount issues with rounding.
...
When setting the rounding mode we need to be super careful about not
destroying flags or R0 if they are needed.
2014-08-30 16:43:13 -07:00
Unknown W. Brackets
820a8e8f2b
armjit: Don't reset downcount on fpu instructions.
...
It's maintained always, oops.
2014-08-30 16:30:13 -07:00
Unknown W. Brackets
d4ec7d8019
Add another memcpy variant.
...
Fixes #4324 (Marvel Ultimate Alliance 2 videos), thanks daniel229.
2014-08-23 08:45:25 -07:00
Henrik Rydgård
5d836bfa5a
Merge pull request #6765 from hrydgard/thin3d
...
Switch UI drawing from GL to Thin3D. This activates the D3D9 path as well.
2014-08-23 10:52:21 +02:00
Henrik Rydgård
b7da82eebb
Merge pull request #6762 from unknownbrackets/fpu-rounding
...
Handle fpu rounding mode at least in jits
2014-08-23 10:43:22 +02:00
Unknown W. Brackets
e9b5e6f277
armjit: Maintain rounding mode throughout jit.
2014-08-22 19:57:50 -07:00
Henrik Rydgard
808f05da89
(Partially) slip thin3d underneath DrawBuffer.
2014-08-22 20:54:53 +02:00
Unknown W. Brackets
925557ed47
x86jit: Maintain the rounding mode always.
...
This should be less often than doing it per block that uses fpu, unless
the game doesn't use fpu much at all.
2014-08-22 09:53:00 -07:00
Henrik Rydgård
f8a4236d58
Merge pull request #6679 from unknownbrackets/gpu-blend
...
Handle doubled alpha blending using premultiplication
2014-08-22 10:06:24 +02:00
Unknown W. Brackets
1fcbb7bbd4
armjit: Respect the rounding mode for mul/etc.
2014-08-22 00:32:01 -07:00
Unknown W. Brackets
ab13b36484
x86jit: Implement cvt.w.s.
...
Not really used that often, anyway, but easy enough and good for testing
that we set the rounding mode correctly.
2014-08-22 00:01:06 -07:00
Unknown W. Brackets
dc91dc1ce8
x86jit: Support fpu rounding modes for mul, etc.
...
Fixes Gods Eater Burst loading PSP savedata, but can no longer load old
savedata.
2014-08-21 23:59:55 -07:00
Sacha
97e93f48fd
Clean up LitPool code and re-enable flushing in AsmJit
2014-08-20 18:29:37 +10:00
Unknown W. Brackets
d52fdafa3c
Note the location of a memset variant.
2014-08-18 23:20:44 -07:00
Unknown W. Brackets
9d3cf346c3
Clarify GetSureBranchTarget() for fpu branches.
...
They also have CONDTYPE_ flags. Looks like this was just getting lucky
that rs can't equal rt, but the code looks confusing when you're looking
at it from an fpu/vfpu perspective.
2014-08-18 07:46:48 -07:00
Unknown W. Brackets
78296d15c6
Don't recurse when disasming an emuhack.
...
Although, should this happen? Apparently does in Peace Walker.
2014-08-17 18:43:59 -07:00
Henrik Rydgård
60eaefa6ad
Merge pull request #6680 from unknownbrackets/replace-funcs
...
Disable most replacements and use checked mem access in them
2014-08-04 23:44:20 +02:00
Unknown W. Brackets
ac94dbcc69
Show the replaced instruction in disassembly.
...
Useful while debugging.
2014-08-03 19:23:29 -07:00
Unknown W. Brackets
abd1f4e58a
Disassemble bne/etc. using rs, rt order.
...
The order makes more logical sense from game disassembly, and matches the
assembler. Fixes #6632 .
2014-08-03 19:22:18 -07:00
Unknown W. Brackets
245a2a3be0
Don't zero out downcount in replacements.
...
It doesn't write out js.downcountAmount in any of these cases, so zeroing
it is wrong.
2014-08-03 13:22:30 -07:00
Unknown W. Brackets
d060a06fa6
Disable a bunch of function replacements.
...
These are just for speed, let's turn them off. Using a flag because:
* I think there's still some issue with savestates, not sure.
* We might swap this flag to a separate option.
2014-08-03 13:15:41 -07:00
Henrik Rydgard
82421f4dcf
x86 jit: Further fix for nor, thanks unknown
...
See #6638
2014-07-27 22:26:35 +02:00
Henrik Rydgard
903ddbc513
x86 JIT: Fix bug where NOR would not get computed correctly in corner case
...
(CompTriArith can end up not actually mapping rd to a register when taking
a shortcut)
May fix the JIT issue mentioned by CPkmn and located by daniel229 as an aside in #6638
2014-07-27 21:41:41 +02:00
Sacha
6ce3765b12
Sailfish: More compatibility with SailFish OS. It also needs stddef where Maemo does.
...
Set packaging by default for iOS with b.sh.
2014-07-24 23:20:09 +10:00
Unknown W. Brackets
6aa9b8aa36
Limit stack walk distance a bit.
...
It was spending 0.5s in debug scanning all of memory for an entry (due to
some fp/sp tricks that aren't well detected yet.) Let's just assume a 1MB
func doesn't need to be walked properly.
2014-07-20 21:52:55 -07:00
Unknown W. Brackets
b03460c169
Improve function range detection.
...
This improves a pattern like this:
j endOfLoop;
li v0, 0;
startOfLoop:
addiu v0, v0, 1
endOfLoop:
bne v0, a0, startOfLoop;
nop
jr ra
nop
Where it jumps to the end of the loop, which only jumps back to the top of
the loop. This might misdetect a few cases of tail recursion, but only
when the funcs are right next to each other.
Also, stops scanning at a jr ra, which was causing funcs to be incorrectly
long in cases.
2014-07-20 14:42:20 -07:00
Henrik Rydgård
0e5679c833
Revert "Detect Peace Walker's anti-cheat hash func"
2014-07-20 23:39:11 +02:00
Henrik Rydgård
06f058de54
Merge pull request #6506 from unknownbrackets/replace-funcs
...
Detect Peace Walker's anti-cheat hash func
2014-07-18 09:34:46 +02:00
Unknown W. Brackets
a59d8b5c1f
Override the codehashing func used in Peace Walker.
...
This makes the demo work fine even with jit enabled. May help the full
game when fighting a certain boss.
2014-07-18 00:23:26 -07:00
Unknown W. Brackets
1fd6214945
Improve function range detection.
...
This improves a pattern like this:
j endOfLoop;
li v0, 0;
startOfLoop:
addiu v0, v0, 1
endOfLoop:
bne v0, a0, startOfLoop;
nop
jr ra
nop
Where it jumps to the end of the loop, which only jumps back to the top of
the loop. This might misdetect a few cases of tail recursion, but only
when the funcs are right next to each other.
Also, stops scanning at a jr ra, which was causing funcs to be incorrectly
long in cases.
2014-07-18 00:22:19 -07:00
Sacha
6957808b97
ArmJit: Optimisation when comparing float against 0.0f
2014-07-17 05:12:43 +10:00
Sacha
d4c983d9e1
Android: ARMv5 fix
2014-07-17 02:34:22 +10:00
Unknown W. Brackets
292a9ea567
Clear module text and bss on unload.
...
Text is set to break instructions, data/bss to -1. Matches results on a
PSP.
2014-07-13 22:00:32 -07:00
Unknown W. Brackets
81096f6bd0
Hook Dissidia's avi record func.
...
This makes it so replays can be recorded. Though you could probably just
record separately anyway.
2014-07-13 08:36:34 -07:00
Unknown W. Brackets
09b9d2ad81
Keep track of ranges that have emuhack ops.
...
So that we can invalidate them smarter.
2014-07-05 16:25:16 -07:00
Unknown W. Brackets
d2e7dfcc51
Minor logging improvement.
2014-07-05 13:19:53 -07:00
Unknown W. Brackets
4b9229a5ba
x86jit: Flush the PC before r/w in debug.
...
This way we get better log output.
2014-07-05 12:57:44 -07:00
Unknown W. Brackets
c0e6f26bb5
Fix startDefaultPrefix tripping.
...
We want the regs already initialized when we set this up.
2014-06-30 08:10:14 -07:00
Henrik Rydgård
bfffe33438
Merge pull request #6469 from unknownbrackets/logging
...
Enforce semicolons at the end of log lines
2014-06-30 11:44:02 +02:00
Unknown W. Brackets
433f4eb00a
Use the ARM rounding mode flag for conversions.
...
It's at least much simpler. Not sure if faster. Handles NAN correctly.
2014-06-29 20:36:00 -07:00
Unknown W. Brackets
f339f7d539
armjit: Handle NAN correctly in float conversion.
2014-06-29 20:05:59 -07:00
Unknown W. Brackets
c168db5943
armjit: Fix really bad typo in cvt.w.s.
2014-06-29 19:43:17 -07:00
Unknown W. Brackets
0078faef8b
Fix some log semicolons that might affect logic.
...
But, these should all be right.
2014-06-29 19:09:38 -07:00
Unknown W. Brackets
5db79dcf11
Fix some missing semicolons on log statements.
2014-06-29 19:09:37 -07:00
Unknown W. Brackets
252100aee5
Remove outdated comment (real cause found/fixed.)
2014-06-28 16:06:10 -07:00
Unknown W. Brackets
f008bebab4
armjit: Fix floor/ceil/cvt.w.s rounding.
...
Unfortunately, correctly rounding is probably slower.
2014-06-28 00:38:57 -07:00
Unknown W. Brackets
f544a87b2f
jit: Initialize startDefaultPrefix when switching.
2014-06-28 00:38:56 -07:00
Unknown W. Brackets
27870aa593
x86jit: Map HI/LO as registers.
...
Not actually ever cached, but now it's all consistent.
2014-06-28 00:38:56 -07:00
Unknown W. Brackets
bc3d789c8a
x86jit: Cache the vfpu compare flags in a reg.
...
Again, to match armjit.
2014-06-28 00:38:55 -07:00
Unknown W. Brackets
acad2e1763
x86jit: Cache fpcond in a register.
...
Mostly to match armjit.
2014-06-28 00:38:55 -07:00
Unknown W. Brackets
0da972c548
Hook the FF1 battle effect func.
...
So that we can download the framebuffer. At least, it seems like that's
what this function is doing.
2014-06-26 01:38:22 -07:00
Unknown W. Brackets
24d8a34a0b
Properly respect resolveReplacements.
...
And use the same opcode reading func in armjit as x86jit.
Fixes Star Ocean on Android.
2014-06-23 08:20:38 -07:00
Unknown W. Brackets
ec94498342
When scanning or relocating, check replacements.
...
Just to make sure we don't wrongly detect the length or unresolve a var
wrong etc.
2014-06-23 08:18:56 -07:00
Unknown W. Brackets
c1e293fe7c
Fix a warning on 32-bit that might be bad...
2014-06-22 22:17:48 -07:00
Unknown W. Brackets
8851fc1685
Remove savedIdRegister/MIPS_CALL_ID.
...
We've never trusted it anyway, simpler without dealing with this stuff.
2014-06-22 11:29:47 -07:00
Unknown W. Brackets
d7e2c2c1d2
armjit: Oops, correctly handle plus/minus vmin/max.
2014-06-21 07:45:47 -07:00
Unknown W. Brackets
95f5d9397c
Correct overflow in trunc.w.s for interpreter.
...
Reported in #4786 .
2014-06-21 00:53:33 -07:00
Unknown W. Brackets
62daf6d7c8
armjit: Fix vmin/vmax to follow the PSP's rules.
...
Also the interpreter. Fixes #6107 .
2014-06-20 23:55:33 -07:00
Unknown W. Brackets
e87e1606c5
Fix func replacements and delay slots on arm.
...
Fixes #6303 .
2014-06-19 08:02:49 -07:00
Unknown W. Brackets
9efbc2694b
Add an invalidate all method to the jit.
2014-06-19 01:13:06 -07:00
Unknown W. Brackets
561d0e5ef9
Check more ops for changing memory in debugger.
2014-06-19 00:48:33 -07:00
Henrik Rydgard
ee1d16cb1d
Use sincosf where available (linux)
2014-06-15 12:06:02 +02:00
Henrik Rydgard
e6f55bfef0
Fix silly mistake in vfpu_sincos. Add unittest.
2014-06-15 11:51:30 +02:00
Henrik Rydgard
0879d76503
VFPU: Ensure that sin(4*x) returns 0.0 (and cos 1) for all x. Fixes #2921
2014-06-15 11:03:00 +02:00
Sacha
55221b5c7c
Sin/cos fix for hardfp builds.
2014-06-12 23:10:22 +10:00
Unknown W. Brackets
0550b9372a
Skip nop padding between functions.
...
Fixes graphical artifacts in Final Fantasy Tactics, recognizing memset and
memcpy.
2014-06-09 00:16:03 -07:00
Unknown W. Brackets
a926b19c03
Hook Tales of Phantasia X save pictures.
...
Maybe other games will use the same func even. This makes save pictures
work correctly the first time.
2014-06-08 16:38:43 -07:00
Unknown W. Brackets
a7b9ce205b
Enable function replacements by default.
...
So things like Star Ocean work, and game memcpy()'s to GPU work.
This will make game start on mobile a bit slower, though. And there could
still be bugs so leaving as an option, but seems pretty stable. Didn't
realize it wasn't enabled by default.
2014-06-07 00:13:45 -07:00
Unknown W. Brackets
f6d4be1d49
Hook Star Ocean's function to upload stencil data.
...
Hurray, it seems to work properly.
2014-05-31 21:48:11 -07:00
Henrik Rydgård
fd19b8d271
Merge pull request #6197 from unknownbrackets/replace-funcs
...
Function replacement hooks and some GLES compat replacements
2014-05-31 20:30:30 +02:00
Unknown W. Brackets
5ccc227462
armjit: Minor const optimization in Comp_VV2Op.
2014-05-31 11:12:36 -07:00
Unknown W. Brackets
df289e46a9
armjit: Use sat0/1 method from prefixes in vsat.
2014-05-31 11:12:35 -07:00
Unknown W. Brackets
b2dc92b942
Add a hook for Hexyz Force's "monoclome" thread.
...
This fixes missing graphics in some areas in GLES, due to direct framebuf
access.
2014-05-31 10:03:02 -07:00
Unknown W. Brackets
5dd8ebe2b4
Add a hook for Gods Eater Burst's swizzled copy.
2014-05-31 10:03:01 -07:00
Unknown W. Brackets
d09be5a4bc
Update PC before going into a replacement func.
...
This way we can report the PC properly on errors, and the replacement func
can even look at PC.
2014-05-31 10:03:01 -07:00
Unknown W. Brackets
f489694515
Add the option to hook, rather than replace, funcs.
...
This can be useful for debugging or developing translations/game hacks,
and also gives us options when dealing with GLES incompatibilities.
2014-05-31 10:03:00 -07:00
Unknown W. Brackets
72eb15f282
Speed up debug build hashfunc lookup.
2014-05-31 10:03:00 -07:00
Henrik Rydgård
585050de27
Fix crash in block cache
2014-05-27 22:02:20 +02:00
Unknown W. Brackets
b73c575418
Support swizzled framebuffer downloads.
...
Used in God Eater 2 when showing the load save screen.
2014-05-27 01:17:09 -07:00
Unknown W. Brackets
a70b5abfb9
Allow jit to be enabled/disabled at runtime.
2014-05-27 00:02:51 -07:00
Unknown W. Brackets
8afd1f028c
Add a couple more memcpy() variants.
2014-05-26 11:20:34 -07:00
Unknown W. Brackets
288d867588
Fix a type comparison warning.
2014-05-21 08:00:31 -07:00
Unknown W. Brackets
5b24e0107f
x86jit: Correct vsat0/vsat1 handling.
2014-05-16 01:04:58 -07:00
Unknown W. Brackets
ccc5458a84
x86jit: Correctly handle NaNs in vfpxd sat clamps.
...
May be a small performance jit, but we've seen games with bugs because of
NaNs recently, so better to be safe.
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
69b0b622be
armjit: Fix D-prefix sat clamp NAN handling.
...
They should leave NAN alone.
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
bc32f0e0b2
armjit: Correct disabled vslt NaN handling.
...
Can possibly enable?
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
a3ad238a44
Add notes for proper NaN handling in vmin/vmax.
2014-05-16 01:04:56 -07:00
Unknown W. Brackets
6ccae8f5a7
x86jit: Use a faster safemem fallback.
...
Really helps performance in games that use uncached addresses a lot,
without really impacting performance of most games which don't.
Of course, fastmem is faster.
2014-05-06 08:05:12 -07:00
The Dax
086d97516d
Fix a couple ARM VFPU flags.
...
Unknown's explanation:
LO means Lower (unsigned), but for floats, it means "Less than".
LT means Lower (signed), but for floats it means "Less than OR unordered".
ARM docs at http://infocenter.arm.com/help/topic/com.arm.doc.dui0068b/Chdhcfbc.html explain the following:
LE means Signed less than or equal, and for floats it means "Less than or equal, or unordered".
LS means Unsigned lower or same, but for floats, it means "Less than or equal"
2014-05-04 22:37:41 -04:00
Unknown W. Brackets
95dcadb6ae
Ignore when a proxied block points to erased mem.
...
Happens for example when a new module is loaded, sometimes.
2014-05-04 01:25:19 -07:00
Unknown W. Brackets
c3a6092e26
Upgrade symbolmaps with module address info.
...
This fixes some issues with jit replacement only if you had a map laying
around.
2014-05-04 01:24:18 -07:00
Unknown W. Brackets
4d665b5e7a
Fix replacement funcs in the interpreter.
2014-04-28 08:01:13 -07:00
Unknown W. Brackets
97c18e7f0e
Comment out a few unsafe replacement funcs.
2014-04-22 08:07:10 -07:00
Unknown W. Brackets
7326c6e716
Fix a race condition on shutdown with hashmap.
...
Also, always need to init the blocks, they are not zero initialized.
2014-04-20 21:44:10 -07:00
Henrik Rydgard
f35168e0e0
Hardcode a bunch of function hashes so we can replace them.
...
Without needing an external file.
2014-04-18 19:00:08 +02:00
Unknown W. Brackets
dc39e75fc1
Oops, forgot about proxy blocks for replace jal.
...
Also fix a crash when they are used.
2014-04-17 01:03:46 -07:00
Unknown W. Brackets
ddd6e3024d
Skip jals when replacing funcs.
...
Improves performance in God Eater (when replacements enabled.)
2014-04-16 23:57:52 -07:00
Unknown W. Brackets
dde2f3ade6
Re-replace functions after loading a savestate.
...
Might need to clear before saving too... anyway, this makes testing a bit
easier for certain areas.
Also, correctly decrease downcount on x86.
2014-04-12 15:49:20 -07:00
Unknown W. Brackets
76e61e10a9
Fix hashmap crashes with games that load modules.
...
This should properly unload and reload the functions as necessary.
2014-04-12 01:16:32 -07:00
Unknown W. Brackets
3001866d18
Skip flushing FPU/VFPU regs if none were allocated.
...
They're not used as often, so this usually saves time. About 1% during
tests.
2014-03-30 00:42:25 -07:00
Unknown W. Brackets
a4327702f1
Reduce some includes under GPU/.
2014-03-29 16:51:38 -07:00
Henrik Rydgard
58237d976f
Fix performance issue in BlockCache due to an instance of std::vector in every block:
...
Avoid creating the vector when not necessary.
This was especially noticeable in debug mode.
2014-03-29 22:26:51 +01:00
Henrik Rydgård
717c1cd34e
Merge pull request #5748 from unknownbrackets/armjit-minor
...
armjit: Allow R1 in regalloc, use LR as temp
2014-03-29 04:09:58 -04:00
Unknown W. Brackets
600842d9a2
armjit: Use prefixes on vscl's T arg.
...
Makes it pass one more thing in the prefixes test, but not sure exactly
how it operates. Better to have it the same as x86 and int anyway.
2014-03-29 01:00:29 -07:00
Unknown W. Brackets
5a89c17cf0
armjit: Allow R1 in regalloc, use LR as temp.
...
LR should be safe, although it may make stack traces not work within jit,
they don't really tend to work anyway.
2014-03-28 18:38:38 -07:00
Unknown W. Brackets
58c5179d8e
Push and pop the callee saved NEON registers.
2014-03-25 22:34:42 -07:00
Unknown W. Brackets
2f5c6a5660
Fix VLDM/VSTM encoding for double/quad regs.
...
Duh, forgot to check Vd. Fixes #5723 .
2014-03-25 22:08:20 -07:00
Unknown W. Brackets
246eaeb209
x86jit: Avoid mem temp for float cmp/loads.
2014-03-22 15:56:28 -07:00