Commit Graph

1486 Commits

Author SHA1 Message Date
Unknown W. Brackets
de566be2ce x86jit: Split out the logic for loading simd regs. 2014-11-16 13:33:15 -08:00
Unknown W. Brackets
5347431c20 x86jit: Initial simd for VecDo3(). Broken.
I'm not sure why/where it's broken...
2014-11-16 13:33:15 -08:00
Unknown W. Brackets
aad505e7b3 x86jit: Add a TryMapDirtyInInVS() for 3-op. 2014-11-16 13:33:14 -08:00
Unknown W. Brackets
88a753eff3 x86jit: Add an invariant contract to the fpu cache.
This should help catch things better in debug mode.
2014-11-16 13:33:14 -08:00
Unknown W. Brackets
39afeb490f x86jit: Add some typesafety. 2014-11-16 13:33:13 -08:00
Unknown W. Brackets
4335bf3346 x86jit: Add basic mapping of SIMD regs.
Not tested yet, just sketched out.  All very suboptimal.
2014-11-16 13:33:13 -08:00
Unknown W. Brackets
9429359b47 x86jit: Add fallbacks when moving from VS -> V. 2014-11-16 13:33:12 -08:00
Unknown W. Brackets
2862367927 x86jit: Add force-non-simd to all current ops.
Unless they already use MapRegs, because that will automatically handle
it.
2014-11-16 13:33:12 -08:00
Unknown W. Brackets
4cf0913692 x86jit: Sketch some initial SIMD apis. 2014-11-16 13:33:07 -08:00
Henrik Rydgard
e43c7af32c ARM Jit: Implement quaternion multiplication 2014-11-16 19:12:00 +01:00
Henrik Rydgard
bfcd3690b6 x86 jit: Fix+enable quaternion product, optimize "sw zero, *" 2014-11-16 18:37:38 +01:00
Henrik Rydgard
28ca8d4818 x86 jit: Use LEA to emulate addu but only when it can save a few bytes 2014-11-16 17:39:47 +01:00
Henrik Rydgard
1c78e29c79 x86 jit: For clarity, use TEMPREG where it doesn't matter that it's EAX.
Might have missed a few places.
2014-11-16 17:38:26 +01:00
Henrik Rydgard
8b90f881b8 x86 jit: A tiny optimization and a tiny bugfix 2014-11-16 16:46:35 +01:00
xSacha
57e4088216 Introduce fake vertex decoder JIT as well.
Compiles and links on CI20 but gets unknown crash in GL driver.
2014-11-13 17:10:29 +10:00
Sacha
c421617c84 Fix Qt build by building Arm disassembler for all platforms. 2014-11-13 00:55:00 +10:00
Sacha
a0086f6412 Introduce a Fake JIT for generic builds. 2014-11-13 00:09:51 +10:00
Kingcom
479c8646a2 Change vpfxs/r/t disassembly syntax 2014-11-12 00:09:57 +01:00
Unknown W. Brackets
096b41cceb x86jit: Interleave reg usage in vcmp. 2014-11-10 23:22:04 -08:00
Unknown W. Brackets
0e1aa35e84 x86jit: Just do the ES/NS compare once. 2014-11-10 23:04:38 -08:00
Unknown W. Brackets
2758e8fa3c x86jit: Optimize vcmp for single and simd. 2014-11-10 23:04:37 -08:00
Unknown W. Brackets
94e29da6c4 Fix a typo in the mips assembler.
Oops, this should be a unique value of course.
2014-11-10 09:13:11 -08:00
Unknown W. Brackets
01c2b88911 Avoid signed ints, seems to cause clang errors. 2014-11-09 16:49:24 -08:00
Henrik Rydgård
7fbe8ba898 Merge pull request #7076 from unknownbrackets/debugger
Add VFPU instructions to the mips asm tables
2014-11-10 00:57:08 +01:00
Unknown W. Brackets
370fb86379 Add VFPU instructions to the mips asm tables. 2014-11-09 15:14:07 -08:00
Unknown W. Brackets
86e3739a3e x86jit: Optimize some cases of ins/ext.
They happen but are minor.
2014-11-09 09:22:29 -08:00
Unknown W. Brackets
e05263af32 x86jit: Allow EBX sign extension for 32-bit. 2014-11-09 09:07:52 -08:00
Unknown W. Brackets
8dbd3c3b9c x86jit: Don't lie about ZERO when it's not an imm. 2014-11-09 08:27:02 -08:00
Unknown W. Brackets
d0a2ced2f9 x86jit: Flip cc in stl* to avoid reg loads.
Unfortunately, this zero thing is now concerning me...
2014-11-09 08:15:39 -08:00
Unknown W. Brackets
59f491eddb x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency.  It also avoids
filling registers that are not likely to be used again.

Fixed a small mistake.
2014-11-09 07:23:44 -08:00
Henrik Rydgard
18495a452d Rename an enum 2014-11-09 14:55:23 +01:00
Henrik Rydgard
a19d0b648a x86 jit: Add a simple speedhack (ignore masking stack pointers) but disable due to low impact. 2014-11-09 14:54:39 +01:00
Henrik Rydgard
a528921f3c x86 JIT: EBX was free in 32-bit mode, let's use it in the regcache. 2014-11-09 12:55:17 +01:00
Henrik Rydgard
db853d8513 Collapse sequences of "int3" (padding after block linking) in x86 disassembly. 2014-11-09 12:10:37 +01:00
Henrik Rydgard
5888b3bdc4 Revert "x86jit: Micro optimize slt* a bit."
This reverts commit ee66596b8d.

Broke a lot of games, probably some small bug.

Conflicts:
	Core/MIPS/x86/CompALU.cpp
2014-11-09 12:07:21 +01:00
Henrik Rydgard
5bcdecc26b unittest: Have the JIT harness print disassembly, to make it easy to inspect results. 2014-11-09 12:03:04 +01:00
Unknown W. Brackets
313d9e95c7 Clarify a comment. 2014-11-09 01:05:03 -08:00
Unknown W. Brackets
ee66596b8d x86jit: Micro optimize slt* a bit.
This improves their performance and hopefully latency.  It also avoids
filling registers that are not likely to be used again.
2014-11-08 22:54:03 -08:00
Unknown W. Brackets
27d8108bb2 x86jit: Optimize loads of 0 into fp regs. 2014-11-08 18:41:16 -08:00
Unknown W. Brackets
7d8858687e x86jit: Avoid speculative loads in mtc1/mfc1. 2014-11-08 18:35:15 -08:00
Unknown W. Brackets
57caa95273 x86jit: Implement round.w.s and friends.
They are not terribly fast, though, updating MXCSR.
2014-11-08 17:59:38 -08:00
Unknown W. Brackets
3908e0f445 x86jit: Small optimization for add.s f1, f2, f2.
Doubles the speed of that particular case.  Biggest difference is not
loading fd for no reason.
2014-11-08 17:32:53 -08:00
Unknown W. Brackets
f9893c29ce x86jit: Very small optimization to c.nge.s. 2014-11-08 17:01:02 -08:00
Unknown W. Brackets
78dfe43776 x86jit: Optimize neg.s and abs.s a tiny bit.
Same reg is probably a common case, improves micro benchmark.
2014-11-08 16:50:41 -08:00
Unknown W. Brackets
bed0d0b059 x86jit: Improve cvt.w.s when fd is loaded or fs.
We have no need to store it.
2014-11-08 16:40:54 -08:00
Unknown W. Brackets
1917d946ea x86jit: Micro optimize cvt.s.w a bit.
This implementation is about 5x faster for micro benchmarks.  Little
impact to overall perf in games I tested, though.
2014-11-08 13:30:38 -08:00
Unknown W. Brackets
671dee85c7 x86jit: Micro optimize vi2f a little bit.
This didn't help overall perf much but micro benchmarks are better.
2014-11-08 13:07:01 -08:00
Unknown W. Brackets
c29b126357 x86jit: Oops, can't have an imm here. 2014-11-08 12:41:48 -08:00
Unknown W. Brackets
c0be19edb6 x86jit: Simplify vavg a bit. 2014-11-08 12:40:04 -08:00
Unknown W. Brackets
761e269e5f x86jit: Avoid some regcache pollution. 2014-11-08 12:38:08 -08:00
Unknown W. Brackets
bc7497857a x86jit: Micro optimize vi2x a bit with ssse3/sse4.
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a x86jit: Implement vi2x instructions.
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)

AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
ddc90ee550 x86jit: Implement vfad and vavg. 2014-11-08 12:13:25 -08:00
Unknown W. Brackets
5ae43defd9 Oops, these should be signed. 2014-11-08 09:39:17 -08:00
Unknown W. Brackets
316e923b40 x86jit: Implement other forms of vx2i.
Gains 3.2% performance in Grand Knights History.
2014-11-08 00:39:40 -08:00
Unknown W. Brackets
097a483d77 x86jit: Micro optimize vs2i a bit. 2014-11-06 22:45:54 -08:00
Unknown W. Brackets
7819b97c9a iOS buildfix. 2014-11-04 08:32:43 -08:00
TwistedUmbrella
6797044476 Correct a namespace typo 2014-11-04 05:10:22 -05:00
Unknown W. Brackets
3061e89250 Fix copy/paste mistake. 2014-11-04 01:41:17 -08:00
Unknown W. Brackets
0d36d4e082 Add a helper to reduce duplicate code.
This is not performance critical.  I wonder if compilers can inline
closures?
2014-11-03 23:50:23 -08:00
Unknown W. Brackets
16ca2b0155 x86jit: Fix trig vv2ops on 32-bit, arg. 2014-11-03 23:43:18 -08:00
Unknown W. Brackets
3e95763a3f x86jit: Implement other rounding modes in vf2i.
3% improvement in Grand Knights History.  I know other games use these
too.
2014-11-03 23:27:05 -08:00
Unknown W. Brackets
717cf25f0d x86jit: Use our sincos funcs for VV2Op as well.
Small (0.7%) speedup in Gods Eater Burst.  There's probably SSE
approximations we could use instead, but those will also need at least xmm
reg flushing/thunking.

At least this avoids flushing gprs, etc.  The sin and cos ops are fairly
common.
2014-11-03 22:13:38 -08:00
Unknown W. Brackets
014445655d Actually clear the hash->func map on forget.
Better not to have dangling pointers.
2014-11-03 13:49:45 -08:00
Unknown W. Brackets
ad6b176e11 Naturally, modern C++ would not build on Symbian. 2014-11-03 08:56:45 -08:00
Unknown W. Brackets
61c21340fb Warning fixes. 2014-11-03 08:34:34 -08:00
Unknown W. Brackets
9d86d3ca9b Use std::unordered_multimaps in a few places. 2014-11-03 08:31:52 -08:00
Unknown W. Brackets
67a7205bdd Switch to a multimap for the hash->function map. 2014-11-03 07:59:56 -08:00
Unknown W. Brackets
56322bdad4 Improve performance of ForgetFunctions().
Mostly matters during shutdown, but also module unload.
2014-11-02 17:32:04 -08:00
Unknown W. Brackets
258b7c9a7c jit: Use the end just to be safe.
In case clearing near the end of a block.
2014-10-27 19:05:52 -07:00
Unknown W. Brackets
5bb9d32eaa jit: Fix partial invalidation of larger blocks.
Fixes #7031.
2014-10-27 19:04:19 -07:00
Unknown W. Brackets
100afc07a2 x86jit: Fix andLink cases of imm blezl, etc. 2014-10-24 08:57:56 -07:00
Unknown W. Brackets
0c1dcfeacf Avoid comparing invalidated iterators. 2014-10-22 00:50:39 -07:00
Unknown W. Brackets
65ecc9a464 jit: Use an exclusive end in the block map.
Simpler, was not consistent before, oops.
2014-10-21 11:52:19 -07:00
BlackDog
f7e8ca486c add lwc1 lwc2 swc1 swc2 opcodes 2014-10-19 20:42:12 +02:00
Unknown W. Brackets
ef6d583542 x86jit: Oops, don't pad INT3s in prelinked blocks.
Fixes #7007.
2014-10-15 22:07:56 -07:00
Unknown W. Brackets
b53f13480a x86jit: Centralize continuing logic. 2014-10-12 19:01:04 -07:00
Unknown W. Brackets
040a6d1745 jit: Improve performance of clearing jit. 2014-10-12 19:00:03 -07:00
Unknown W. Brackets
e6373aaed9 jit: Remove from the block map more carefully. 2014-10-12 17:47:07 -07:00
Unknown W. Brackets
2e81a38892 jit: Fix a possible infinite loop in invalidation. 2014-10-12 17:46:54 -07:00
Unknown W. Brackets
4853a1b7a0 jit: Optimize proxy block lookup from address.
It was really slow before with enough proxy blocks.
2014-10-12 17:35:23 -07:00
Unknown W. Brackets
1064f580e4 armjit: Add proxy blocks for continuing. 2014-10-12 17:20:26 -07:00
Unknown W. Brackets
d98adf27d6 x86jit: Add proxy blocks for continuing. 2014-10-12 17:15:31 -07:00
Unknown W. Brackets
01f9521dc5 jit: Invalidate blocks even if they end unevenly.
This allows blocks to start and end where ever they need, which should be
good for replacements and for continuing.
2014-10-12 17:13:04 -07:00
Unknown W. Brackets
90821b761d x86jit: Pad linked exits with breakpoints.
So that we don't get garbage, and so we see if we end up there.
2014-10-12 16:00:58 -07:00
Unknown W. Brackets
5fd402222b x86jit: Use the shorter MDisp() offset for andLink. 2014-10-12 15:18:22 -07:00
Unknown W. Brackets
0f32103615 x86jit: Consistently use mips_. 2014-10-12 15:16:09 -07:00
Henrik Rydgård
afbe50d3b9 Merge pull request #6998 from unknownbrackets/jit-minor2
x86jit: Preload sp and similar regs used often
2014-10-13 00:00:28 +02:00
Unknown W. Brackets
e3a04aa2d2 x86jit: Preload sp and similar regs used often.
This can help us avoid using a temporary.

Very tiny performance improvement.
2014-10-12 14:53:56 -07:00
Unknown W. Brackets
6fae78cd3f x86jit: Fix a bug in branch continuing.
When we predict it won't take a likely delay slot, we'd lose our register
allocation state.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
2f598e8f38 jit: Statically jump for fixed branches.
This handles both loops (first step is known) and static branches (some
code uses them instead of jumps, and we disassemble that to "b".)

Not likely to be a big improvement, but might help if the branch predictor
was wrong.

This is as opposed to continuing, which would build a larger jit block.
2014-10-12 12:51:47 -07:00
Unknown W. Brackets
9228ac72da jit: Reorganize imm branch logic a bit. 2014-10-12 12:51:46 -07:00
Unknown W. Brackets
4d30288601 x86jit: Fix force flush to zero. 2014-10-12 12:51:46 -07:00
Unknown W. Brackets
928e2adfc9 jit: Avoid applying/restoring the rounding mode.
If the game never sets it, we can skip around syscalls, interpreter,
replacements, etc.
2014-10-12 12:51:45 -07:00
Unknown W. Brackets
8d0dca71fe jit: Rename the rounding mode funcs to clarify.
They apply/restore the value, set/clear is confusing.
2014-10-12 11:35:20 -07:00
Henrik Rydgård
6cb2c9c97d Merge pull request #6989 from hrydgard/x86-emitter-merge
Merge from Dolphin's x86-64 emitter
2014-10-12 19:52:59 +02:00
Henrik Rydgård
3b1476c8ec MIPSTables: Annotate fp and hi/lo in/out more accurately than just "other"
Some typo fixes
2014-10-12 19:46:50 +02:00
Henrik Rydgard
8177b4c43b Avoid an ifdef using PTRBITS 2014-10-12 19:35:55 +02:00
Henrik Rydgård
eab010a0c0 x86 JIT: Sacrifice a register for a pointer to the MIPS context. Shrinks emitted x86 code considerably.
Nice in 64-bit, but might be a bit too much in 32-bit though... Needs testing.
2014-10-12 19:35:55 +02:00
Henrik Rydgård
f99c2cd010 x86 Jit: Generate nicer code for some cases of addiu 2014-10-12 17:47:53 +02:00
Henrik Rydgård
91966824bb minor cleanup: No point in having special functions for ReadFCR/WriteFCR, they're smaller than many other ops.. 2014-10-11 15:57:36 +02:00
Henrik Rydgård
2feae8d98e Merge pull request #6978 from daniel229/replace_danganronpa
Replace frame download in danganronpa 1&2
2014-10-07 00:46:19 +02:00
Henrik Rydgård
c9a21ab44d Add T2 and T3 to our register enum for clarity 2014-10-05 14:20:30 +02:00
daniel229
5ff098efb9 Another replace frame download in danganronpa 1 2014-10-05 13:46:47 +08:00
daniel229
ef1484da65 Replace frame download in danganronpa 1 2014-10-05 13:44:39 +08:00
daniel229
a7cf3aeafc Another replace frame download in danganronpa 2 2014-10-05 13:42:03 +08:00
daniel229
d7927009d0 Replace frame download in danganronpa 2 2014-10-05 13:39:15 +08:00
daniel229
aad301a97a Replace download frame in Boku no Natsuyasumi 2 and 4 2014-09-27 14:00:37 +08:00
daniel229
4de7e89330 Replace download frame in Sora no kiseki SC,and a comment for Sora no kiseki 3rd 2014-09-26 17:13:01 +08:00
daniel229
ea9a0182e4 Replace download frame in Sora no kiseki FC 2014-09-26 16:55:37 +08:00
Henrik Rydgård
9aaaf3a835 Merge pull request #6922 from daniel229/a_few_func_replacement
A few functions replace in some games
2014-09-21 23:13:02 +02:00
Unknown W. Brackets
4210ba44eb Clean up a few more ImmPtr() cases. 2014-09-21 08:34:27 -07:00
daniel229
d452d02aba Update replace function for Toaru Majutsu to Kagaku no Ensemble. 2014-09-21 10:58:56 +08:00
daniel229
1e2abba2b1 Replace function in Toradora! Portable 2014-09-18 15:47:52 +08:00
daniel229
efa35294ce Replace function in Hentai Ouji To Warawanai Neko 2014-09-18 15:45:58 +08:00
daniel229
4d6e0f6d31 Another function replace in Kirameki School Life 2014-09-18 15:41:39 +08:00
daniel229
a657da91c3 Replace download frame in Toaru Majutsu to Kagaku no Ensemble 2014-09-18 15:37:12 +08:00
daniel229
48b774d143 Replace download frame in Rezel Cross 2014-09-18 15:29:59 +08:00
daniel229
2dd3ab6c20 Replace frame download in suikoden1&2 2014-09-15 00:19:22 +08:00
Henrik Rydgard
f84ebf6bff sprintf->snprintf, fix some too short buffers 2014-09-14 00:14:11 +02:00
Unknown W. Brackets
a892779f89 Merge pull request #6880 from daniel229/func_replace_sakurasou
Replace frame download in Sakurasou No Pet Na Kanojo

Conflicts:
	Core/HLE/ReplaceTables.cpp
2014-09-12 22:28:27 -07:00
Unknown W. Brackets
52b6f1095e armjit: Fix rounding mode, allow non flush-to-zero.
Default: force flush to zero (for RunFast mode.)  But now it's an ini
option so we can more easily compare armjit differences.
2014-09-11 07:58:51 -07:00
daniel229
7c1d4234ab Replace frame download in Sakurasou No Pet Na Kanojo 2014-09-11 15:32:07 +08:00
daniel229
22fa431f89 Replace frame download in Ore no Imouto ga Konnani Kawaii Wake ga Nai 2014-09-11 14:34:17 +08:00
daniel229
81ec625d0a Update the hook for Kirameki School Life SP 2014-09-11 12:01:58 +08:00
daniel229
e8efab6d21 Change function name 2014-09-11 00:47:06 +08:00
daniel229
7f6f52a904 Fixes saveicons in Kirameki School Life SP. 2014-09-11 00:18:59 +08:00
daniel229
b1d9461779 Replace frame download in Narisokonai Eiyuutan 2014-09-09 14:33:15 +08:00
Unknown W. Brackets
420eb1bed3 Replace frame download in SD Gundam G Generation. 2014-09-08 22:44:27 -07:00
Unknown W. Brackets
ee6960ff7a Hook the saveicon creation func in Growlanser IV. 2014-09-08 19:10:46 -07:00
daniel229
202f987e9b Replace function for Zero no Kiseki and Ao no Kiseki 2014-09-05 00:52:04 +08:00
Andrew Church
33264a6b8f Hook Brandish frame capture for menu fadeout and save screenshots. 2014-09-04 17:36:56 +09:00
Andrew Church
3033dc5138 Revert to unconditional ClearRoundingMode() when setting FCR31. 2014-09-04 11:36:56 +09:00
Andrew Church
128122af39 Fix broken rounding mode handling. 2014-09-04 11:30:11 +09:00
Andrew Church
726cb851b9 Don't unconditionally ClearRoundingMode() before setting it. 2014-09-04 09:28:56 +09:00
Andrew Church
5816685668 Handle the FS (flush-to-zero) bit in FCR31 for x86 JIT. 2014-09-04 01:50:24 +09:00
Unknown W. Brackets
4a1514730f x86jit/ppcjit: Correct some bad sltiu compares. 2014-09-02 08:04:22 -07:00
Unknown W. Brackets
c9df66a450 Initialize the VFPU revision from a PSP-3000 value. 2014-09-01 23:16:50 -07:00
Unknown W. Brackets
4459b8f483 jit: Actually jit vmtfc/vmfvc.
Sicne we have them and they are easy.
2014-09-01 23:13:39 -07:00
Unknown W. Brackets
fd1b01b573 Fix the vrndi.s output range.
Was previously outputting only valid positive float values, but should use
a much wider range of a u32.

Might've affected randomness in some games.
2014-09-01 22:33:01 -07:00
Unknown W. Brackets
5f6f6827b5 jit: Update rounding mode immediately on ctc1. 2014-08-30 23:48:27 -07:00
Unknown W. Brackets
e8cdbcc33f x86jit: Fix some flags/EAX trashing in rounding.
Fixes #6810.
2014-08-30 16:46:43 -07:00
Unknown W. Brackets
8daff0a25e armjit: Fix some downcount issues with rounding.
When setting the rounding mode we need to be super careful about not
destroying flags or R0 if they are needed.
2014-08-30 16:43:13 -07:00
Unknown W. Brackets
820a8e8f2b armjit: Don't reset downcount on fpu instructions.
It's maintained always, oops.
2014-08-30 16:30:13 -07:00
Unknown W. Brackets
d4ec7d8019 Add another memcpy variant.
Fixes #4324 (Marvel Ultimate Alliance 2 videos), thanks daniel229.
2014-08-23 08:45:25 -07:00
Henrik Rydgård
5d836bfa5a Merge pull request #6765 from hrydgard/thin3d
Switch UI drawing from GL to Thin3D. This activates the D3D9 path as well.
2014-08-23 10:52:21 +02:00
Henrik Rydgård
b7da82eebb Merge pull request #6762 from unknownbrackets/fpu-rounding
Handle fpu rounding mode at least in jits
2014-08-23 10:43:22 +02:00
Unknown W. Brackets
e9b5e6f277 armjit: Maintain rounding mode throughout jit. 2014-08-22 19:57:50 -07:00
Henrik Rydgard
808f05da89 (Partially) slip thin3d underneath DrawBuffer. 2014-08-22 20:54:53 +02:00
Unknown W. Brackets
925557ed47 x86jit: Maintain the rounding mode always.
This should be less often than doing it per block that uses fpu, unless
the game doesn't use fpu much at all.
2014-08-22 09:53:00 -07:00
Henrik Rydgård
f8a4236d58 Merge pull request #6679 from unknownbrackets/gpu-blend
Handle doubled alpha blending using premultiplication
2014-08-22 10:06:24 +02:00
Unknown W. Brackets
1fcbb7bbd4 armjit: Respect the rounding mode for mul/etc. 2014-08-22 00:32:01 -07:00
Unknown W. Brackets
ab13b36484 x86jit: Implement cvt.w.s.
Not really used that often, anyway, but easy enough and good for testing
that we set the rounding mode correctly.
2014-08-22 00:01:06 -07:00
Unknown W. Brackets
dc91dc1ce8 x86jit: Support fpu rounding modes for mul, etc.
Fixes Gods Eater Burst loading PSP savedata, but can no longer load old
savedata.
2014-08-21 23:59:55 -07:00
Sacha
97e93f48fd Clean up LitPool code and re-enable flushing in AsmJit 2014-08-20 18:29:37 +10:00
Unknown W. Brackets
d52fdafa3c Note the location of a memset variant. 2014-08-18 23:20:44 -07:00
Unknown W. Brackets
9d3cf346c3 Clarify GetSureBranchTarget() for fpu branches.
They also have CONDTYPE_ flags.  Looks like this was just getting lucky
that rs can't equal rt, but the code looks confusing when you're looking
at it from an fpu/vfpu perspective.
2014-08-18 07:46:48 -07:00
Unknown W. Brackets
78296d15c6 Don't recurse when disasming an emuhack.
Although, should this happen?  Apparently does in Peace Walker.
2014-08-17 18:43:59 -07:00
Henrik Rydgård
60eaefa6ad Merge pull request #6680 from unknownbrackets/replace-funcs
Disable most replacements and use checked mem access in them
2014-08-04 23:44:20 +02:00
Unknown W. Brackets
ac94dbcc69 Show the replaced instruction in disassembly.
Useful while debugging.
2014-08-03 19:23:29 -07:00
Unknown W. Brackets
abd1f4e58a Disassemble bne/etc. using rs, rt order.
The order makes more logical sense from game disassembly, and matches the
assembler.  Fixes #6632.
2014-08-03 19:22:18 -07:00
Unknown W. Brackets
245a2a3be0 Don't zero out downcount in replacements.
It doesn't write out js.downcountAmount in any of these cases, so zeroing
it is wrong.
2014-08-03 13:22:30 -07:00
Unknown W. Brackets
d060a06fa6 Disable a bunch of function replacements.
These are just for speed, let's turn them off.  Using a flag because:
 * I think there's still some issue with savestates, not sure.
 * We might swap this flag to a separate option.
2014-08-03 13:15:41 -07:00
Henrik Rydgard
82421f4dcf x86 jit: Further fix for nor, thanks unknown
See #6638
2014-07-27 22:26:35 +02:00
Henrik Rydgard
903ddbc513 x86 JIT: Fix bug where NOR would not get computed correctly in corner case
(CompTriArith can end up not actually mapping rd to a register when taking
a shortcut)

May fix the JIT issue mentioned by CPkmn and located by daniel229 as an aside in #6638
2014-07-27 21:41:41 +02:00
Sacha
6ce3765b12 Sailfish: More compatibility with SailFish OS. It also needs stddef where Maemo does.
Set packaging by default for iOS with b.sh.
2014-07-24 23:20:09 +10:00
Unknown W. Brackets
6aa9b8aa36 Limit stack walk distance a bit.
It was spending 0.5s in debug scanning all of memory for an entry (due to
some fp/sp tricks that aren't well detected yet.)  Let's just assume a 1MB
func doesn't need to be walked properly.
2014-07-20 21:52:55 -07:00
Unknown W. Brackets
b03460c169 Improve function range detection.
This improves a pattern like this:

  j endOfLoop;
  li v0, 0;
  startOfLoop:
  addiu v0, v0, 1
endOfLoop:
  bne v0, a0, startOfLoop;
  nop
  jr ra
  nop

Where it jumps to the end of the loop, which only jumps back to the top of
the loop.  This might misdetect a few cases of tail recursion, but only
when the funcs are right next to each other.

Also, stops scanning at a jr ra, which was causing funcs to be incorrectly
long in cases.
2014-07-20 14:42:20 -07:00
Henrik Rydgård
0e5679c833 Revert "Detect Peace Walker's anti-cheat hash func" 2014-07-20 23:39:11 +02:00
Henrik Rydgård
06f058de54 Merge pull request #6506 from unknownbrackets/replace-funcs
Detect Peace Walker's anti-cheat hash func
2014-07-18 09:34:46 +02:00
Unknown W. Brackets
a59d8b5c1f Override the codehashing func used in Peace Walker.
This makes the demo work fine even with jit enabled.  May help the full
game when fighting a certain boss.
2014-07-18 00:23:26 -07:00
Unknown W. Brackets
1fd6214945 Improve function range detection.
This improves a pattern like this:

  j endOfLoop;
  li v0, 0;
  startOfLoop:
  addiu v0, v0, 1
endOfLoop:
  bne v0, a0, startOfLoop;
  nop
  jr ra
  nop

Where it jumps to the end of the loop, which only jumps back to the top of
the loop.  This might misdetect a few cases of tail recursion, but only
when the funcs are right next to each other.

Also, stops scanning at a jr ra, which was causing funcs to be incorrectly
long in cases.
2014-07-18 00:22:19 -07:00
Sacha
6957808b97 ArmJit: Optimisation when comparing float against 0.0f 2014-07-17 05:12:43 +10:00
Sacha
d4c983d9e1 Android: ARMv5 fix 2014-07-17 02:34:22 +10:00
Unknown W. Brackets
292a9ea567 Clear module text and bss on unload.
Text is set to break instructions, data/bss to -1.  Matches results on a
PSP.
2014-07-13 22:00:32 -07:00
Unknown W. Brackets
81096f6bd0 Hook Dissidia's avi record func.
This makes it so replays can be recorded.  Though you could probably just
record separately anyway.
2014-07-13 08:36:34 -07:00
Unknown W. Brackets
09b9d2ad81 Keep track of ranges that have emuhack ops.
So that we can invalidate them smarter.
2014-07-05 16:25:16 -07:00
Unknown W. Brackets
d2e7dfcc51 Minor logging improvement. 2014-07-05 13:19:53 -07:00
Unknown W. Brackets
4b9229a5ba x86jit: Flush the PC before r/w in debug.
This way we get better log output.
2014-07-05 12:57:44 -07:00
Unknown W. Brackets
c0e6f26bb5 Fix startDefaultPrefix tripping.
We want the regs already initialized when we set this up.
2014-06-30 08:10:14 -07:00
Henrik Rydgård
bfffe33438 Merge pull request #6469 from unknownbrackets/logging
Enforce semicolons at the end of log lines
2014-06-30 11:44:02 +02:00
Unknown W. Brackets
433f4eb00a Use the ARM rounding mode flag for conversions.
It's at least much simpler.  Not sure if faster.  Handles NAN correctly.
2014-06-29 20:36:00 -07:00
Unknown W. Brackets
f339f7d539 armjit: Handle NAN correctly in float conversion. 2014-06-29 20:05:59 -07:00
Unknown W. Brackets
c168db5943 armjit: Fix really bad typo in cvt.w.s. 2014-06-29 19:43:17 -07:00
Unknown W. Brackets
0078faef8b Fix some log semicolons that might affect logic.
But, these should all be right.
2014-06-29 19:09:38 -07:00
Unknown W. Brackets
5db79dcf11 Fix some missing semicolons on log statements. 2014-06-29 19:09:37 -07:00
Unknown W. Brackets
252100aee5 Remove outdated comment (real cause found/fixed.) 2014-06-28 16:06:10 -07:00
Unknown W. Brackets
f008bebab4 armjit: Fix floor/ceil/cvt.w.s rounding.
Unfortunately, correctly rounding is probably slower.
2014-06-28 00:38:57 -07:00
Unknown W. Brackets
f544a87b2f jit: Initialize startDefaultPrefix when switching. 2014-06-28 00:38:56 -07:00
Unknown W. Brackets
27870aa593 x86jit: Map HI/LO as registers.
Not actually ever cached, but now it's all consistent.
2014-06-28 00:38:56 -07:00
Unknown W. Brackets
bc3d789c8a x86jit: Cache the vfpu compare flags in a reg.
Again, to match armjit.
2014-06-28 00:38:55 -07:00
Unknown W. Brackets
acad2e1763 x86jit: Cache fpcond in a register.
Mostly to match armjit.
2014-06-28 00:38:55 -07:00
Unknown W. Brackets
0da972c548 Hook the FF1 battle effect func.
So that we can download the framebuffer.  At least, it seems like that's
what this function is doing.
2014-06-26 01:38:22 -07:00
Unknown W. Brackets
24d8a34a0b Properly respect resolveReplacements.
And use the same opcode reading func in armjit as x86jit.
Fixes Star Ocean on Android.
2014-06-23 08:20:38 -07:00
Unknown W. Brackets
ec94498342 When scanning or relocating, check replacements.
Just to make sure we don't wrongly detect the length or unresolve a var
wrong etc.
2014-06-23 08:18:56 -07:00
Unknown W. Brackets
c1e293fe7c Fix a warning on 32-bit that might be bad... 2014-06-22 22:17:48 -07:00
Unknown W. Brackets
8851fc1685 Remove savedIdRegister/MIPS_CALL_ID.
We've never trusted it anyway, simpler without dealing with this stuff.
2014-06-22 11:29:47 -07:00
Unknown W. Brackets
d7e2c2c1d2 armjit: Oops, correctly handle plus/minus vmin/max. 2014-06-21 07:45:47 -07:00
Unknown W. Brackets
95f5d9397c Correct overflow in trunc.w.s for interpreter.
Reported in #4786.
2014-06-21 00:53:33 -07:00
Unknown W. Brackets
62daf6d7c8 armjit: Fix vmin/vmax to follow the PSP's rules.
Also the interpreter.  Fixes #6107.
2014-06-20 23:55:33 -07:00
Unknown W. Brackets
e87e1606c5 Fix func replacements and delay slots on arm.
Fixes #6303.
2014-06-19 08:02:49 -07:00
Unknown W. Brackets
9efbc2694b Add an invalidate all method to the jit. 2014-06-19 01:13:06 -07:00
Unknown W. Brackets
561d0e5ef9 Check more ops for changing memory in debugger. 2014-06-19 00:48:33 -07:00
Henrik Rydgard
ee1d16cb1d Use sincosf where available (linux) 2014-06-15 12:06:02 +02:00
Henrik Rydgard
e6f55bfef0 Fix silly mistake in vfpu_sincos. Add unittest. 2014-06-15 11:51:30 +02:00
Henrik Rydgard
0879d76503 VFPU: Ensure that sin(4*x) returns 0.0 (and cos 1) for all x. Fixes #2921 2014-06-15 11:03:00 +02:00
Sacha
55221b5c7c Sin/cos fix for hardfp builds. 2014-06-12 23:10:22 +10:00
Unknown W. Brackets
0550b9372a Skip nop padding between functions.
Fixes graphical artifacts in Final Fantasy Tactics, recognizing memset and
memcpy.
2014-06-09 00:16:03 -07:00
Unknown W. Brackets
a926b19c03 Hook Tales of Phantasia X save pictures.
Maybe other games will use the same func even.  This makes save pictures
work correctly the first time.
2014-06-08 16:38:43 -07:00
Unknown W. Brackets
a7b9ce205b Enable function replacements by default.
So things like Star Ocean work, and game memcpy()'s to GPU work.

This will make game start on mobile a bit slower, though.  And there could
still be bugs so leaving as an option, but seems pretty stable.  Didn't
realize it wasn't enabled by default.
2014-06-07 00:13:45 -07:00
Unknown W. Brackets
f6d4be1d49 Hook Star Ocean's function to upload stencil data.
Hurray, it seems to work properly.
2014-05-31 21:48:11 -07:00
Henrik Rydgård
fd19b8d271 Merge pull request #6197 from unknownbrackets/replace-funcs
Function replacement hooks and some GLES compat replacements
2014-05-31 20:30:30 +02:00
Unknown W. Brackets
5ccc227462 armjit: Minor const optimization in Comp_VV2Op. 2014-05-31 11:12:36 -07:00
Unknown W. Brackets
df289e46a9 armjit: Use sat0/1 method from prefixes in vsat. 2014-05-31 11:12:35 -07:00
Unknown W. Brackets
b2dc92b942 Add a hook for Hexyz Force's "monoclome" thread.
This fixes missing graphics in some areas in GLES, due to direct framebuf
access.
2014-05-31 10:03:02 -07:00
Unknown W. Brackets
5dd8ebe2b4 Add a hook for Gods Eater Burst's swizzled copy. 2014-05-31 10:03:01 -07:00
Unknown W. Brackets
d09be5a4bc Update PC before going into a replacement func.
This way we can report the PC properly on errors, and the replacement func
can even look at PC.
2014-05-31 10:03:01 -07:00
Unknown W. Brackets
f489694515 Add the option to hook, rather than replace, funcs.
This can be useful for debugging or developing translations/game hacks,
and also gives us options when dealing with GLES incompatibilities.
2014-05-31 10:03:00 -07:00
Unknown W. Brackets
72eb15f282 Speed up debug build hashfunc lookup. 2014-05-31 10:03:00 -07:00
Henrik Rydgård
585050de27 Fix crash in block cache 2014-05-27 22:02:20 +02:00
Unknown W. Brackets
b73c575418 Support swizzled framebuffer downloads.
Used in God Eater 2 when showing the load save screen.
2014-05-27 01:17:09 -07:00
Unknown W. Brackets
a70b5abfb9 Allow jit to be enabled/disabled at runtime. 2014-05-27 00:02:51 -07:00
Unknown W. Brackets
8afd1f028c Add a couple more memcpy() variants. 2014-05-26 11:20:34 -07:00
Unknown W. Brackets
288d867588 Fix a type comparison warning. 2014-05-21 08:00:31 -07:00
Unknown W. Brackets
5b24e0107f x86jit: Correct vsat0/vsat1 handling. 2014-05-16 01:04:58 -07:00
Unknown W. Brackets
ccc5458a84 x86jit: Correctly handle NaNs in vfpxd sat clamps.
May be a small performance jit, but we've seen games with bugs because of
NaNs recently, so better to be safe.
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
69b0b622be armjit: Fix D-prefix sat clamp NAN handling.
They should leave NAN alone.
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
bc32f0e0b2 armjit: Correct disabled vslt NaN handling.
Can possibly enable?
2014-05-16 01:04:57 -07:00
Unknown W. Brackets
a3ad238a44 Add notes for proper NaN handling in vmin/vmax. 2014-05-16 01:04:56 -07:00
Unknown W. Brackets
6ccae8f5a7 x86jit: Use a faster safemem fallback.
Really helps performance in games that use uncached addresses a lot,
without really impacting performance of most games which don't.

Of course, fastmem is faster.
2014-05-06 08:05:12 -07:00
The Dax
086d97516d Fix a couple ARM VFPU flags.
Unknown's explanation:
LO means Lower (unsigned), but for floats, it means "Less than".
LT means Lower (signed), but for floats it means "Less than OR unordered".
ARM docs at http://infocenter.arm.com/help/topic/com.arm.doc.dui0068b/Chdhcfbc.html explain the following:
LE means Signed less than or equal, and for floats it means "Less than or equal, or unordered".
LS means Unsigned lower or same, but for floats, it means "Less than or equal"
2014-05-04 22:37:41 -04:00
Unknown W. Brackets
95dcadb6ae Ignore when a proxied block points to erased mem.
Happens for example when a new module is loaded, sometimes.
2014-05-04 01:25:19 -07:00
Unknown W. Brackets
c3a6092e26 Upgrade symbolmaps with module address info.
This fixes some issues with jit replacement only if you had a map laying
around.
2014-05-04 01:24:18 -07:00
Unknown W. Brackets
4d665b5e7a Fix replacement funcs in the interpreter. 2014-04-28 08:01:13 -07:00
Unknown W. Brackets
97c18e7f0e Comment out a few unsafe replacement funcs. 2014-04-22 08:07:10 -07:00
Unknown W. Brackets
7326c6e716 Fix a race condition on shutdown with hashmap.
Also, always need to init the blocks, they are not zero initialized.
2014-04-20 21:44:10 -07:00
Henrik Rydgard
f35168e0e0 Hardcode a bunch of function hashes so we can replace them.
Without needing an external file.
2014-04-18 19:00:08 +02:00
Unknown W. Brackets
dc39e75fc1 Oops, forgot about proxy blocks for replace jal.
Also fix a crash when they are used.
2014-04-17 01:03:46 -07:00
Unknown W. Brackets
ddd6e3024d Skip jals when replacing funcs.
Improves performance in God Eater (when replacements enabled.)
2014-04-16 23:57:52 -07:00
Unknown W. Brackets
dde2f3ade6 Re-replace functions after loading a savestate.
Might need to clear before saving too... anyway, this makes testing a bit
easier for certain areas.

Also, correctly decrease downcount on x86.
2014-04-12 15:49:20 -07:00
Unknown W. Brackets
76e61e10a9 Fix hashmap crashes with games that load modules.
This should properly unload and reload the functions as necessary.
2014-04-12 01:16:32 -07:00
Unknown W. Brackets
3001866d18 Skip flushing FPU/VFPU regs if none were allocated.
They're not used as often, so this usually saves time.  About 1% during
tests.
2014-03-30 00:42:25 -07:00
Unknown W. Brackets
a4327702f1 Reduce some includes under GPU/. 2014-03-29 16:51:38 -07:00
Henrik Rydgard
58237d976f Fix performance issue in BlockCache due to an instance of std::vector in every block:
Avoid creating the vector when not necessary.

This was especially noticeable in debug mode.
2014-03-29 22:26:51 +01:00
Henrik Rydgård
717c1cd34e Merge pull request #5748 from unknownbrackets/armjit-minor
armjit: Allow R1 in regalloc, use LR as temp
2014-03-29 04:09:58 -04:00
Unknown W. Brackets
600842d9a2 armjit: Use prefixes on vscl's T arg.
Makes it pass one more thing in the prefixes test, but not sure exactly
how it operates.  Better to have it the same as x86 and int anyway.
2014-03-29 01:00:29 -07:00
Unknown W. Brackets
5a89c17cf0 armjit: Allow R1 in regalloc, use LR as temp.
LR should be safe, although it may make stack traces not work within jit,
they don't really tend to work anyway.
2014-03-28 18:38:38 -07:00
Unknown W. Brackets
58c5179d8e Push and pop the callee saved NEON registers. 2014-03-25 22:34:42 -07:00
Unknown W. Brackets
2f5c6a5660 Fix VLDM/VSTM encoding for double/quad regs.
Duh, forgot to check Vd.  Fixes #5723.
2014-03-25 22:08:20 -07:00
Unknown W. Brackets
246eaeb209 x86jit: Avoid mem temp for float cmp/loads. 2014-03-22 15:56:28 -07:00