851 Commits

Author SHA1 Message Date
Unknown W. Brackets
a3a061a69f armjit: Optimize a division by a power of two.
These really happen.
2013-11-09 08:43:53 -08:00
Unknown W. Brackets
1776c85882 armjit: Implement a software divide for divu.
It's not actually that much code.
2013-11-09 08:43:52 -08:00
Unknown W. Brackets
b2a240d105 armjit: Implement msub/msubu. 2013-11-09 08:43:52 -08:00
Unknown W. Brackets
3aa8706ae7 armjit: Optimize lwl/lwr against an imm address. 2013-11-09 08:43:48 -08:00
Unknown W. Brackets
4026944b02 armjit: Handle lwl/lwr (not pretty, though.) 2013-11-09 08:42:30 -08:00
Henrik Rydgård
e90f7f360d Merge pull request #4480 from unknownbrackets/perf
Flush regs using STMIA if possible, plus imm adjustments (armjit)
2013-11-09 08:41:25 -08:00
Henrik Rydgard
06ce01ea04 Remove erroneous comment. 2013-11-09 17:34:52 +01:00
Unknown W. Brackets
54168b173e armjit: Clean up some magic numbers. 2013-11-09 08:25:08 -08:00
Unknown W. Brackets
6038d96b46 armjit: Flush regs using STMIA where possible. 2013-11-09 08:25:07 -08:00
Unknown W. Brackets
e686ff59bf armjit: Allocate regs in preferred slots.
This may allow better flushing.  Not sure if these are the best regs,
but if they aren't it shouldn't really hurt.
2013-11-09 08:25:07 -08:00
Unknown W. Brackets
cb3bb73148 armjit: Improve GPR typesafety. 2013-11-09 08:24:15 -08:00
Unknown W. Brackets
945b8bf5c5 armjit: optimize reverse subtract, avoid temp imms.
If we have a non-op2 imm, get rid of it asap.  If we have a op2 friendly
imm, keep it.
2013-11-09 08:18:43 -08:00
Unknown W. Brackets
415f22ecac armjit: Preserve imms on min/max as well. 2013-11-09 08:18:43 -08:00
Henrik Rydgard
502f772856 Add experimental mode to cache pointers in the arm jit.
Turned off for now as it needs more work but seems quite promising already.
2013-11-09 17:15:30 +01:00
Henrik Rydgard
58c39a38ee ARM regcache: Add mechanism to keep registers converted to pointers around 2013-11-09 16:57:29 +01:00
Henrik Rydgard
5ad04a23f4 x86 jit: Rename BindToRegister to MapReg 2013-11-09 15:23:31 +01:00
Henrik Rydgard
d26692ef92 Fix bug from a couple of commits ago in ARMJit 2013-11-09 15:22:39 +01:00
Henrik Rydgard
316d23d4cc Optimize mfv/mtv/mfc1/mtc1 on x86 too 2013-11-09 14:06:45 +01:00
Henrik Rydgard
04451623b9 This variant didn't seem to make much difference either (see prev commit) 2013-11-09 13:06:10 +01:00
Henrik Rydgard
15bc5a8db7 Add small ARM perf experiment. Did not help on ARMv7 so turned it off.
xsacha might want to try it on ARMv6.
2013-11-09 12:57:07 +01:00
Unknown W. Brackets
5d46a82f43 armjit: Use a MOV for add/or with 0.
Might skip the ALU, so might be faster.
2013-11-08 11:41:57 -08:00
Unknown W. Brackets
b8e126e7ce armjit: Preserve imms in slt/sltu as possible. 2013-11-08 11:41:57 -08:00
Unknown W. Brackets
8393d4aaae armjit: Preserve immediates more in nor. 2013-11-08 11:41:56 -08:00
Unknown W. Brackets
d7e42b26a3 armjit: Avoid flushing imm on add t0, imm, imm. 2013-11-08 11:41:56 -08:00
Unknown W. Brackets
a435c9dd13 armjit: Optimize movz/movn with immediates. 2013-11-08 11:41:55 -08:00
Unknown W. Brackets
376918c408 armjit: Reverse add t0, N, t1 to preserve imm. 2013-11-08 11:41:55 -08:00
Unknown W. Brackets
02dd250354 armjit: Optimize out a few immediate logic cases. 2013-11-08 11:39:24 -08:00
Henrik Rydgard
58db79672f Fix vmtvc on ARM, fixing issues with our prefix check. Add some logging.
Also improve vcmp on ARM.
2013-11-08 19:59:11 +01:00
Henrik Rydgard
309f904c0c Extract JitState into its own header (arm/x86) 2013-11-08 18:51:52 +01:00
Henrik Rydgard
f57f8170d3 ARMjit: Optimize mfv, mtv 2013-11-08 12:43:48 +01:00
Henrik Rydgard
dff0c431aa ARMjit: Optimize mfc1, mtc1 2013-11-08 12:43:48 +01:00
Henrik Rydgard
5a95e267fb Add an optimization to discard registers at the end of functions when possible.
Works in some games but crashes many so hiding it for now. Do not add UI.
2013-11-08 12:43:48 +01:00
Sacha
803148b8ca ARMv6: Fix offsets > 4096 for litpool. More aggressive check.
Somehow Scooby Doo gets to offsets of ~4200 unless i drop the threshold down to ~3200. Not sure why the offset can jump by so much in one instruction.
Makes Scooby Doo playable now instead of showing a blue screen in the main game. Likely affects other games.
2013-11-08 16:07:05 +10:00
Henrik Rydgard
c0d7c5e958 vsgn x86 bugfix 2013-11-07 21:07:07 +01:00
Henrik Rydgard
6eb7f94065 Implement vsgn in x86/x64 and ARM jit 2013-11-07 15:29:13 +01:00
Henrik Rydgard
32c95af820 ARM: Some zero-register fixes 2013-11-07 15:29:13 +01:00
Henrik Rydgard
91393093bc Re-enable the "nice delay slot" optimization on ARM 2013-11-07 15:29:12 +01:00
Henrik Rydgård
dbaac03afb Merge pull request #4462 from Kingcom/FollowOp
Extend follow functionality of disassembly
2013-11-06 12:15:57 -08:00
Henrik Rydgård
1e158fa652 ARM vtx dec: Preserving our FP scratch register appears to improve
stability.

Also added some logging.
2013-11-06 11:47:26 +01:00
Henrik Rydgård
9be3f8fc0a Use ANDI2R instead of a BIC with a too large parameter 2013-11-06 10:50:30 +01:00
Sacha
81d3df0841 ARMJIT: Minor optimisations for armv6 and armv7. 2013-11-06 15:28:26 +10:00
Kingcom
aae08173f1 Extend follow functionality of disassembly 2013-11-05 21:20:21 +01:00
Sacha
61e6054920 Revert change to WSBH as we don't have a swap16 that takes/returns u32. 2013-11-06 01:38:02 +10:00
Henrik Rydgard
6483c0c2cd Two minor armjit optimizations 2013-11-05 16:25:01 +01:00
Sacha
a5011e3ff0 Improve swap usage in MIPS. ARMv6 can use REV/REV16. Intepreter can use existing swap functions. 2013-11-06 01:20:35 +10:00
Unknown W. Brackets
732ae13ebb Fast path CallSyscall where possible.
It seems we're spending a decent amount of time there, which isn't
entirely unexpected.  We can eliminate some things easily.
2013-11-04 07:59:37 -08:00
Unknown W. Brackets
baa82e0a9d Keep syscalls the same in the interpreter.
Rather than having different bugs.
2013-11-04 07:59:36 -08:00
Unknown W. Brackets
5efc7fd581 Fix typo. 2013-11-03 07:36:53 -08:00
Henrik Rydgard
c4e02ab41d Revert "Fix Comp_VRot on x86 Linux/Mac/etc."
Seems broken, doesn't built on Windows.

This reverts commit d41acebb3de06710fd71a757f598e7c95037166c.
2013-11-03 15:24:57 +01:00
Henrik Rydgård
f414856ebe Merge pull request #4428 from Kingcom/StepOver
Fix step over handling of fpu branches and jalr
2013-11-03 03:09:04 -08:00