Commit Graph

150 Commits

Author SHA1 Message Date
Unknown W. Brackets
2f5c6a5660 Fix VLDM/VSTM encoding for double/quad regs.
Duh, forgot to check Vd.  Fixes #5723.
2014-03-25 22:08:20 -07:00
Unknown W. Brackets
b589d3b170 vertexjit: Fix a silly mistake in weights > 4.
Darn switch, took me way too long to notice this.
2014-03-23 19:02:40 -07:00
Unknown W. Brackets
717e6db3a7 Fix VDUP so that it actually works. 2014-03-23 17:51:32 -07:00
Unknown W. Brackets
f74b765ff3 Fix VLDM/VSTM encoding for D/Q regs.
Now it is actually using ASIMD/NEON.
2014-03-23 16:26:13 -07:00
Unknown W. Brackets
ff2d5bb17e Add a float->GP reg, fix VDUP for I_16. 2014-03-23 16:25:56 -07:00
Unknown W. Brackets
8056440ba1 Implement NEON register VMOVs. 2014-03-23 16:25:52 -07:00
Unknown W. Brackets
2b586b2a7a Support other constant VMOVs on NEON.
Float is especially useful.
2014-03-23 16:12:51 -07:00
Unknown W. Brackets
3e1cd5c161 Add a NEON VMOV imm encoding to the emitter. 2014-03-23 16:12:46 -07:00
Unknown W. Brackets
06db03ac9e Add some asserts to VLDM/VSTM. 2014-03-22 16:12:35 -07:00
Unknown W. Brackets
60bbf4af3b Fix VLD1/VST1 n=4. 2014-03-22 16:12:08 -07:00
Unknown W. Brackets
f3d38ee269 Fix VMOV for Dregs and VSHL reg order. 2014-03-22 16:12:00 -07:00
Unknown W. Brackets
0da9c1851c vertexjit: Add VQMOV(U)N and fix VMOVN size.
It will be too confusing if it's specified as the destination, unlike
VMOVL.  Plus the assembler syntax uses the source size.
2014-03-22 16:11:36 -07:00
Henrik Rydgård
8dfadf7b8e ArmEmitter: Add VMOV_neon and a Size parameter to VFMA for consistency. 2014-03-22 16:31:16 +01:00
Unknown W. Brackets
e783627947 armjit: Use our I2R funcs on reg/reg math too.
When one is a known immediate.  This should catch more cases, like:

ori v0, $0, 0xFFFF
and v1, v1, v0
2014-03-14 19:15:43 -07:00
Unknown W. Brackets
836787d19a Optimize ANDI2R() to use UBFX if possible.
This way, & 0x7FF or & 0xFFFF etc. are all fast when on ARMv7.
2014-03-14 19:15:42 -07:00
Unknown W. Brackets
3a07924ad9 Add Try arm emitter I2R funcs.
This way we can use them without giving up the regcache's immediate
optimizations.
2014-03-14 19:15:42 -07:00
Unknown W. Brackets
48fa22b7cf B/BL were swapped in the arm emitter.
Oops...
2014-03-14 19:15:41 -07:00
Unknown W. Brackets
76305130ee Add a couple missing (unused) ARM instructions. 2014-03-14 19:15:39 -07:00
Henrik Rydgard
ab9cd99d0f Major ARM disassembler improvements, will make debugging the JIT easier 2014-03-12 18:09:28 +01:00
Henrik Rydgard
adadf11890 An attempt to combine FPU regcache writebacks with VSTMIA. Disabled due to bugs. 2014-03-11 11:03:51 +01:00
Henrik Rydgard
2eb6a4e2f2 Fix a warning, rename some parameters, etc. 2014-03-08 10:40:43 +01:00
Sacha
ad31cd1b7c Clean up ArmEmitter (cross-merge from Dolphin, minus the bad bits) 2014-03-07 15:47:34 +10:00
Sacha
30a6a5d10f ARMJIT: Implement VLDM/VSTM load/store combinations and use in armjit. Also add them to disassembler. 2014-03-07 02:56:34 +10:00
Unknown W. Brackets
2655a4cba6 Include some now-missing things for Linux. 2013-12-30 21:15:00 -08:00
Henrik Rydgård
e5e17fbc6e More include cleanup. Hoping for very slightly faster compile times.. 2013-12-30 10:49:05 +01:00
Unknown W. Brackets
e6b2d00a2f Avoid reseved identifiers like _SP, etc.
R_SP is not that bad.
2013-12-29 14:25:34 -08:00
Henrik Rydgard
8956fb2932 Minor optimization in ADDI2R 2013-11-30 15:52:59 +01:00
Unknown W. Brackets
dffa35ef2f When ins is used with a zero argument, don't OR.
Seems it's used effectively to mask out bits with rs=zero.  Makes sense...
2013-11-29 09:17:12 -08:00
Henrik Rydgard
aaab7e32d2 ARM emitter: Fix VDUP 2013-11-24 19:30:25 +01:00
Henrik Rydgard
030e6460cc ARM: NEON-optimize software skinning 2013-11-24 18:03:42 +01:00
Henrik Rydgard
dfea160491 ARM: Use PLD (cache preload) in vertex decoder loop. 2013-11-24 15:08:47 +01:00
Henrik Rydgard
f650b23c90 ARM: Add NEON widening and narrowing moves, and float/int convert.
Experiment a little in the vertex decoder.
2013-11-24 13:30:28 +01:00
Henrik Rydgard
16509ba3e9 ARMEmitter: Make the helper functions private. 2013-11-23 13:43:22 +01:00
Henrik Rydgard
cda4e9cbf3 ARM emitter: Complete VLD1/VST1 for lanes and to-all-lanes. 2013-11-23 13:36:26 +01:00
Henrik Rydgard
e0eb152fb9 VLD1/VST1: Change argument ordering again. 2013-11-23 11:05:19 +01:00
Henrik Rydgard
b64f44c3fc ARM emitter: Implement VMLA and VMUL by scalar, VLD1/VST1 multiple 2013-11-23 01:51:35 +01:00
Unknown W. Brackets
c50ab6d6aa armjit: Fix divu when divisor is a constant 1.
Fixes #4539 and #4520.
2013-11-19 13:24:15 -08:00
Unknown W. Brackets
f165a15eff Fix a -unsigned warning.
Looks ugly, but (u32)-(s32)val is what we really want here.

Also make a __FUNCTION__ redeclaration warning go away.
2013-11-15 08:18:34 -08:00
Sacha
e3bdb3e09b Disable LitPool as it is causing crashes with Vertex Decoder JIT. Performance seems to be almost unaffected since the IMM changes. 2013-11-15 14:12:00 +10:00
Sacha
20e8a81268 Switch to compile-time ARMV7 define. 2013-11-15 11:20:39 +10:00
Henrik Rydgård
ddf5b695ac Update ArmEmitter with Sonic1's new NEON emitters. Thanks! 2013-11-13 11:34:47 +01:00
Unknown W. Brackets
1a98691c57 armjit: Fix ANDI2R() clearing low bits incorrectly. 2013-11-11 19:07:16 -08:00
Unknown W. Brackets
ee492099b5 Avoid a literal in ORI2R where possible. 2013-11-10 14:38:08 -08:00
Unknown W. Brackets
83fe874dcc armjit: Use multiple BICs in ANDI2R if possible.
Rather than a temporary.
2013-11-09 08:42:31 -08:00
Unknown W. Brackets
6038d96b46 armjit: Flush regs using STMIA where possible. 2013-11-09 08:25:07 -08:00
Henrik Rydgard
1bf83efe9e ARM optimization in ADDI2R: Dual adds instead of MOVI2R, ADD when possible 2013-11-08 12:43:47 +01:00
Unknown W. Brackets
f6662054bd Fix arm emitter bug in LDRH and friends. 2013-11-05 00:32:08 -08:00
Unknown W. Brackets
7a8671f8a2 Add a TSTI2R helper for readability mainly. 2013-11-03 21:58:26 -08:00
Unknown W. Brackets
5de7181b36 Add other forms of LDM/STM to the emitter. 2013-11-03 21:31:05 -08:00
Unknown W. Brackets
95c8ee5089 Missing stddef library (Linux buildfix.) 2013-10-27 15:52:40 +00:00