Ryan Houdek
c37a12e806
Merge pull request #3490 from Sonicadvance1/disable_assert
...
Disable assert in release
2024-03-11 15:48:18 -07:00
Ryan Houdek
54403e2146
Disable assert in release
...
Arguments and conditional doesn't get optimized out in release builds
for the inline function call versus the define.
Was showing up an annoying amount of time when testing.
2024-03-10 22:01:50 -07:00
Paulo Matos
a86f2d3e2c
Improve 32bit constant usage in memory addressing
...
Folds reg+const memory address into addressing mode,
if the constant is within 16Kb.
Update instcountci files.
Add test 32Bit_ASM/FEX_bugs/SubAddrBug.asm
2024-03-05 14:01:32 +00:00
Alyssa Rosenzweig
11880459a5
OpcodeDispatcher: use SETF for DEC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-01 19:40:53 -04:00
Alyssa Rosenzweig
0ef0bb2c97
OpcodeDispatcher: use SETF for INC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-01 19:40:53 -04:00
Alyssa Rosenzweig
72edee7c6f
IR: add SETF8/SETF16 ir ops
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-01 19:40:53 -04:00
Ryan Houdek
009ae55ff0
Merge pull request #3475 from alyssarosenzweig/opt/lock-dec
...
Optimize lock dec
2024-02-29 08:44:24 -08:00
Ryan Houdek
98572b9e23
Merge pull request #3473 from Sonicadvance1/remove_mov_swap
...
Arm64: Stop moving source in atomic swap
2024-02-29 08:44:16 -08:00
Alyssa Rosenzweig
fed5e6d546
OpcodeDispatcher: use fetchadd for atomic DEC
...
Avoids a NEG.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-29 09:28:21 -04:00
Alyssa Rosenzweig
f27e2246e2
Merge pull request #3468 from alyssarosenzweig/opt/miscs
...
Misc little opts
2024-02-29 09:18:17 -04:00
Ryan Houdek
eaf83aa6b4
Fix reserving range check
...
Fixes an issue where TestHarnessRunner was managing to reserve the space
below stack again, resulting in stack growth breaking. Would typically
only show up when using the vixl simulator under gdb for some reason.
This is likely the last bandage on this code before it gets completely
rewritten to be more readable.
2024-02-29 04:02:05 -08:00
Ryan Houdek
c318947695
Arm64: Stop moving source in atomic swap
...
ldswpal doesn't overwrite the source register and only reads the bits
required for the sized operation.
Not sure exactly why we were doing a copy here.
Removing it means improving Skyrim's hottest code block, as seen in #3472
2024-02-29 03:07:05 -08:00
Alyssa Rosenzweig
811487ad98
OpcodeDispatcher: use real branch for INT
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
4f4e38ace2
OpcodeDispatcher: use real branch for rep cmps
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
edd6becc56
OpcodeDispatcher: use real branch for rep scas
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
e47a94cae7
OpcodeDispatcher: skip mask with shld
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:18:29 -04:00
Alyssa Rosenzweig
cc82dba1ca
OpcodeDispatcher: use mvn for AF with constants
...
This reduces pointless constant usage. For now, it's no net change to
instcountci, but it should make it easier to get wins later. Hopefully.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:18:28 -04:00
Alyssa Rosenzweig
8232669b22
OpcodeDispatcher: simplify
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:00:13 -04:00
Alyssa Rosenzweig
28936073c4
OpcodeDispatcher: allow upper garbage on NEG
...
like SUB.
due to RA silliness, this is a loss for inst count but a win for cycles.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 09:59:57 -04:00
Alyssa Rosenzweig
ef2559d911
OpcodeDispatcher: allow garbage with SCAS
...
it's just feeding SUB flags which allow it
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 09:16:08 -04:00
Ryan Houdek
f346f89678
OpcodeDispatcher: Don't use AddShift with no shift
...
This accidentally removed optimizations elsewhere that was only checking
for Add.
2024-02-27 19:56:12 -08:00
Ryan Houdek
139367d248
Merge pull request #3463 from Sonicadvance1/update_xxhash
...
Update xxhash to v0.8.2
2024-02-27 16:39:38 -08:00
Alyssa Rosenzweig
49e798ab2b
Merge pull request #3461 from alyssarosenzweig/opt/sbc
...
Optimize SBC
2024-02-27 11:29:45 -04:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
...
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
...
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
...
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
12cc980603
OpcodeDispatcher: shuffle adc flag order
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
f3d55dd721
OpcodeDispatcher: shuffle SBC flag order
...
avoids clobbering nzcv
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
91cef6b76f
OpcodeDispatcher: use native ADC even for 8/16-bit
...
we mask off the upper bits, and they agree in the lower bits.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
270cbf39b5
OpcodeDispatcher: specialize SALC
...
this gets rid of the awkward non-flag SBB case, which streamlines SBB. while
getting better codegen for the demon opcode (-:
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:33:41 -04:00
Alyssa Rosenzweig
2e0be0a5e7
OpcodeDispatcher: allow more upper garbage with adc
...
missed this last series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:42:13 -04:00
Alyssa Rosenzweig
d60c089697
OpcodeDispatcher: allow upper garbage with sbb
...
for the usual reasons
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
e76ebeab58
OpcodeDispatcher: use 1-op "src + CF"
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
333271d490
OpcodeDispatcher: fuse sbb when flags calculated
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
a750870abf
OpcodeDispatcher: use fused sbcs calculations
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
15db72ef60
IR: add Sbb, SbbWithFlags ops
...
For fusing sbc+sbcs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
5f16f357af
Merge pull request #3456 from alyssarosenzweig/opt/adc
...
Optimize ADC
2024-02-26 09:39:37 -04:00
Ryan Houdek
4f028b8614
Capture a 64-bit process trying to jump to 32-bit syscall handler
...
Fixes #591
Adds a simple unittest
2024-02-26 05:37:29 -08:00
Alyssa Rosenzweig
7deb4976a3
RedundantFlagCalculationElimination: fix missing NEG case
...
can be predicated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 17:37:16 -04:00
Alyssa Rosenzweig
1e153e0c81
OpcodeDispatcher: allow garbage with adcs
...
for the usual reasons
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:02 -04:00
Alyssa Rosenzweig
6994fc3a01
IR,OpcodeDispatcher,JIT: fuse adcs flags
...
The usual tricks, also requires introducing a bare adc op to optimize adcs to,
but we wanted that anyway!
Also support a zero source, so we can calculate "foo + CF" in one instruction to
optimize the "lock adc" cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:49:32 -04:00
Ryan Houdek
d703f3ccee
Fixes zero register flag generation
...
Fixes 140976d322
Adds a unit test to ensure it keeps working.
2024-02-24 16:32:25 -08:00
Alyssa Rosenzweig
80e632db8a
OpcodeDispatcher: garbage collect
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
852e3c4e93
OpcodeDispatcher: fuse XADD
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
e86547bbcb
OpcodeDispatcher: fuse INC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
883cca2e8f
OpcodeDispatcher: use AddWithFlags
...
give it the same treatment we just gave sub.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
8540332520
IR: add AddWithFlags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
df5bdefb8a
OpcodeDispatcher: merge secondary ALU with primary ALU
...
It's the same, stop copypasting. This gets our flag and arithmetic opts (current
and future) applied to secondary ALU too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
3d1fb7701c
OpcodeDispatcher: optimize sub
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
140976d322
OpcodeDispatcher: prep primary ALU for better flags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00