8111 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
319cf4bf3d Arm64: Preserve flags in ExitFunction
Mostly harmless (except for fusion on cortexes), prepares us for nzcv work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
d75c0f2c50 IR: Add missing flag clobbers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
a586d3823d IR: VFCMPEQ does not clobber flag
Oversight.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
5522c6db9c OpcodeDispatcher: Use jump wrappers
Mostly automated replacement + renaming for build fixing.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
367e1658ad OpcodeDispatcher: Add jump wrappers
These should always be used in the dispatcher rather than the raw jumps they
translate to, as they ensure that flags are flushed. Eliminates a class of bugs
that will become a lot easier to hit with the new nzcv work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
bbad06f81a ArchHelpers: Add cfinv()
FlagM goodness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:35 -04:00
Ryan Houdek
39c5ab1c81
Merge pull request #3225 from neobrain/fix_32bit_funcptrs
Thunks: Fix function pointer support on 32-bit
2023-10-25 07:07:18 -07:00
Ryan Houdek
5bf790324c
Merge pull request #3224 from neobrain/feature_thunk_strict_everywhere
Thunks: Annotate pointer parameters throughout all thunked libraries
2023-10-25 07:00:42 -07:00
Tony Wasserka
adcdb32d49 Thunks: Fix function pointer support on 32-bit 2023-10-25 14:37:22 +02:00
Tony Wasserka
f264578f12 Thunks: Unconditionally enable strict processing mode 2023-10-25 12:39:57 +02:00
Tony Wasserka
7149da387a Thunks: Annotate pointer parameters throughout all thunked libraries 2023-10-25 12:39:57 +02:00
Alyssa Rosenzweig
0f3d14e7c0
Merge pull request #3223 from Sonicadvance1/testharness_page_size
TestHarnessRunner: Don't hardcode stack allocation to 4096 bytes
2023-10-24 10:52:35 -04:00
Ryan Houdek
aad5080224 TestHarnessRunner: Don't hardcode stack allocation to 4096 bytes
Just allocate a single page that we query at runtime.
2023-10-24 07:36:13 -07:00
Ryan Houdek
c77a3d673c
Merge pull request #3222 from Sonicadvance1/nzcv_opt_bug
unittests: Adds test for bug from #3162
2023-10-23 18:19:16 -07:00
Ryan Houdek
0ff2e6e1e3 unittests: Adds test for bug from #3162
This PR has a bug around flags calculation and REP LODS{B,W,D,Q}.
This currently passes on main but fails on #3162.

Bug only occurs in 32-bit instead of 64-bit with the same test. Should
help diagnose the bugs in #3162.
2023-10-23 16:55:56 -07:00
Ryan Houdek
8538f5bac4
Merge pull request #3221 from Sonicadvance1/instcountci_missing_insts
InstCountCI: Adds two missing variants of movd/movq
2023-10-23 15:26:58 -07:00
Ryan Houdek
a305baf6e5 InstCountCI: Fixes some mislabeled instructions
These are optimal according to our standards.
2023-10-23 15:15:03 -07:00
Ryan Houdek
807619aa02 InstCountCI: Adds two missing variants of movd/movq
We only had the move to memory destination version, ensure we test to
GPR as well.
2023-10-23 15:01:18 -07:00
Ryan Houdek
6db2125b41
Merge pull request #3220 from Sonicadvance1/override_flagm
InstCountCI: Support disabling flagm extensions
2023-10-23 14:45:45 -07:00
Ryan Houdek
9f6d80fe5d InstCountCI: Duplicate tests that change behaviour based on flagm
Necessary for #3162 to have consistent behaviour in CI
2023-10-23 14:03:19 -07:00
Ryan Houdek
423ce12001 InstCountCI: Disable flagm and Flagm2 on tests
Most of these will get duplicated in the next commit
2023-10-23 14:02:50 -07:00
Ryan Houdek
4edd72fc33 InstCountCI: Support disabling flagm extensions
This is necessary so #3162 can give consistent results
2023-10-23 14:02:24 -07:00
Ryan Houdek
978f607dd9
Merge pull request #3177 from neobrain/feature_thunk_pointer_annotations
Thunks: Add new pointer annotations to assist data layout analysis
2023-10-23 13:00:27 -07:00
Ryan Houdek
9ba78c9771
Merge pull request #3219 from Sonicadvance1/enable_vixlsim_instcountci
github: Enables Vixl simulator on x86 host for instcountci
2023-10-23 12:49:33 -07:00
Ryan Houdek
e2144345c0 github: Enables Vixl simulator on x86 host for instcountci
Otherwise features get filled out weirdly.
2023-10-23 12:42:40 -07:00
Ryan Houdek
63e4c3682d
Merge pull request #3218 from Sonicadvance1/opt_df
OpcodeDispatcher: Optimize DF pointer offset calculation
2023-10-23 11:41:22 -07:00
Ryan Houdek
2956e84ead InstCountCI: Update for constant prop improvement 2023-10-23 10:39:29 -07:00
Ryan Houdek
4466c50c2b ConstProp: Optimize SubShift and Add with negative
When SubShift (LSL) occurs with both sources constant then optimize away
the calculation.

Additionally if add is found to have one immediate constant where the
inverse of the constant fits in to ImmAddSub range, then invert the
constant and change it in to a sub.

This optimizes the cases when direction flag is known upfront in an
instruction.
2023-10-23 10:36:33 -07:00
Ryan Houdek
dd5ca1d349 InstCountCI: Update for DF pointer optimization. 2023-10-23 10:14:51 -07:00
Ryan Houdek
95c756b466 OpcodeDispatcher: Optimize DF pointer offset calculation
Previously this moved two constant, did a compare and a csel. Four
instructions in total. It also corrupts NZCV which we want to use for
other things.

This new codegen emits one constant and one subtract instruction, two
instructions total and doesn't touch NZCV.

More optimal!
2023-10-23 09:27:41 -07:00
Ryan Houdek
99465faf63 IR: Implements support for subtract with shifted register
Will be used soon.
2023-10-23 09:27:41 -07:00
Ryan Houdek
e018917f76
Merge pull request #3216 from alyssarosenzweig/opt/nzcv-infra
Prep commits for NZCV modelling
2023-10-23 07:39:47 -07:00
Ryan Houdek
a261d9909e
Merge pull request #3214 from Sonicadvance1/fix_bug
FEXCore: Fixes bug in vector `ZextAndMaskingElimination` pass
2023-10-23 07:29:43 -07:00
Alyssa Rosenzweig
6de8bc6848 IR: Annotate instructions with implicit flag clobber
Audit the code base and mark any instruction that implicitly clobbers flags so
it can get special handling in the dispatcher to spill NZCV ahead of emitting.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:47 -04:00
Alyssa Rosenzweig
d87155e4ee IR: Add infrastructure for modelling flag clobbers
Lots of instructions clobber NZCV inadvertently but are not intended to write to
the host flags from the IR point-of-view. As an example, Abs logically has no
side effects but physically clobbers NZCV due to its cmp/csneg impl on non-CSSC
hw. Add infrastructure to model this in the IR so we can deal with it when we
start using NZCV for things.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:47 -04:00
Alyssa Rosenzweig
7484cacaf9 InstCountCI: Update for VInsertElement change
Only SVE256 codepath affected.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:47 -04:00
Alyssa Rosenzweig
42259974c4 Arm64: Preserve NZCV in VInsertElement
So we don't need to mark VInsertElement as implicit clobber in the common case.
Only afects sve256 which doesn't exist yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:34 -04:00
Alyssa Rosenzweig
e455996dbd OpcodeDispatcher: Remove silly shift branching
The flag generation code does this internally and more efficiently.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:16:38 -04:00
Alyssa Rosenzweig
b5dd1d05e9 Dispatcher: Preserve NZCV
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:16:38 -04:00
Alyssa Rosenzweig
bbaf70da15 Dispatcher: Yeet pointless subs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:16:29 -04:00
Ryan Houdek
65e8d094ef
Merge pull request #3215 from alyssarosenzweig/hack/16k-units
FEXLoader: Query runtime page size
2023-10-23 06:43:43 -07:00
Alyssa Rosenzweig
4c801d594a FEXLoader: Query runtime page size
This lets most of the ASM tests run on 16K Linux hosts which is good because I
have a Mac and I'm bad at computer.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 09:35:22 -04:00
Ryan Houdek
d4403edea9 OpcodeDispatcher: Updates COMIS to eliminate scalar moves
This was one of the few things that managed to hit the previously
removed optimization. Just fix the OpcodeDispatcher instead.
2023-10-21 21:33:07 -07:00
Ryan Houdek
06ef012fb2 FEXCore: Fixes bug in vector ZextAndMaskingElimination pass
With the previous RCLSE pass optimization that fixes store->load
forwarding, this pass started optimizing harder.

This hit a bug with this vmov removal that previously didn't get hit.
In particular this would eliminate vmov IR operations even if they were
zero extending a vector.
Since we have dramatically cleaned up the amount of vmov IR operations
we are generating, remove this optimization entirely. In the games I
tested, the only game that hit this "optimization" was Ender Lilies and
it started generating broken code for the single block of instructions
that did.

Adds a unit test for this case just in-case it comes back in the future
for some reason.

Fixes an issue where Ender Lilies would flash the screen to black every
time an enemy hit the player character.
2023-10-21 21:21:14 -07:00
Mai
8f8f37684a
Merge pull request #3213 from Sonicadvance1/fix_repres
JITArm64: Fixes bug in rpres scalar operations
2023-10-22 05:05:14 +02:00
Ryan Houdek
7140b8d901 InstCountCI: Update for RPRES fix 2023-10-21 15:29:11 -07:00
Ryan Houdek
d5beba9423 JITArm64: Fixes bug in rpres scalar operations
Noticed this during code investigation, these two operations were
swapped.

Would have caused issues if anything supported RPRES today.
2023-10-21 15:24:43 -07:00
Ryan Houdek
2e694412f4
Merge pull request #3211 from lioncash/ext
VectorOps: Handle SVE VExtr a little better
2023-10-19 16:04:02 +02:00
Lioncache
d84577c36c VectorOps: Handle SVE VExtr a little better
If the source registers don't alias the destination, then we can
safely move the lower bits over to it without using a temporary.
2023-10-19 15:11:23 +02:00
Ryan Houdek
cf9c2aa72c
Merge pull request #3206 from Sonicadvance1/fix_syscall
Linux: Fixes issue with *at syscalls with absolute paths not working
2023-10-19 15:05:34 +02:00