Paulo Matos
a1378f94ce
X87 Code Refactoring and Optimization Pass
2024-07-22 08:44:45 +02:00
Ryan Houdek
77ec950ff2
Merge pull request #3885 from alyssarosenzweig/opt/zero-flag
...
Optimize zero x87 flags
2024-07-21 13:06:25 -07:00
Alyssa Rosenzweig
592d6cc43f
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:50:10 -04:00
Alyssa Rosenzweig
610caf8529
ConstProp: treat StoreContext as zeroable
...
todo: FPR equivalent.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:09 -04:00
Alyssa Rosenzweig
d20b46e46f
IR: drop LoadFlag/StoreFlag ops
...
pointless, we can just load/store the context now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:09 -04:00
Alyssa Rosenzweig
4094aa1b9a
DeadStoreElimination: drop flag handling
...
now that we do everything via NZCV, this is mostly vestigial. DF/x87 flags are
sufficiently rare to be "don't care"s here, and we don't even have multiblock
enabled yet!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:08 -04:00
Ryan Houdek
f8c6baae97
Merge pull request #3883 from Sonicadvance1/implement_daz
...
Arm64: Implements support for DAZ using AFP.FIZ
2024-07-21 10:03:34 -07:00
Ryan Houdek
5c9bb6594c
Merge pull request #3884 from Sonicadvance1/remove_vex_telem
...
Telemetry: Remove VEX flag
2024-07-20 19:44:28 -07:00
Ryan Houdek
56df57e980
InstcountCI: Update
2024-07-20 17:26:27 -07:00
Ryan Houdek
f7b4d25803
Telemetry: Remove VEX flag
...
This is no longer necessary and it also no longer provides us any useful
information. Since we expose the AVX CPUID flag, basically everything
uses VEX encoding now, so it is basically always set.
2024-07-20 17:24:00 -07:00
Ryan Houdek
4fffe68f81
InstcountCI: Update
2024-07-20 15:57:01 -07:00
Ryan Houdek
95b15d788b
Arm64: Fix filling static registers
...
Some locations could end up with SRA registers that only spilled one
register.
Allow passing in temporaries from the call site.
Fixes rpid and syscalls asserting.
2024-07-20 15:57:01 -07:00
Ryan Houdek
ae9312bdab
unittests: Implements a DAZ test
...
Specifically does a vector add with and without DAZ enabled and ensures
the value is different when the source values contain a denormal.
2024-07-20 15:34:54 -07:00
Ryan Houdek
b78da2e5ad
Arm64: Implements support for DAZ using AFP.FIZ
...
When AFP is supported then we can actually support DAZ. This might also
fix the audio corruption in Animal Well but I can't test it until Steam
is running on Oryon. Requires a bit of plumbing for MXCSR which we were
hacking around before but now we actually want to store the value.
Fixes #3856
2024-07-20 15:34:54 -07:00
Ryan Houdek
54fc8cb0bd
TestHarnessRunner: Support querying AFP for features
...
Also fixes desync of flags
2024-07-20 15:34:54 -07:00
Ryan Houdek
228009c283
Merge pull request #3881 from bylaws/race
...
FixedSizePooledAllocation: Fix a race when unclaiming disowned buffers
2024-07-20 12:35:35 -07:00
Billy Laws
1f878ce4cd
FixedSizePooledAllocation: Fix a race when unclaiming disowned buffers
...
A disowned buffer could be unclaimed or claimed by a different thread in
the time between the !IsFree check and locking the allocation mutex.
Fix this and prevent such errors in the future by always checking
ownership with the allocator locked before attempting to unclaim
buffers.
2024-07-20 00:12:52 +00:00
Alyssa Rosenzweig
e4b7a65a49
Merge pull request #3880 from pmatos/InstCountMemcpy
...
Add x87 memcpy instcountci tests
2024-07-19 08:53:23 -04:00
Paulo Matos
c77a707dbe
Add x87 memcpy instcountci tests
2024-07-19 09:09:34 +02:00
Ryan Houdek
f81fc4e4f0
Merge pull request #3866 from Sonicadvance1/ArgumentLoader_atexit_remove
...
ArgumentLoader: Removes static fextl::vector usage
2024-07-18 13:12:03 -07:00
Ryan Houdek
d385e496d3
Merge pull request #3879 from Sonicadvance1/cpuid_leafs
...
EmulatedFiles: Adds a few leaf CPUID flags
2024-07-18 13:10:21 -07:00
Mai
f8c4c543e3
Merge pull request #3871 from Sonicadvance1/improve_vpshufd_vpermilps
...
AVX128: Improve VPERMILPS/PD and VPSHUFD
2024-07-18 15:58:47 -04:00
Ryan Houdek
d2f903ae55
EmulatedFiles: Adds a few leaf CPUID flags
...
We support leaf functions now, so add the few that were calling for it.
We will be gaining support for the xsave ones relatively soon, so its
good to have them supported.
Also deletes a couple of cdt/cqm things that aren't exposed and we won't
be supporting.
2024-07-18 07:02:49 -07:00
Ryan Houdek
d1249ec5cf
Merge pull request #3878 from neobrain/refactor_fix_format_oops
...
EmulatedFiles: Fix bad formatting
2024-07-18 06:44:44 -07:00
Tony Wasserka
bf9a6d763c
EmulatedFiles: Fix bad formatting
2024-07-18 15:06:57 +02:00
Ryan Houdek
0b829d2c46
unittests: Adds a test for full pshufd imm coverage
2024-07-18 04:13:03 -07:00
Ryan Houdek
bddb533fa0
InstcountCI: Add some more of the cases
2024-07-18 04:13:03 -07:00
Ryan Houdek
1c35eeffeb
Vector: Optimize PSHUFD with brute force search
...
With a brute force search of methods between 1-3 instructions we cover a
lot more cases more optimally.
There's definitely still more cases (and probably some that can reduce
from 3 instruction to 2), but covering 44 cases is a pretty good margin
already.
2024-07-18 04:10:58 -07:00
Ryan Houdek
c7254e31ed
InstcountCI: Update for VPERM/VPSHUFD improvements
2024-07-18 04:10:58 -07:00
Ryan Houdek
b0bd8a62a2
AVX128: Improve VPERMILPS/PD and VPSHUFD
...
VPSHUFD and VPERMILPS are aliases of each other.
Reuses the implementation path from the PSHUFD implementation which has
a few swizzles and then a table lookup.
VPERMILPD is a very simple swizzle per 128-bit lane.
Fixes #3797
Fixes #3784
2024-07-18 04:10:58 -07:00
Ryan Houdek
17a55fbb39
Merge pull request #3876 from alyssarosenzweig/json/x87
...
Autogenerate LoweredX87() query, misc json_ir_generator cleanup in the area
2024-07-18 04:10:21 -07:00
Alyssa Rosenzweig
dedec83881
json_ir_generator: autoderive array names
...
these are purely internal.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:32:51 -04:00
Alyssa Rosenzweig
9fd5c73633
json_ir_generator: generate IsLoweredX87 helper
...
X87 pass will use this query.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:46 -04:00
Alyssa Rosenzweig
bdb890a8b0
json_ir_generator: rename X87 -> LoweredX87
...
to reflect its actual meaning
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:14 -04:00
Alyssa Rosenzweig
5043d09771
json_ir_generator: use textwrap.dedent, f-string
...
for size
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:28:59 -04:00
Ryan Houdek
da51169ba9
Merge pull request #3875 from alyssarosenzweig/ir/gethostflag
...
IR: garbage collect premature F80Cmp optimizations
2024-07-17 03:05:48 -07:00
Ryan Houdek
f72cee480f
Merge pull request #3874 from alyssarosenzweig/opt/reconstructftw
...
X87: save uop in ReconstructFTW
2024-07-17 03:05:37 -07:00
Alyssa Rosenzweig
7546160811
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
e7d5a01c5f
IR: remove F80Cmp flags
...
nothing is optimizing around this, it's just adding pointless complexity. if we
want to actually optimize F80Cmp, the right way would be to lift the
implementation into the OpcodeDispatcher or JIT. it wouldn't be terribly
difficult. This kludge doesn't get us closer there.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
0c3a8d0bc8
IR: remove GetHostFlag
...
it doesn't get host flags, it's just an extra Bfe used in x87. pointless and
confusing!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:44:34 -04:00
Alyssa Rosenzweig
19e58cac62
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:28 -04:00
Alyssa Rosenzweig
c4ba7eee87
X87: save uop in ReconstructFTW
...
noticed while reviewing Paulo's work
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:09 -04:00
Ryan Houdek
09c4a5594a
Merge pull request #3870 from Sonicadvance1/enable_more_tests
...
github: Vixl simulator enable more asm tests
2024-07-16 07:23:22 -07:00
Alyssa Rosenzweig
d204155661
Merge pull request #3872 from pmatos/X87AutoMarking
...
X87 Stack Ops Auto-marking
2024-07-16 09:45:42 -04:00
Tony Wasserka
924b8c10a9
Merge pull request #3873 from pmatos/UnusedFunction
...
Remove unused function MmapOverride
2024-07-16 12:30:04 +02:00
Paulo Matos
9017cd14c8
Remove unused function MmapOverride
2024-07-16 11:16:07 +02:00
Paulo Matos
8d89adef2e
Add IR stack operations
...
These IR operations deal implicitly with the x87 stack and are removed
by the x87 stack optimization pass.
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
6615b55c12
json_ir_generator: call RecordX87Use when generating ops
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
66865dd177
json_ir_generator: alias X87 to !JITDispatch
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
1e709d1150
OpcodeDispatcher: add RecordX87 helper
...
calls will be generated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00