Commit Graph

10035 Commits

Author SHA1 Message Date
Paulo Matos
a1378f94ce X87 Code Refactoring and Optimization Pass 2024-07-22 08:44:45 +02:00
Ryan Houdek
77ec950ff2
Merge pull request #3885 from alyssarosenzweig/opt/zero-flag
Optimize zero x87 flags
2024-07-21 13:06:25 -07:00
Alyssa Rosenzweig
592d6cc43f InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:50:10 -04:00
Alyssa Rosenzweig
610caf8529 ConstProp: treat StoreContext as zeroable
todo: FPR equivalent.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:09 -04:00
Alyssa Rosenzweig
d20b46e46f IR: drop LoadFlag/StoreFlag ops
pointless, we can just load/store the context now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:09 -04:00
Alyssa Rosenzweig
4094aa1b9a DeadStoreElimination: drop flag handling
now that we do everything via NZCV, this is mostly vestigial. DF/x87 flags are
sufficiently rare to be "don't care"s here, and we don't even have multiblock
enabled yet!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-21 15:49:08 -04:00
Ryan Houdek
f8c6baae97
Merge pull request #3883 from Sonicadvance1/implement_daz
Arm64: Implements support for DAZ using AFP.FIZ
2024-07-21 10:03:34 -07:00
Ryan Houdek
5c9bb6594c
Merge pull request #3884 from Sonicadvance1/remove_vex_telem
Telemetry: Remove VEX flag
2024-07-20 19:44:28 -07:00
Ryan Houdek
56df57e980
InstcountCI: Update 2024-07-20 17:26:27 -07:00
Ryan Houdek
f7b4d25803
Telemetry: Remove VEX flag
This is no longer necessary and it also no longer provides us any useful
information. Since we expose the AVX CPUID flag, basically everything
uses VEX encoding now, so it is basically always set.
2024-07-20 17:24:00 -07:00
Ryan Houdek
4fffe68f81
InstcountCI: Update 2024-07-20 15:57:01 -07:00
Ryan Houdek
95b15d788b
Arm64: Fix filling static registers
Some locations could end up with SRA registers that only spilled one
register.
Allow passing in temporaries from the call site.
Fixes rpid and syscalls asserting.
2024-07-20 15:57:01 -07:00
Ryan Houdek
ae9312bdab
unittests: Implements a DAZ test
Specifically does a vector add with and without DAZ enabled and ensures
the value is different when the source values contain a denormal.
2024-07-20 15:34:54 -07:00
Ryan Houdek
b78da2e5ad
Arm64: Implements support for DAZ using AFP.FIZ
When AFP is supported then we can actually support DAZ. This might also
fix the audio corruption in Animal Well but I can't test it until Steam
is running on Oryon. Requires a bit of plumbing for MXCSR which we were
hacking around before but now we actually want to store the value.

Fixes #3856
2024-07-20 15:34:54 -07:00
Ryan Houdek
54fc8cb0bd
TestHarnessRunner: Support querying AFP for features
Also fixes desync of flags
2024-07-20 15:34:54 -07:00
Ryan Houdek
228009c283
Merge pull request #3881 from bylaws/race
FixedSizePooledAllocation: Fix a race when unclaiming disowned buffers
2024-07-20 12:35:35 -07:00
Billy Laws
1f878ce4cd FixedSizePooledAllocation: Fix a race when unclaiming disowned buffers
A disowned buffer could be unclaimed or claimed by a different thread in
the time between the !IsFree check and locking the allocation mutex.
Fix this and prevent such errors in the future by always checking
ownership with the allocator locked before attempting to unclaim
buffers.
2024-07-20 00:12:52 +00:00
Alyssa Rosenzweig
e4b7a65a49
Merge pull request #3880 from pmatos/InstCountMemcpy
Add x87 memcpy instcountci tests
2024-07-19 08:53:23 -04:00
Paulo Matos
c77a707dbe Add x87 memcpy instcountci tests 2024-07-19 09:09:34 +02:00
Ryan Houdek
f81fc4e4f0
Merge pull request #3866 from Sonicadvance1/ArgumentLoader_atexit_remove
ArgumentLoader: Removes static fextl::vector usage
2024-07-18 13:12:03 -07:00
Ryan Houdek
d385e496d3
Merge pull request #3879 from Sonicadvance1/cpuid_leafs
EmulatedFiles: Adds a few leaf CPUID flags
2024-07-18 13:10:21 -07:00
Mai
f8c4c543e3
Merge pull request #3871 from Sonicadvance1/improve_vpshufd_vpermilps
AVX128: Improve VPERMILPS/PD and VPSHUFD
2024-07-18 15:58:47 -04:00
Ryan Houdek
d2f903ae55
EmulatedFiles: Adds a few leaf CPUID flags
We support leaf functions now, so add the few that were calling for it.
We will be gaining support for the xsave ones relatively soon, so its
good to have them supported.

Also deletes a couple of cdt/cqm things that aren't exposed and we won't
be supporting.
2024-07-18 07:02:49 -07:00
Ryan Houdek
d1249ec5cf
Merge pull request #3878 from neobrain/refactor_fix_format_oops
EmulatedFiles: Fix bad formatting
2024-07-18 06:44:44 -07:00
Tony Wasserka
bf9a6d763c EmulatedFiles: Fix bad formatting 2024-07-18 15:06:57 +02:00
Ryan Houdek
0b829d2c46
unittests: Adds a test for full pshufd imm coverage 2024-07-18 04:13:03 -07:00
Ryan Houdek
bddb533fa0
InstcountCI: Add some more of the cases 2024-07-18 04:13:03 -07:00
Ryan Houdek
1c35eeffeb
Vector: Optimize PSHUFD with brute force search
With a brute force search of methods between 1-3 instructions we cover a
lot more cases more optimally.

There's definitely still more cases (and probably some that can reduce
from 3 instruction to 2), but covering 44 cases is a pretty good margin
already.
2024-07-18 04:10:58 -07:00
Ryan Houdek
c7254e31ed
InstcountCI: Update for VPERM/VPSHUFD improvements 2024-07-18 04:10:58 -07:00
Ryan Houdek
b0bd8a62a2
AVX128: Improve VPERMILPS/PD and VPSHUFD
VPSHUFD and VPERMILPS are aliases of each other.

Reuses the implementation path from the PSHUFD implementation which has
a few swizzles and then a table lookup.

VPERMILPD is a very simple swizzle per 128-bit lane.

Fixes #3797
Fixes #3784
2024-07-18 04:10:58 -07:00
Ryan Houdek
17a55fbb39
Merge pull request #3876 from alyssarosenzweig/json/x87
Autogenerate LoweredX87() query, misc json_ir_generator cleanup in the area
2024-07-18 04:10:21 -07:00
Alyssa Rosenzweig
dedec83881 json_ir_generator: autoderive array names
these are purely internal.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:32:51 -04:00
Alyssa Rosenzweig
9fd5c73633 json_ir_generator: generate IsLoweredX87 helper
X87 pass will use this query.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:46 -04:00
Alyssa Rosenzweig
bdb890a8b0 json_ir_generator: rename X87 -> LoweredX87
to reflect its actual meaning

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:14 -04:00
Alyssa Rosenzweig
5043d09771 json_ir_generator: use textwrap.dedent, f-string
for size

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:28:59 -04:00
Ryan Houdek
da51169ba9
Merge pull request #3875 from alyssarosenzweig/ir/gethostflag
IR: garbage collect premature F80Cmp optimizations
2024-07-17 03:05:48 -07:00
Ryan Houdek
f72cee480f
Merge pull request #3874 from alyssarosenzweig/opt/reconstructftw
X87: save uop in ReconstructFTW
2024-07-17 03:05:37 -07:00
Alyssa Rosenzweig
7546160811 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
e7d5a01c5f IR: remove F80Cmp flags
nothing is optimizing around this, it's just adding pointless complexity. if we
want to actually optimize F80Cmp, the right way would be to lift the
implementation into the OpcodeDispatcher or JIT. it wouldn't be terribly
difficult. This kludge doesn't get us closer there.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
0c3a8d0bc8 IR: remove GetHostFlag
it doesn't get host flags, it's just an extra Bfe used in x87. pointless and
confusing!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:44:34 -04:00
Alyssa Rosenzweig
19e58cac62 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:28 -04:00
Alyssa Rosenzweig
c4ba7eee87 X87: save uop in ReconstructFTW
noticed while reviewing Paulo's work

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:09 -04:00
Ryan Houdek
09c4a5594a
Merge pull request #3870 from Sonicadvance1/enable_more_tests
github: Vixl simulator enable more asm tests
2024-07-16 07:23:22 -07:00
Alyssa Rosenzweig
d204155661
Merge pull request #3872 from pmatos/X87AutoMarking
X87 Stack Ops Auto-marking
2024-07-16 09:45:42 -04:00
Tony Wasserka
924b8c10a9
Merge pull request #3873 from pmatos/UnusedFunction
Remove unused function MmapOverride
2024-07-16 12:30:04 +02:00
Paulo Matos
9017cd14c8 Remove unused function MmapOverride 2024-07-16 11:16:07 +02:00
Paulo Matos
8d89adef2e Add IR stack operations
These IR operations deal implicitly with the x87 stack and are removed
by the x87 stack optimization pass.
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
6615b55c12 json_ir_generator: call RecordX87Use when generating ops
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
66865dd177 json_ir_generator: alias X87 to !JITDispatch
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
1e709d1150 OpcodeDispatcher: add RecordX87 helper
calls will be generated.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00