9409 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
abfd974d70 OpcodeDispatcher: select hardware addressing modes
Now that we have a framework to do this in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:50 -04:00
Alyssa Rosenzweig
97966930e9 OpcodeDispatcher/x87f64: fuse addr calc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
a52a2e3ae4 OpcodeDispatcher/x87: fuse addr
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
c49b30f105 OpcodeDispatcher/Vector: fuse addr calc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
b0b4ad2083 OpcodeDispatcher: fuse xlat address
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
ee4bee4fef OpcodeDispatcher: fuse BT address
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
c3a0f5a2f6 OpcodeDispatcher: fuse sgdt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
0413a6bf68 OpcodeDispatcher: improve bmi2 shift
allow upper garbage, use simpler clean.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
7bd036d1ae OpcodeDispatcher: refactor address modes
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:32 -04:00
Alyssa Rosenzweig
112c49a348 ConstProp: fix inlining shifted imm to mem instructions
hit by sse4_1-pmaxuw.c.gcc-target-test-64.jit.gcc-target-64

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:42:48 -04:00
Alyssa Rosenzweig
80878ae611 ConstProp: rework mem immediate inlining
deduplicate all the things.

functional change:
hit by sse4_1-pmaxuw.c.gcc-target-test-64.jit.gcc-target-64

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:42:48 -04:00
Alyssa Rosenzweig
85a69be5b6 ConstProp: drop address fusion
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:38:03 -04:00
Alyssa Rosenzweig
8b5ca303e3 JIT: add asserts for invalid TSO load/store
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 12:12:36 -04:00
Ryan Houdek
20d5a26a72
Merge pull request #3674 from alyssarosenzweig/opt/logical-flags
Optimize logical flags
2024-05-30 12:11:21 -07:00
Alyssa Rosenzweig
9346116485 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-30 14:42:29 -04:00
Alyssa Rosenzweig
bb8336fcad OpcodeDispatcher: optimize logical flags
fuse the PF write in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-30 14:42:22 -04:00
Ryan Houdek
ee96d60983
Merge pull request #3673 from alyssarosenzweig/ra/tied
Track tied sources in the IR
2024-05-30 10:55:15 -07:00
Alyssa Rosenzweig
6052b335dc
Merge pull request #3666 from Sonicadvance1/fix_initial_darwinia
FileManagement: Fix fstatat/statx with self and NOFOLLOW
2024-05-29 23:11:24 -04:00
Ryan Houdek
ab0a6bbe9f
Merge pull request #3669 from Sonicadvance1/fix_addshift_operation
ConstProp fixes for Darwinia
2024-05-29 19:43:13 -07:00
Ryan Houdek
9dd6d8ed94
Merge pull request #3639 from Sonicadvance1/cleanupFD
FEXLoader: Cleanup FD extraction from environment variables
2024-05-29 19:18:59 -07:00
Ryan Houdek
3b5d0e3e27
FEXLoader: Cleanup FD extraction from environment variables
In preparation for seccomp execve inheritance where we need to extract
another FD from a different environment variable.

- Small function to extract the FD and also unset the environment
  variable in the same place.
   - Keeping the fetch and unset together instead of spreading to
     another location in the source.
- Extract the FD upfront instead of passing the string_view around,
  since we are unsetting the environment variable at the same place.

Future seccomp inheritance will get the FD just after the FEXFD
   - `int FEXSeccompFD {GetFEXFDFromEnv("FEX_SECCOMPFD")};`
2024-05-29 18:47:28 -07:00
Ryan Houdek
37e13cf073
FileManagement: Fix fstatat with self and NOFOLLOW
When asked to not follow the symlink, FEX needs to return data about the
symlink itself rather than following to the target executable. In that
case we need to return symlink information otherwise games that sanity
check can break.

This is what happened with Darwinia in #3662.

We return the FEXInterpreter symlink information in this case since it
doesn't return any information that is relevent to leaking emulator
state. Once the application asks to follow through to the symlink target
is when we will replace.

Also adds a unit test to ensure we don't break it.
2024-05-29 18:41:24 -07:00
Alyssa Rosenzweig
9b1b9c26cc
Merge pull request #3664 from Sonicadvance1/change_timestamp
FEXLogging: Changes representation of timestamp
2024-05-29 16:44:42 -04:00
Ryan Houdek
f7f3024b92
unittests/ASM: Adds SIB transpose scale register test
With a bit of pointer math it will choose the incorrect address if the
base and offset registers were transposed.
2024-05-29 11:41:20 -07:00
Alyssa Rosenzweig
11ec71a4ce InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Alyssa Rosenzweig
665491adf8 OpcodeDispatcher: drop weird !flagm special case
now that bfi is coalesced, this is a win.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Alyssa Rosenzweig
55391ccbc0 RegisterAllocationPass: try to coalesce tied sources
we'll do better in the future but this is already a win.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Alyssa Rosenzweig
7790d7a0b7 IR: track tied sources
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Ryan Houdek
f3d8c2cbac
unittests/ASM: Adds unittest for bug encountered in Darwinia 2024-05-29 07:29:49 -07:00
Ryan Houdek
1226069b4c
InstCountCI: Update for fixes
Only prefetch hit currently since ConstProp is limited to optimizing the
ADD IROp atm.
2024-05-29 04:42:40 -07:00
Ryan Houdek
80687c8d2d
ConstProp: Limits which addressing modes can be used for vector loadstores
This was causing us to generate invalid code in Darwinia, resulting in a
crash. With assertions enabled this would be picked up in the emitter.

Only implement AddShift optimizations for now because I don't want to do
the remaining optimizations in a bug fix PR.

Fixes Darwinia.
2024-05-29 04:42:11 -07:00
Ryan Houdek
920fe60492
ConstProp: Fix bug with transposed elements from AddShift op
Accidentally we were swapping which sources were the base and which was
the one getting shifted. This wasn't super common so it usually didn't
matter.

Fixes one crash in Darwinia.
2024-05-29 04:32:51 -07:00
Ryan Houdek
8c6ce2cb3b
Passes/ConstProp: Have MemExtendedAddressing return a struct rather than a tuple
Makes it less confusing about which variable is the base versus the
offset.

NFC
2024-05-29 04:32:14 -07:00
Ryan Houdek
61f30d004c
IR: Document AddShift behaviour
Just to clarify that Src2 is the shifted operation.
2024-05-29 04:29:09 -07:00
Ryan Houdek
95919a1ddf
InstcountCI: Add addressing limit tests for base + offset<<shift
These need to be tested.
2024-05-29 04:28:24 -07:00
Ryan Houdek
35ec54f920
Merge pull request #3667 from alyssarosenzweig/opt/pcmp
Optimize PCMPESTRI flags a bit
2024-05-28 22:06:37 -07:00
Alyssa Rosenzweig
32e8a56093 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-28 09:32:14 -04:00
Alyssa Rosenzweig
136f1d0a0b OpcodeDispatcher: drop pcmpestri zext
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-28 09:32:14 -04:00
Alyssa Rosenzweig
0c042d1e85 VectorFallbacks: optimize PCMP*STRI flags
Return an NZCV.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-28 09:19:43 -04:00
Alyssa Rosenzweig
ad13442be4
Merge pull request #3665 from Sonicadvance1/sse42_instcountci
InstCountCI: Adds SSE4.2 operations
2024-05-28 08:57:45 -04:00
Ryan Houdek
d6b9252760
InstCountCI: Adds SSE4.2 operations
Doesn't handle all 127 combinations of the control immediate for all
four instructions. Although supplies the instruction control instruction
that A Hat in Time abuses heavily.

The SSE2 implementation of the function in vcruntime140 is likely faster
than our currently implementation but we should be able to get something
comparable. Not bad considering this is a required extension and this is
the first game we found that abuses the instruction heavily.
2024-05-28 00:55:32 -07:00
Ryan Houdek
22222ebaf5
FEXLogging: Changes representation of timestamp
This was a bit confusing to read and I had always expected to change
this at some point.

Previous:
```
[INFO][1579518391560577][1601857.1601857] clone: Unsupported flags w/o CLONE_THREAD (Shared Resources), 4100
```

Now:
```
[INFO][1590468.992593376][1629501.1629501] clone: Unsupported flags w/o CLONE_THREAD (Shared Resources), 4100
```
2024-05-27 23:36:58 -07:00
Alyssa Rosenzweig
734258e23b
Merge pull request #3661 from Sonicadvance1/remove_warnings2
Removes warnings
2024-05-25 11:52:37 -04:00
Ryan Houdek
74916b3757
RAPass: Remove warnings 2024-05-24 18:41:30 -07:00
Ryan Houdek
c5359264a3
VixlUtils: Remove warnings 2024-05-24 18:41:19 -07:00
Ryan Houdek
9d0ff7929e
Merge pull request #3660 from alyssarosenzweig/opt/smash-less
Delete a big chunk of IR/Passes/*
2024-05-24 16:23:48 -07:00
Alyssa Rosenzweig
d3eed27d17 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:45:32 -04:00
Alyssa Rosenzweig
bc1669b163 DeadStoreElimination: eliminate map
use a vec. block indices will be dense in the new IR. This is memory intensive
but seems faster in practice.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
83e417b2c6 DeadStoreElimination: combine GPR/FPR handling
slight speed up per profile.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
cb00d9171f IR: merge general DCE with flag DCE
Flag DCE needs to do general DCE anyway to converge in one pass. So we can move
the special syscall/atomic logic over to flag DCE and then drop the second DCE
pass altogether. Now local dead code of both is eliminated in a single pass.

Flag DCE is carefully written to converge in a single iteration which makes this
scheme work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00