Ryan Houdek
0139498072
SpinWaitLock: Removes unused variable in spin-loop fallback
...
Tmp was no longer being used, forgot to remove it.
2024-02-05 07:22:52 -08:00
Ryan Houdek
472a701e2b
Merge pull request #3403 from Sonicadvance1/fix_spinlock_contended_lock
...
SpinLockWait: Fixes unexpected lock success
2024-02-05 06:51:42 -08:00
Ryan Houdek
cce6011205
SpinLockWait: Fixes unexpected lock success
...
With a contended unique lock, we forgot to reset the `Expected` value to
zero. This was causing a contended mutex to incorrectly succeed.
Noticed this when converting some pthread mutexes over to spinloops to
remove strace noise.
The reference wfe_mutex library I wrote didn't have this problem since
the implementation is slightly different.
2024-02-03 01:10:57 -08:00
Ryan Houdek
c437129ed8
Revert "Revert "FEXLoader: Moves thread management to the frontend""
...
This reverts commit 5358af7794d9568398f7b84fe09b4c8198448f2c.
2024-02-03 00:57:36 -08:00
Alyssa Rosenzweig
8d3f0b6f02
OpcodeDispatcher: reassociate and sink W in sha1
...
We only need each part of W extracted in the corresponding round, so sink the
extract into the round to reduce pressure.
Further, W and E are added and then never used again. So, by reassociating we
can do the add upfront, killing W and E at the start and further reducing
pressure.
Eliminates spilling in sha1rnds4.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
60f7b9bcc4
OpcodeDispatcher: optimze sha1's 2/3 expr
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
a487557173
OpcodeDispatcher: extract BitwiseAtLeastTwo
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
394b4888bb
OpcodeDispatcher: reassociate and remat C0, G0
...
costs 2 moves and eliminates the rest of our spilling
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
142cbdd852
OpcodeDispatcher: expand, reassociate, and interleave sha256 calc
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
2f9102f78d
OpcodeDispatcher: expand & interleave sha256 calc
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
c9824d04cb
OpcodeDispatcher: sink sha256 extracts
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
9c2a569539
OpcodeDispatcher: reexpress Major in sha256
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
515aa4ce3e
OpcodeDispatcher: fuse eor+ror in sha256
...
This reduces instructions a ton.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
f616beb992
OpcodeDispatcher: CSE sha
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
0dcf1e12b8
OpcodeDispatcher: copyprop sha logic
...
prepare for clever
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
2cbf544ef5
OpcodeDispatcher: expand sha logic
...
no functional change, just preparing for cleverness.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Ryan Houdek
0eed73beeb
HostFeatures: Supports runtime disabling of preserve_all
...
This is used for instcountci to ensure instruction counts don't change
when a compiler supports this feature or not. Always runtime disable
when running in instcountci.
CMake option from #3394 can still be useful so leaving that in place.
2024-02-02 08:59:04 -08:00
Mai
6993f4fd8d
Merge pull request #3400 from Sonicadvance1/revert_runtime_longmode_switch
...
Revert #3303
2024-02-02 11:53:44 -05:00
Paulo Matos
4623544f69
Improve XCHG operations
...
Marking loads as allowing upper garbage simplifies some operations.
Update InstCountCI as well.
2024-02-02 08:16:13 +00:00
Ryan Houdek
ccf1402fe6
Revert "FEXCore: Accurately store segment descriptors"
...
This reverts commit 8648fb148556459b277dcd7e53a0fc092b626875.
2024-02-01 18:14:30 -08:00
Ryan Houdek
da0e1b515a
Revert "OpcodeDispatcher: Initial support for runtime long-mode switch"
...
This reverts commit 9e5d7aa5fe65461b0067ea72034e23cb1dc44285.
2024-02-01 18:14:24 -08:00
Ryan Houdek
cec1814a09
Merge pull request #3384 from pmatos/CDQOp-Opt
...
Optimize CDQOp
2024-01-31 17:51:23 -08:00
Mai
4d49ac7c3d
Merge pull request #3387 from alyssarosenzweig/opt/rotates
...
Optimize rotates
2024-01-31 18:20:40 -05:00
Alyssa Rosenzweig
6d13d9fb56
Merge pull request #3395 from pmatos/StaticAnalysis
...
Code cleanup - mainly dead store removal; NFC
2024-01-31 17:24:48 -04:00
Mai
ae7dc250db
Merge pull request #3386 from alyssarosenzweig/opt/shift
...
Optimize shifts a bit
2024-01-31 14:11:58 -05:00
Paulo Matos
e4560ed0c8
Code cleanup - mainly dead store removal; NFC
...
scan-build found a few dead stores that can be easily cleaned-up
2024-01-31 08:35:55 +00:00
Alyssa Rosenzweig
f3eee8f305
OpcodeDispatcher: optimize bextr's length sanitize
...
reordering the operations saves an immediate move.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
f66085f4a7
OpcodeDispatcher: optimize bextr's (1 << x) - 1
...
little algebraic trick I cribbed from llvm
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
c9461d9997
OpcodeDispatcher: optimize BEXTR flag setting
...
use native test.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
d5eb99fac8
OpcodeDispatcher: optimize popcount flags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
f3175848b1
OpcodeDispatcher: use lzcnt flag gen for tzcnt
...
as far as flags go, they're identical: set ZF for zero output, set CF for output
= DestSize, undef the rest. merge the impls, so we get the optimized lzcnt impl
for tzcnt.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
3dd597a591
OpcodeDispatcher: optimize lzcnt
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
e8e05252f0
OpcodeDispatcher: optimize BLSI
...
and explain why the suss thing we did before was actually right all along.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
93cef53ec0
OpcodeDispatcher: optimize blsr flags
...
reorder to avoid nzcv clobber
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
3a19133267
OpcodeDispatcher: fix inverted BLSR carry
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
9b309b2102
OpcodeDispatcher: optimize blsmsk flags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
fe88b904c9
OpcodeDispatcher: fix missing SF set with blsmsk
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
2e63c6d547
OpcodeDispatcher: fix inverted CF with blsmsk
...
CF set if SRC = 0
per https://www.felixcloutier.com/x86/blsmsk
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
0bc9e1a409
OpcodeDispatcher: clobber OF with shift immediate
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
338f12845d
OpcodeDispatcher: save a constant in shld
...
one weird trick
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
b3ae81f75f
OpcodeDispatcher: allow garbage on shld shift
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
c1a1c37980
OpcodeDispatcher: mark ideas to improve SHLD
...
a bit tricky right now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
fb6f850bb4
OpcodeDispatcher: remove rcl sub
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
b6d8749525
OpcodeDispatcher: remove select from rcl
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
d3f1397325
OpcodeDispatcher: eliminate constants in RCR
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
0a164428fa
OpcodeDispatcher: eliminate select in RCR
...
the nzcv clobber I actually came ofr
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
7496175100
OpcodeDispatcher: optimize 32-bit rcl/rcr
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
0616a9cef1
OpcodeDispatcher: eliminate move in rcr 1-bit
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
97f8775354
OpcodeDispatcher: optimize <32-bit rcr op1
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
c92099aa98
OpcodeDispatcher: fuse orlshl in rcr 1-bit
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00