962 Commits

Author SHA1 Message Date
Ryan Houdek
0139498072 SpinWaitLock: Removes unused variable in spin-loop fallback
Tmp was no longer being used, forgot to remove it.
2024-02-05 07:22:52 -08:00
Ryan Houdek
472a701e2b
Merge pull request #3403 from Sonicadvance1/fix_spinlock_contended_lock
SpinLockWait: Fixes unexpected lock success
2024-02-05 06:51:42 -08:00
Ryan Houdek
cce6011205 SpinLockWait: Fixes unexpected lock success
With a contended unique lock, we forgot to reset the `Expected` value to
zero. This was causing a contended mutex to incorrectly succeed.

Noticed this when converting some pthread mutexes over to spinloops to
remove strace noise.

The reference wfe_mutex library I wrote didn't have this problem since
the implementation is slightly different.
2024-02-03 01:10:57 -08:00
Ryan Houdek
c437129ed8 Revert "Revert "FEXLoader: Moves thread management to the frontend""
This reverts commit 5358af7794d9568398f7b84fe09b4c8198448f2c.
2024-02-03 00:57:36 -08:00
Alyssa Rosenzweig
8d3f0b6f02 OpcodeDispatcher: reassociate and sink W in sha1
We only need each part of W extracted in the corresponding round, so sink the
extract into the round to reduce pressure.

Further, W and E are added and then never used again. So, by reassociating we
can do the add upfront, killing W and E at the start and further reducing
pressure.

Eliminates spilling in sha1rnds4.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
60f7b9bcc4 OpcodeDispatcher: optimze sha1's 2/3 expr
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
a487557173 OpcodeDispatcher: extract BitwiseAtLeastTwo
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
394b4888bb OpcodeDispatcher: reassociate and remat C0, G0
costs 2 moves and eliminates the rest of our spilling

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
142cbdd852 OpcodeDispatcher: expand, reassociate, and interleave sha256 calc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
2f9102f78d OpcodeDispatcher: expand & interleave sha256 calc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
c9824d04cb OpcodeDispatcher: sink sha256 extracts
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
9c2a569539 OpcodeDispatcher: reexpress Major in sha256
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
515aa4ce3e OpcodeDispatcher: fuse eor+ror in sha256
This reduces instructions a ton.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
f616beb992 OpcodeDispatcher: CSE sha
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
0dcf1e12b8 OpcodeDispatcher: copyprop sha logic
prepare for clever

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig
2cbf544ef5 OpcodeDispatcher: expand sha logic
no functional change, just preparing for cleverness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-02 13:03:07 -04:00
Ryan Houdek
0eed73beeb HostFeatures: Supports runtime disabling of preserve_all
This is used for instcountci to ensure instruction counts don't change
when a compiler supports this feature or not. Always runtime disable
when running in instcountci.

CMake option from #3394 can still be useful so leaving that in place.
2024-02-02 08:59:04 -08:00
Mai
6993f4fd8d
Merge pull request #3400 from Sonicadvance1/revert_runtime_longmode_switch
Revert #3303
2024-02-02 11:53:44 -05:00
Paulo Matos
4623544f69 Improve XCHG operations
Marking loads as allowing upper garbage simplifies some operations.
Update InstCountCI as well.
2024-02-02 08:16:13 +00:00
Ryan Houdek
ccf1402fe6 Revert "FEXCore: Accurately store segment descriptors"
This reverts commit 8648fb148556459b277dcd7e53a0fc092b626875.
2024-02-01 18:14:30 -08:00
Ryan Houdek
da0e1b515a Revert "OpcodeDispatcher: Initial support for runtime long-mode switch"
This reverts commit 9e5d7aa5fe65461b0067ea72034e23cb1dc44285.
2024-02-01 18:14:24 -08:00
Ryan Houdek
cec1814a09
Merge pull request #3384 from pmatos/CDQOp-Opt
Optimize CDQOp
2024-01-31 17:51:23 -08:00
Mai
4d49ac7c3d
Merge pull request #3387 from alyssarosenzweig/opt/rotates
Optimize rotates
2024-01-31 18:20:40 -05:00
Alyssa Rosenzweig
6d13d9fb56
Merge pull request #3395 from pmatos/StaticAnalysis
Code cleanup - mainly dead store removal; NFC
2024-01-31 17:24:48 -04:00
Mai
ae7dc250db
Merge pull request #3386 from alyssarosenzweig/opt/shift
Optimize shifts a bit
2024-01-31 14:11:58 -05:00
Paulo Matos
e4560ed0c8 Code cleanup - mainly dead store removal; NFC
scan-build found a few dead stores that can be easily cleaned-up
2024-01-31 08:35:55 +00:00
Alyssa Rosenzweig
f3eee8f305 OpcodeDispatcher: optimize bextr's length sanitize
reordering the operations saves an immediate move.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
f66085f4a7 OpcodeDispatcher: optimize bextr's (1 << x) - 1
little algebraic trick I cribbed from llvm

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
c9461d9997 OpcodeDispatcher: optimize BEXTR flag setting
use native test.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
d5eb99fac8 OpcodeDispatcher: optimize popcount flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
f3175848b1 OpcodeDispatcher: use lzcnt flag gen for tzcnt
as far as flags go, they're identical: set ZF for zero output, set CF for output
= DestSize, undef the rest. merge the impls, so we get the optimized lzcnt impl
for tzcnt.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
3dd597a591 OpcodeDispatcher: optimize lzcnt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
e8e05252f0 OpcodeDispatcher: optimize BLSI
and explain why the suss thing we did before was actually right all along.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
93cef53ec0 OpcodeDispatcher: optimize blsr flags
reorder to avoid nzcv clobber

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
3a19133267 OpcodeDispatcher: fix inverted BLSR carry
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
9b309b2102 OpcodeDispatcher: optimize blsmsk flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
fe88b904c9 OpcodeDispatcher: fix missing SF set with blsmsk 2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
2e63c6d547 OpcodeDispatcher: fix inverted CF with blsmsk
CF set if SRC = 0

per https://www.felixcloutier.com/x86/blsmsk

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig
0bc9e1a409 OpcodeDispatcher: clobber OF with shift immediate
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
338f12845d OpcodeDispatcher: save a constant in shld
one weird trick

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
b3ae81f75f OpcodeDispatcher: allow garbage on shld shift
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
c1a1c37980 OpcodeDispatcher: mark ideas to improve SHLD
a bit tricky right now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig
fb6f850bb4 OpcodeDispatcher: remove rcl sub
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
b6d8749525 OpcodeDispatcher: remove select from rcl
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
d3f1397325 OpcodeDispatcher: eliminate constants in RCR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
0a164428fa OpcodeDispatcher: eliminate select in RCR
the nzcv clobber I actually came ofr

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
7496175100 OpcodeDispatcher: optimize 32-bit rcl/rcr
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
0616a9cef1 OpcodeDispatcher: eliminate move in rcr 1-bit
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
97f8775354 OpcodeDispatcher: optimize <32-bit rcr op1
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig
c92099aa98 OpcodeDispatcher: fuse orlshl in rcr 1-bit
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-01-30 22:22:57 -04:00