Commit Graph

209 Commits

Author SHA1 Message Date
Ryan Houdek
db6c8852fc IR: Removes implicit sized or 2023-08-28 19:06:05 -07:00
Ryan Houdek
65dc6f3e90 IR: Removes implicit sized lshr 2023-08-28 18:16:56 -07:00
Ryan Houdek
60c4438780 IR: Removes implicit sized lshl 2023-08-28 17:50:41 -07:00
Ryan Houdek
898ce1ce8f IR: Removes implicit sized UMulH 2023-08-28 17:20:55 -07:00
Ryan Houdek
aa8dfd6af1 IR: Removes implicit sized UMul 2023-08-28 17:20:55 -07:00
Ryan Houdek
6a6d808b0d IR: Removes implicit sized MulH 2023-08-28 17:20:55 -07:00
Ryan Houdek
fac5b2ac72 IR: Removes implicit sized Mul 2023-08-28 17:20:55 -07:00
Ryan Houdek
b9e4a1423f IR: Removes sext IR helper
You hold no power here IR operation.
2023-08-28 17:03:38 -07:00
Ryan Houdek
c0bb6a053f IR: Removes implicit sized {Create,Extract}ElementPair 2023-08-28 16:50:00 -07:00
Ryan Houdek
a36427d01e IR: Removes implicit sized CAS 2023-08-28 07:12:31 -07:00
Ryan Houdek
594baff705 IR: Removes implicit sized CASPair 2023-08-28 07:12:31 -07:00
Ryan Houdek
e9f2bc037f IR: Removes non-opsize AtomicAdd/Sub/And/Or
These were unused
2023-08-28 07:12:31 -07:00
Ryan Houdek
6dcfd6eb73 IR: Removes non-opsize AtomicXor 2023-08-28 07:12:31 -07:00
Ryan Houdek
4f8a63459c IR: Removes non-opsize AtomicSwap 2023-08-28 07:12:31 -07:00
Ryan Houdek
ab230bf527 IR: Removes non-opsize AtomicFetchAdd 2023-08-28 07:12:31 -07:00
Ryan Houdek
8ba9613972 IR: Removes non-opsize AtomicFetchSub 2023-08-28 07:12:31 -07:00
Ryan Houdek
9370e30af4 IR: Removes non-opsize AtomicFetchAnd 2023-08-28 07:12:31 -07:00
Ryan Houdek
25ce57ef92 IR: Removes non-opsize AtomicFetchOr 2023-08-28 07:12:31 -07:00
Ryan Houdek
436e0f6f86 IR: Removes non-opsize AtomicFetchXor 2023-08-28 07:12:31 -07:00
Ryan Houdek
f44c6f394f IR: Removes non-opsize AtomicFetchNeg 2023-08-28 07:12:31 -07:00
Ryan Houdek
48669b7006 IR: Removes implicit sized orlshl/orlshr 2023-08-28 05:04:32 -07:00
Ryan Houdek
16481c0e55 IR: Removes implicit sized rev 2023-08-28 05:04:32 -07:00
Ryan Houdek
b00310674a IR: Removes implicit sized not 2023-08-28 05:04:32 -07:00
Ryan Houdek
5768444ce9 IR: Removes implicit sized abs 2023-08-28 05:04:32 -07:00
Ryan Houdek
1f1473eb74 IR: Removes implicit sized neg 2023-08-28 05:04:32 -07:00
Ryan Houdek
cccd7001cb IR: Removes implicit sized ashr 2023-08-28 05:04:32 -07:00
Ryan Houdek
ce8392d5ae IR: Removes implicit sized ror 2023-08-28 05:02:01 -07:00
Ryan Houdek
386cf36cfd IR: Removes implicit sized sbfe
This one is a bit weird since currently it /always/ assumes a 64-bit
operating size.

We'll likely need to revisit this.
2023-08-28 05:02:01 -07:00
Ryan Houdek
0ddc23a5c9 IR: Removes implicit sized Popcount 2023-08-28 05:02:01 -07:00
Ryan Houdek
b95648a4ab IR: Removes implicit sized FindMSB 2023-08-28 05:02:01 -07:00
Ryan Houdek
bf18672999 IR: Removes implicit sized FindLSB 2023-08-28 05:02:01 -07:00
Ryan Houdek
f405c1be69 IR: Removes inverted EntrypointOffset/InlineEntrypointOffset 2023-08-28 05:02:01 -07:00
Ryan Houdek
f3cd115fa7 IR: Removes implicit sized CountLeadingZeroes 2023-08-28 05:02:00 -07:00
Ryan Houdek
a14720e130 IR: Removes implicit sized FindTrailingZeroes 2023-08-28 05:02:00 -07:00
Ryan Houdek
86ed909de8 IR: Removes implicit sized DIV/REM 2023-08-28 05:02:00 -07:00
Ryan Houdek
ac7e75c06b IR: Removes implicit sized UDIV/UREM 2023-08-28 05:02:00 -07:00
Ryan Houdek
ec3e7ceeb5 IR: Removes implicit sized EXTR 2023-08-28 05:02:00 -07:00
Ryan Houdek
c8c8ddbd4f IR: Removes implicit sized PDEP/PEXT 2023-08-28 05:02:00 -07:00
Ryan Houdek
35013bda37 IR: Adds helper to convert between an integer size and IR::OpSize
This is a nop operation and will get optimized away in release builds.
2023-08-28 05:02:00 -07:00
Ryan Houdek
ea6d068cc5 IR: Removes implicit sized LDIV/LREM 2023-08-28 05:02:00 -07:00
Ryan Houdek
5013473ec0 IR: Removes implicit sized LUDIV/LUREM 2023-08-28 05:02:00 -07:00
Ryan Houdek
1d7c280367
Merge pull request #3012 from Sonicadvance1/optimize_movmskps
OpcodeDispatcher: Optimizes SSE movmaskps
2023-08-27 21:29:04 -07:00
Ryan Houdek
514a8223d9 OpcodeDispatcher: Optimizes SSE movmaskps
This now improves the instruction implementation from 17 instructions
down to 5 or 6 depending on if the host supports SVE.

I would say this is now optimal.
2023-08-27 21:07:20 -07:00
Ryan Houdek
e4bb0df486 IR: Convert all Move+Atomic+ALU ops from implicit to explicit size
The number of times the implicit size calculation in GPR operations has
bit us is immeasurable and was a mistake from the start of the project.
The vector based operations never had this problem since they were
explicitly sized for a long time now.

This converts the base IR operations to be explicitly sized, but adds
implicit sized helpers for the moment while we work on removing implicit
usage from the OpcodeDispatcher.

Should be NFC at this moment but it is a big enough change that I want
it in before the "real" work starts.
2023-08-27 01:35:08 -07:00
Ryan Houdek
a76c2c57b0 OpcodeDispatcher: Optimize PSHUF{LW, HW, D}!
This is way more optimal!
2023-08-25 12:59:40 -07:00
Ryan Houdek
7f63d87295 IR: Adds support for new LoadNamedVectorIndexedConstant IR 2023-08-25 12:59:40 -07:00
Ryan Houdek
c5d147322f HostFeatures: Adds support for FCMA 2023-08-24 15:00:41 -07:00
Ryan Houdek
f300196d90 OpcodeDispatcher: Optimize AddSubP{S,D}
Use a named constant for loading the sign inversion, then EOR the second
source and just FAdd it all.
In a vacuum it isn't a significant improvement, but as soon as more than
one instruction is in a block it will eventually get optimized with
named constant caching and be a significant win.

Thanks to @rygorous for the idea!
2023-08-23 20:32:51 -07:00
Ryan Houdek
bbf9cb9d52 IR: Implements new VRev32 and LoadNamedVectorConstant ops
VRev32 matches Arm64 semantics directly.
LoadNamedVectorConstant allows FEX to quickly load "named constants".
This will allow us to have specific hardcoded vector constant values
that we can load with a ldr(State)+ldr(Value) and will be more abused in
the future.
This also allows us to do a very simple optimization in the future where
we can optimize away redundant loads of these loads if they are used
multiple times in the same block. (Not implemented here).
2023-08-22 16:29:06 -07:00
Ryan Houdek
c5f5a03c68 OpcodeDispatcher: Optimize PSIGN
This dramatically improves the performance of the PSIGN instructions.
2023-08-20 13:38:23 -07:00
Ryan Houdek
a523858f66
Merge pull request #2923 from Sonicadvance1/nonnull_legacy_segment_telemetry
FEXCore: Adds telemetry around legacy segment register setting
2023-08-20 10:27:56 -07:00
Mai
6960fca256
Merge pull request #2929 from Sonicadvance1/signaldelegator_getconfig
SignalDelegator: Allow getting the internal configuration
2023-08-19 23:12:06 -04:00
Ryan Houdek
ea5c67da80 SignalDelegator: Allow getting the internal configuration
Not used by FEX today but will be used by the WINE integration.
2023-08-18 11:56:52 -07:00
Ryan Houdek
fc84f6b345
Merge pull request #2927 from bylaws/interrupt
FEXCore: Allow for interrupting the JIT on block entry
2023-08-18 06:14:24 -07:00
Billy Laws
de63fd05d0 FEXCore: Allow for interrupting the JIT on block entry
This takes a similar approach to deferred signal handling and allows any given
thread to be interrupted while running JIT code by protecting the appropriate
page as RO. When the thread then enters a new block, it will try to acccess
that page and segfault. This is safer than just sending a signal to the thread
as that could stop in a place where JIT context couldn't be recovered correctly.
2023-08-18 05:58:51 -07:00
Billy Laws
bbfd15f801 LogMan: Commonise log level to string conversion 2023-08-18 04:36:31 -07:00
Billy Laws
0954c7eb9f AllocatorHooks: Add C++17 aligned new/delete functions 2023-08-18 04:32:16 -07:00
Ryan Houdek
d19e2507e5 FEXCore: Adds telemetry around legacy segment register setting
Due to Intel dropping support for legacy segment registers[1] there is a
concern that this will break legacy 32-bit software that is doing some
magic segment register handling.

Adds some simple telemetry for 32-bit applications that when they
encounter an instruction that sets the segment register or uses a
segment register that the JIT will do a /relatively/ quick four
instruction check to see if it is not a null segment.

It's not enough to just check if the segment index is 0 or not, 32-bit
Linux software starts with non-zero segment register indexes but the LDT
for each segment index is a null-descriptor.

Once the segment address is loaded, the IR operation will do a quick
check against zero and if it /isn't/ zero then set the telemetry value.

A very minor optimization that segment registers only get checked once
per block to ensure overhead stays low.

[1] https://www.intel.com/content/www/us/en/developer/articles/technical/envisioning-future-simplified-architecture.html
   - 3.6 - Restricted Subset of Segmentation
      - `Bases are supported for FS, GS, GDT, IDT, LDT, and TSS
        registers; the base for CS, DS, ES, and SS is ignored for 32-bit
        mode, same as 64-bit mode (treated as zero).`
   - 4.2.17 - MOV to Segment Register
      - Will fault if SS is written (Breaking anything that writes to
        SS).
      - Will not fault if CS, DS, ES are written (Thus it sets the
        segment but gets ignored due to 3.6).
2023-08-17 17:00:41 -07:00
Alyssa Rosenzweig
af21b8f3c7 Move External/FEXCore/ to FEXCore/
It is not an external component, and it makes paths needlessly long.
Ryan seemed amenable to this when we discussed on IRC earlier.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-08-17 16:32:16 -04:00