Ryan Houdek
b0bd8a62a2
AVX128: Improve VPERMILPS/PD and VPSHUFD
...
VPSHUFD and VPERMILPS are aliases of each other.
Reuses the implementation path from the PSHUFD implementation which has
a few swizzles and then a table lookup.
VPERMILPD is a very simple swizzle per 128-bit lane.
Fixes #3797
Fixes #3784
2024-07-18 04:10:58 -07:00
Alyssa Rosenzweig
dedec83881
json_ir_generator: autoderive array names
...
these are purely internal.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:32:51 -04:00
Alyssa Rosenzweig
9fd5c73633
json_ir_generator: generate IsLoweredX87 helper
...
X87 pass will use this query.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:46 -04:00
Alyssa Rosenzweig
bdb890a8b0
json_ir_generator: rename X87 -> LoweredX87
...
to reflect its actual meaning
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:14 -04:00
Alyssa Rosenzweig
5043d09771
json_ir_generator: use textwrap.dedent, f-string
...
for size
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:28:59 -04:00
Ryan Houdek
da51169ba9
Merge pull request #3875 from alyssarosenzweig/ir/gethostflag
...
IR: garbage collect premature F80Cmp optimizations
2024-07-17 03:05:48 -07:00
Ryan Houdek
f72cee480f
Merge pull request #3874 from alyssarosenzweig/opt/reconstructftw
...
X87: save uop in ReconstructFTW
2024-07-17 03:05:37 -07:00
Alyssa Rosenzweig
e7d5a01c5f
IR: remove F80Cmp flags
...
nothing is optimizing around this, it's just adding pointless complexity. if we
want to actually optimize F80Cmp, the right way would be to lift the
implementation into the OpcodeDispatcher or JIT. it wouldn't be terribly
difficult. This kludge doesn't get us closer there.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
0c3a8d0bc8
IR: remove GetHostFlag
...
it doesn't get host flags, it's just an extra Bfe used in x87. pointless and
confusing!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:44:34 -04:00
Alyssa Rosenzweig
c4ba7eee87
X87: save uop in ReconstructFTW
...
noticed while reviewing Paulo's work
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:09 -04:00
Paulo Matos
8d89adef2e
Add IR stack operations
...
These IR operations deal implicitly with the x87 stack and are removed
by the x87 stack optimization pass.
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
6615b55c12
json_ir_generator: call RecordX87Use when generating ops
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
66865dd177
json_ir_generator: alias X87 to !JITDispatch
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
1e709d1150
OpcodeDispatcher: add RecordX87 helper
...
calls will be generated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
476ee0cd7d
IR: track whether x87 is used in header
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Ryan Houdek
b8e864ffdf
Merge pull request #3865 from Sonicadvance1/telemetry_atexit
...
Telemetry: Change how visibility of telemetry values work
2024-07-15 09:53:37 -07:00
Ryan Houdek
d79b7fcc49
Merge pull request #3808 from alyssarosenzweig/rclse/3
...
Try to delete RCLSE again
2024-07-12 20:38:06 -07:00
Ryan Houdek
b9a6caea8d
Merge pull request #3844 from Sonicadvance1/fix_vmovq
...
AVX128: Fixes vmovq loading too much data
2024-07-12 17:07:32 -07:00
Ryan Houdek
97a68cb643
Telemetry: Change how visibility of telemetry values work
...
Removes global initializer for telemetry values since their address is
visible and PIC relative code loading handles the address fetching for
us.
2024-07-12 03:18:23 -07:00
Ryan Houdek
870e395ac4
Merge pull request #3862 from Sonicadvance1/remove_atexit_logman
...
LogManager: Removes fextl::vector usage
2024-07-12 02:05:02 -07:00
Ryan Houdek
04592f82f5
Merge pull request #3861 from Sonicadvance1/remove_atexit_vdso
...
VDSO: Stop using a vector for a static
2024-07-12 02:04:25 -07:00
Ryan Houdek
5ef0db994d
VDSO: Stop using a vector for a static
...
This causes a global initializer that registers an atexit handler.
Be smarter, use an std::array and pass its data around using a span
instead.
Removes the global initializer and removes the atexit installation
2024-07-11 23:53:57 -07:00
Ryan Houdek
b523407a3e
LogManager: Removes fextl::vector usage
...
We never use more than one logging method at a time so this was
overengineered for what it is doing.
Instead only allow one handler for messages and throw messages each
which just is a pointer.
Removes a global initializer and an atexit handler being installed
2024-07-11 22:51:56 -07:00
Ryan Houdek
8021dc10a1
OpcodeDispatcher: Force noinline for the function call in the Bind helper
...
Clang was inlining a few of the functions it was calling. So force it
never to inline since this is supports to be a little shim trampoline
only.
2024-07-11 19:00:42 -07:00
Ryan Houdek
7e8d734e43
AVX256: Initial fixes just to get my unittest working
...
This is the initial split to decouple AVX256 composed operations from
their MMX/SSE counterparts. This is to work around the subtle
differences with AVX/SSE zext/insert behaviour.
2024-07-11 18:43:31 -07:00
Ryan Houdek
3c7318d7c8
AVX128: Fixes vmovq loading too much data
...
This was doing a 128-bit load from memory and then a 64-bit zero extend
which looked like a spurious move but it was trying to match the
behaviour of vmovq where it needed the zero extend.
Also adds a unit test to ensure that we aren't loading too much data by
loading right up against a page boundary.
Fixes #3787
2024-07-11 18:34:05 -07:00
Ryan Houdek
fc0b233046
Merge pull request #3859 from neobrain/refactor_opdispatch_templates
...
OpcodeDispatcher: Replace hand-written wrapper templates with a generic utility
2024-07-11 18:18:23 -07:00
Mai
e25918d846
Merge pull request #3858 from Sonicadvance1/implement_nt_load
...
Implement support for SSE4.1/AVX NT loads
2024-07-11 14:22:41 -04:00
Alyssa Rosenzweig
3a334c4585
Reapply "IR: drop RCLSE"
...
This reverts commit 78aee4d96e
.
2024-07-11 13:21:14 -04:00
Alyssa Rosenzweig
8dae4bcd44
OpcodeDispatcher: drop stale comment
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-11 13:21:14 -04:00
Alyssa Rosenzweig
294f10fdd0
OpcodeDispatcher: reg cache mmx
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-11 13:21:14 -04:00
Tony Wasserka
b9829ed316
OpcodeDispatcher: Replace even more hand-written wrapper templates
2024-07-11 16:19:15 +02:00
Tony Wasserka
4ccec17676
OpcodeDispatcher: Replace more hand-written wrapper templates
2024-07-11 16:19:15 +02:00
Tony Wasserka
f45082043b
OpcodeDispatcher: Replace hand-written wrapper templates with a generic utility
2024-07-11 16:19:14 +02:00
Tony Wasserka
3222f13dde
Fix comment formatting
2024-07-11 16:19:14 +02:00
Mai
b282620a48
Merge pull request #3857 from Sonicadvance1/sve_bitperm
...
Arm64: Implement support for SVE bitperm
2024-07-11 05:05:41 -04:00
Ryan Houdek
e24b01b6cb
Arm64: Implement support for SVE bitperm
2024-07-11 01:46:35 -07:00
Tony Wasserka
9a8694c2f3
Merge pull request #3853 from neobrain/refactor_warn_fixes
...
Fix all the warnings
2024-07-11 10:12:41 +02:00
Tony Wasserka
070a9148aa
Merge pull request #3852 from neobrain/refactor_opdispatch_codesize
...
OpcodeDispatcher: Avoid template monomorphization to reduce FEXLoader binary size
2024-07-11 09:58:49 +02:00
Tony Wasserka
f19fe3b6f3
Fix warning about an expression with side effects being passed to __builtin_assume
...
LOGMAN_THROW_AA_FMT has no benefit over LOGMAN_THROW_A_FMT here, so just use
the latter.
2024-07-11 09:54:31 +02:00
Tony Wasserka
8d2b15665d
Fix unused-variable warnings
2024-07-11 09:54:30 +02:00
Tony Wasserka
5dc4ab062d
Fix invalid-offsetof warnings due to InternalThreadState not being standard layout
...
See https://github.com/llvm/llvm-project/issues/53021 for more information
about unique_ptr turning non-standard-layout.
2024-07-11 09:54:30 +02:00
Ryan Houdek
548fd9daf8
OpcodeDispatcher: Implement support for SSE4.1 NT load
2024-07-10 23:07:37 -07:00
Ryan Houdek
f831f5a0e1
AVX128: Implement support for NT Load
2024-07-10 23:07:14 -07:00
Ryan Houdek
4c21aa2604
Arm64: Implement support for NT Loads with ASIMD fallback
2024-07-10 23:06:46 -07:00
Ryan Houdek
c9efb75714
CodeEmitter: Implement support for SVE NT loads
2024-07-10 23:06:19 -07:00
Ryan Houdek
3554d5c2f7
HostFeatures: Check for SVE bit permute extension
2024-07-10 21:45:07 -07:00
Mai
5fe405e1fb
Merge pull request #3855 from neobrain/fix_aotir_uniqueptr
...
AOTIR: Change std::unique_ptr to fextl::unique_ptr
2024-07-10 17:04:12 -04:00
Tony Wasserka
56bb3744a5
AOTIR: Change std::unique_ptr to fextl::unique_ptr
2024-07-10 19:34:24 +02:00
Tony Wasserka
470b435afd
fextl: Properly handle nullptr arguments in fextl::default_delete
...
This reflects behavior of std::default_delete.
2024-07-10 19:17:50 +02:00