Commit Graph

1691 Commits

Author SHA1 Message Date
Ryan Houdek
b0bd8a62a2
AVX128: Improve VPERMILPS/PD and VPSHUFD
VPSHUFD and VPERMILPS are aliases of each other.

Reuses the implementation path from the PSHUFD implementation which has
a few swizzles and then a table lookup.

VPERMILPD is a very simple swizzle per 128-bit lane.

Fixes #3797
Fixes #3784
2024-07-18 04:10:58 -07:00
Alyssa Rosenzweig
dedec83881 json_ir_generator: autoderive array names
these are purely internal.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:32:51 -04:00
Alyssa Rosenzweig
9fd5c73633 json_ir_generator: generate IsLoweredX87 helper
X87 pass will use this query.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:46 -04:00
Alyssa Rosenzweig
bdb890a8b0 json_ir_generator: rename X87 -> LoweredX87
to reflect its actual meaning

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:31:14 -04:00
Alyssa Rosenzweig
5043d09771 json_ir_generator: use textwrap.dedent, f-string
for size

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-17 15:28:59 -04:00
Ryan Houdek
da51169ba9
Merge pull request #3875 from alyssarosenzweig/ir/gethostflag
IR: garbage collect premature F80Cmp optimizations
2024-07-17 03:05:48 -07:00
Ryan Houdek
f72cee480f
Merge pull request #3874 from alyssarosenzweig/opt/reconstructftw
X87: save uop in ReconstructFTW
2024-07-17 03:05:37 -07:00
Alyssa Rosenzweig
e7d5a01c5f IR: remove F80Cmp flags
nothing is optimizing around this, it's just adding pointless complexity. if we
want to actually optimize F80Cmp, the right way would be to lift the
implementation into the OpcodeDispatcher or JIT. it wouldn't be terribly
difficult. This kludge doesn't get us closer there.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:53:58 -04:00
Alyssa Rosenzweig
0c3a8d0bc8 IR: remove GetHostFlag
it doesn't get host flags, it's just an extra Bfe used in x87. pointless and
confusing!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 14:44:34 -04:00
Alyssa Rosenzweig
c4ba7eee87 X87: save uop in ReconstructFTW
noticed while reviewing Paulo's work

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 13:54:09 -04:00
Paulo Matos
8d89adef2e Add IR stack operations
These IR operations deal implicitly with the x87 stack and are removed
by the x87 stack optimization pass.
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
6615b55c12 json_ir_generator: call RecordX87Use when generating ops
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
66865dd177 json_ir_generator: alias X87 to !JITDispatch
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
1e709d1150 OpcodeDispatcher: add RecordX87 helper
calls will be generated.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Alyssa Rosenzweig
476ee0cd7d IR: track whether x87 is used in header
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-16 09:07:35 +02:00
Ryan Houdek
b8e864ffdf
Merge pull request #3865 from Sonicadvance1/telemetry_atexit
Telemetry: Change how visibility of telemetry values work
2024-07-15 09:53:37 -07:00
Ryan Houdek
d79b7fcc49
Merge pull request #3808 from alyssarosenzweig/rclse/3
Try to delete RCLSE again
2024-07-12 20:38:06 -07:00
Ryan Houdek
b9a6caea8d
Merge pull request #3844 from Sonicadvance1/fix_vmovq
AVX128: Fixes vmovq loading too much data
2024-07-12 17:07:32 -07:00
Ryan Houdek
97a68cb643
Telemetry: Change how visibility of telemetry values work
Removes global initializer for telemetry values since their address is
visible and PIC relative code loading handles the address fetching for
us.
2024-07-12 03:18:23 -07:00
Ryan Houdek
870e395ac4
Merge pull request #3862 from Sonicadvance1/remove_atexit_logman
LogManager: Removes fextl::vector usage
2024-07-12 02:05:02 -07:00
Ryan Houdek
04592f82f5
Merge pull request #3861 from Sonicadvance1/remove_atexit_vdso
VDSO: Stop using a vector for a static
2024-07-12 02:04:25 -07:00
Ryan Houdek
5ef0db994d
VDSO: Stop using a vector for a static
This causes a global initializer that registers an atexit handler.

Be smarter, use an std::array and pass its data around using a span
instead.

Removes the global initializer and removes the atexit installation
2024-07-11 23:53:57 -07:00
Ryan Houdek
b523407a3e
LogManager: Removes fextl::vector usage
We never use more than one logging method at a time so this was
overengineered for what it is doing.

Instead only allow one handler for messages and throw messages each
which just is a pointer.

Removes a global initializer and an atexit handler being installed
2024-07-11 22:51:56 -07:00
Ryan Houdek
8021dc10a1
OpcodeDispatcher: Force noinline for the function call in the Bind helper
Clang was inlining a few of the functions it was calling. So force it
never to inline since this is supports to be a little shim trampoline
only.
2024-07-11 19:00:42 -07:00
Ryan Houdek
7e8d734e43
AVX256: Initial fixes just to get my unittest working
This is the initial split to decouple AVX256 composed operations from
their MMX/SSE counterparts. This is to work around the subtle
differences with AVX/SSE zext/insert behaviour.
2024-07-11 18:43:31 -07:00
Ryan Houdek
3c7318d7c8
AVX128: Fixes vmovq loading too much data
This was doing a 128-bit load from memory and then a 64-bit zero extend
which looked like a spurious move but it was trying to match the
behaviour of vmovq where it needed the zero extend.

Also adds a unit test to ensure that we aren't loading too much data by
loading right up against a page boundary.

Fixes #3787
2024-07-11 18:34:05 -07:00
Ryan Houdek
fc0b233046
Merge pull request #3859 from neobrain/refactor_opdispatch_templates
OpcodeDispatcher: Replace hand-written wrapper templates with a generic utility
2024-07-11 18:18:23 -07:00
Mai
e25918d846
Merge pull request #3858 from Sonicadvance1/implement_nt_load
Implement support for SSE4.1/AVX NT loads
2024-07-11 14:22:41 -04:00
Alyssa Rosenzweig
3a334c4585 Reapply "IR: drop RCLSE"
This reverts commit 78aee4d96e.
2024-07-11 13:21:14 -04:00
Alyssa Rosenzweig
8dae4bcd44 OpcodeDispatcher: drop stale comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-11 13:21:14 -04:00
Alyssa Rosenzweig
294f10fdd0 OpcodeDispatcher: reg cache mmx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-11 13:21:14 -04:00
Tony Wasserka
b9829ed316 OpcodeDispatcher: Replace even more hand-written wrapper templates 2024-07-11 16:19:15 +02:00
Tony Wasserka
4ccec17676 OpcodeDispatcher: Replace more hand-written wrapper templates 2024-07-11 16:19:15 +02:00
Tony Wasserka
f45082043b OpcodeDispatcher: Replace hand-written wrapper templates with a generic utility 2024-07-11 16:19:14 +02:00
Tony Wasserka
3222f13dde Fix comment formatting 2024-07-11 16:19:14 +02:00
Mai
b282620a48
Merge pull request #3857 from Sonicadvance1/sve_bitperm
Arm64: Implement support for SVE bitperm
2024-07-11 05:05:41 -04:00
Ryan Houdek
e24b01b6cb
Arm64: Implement support for SVE bitperm 2024-07-11 01:46:35 -07:00
Tony Wasserka
9a8694c2f3
Merge pull request #3853 from neobrain/refactor_warn_fixes
Fix all the warnings
2024-07-11 10:12:41 +02:00
Tony Wasserka
070a9148aa
Merge pull request #3852 from neobrain/refactor_opdispatch_codesize
OpcodeDispatcher: Avoid template monomorphization to reduce FEXLoader binary size
2024-07-11 09:58:49 +02:00
Tony Wasserka
f19fe3b6f3 Fix warning about an expression with side effects being passed to __builtin_assume
LOGMAN_THROW_AA_FMT has no benefit over LOGMAN_THROW_A_FMT here, so just use
the latter.
2024-07-11 09:54:31 +02:00
Tony Wasserka
8d2b15665d Fix unused-variable warnings 2024-07-11 09:54:30 +02:00
Tony Wasserka
5dc4ab062d Fix invalid-offsetof warnings due to InternalThreadState not being standard layout
See https://github.com/llvm/llvm-project/issues/53021 for more information
about unique_ptr turning non-standard-layout.
2024-07-11 09:54:30 +02:00
Ryan Houdek
548fd9daf8
OpcodeDispatcher: Implement support for SSE4.1 NT load 2024-07-10 23:07:37 -07:00
Ryan Houdek
f831f5a0e1
AVX128: Implement support for NT Load 2024-07-10 23:07:14 -07:00
Ryan Houdek
4c21aa2604
Arm64: Implement support for NT Loads with ASIMD fallback 2024-07-10 23:06:46 -07:00
Ryan Houdek
c9efb75714
CodeEmitter: Implement support for SVE NT loads 2024-07-10 23:06:19 -07:00
Ryan Houdek
3554d5c2f7
HostFeatures: Check for SVE bit permute extension 2024-07-10 21:45:07 -07:00
Mai
5fe405e1fb
Merge pull request #3855 from neobrain/fix_aotir_uniqueptr
AOTIR: Change std::unique_ptr to fextl::unique_ptr
2024-07-10 17:04:12 -04:00
Tony Wasserka
56bb3744a5 AOTIR: Change std::unique_ptr to fextl::unique_ptr 2024-07-10 19:34:24 +02:00
Tony Wasserka
470b435afd fextl: Properly handle nullptr arguments in fextl::default_delete
This reflects behavior of std::default_delete.
2024-07-10 19:17:50 +02:00