4074 Commits

Author SHA1 Message Date
Paulo Matos
88b01a0ca9 Ignore python files for clang-format 2024-06-19 14:23:27 +02:00
Lioncache
afa7de969e Externals: Update vixl submodule
Updates vixl to track the latest upstream changes that fix erroneous
non-zeroing behavior for 256-bit vectors
2024-06-14 15:57:03 -04:00
Paulo Matos
bcc136c3b9 Fix exec path where file needs to be ignored
Ignored files were not being checked. Both clang-format.py wrapper
and code-format-helper where not aligned.
2024-05-13 11:41:44 +02:00
Paulo Matos
e1efde9605 CI workflow to check clang-format usage on pull requests
Adapted from LLVM version of pr-code-format.yml.
Copies a few scripts from LLVM to External/.
Runs self-hosted on X64.

Assumes clang-format 16.0.6 for formatting.
2024-04-15 14:05:48 +02:00
Billy Laws
7ff4cd9108 Update jemalloc submodule 2024-04-09 23:43:49 +00:00
Ryan Houdek
ce7acd9b71
External/drm-headers: Update to v6.8 2024-03-10 15:45:34 -07:00
Tony Wasserka
6edba49784 Update Catch2 to v3.5.3 2024-03-05 12:15:29 +01:00
Tony Wasserka
31e976a5bc Library Forwarding: Update Vulkan definitions to v1.3.278 2024-02-29 19:08:38 +01:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Ryan Houdek
ccc699444d
Update vixl
Removes a commit from our fork.
2024-02-26 23:16:31 -08:00
Ryan Houdek
c333aac4f9
Merge pull request #3354 from Sonicadvance1/uprev_kernel_2
Linux uprev to v6.6
2024-01-03 14:25:13 -08:00
Ryan Houdek
f02bd5c9a6 Linux/drm: Update to v6.6 2023-12-25 08:28:45 -08:00
Ryan Houdek
8b24f7fc26 Externals: Update xbyak to v7.02 and switch away from fork
The last few patches we need have been upstreamed so we shouldn't need
our downstream fork anymore.
2023-12-21 01:52:05 -08:00
Lioncache
4ccc40f697 Externals: Update fmt from 10.1.0 to 10.1.1
Notably this bugfix version also introduces support for formatting
std::atomic types and std::atomic_flag.

Also, of course keeps our tracked external up to date.
2023-10-13 13:31:49 -04:00
Ryan Houdek
0092ea7c0b External: Remove a spurious license
This doesn't exist anymore
2023-10-06 09:37:17 -07:00
Tony Wasserka
04592af609 Thunks: Update Vulkan thunk to v1.3.261.1 2023-09-26 12:14:58 +02:00
Lioncache
989fe22e2d Externals: Update Catch2 to v2.13.10
Updates it to the latest v2 branch tag
2023-08-24 18:04:59 -04:00
Lioncache
9035a29906 Externals: Update fmt to 10.1.0
Updates fmt to the latest version.
2023-08-24 17:01:40 -04:00
Alyssa Rosenzweig
af21b8f3c7 Move External/FEXCore/ to FEXCore/
It is not an external component, and it makes paths needlessly long.
Ryan seemed amenable to this when we discussed on IRC earlier.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-08-17 16:32:16 -04:00
Lioncache
e14e71aaff IR: Allow 128-bit broadcasts in VBroadcastFromMem
Now all vbroadcast implementations go down the more optimal path.

For non-SVE 128-bit cases where we only have 128-bit wide registers,
we behave like ld1rqb and just act as a normal 128-bit load for
interface convenience.
2023-08-17 14:20:12 -04:00
Ryan Houdek
a9dea29f03
Merge pull request #2917 from lioncash/quad
ARMEmitter: Handle SVE load and broadcast quadword groups
2023-08-17 10:59:27 -07:00
Lioncache
f97df2a40f ARMEmitter: Handle SVE load and broadcast quadword (scalar plus scalar) group 2023-08-17 13:32:20 -04:00
Lioncache
6f9bc1e2fe ARMEmitter: Handle SVE load and broadcast quadword (scalar plus imm) category 2023-08-17 13:32:17 -04:00
Ryan Houdek
49b8b7cd2c
Merge pull request #2916 from Sonicadvance1/128bit_predicate
Arm64Emitter: Ensure that 128-bit predicate is generated with SVE
2023-08-17 09:56:56 -07:00
Ryan Houdek
059c022255 Arm64Emitter: Ensure that 128-bit predicate is generated with SVE
In the case of running on a 128-bit SVE system this predicate wasn't
setup. Since we never had any predicate usage before this wasn't an
issue. Now that #2914 is using the 128-bit predicate we need to make
sure that we are generating it.
2023-08-17 09:37:55 -07:00
Lioncache
25708be807 IR: Add TSO handling to VBroadcastFromMem 2023-08-17 12:30:57 -04:00
Lioncache
c8e3ca481f OpcodeDispatcher: Remove explicit zero-extending in VBROADCASTOp
Since the implementations zero the upper lanes when appropriate, we can
remove the unnecessary explicit move.
2023-08-17 12:17:24 -04:00
Lioncache
879bc5176e IR: Add VBroadcastFromMem opcode
Allows the implementations of the vbroadcast instructions to perform the
load and broadcast in one operation as opposed to doing the load and then
broadcast separately.

Notably, the broadcasting loads can also be used on systems that have SVE 128-bit
support as well, not only 256-bit.

On non-SVE systems, we use the equivalent AdvSIMD instructions.
2023-08-17 12:17:24 -04:00
Ryan Houdek
6d562f8b3b
Merge pull request #2911 from Sonicadvance1/stop_abusing_orr
Arm64: Stop abusing orr in LoadConstant
2023-08-17 09:13:36 -07:00
Ryan Houdek
1029bb1fae
Merge pull request #2910 from Sonicadvance1/minor_bfi_opt
Arm64: Optimize non-optimal BFI move case
2023-08-17 09:12:37 -07:00
Ryan Houdek
1343c14db0
Merge pull request #2909 from Sonicadvance1/optimize_clzero_clear
Arm64: Optimize CacheLine{Clear,Clean}
2023-08-17 09:10:45 -07:00
Lioncache
77c64285cb OpcodeDispatcher: Remove unused variable in AVXVectorUnaryOpImpl
Forgot to remove this when getting rid of the unnecessary
explicit zero-extending behavior
2023-08-17 11:26:41 -04:00
Ryan Houdek
fe37c89109 FEXCore/Config: Stop making temporary string copies
For config values that were string objects we were unnecessary creating
copies each time the string was accessed.

Convert the () operator over to returning a reference.
2023-08-16 21:35:13 -07:00
Ryan Houdek
23fd79a3b3 Arm64: Stop abusing orr in LoadConstant
The current implementation uses orr excessively. This has FEX missing
hardware optimization opportunities where some CPU cores will zero-cycle
move constants that fit in to the 16-bits of movz/movk.

First evaluate up front if the number of 16-bit segments is > 1, in
those cases we should check if it is a bitfield that can be moved in one
instruction with orr.

After that point we will use movz for 16-bit constant moves.

Additionally this optimizes the case where a constant of zero is loaded
to be a `mov <reg>, zr` which gets renamed in most hardware.
2023-08-16 19:35:15 -07:00
Ryan Houdek
a3b40c37c2 Arm64: Optimize non-optimal BFI move case
Commonly we are doing a BFI into a 32-bit register, which is hitting the
ubfx (lsr alias) path.

In the case of 32-bit destination we can also do a regular move, which
will take advantage of CPU's rename functionality and give a minor speed
boost.
2023-08-16 14:35:41 -07:00
Ryan Houdek
4522a766e0 Arm64: Optimize CacheLine{Clear,Clean}
When the cacheline size matches the expected x86 cacheline size then we
can remove the spurious move + add.
2023-08-16 14:20:22 -07:00
Ryan Houdek
fc12958095 FEXCore/IR: Fixes bug in IRDumper without specification
Didn't notice this in the previous PR, When DUMPIR=stderr without and
selection of where to place it in PASSMANAGERDUMPIR it was supposed to
put the dumper at the end of the passes.

We need to make sure that it it placed at the end of the passes rather
than current `it`.
2023-08-16 13:51:03 -07:00
Mai
df3d4efc80
Merge pull request #2904 from Sonicadvance1/instcountci_only_arm
GIthub: Only enable InstCountCI on an ARM platform
2023-08-15 17:33:10 -04:00
Ryan Houdek
1441cb76b9 HostFeatures: Adds support for overriding ARMv8.1 LSE atomics
Always enable it on the InstCountCI.
2023-08-15 14:12:27 -07:00
Lioncache
17956eac5f OpcodeDispatcher: Eliminate unnecessary moves in AVXVectorUnaryOpImpl
We no longer need to do any manual zero-extending here, since this
will occur automatically on hardware with SVE when 128-bit AdvSIMD
is used.
2023-08-15 15:43:34 -04:00
Lioncache
2708374d95 Arm64/VectorOps: Remove redundant move in VFRSqrt SVE path
We can perform the SQRT first and then broadcast 1.0 into the destination
since all the intermediary work is done, meaning we don't have to worry
about Dst and Vector aliasing one another.
2023-08-15 15:22:21 -04:00
Lioncache
6acce60855 ARMEmitter: Handle SVE load and broadcast element group
These can be used to improve vbroadcast implementations from
doing a mem load+dup in the non-GPR case into just directly
loading into the destination.
2023-08-15 13:47:12 -04:00
Lioncache
81115f64f6 ARMEmitter: Handle SVE Store Multiple Structures (scalar plus scalar) 2023-08-15 10:18:37 -04:00
Lioncache
0176efa3bb ARMEmitter: Handle SVE Load Multiple Structures (scalar plus scalar) group 2023-08-15 10:01:15 -04:00
Ryan Houdek
398e76be89 X86Tables: Fixes typo 2023-08-14 16:04:05 -07:00
Ryan Houdek
f248e7f3e7 Config: If DumpIR is enabled, default enable a passmanager option
If DumpIR is enabled but the PassManagerDumpIR option isn't enabled then
this currently does nothing.

As a convenience, enable dumping the final optimized IR if an option
hasn't been specified.
2023-08-14 12:29:56 -07:00
Ryan Houdek
e51606c669 Config: Fixes mixup in PassManagerDumpIR
The opt and pass options were inverted in PassManager.
Renames the enum to make this more clear.
2023-08-14 12:28:37 -07:00
Ryan Houdek
648d8aeb65 Config: Adds missing server option to DumpIR description
This was accepted but I failed to describe it when added.
2023-08-14 12:22:35 -07:00
Ryan Houdek
112c463655 Config: Ensure OutputLog to server doesn't try to expand path
"server" isn't a path, this was missed when it was added.
2023-08-14 12:20:58 -07:00
Alyssa Rosenzweig
7ecbbd6c04 ConstProp: Fix set-but-not-used mask variable
I think this was the intended logic?

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-08-14 11:59:59 -04:00