6480 Commits

Author SHA1 Message Date
Triang3l
22eb8747d3 [GPU/Kernel] Fix space-prefixed hexadecimal number printing 2022-01-29 14:02:55 +03:00
Triang3l
fe3f0f26e4 [UI] Image post-processing and full presentation/window rework
[GPU] Add FXAA post-processing
[UI] Add FidelityFX FSR and CAS post-processing
[UI] Add blue noise dithering from 10bpc to 8bpc
[GPU] Apply the DC PWL gamma ramp closer to the spec, supporting fully white color
[UI] Allow the GPU CP thread to present on the host directly, bypassing the UI thread OS paint event
[UI] Allow variable refresh rate (or tearing)
[UI] Present the newest frame (restart) on DXGI
[UI] Replace GraphicsContext with a far more advanced Presenter with more coherent surface connection and UI overlay state management
[UI] Connect presentation to windows via the Surface class, not native window handles
[Vulkan] Switch to simpler Vulkan setup with no instance/device separation due to interdependencies and to pass fewer objects around
[Vulkan] Lower the minimum required Vulkan version to 1.0
[UI/GPU] Various cleanup, mainly ComPtr usage
[UI] Support per-monitor DPI awareness v2 on Windows
[UI] DPI-scale Dear ImGui
[UI] Replace the remaining non-detachable window delegates with unified window event and input listeners
[UI] Allow listeners to safely destroy or close the window, and to register/unregister listeners without use-after-free and the ABA problem
[UI] Explicit Z ordering of input listeners and UI overlays, top-down for input, bottom-up for drawing
[UI] Add explicit window lifecycle phases
[UI] Replace Window virtual functions with explicit desired state, its application, actual state, its feedback
[UI] GTK: Apply the initial size to the drawing area
[UI] Limit internal UI frame rate to that of the monitor
[UI] Hide the cursor using a timer instead of polling due to no repeated UI thread paints with GPU CP thread presentation, and only within the window
2022-01-29 13:22:03 +03:00
Pseudo-Kernel
372bdd3ec9
[APU] XMA: Fix audio loop handling.
Handles audio loop if loop_start < loop_end.
Need to handle additional cases like loop_start > loop_end.
2022-01-29 02:49:00 -06:00
TranzRail
1d51b574ec [Kernel] Add PVR opcode (includes cvars support) 2022-01-29 02:44:55 -06:00
Wunkolo
24205ee860 [x64] Fix VECTOR_SH{L,R,A}_V128(Int8) masking
[AltiVec](https://www.nxp.com/docs/en/reference-manual/ALTIVECPEM.pdf)
doc says that it just uses the lower `log2(n)` bits of the shift-amount
rather than the whole element-sized value. So there is no need to handle
an overflow. Also adjusts 64-bit literals to utilize the explicit
`UINT64_C` type.
2022-01-29 02:39:34 -06:00
Wunkolo
f8350b5536 [x64] Add VECTOR_SH{R,L}_I8_SAME_CONSTANT unit test
This is to target the new GFNI-based optimization for the Int8 case.
2022-01-29 02:39:34 -06:00
Wunkolo
bd9a290b30 [x64] Add GFNI-based optimization for VECTOR_SH{R,L}_V128(Int8)
In the `Int8` case of `VECTOR_SH{R,L}_V128`, when all the values are the
same, then a single-instruction `gf2p8affineqb` can be emitted that does
an int8-based arithmetic-shift, utilizing GF(8) arithmetic.

More info here:
https://wunkolo.github.io/post/2020/11/gf2p8affineqb-int8-shifting/

Also fixes the iteration-type for when detecting if all of the simd
lanes are the same value(was iterating `u16` and not `u8`)
2022-01-29 02:39:34 -06:00
Caroline Joy Bell
7418011ab5 Reflect the closure of #1333
#1333 has been closed, so the README should reflect that.
This line can be replaced with another suggestion from the good-first-issue tag.

I suggest contributing to #1526 in README.md
2022-01-28 23:01:50 +03:00
Margen67
b76e0a1bd8 Update to new issue forms 2022-01-28 22:53:41 +03:00
Joel Linn
dbbf401205 [Base] Align test memory 2022-01-25 12:55:10 -06:00
Joel Linn
464795eece [CI] No more GCC Release builds
GNU ld is awfully slow, even more so with LTO
2022-01-25 12:55:10 -06:00
Joel Linn
16be84cdb0 [CI] Add valgrind step to drone 2022-01-25 12:55:10 -06:00
Rick Gibbed
c481b0483c
Add links to release-builds-windows. 2022-01-25 08:05:51 -06:00
Margen67
08a2a5b7ac [AppVeyor] Upgrade to VS2022 2022-01-24 10:34:03 -06:00
Rick Gibbed
e49916ea0a [XAM] Improvements to profile r/w setting exports
[XAM] Improvements to XamUserReadProfileSettingsEx/
XamUserWriteProfileSettings.

- Unify X_USER_READ_PROFILE_SETTING and X_USER_WRITE_PROFILE_SETTING
  into X_USER_PROFILE_SETTING.
- Clean up Setting serialization to use X_USER_PROFILE_SETTING_DATA
  instead of manual buffer copying.
- Fix XamUserReadProfileSettingsEx case where user_index is non-zero
  and xuids are being used.
- Skip unset settings in XamUserWriteProfileSettings_entry.
2022-01-24 07:29:57 -06:00
Rick Gibbed
b3bb6687db
[AppVeyor] Fix extended commit message. 2022-01-24 06:54:11 -06:00
Rick Gibbed
cb83479c53
[AppVeyor] Fix deploy tag/description. 2022-01-24 06:05:45 -06:00
Rick Gibbed
02bdea7b8b
[AppVeyor] Better deploy tag/description. 2022-01-24 05:09:07 -06:00
Rick Gibbed
bf20aa5f8e
[AppVeyor] Deploy artifacts to GitHub Releases.
[AppVeyor] Deploy artifacts to GitHub Releases, take four.

I guess before_deploy is broken?
2022-01-24 03:52:38 -06:00
Rick Gibbed
a7f3b11076
[AppVeyor] Deploy artifacts to GitHub Releases.
[AppVeyor] Deploy artifacts to GitHub Releases, take three.
2022-01-24 02:54:05 -06:00
Rick Gibbed
258581ee07
[AppVeyor] Deploy artifacts to GitHub Releases
[AppVeyor] Deploy artifacts to GitHub Releases, take two.
2022-01-24 02:15:23 -06:00
Rick Gibbed
fcd4f69e0f
[AppVeyor] Deploy artifacts to GitHub Releases 2022-01-24 01:40:57 -06:00
Margen67
564a6d6238 [App] Disable stuff that crashes the emulator 2022-01-23 11:57:40 -06:00
Wunkolo
f7c14a089d [x64] Add host-extension detection preprocessor
Rather than having a huge list of if-statements that all do the same
thing, this preprocessor allows a more concise pattern to detecting if
the emit-flag is enabled as well as the correlated Xbyak flag that it
needs to check for to before allowing the feature-flag to be emitted.

Also moved the AVX-check to the beginning to early-out rather than do a
bunch of wasted work only to find out last that the host doesn't even
support AVX.
2022-01-23 05:04:56 -06:00
Joel Linn
e4ae1d8b2f [Base] Fix copy_and_swap_16_in_32_aligned 2022-01-22 16:18:54 +03:00
Joel Linn
0316d1a054 [Base] Tests for copy_and_swap_16_in_32_aligned 2022-01-22 16:18:54 +03:00
Joel Linn
4a288dc6bd [Base, aarch64] Add copy_and_swap NEON impls 2022-01-22 16:18:54 +03:00
Joel Linn
bfaad055a2 [Base] Add easier to debug copy_and_swap tests 2022-01-22 16:18:54 +03:00
Rick Gibbed
617b17e25b
[WinKey] Fix RThumbDown being mapped to RThumbLeft 2022-01-14 16:06:40 -06:00
Wunkolo
a9a365aa32 [x64] Add GFNI-based optimization for VECTOR_SHA_V128(Int8)
In the `Int8` case of `VECTOR_SHA_V128`, when all the values are the same, then a single-instruction `gf2p8affineqb` can be emitted that does an int8-based arithmetic-shift, utilizing GF(8) arithmetic.

More info here:
https://wunkolo.github.io/post/2020/11/gf2p8affineqb-int8-shifting/

As of now(Dec 2021): Tremont(Lakefield), Jasper Lake, Ice lake, Tigerlake, and Rocket Lake support GNFI.
2022-01-13 15:32:55 -06:00
Wunkolo
fba23e3e75 [x64] Add kX64EmitGFNI emitter feature-flag
This determines support for the `gf2p8affineqb` instruction. Even though `GFNI` is typically found with AVX512-enabled chips, it _is_ possible for there to be a chip with `GFNI` but does not support `AVX` or `AVX2` of any sort. An example of this is Tremont(Lakefield) chips as well as Jasper Lake.

13df339fe7/GenuineIntel/GenuineIntel00806A1_Lakefield_LC_InstLatX64.txt (L1297-L1299)

13df339fe7/GenuineIntel/GenuineIntel00906C0_JasperLake_InstLatX64.txt (L1252-L1254)
2022-01-13 15:32:55 -06:00
Wunkolo
5d1b53cd6f [x64] Add VECTOR_SHA_I8_SAME_CONSTANT unit test
This is to target the new GNFI-based optimization for the Int8 case.
2022-01-13 15:32:55 -06:00
Stefan Schmidt
31c9f026c5 [UI] Force use of Xwayland when running on Wayland 2022-01-12 17:37:54 +03:00
Enrico Pozzobon
5e31429128 [WinKey] Rebindable keyboard controls. 2022-01-11 12:38:13 -06:00
gibbed
5384e0e174 [Base] Fix MICROPROFILE_PRINTF. 2022-01-11 06:09:26 -06:00
gibbed
f4d60f3fc4 [XAM] Fix xeXMsgStartIORequestEx result check. 2022-01-11 06:09:06 -06:00
Wunkolo
233ed107fe [CPU] Remove use_haswell_instructions in favor of x64_extension_mask
Rather than having a single bool to conditionally detect haswell-level
instruction features. The granularity is increased with a new
`x64_extension_mask` where individual features within the x64 backend
can be turned on or off in a bit-mask manner. Since we have an ARM
backend on the horizon, I've added this to the new `x64`
configuration-group rather than `CPU`. This new pattern will hopefully
allow for testing to be more targetted to certain processor features and
allows the user to determine if they want certain features to be enabled
or disabled(such as avoiding BMI2 on certain AMD processors due to
pdep/pext being incredibly slow). The default configuration is to detect
and utilize all available features.
2022-01-11 03:57:32 -06:00
Wunkolo
37aa3d129c [x64] Explicitly handle AND_NOT dest == src1
This addresses a JIT-issue in the case that the `src1` and `dest`
register are both the same. This issue only happens in the "generic"
x86 path but not in the BMI1-accelerated path.

Thanks Rick for the extensive debugging help.

When `src1` and `dest` were the same, then the `addc` instruction at
`82099A08` in title `584108FF` might emit the following assembly:
```
.text:82099A08                 andc      r11, r10, r11
  |
  | Jitted
  |
  V
00000000A0011B15  mov         rbx,r10
00000000A0011B18  not         rbx
00000000A0011B1B  and         rbx,rbx
```

This was due to the src1 operand and the destination register being the
same, which used to call the "else" case in the x64 emitter when it
needs to be handled explicitly due to register aliasing/allocation.

Addresses issue #1945
2022-01-10 15:48:49 -06:00
gibbed
975eadf17e [Kernel] Assert export function return/arg types. 2022-01-09 14:16:37 -06:00
gibbed
12ec728989 [Kernel] Use tables for export groups. 2022-01-09 14:16:37 -06:00
gibbed
3ad0a7dab2 [Kernel] Suffix export functions with _entry. 2022-01-09 12:17:03 -06:00
Rick Gibbed
ce1a84375b
Remove FUNDING.yml.
File has been moved to organization-wide repository.

https://github.com/xenia-project/.github/FUNDING.yml
2022-01-09 12:06:34 -06:00
Triang3l
14b69fdb00 [GPU] vfetch_full fetching nothing still must calculate the address 2022-01-09 16:26:05 +03:00
Triang3l
d6188c5d7e [GPU] Reuse base+index*stride in vfetch_mini instead of reloading the index GPR
The wheel shader in 4D530910 does vfetch_full to r0 with the index from r0.x, and then vfetch_mini.
Thanks @Gliniak for the finding :3
Also small formatting cleanup in commented-out code.
2022-01-09 14:58:38 +03:00
gibbed
600c14b3f0 [xboxknrl] Implement ExTryToAcquireRWLShared.
[xboxknrl] Implement ExTryToAcquireReadWriteLockShared.
2022-01-07 10:22:48 -06:00
gibbed
1f9c434b5e [xboxkrnl] Implement ExAcquireRWLShared.
[xboxkrnl] Implement ExAcquireReadWriteLockShared.
2022-01-07 10:22:48 -06:00
gibbed
3162a6435c [xboxkrnl] Implement ExTryToAcquireRWLExclusive.
[xboxkrnl] Implement ExTryToAcquireReadWriteLockExclusive.
2022-01-07 10:22:48 -06:00
gibbed
e795337071 [xboxkrnl] ExReleaseReadWriteLock fixes.
[xboxkrnl] ExReleaseReadWriteLock fixes:
- Don't unncessarily double-load lock members.
- Reset readers entry count when lock count becomes negative.
- Properly decrease writers waiting count when writer event fired.
2022-01-07 10:22:48 -06:00
gibbed
b4f35635c5 [xboxkrnl] ExAcquireReadWriteLockExclusive fixes.
[xboxkrnl] ExAcquireReadWriteLockExclusive fixes:
- Don't unnecessarily double-load lock count.
- Don't release spin lock before we're done with the lock.
2022-01-07 10:22:48 -06:00
gibbed
fa774f1d86 [xboxkrnl] Fix up XexGetProcedureAddress logging.
[xboxkrnl] Fix up XexGetProcedureAddress failure logging.
2022-01-07 09:35:43 -06:00