Commit Graph

86 Commits

Author SHA1 Message Date
Unknown W. Brackets
d9f6bae1ff x64jit: Initial reg transfer. 2023-09-24 16:28:29 -07:00
Unknown W. Brackets
61a99b4bac x86jit: Implement trig/reciprocals. 2023-08-27 23:24:30 -07:00
Henrik Rydgård
c4e44d66b0 x86/x64: Nop-align the main loop of vertex decoder loops 2023-06-12 20:39:39 +02:00
Unknown W. Brackets
813bfded92
x86jit: Correct vh2f NAN handling (#16275)
* x86jit: Correct vh2f NAN handling.

Allows another test to pass.

* x86jit: Reuse MAccessibleDisp().
2022-10-23 10:09:29 +02:00
Unknown W. Brackets
2479d52202 Global: Reduce includes of common headers.
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
c1e657ed47 samplerjit: Better vectorize UV linear calc.
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
8573c34f85 x86jit: Check CALL dist for safe memory funcs. 2022-01-22 00:14:15 -08:00
Unknown W. Brackets
0ba2d05da5 samplerjit: Simplify AVX shift-copies.
These have been the most common and the fallback is safe.  Let's just add
a helper.
2022-01-17 15:15:36 -08:00
Unknown W. Brackets
ce6ea8da11 samplerjit: Apply gather lookup to all CLUT4. 2022-01-02 17:19:18 -08:00
Unknown W. Brackets
22f770c828 samplerjit: Use VPGATHERDD for simple CLUT4 loads.
Planning to expand this to more paths.
2022-01-02 17:19:17 -08:00
Unknown W. Brackets
1addf84e90 samplerjit: Use SSSE3/SSE4 in linear filtering. 2021-12-30 23:22:56 -08:00
Unknown W. Brackets
7aa9664d20 x64jit: Add AVX2-only instructions. 2021-12-29 19:46:26 -08:00
Unknown W. Brackets
7508fcc22d x64jit: Add AVX-only instructions. 2021-12-29 19:46:26 -08:00
Unknown W. Brackets
147b81d6f7 x64jit: Add AVX/AVX2 encodings.
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
bf06342f9d samplerjit: Minor SSE4 optimizations.
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
820361f34b samplerjit: Calculate texel byte offset as vector. 2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3f3e0ea8cf softjit: Optimize typical alpha/depth test.
Messed with SSE4 then realized there's no point, just use SHR.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
4178f09e57 Build: More consistently avoid _M_ defines.
We use PPSSPP_ARCH in several places already, this makes it more complete.
2021-03-02 21:49:21 -08:00
Gleb Mazovetskiy
7305ba9d9b x64Emitter: Fix unaligned store UBSAN errors
This compiles to the same assembly as before even without optimizations and avoids UB.

https://godbolt.org/z/4G5edM

While the UB here is benign, this improves signal-to-noise ratio of UBSAN errors.

Fixes #14005
2021-01-30 12:26:01 +00:00
Henrik Rydgård
989e353482 Common.h shouldn't include Log.h.
Buildfixes

More buildfixes. Move JSON code to common.
2020-10-04 11:42:14 +02:00
Henrik Rydgård
c5e0b799d9 Remove category from _assert_msg_ functions. We don't filter these by category anyway.
Fixes the inconsistency where we _assert_ didn't take a category but
_assert_msg_ did.
2020-07-19 20:33:25 +02:00
Unknown W. Brackets
7910b4029a arm64jit: Track writable and non-writable pointers.
Switch uses different memory regions.  We can handle this, might as well
cleanup some const abuse.
2020-05-17 00:15:12 -07:00
Henrik Rydgård
b4a44c5e02 Another buildfix, sigh. Also extend the safe region a little bit to the thing from a couple commits ago. 2017-12-13 22:28:30 +01:00
Henrik Rydgård
d2fe5abb84 Add a tiny bit of safety margin to the RipAccessible check. Should be enough for 128-bit SSE data. 2017-12-13 22:00:59 +01:00
Henrik Rydgård
8d0498303a Fix a PIC compliance bug in the VFPU. Comment other cases properly (for easy searching). 2017-08-29 11:45:12 +02:00
Henrik Rydgård
567937fa4d x64: Enable non-RIP addressing for FPU registers 2017-07-07 11:33:07 +02:00
Henrik Rydgård
0645677fea Access FPU temps through CTXREG 2017-07-07 11:33:06 +02:00
Henrik Rydgård
7c3b37c561 More RIP elimination 2017-07-07 11:33:05 +02:00
Henrik Rydgård
d82f90f1b2 More RIP removal 2017-07-07 11:33:05 +02:00
Henrik Rydgård
80b82ecd81 Buildfix attempt 2017-07-07 11:33:02 +02:00
Unknown W. Brackets
cb3db559bd SoftGPU: Jit the linear sampling too.
For now, just reducing overhead.  Could be smarter.
2017-05-30 22:57:46 -07:00
Henrik Rydgård
0ec1e5e3b2 Don't erase and rewrite the dispatcher when the cache is cleared. Fixes #9708 2017-05-26 15:48:03 +02:00
Henrik Rydgard
a1ec735f6c ARM64Emitter: Implement instructions to move data to/from SP 2017-01-26 14:23:42 +01:00
Florent Castelli
8c3552de74 cmake: Detect features at compile time
Instead of relying on manually passed down flags from CMake,
we now have ppsspp_config.h file to create the platform defines for us.
This improves support for multiplatform builds (such as iOS).
2016-10-19 12:31:19 +02:00
Henrik Rydgard
ffe4c266ef Add CodeBlockCommon base class to remove further arch-specificity in JitBlockCache
Remove unused ArmThunk.
2016-05-01 11:40:00 +02:00
Unknown W. Brackets
ef1dc583a2 Fix various minor warnings. 2016-03-20 14:17:51 -07:00
aroulin
8a09dedf94 x64Emitter: add RCPPS and RCPSS SSE instructions 2015-08-23 16:43:07 +02:00
Henrik Rydgard
604abe933e Update submodules, add x64Emitter bugfix from Dolphin (plus a few new instrs), misc 2015-01-11 00:12:32 +01:00
Henrik Rydgård
6bf2c02908 x86 jit: Allow storing all imms directly without bouncing to a register, not just zero. 2014-12-23 22:25:53 +01:00
Henrik Rydgard
4ec30d98e1 Port the x86 and ARM emitters over to use the generic CodeBlock class 2014-12-15 22:32:55 +01:00
Henrik Rydgard
2bce7bc460 X64Emitter: Merge some AVX stuff from Dolphin 2014-12-07 23:09:38 +01:00
Henrik Rydgard
66d74981b5 Merge ARM emitter updates from the NEON branch 2014-11-29 10:49:22 +01:00
Henrik Rydgard
344f71b092 x86 jit: Commit commented-out haddps-based vdot.q as reminder not to use haddps... 2014-11-28 00:19:11 +01:00
Henrik Rydgard
5033babb10 x86 Jit: SIMD-ify vdot 2014-11-26 23:47:18 +01:00
Henrik Rydgard
804de50711 x86 jit: SIMD-ify VFPU register file writebacks where possible 2014-11-26 01:33:05 +01:00
chinhodado
4bac356df6 Use const 2014-11-14 16:13:06 -05:00
Henrik Rydgard
784bf82b58 Improve AVX check in CPUDetect. Warning fix.
Keeping the ifdef for zenfone - still doesn't work without it
2014-11-11 23:48:58 +01:00
Unknown W. Brackets
bc7497857a x86jit: Micro optimize vi2x a bit with ssse3/sse4.
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a x86jit: Implement vi2x instructions.
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)

AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
844c7e73d3 x86jit: Add SSE 4.1 rounding ops to emitter. 2014-11-03 23:18:09 -08:00