Unknown W. Brackets
813bfded92
x86jit: Correct vh2f NAN handling ( #16275 )
...
* x86jit: Correct vh2f NAN handling.
Allows another test to pass.
* x86jit: Reuse MAccessibleDisp().
2022-10-23 10:09:29 +02:00
Unknown W. Brackets
2479d52202
Global: Reduce includes of common headers.
...
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
c1e657ed47
samplerjit: Better vectorize UV linear calc.
...
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
8573c34f85
x86jit: Check CALL dist for safe memory funcs.
2022-01-22 00:14:15 -08:00
Unknown W. Brackets
0ba2d05da5
samplerjit: Simplify AVX shift-copies.
...
These have been the most common and the fallback is safe. Let's just add
a helper.
2022-01-17 15:15:36 -08:00
Unknown W. Brackets
ce6ea8da11
samplerjit: Apply gather lookup to all CLUT4.
2022-01-02 17:19:18 -08:00
Unknown W. Brackets
22f770c828
samplerjit: Use VPGATHERDD for simple CLUT4 loads.
...
Planning to expand this to more paths.
2022-01-02 17:19:17 -08:00
Unknown W. Brackets
1addf84e90
samplerjit: Use SSSE3/SSE4 in linear filtering.
2021-12-30 23:22:56 -08:00
Unknown W. Brackets
7aa9664d20
x64jit: Add AVX2-only instructions.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
7508fcc22d
x64jit: Add AVX-only instructions.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
147b81d6f7
x64jit: Add AVX/AVX2 encodings.
...
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
bf06342f9d
samplerjit: Minor SSE4 optimizations.
...
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
820361f34b
samplerjit: Calculate texel byte offset as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3f3e0ea8cf
softjit: Optimize typical alpha/depth test.
...
Messed with SSE4 then realized there's no point, just use SHR.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
4178f09e57
Build: More consistently avoid _M_ defines.
...
We use PPSSPP_ARCH in several places already, this makes it more complete.
2021-03-02 21:49:21 -08:00
Gleb Mazovetskiy
7305ba9d9b
x64Emitter: Fix unaligned store UBSAN errors
...
This compiles to the same assembly as before even without optimizations and avoids UB.
https://godbolt.org/z/4G5edM
While the UB here is benign, this improves signal-to-noise ratio of UBSAN errors.
Fixes #14005
2021-01-30 12:26:01 +00:00
Henrik Rydgård
989e353482
Common.h shouldn't include Log.h.
...
Buildfixes
More buildfixes. Move JSON code to common.
2020-10-04 11:42:14 +02:00
Henrik Rydgård
c5e0b799d9
Remove category from _assert_msg_ functions. We don't filter these by category anyway.
...
Fixes the inconsistency where we _assert_ didn't take a category but
_assert_msg_ did.
2020-07-19 20:33:25 +02:00
Unknown W. Brackets
7910b4029a
arm64jit: Track writable and non-writable pointers.
...
Switch uses different memory regions. We can handle this, might as well
cleanup some const abuse.
2020-05-17 00:15:12 -07:00
Henrik Rydgård
b4a44c5e02
Another buildfix, sigh. Also extend the safe region a little bit to the thing from a couple commits ago.
2017-12-13 22:28:30 +01:00
Henrik Rydgård
d2fe5abb84
Add a tiny bit of safety margin to the RipAccessible check. Should be enough for 128-bit SSE data.
2017-12-13 22:00:59 +01:00
Henrik Rydgård
8d0498303a
Fix a PIC compliance bug in the VFPU. Comment other cases properly (for easy searching).
2017-08-29 11:45:12 +02:00
Henrik Rydgård
567937fa4d
x64: Enable non-RIP addressing for FPU registers
2017-07-07 11:33:07 +02:00
Henrik Rydgård
0645677fea
Access FPU temps through CTXREG
2017-07-07 11:33:06 +02:00
Henrik Rydgård
7c3b37c561
More RIP elimination
2017-07-07 11:33:05 +02:00
Henrik Rydgård
d82f90f1b2
More RIP removal
2017-07-07 11:33:05 +02:00
Henrik Rydgård
80b82ecd81
Buildfix attempt
2017-07-07 11:33:02 +02:00
Unknown W. Brackets
cb3db559bd
SoftGPU: Jit the linear sampling too.
...
For now, just reducing overhead. Could be smarter.
2017-05-30 22:57:46 -07:00
Henrik Rydgård
0ec1e5e3b2
Don't erase and rewrite the dispatcher when the cache is cleared. Fixes #9708
2017-05-26 15:48:03 +02:00
Henrik Rydgard
a1ec735f6c
ARM64Emitter: Implement instructions to move data to/from SP
2017-01-26 14:23:42 +01:00
Florent Castelli
8c3552de74
cmake: Detect features at compile time
...
Instead of relying on manually passed down flags from CMake,
we now have ppsspp_config.h file to create the platform defines for us.
This improves support for multiplatform builds (such as iOS).
2016-10-19 12:31:19 +02:00
Henrik Rydgard
ffe4c266ef
Add CodeBlockCommon base class to remove further arch-specificity in JitBlockCache
...
Remove unused ArmThunk.
2016-05-01 11:40:00 +02:00
Unknown W. Brackets
ef1dc583a2
Fix various minor warnings.
2016-03-20 14:17:51 -07:00
aroulin
8a09dedf94
x64Emitter: add RCPPS and RCPSS SSE instructions
2015-08-23 16:43:07 +02:00
Henrik Rydgard
604abe933e
Update submodules, add x64Emitter bugfix from Dolphin (plus a few new instrs), misc
2015-01-11 00:12:32 +01:00
Henrik Rydgård
6bf2c02908
x86 jit: Allow storing all imms directly without bouncing to a register, not just zero.
2014-12-23 22:25:53 +01:00
Henrik Rydgard
4ec30d98e1
Port the x86 and ARM emitters over to use the generic CodeBlock class
2014-12-15 22:32:55 +01:00
Henrik Rydgard
2bce7bc460
X64Emitter: Merge some AVX stuff from Dolphin
2014-12-07 23:09:38 +01:00
Henrik Rydgard
66d74981b5
Merge ARM emitter updates from the NEON branch
2014-11-29 10:49:22 +01:00
Henrik Rydgard
344f71b092
x86 jit: Commit commented-out haddps-based vdot.q as reminder not to use haddps...
2014-11-28 00:19:11 +01:00
Henrik Rydgard
5033babb10
x86 Jit: SIMD-ify vdot
2014-11-26 23:47:18 +01:00
Henrik Rydgard
804de50711
x86 jit: SIMD-ify VFPU register file writebacks where possible
2014-11-26 01:33:05 +01:00
chinhodado
4bac356df6
Use const
2014-11-14 16:13:06 -05:00
Henrik Rydgard
784bf82b58
Improve AVX check in CPUDetect. Warning fix.
...
Keeping the ifdef for zenfone - still doesn't work without it
2014-11-11 23:48:58 +01:00
Unknown W. Brackets
bc7497857a
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
...
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a
x86jit: Implement vi2x instructions.
...
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
844c7e73d3
x86jit: Add SSE 4.1 rounding ops to emitter.
2014-11-03 23:18:09 -08:00
Henrik Rydgård
6cb2c9c97d
Merge pull request #6989 from hrydgard/x86-emitter-merge
...
Merge from Dolphin's x86-64 emitter
2014-10-12 19:52:59 +02:00
Henrik Rydgård
7bde976069
Merge x64 emitter from a newer Dolphin version.
...
This one can generate slightly smaller code by exploiting some EAX-only
encoding and various other short forms, and adds support for many newer
CPU instructions.
2014-10-12 19:46:58 +02:00
Henrik Rydgård
281ab5f9cb
Sync x64 emitter to Dolphin's.
2014-10-12 19:45:26 +02:00