Unknown W. Brackets
a201d3f561
samplerjit: Fix non-AVX three-op shift.
...
Oops, was still shifting the source register.
2022-02-15 20:12:45 -08:00
Unknown W. Brackets
16dca4f69b
x86jit: Use BMI2 for variable shifts.
...
We don't actually regalloc ECX, but this still saves a copy, and on modern
CPUs these seem to be pretty fast.
2022-01-31 19:38:17 -08:00
Unknown W. Brackets
c1e657ed47
samplerjit: Better vectorize UV linear calc.
...
Gives about 1-2% when mips are used.
2022-01-24 20:42:07 -08:00
Unknown W. Brackets
8573c34f85
x86jit: Check CALL dist for safe memory funcs.
2022-01-22 00:14:15 -08:00
Unknown W. Brackets
0ba2d05da5
samplerjit: Simplify AVX shift-copies.
...
These have been the most common and the fallback is safe. Let's just add
a helper.
2022-01-17 15:15:36 -08:00
Unknown W. Brackets
ce6ea8da11
samplerjit: Apply gather lookup to all CLUT4.
2022-01-02 17:19:18 -08:00
Unknown W. Brackets
22f770c828
samplerjit: Use VPGATHERDD for simple CLUT4 loads.
...
Planning to expand this to more paths.
2022-01-02 17:19:17 -08:00
Unknown W. Brackets
1addf84e90
samplerjit: Use SSSE3/SSE4 in linear filtering.
2021-12-30 23:22:56 -08:00
Unknown W. Brackets
7aa9664d20
x64jit: Add AVX2-only instructions.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
7508fcc22d
x64jit: Add AVX-only instructions.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
147b81d6f7
x64jit: Add AVX/AVX2 encodings.
...
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
bf06342f9d
samplerjit: Minor SSE4 optimizations.
...
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
820361f34b
samplerjit: Calculate texel byte offset as vector.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3f3e0ea8cf
softjit: Optimize typical alpha/depth test.
...
Messed with SSE4 then realized there's no point, just use SHR.
2021-11-26 08:21:14 -08:00
Unknown W. Brackets
4178f09e57
Build: More consistently avoid _M_ defines.
...
We use PPSSPP_ARCH in several places already, this makes it more complete.
2021-03-02 21:49:21 -08:00
Gleb Mazovetskiy
7305ba9d9b
x64Emitter: Fix unaligned store UBSAN errors
...
This compiles to the same assembly as before even without optimizations and avoids UB.
https://godbolt.org/z/4G5edM
While the UB here is benign, this improves signal-to-noise ratio of UBSAN errors.
Fixes #14005
2021-01-30 12:26:01 +00:00
Henrik Rydgård
e8a9845d93
First step of cleaning up Log.h. Plus a few other bits and bobs.
2020-08-16 14:48:54 +02:00
Henrik Rydgård
0829543987
Third part of getting rid of PanicAlert
2020-07-19 20:34:02 +02:00
Henrik Rydgård
47a3bf1dd7
Step 2 of removing PanicAlert
2020-07-19 20:34:02 +02:00
Henrik Rydgård
c5e0b799d9
Remove category from _assert_msg_ functions. We don't filter these by category anyway.
...
Fixes the inconsistency where we _assert_ didn't take a category but
_assert_msg_ did.
2020-07-19 20:33:25 +02:00
Unknown W. Brackets
7910b4029a
arm64jit: Track writable and non-writable pointers.
...
Switch uses different memory regions. We can handle this, might as well
cleanup some const abuse.
2020-05-17 00:15:12 -07:00
Henrik Rydgård
381c4ca4b2
X64: Fix bug in a case in the MOVQ emitter : rex byte should be after the 0x66 prefix
2017-07-07 11:33:07 +02:00
Henrik Rydgård
0645677fea
Access FPU temps through CTXREG
2017-07-07 11:33:06 +02:00
Unknown W. Brackets
cb3db559bd
SoftGPU: Jit the linear sampling too.
...
For now, just reducing overhead. Could be smarter.
2017-05-30 22:57:46 -07:00
Henrik Rydgård
0ec1e5e3b2
Don't erase and rewrite the dispatcher when the cache is cleared. Fixes #9708
2017-05-26 15:48:03 +02:00
Henrik Rydgard
323eb72b7c
Write-protect the dispatcher on all platforms.
2016-08-28 13:35:27 +02:00
Henrik Rydgard
ffe4c266ef
Add CodeBlockCommon base class to remove further arch-specificity in JitBlockCache
...
Remove unused ArmThunk.
2016-05-01 11:40:00 +02:00
Henrik Rydgard
88f25fd50e
x86-64: Fix L bit in VEX instruction emitter. Ported fix from Citra.
...
Currently unused in the emulator, though.
2016-02-28 13:07:24 +01:00
aroulin
8a09dedf94
x64Emitter: add RCPPS and RCPSS SSE instructions
2015-08-23 16:43:07 +02:00
Henrik Rydgard
604abe933e
Update submodules, add x64Emitter bugfix from Dolphin (plus a few new instrs), misc
2015-01-11 00:12:32 +01:00
Henrik Rydgard
4ec30d98e1
Port the x86 and ARM emitters over to use the generic CodeBlock class
2014-12-15 22:32:55 +01:00
Henrik Rydgard
2bce7bc460
X64Emitter: Merge some AVX stuff from Dolphin
2014-12-07 23:09:38 +01:00
Henrik Rydgard
5290ffd929
Minor cleanup in vtfm. Re-enable vrot combination. Optimize vfad/vavg when dpps is available.
...
Also fixes bug in emitter of dpps.
2014-12-03 22:44:32 +01:00
Henrik Rydgard
344f71b092
x86 jit: Commit commented-out haddps-based vdot.q as reminder not to use haddps...
2014-11-28 00:19:11 +01:00
Henrik Rydgard
5033babb10
x86 Jit: SIMD-ify vdot
2014-11-26 23:47:18 +01:00
Henrik Rydgard
28ca8d4818
x86 jit: Use LEA to emulate addu but only when it can save a few bytes
2014-11-16 17:39:47 +01:00
Unknown W. Brackets
bc7497857a
x86jit: Micro optimize vi2x a bit with ssse3/sse4.
...
Both are small wins.
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
0e646f748a
x86jit: Implement vi2x instructions.
...
Also, my opcodes were wrong in the test (shifted the pair bit the wrong
way, oops.)
AFAICT, there's no reason PSRAD/etc. were not encoding REX...
2014-11-08 12:13:26 -08:00
Unknown W. Brackets
d7bdded6f8
x86jit: fix rip addressing on PEXTRW/PINSRW.
...
I think this is right anyway, not 100% sure.
2014-11-03 23:18:32 -08:00
Unknown W. Brackets
844c7e73d3
x86jit: Add SSE 4.1 rounding ops to emitter.
2014-11-03 23:18:09 -08:00
Henrik Rydgård
7bde976069
Merge x64 emitter from a newer Dolphin version.
...
This one can generate slightly smaller code by exploiting some EAX-only
encoding and various other short forms, and adds support for many newer
CPU instructions.
2014-10-12 19:46:58 +02:00
Henrik Rydgård
281ab5f9cb
Sync x64 emitter to Dolphin's.
2014-10-12 19:45:26 +02:00
Unknown W. Brackets
e1a57abcb4
Fix mixed newline style.
2014-09-20 08:30:37 -07:00
Henrik Rydgard
62054b1e7b
Fix PINSRW/PEXTRW emitters.
...
Fixes crash introduced in 5276487611
(apparently we haven't used PINSRW before)
2014-09-20 11:46:05 +02:00
Henrik Rydgard
215abfb951
Some cleanup in /Common
2014-09-06 10:47:25 +02:00
Henrik Rydgard
d3dce422a8
X64emitter: merge from dolphin
2014-07-20 00:21:28 +02:00
Henrik Rydgard
221216b5b2
Bugfix in x64 emitter, thanks magumagu
2014-03-27 22:25:30 +01:00
Unknown W. Brackets
632eec38e8
vertexjit: Use SSE4.1 where available on x86.
...
Just because we can.
2014-03-22 16:11:16 -07:00
Unknown W. Brackets
162f229294
vertexjit: Support the color morphs on x86.
2014-03-22 15:56:29 -07:00
Unknown W. Brackets
f14361c3b8
Add a bunch more missing cstring includes.
2013-12-30 21:37:19 -08:00