Commit Graph

82 Commits

Author SHA1 Message Date
Unknown W. Brackets
0aba5ff3c1 TexCache: Correct alpha mask checks for SSE2.
Should have been shifts by byte (4/8), but let's just switch to shuffles
anyway.  These were always shifting in zeros and failing.
2022-12-03 12:38:01 -08:00
Unknown W. Brackets
cf030c3bce Global: Cleanup some unreferenced warnings. 2022-08-13 12:43:14 -07:00
Henrik Rydgård
35e7affa3e Simplify alphasum checking for DXT textures, and fix a regression
Got some weird blackness in the sky in Gran Turismo. This fixes that.
2022-04-25 00:54:47 +02:00
Henrik Rydgård
900ff64cf1 Buildfix 2022-04-15 13:39:01 +02:00
Henrik Rydgård
3efce3ceca Try a clang pragma to avoid overeager auto-vectorization 2022-04-15 13:26:54 +02:00
Henrik Rydgård
9e60b82c54 Buildfixing, correct NEON type usage 2022-04-15 13:19:03 +02:00
Henrik Rydgård
185b93058e SIMD-optimize CheckMask16 / CopyAndSumMask16 2022-04-15 12:40:10 +02:00
Henrik Rydgård
c4dfbf4f1a Delete a lot of specialized alpha checking code.
This was now only used to check alpha in CLUTs, and the generic functions will not actually be any slower.
2022-04-15 12:34:50 +02:00
Henrik Rydgård
a5ee1884c1 Address feedback 2022-04-15 01:08:14 +02:00
Henrik Rydgård
9f7e0978a9 AND together colors while decoding, and then check against fullAlphaMask. 2022-04-15 00:56:25 +02:00
Henrik Rydgård
584e94f01e ARM32: Remove a lot of non-NEON fallback paths 2022-04-13 11:44:55 +02:00
Henrik Rydgård
f54ed3757c Always use the stable quick tex hash. Doesn't actually make a difference except on new CPU archs. 2022-04-13 11:18:18 +02:00
Henrik Rydgård
e6fe31365a Remove more function defines 2022-04-13 10:02:16 +02:00
Henrik Rydgård
a68ddd0a8d Merge separate NEON functions into the normal functions.
We no longer support non-NEON ARM.

It's nice also to have the NEON and SSE implementations "close" to each
other, easier to port optimizations back and forth etc.
2022-04-12 23:43:21 +02:00
Unknown W. Brackets
2479d52202 Global: Reduce includes of common headers.
In many places, string, map, or Common.h were included but not needed.
2022-01-30 16:35:33 -08:00
Unknown W. Brackets
8a00c2d233 GPU: Allow gcc/clang/icc runtime SSE4 usage.
All our builds before were only using SSE4 in jit...
2022-01-08 17:09:09 -08:00
Unknown W. Brackets
6762903087 TexCache: Correct confusing red/blue var names.
This decodes to RGBA (R least significant), so it's confusing to refer to
it as BGRA.  It's actually the 565 colors in the DXT data that are BGR.
2021-09-12 17:21:45 -07:00
Unknown W. Brackets
a0eeb52444 softgpu: Decode DXT texels directly.
This improves performance a lot compared to decoding the whole block.
Eventually we may implement a cache, but threading makes that complex to
make properly fast.
2021-09-12 09:37:34 -07:00
Unknown W. Brackets
1ee5352d3e TexCache: Correct DXT5 alpha calculation.
This matches PSP alpha values from an exhaustive test.
2021-09-12 09:35:53 -07:00
Unknown W. Brackets
8a8328c431 Common: Move ColorConv to a more appropriate place. 2021-05-01 11:20:05 -07:00
Unknown W. Brackets
13ec384dbe Build: Explicitly include ppsspp_config.h.
This adds it to all files that use it.  Not all our builds include the
file.
2021-03-02 21:04:03 -08:00
Unknown W. Brackets
30625225b0 GPU: Remove neon xxhash implementation.
It's typically around the same speed now with modern compilers, and much
slower than XXH3.
2020-08-27 20:31:09 -07:00
Rémi Verschelde
e479bf7f7b TextureDecoder: Fix misuse of NEON on all armv7
`ppsspp_config.h` properly defines `PPSSPP_ARCH(ARM_NEON)` already for
arm64v8 and armv7+NEON, so we use that instead of using NEON instructions
on all armv7.
2020-06-27 17:29:24 +02:00
Unknown W. Brackets
4a8839c99d GPU: Avoid divide by zero in garbage displaylist. 2020-03-19 20:56:24 -07:00
Unknown W. Brackets
1199008641 TexCache: Align bufw properly even for VRAM.
Fixes minimap arrows in Manhunt 2 (see #9615.)
2019-03-24 19:21:08 -07:00
Unknown W. Brackets
bd294f658f TexCache: Round DXT5 alpha up.
This isn't quite right, but it seems better than rounding down.
Experimented with a lower round up value, but none were right - the
weighting must be more complex.
2018-11-04 09:36:39 -08:00
Unknown W. Brackets
df200fc3d2 TexCache: In DXT3, don't swizzle alpha.
Hardware doesn't seem to.
2018-11-04 09:36:39 -08:00
Unknown W. Brackets
c31e01771e TexCache: Respect color order in DXT3/5.
Hardware is still doing DXT1 style colors in this scenario.
2018-11-04 09:36:39 -08:00
Unknown W. Brackets
11ab4e8634 TexCache: Mix DXT colors using 2/3 not 3/8.
Hardware draws using 2/3.  Adding this way matches rounding, too.
2018-11-04 09:36:39 -08:00
Unknown W. Brackets
35a1d8a1ef TexCache: Decode DXT1 zero alpha as black.
Hardware tests show this is how it decodes, which is more like standard
DXT1 decoding.
2018-11-04 08:09:56 -08:00
Unknown W. Brackets
38eb9d12d0 TexCache: Don't swizzle DXT1 colors.
Hardware tests show that this shouldn't happen.  May be important for
color tests, etc.
2018-11-04 08:09:13 -08:00
Unknown W. Brackets
97773d3dd5 TexCache: Fix texture alignment in GLES.
We must align to 4 bytes, and we aren't always aligned to 16 anymore, so
we must check when dealing with swizzle.
2018-09-08 19:00:30 -07:00
Unknown W. Brackets
f65edc20a3 TexCache: Optimize DXT5 alpha lerp.
This makes the overall DXT5 decode about 8% faster.
2018-09-02 11:41:27 -07:00
Unknown W. Brackets
3f35221f3b TexCache: Avoid masking out alpha for DXT3/DXT5.
A little faster.  Also refactor colors a bit to be more readable.
2018-09-02 09:53:31 -07:00
Unknown W. Brackets
8ae2b1e6fb TexCache: Optimize DXT3/DXT5 decode to single pass.
This is significantly faster on Vulkan, and in other situations where
we're decoding directly to uncached memory.
2018-09-02 09:30:46 -07:00
Unknown W. Brackets
715a7b7318 Global: Silence some unused declaration warnings.
These things aren't used on Android.
2017-12-03 19:22:03 -08:00
Unknown W. Brackets
65e71f57c7 TexCache: Add NEON alpha checks for Vulkan. 2017-11-12 16:45:05 -08:00
Unknown W. Brackets
f087b87b0c TexCache: Simplify CheckAlpha funcs and SIMD.
Only check for full alpha now, which is simpler.
2017-11-12 16:41:19 -08:00
Unknown W. Brackets
9fbcc01afa TexCache: Remove simple 0/1 alpha check.
No practical optimizations have come of this, so it's a waste of time.
Slows down Vulkan too.
2017-11-12 16:17:46 -08:00
Henrik Rydgard
b0bd7e3c6f Minor changes for compatibility with VS2017 2017-03-12 17:33:00 +01:00
Henrik Rydgard
b1971d266b Protect Unswizzle from bad alignment of the destination. Might help #9134 2017-02-23 23:03:01 +01:00
Henrik Rydgard
b0cdcfca3c D3D11: Proper fix for DXT5 crash. May also help #9134. 2017-02-18 02:41:17 +01:00
Henrik Rydgård
e47138a5f3 Warning fixes 2017-01-17 20:26:48 +07:00
Henrik Rydgard
ea5e9f8c35 Fix ARM64 Android build 2016-11-03 22:15:50 +01:00
Jools Wills
afe8e2bfb4 Fix building on rpi - #9104
Check for PPSSPP_ARCH(ARM_NEON) for neon code
Fix up rpi armv6/armv6 toolchain to work around issue with CMAKE_*_FLAGS not being set.
2016-11-01 02:45:30 +00:00
Florent Castelli
8c3552de74 cmake: Detect features at compile time
Instead of relying on manually passed down flags from CMake,
we now have ppsspp_config.h file to create the platform defines for us.
This improves support for multiplatform builds (such as iOS).
2016-10-19 12:31:19 +02:00
Unknown W. Brackets
f039259a1a Use a same-everywhere quick hash for now. 2016-05-01 00:30:43 -07:00
Unknown W. Brackets
3593a7963e Cleanup and clarify texture swizzling funcs. 2016-03-26 21:55:32 -07:00
Unknown W. Brackets
6b3260df9a Correct SSE alpha check for 4444 textures.
Oops, can't use cmplt here.
2016-01-12 00:20:36 -08:00
Unknown W. Brackets
7bfe100b0f Fix some unused variable warnings.
The CheckAlpha one looks like it will matter.
2015-11-25 16:11:53 -08:00