Commit Graph

63 Commits

Author SHA1 Message Date
Unknown W. Brackets
8a00c2d233 GPU: Allow gcc/clang/icc runtime SSE4 usage.
All our builds before were only using SSE4 in jit...
2022-01-08 17:09:09 -08:00
Unknown W. Brackets
43f71884ee softgpu: Clarify internal matrix multiply usage. 2022-01-07 17:53:24 -08:00
Unknown W. Brackets
e7d66f2029 softgpu: Reuse SSE/NEON matrix code. 2022-01-06 21:19:47 -08:00
Unknown W. Brackets
079b67e7ed softgpu: Use common SIMD matrix multiplies. 2022-01-06 21:19:47 -08:00
Unknown W. Brackets
2d8fdd8cf4 Math3D: Allow construction from NEON vectors.
This makes it match SSE and easier to keep things generic.  Will impact
alignment of non-packed Vec2/Vec3.
2021-11-28 08:24:53 -08:00
Unknown W. Brackets
fb6fadbbb7 softgpu: Fast path rectangles as fans.
Some games, such as Legend of Heroes III, use fans instead of strips.
2021-11-14 18:31:45 -08:00
Henrik Rydgård
a498f164ee vmulq_laneq_f32 not supported on ARM32 2021-10-31 16:32:45 +01:00
Henrik Rydgård
fdacf751ce NEON/SSE-optimize some matrix multiplications used by software transform
Will hopefully reclaim any potential speed loss from the recent
refactor.
2021-10-31 13:36:34 +01:00
Unknown W. Brackets
2f63f9999d GPU: Normalize 0 to 1 always in software lighting.
See #14167.  This seems to be consistent.
2021-02-27 23:51:45 -08:00
Henrik Rydgård
9e41fafd0d Move math and some file and data conversion files out from native to Common.
Buildfixing

Move some file util files

Buildfix

Move KeyMap.cpp/h to Core where they belong better.

libretro buildfix attempt

Move ini_file

More buildfixes
2020-10-04 09:12:46 +02:00
Henrik Rydgård
510229b68b SoftGPU: Detect through-mode rectangles from triangle strips 2019-10-27 20:54:36 +01:00
xebra
62aaf6336a Math3D: Something wrong with hand simd optimization in vec2<float>, so it causes very slow down.
However, compiler optimization is faster enough, so removed it.
2018-10-07 23:54:17 +09:00
xebra
d0682d7829 [spline/bezier]Move SIMD optimization of vector operations to Math3D.h.
Needs rebuild to avoid a dialog confirmation on Visual Studio.
2018-10-07 23:53:43 +09:00
xebra
62ad5fe546 Fix namespace Vec2f. 2018-10-07 23:53:41 +09:00
Henrik Rydgård
45cfda4aa0 Small refactoring in VertexDecoderCommon 2018-03-05 00:03:47 +01:00
Henrik Rydgård
7bb427e6f1 Buildfix 2017-08-31 17:24:34 +02:00
Henrik Rydgård
6a1fa728d8 Remove Globals.h 2017-08-31 17:15:22 +02:00
Henrik Rydgård
91783a3281 SIMD-optimize some data conv routines used in uniform updates. 2017-08-20 11:43:35 +02:00
Unknown W. Brackets
4fb7e43af8 SoftGPU: Grab 4 S/T coords in non-through too. 2017-04-23 11:11:16 -07:00
Unknown W. Brackets
3142462ac6 SoftGPU: Rasterize triangles in chunks of 4 pixels.
Not very optimal yet.
2017-04-23 10:37:11 -07:00
Unknown W. Brackets
5ee062c681 Try to optimize bezier color sampling. 2015-04-18 12:47:21 -07:00
Unknown W. Brackets
f070d6f5ed Use SSE when generating spline normals. 2015-02-25 19:22:48 -08:00
Unknown W. Brackets
90605520a1 Add conversions between Vec3f and Vec3Packedf. 2015-02-22 13:16:07 -08:00
Unknown W. Brackets
ef73487fca Fix Vec4::SetZero() not clearing all lanes. 2014-12-13 10:35:16 -08:00
Unknown W. Brackets
9f7dbec050 Missing include for Linux/etc. 2014-10-31 09:51:17 -07:00
Unknown W. Brackets
eee3ac79f4 Always clamp in ToRGB[A]?().
Before we only clamped with SSE, better to be consistent.  This may also
be slightly faster.
2014-10-31 09:07:54 -07:00
Henrik Rydgard
6304d60b40 Convert 4x4 to 4x3 matrices where possible (except bones) 2014-09-18 23:08:46 +02:00
Henrik Rydgard
bf7a4f9097 D3D: Use fixed constant registers for vertex shaders too. 2014-09-10 13:43:35 +02:00
Tony Wasserka
d09b9fa6a1 Math3D: Change the vector swizzlers to return const objects.
Otherwise, people might be tempted to do things like "some_vec4.xyz() = some_vec3", which compiles fine but does not do the expected thing because xyz() does not return references.
2014-08-17 18:39:02 +02:00
Unknown W. Brackets
56b83af1f0 Don't use aligned loads in non-inlined funcs.
I'm wanting things to stay in registers, but that's not realistic for
arguments.  Force inline the others.  May help #5699.
2014-03-23 12:09:17 -07:00
Henrik Rydgard
bc121242b3 Use fast_math matrix multiplication for culling and sw transform 2014-03-22 14:40:09 +01:00
Unknown W. Brackets
a8a299c2e3 Fix ToRGB/ToRGBA possible accuracy loss.
It was always like this, but not used as much before.  Shifts are fast and
it eneds to sum anyway, there should not be any benefit to multiplying as
floats, and it will probably lose accuracy.
2014-03-18 22:56:27 -07:00
Unknown W. Brackets
416df17088 Inline From/ToRGB(A) to avoid losing SSE.
Otherwise it has to store it, which I'd like to avoid.
2014-03-17 23:03:04 -07:00
Unknown W. Brackets
6630e45eff Just add a packed version of Vec3f.
This way we can have it aligned to memory where needed.  I think it'd be
better to avoid this if possible so that we can actually vectorize
spline/etc. code.

Fixes #5673.
2014-03-17 06:59:40 -07:00
Unknown W. Brackets
dd140b73bb softgpu: Use SSE for gouraud shading. 2014-03-16 14:29:22 -07:00
Unknown W. Brackets
473fb866e6 softgpu: Implement vertex preview.
And move ConvertMatrix4x3To4x4() into a common place since there were
differing implementations, which was only confusing.
2013-12-29 13:45:10 -08:00
Unknown W. Brackets
2f0c8c2877 softgpu: Attempt to implement GE_PROJMAP_UV.
Looks okay, not sure if it's fully correct.
2013-12-15 11:59:22 -08:00
Unknown W. Brackets
dfadb67ea1 Avoid some operator overloads.
Causing ambiguity.
2013-11-17 14:42:58 -08:00
Unknown W. Brackets
a3bd2f1365 Fix Vec3ByMatrix44() and use it for matrix math. 2013-11-17 14:10:57 -08:00
Unknown W. Brackets
b541c81ba3 Clean up Mat3x3 etc. constness. 2013-11-17 13:27:51 -08:00
Henrik Rydgard
8a69543ec4 BBOX: Transform the planes by the matrix so we don't need to transform the box 2013-11-14 11:44:13 +01:00
Unknown W. Brackets
eae6e87620 Simplify lighting clamping in softgpu. 2013-10-05 13:05:32 -07:00
Henrik Rydgard
bd8cb4b02d Start work on implementing bbox, add a comment with some thoughts.. 2013-09-24 14:14:05 +02:00
neobrain
2228ff1cd0 GPU/Math3D: Add a 4x4 matrix class. 2013-07-29 22:49:19 +02:00
neobrain
9f73789c22 GPU/Math3D: Add a 3x3 matrix class. 2013-07-29 22:26:42 +02:00
neobrain
d3e33c527e GPU/Math3D: Replace VecX::Lerp methods with more general and clearer Lerp and LerpInt template functions. 2013-07-29 22:26:42 +02:00
neobrain
f080abc9e9 GPU/Math3D: Add methods for casting vectors. 2013-07-29 22:26:42 +02:00
neobrain
4ab080d083 GPU/Math3D: Support converting Vec3 and Vec4 objects to u32 color values. 2013-07-29 22:26:42 +02:00
neobrain
97f4318ce5 GPU/Math3D: Rename VecXRef::Mul to the star operator. 2013-07-29 22:26:42 +02:00
neobrain
878550ec68 GPU/Math3D: Add component swizzlers for Vec2, Vec3 and Vec4. 2013-07-29 22:26:42 +02:00