Ryan Houdek
e2457943f5
AVX128: Implement support for vperm{q,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
a05644172a
AVX128: Implement support for vdd{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
cc168ce0fb
VectorOps: Restructure DPPOpImpl. This will get reused by AVX128
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
76bd22d279
OpcodeDispatcher: rm gratuitous lambda
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
18574f3cf1
OpcodeDispatcher: extract VPERMILRegOpImpl
...
for avx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
665215ab47
OpcodeDispatcher: extract PTestOpImpl
...
for avx128
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:52:48 -04:00
Alyssa Rosenzweig
6009f36403
OpcodeDispatcher: extract VPERMDIndices
...
and rename things accordingly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:46:17 -04:00
Alyssa Rosenzweig
2580efda0d
OpcodeDispatcher: tweak VTestOp signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:46:17 -04:00
Ryan Houdek
3a310b8815
Merge pull request #3756 from Sonicadvance1/fix_vmovhlps
...
Fix VMOVLHPS instruction
2024-06-24 19:14:56 -07:00
Ryan Houdek
7ff96227c0
Merge pull request #3755 from Sonicadvance1/fix_avx128_vmovntdqa
...
AVX128: Fix vmovntdqa failing to zero upper 128-bits
2024-06-24 19:14:48 -07:00
Ryan Houdek
6d3745b8f1
AVX128: Fixes VMOVLHPS instruction
...
We didn't have unit tests for this
2024-06-24 17:22:51 -07:00
Ryan Houdek
d0f0b975be
SVE256: Fixes VMOVLHPS instruction
...
We didn't have unit tests for this
2024-06-24 17:22:47 -07:00
Ryan Houdek
ff2e6ed59f
X86Tables: Fixes instruction encoding for VMOVLP{S,D}
...
These can have both register and memory modrm encoding
2024-06-24 17:22:43 -07:00
Ryan Houdek
f0d9c8c10a
AVX128: Fix vmovntdqa failing to zero upper 128-bits
2024-06-24 16:32:09 -07:00
Ryan Houdek
b47e981932
AVX128: Fixes SSE4.2 string compare instructions
2024-06-24 15:54:06 -07:00
Ryan Houdek
dc44eb4caf
Merge pull request #3749 from Sonicadvance1/contigous_mask_optimization_removal
...
Arm64: Remove contiguous masked element optimization
2024-06-24 15:22:23 -07:00
Ryan Houdek
dfda6733f0
Merge pull request #3750 from Sonicadvance1/pshuf_bug
...
OpcodeDispatcher: Fixes bug in pshuf{lw,hw}
2024-06-24 15:22:07 -07:00
Alyssa Rosenzweig
21c6986dc7
OpcodeDispatcher: tweak HSUBPOpImpl
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 18:21:08 -04:00
Alyssa Rosenzweig
635720fe12
OpcodeDispatcher: tweak PHADDSOpImpl signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 18:21:08 -04:00
Ryan Houdek
702ecf7637
AVX128: Implement support for round{ss,sd}
2024-06-24 15:19:08 -04:00
Ryan Houdek
0595f1e044
AVX128: Implement support for vround{ps,pd}
2024-06-24 15:19:08 -04:00
Ryan Houdek
cebb032bd3
AVX128: Implement support for vphminposuw
...
Reuses the non-AVX implementation since it only operates on 128-bits.
2024-06-24 15:19:08 -04:00
Ryan Houdek
8e32763ada
AVX128: Implements support for AVX string ops
...
Reuses the implementation from the SSE4.2 implementation, just
explicitly zeroes the hardcoded YMM0's upper 128-bits.
2024-06-24 15:19:08 -04:00
Ryan Houdek
7532337231
AVX128: Implements support for vector AES instructions
2024-06-24 15:19:08 -04:00
Ryan Houdek
4a66d4570e
AVX128: Implement support for a trinary operation with a passed in vector
...
Will be used for AES operations
2024-06-24 15:19:08 -04:00
Alyssa Rosenzweig
6f5e99d47d
OpcodeDispatcher: factor out TranslateRoundType
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 15:19:08 -04:00
Alyssa Rosenzweig
9ee9f5bddd
OpcodeDispatcher: tweak VectorRoundImpl signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 15:14:55 -04:00
Ryan Houdek
d29139d88a
AVX128: Implement support for vextract{i,f}128
2024-06-24 14:27:19 -04:00
Ryan Houdek
317575ba99
AVX128: Implement support for cvtdq2{ps,pd}
2024-06-24 14:27:19 -04:00
Ryan Houdek
d4f2638a2e
AVX128: Implement support for cvt{t,}pd2pq
2024-06-24 14:27:19 -04:00
Ryan Houdek
b67d9be227
AVX128: Implement support for vcvt{pd2ps,ps2pd}
...
Fairly complex set of instructions due to the edge cases.
2024-06-24 14:27:19 -04:00
Ryan Houdek
d52add8fad
AVX128: Implement support for vcvt{ss2sd,sd2ss}
2024-06-24 14:27:19 -04:00
Ryan Houdek
aa9159d25c
AVX128: Implement support for vpmulh{u,}w
2024-06-24 14:27:19 -04:00
Ryan Houdek
94c777259e
AVX128: Implements support for vpmulhrsw
2024-06-24 14:27:19 -04:00
Ryan Houdek
c9f8fa5662
AVX128: Implement support for vpmul{u,}dq
2024-06-24 14:27:19 -04:00
Ryan Houdek
64ee6b119e
AVX128: Implement support for vaddsubp{s,d}
2024-06-24 14:27:19 -04:00
Ryan Houdek
d2ec9a8936
AVX128: Implement support for vpsubsw
2024-06-24 14:27:19 -04:00
Ryan Houdek
2a927453f7
AVX128: Implement support for vphsub{w,d}
2024-06-24 14:27:19 -04:00
Ryan Houdek
c19d489c9a
AVX128: Implement support for vinsertps
...
This one actually reuses the core base implementation which is nice.
2024-06-24 14:27:19 -04:00
Ryan Houdek
6012eb051b
AVX128: Implement support for vinsert{f128,i128}
2024-06-24 14:27:19 -04:00
Alyssa Rosenzweig
3974746473
OpcodeDispatcher: tweak PHSUBOpImpl
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
e1bcdcf387
OpcodeDispatcher: tweak PHSUBSOpImpl signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
fd5fbddae9
OpcodeDispatcher: tweak PMULLOpImpl for avx128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
8ff72beddb
OpcodeDispatcher: tweak PMULHRSWOpImpl signature for avx128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
cba5f7877b
OpcodeDispatcher: tweak ADDSUBPOpImpl signature for AVX128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
9d7e9fd9fc
OpcodeDispatcher: add AVX128_Zext helper
...
should let us clean up a lot.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Ryan Houdek
082a0baff3
JIT: Implement missing Vector_FToF2
2024-06-24 14:14:23 -04:00
Ryan Houdek
3a4914315b
Arm64: Remove contiguous masked element optimization
...
This was a premature optimization and currently breaks. Just remove it
for now.
2024-06-24 07:49:00 -07:00
Ryan Houdek
448b5a338a
ARM64: Adds new FMA vector instructions
2024-06-24 07:48:05 -07:00
Ryan Houdek
4c9890d7f8
OpcodeDispatcher: Fixes bug in pshuf{lw,hw}
...
This optimization was incorrect. Updates unittests to ensure it keeps
working.
2024-06-24 07:43:48 -07:00
Alyssa Rosenzweig
be8ff9ccb9
Merge pull request #3740 from Sonicadvance1/avx_12
...
AVX128: More various instructions
2024-06-24 09:28:40 -04:00
Ryan Houdek
9c531d97b0
AVX128: Implements the various vector shift instructions
...
These are very closely related to each other so it makes sense to
implement the roughly three different families in one commit.
2024-06-24 09:20:19 -04:00
Ryan Houdek
6edf4619d4
Merge pull request #3742 from Sonicadvance1/export_avx_reg_helpers
...
FEXCore: Implement AVX reconstruction helpers
2024-06-24 05:57:44 -07:00
Ryan Houdek
8f769ce5a3
Merge pull request #3743 from alyssarosenzweig/cleanup/literal
...
X86Tables: add Literal() helper
2024-06-23 13:45:42 -07:00
Ryan Houdek
d52a1da501
FEXCore: Implement support for fetching/setting YMM registers
...
Because we have two views of the YMM registers depending on if the host
supports SVE256 or not, add helper functions to fetch them correctly.
We fetch them in the way that Linux desires them in signal handlers, if
we want to return the converged view directly, that is easy to add
support for. It's unnecessary for now.
2024-06-21 17:13:56 -04:00
Ryan Houdek
abdcaa7c86
AVX128: Implement support for vpinsr{b,w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
ad122cf463
AVX128: Implement support for vpmovmskb
2024-06-21 15:53:52 -04:00
Ryan Houdek
b58a57d225
AVX128: Implement support for vmovmskp{s,d}
2024-06-21 15:53:52 -04:00
Ryan Houdek
28d679de98
AVX128: Implement support for vpmov{s,z}{b,w,d}{w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
d1dd055e6a
AVX128: Implement support for vpextr{b,w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
3045578da4
AVX128: Implement vmov{d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
9566dda73e
AVX128: Implement support for vcmps{s,d}
2024-06-21 15:53:52 -04:00
Ryan Houdek
a0ced2b685
AVX128: Implement support for vcmpp{s,d}
2024-06-21 15:50:26 -04:00
Ryan Houdek
df232f567b
AVX128: Implement support for v{add,sub,mul,fmin,fmax,fdiv,sqrt,rsqrt,rcp}s{s,d}
2024-06-21 15:50:05 -04:00
Ryan Houdek
2a6d6a9d13
AVX128: Implement support for v{u,}comis{s,d}
2024-06-21 15:50:05 -04:00
Alyssa Rosenzweig
cd03932bd1
OpcodeDispatcher: tweak InsertScalarFCMPOpImpl signature
...
so AVX128 can reuse it.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-21 15:50:05 -04:00
Ryan Houdek
2e5fa1ef1b
Merge pull request #3739 from Sonicadvance1/avx_11
...
Frontend: Expose AVX W flag
2024-06-21 12:15:14 -07:00
Alyssa Rosenzweig
0c6c4cd532
OpcodeDispatcher: make FCMP more compact
...
I told Ryan to change this for AVX, but it needs to be changed in the original
to match!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-21 15:02:10 -04:00
Alyssa Rosenzweig
25f8a87429
OpcodeDispatcher: use Literal() helper
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-21 14:58:49 -04:00
Alyssa Rosenzweig
edf1a7970d
X86Tables: add Literal() helper
...
Any time we get the value of Literal, we want to assert that it's actually a
literal. We've been open coding this pattern sporadically throughout the
opcodedispatcher. Let's add an ergonomic helper to fetch the value of literal,
asserting that the value is indeed literal.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-21 14:46:46 -04:00
Ryan Houdek
fac9972bad
Merge pull request #3741 from alyssarosenzweig/cleanup/comiss
...
OpcodeDispatcher: refactor Comiss helper
2024-06-21 11:43:05 -07:00
Alyssa Rosenzweig
9ecb960f3a
OpcodeDispatcher: refactor Comiss helper
...
AVX128 will use this, it's not SSE-specific.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-21 14:23:20 -04:00
Ryan Houdek
3d26e23891
Merge pull request #3737 from Sonicadvance1/avx_10
...
Arm64: Implement support for emulated masked vector loadstores
2024-06-21 11:04:01 -07:00
Ryan Houdek
7bbbd95775
Merge pull request #3736 from Sonicadvance1/avx_9
...
AVX128: Some pun pickles, moves and conversions
2024-06-21 10:55:19 -07:00
Ryan Houdek
bb308899b9
Frontend: Expose AVX W flag
...
Previously we could always tell the size of the operation depending on
how this effects the operating size of the instruction. Converting
64-bit down to 32-bit as an example.
AVX gather instructions are the first instruction class that can't infer
this information. The element load size is determined by the W flag but
the operating size of 128-bit or 256-bit is determined by other means.
Expose this flag so we can determine this difference. The FMA
instructions are going to need this flag as well.
2024-06-21 10:54:42 -07:00
Ryan Houdek
e95c8d703c
Arm64: Implement support for emulated masked vector loadstores
...
In order to support `vmaskmov{ps,pd}` without SVE128 this is required.
It's pretty gnarly but they aren't often used so that's fine from a
compatibility perspective.
Example SVE128 implementation:
```json
"vmaskmovps ymm0, ymm1, [rax]": {
"ExpectedInstructionCount": 9,
"Comment": [
"Map 2 0b01 0x2c 256-bit"
],
"ExpectedArm64ASM": [
"ldr q2, [x28, #32 ]",
"mrs x20, nzcv",
"cmplt p0.s, p6/z, z17.s, #0 ",
"ld1w {z16.s}, p0/z, [x4]",
"add x21, x4, #0x10 (16)",
"cmplt p0.s, p6/z, z2.s, #0 ",
"ld1w {z2.s}, p0/z, [x21]",
"str q2, [x28, #16 ]",
"msr nzcv, x20"
]
},
```
Example ASIMD implementation
```json
"vmaskmovps ymm0, ymm1, [rax]": {
"ExpectedInstructionCount": 37,
"Comment": [
"Map 2 0b01 0x2c 256-bit"
],
"ExpectedArm64ASM": [
"ldr q2, [x28, #32 ]",
"mrs x20, nzcv",
"movi v0.2d, #0x0",
"mov x1, x4",
"mov x0, v17.d[0]",
"tbz x0, #63 , #+0x8",
"ld1 {v0.s}[0], [x1]",
"add x1, x1, #0x4 (4)",
"tbz w0, #31 , #+0x8",
"ld1 {v0.s}[1], [x1]",
"add x1, x1, #0x4 (4)",
"mov x0, v17.d[1]",
"tbz x0, #63 , #+0x8",
"ld1 {v0.s}[2], [x1]",
"add x1, x1, #0x4 (4)",
"tbz w0, #31 , #+0x8",
"ld1 {v0.s}[3], [x1]",
"mov v16.16b, v0.16b",
"add x21, x4, #0x10 (16)",
"movi v0.2d, #0x0",
"mov x1, x21",
"mov x0, v2.d[0]",
"tbz x0, #63 , #+0x8",
"ld1 {v0.s}[0], [x1]",
"add x1, x1, #0x4 (4)",
"tbz w0, #31 , #+0x8",
"ld1 {v0.s}[1], [x1]",
"add x1, x1, #0x4 (4)",
"mov x0, v2.d[1]",
"tbz x0, #63 , #+0x8",
"ld1 {v0.s}[2], [x1]",
"add x1, x1, #0x4 (4)",
"tbz w0, #31 , #+0x8",
"ld1 {v0.s}[3], [x1]",
"mov v2.16b, v0.16b",
"str q2, [x28, #16 ]",
"msr nzcv, x20"
]
},
```
There's a little bit of an improvement where nzcv isn't needed to get
touched on the ASIMD implementation, but I'll leave that for a future
improvement.
2024-06-21 08:21:32 -07:00
Ryan Houdek
903d6a742e
CPUBackend: Removes SupportsSaturatingRoundingShifts
option
...
This has always been true ever since we removed the x86 JIT and
Interpreter. This was left over and adding more code for no reason.
2024-06-21 08:11:22 -07:00
Ryan Houdek
424218e327
AVX128: Implement support for vpsign{b,w,d}
2024-06-21 08:11:22 -07:00
Ryan Houdek
17dc03d414
AVX128: Implement support for vpack{s,u}{wb,dw}
2024-06-21 08:11:21 -07:00
Ryan Houdek
baf699c6e1
AVX128: Implements support for vandnps and vpandn
...
This can't use the previous binary operator handler since the register
sources need to be swapped.
2024-06-21 08:11:21 -07:00
Ryan Houdek
1431af1ff5
AVX128: Implements support for vcvt{t,}s{s,d}2si
2024-06-21 08:11:21 -07:00
Ryan Houdek
775a41b903
AVX128: Implement support for vcvtsi2s{s,d}
2024-06-21 08:11:21 -07:00
Ryan Houdek
e614340c0c
CPUID: Update labeling on some reserved bits
...
These aren't reserved and I was confused that they were missing.
2024-06-21 05:34:44 -07:00
Ryan Houdek
3c293b9aed
Arm64: Loosen restrictions on V{Load,Store}VectorMasked to allow 128-bit operation
2024-06-21 04:26:09 -07:00
Ryan Houdek
283c2861c9
AVX128: Implement suppor for vlddqu
2024-06-21 00:56:36 -07:00
Ryan Houdek
757dc95116
AVX128: Implement support for the punpckh instructions
2024-06-21 00:56:32 -07:00
Ryan Houdek
6192250b8a
AVX128: Implement support for the punpckl instructions
2024-06-21 00:56:28 -07:00
Ryan Houdek
f489135b1d
Merge pull request #3734 from Sonicadvance1/avx_8
...
AVX128: Move moves!
2024-06-21 00:53:41 -07:00
Ryan Houdek
3f232e631e
Merge pull request #3730 from Sonicadvance1/avx_4
...
Vector: Helper refactorings
2024-06-21 00:31:14 -07:00
Ryan Houdek
6e3643c3ef
Merge pull request #3714 from pmatos/FSTstiTagSet
...
Set tag properly in X87 FST(reg)
2024-06-21 00:27:24 -07:00
Ryan Houdek
c28824f94d
AVX128: Implements support for vbroadcast*
2024-06-20 09:43:10 -07:00
Ryan Houdek
664d766b45
AVX128: Implement support for vmovshdup
2024-06-20 09:43:10 -07:00
Ryan Houdek
fce694ed92
AVX128: Implement support for vmovsldup
2024-06-20 09:43:10 -07:00
Ryan Houdek
96aafb4f07
AVX128: Implement support for vmovddup
...
This instruction is a little weird.
When accessing memory, the 128-bit operating size of the instruction
only loads 64-bits.
Meanwhile the 256-bit operating size of the instruction fetches a full
256-bits.
Theoretically the hardware could get away with two 64-bit loads or a
wacky 24-byte load, but it looks like to simplify hardware they just
spec'd it that the 256-bit version will always load the full range.
2024-06-20 09:43:10 -07:00
Ryan Houdek
dbaf95a8f3
AVX128: Implement support for vmovhps/d
2024-06-20 06:53:21 -07:00
Ryan Houdek
e67df96ad9
AVX128: Implement support for movlps/d
2024-06-20 06:53:17 -07:00
Ryan Houdek
56de94578d
AVX128: Implement support for vmovq
2024-06-20 06:53:13 -07:00
Ryan Houdek
06fc2f5ef0
AVX128: Implement support for non-temporal moves.
2024-06-20 06:53:09 -07:00
Ryan Houdek
b3ba315cbd
AVX128: Implements unary/binary lambda helper
2024-06-20 06:53:05 -07:00
Ryan Houdek
e5a531e683
Vector: Refactor MPSADBWOpImpl so AVX128 can use it.
2024-06-20 06:43:57 -07:00
Ryan Houdek
e2de57bd04
Vector: Refactor PSADBWOpImpl so AVX128 can use it.
2024-06-20 06:43:57 -07:00
Ryan Houdek
4eebca93e3
Vector: Refactor PSHUFBOpImpl. This will be reused for AVX128
2024-06-20 06:33:27 -07:00
Ryan Houdek
3919ec9692
Vector: Expose VBLENDOpImpl in the OpcodeDispatcher. It will be reused by AVX128
2024-06-20 06:33:21 -07:00
Ryan Houdek
02aeb0ac1a
Vector: Restructure PMADDWDOpImpl. It's going to get reused for AVX128
2024-06-20 06:33:15 -07:00
Ryan Houdek
206544ad09
Vector: Reconfigure PMADDUBSWOpImpl, it's going to get reused for AVX128
2024-06-20 06:33:08 -07:00
Ryan Houdek
3854cd2b2f
Vector: Restruture SHUFOpImpl. AVX128 is going to reuse it.
2024-06-20 06:32:58 -07:00
Ryan Houdek
acbd920c9a
OpcodeDispatcher: Adds initial groundwork for decomposed AVX operations
...
Only installs the tables if SVE256 isn't supported yet AVX is explicitly
enabled with HostFeatures, to protect accidental enablement early.
- Only implements 85 instructions starting out
- Basic vector moves
- Basic vector unary operations
- Basic vector binary operations
- VZeroUpper/VZeroAll
The bulk of the implementation is currently the handling for loading and
storing the halves of the registers from the context or from memory.
This means the load/store helpers must always return a pair unless only
requesting the bottom half of the register, which occurs with 128-bit
AVX operations. The store side then needing to consume the named zero
register if it occurs since those cases will zero the upper bits.
This implementation approach has a few benefits.
- I can pound this out extremely quickly
- SSE implementations are unaffected and don't need to deal with the
insert behaviour of SVE256.
- We still keep the SVE256 implementation for the inevitable future when
hardware vendors actually do implement it (Give it 8 years or
something).
- We can actually unit test this path in CI once it is complete.
- We can partially optimize some paths with SVE128 (Gathers) and support
a full ASIMD path if necessary.
One downside is that I can't enable this in CI yet because it can't pass
all unittests. but that's a non-issue since it is going to be in heavy
flux as I'm hammering out the implementation. It'll get switched on at
the end when it's passing all 1265 AVX unittests. Currently at 1001 on
this.
2024-06-20 08:44:14 -04:00
Alyssa Rosenzweig
db0bdd48e5
Merge pull request #3729 from alyssarosenzweig/refactor/address-modes
...
OpcodeDispatcher: Refactor address modes
2024-06-20 08:18:33 -04:00
Ryan Houdek
da21ee3cda
Merge pull request #3692 from pmatos/AFP_RPRES_fix
...
Fixes AFP.NEP handling on scalar insertions
2024-06-19 19:23:49 -07:00
Ryan Houdek
d2baef2b36
Merge pull request #3727 from Sonicadvance1/vaes
...
VAES support
2024-06-19 19:22:56 -07:00
Ryan Houdek
df96bc83cc
Merge pull request #3726 from Sonicadvance1/oryon_errata
...
HostFeatures: Work around Qualcomm Oryon RNG errata
2024-06-19 19:21:14 -07:00
Alyssa Rosenzweig
ec03831a21
OpcodeDispatcher: plumb A.NonTSO deeper
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-19 08:52:07 -04:00
Alyssa Rosenzweig
9ca821316a
OpcodeDispatcher: factor out DecodeAddress
...
this is the common guts of the load/store routines.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-19 08:52:07 -04:00
Alyssa Rosenzweig
025a060337
OpcodeDispatcher: extract IsNonTSOReg
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-19 08:52:07 -04:00
Alyssa Rosenzweig
371d6f0730
OpcodeDispatcher: extract IsOperandMem
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-19 08:52:07 -04:00
Ryan Houdek
643bc10d52
CPUID: Expose VAES if supported
2024-06-19 05:51:47 -07:00
Ryan Houdek
542ed8b6ad
Implement support for querying AES256 support
...
This is a different feature flag than regular AES as the default AES+AVX
only operates on 128-bit wide vectors.
With the newer `VAES` extension this is expanded to 256-bit.
2024-06-19 05:51:47 -07:00
Paulo Matos
2483329ef6
Fixes AFP.NEP handling on scalar insertions
...
Fixes #3690
When doing scalar insertions, upper bits come from different arguments
depending on the operation. These are listed in the ARM spec under the
NEP bit documentation.
2024-06-19 10:02:54 +02:00
Paulo Matos
359221b379
Set tag properly in X87 FST(reg)
2024-06-19 10:02:05 +02:00
Paulo Matos
f9b38a1de7
FXCH should set C1 to zero
2024-06-19 08:57:48 +02:00
Ryan Houdek
67e1ac0442
Merge pull request #3725 from alyssarosenzweig/ir/vbic
...
IR: rename _VBic -> _VAndn
2024-06-18 16:34:26 -07:00
Ryan Houdek
c57e9e008f
Merge pull request #3723 from alyssarosenzweig/fexcore/zero-helper
...
OpcodeDispatcher: refactor zero vector loads
2024-06-18 16:34:15 -07:00
Ryan Houdek
b34c23fe3d
HostFeatures: Work around Qualcomm Oryon RNG errata
...
The Oryon is the first CPU we know of that implemented support for the
RNG extension. It also has an errata where reading the RNDRRS register
never returns success. X86's RDSEED guarantees forward progress with
enough retries.
When an x86 processor messed this up at one point, some Linux systems
would infinite loop (presumably when something in boot was filling an
entropy pool). This required a microcode change to fix that processor.
The rdseed unittest infinite loops on this platform if RNG was exposed.
2024-06-18 16:29:53 -07:00
Alyssa Rosenzweig
01da5972fc
IR: rename _VBic -> _VAndn
...
to be consistent with the scalar _Andn opcode, which is specifically named _Andn
and not _Bic.
noticed while reviewing AVX patches
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-18 14:00:01 -04:00
Ryan Houdek
bf812aae8f
CoreState: Adds avx_high structure for tracking decoupled AVX halves.
...
Needed something inbetween the `InlineJITBlockHeader` and `avx_high` in
order to match alignment requirements of 16-byte for avx_high. Chose the
`DeferredSignalRefCount` because we hit it quite frequently and it is
basically the only 64-bit variable that we end up touching
significantly.
In the future the CPUState object is going to need to change its view of
the object depending on if the device supports SVE256 or not, but we
don't need to frontload the work right now. It'll become significantly
easier to support that path once the RCLSE pass gets deleted.
2024-06-18 12:00:45 -04:00
Ryan Houdek
9a71443005
CoreState: Adds a gregs offset check
...
This is required to be less than the maximum range for LDP and STP in
the Arm64 Dispatcher otherwise it breaks. Necessary to ensure this when
reorganizing the CoreState.
2024-06-18 12:00:45 -04:00
Ryan Houdek
ee165249bc
Dispatcher: Fix ARM64EC
...
We don't have CI for this and was missed.
2024-06-18 12:00:45 -04:00
Alyssa Rosenzweig
af8cfb79e5
OpcodeDispatcher: refactor zero vector loads
...
AVX128 is going to slam this, so make it more ergonomic.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-18 11:44:46 -04:00
Ryan Houdek
13ebfb1a49
Merge pull request #3711 from Sonicadvance1/avx128_2
...
FEXCore: Disentangle the SVE256 feature from AVX
2024-06-17 17:35:15 -07:00
Ryan Houdek
f863b30951
Merge pull request #3716 from alyssarosenzweig/ir-dump/unrecoverable
...
json_ir_generator: don't print unrecoverable temps
2024-06-17 17:25:27 -07:00
Ryan Houdek
1ce27a5e6b
FEXCore: Disentangle the SVE256 feature from AVX
...
In quite a few locations we are mixing the case that SVE256 == AVX or
that AVX means the guest register size is 256-bit.
While this is true today, this is entanglement is going to change very
quickly and cause confusion in follow-up PRs.
Now we have SVE128, SVE256, and SVE2 HostFeatures to disambiguate the
different features which mean different things.
This PR keeps the alias that `SupportsAVX` = `SupportsSVE256 && SupportsSVE2`
but that alias is going to very quickly change its definition.
2024-06-17 17:20:32 -07:00
Ryan Houdek
933d622860
Merge pull request #3710 from Sonicadvance1/avx128_1
...
CoreState: Move `InlineJITBlockHeader` to the start of the struct
2024-06-17 17:17:56 -07:00
Alyssa Rosenzweig
29390b439a
json_ir_generator: don't print unrecoverable temps
...
this makes the print more noisy for no benefit, don't do it.
before:
%9(GPRFixed16) i32 = Add OpSize:Tmp:Size, %6(GPRFixed0) i64, %17(Invalid)
%10(GPR0) i64 = Bfi OpSize:Tmp:Size, #0x10, #0x0, %6(GPRFixed0) i64, %9(GPRFixed16) i32
(%11 i64) StoreRegister %6(GPRFixed0) i64, #0x11, GPR, u8:Tmp:Size
(%12 i64) StoreRegister %9(GPRFixed16) i32, #0x10, GPR, u8:Tmp:Size
(%13 i64) StoreRegister %10(GPR0) i64, #0x0, GPR, u8:Tmp:Size
after:
%9(GPRFixed16) i32 = Add %6(GPRFixed0) i64, %17(Invalid)
%10(GPR0) i64 = Bfi #0x10, #0x0, %6(GPRFixed0) i64, %9(GPRFixed16) i32
(%11 i64) StoreRegister %6(GPRFixed0) i64, #0x11, GPR
(%12 i64) StoreRegister %9(GPRFixed16) i32, #0x10, GPR
(%13 i64) StoreRegister %10(GPR0) i64, #0x0, GPR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:58:56 -04:00
Alyssa Rosenzweig
799c17eb90
Arm64Emitter: drop out of date comment
...
I fixed this when we landed the new RA
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:58:08 -04:00
Alyssa Rosenzweig
5fb84866e0
json_ir_generator: rework argument printing
...
for next commit
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:40:29 -04:00
Alyssa Rosenzweig
4965344ef5
Merge pull request #3705 from alyssarosenzweig/pre-rclse
...
Clean ups from my RCLSE branch
2024-06-17 14:22:01 -04:00
Alyssa Rosenzweig
46ca53ad0d
Merge pull request #3704 from alyssarosenzweig/ra/spill-better
...
RA: priorize remat over spilling
2024-06-17 09:01:50 -04:00
Alyssa Rosenzweig
61ff1b3584
Merge pull request #3712 from alyssarosenzweig/jit/silly-assert
...
JIT: delete silly assert
2024-06-17 08:59:00 -04:00
Alyssa Rosenzweig
7c0c5de4bd
JIT: delete silly assert
...
noticed in the area.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 08:51:22 -04:00
Ryan Houdek
a9bacc1b6b
CoreState: Move InlineJITBlockHeader
to the start of the struct
...
This currently doesn't do much but soon this will be very important to
ensure the data prefetcher of Cortex keeps the cachelines following this
variable in L1.
2024-06-17 02:59:56 -07:00
Alyssa Rosenzweig
9443b18076
RegisterAllocationPass: optimize spill loop
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-16 08:15:15 -04:00
Alyssa Rosenzweig
4bd84eb523
OpcodeDispatcher: extract PF/AF invalidate helpers
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
e2073dcd30
OpcodeDispatcher: extract safe Thunk
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
fd72669c7e
OpcodeDispatcher: extract safe Break
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
81c144697b
OpcodeDispatcher: extract safe ExitFunction
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
aecf180dfe
OpcodeDispatcher: extract FlushRegisterCache
...
The "end the clause" signal. for now just flushes flags.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
10fa4a4f20
OpcodeDispatcher: remove never-gonna-be-done todo
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
534732564b
OpcodeDispatcher: drop pointless thunks for packss
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:46 -04:00
Alyssa Rosenzweig
6a314bc9cd
RegisterAllocationPass: prioritize remat over spilling
...
No instcountci changes yet, since nothing currently spills in instcountci. This
mitigates spilling later seen with #3703 , and should help for certain
pathological blocks even without those changes (maybe we should try to get some
of those blocks in instcountci?).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:21:57 -04:00
Ryan Houdek
1d1ed012d8
FEXCore: Fixes Call with 32-bit displacement and address size override
...
FEX had a bug with this instruction where it was incorrectly using both
the address size override and operand size override to truncate the
immediate offset. This isn't how the instruction should behave as it
should actually ignore the address size override.
This now puts it correctly inline with how the jump instruction works
and adds a unit test to ensure it doesn't break again.
This fixes a crash from the Arch rootfs from the glibc dynamic linker
being compiling in a way where a call instruction was getting aligned
using this prefix (Since the compiler knew it does nothing).
2024-06-14 14:00:35 -07:00
Lioncache
d133fa6dc1
ASIMD Tests: Remove erroneous disassembly tests
...
The vixl disassembler has gotten more strict about certain instruction types, so these tests
aren't really needed.
Alternatively, we could mark them as unallocated, but we can opt to remove them here.
2024-06-14 16:12:21 -04:00
Ryan Houdek
184c9d21bb
Revert "OpcodeDispatcher: optimize logical flags"
...
This reverts commit bb8336fcad
.
2024-06-13 19:28:16 -07:00
Alyssa Rosenzweig
a8bf3859ea
ConstProp: rm pointless constant folding
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
aa7dcffcea
ConstProp: drop const pool heuristic
...
slightly worse for compile time, slightly better output, honestly I'll take the
win because this is easier to reason about.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
be1a5cea8e
ConstProp: drop addressgen const pool stuff
...
I don't get the point, it should be handled by a combination of existing
passes/techniques just fine. no instcountci changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
402ea84aa0
RedundantFlagCalculationElimination: cleanup DCE
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
19a7b06b91
ConstProp: swallow up LongDivideElimination
...
as usual.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
96bd643e5b
ConstProp: always inline constants
...
x86/interpreter leftover, I think.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
6b9293979c
ConstProp: swallow up InlineCallOptimization
...
No reason to have a separate pass for this, merging should be a bit faster since
it eliminates an IR walk.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
7d5cee4384
InlineCallOptimization: rm x86 leftover
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
32f5a28433
IR: use Ref instead of OrderedNode
...
find-and-replace across the tree, excluding IR.h itself.
also excluded IRValidation because its treatment of blocks blows up and will be
reformed in the new IR anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-03 12:19:34 -04:00
Alyssa Rosenzweig
ce30179ed1
IR: add Ref typedef
...
To put new IR lipstick on the old IR pig.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-03 12:19:34 -04:00
Alyssa Rosenzweig
a515b707f3
Merge pull request #3679 from Sonicadvance1/memory_model_emulation_programmer_documentation
...
FEXCore/docs: Adds programmer documentation about memory model emulation
2024-06-03 09:24:37 -04:00
Alyssa Rosenzweig
951fee361f
OpcodeDispatcher: optimize shld
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 14:44:24 -04:00
Alyssa Rosenzweig
abfd974d70
OpcodeDispatcher: select hardware addressing modes
...
Now that we have a framework to do this in.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:50 -04:00
Alyssa Rosenzweig
97966930e9
OpcodeDispatcher/x87f64: fuse addr calc
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
a52a2e3ae4
OpcodeDispatcher/x87: fuse addr
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
c49b30f105
OpcodeDispatcher/Vector: fuse addr calc
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
b0b4ad2083
OpcodeDispatcher: fuse xlat address
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
ee4bee4fef
OpcodeDispatcher: fuse BT address
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
c3a0f5a2f6
OpcodeDispatcher: fuse sgdt
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
0413a6bf68
OpcodeDispatcher: improve bmi2 shift
...
allow upper garbage, use simpler clean.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00
Alyssa Rosenzweig
7bd036d1ae
OpcodeDispatcher: refactor address modes
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:32 -04:00
Alyssa Rosenzweig
112c49a348
ConstProp: fix inlining shifted imm to mem instructions
...
hit by sse4_1-pmaxuw.c.gcc-target-test-64.jit.gcc-target-64
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:42:48 -04:00
Alyssa Rosenzweig
80878ae611
ConstProp: rework mem immediate inlining
...
deduplicate all the things.
functional change:
hit by sse4_1-pmaxuw.c.gcc-target-test-64.jit.gcc-target-64
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:42:48 -04:00
Alyssa Rosenzweig
85a69be5b6
ConstProp: drop address fusion
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 17:38:03 -04:00
Ryan Houdek
8dbfd1635a
FEXCore/docs: Adds programmer documentation about memory model emulation
...
I keep needing to look these up to remember the limitations. Add a doc
file so I can more easily point to the information.
2024-05-31 10:36:48 -07:00
Alyssa Rosenzweig
8b5ca303e3
JIT: add asserts for invalid TSO load/store
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-31 12:12:36 -04:00
Alyssa Rosenzweig
bb8336fcad
OpcodeDispatcher: optimize logical flags
...
fuse the PF write in.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-30 14:42:22 -04:00
Ryan Houdek
ee96d60983
Merge pull request #3673 from alyssarosenzweig/ra/tied
...
Track tied sources in the IR
2024-05-30 10:55:15 -07:00
Ryan Houdek
ab0a6bbe9f
Merge pull request #3669 from Sonicadvance1/fix_addshift_operation
...
ConstProp fixes for Darwinia
2024-05-29 19:43:13 -07:00
Alyssa Rosenzweig
665491adf8
OpcodeDispatcher: drop weird !flagm special case
...
now that bfi is coalesced, this is a win.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Alyssa Rosenzweig
55391ccbc0
RegisterAllocationPass: try to coalesce tied sources
...
we'll do better in the future but this is already a win.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Alyssa Rosenzweig
7790d7a0b7
IR: track tied sources
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-29 12:32:07 -04:00
Ryan Houdek
80687c8d2d
ConstProp: Limits which addressing modes can be used for vector loadstores
...
This was causing us to generate invalid code in Darwinia, resulting in a
crash. With assertions enabled this would be picked up in the emitter.
Only implement AddShift optimizations for now because I don't want to do
the remaining optimizations in a bug fix PR.
Fixes Darwinia.
2024-05-29 04:42:11 -07:00
Ryan Houdek
920fe60492
ConstProp: Fix bug with transposed elements from AddShift op
...
Accidentally we were swapping which sources were the base and which was
the one getting shifted. This wasn't super common so it usually didn't
matter.
Fixes one crash in Darwinia.
2024-05-29 04:32:51 -07:00
Ryan Houdek
8c6ce2cb3b
Passes/ConstProp: Have MemExtendedAddressing return a struct rather than a tuple
...
Makes it less confusing about which variable is the base versus the
offset.
NFC
2024-05-29 04:32:14 -07:00
Ryan Houdek
61f30d004c
IR: Document AddShift behaviour
...
Just to clarify that Src2 is the shifted operation.
2024-05-29 04:29:09 -07:00
Alyssa Rosenzweig
136f1d0a0b
OpcodeDispatcher: drop pcmpestri zext
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-28 09:32:14 -04:00
Alyssa Rosenzweig
0c042d1e85
VectorFallbacks: optimize PCMP*STRI flags
...
Return an NZCV.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-28 09:19:43 -04:00
Alyssa Rosenzweig
734258e23b
Merge pull request #3661 from Sonicadvance1/remove_warnings2
...
Removes warnings
2024-05-25 11:52:37 -04:00
Ryan Houdek
74916b3757
RAPass: Remove warnings
2024-05-24 18:41:30 -07:00
Alyssa Rosenzweig
bc1669b163
DeadStoreElimination: eliminate map
...
use a vec. block indices will be dense in the new IR. This is memory intensive
but seems faster in practice.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
83e417b2c6
DeadStoreElimination: combine GPR/FPR handling
...
slight speed up per profile.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
cb00d9171f
IR: merge general DCE with flag DCE
...
Flag DCE needs to do general DCE anyway to converge in one pass. So we can move
the special syscall/atomic logic over to flag DCE and then drop the second DCE
pass altogether. Now local dead code of both is eliminated in a single pass.
Flag DCE is carefully written to converge in a single iteration which makes this
scheme work.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
cf77f2ae5d
RedundantFlagCalculationElimination: fix convergence issue
...
If both the destination and the flags are dead for an AddWithFlags, we need to
eliminate it in one pass. If we only replace without elimiating, we would need a
second DCE pass to eliminate. We want DCE to finish in one pass, so fix this.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
273d086a7b
ConstProp: merge const pooling passes
...
walk the IR less.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
94d9cf54bc
ConstProp: don't push/pop cursor
...
pointless
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
3089e0e6de
ConstProp: merge masking opts with const folding
...
Single pass over the IR now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
3c088fb414
ConstProp: remove masking elimination opts
...
This has been deadcode since 2020. Drop it so we can focus on what *does* work
and what does matter.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00