FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2024-12-15 01:49:00 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	7e6bb04db1	OpcodeDispatcher: Extract CalculatePF This does duplicate the _Constant(1) but it doesn't matter because it gets inlined into the eor anyway. There is no functional change here. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-07-13 10:05:18 -04:00
Alyssa Rosenzweig	716cac35a8	OpcodeDispatcher: Fix PF calculation We store garbage in the upper bits. That's ok, but it means we need to mask on read for correct behaviour. Closes #2767 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-07-13 08:38:46 -04:00
Ryan Houdek	9722c4c5a4	Merge pull request #2766 from alyssarosenzweig/flags/add-of OpcodeDispatcher: Optimize ADD/ADC OF flag packing	2023-07-12 15:47:21 -07:00
Alyssa Rosenzweig	e8c0e19afc	OpcodeDispatcher: "Calculcate" -> "Calculate" Typofix. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-07-12 18:07:04 -04:00
Alyssa Rosenzweig	c559fec959	OpcodeDispatcher: Optimize ADD/ADC OF flag packing We can fold the Not into the And. This requires flipping the arguments to Andn, but we do not flip the order of the assignments since that requires an extra register in a test I'm looking at. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-07-12 18:06:36 -04:00
Alyssa Rosenzweig	8d2fabe705	OpcodeDispatcher: Deduplicate ADD/ADC OF generation Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-07-12 18:06:36 -04:00
Ryan Houdek	5dbd1b8dc2	FEXCore: Removes unused TLS variable Not sure why this still existed.	2023-07-12 13:05:47 -07:00
Ryan Houdek	5fef0c29aa	FEXCore: Rename Telemetry helper function `GetObject` WIN32 has a define already called `GetObject` and will cause our symbol to have an A appended to it and break linking. Just rename it to `GetTelemetryValue`	2023-07-12 11:53:13 -07:00
Ryan Houdek	d387c46aab	FEXCore: Fixes WIN32 compiling again Mostly a quick bandage while I'm setting getting ready to setup the runners to test this for us.	2023-07-12 11:53:13 -07:00
Mai	ddd6dbfdcc	Merge pull request #2759 from Sonicadvance1/redundant_bfe_flags OpcodeDispatcher: Remove spurious bfe with flag storing	2023-07-10 22:19:21 -04:00
Mai	7f2557e322	Merge pull request #2757 from Sonicadvance1/optimize_movss_reg OpcodeDispatcher: Optimize MOVSS to register	2023-07-10 21:22:47 -04:00
Mai	810c7d926c	Merge pull request #2758 from Sonicadvance1/optimize_tso_vector_loadstores IR: Optimize vector TSO loadstore address calculation	2023-07-10 21:21:46 -04:00
Ryan Houdek	04c325661c	OpcodeDispatcher: Remove spurious bfe with flag storing Noticed during introspection that we were generating zero constants redundantly. Bunch of single cycle hits or zero-register renames. Every time a `SetRFLAG` helper was called, it was /always/ doing a BFE on everything passed in to extract the lowest bit. In nearly all cases the data getting passed in is already only the lowest bit. Instead, stop the helper from doing this BFE, and ensure the OpcodeDispatcher does BFE in the couple of cases it still needs to do. As I was skimming through all these to ensure BFE isn't necessary, I did notice that some of the BCD instructions are wrong or questionable. So I left a comment on those so we can come back to it.	2023-07-10 18:03:23 -07:00
Ryan Houdek	2d800b2627	IR: Optimize vector TSO loadstore address calculation These address calculations were failing to understand that they can be optimized. When TSO emulation is disabled these were fine, but with TSO we were eating one more instruction. Before: ``` add x20, x12, #0x4 (4) dmb ish ldr s16, [x20] dmb ish ``` After: ``` dmb ish ldr s16, [x12, #4] dmb ish ``` Also left a note that once LRCPC3 is supported in hardware that we can do a similar optimization there.	2023-07-10 15:21:46 -07:00
Ryan Houdek	55ed3e0549	OpcodeDispatcher: Optimize MOVSS to register Easily fixed. Found through inspection. Before: ``` eor v0.16b, v0.16b, v0.16b mov v0.s[0], v17.s[0] mov v4.16b, v0.16b mov v16.s[0], v4.s[0] ``` After: ``` mov v16.s[0], v17.s[0] ```	2023-07-10 14:36:27 -07:00
Ryan Houdek	55d084ebb0	OpcodeDispatcher: Optimize MOVSS to memory destination Easy fixed. Found through inspection. Before: ``` eor v0.16b, v0.16b, v0.16b mov v0.s[0], v16.s[0] mov v4.16b, v0.16b str s4, [x11] ``` After: ``` str s16, [x11] ```	2023-07-10 14:25:01 -07:00
Mai	98eda5e163	Merge pull request #2749 from Sonicadvance1/optimize_away_redundant_masks OpcodeDispatcher: Optimize some shifts size masking	2023-07-10 08:08:57 -04:00
Ryan Houdek	92d0344d6a	OpcodeDispatcher: Fixes bug with pcmpestri When this instruction returns the index in to the ecx register, this is defined as a 32-bit result. This means it actually gets zero-extended to the full 64-bit GPR size on 64-bit processes. Previously FEX was doing a 32-bit insert which leaves garbage data in the upper 32-bits of the RCX register. Adds a unit test to ensure the result is zero extended. Fixes running Java games under FEX now that SSE4.2 is exposed.	2023-07-08 18:08:47 -07:00
Ryan Houdek	9327435f97	OpcodeDispatcher: Optimize some shifts size masking Inspired from #2561, these shifts don't need to be masked if we know their operating size up front. Causes a handful of these to become more optimal.	2023-07-08 16:41:15 -07:00
Mai	8a4bfba47c	Merge pull request #2745 from Sonicadvance1/optimize_fcmov OpcodeDispatcher: Optimize GetPackedRFLAG	2023-07-07 22:29:52 -04:00
Mai	69ea03f0eb	Merge pull request #2746 from Sonicadvance1/optimize_maskmov OpcodeDispatcher: Optimize MASKMOVDQU and MASKMOVQ	2023-07-07 22:29:37 -04:00
Ryan Houdek	15f5fe658b	OpcodeDispatcher: Optimize MASKMOVDQU and MASKMOVQ This previous implementation was particularly gnarly. Because these instructions are both weackly ordered and have implementation dependent exception and trap behaviour these can actually be fairly conveniently converted over to a load + cmlt + bsl + str instruction. For the XMM variant this reduces code blowup from 80x to 15x! For the MMX variant this reduces code blowup from 46x to 17x! Both of these improvements are significant wins! There's still some minor improvement that could be done with bsl that requires some redundant moves, but since we don't have constraint support for this we still eat two additional instructions Before: ```asm 0x0000ffff7b800718 10ffffe0 adr x0, #-0x4 (addr 0xffff7b800714) 0x0000ffff7b80071c f9005f80 str x0, [x28, #184] 0x0000ffff7b800720 4eb11e24 mov v4.16b, v17.16b 0x0000ffff7b800724 4eb01e05 mov v5.16b, v16.16b 0x0000ffff7b800728 aa0b03f4 mov x20, x11 0x0000ffff7b80072c 4e083c95 mov x21, v4.d[0] 0x0000ffff7b800730 4e083cb6 mov x22, v5.d[0] 0x0000ffff7b800734 d3471eb7 ubfx x23, x21, #7, #1 0x0000ffff7b800738 b4000077 cbz x23, #+0xc (addr 0xffff7b800744) 0x0000ffff7b80073c d3401ed7 uxtb x23, w22 0x0000ffff7b800740 39000297 strb w23, [x20] 0x0000ffff7b800744 d34f3eb7 ubfx x23, x21, #15, #1 0x0000ffff7b800748 b4000077 cbz x23, #+0xc (addr 0xffff7b800754) 0x0000ffff7b80074c d3483ed7 ubfx x23, x22, #8, #8 0x0000ffff7b800750 39000697 strb w23, [x20, #1] 0x0000ffff7b800754 d3575eb7 ubfx x23, x21, #23, #1 0x0000ffff7b800758 b4000077 cbz x23, #+0xc (addr 0xffff7b800764) 0x0000ffff7b80075c d3505ed7 ubfx x23, x22, #16, #8 0x0000ffff7b800760 39000a97 strb w23, [x20, #2] 0x0000ffff7b800764 d35f7eb7 ubfx x23, x21, #31, #1 0x0000ffff7b800768 b4000077 cbz x23, #+0xc (addr 0xffff7b800774) 0x0000ffff7b80076c d3587ed7 ubfx x23, x22, #24, #8 0x0000ffff7b800770 39000e97 strb w23, [x20, #3] 0x0000ffff7b800774 d3679eb7 ubfx x23, x21, #39, #1 0x0000ffff7b800778 b4000077 cbz x23, #+0xc (addr 0xffff7b800784) 0x0000ffff7b80077c d3609ed7 ubfx x23, x22, #32, #8 0x0000ffff7b800780 39001297 strb w23, [x20, #4] 0x0000ffff7b800784 d36fbeb7 ubfx x23, x21, #47, #1 0x0000ffff7b800788 b4000077 cbz x23, #+0xc (addr 0xffff7b800794) 0x0000ffff7b80078c d368bed7 ubfx x23, x22, #40, #8 0x0000ffff7b800790 39001697 strb w23, [x20, #5] 0x0000ffff7b800794 d377deb7 ubfx x23, x21, #55, #1 0x0000ffff7b800798 b4000077 cbz x23, #+0xc (addr 0xffff7b8007a4) 0x0000ffff7b80079c d370ded7 ubfx x23, x22, #48, #8 0x0000ffff7b8007a0 39001a97 strb w23, [x20, #6] 0x0000ffff7b8007a4 d37ffeb5 lsr x21, x21, #63 0x0000ffff7b8007a8 b4000075 cbz x21, #+0xc (addr 0xffff7b8007b4) 0x0000ffff7b8007ac d378fed5 lsr x21, x22, #56 0x0000ffff7b8007b0 39001e95 strb w21, [x20, #7] 0x0000ffff7b8007b4 4e183c95 mov x21, v4.d[1] 0x0000ffff7b8007b8 4e183cb6 mov x22, v5.d[1] 0x0000ffff7b8007bc d3471eb7 ubfx x23, x21, #7, #1 0x0000ffff7b8007c0 b4000077 cbz x23, #+0xc (addr 0xffff7b8007cc) 0x0000ffff7b8007c4 d3401ed7 uxtb x23, w22 0x0000ffff7b8007c8 39002297 strb w23, [x20, #8] 0x0000ffff7b8007cc d34f3eb7 ubfx x23, x21, #15, #1 0x0000ffff7b8007d0 b4000077 cbz x23, #+0xc (addr 0xffff7b8007dc) 0x0000ffff7b8007d4 d3483ed7 ubfx x23, x22, #8, #8 0x0000ffff7b8007d8 39002697 strb w23, [x20, #9] 0x0000ffff7b8007dc d3575eb7 ubfx x23, x21, #23, #1 0x0000ffff7b8007e0 b4000077 cbz x23, #+0xc (addr 0xffff7b8007ec) 0x0000ffff7b8007e4 d3505ed7 ubfx x23, x22, #16, #8 0x0000ffff7b8007e8 39002a97 strb w23, [x20, #10] 0x0000ffff7b8007ec d35f7eb7 ubfx x23, x21, #31, #1 0x0000ffff7b8007f0 b4000077 cbz x23, #+0xc (addr 0xffff7b8007fc) 0x0000ffff7b8007f4 d3587ed7 ubfx x23, x22, #24, #8 0x0000ffff7b8007f8 39002e97 strb w23, [x20, #11] 0x0000ffff7b8007fc d3679eb7 ubfx x23, x21, #39, #1 0x0000ffff7b800800 b4000077 cbz x23, #+0xc (addr 0xffff7b80080c) 0x0000ffff7b800804 d3609ed7 ubfx x23, x22, #32, #8 0x0000ffff7b800808 39003297 strb w23, [x20, #12] 0x0000ffff7b80080c d36fbeb7 ubfx x23, x21, #47, #1 0x0000ffff7b800810 b4000077 cbz x23, #+0xc (addr 0xffff7b80081c) 0x0000ffff7b800814 d368bed7 ubfx x23, x22, #40, #8 0x0000ffff7b800818 39003697 strb w23, [x20, #13] 0x0000ffff7b80081c d377deb7 ubfx x23, x21, #55, #1 0x0000ffff7b800820 b4000077 cbz x23, #+0xc (addr 0xffff7b80082c) 0x0000ffff7b800824 d370ded7 ubfx x23, x22, #48, #8 0x0000ffff7b800828 39003a97 strb w23, [x20, #14] 0x0000ffff7b80082c d37ffeb5 lsr x21, x21, #63 0x0000ffff7b800830 b4000075 cbz x21, #+0xc (addr 0xffff7b80083c) 0x0000ffff7b800834 d378fed5 lsr x21, x22, #56 0x0000ffff7b800838 39003e95 strb w21, [x20, #15] 0x0000ffff7b80083c 58000040 ldr x0, pc+8 (addr 0xffff7b800844) 0x0000ffff7b800840 d63f0000 blr x0 ``` After: ```asm 0x0000ffff7ac00718 10ffffe0 adr x0, #-0x4 (addr 0xffff7ac00714) 0x0000ffff7ac0071c f9005f80 str x0, [x28, #184] 0x0000ffff7ac00720 4e20aa24 cmlt v4.16b, v17.16b, #0 0x0000ffff7ac00724 3dc00165 ldr q5, [x11] 0x0000ffff7ac00728 4ea41c80 mov v0.16b, v4.16b 0x0000ffff7ac0072c 6e651e00 bsl v0.16b, v16.16b, v5.16b 0x0000ffff7ac00730 4ea01c04 mov v4.16b, v0.16b 0x0000ffff7ac00734 3d800164 str q4, [x11] 0x0000ffff7ac00738 58000040 ldr x0, pc+8 (addr 0xffff7ac00740) 0x0000ffff7ac0073c d63f0000 blr x0 ```	2023-07-07 18:37:17 -07:00
Ryan Houdek	052aa4317b	OpcodeDispatcher: Optimize `GetPackedRFLAG` Only return the particular flags that are being requested in the moment since compacting them all when requested is fairly slow. x87 fcmov in particular was requesting all the flags when it only needs a couple. This reduces a `fcmovb` instruction count blowup from 103x to 38x. Still more room to go but this one stood out as being particularly bad. Old: ```asm 0x0000000265a002bc 10ffffe0 adr x0, #-0x4 (addr 0x265a002b8) 0x0000000265a002c0 f9005f80 str x0, [x28, #184] 0x0000000265a002c4 d2800014 mov x20, #0x0 0x0000000265a002c8 d2800035 mov x21, #0x1 0x0000000265a002cc d2800056 mov x22, #0x2 0x0000000265a002d0 394b0397 ldrb w23, [x28, #704] 0x0000000265a002d4 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a002d8 aa1702d6 orr x22, x22, x23 0x0000000265a002dc 394b0b97 ldrb w23, [x28, #706] 0x0000000265a002e0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a002e4 531e76f7 lsl w23, w23, #2 0x0000000265a002e8 aa1702d6 orr x22, x22, x23 0x0000000265a002ec 394b1397 ldrb w23, [x28, #708] 0x0000000265a002f0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a002f4 531c6ef7 lsl w23, w23, #4 0x0000000265a002f8 aa1702d6 orr x22, x22, x23 0x0000000265a002fc 394b1b97 ldrb w23, [x28, #710] 0x0000000265a00300 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00304 531a66f7 lsl w23, w23, #6 0x0000000265a00308 aa1702d6 orr x22, x22, x23 0x0000000265a0030c 394b1f97 ldrb w23, [x28, #711] 0x0000000265a00310 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00314 531962f7 lsl w23, w23, #7 0x0000000265a00318 aa1702d6 orr x22, x22, x23 0x0000000265a0031c 394b2397 ldrb w23, [x28, #712] 0x0000000265a00320 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00324 53185ef7 lsl w23, w23, #8 0x0000000265a00328 aa1702d6 orr x22, x22, x23 0x0000000265a0032c 394b2797 ldrb w23, [x28, #713] 0x0000000265a00330 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00334 53175af7 lsl w23, w23, #9 0x0000000265a00338 aa1702d6 orr x22, x22, x23 0x0000000265a0033c 394b2b97 ldrb w23, [x28, #714] 0x0000000265a00340 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00344 531656f7 lsl w23, w23, #10 0x0000000265a00348 aa1702d6 orr x22, x22, x23 0x0000000265a0034c 394b2f97 ldrb w23, [x28, #715] 0x0000000265a00350 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00354 531552f7 lsl w23, w23, #11 0x0000000265a00358 aa1702d6 orr x22, x22, x23 0x0000000265a0035c 394b3397 ldrb w23, [x28, #716] 0x0000000265a00360 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00364 53144ef7 lsl w23, w23, #12 0x0000000265a00368 aa1702d6 orr x22, x22, x23 0x0000000265a0036c 394b3b97 ldrb w23, [x28, #718] 0x0000000265a00370 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00374 531246f7 lsl w23, w23, #14 0x0000000265a00378 aa1702d6 orr x22, x22, x23 0x0000000265a0037c 394b4397 ldrb w23, [x28, #720] 0x0000000265a00380 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00384 53103ef7 lsl w23, w23, #16 0x0000000265a00388 aa1702d6 orr x22, x22, x23 0x0000000265a0038c 394b4797 ldrb w23, [x28, #721] 0x0000000265a00390 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a00394 530f3af7 lsl w23, w23, #17 0x0000000265a00398 aa1702d6 orr x22, x22, x23 0x0000000265a0039c 394b4b97 ldrb w23, [x28, #722] 0x0000000265a003a0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a003a4 530e36f7 lsl w23, w23, #18 0x0000000265a003a8 aa1702d6 orr x22, x22, x23 0x0000000265a003ac 394b4f97 ldrb w23, [x28, #723] 0x0000000265a003b0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a003b4 530d32f7 lsl w23, w23, #19 0x0000000265a003b8 aa1702d6 orr x22, x22, x23 0x0000000265a003bc 394b5397 ldrb w23, [x28, #724] 0x0000000265a003c0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a003c4 530c2ef7 lsl w23, w23, #20 0x0000000265a003c8 aa1702d6 orr x22, x22, x23 0x0000000265a003cc 394b5797 ldrb w23, [x28, #725] 0x0000000265a003d0 d3407ef7 ubfx x23, x23, #0, #32 0x0000000265a003d4 530b2af7 lsl w23, w23, #21 0x0000000265a003d8 aa1702d6 orr x22, x22, x23 0x0000000265a003dc 924002d6 and x22, x22, #0x1 0x0000000265a003e0 93400294 sbfx x20, x20, #0, #1 0x0000000265a003e4 934002b5 sbfx x21, x21, #0, #1 0x0000000265a003e8 f10002df cmp x22, #0x0 (0) 0x0000000265a003ec 9a950294 csel x20, x20, x21, eq 0x0000000265a003f0 4e080e84 dup v4.2d, x20 0x0000000265a003f4 394baf94 ldrb w20, [x28, #747] 0x0000000265a003f8 91000695 add x21, x20, #0x1 (1) 0x0000000265a003fc 92400ab5 and x21, x21, #0x7 0x0000000265a00400 d2800200 mov x0, #0x10 0x0000000265a00404 9b007e80 mul x0, x20, x0 0x0000000265a00408 8b000380 add x0, x28, x0 0x0000000265a0040c 3dc0bc05 ldr q5, [x0, #752] 0x0000000265a00410 d2800200 mov x0, #0x10 0x0000000265a00414 9b007ea0 mul x0, x21, x0 0x0000000265a00418 8b000380 add x0, x28, x0 0x0000000265a0041c 3dc0bc06 ldr q6, [x0, #752] 0x0000000265a00420 4ea41c80 mov v0.16b, v4.16b 0x0000000265a00424 6e651cc0 bsl v0.16b, v6.16b, v5.16b 0x0000000265a00428 4ea01c04 mov v4.16b, v0.16b 0x0000000265a0042c d2800200 mov x0, #0x10 0x0000000265a00430 9b007e80 mul x0, x20, x0 0x0000000265a00434 8b000380 add x0, x28, x0 0x0000000265a00438 3d80bc04 str q4, [x0, #752] 0x0000000265a0043c 58000040 ldr x0, pc+8 (addr 0x265a00444) 0x0000000265a00440 d63f0000 blr x0 ``` New: ```asm 0x0000000265a002bc 10ffffe0 adr x0, #-0x4 (addr 0x265a002b8) 0x0000000265a002c0 f9005f80 str x0, [x28, #184] 0x0000000265a002c4 d2800014 mov x20, #0x0 0x0000000265a002c8 d2800035 mov x21, #0x1 0x0000000265a002cc d2800056 mov x22, #0x2 0x0000000265a002d0 394b0397 ldrb w23, [x28, #704] 0x0000000265a002d4 330002f6 bfxil w22, w23, #0, #1 0x0000000265a002d8 924002d6 and x22, x22, #0x1 0x0000000265a002dc 93400294 sbfx x20, x20, #0, #1 0x0000000265a002e0 934002b5 sbfx x21, x21, #0, #1 0x0000000265a002e4 f10002df cmp x22, #0x0 (0) 0x0000000265a002e8 9a950294 csel x20, x20, x21, eq 0x0000000265a002ec 4e080e84 dup v4.2d, x20 0x0000000265a002f0 394baf94 ldrb w20, [x28, #747] 0x0000000265a002f4 91000695 add x21, x20, #0x1 (1) 0x0000000265a002f8 92400ab5 and x21, x21, #0x7 0x0000000265a002fc d2800200 mov x0, #0x10 0x0000000265a00300 9b007e80 mul x0, x20, x0 0x0000000265a00304 8b000380 add x0, x28, x0 0x0000000265a00308 3dc0bc05 ldr q5, [x0, #752] 0x0000000265a0030c d2800200 mov x0, #0x10 0x0000000265a00310 9b007ea0 mul x0, x21, x0 0x0000000265a00314 8b000380 add x0, x28, x0 0x0000000265a00318 3dc0bc06 ldr q6, [x0, #752] 0x0000000265a0031c 4ea41c80 mov v0.16b, v4.16b 0x0000000265a00320 6e651cc0 bsl v0.16b, v6.16b, v5.16b 0x0000000265a00324 4ea01c04 mov v4.16b, v0.16b 0x0000000265a00328 d2800200 mov x0, #0x10 0x0000000265a0032c 9b007e80 mul x0, x20, x0 0x0000000265a00330 8b000380 add x0, x28, x0 0x0000000265a00334 3d80bc04 str q4, [x0, #752] 0x0000000265a00338 58000040 ldr x0, pc+8 (addr 0x265a00340) 0x0000000265a0033c d63f0000 blr x0 ```	2023-07-07 17:01:59 -07:00
Ryan Houdek	debcb0e047	Arm64: Optimize BFI in the case that Dst == srcDst ARM64 BFI doesn't allow you to encode two source registers here to match our SSA semantics. Also since we don't support RA constraints to ensure that these match, just do the optimal case in the backend. Leave a comment for future RA contraint excavators to make this more optimal	2023-07-07 16:43:41 -07:00
Ryan Houdek	baf04b6a41	FEXCore: Minor cleanup This isn't required anymore since we are exposing the virtual class directly.	2023-07-07 15:06:14 -07:00
Ryan Houdek	f9b352a093	Linux: Fixes hangs due to mutexes locked while fork happens. When a fork occurs FEX needs to be incredibly careful as any thread (that isn't forking) that holds a lock will vanish when the fork occurs. At this point if the newly forked process tries to use these mutexes then the process hangs indefinitely. The three major mutexes that need to be held during a fork: - Code Invalidation mutex - This is the highest priority and causes us to hang frequently. - This is highly likely to occur when one thread is loading shared libraries and another thread is forking. - Happens frequently with Wine and steam. - VMA tracking mutex - This one happens when one thread is allocating memory while a fork occurs. - This closely relates to the code invalidation mutex, just happens at the syscall layer instead of the FEXCore layer. - Happens as frequently as the code invalidation mutex. - Allocation mutex - This mutex is used for FEX's 64-bit Allocator, this happens when FEX is allocating memory on one thread and a fork occurs. - Fairly infrequent because jemalloc doesn't allocate VMA regions that often. While this likely doesn't hit all of the FEX mutexes, this hits the ones that are burning fires and are happening frequently. - FEXCore: Adds forkable mutex/locks Necessary since we have a few locations in FEX that need to be locked before and after a fork. When a fork occurs the locks must be locked prior to the fork. Then afterwards they either need to unlock or be set to default initialization state. - Parent - Does an unlock - Child - Sets the lock to default initialization state - This is because it pthreads does TID based ownership checking on unique locks and refcount based waiting for shared locks. - No way to "unlock" after fork in this case other than default initializing.	2023-07-04 02:13:06 -07:00
Ryan Houdek	d2032da452	Merge pull request #2737 from bylaws/main Some small fixes for android building	2023-07-01 14:59:18 -07:00
Billy Laws	35c52f20f9	AllocatorHooks: Avoid referencing valloc on Android This is not implemented in bionic, so follow the MINGW approach and implement it with _aligned_alloc.	2023-07-01 22:21:16 +01:00
Billy Laws	17c82c22a6	JitSymbols: Store symbol mappings in /data/local/tmp on Android	2023-07-01 22:13:44 +01:00
Ryan Houdek	df03a7b101	unittests/Emitter: Adds CSSC tests	2023-06-30 19:34:35 -07:00
Ryan Houdek	c859540d7e	Emitter: Adds support for CSSC Not used currently but will be used in the future.	2023-06-30 19:34:35 -07:00
Ryan Houdek	20794593e7	unittests/Emitter: Update tests for updated vixl Output in vixl changed for some of these. Most for the better but not all of them.	2023-06-30 19:34:35 -07:00
Ryan Houdek	a80a2bf569	External/vixl: Update	2023-06-30 19:11:22 -07:00
Mai	1a4d5a1abb	Merge pull request #2733 from Sonicadvance1/fix_jemalloc_checks External/jemalloc: Updates external jemallocs	2023-06-30 17:57:29 -04:00
Ryan Houdek	677b72c9a5	External/jemalloc: Updates external jemallocs Fixes their `malloc_usable_size` checks.	2023-06-28 09:26:45 -07:00
Ryan Houdek	71a8c66c95	Context: Removes dead `AddVirtualMemoryMapping` function This has been around since the initial commit. Bad idea that wasn't ever thought through. Something about remapping guest virtual and host virtual memory which will never be a thing.	2023-06-28 09:18:36 -07:00
Lioncache	bf773452ac	IR: Add missing formatters Currently RegisterClassType and FenceType are passed into logs, which fmt 10.0.0 is more strict about. Adds the formatters that were missing so that compilation can succeed without needing to change all log sites.	2023-06-17 09:42:31 -04:00
Lioncache	95dbccc0ab	Externals: Update fmt to 10.0.0 Keeps ourselves up to date with the latest major release.	2023-06-17 09:25:20 -04:00
Ryan Houdek	e5189d63a2	Merge pull request #2708 from Sonicadvance1/fix_paranoidtso Arm64: Fixes paranoidtso option for CPUs that support LRCPC/2	2023-06-16 13:32:43 -07:00
Ryan Houdek	9dcc1deec0	Merge pull request #2722 from Sonicadvance1/rip_reconstruction JIT: Implement support for per-instruction RIP reconstruction	2023-06-16 13:02:14 -07:00
Ryan Houdek	66d4206cd7	Merge pull request #2719 from lioncash/flags OpcodeDispatcher: Ensure MXCSR is saved/restored with FXSAVE/FXRSTOR	2023-06-16 13:01:56 -07:00
Lioncache	01837b3ad6	IR: Remove HasSideEffects for VPCMPXSTRX ops This is a leftover from early on and not necessary, since we don't operate on any state other than what is provided to the IR op itself.	2023-06-16 11:53:31 -04:00
Lioncache	bdb68840e3	IR: Move VPCMPESTRX REX handling to OpcodeDispatcher We can handle this in the dispatcher itself, so that we don't need to pass along the register size as a member of the opcode. This gets rid of some unnecessary duplication of functionality in the backends and makes it so potential backends don't need to deal with this.	2023-06-16 11:49:36 -04:00
Lioncache	4e2dcf3298	OpcodeDispatcher: Ensure MXCSR is saved/restored with FXSAVE/FXRSTOR Previously, the bits that we support in the MXCSR weren't being saved, which means that some opcode patterns may fail to restore the rounding mode properly. e.g. FXSAVE, followed by FNINIT, followed by FXRSTOR wouldn't restore the rounding mode properly This fixes that.	2023-06-16 09:25:53 -04:00
Ryan Houdek	628f825416	JIT: Implement support for per-instruction RIP reconstruction FEX's current implementation of RIP reconstruction is limited to the entrypoint that a single block has. This will cause the RIP to be incorrect past the first instruction in that block. While this is fine for a decent number of games, especially since fault handling isn't super common. This doesn't work for all situations. When testing Ultimate Chicken Horse, we found out that changing the block size to 1 worked around an early crash in the game's startup. This game is likely relying on Mono/Unity's AOT compilation step, which does some more robust faulting that the runtime JIT. Needing the RIP to be correct since they do some sort of checking for what the code came from. This fixes Ultimate Chicken Horse specifically, but will likely fix other games that are built the same way.	2023-06-14 17:28:56 -07:00
Ryan Houdek	a80327f6df	X86Tables: Adds some missing MEM_ACCESS flags to REP instructions	2023-06-14 17:04:50 -07:00
Ryan Houdek	c9712e45cb	Arm64: Fixes GPR pair allocation to get one pair back When executing a 32-bit application we were failing to allocate a single GPR pair. This meant we only have 7 pairs when we could have had 8. This was because r30 was ending up in the middle of the allocation arrays so we couldn't safely create a sequential pair of registers. Organize the register allocation arrays to be unique for each bitness being executed and then access them through spans instead. Also works around bug where the RA validation doesn't understand when pair indexes don't correlate directly to GPR indexes. So while the previous PR fixed the RA pass, it didn't fix the RA validation pass. Noticed this when pr57018 32-bit gcc test was run with the #2700 PR which improved the RA allocation a bit.	2023-06-13 20:04:51 -07:00
Lioncache	9017325c95	CPUID: Signify support for XSAVE if AVX is enabled Now that XSAVE and XRSTOR are implemented, we can enable the CPUID bits for them when AVX support is enabled.	2023-06-13 19:21:14 -04:00
Lioncache	ae536e44d7	OpcodeDispatcher: Handle XRSTOR	2023-06-13 17:47:45 -04:00
Lioncache	7679485cc3	OpcodeDispatcher: Handle XSAVE	2023-06-13 15:01:33 -04:00
Ryan Houdek	537562fab7	Arm64: Fixes register pair conflict. When FEX was updated to reclaim 64-bit registers in #2494, I had mistakenly messed up pair register class conflicts. The problem is that FEX has r30 stuck in the middle of the RA which causes the paired registers to need to offset their index half way. This meant that the conflict index being incorrect was always broken on 32-bit applications ever since that PR. Keep the intersection indexes in their own array so to can be correctly indexed at runtime. Thanks to @asahilina finding out that Osmos started crashing a few months ago and I finally just got around to bisecting what the problem was. This now fixes Osmos from crashing, although the motes are still invisible on the 32-bit application. Not sure what other havok this has been causing.	2023-06-12 23:31:16 -07:00
Mai	f8721992c2	Merge pull request #2712 from Sonicadvance1/fix_jemalloc_generate External: Update jemalloc trees	2023-06-12 17:12:24 -04:00
Lioncache	755600c371	CPUID: Signify support for SSE4.2 With all the kinks worked out of these instructions, we can finally enable SSE4.2	2023-06-12 13:19:38 -04:00
Lioncache	bec8b70e5d	VectorFallbacks: Fix PCMPSTR fallback ZF/SF flag setting So, uh, this was a little silly to track down. So, having the upper limit as unsigned was a mistake, since this would cause negative valid lengths to convert into an unsigned value within the first two flag comparison cases A -1 valid length can occur if one of the strings starts with a null character in a vector's first element. (It will be zero and we then subtract it to make the length zero-based). Fixes this edge-case up and expands a test to check for this in the future.	2023-06-12 13:13:24 -04:00
Ryan Houdek	bef8ddde48	External: Update jemalloc trees Allows us to generate a header at compile time for OS specific features. Should fix compiling on Android since they have a different function declaration for `malloc_usable_size` compared to Linux.	2023-06-12 09:34:30 -07:00
Mai	fe06f1b151	Merge pull request #2711 from Sonicadvance1/pad_ir_header_32bit IR: Pad IROp_Header to be 32-bit in width	2023-06-11 05:49:00 -04:00
Ryan Houdek	92a15e00c7	IR: Pad IROp_Header to be 32-bit in width We spent a bit of effort removing 8-bits from this header to get it down to three bytes. This ended up in PRs #2319 and #2320 There was no explicit need to go down to three bytes, the other two arguments we were removing were just better served to be lookups instead of adding IR overhead for each operation. This now introduced alignment issues that was brought up in #2472. Apparently the Android NDK's clang will pad nested structs like this, maybe to match alignment? Regardless we should just make it be 32-bit. This fixes Android execution of FEXCore. This fixes #2472 Pros: - Initialization now turns in to a single str because it's 32-bit - We have 8-bits more space that we can abuse in the IR op now - If we need more than 64-bit and 128-bit are easy bumps in the future Cons: - Each IR operation takes at minimum 25% more space in the intrusive allocators - Not really that big of a deal since we are talking 3 bytes versus 4.	2023-06-10 12:38:03 -07:00
Ryan Houdek	7ceadc6b5b	Move config layers to the frontend FEXCore has no need to understand how to load these layers. Which requires json parsing. Move these to the frontend which is already doing the configuration layer setup and initialization tasks anyway. Means FEXCore itself no longer needs to link to tiny-json which can be left to the frontend.	2023-06-09 18:15:40 -07:00
Ryan Houdek	8c41e8f7d8	Arm64: Fixes paranoidtso option for CPUs that support LRCPC/2 Regular LoadStoreTSO operations have gained support for LRCPC and LRCPC2 which changes the semantics of the operation by letting it support immediate offsets. The paranoid version of these operations didn't support the immediate offsets yet which was causing incorrect memory loadstores. Bring over the new semantics from the regular LoadStoreTSO but without any nop padding.	2023-06-09 16:32:28 -07:00
Ryan Houdek	784b3064fc	ArchHelpers: Convert a couple of magic numbers to constants Makes this easier to read.	2023-06-09 16:31:44 -07:00
Ryan Houdek	5b5808218b	Merge pull request #2703 from Sonicadvance1/minor_of_opt OpcodeDispatcher: Optimize ADC/ADD OF flag calculation	2023-06-07 12:54:55 -07:00
Ryan Houdek	41ec987f3e	OpcodeDispatcher: Optimize ADC/ADD OF flag calculation `eor <reg>, <reg>, #-1` can't be encoded as an instruction. Instead use mvn which does the same thing. Removes a single instruction from each OF calculation for ADC and ADD. Also no reason to use a switch statement for the source size, just use _Bfe and calculate the offset based on operation size. SBB caught in the crossfire to ensure it also isn't using a switch statement.	2023-06-07 12:40:51 -07:00
Ryan Houdek	03f73531d3	IRDumper: Fixes ssa number in arguments. This can spuriously end up as a hex number which makes it hard to reason why DCE wasn't deleting IR operations. Ensure it is always a decimal.	2023-06-07 09:52:04 -07:00
Ryan Houdek	a2cbfccb3b	OpcodeDispatcher: Optimize EFLAG unpacking Noticed this was slightly unoptimal. Resulting in a 18% code reduction in the case of of a simple four instruction test ASM case.	2023-06-06 17:56:25 -07:00
Mai	4e01452a65	Merge pull request #2699 from Sonicadvance1/minor_fcmov_opt X87: Super minor FCMOV optimization	2023-06-06 20:22:40 -04:00
Mai	cc7a56b1a6	Merge pull request #2689 from Sonicadvance1/fix_bmi CPUID: Only enable BMI1 and BMI2 if AVX is supported	2023-06-06 20:21:57 -04:00
Ryan Houdek	0b0dd3891e	X87: Super minor FCMOV optimization This caught my eye as I was skimming, remove one IR op per FCMOV instruction. This was just duplicating the generated GPR mask across the FPR.	2023-06-04 06:39:35 -07:00
Ryan Houdek	96a0364a86	Review comments	2023-06-02 21:53:52 -07:00
Ryan Houdek	c0a783997d	Convert remaining memory tracking to deferred signals	2023-06-01 11:35:22 -07:00
Ryan Houdek	f78537109d	Core: Convert mtrack code invalidation over to deferred signals	2023-06-01 11:35:22 -07:00
Ryan Houdek	0c156ed6f9	Context: Switch over to deferred signals	2023-06-01 11:28:04 -07:00
Ryan Houdek	8840b2154c	Allocator: Allow more optimal deferred signals path	2023-06-01 11:28:04 -07:00
Ryan Houdek	e02be8073e	FEXCore: Support deferred signal mutex This is part of FEXCore since it pulls in InternalThreadData, but is related to the FHU signal mutex class. Necessary to allow deferring signals in C++ code rather than right in the JIT.	2023-06-01 11:28:04 -07:00
Ryan Houdek	f75d3550b4	Jit64: Used deferred signals in dispatcher	2023-06-01 11:28:04 -07:00
Ryan Houdek	802c588695	Arm64: Use deferred signals in dispatcher	2023-06-01 11:28:04 -07:00
Ryan Houdek	fd962f40d7	SignalDelegator: Support deferring signals	2023-06-01 11:28:04 -07:00
Ryan Houdek	a9b660af69	CoreState: Add new members to track deferred signal capability	2023-06-01 11:28:04 -07:00
Ryan Houdek	5be798e9e6	Merge pull request #2693 from Sonicadvance1/remove_debug Context: Remove debug namespace	2023-06-01 11:26:05 -07:00
Ryan Houdek	c9d1f0d75a	Merge pull request #2687 from Sonicadvance1/telemetry_save_crash Telemetry: Save on signal terminate	2023-05-30 10:26:03 -07:00
Ryan Houdek	1dc4f8c429	Context: Remove debug namespace Unused and broken	2023-05-30 09:00:57 -07:00
Ryan Houdek	45d3b83143	Telemetry: Save on signal terminate When a signal handler is not installed and is a terminal failure, make sure to save telemetry before faulting. We know when an application is going down in this case so we can make sure to have the telemetry data saved. Adds a telemetry signal mask data point as well to know which signal took it down.	2023-05-30 08:49:33 -07:00
Ryan Houdek	c9101d3f68	CPUID: Only enable BMI1 and BMI2 if AVX is supported These two extensions rely on AVX being supported to be used. Primarily because they are VEX encoded. GTA5 is using these flags to determine if it should enable its AVX support.	2023-05-26 20:48:36 -07:00
Ryan Houdek	a6c6248bcb	ArmEmitter: Fixes bug in SpillStaticRegs Some code in FEX's Arm64 emitter was making an assumption that once SpillStaticRegs was called that it was safe to still use the SRA register state. This wasn't actually true since FEX was using one SRA register to optimize FPR stores. Assuming that the SRA registers were safe to use since they were just saved and no longer necessary. Correct this assumption hell by forcing users of the function to provide the temporary register directly. In all cases the users have a temporary available that it can use. Probably fixes some very weird edge case bugs.	2023-05-22 16:48:07 -07:00
Ryan Houdek	5646428640	FEXCore: Implements support for xgetbv This returns the `XFEATURE_ENABLED_MASK` register which reports what features are enabled on the CPU. This behaves similarly to CPUID where it uses an index register in ecx. This is a prerequisite to enabling XSAVE/XRSTOR and AVX since applications will expect this to exist. xsetbv is a privileged instruction and doesn't need to be implemented.	2023-05-22 16:48:07 -07:00
Ryan Houdek	6ef6d9c391	Thunks: Mostly reverts #2672 I forgot that x11 was part of the custom ABI of thunks. #2672 had broken thunks on ARM64. I thought I had tested a game with them enabled but apparently I tested the wrong game. Not a full revert since we can still ldr with a literal, but we also still need to adr x11 and nop pad. At least removes the data dependency on x11 from the ldr.	2023-05-18 15:50:55 -07:00
Ryan Houdek	3a4a965347	TestHarnessRunner: Support exiting on HLT Currently WINE's longjump doesn't work, so instead set a flag that if HLT is attempted, just exit the JIT. This will get our unittests executing at least.	2023-05-17 21:09:31 -07:00
Ryan Houdek	45cdab2ac3	HostFeatures: Use ID registers under Wine InferFromOS doesn't work under WINE. InferFromIDRegisters doesn't work under Windows but it will under Wine. Since we don't support Windows, just use InferFromIDRegisters.	2023-05-17 21:07:40 -07:00
Ryan Houdek	d675b4af6f	External: Update vixl	2023-05-17 21:07:40 -07:00
Ryan Houdek	363411f0c7	ArchHelpers: Adds missing stub function	2023-05-17 21:05:55 -07:00
Ryan Houdek	5bc418407c	FEXCore: Disable emitter unit tests on win32	2023-05-17 21:05:55 -07:00
Ryan Houdek	61ca651fe1	FEXCore: Don't initialize ThunkHandler on Win32 Adds a couple pointer checks to ensure it won't crash. Doesn't work and will cause assertions.	2023-05-17 21:05:55 -07:00
Mai	77e8be1215	Merge pull request #2671 from Sonicadvance1/wine_syscalls FEXCore: Support Wine syscalls	2023-05-18 00:04:25 -04:00
Lioncache	f7c663240e	OpcodeDispatcher: Handle PCMPESTRM/VPCMPESTRM ...and with that all of the SSE4.2 string instructions are implemented now	2023-05-17 00:21:55 -04:00
Lioncache	82b4aef30d	OpcodeDispatcher: Handle PCMPISTRM/VPCMPISTRM	2023-05-16 22:59:54 -04:00
Lioncache	22919a5b65	OpcodeDispatcher: Add mask variant handling to PCMPXSTXOpImpl() Will be used to handle PCMPESTRM/PCMPISTRM instruction variants.	2023-05-16 22:59:52 -04:00
Ryan Houdek	f47caf48c6	Merge pull request #2669 from Sonicadvance1/aotir_mutex AOTIR: Stop passing a mutex around. It's already guarded	2023-05-12 18:56:55 -07:00
Ryan Houdek	5674d3a871	Merge pull request #2667 from Sonicadvance1/fextl_file FEXCore: Convert Core and Telemetry over to fextl::file::File	2023-05-12 18:56:45 -07:00
Mai	e03b859c20	Merge pull request #2673 from Sonicadvance1/remove_warnings_13 OpcodeDispatcher: Removes a warning that cropped up.	2023-05-12 21:49:43 -04:00
Ryan Houdek	7d822ba1c8	OpcodeDispatcher: Removes a warning that cropped up.	2023-05-12 17:34:20 -07:00
Ryan Houdek	f90dcd2eb1	FEXCore: Convert Core and Telemetry over to FEXCore::File::File This way telemetry and IR dumping can work under Wine.	2023-05-12 17:32:48 -07:00
Ryan Houdek	adbdd33ece	fextl/fmt: Adds write handler for FEXCore::File::File	2023-05-12 17:32:48 -07:00
Ryan Houdek	06250d806d	FEXCore/Utils: Adds File type OS agnostic file class since we can't use std::FILE	2023-05-12 17:32:48 -07:00
Ryan Houdek	613ed559e7	Thunks: Optimize ARM64 trampoline No need to use adr for getting the PC relative literal, we can use LDR (literal) to load the PC relative address directly. Reduces trampline instructions from 3 to 2, also reduces trampoline size from 24-bytes to 16-bytes.	2023-05-12 17:28:36 -07:00
Ryan Houdek	8ac3841946	FEXCore: Support Wine syscalls Wine syscalls need to end the code block at the point of the syscall. This is because syscalls may update RIP which means the JIT loop needs to immediately restart. Additionally since they can update CPU state, make wine syscalls not return a result and instead refill the register state from the CPU state. This will mean the syscall handler will need to update their result register (RAX?) before returning.	2023-05-12 16:42:26 -07:00
Ryan Houdek	458259bf47	FEXCore: Move EnumOperators to FEXCore fextl needs this and can't depend on FHU	2023-05-12 15:23:00 -07:00
Ryan Houdek	2fc529d5b7	AOTIR: Stop passing a mutex around. It's already guarded	2023-05-11 03:56:33 -07:00
Ryan Houdek	ea489567da	ARM64: Fixes SRA disabled codepath Disabling SRA has been broken a quite a while. Disabling this was instrumental in figuring out the VC redistributable crash. Ensure it works by reintroducing non-SRA load/store register handlers, and by supporting runtime selectable dispatch pointers for the JIT. Side-bonus, moves the {LOAD,STORE}MEMTSO ops over to this dispatch as well to make it consistent and probably slightly quicker.	2023-05-11 03:25:19 -07:00
Ryan Houdek	6eae064511	FEXCore: Adds support for hardware x86-TSO prctl From https://github.com/AsahiLinux/linux/commits/bits/220-tso This fails gracefully in the case the upstream kernel doesn't support this feature, so can go in early. This feature allows FEX to use hardware's TSO emulation capability to reduce emulation overhead from our atomic/lrcpc implementation. In the case that the TSO emulation feature is enabled in FEX, we will check if the hardware supports this feature and then enable it. If the hardware feature is supported it will then use regular memory accesses with the expectation that these are x86-TSO in strength. The only hardware that anyone cares about that supports this is Apple's M class SoCs. Theoretically NVIDIA Denver/Carmel supports sequentially consistent, which isn't quite the same thing. I haven't cared to check if multithreaded SC has as strong of guarantees. But also since Carmel/Denver hardware is fairly rare, it's hard to care about for our use case.	2023-05-08 20:12:03 -07:00
Ryan Houdek	2d4bf97cac	FEXCore: Moves SIGBUS handler to FEXCore/Utils This can be done in an OS agnostic fashion. FEXCore knows the details of its JIT and should be done in FEXCore itself. The frontend is only necessary to inform FEXCore where the fault occured and provide the array of GPRs for accessing and modifying the signal state. This is necessary for supporting both Linux and Wine signal contexts with their unaligned access handlers.	2023-05-05 17:04:26 -07:00
Mai	f7d827a26a	Merge pull request #2662 from Sonicadvance1/disable_rdtscp CPUID: Disable RDTSCP under wine	2023-05-05 17:33:20 -04:00
Ryan Houdek	37b5bc49c6	Merge pull request #2656 from Sonicadvance1/fexcore_no_exceptions FEXCore: Compile without exceptions	2023-05-05 14:32:37 -07:00
Ryan Houdek	dcb3f182d6	CPUID: Disable RDTSCP under wine We don't have a sane way to query cpu index under wine. We could technically still use the syscall since we know that we are still executing under Linux, but that seems a bit terrible. Disable for now until something can be worked out. Not like it is used heavily anyway.	2023-05-05 13:52:39 -07:00
Mai	ba45bf4ae7	Merge pull request #2661 from Sonicadvance1/virtual_alloc_base Allocator: Adds VirtualAlloc with memory Base hint function	2023-05-05 14:35:24 -04:00
Mai	121d9fda2d	Merge pull request #2660 from Sonicadvance1/arm64_win32_ra Arm64Emitter: Replace x18 usage with x30	2023-05-05 14:35:02 -04:00
Mai	73ede9d000	Merge pull request #2659 from Sonicadvance1/save_platform_register ARM64Emitter: Ensure platform register is saved on win32	2023-05-05 14:34:25 -04:00
Mai	6dfea8a80f	Merge pull request #2657 from Sonicadvance1/remove_unnecessary_guard LookupCache: Removes unnecessary recursive lock_guard	2023-05-05 14:33:50 -04:00
Ryan Houdek	ef6c220a75	Allocator: Adds VirtualAlloc with memory Base hint function This will be used with the TestHarnessRunner in the future to map specific memory regions. This is only used as a hint rather than exact placement with failure on inability to map. This also hits the fun quirk of 64k allocation granularity which developers need to be careful about.	2023-05-04 15:39:32 -07:00
Ryan Houdek	1e4a6d432c	Merge pull request #2658 from Sonicadvance1/remove_unused_log LogManager: Remove unused handler	2023-05-04 15:32:04 -07:00
Ryan Houdek	4ebd180147	Arm64Emitter: Replace x18 usage with x30 Related to #2659 but not necessary directly. Currently x30(LR) is unused in our RA. In all locations that call out to code, we are already preserving LR and bringing it back after the fact. This was just a missed opportunity since we aren't doing any call-ret stack manipulations that would facilitate LR needing to stick around. Since x18 is a reserved platform register on win32, we can replace its usage with r19, and then replace r19 usage with x30 and everything just works happily. Now x18 is the unused register instead of x30 and we can come back in the future to gain one more register for RA on Linux platforms.	2023-05-04 15:25:47 -07:00
Ryan Houdek	ac4ef63ae6	ARM64Emitter: Ensure platform register is saved on win32 Platform register stores the TEB region on win32 and needs to be preserved if we're going to overwrite it. Ensure we do so.	2023-05-04 15:12:52 -07:00
Ryan Houdek	b2392ef1c6	LogManager: Remove unused handler This non-fmt handler is now entirely unused and can be removed.	2023-05-04 14:52:45 -07:00
Ryan Houdek	8e4d52396b	LookupCache: Removes unnecessary recursive lock_guard All code paths to this are already guaranteed to own the lock. The rest of the codepaths haven't been vetted to actually need recursive_mutex yet, but seems likely that it will be able to get converted to a regular mutex with some more work.	2023-05-04 14:45:19 -07:00
Ryan Houdek	6eeb45b2dc	FEXCore: Compile without exceptions This disables some unwinding overhead when FEXCore is already guaranteed to not throw.	2023-05-04 14:42:02 -07:00
Ryan Houdek	22cf2696da	fextl/memory: Don't allow arrays in fextl::make_unique This ensures we don't hit a programming error since we don't support the array version of this.	2023-05-04 14:38:12 -07:00
Alexandre Julliard	8081ac61e5	AllocatorHooks: Fix parameter order for Win32 _aligned_malloc. The prototype is the opposite of memalign().	2023-05-03 16:15:07 +02:00
Alexandre Julliard	435b4daae1	AllocatorHooks: Pass valid parameters to the Win32 VirtualAlloc.	2023-05-03 16:13:37 +02:00
Lioncache	5ee913bc75	OpcodeDispatcher: Simplify PCMPXSTRIOpImpl All variants of the PCMPXSTRX instructions will take their arguments in the same manner, so we don't need to specify them for each handler. We can also rename the function to PCMPXSTRXOpImpl, since this will be extended to handle the masking variants of the string instructions.	2023-05-02 18:48:35 -04:00
Lioncache	f502154f96	OpcodeDispatcher: Handle VPCMPISTRI	2023-05-02 14:00:05 -04:00
Lioncache	7a59fb3e25	IR: Add IR fallback for VPCMPISTRX Will be the fallback that handles the implicit length string instruction emulation.	2023-05-02 13:52:30 -04:00
Mai	590422b295	Merge pull request #2641 from Sonicadvance1/remove_unittest_gen FEXCore: Stop exposing the x86 table data symbols	2023-04-26 05:09:23 -04:00
Ryan Houdek	9d268df91f	Softfloat: Disable some duplicate BIGFLOAT handlers Since mingw has its reduced precision has double, these handlers were duplicated and causing compile failure.	2023-04-26 01:48:37 -07:00
Ryan Houdek	699541485d	Arm64: Disable ProcessorID and Break on mingw Currently unsupported on mingw	2023-04-26 01:48:37 -07:00
Ryan Houdek	46a63186a2	FEXCore: Name libFEXCore correctly and use sync library	2023-04-26 01:48:37 -07:00
Ryan Houdek	90f347839d	InterruptableConditionVariable: Implement for mingw	2023-04-26 01:48:37 -07:00
Ryan Houdek	c9e7d9f331	FEXCore: Disable IRDumper on mingw	2023-04-26 01:48:37 -07:00
Ryan Houdek	8c3a3bfb7c	FEXCore: Resolve some header includes Some aren't necessary anymore. Some need to not exist on mingw.	2023-04-26 01:48:37 -07:00
Ryan Houdek	9034946b43	Move UContext from FEXCore to frontend. FEXCore no longer needs this since all the signal handling is done in the frontend.	2023-04-26 01:48:37 -07:00
Ryan Houdek	056f44be0b	SignalDelegator: Moves all signal handling to the frontend This is a very OS specific operation and it living in FEXCore doesn't make much sense. This still requires some strong collaboration between FEXCore and the frontend but it is now split between the locations. There's still a bit more cleanup work that can be done after this is merged, but we need to get this burning fire out of the way. This is necessary for llvm-mingw, this requires all previous PRs to be merged first. After this is merged, most of the llvm-mingw work is complete, just some minor cleanups. To be merged first: - #2602 - #2604 - #2605 - #2607 - #2610 - #2615 - #2619 - #2621 - #2622 - #2624 - #2625 - #2626 - #2627 - #2628 - #2629	2023-04-26 01:24:11 -07:00
Mai	b5420f5db3	Merge pull request #2629 from Sonicadvance1/fexcore_cmake_mingw FEXCore: Fixup cmake file for mingw	2023-04-25 10:12:17 -04:00
Mai	c94268789b	Merge pull request #2619 from Sonicadvance1/fileloading_mingw FileLoading: Add WIN32 specific loading path	2023-04-25 10:11:14 -04:00
Mai	af15277fc4	Merge pull request #2615 from Sonicadvance1/fhu_mingw FHU/FS: Create WIN32 helpers for some functions.	2023-04-25 10:09:35 -04:00
Mai	86e09a00f0	Merge pull request #2610 from Sonicadvance1/mingw_virtual_alloc AllocatorHooks: Adds some mingw allocator helpers	2023-04-25 10:08:44 -04:00
Lioncache	c94721a04b	OpcodeDispatcher: Handle VPMASKMOVD/VPMASKMOVQ We can reuse the same helper we have for handling VMASKMOVPD and VMASKMOVPS, though we need to move some handling around to account for the fact that VPMASKMOVD and VPMASKMOVQ 'hijack' the REX.W bit to signify the element size of the operation.	2023-04-24 10:50:11 -04:00
Ryan Houdek	c87f361bb5	FEXCore: Stop exposing the x86 table data symbols This was only used for the unit test fuzzing framework. Which has been removed and unused for pretty much its entire lifespan. These can now be internal only.	2023-04-23 09:38:03 -07:00
Mai	0fa4390e47	Merge pull request #2622 from Sonicadvance1/dispatcher_signals Dispatcher: Disable signal handling under mingw	2023-04-21 21:43:30 -04:00
Mai	7a774a8d80	Merge pull request #2624 from Sonicadvance1/fexcore_cpuid FEXCore: Switch to xbyak for CPUID fetch helpers.	2023-04-21 21:42:54 -04:00
Mai	361e684c64	Merge pull request #2628 from Sonicadvance1/objectcache_mingw Disable AOT and object cache under mingw	2023-04-21 21:42:25 -04:00
Mai	4c74913edf	Merge pull request #2627 from Sonicadvance1/disable_break_mingw Disable Break/INT operations on mingw	2023-04-21 21:42:08 -04:00
Mai	059472fcef	Merge pull request #2621 from Sonicadvance1/object_cache_packed ObjectCache: Ensure correctly packed config option	2023-04-21 21:41:34 -04:00
Mai	4a11111abd	Merge pull request #2626 from Sonicadvance1/thunks_mingw Thunks: Disable under mingw	2023-04-21 21:40:40 -04:00
Mai	f673afc38f	Merge pull request #2625 from Sonicadvance1/gdbserver_mingw GdbServer: Disable under mingw	2023-04-21 21:40:24 -04:00
Mai	1fad26d72f	Merge pull request #2613 from Sonicadvance1/cpuinfo_mingw CPUInfo: Add mingw helper for CalculateNumberOfCPUs	2023-04-21 21:39:52 -04:00
Mai	2b5ddb6b93	Merge pull request #2607 from Sonicadvance1/mingw_softflow llvm-mingw: Fix SoftFloat compiling	2023-04-21 21:38:27 -04:00
Mai	c140dd7da8	Merge pull request #2605 from Sonicadvance1/aligned_alloc Allocator: Ensure uses of aligned allocations use aligned_free	2023-04-21 21:38:08 -04:00
Mai	da126141d3	Merge pull request #2604 from Sonicadvance1/move_config_paths Config: Move path generation to the frontend	2023-04-21 21:37:23 -04:00
Mai	874ae5b0fc	Merge pull request #2602 from Sonicadvance1/move_thread_creation Threads: Moves pthread logic to FEXLoader	2023-04-21 21:36:28 -04:00
Lioncache	651c6f8ddf	OpcodeDispatcher: Handle VCVTPS2PD/VCVTPD2PS	2023-04-18 10:29:57 -04:00
Lioncache	73ca4e5687	OpcodeDispatcher: Move vector float conversion to helper Will be used for implementing the equivalent AVX instructions.	2023-04-18 10:07:30 -04:00
Lioncache	cb9cc74fcc	OpcodeDispatcher: Handle AVX variants of float-to-float conversions Adds in the handling of destination type size differences with AVX. Also fixes cases where the SSE operations would load 128-bit vectors from meory, rather than only loading 64-bit vectors with VCVTPS2PD.	2023-04-18 09:52:28 -04:00
Lioncache	d1116456fc	OpcodeDispatcher: Handle VCVTSD2SS/VCVTSS2SD	2023-04-18 08:13:23 -04:00
Lioncache	84985952c9	OpcodeDispatcher: Factor out scalar floating-point conversion to helper Will be used to implement the AVX variants of VCVTSD2SS and VCVTSS2SD	2023-04-18 07:16:37 -04:00
Mai	a351620c60	Merge pull request #2634 from Sonicadvance1/missing_avx VEXTables: Adds a missing class of AVX instructions	2023-04-18 06:55:35 -04:00
Ryan Houdek	9117f7e724	VEXTables: Adds a missing class of AVX instructions These are all AVX1, not sure how I missed this. Sorry @lioncash, four more instructions.	2023-04-17 20:39:59 -07:00
Lioncache	8e391e7a61	Interpreter: Move PCMPESTRX fallback to VectorFallbacks Now that OpHandlers isn't coupled to the F80 ops anymore, we can move this over to its own file dedicated to vector fallbacks.	2023-04-17 22:57:09 -04:00
Lioncache	98fbc4a46d	Interpreter: Move OpHandler struct into its own header We can also provide a general rundown for hooking up interpreter fallbacks here for the uninitiated.	2023-04-17 22:55:02 -04:00
Lioncache	8481aeccb5	Interpreter: Move F80Ops.h into Fallback directory We can also rename it to F80Fallbacks.h to make the file purpose a little more explicit.	2023-04-17 22:54:58 -04:00
Lioncache	b1df63f425	Interpreter: Move fallbacks into new directory Will be used to store fallbacks and separate the definition struct from the F80 fallbacks	2023-04-17 22:05:00 -04:00
Lioncache	39c73d975b	OpcodeDispatcher: Handle PCMPESTRI/VPCMPESTRI	2023-04-17 21:42:58 -04:00
Lioncache	30cb1aaaed	IR: Add VPCMPESTRX fallback In order to implement the SSE4.2 string instructions in a reasonable manner, we can make use of a fallback implementation for the time being. This implementation just returns the intermediate result and leaves it up to the function making use of it to derive the final result from said intermediate result. This is fine, considering we have the immediate control byte that tells us exactly what is desired as far as output formats go. Given that the result of this IR op will never take up more than 16-bits, we store the flags we need to set in the upper 16 bits of the result to avoid needing to implement multiple return values in the JIT. Also, since the IR op just returns the intermediate result, this can be used to implement all of the explicit string instructions with a single IR op. The implementation is pretty heavily documented to help make heads or tails of these monster instructions.	2023-04-17 21:39:32 -04:00
Ryan Houdek	cbf41448fc	Thunks: Disable under mingw	2023-04-17 03:10:04 -07:00
Ryan Houdek	47bdc9af12	Config: Move realpath usage to FHU	2023-04-17 03:05:25 -07:00
Ryan Houdek	0fad5b88c1	FEXCore: Fixup cmake file for mingw - 64-bit allocator doesn't work under mingw atm. - Can't link against libdl - Can't have a SONAME because it is a PE, not a shared library.	2023-04-17 02:57:27 -07:00
Ryan Houdek	8c9fe0dd31	AOTIR: Disable loading and saving on mingw	2023-04-17 02:55:15 -07:00
Ryan Houdek	dda3afcfaf	ObjectCache: Disable job handling on mingw This isn't wired up anyway, but this needs to be disabled for now.	2023-04-17 02:55:11 -07:00
Ryan Houdek	34ceefb2c3	JIT64: Disable Break op on mingw No way to handle this currently.	2023-04-17 02:54:30 -07:00
Ryan Houdek	25ef63a069	OpcodeDispatcher: Disable INT instruction entirely under mingw Not yet able to handle this there.	2023-04-17 02:54:25 -07:00
Ryan Houdek	99a9c88f3f	GdbServer: Disable under mingw This needs to move to the frontend at some point.	2023-04-17 02:53:40 -07:00
Ryan Houdek	77f56199e8	Dispatcher: Disable signal handling under mingw This needs some hefty reconstructing	2023-04-17 02:53:03 -07:00
Ryan Houdek	6c13b629af	FEXCore: Switch to xbyak for CPUID fetch helpers. This will use the correct `__cpuid` define, either in cpuid.h or self-defined depending on environment. Otherwise we would need to define our own cpuid helpers to match the difference between mingw and linux.	2023-04-17 02:52:17 -07:00
Ryan Houdek	3ebe9f7b04	CPUInfo: Add mingw helper for CalculateNumberOfCPUs	2023-04-16 17:30:30 -07:00
Ryan Houdek	005389f8c1	llvm-mingw: Fix SoftFloat compiling	2023-04-16 00:30:28 -07:00
Mai	a33443db62	Merge pull request #2611 from Sonicadvance1/arm64_mingw ARM64Dispatcher: Fix compiling with mingw	2023-04-16 03:30:09 -04:00
Mai	68599bf124	Merge pull request #2618 from Sonicadvance1/corestate_mingw CoreState: Fix SynchronousFaultData padding type	2023-04-16 03:26:32 -04:00
Mai	4bffdc6345	Merge pull request #2612 from Sonicadvance1/frontend_mingw Frontend: Remove errant header	2023-04-16 03:25:03 -04:00
Mai	cbe55b0765	Merge pull request #2616 from Sonicadvance1/telemetry_mingw Telemetry: Disable on WIN32	2023-04-16 03:23:20 -04:00
Mai	2ff5096103	Merge pull request #2617 from Sonicadvance1/netstream_mingw Netstream: Disable on WIN32	2023-04-16 03:23:06 -04:00
Mai	797737a84d	Merge pull request #2609 from Sonicadvance1/mingw_threadname Threads: Adds SetThreadName helper	2023-04-16 03:22:46 -04:00
Mai	cfc1aa593b	Merge pull request #2620 from Sonicadvance1/ra_helper RA: Use FindFirstSetBit helper	2023-04-16 03:20:09 -04:00
Ryan Houdek	6b964f70e0	CPUID: Fix std::min type cast	2023-04-16 00:09:36 -07:00
Ryan Houdek	78844ee975	ObjectCache: Ensure correctly packed config option	2023-04-15 18:41:57 -07:00
Ryan Houdek	51afcb7143	RA: Use FindFirstSetBit helper	2023-04-15 18:41:35 -07:00
Ryan Houdek	5258b1972b	FileLoading: Add WIN32 specific loading path	2023-04-15 18:41:11 -07:00
Ryan Houdek	c6616d64d8	CoreState: Fix SynchronousFaultData padding type	2023-04-15 18:40:34 -07:00
Ryan Houdek	d9b9ce804b	Netstream: Disable on WIN32	2023-04-15 18:40:11 -07:00
Ryan Houdek	132aa7e4d3	Telemetry: Disable on WIN32	2023-04-15 18:39:44 -07:00
Ryan Houdek	fc00a31aee	Frontend: Remove errant header	2023-04-15 18:37:53 -07:00
Ryan Houdek	1de84110e8	ARM64Dispatcher: Fix compiling with mingw	2023-04-15 18:37:31 -07:00
Ryan Houdek	1962f036e1	ObjectCacheService: Use ThreadName helper	2023-04-15 18:21:42 -07:00
Ryan Houdek	879a081556	Threads: Adds SetThreadName helper	2023-04-15 18:21:42 -07:00
Ryan Houdek	105060363f	FEXCore: Move mmap allocators over to VirtualAlloc	2023-04-15 18:07:54 -07:00
Ryan Houdek	bbb3a6439f	AllocatorHooks: Adds some mingw allocator helpers	2023-04-15 18:07:49 -07:00
Ryan Houdek	40b67462b7	Allocator: Ensure uses of aligned allocations use aligned_free This will be used by mingw.	2023-04-15 15:25:17 -07:00
Ryan Houdek	d853de39ff	Config: Move path generation to the frontend This lets all the path generation for the config to be in the frontend. This then informs FEXCore where things should live. This is for llvm-mingw. While paths aren't quite generated correctly, this gets the code closer to compiling.	2023-04-15 15:25:01 -07:00
Ryan Houdek	75a62f856b	Threads: Moves pthread logic to FEXLoader This is not an attempt to clean up the various issues with the pthread logic, instead just moving the pthread specific logic out of FEXCore in to FEXLoader. FEXCore needs to know how to create threads in an agnostic way. Which is why we obfuscate the details with this inteface. Initially this was implemented with the pthread handlers in FEXCore and expected eventually for those to get moved to the frontend. This is the time when it has been moved. This is the first step towards compiling with llvm-mingw. Still a long way to go.	2023-04-15 15:24:30 -07:00
Ryan Houdek	1a91d849f0	Merge pull request #2591 from Sonicadvance1/new_jemalloc Add in jemalloc glibc hooking again	2023-04-14 13:32:11 -07:00
Ryan Houdek	1cc9f2107d	Add in jemalloc glibc hooking again We still need to hook glibc for thunks to work with `IsHostHeapAllocation`. So now we link in two jemalloc allocators in different namespaces. As usual we have multiple heap allocators that we need to be careful about. 1. jemalloc with `je_` namespace. - This is FEX's regular heap allocator and what gets used for all the fextl objects. - This allocator is the one that the FEX mmap/munmap hooks hook in to - This mmap hooking gives this allocator the full 48-bit VA even in 32-bit space. 2. jemalloc with `glibc_je_` namespace. - This is the allocator that overrides the glibc allocator routines - This is the allocator that thunks will use. - This is what `IsHostHeapAllocation` will check for. 3. Guest glibc allocator - We don't touch this one. But it is distinct from the host side allocators. - The guest side of thunks will use this heap allocator. 4. Host glibc allocator - #2 replaces this one unless explicitly disabled. - Always expected to override the allocator, so this configuration isn't expected. Already tested this with Dota Underlords to ensure this works with thunks.	2023-04-14 13:16:22 -07:00
Ryan Houdek	46e5343a0e	External/drm: Update to v6.2	2023-04-14 13:07:52 -07:00
Ryan Houdek	f45ea1e0f6	Merge pull request #2600 from Sonicadvance1/fix_stringconv_allocate StringUtils: Stop allocating TrimTokens	2023-04-12 03:50:20 -07:00
Ryan Houdek	92593162b0	StringUtils: Stop allocating TrimTokens Use a string_view instead of a fextl::string so this stops allocating memory (stack in the cases I have seen). Fixes #2562	2023-04-12 03:34:58 -07:00
Ryan Houdek	5c62ea21f4	Merge pull request #2598 from Sonicadvance1/stop_leaking_avx FEXCore: Stop leaking AVX configuration state	2023-04-11 19:57:58 -07:00
Ryan Houdek	0d0b99f344	OpcodeDispatcher: Move usages of `And(Not(` to Andn Fixes #2199 Very few uses actually, we were pretty good at this already.	2023-04-11 15:35:12 -07:00
Ryan Houdek	44e06185b7	FEXCore: Stop leaking AVX configuration state The dispatcher was saving AVX state even though FEX doesn't support it currently. This is due to it checking for the config option rather than the HostFeatures option. The `EnableAVX` config option is supposed to be used to inform FEXCore if we want AVX disabled or not when the host supports the feature. In this case it is universally enabled because we haven't encountered any games that have issues with AVX state being saved with signals. (We know they exist, we just don't have configurations for them). The HostFeatures option `SupportsAVX` is the option that is supposed to be getting used for determining if the runtime AVX feature is enabled. This also had an issue though that this was also always enabled if running on an x86 host with AVX, or an ARM host with SVE2-256bit. It was then disabled if the config option was disabled; But, since FEX-Emu doesn't support AVX fully yet, we need to ensure this isn't yet enabled. But this only solves half the problem. In order for our CI to test AVX features before fully supporting AVX, it needs to be able to enable AVX so that the CPU state is correctly saved. So we need to change the default configuration option to be false, and have CI enable it for the tests that matter before AVX is fully implemented.	2023-04-11 15:21:32 -07:00
Ryan Houdek	49abe8afb5	Allocator: Remove pointer indirection overhead Every time we are calling a function in `FEXCore::Allocator::` this is a pointer indirection. Which means on x86 it is always a `call [rdi]` and on AArch64 it is a `ldr x17, [x0]; blr x17;`. Instead of doing this, use inline functions in the header that call the correct allocation function directly. This function gets inlined and is no longer an indirect call. When compiling with jemalloc, we forward declare the jemalloc function definitions so we don't have to pull in the entire jemalloc interface in to the public header definitions.	2023-04-11 10:29:35 -07:00
Ryan Houdek	d7bc0370ee	Merge pull request #2592 from Sonicadvance1/remove_fwrite fextl::fmt: Remove fwrite usage	2023-04-11 03:32:06 -07:00
Ryan Houdek	0e007d2724	StringConv: Convert to conversion functions that don't use std::string `std::stoul` and `std::stroull` take a std::string which was converting the string_view to a std::string first. Causing glibc fault testing to catch this since not much uses this. These will be added to the documentation.	2023-04-10 18:56:05 -07:00
Ryan Houdek	466edf7744	fextl::fmt: Remove fwrite usage fwrite allocates some backing memory for buffering outputs which FEX can't track. Switch to using `fileno` to get the fd from the FILE and write directly. This will need to be changed for llvm-mingw support but that will come after this. This will be added to the documentation that we can't use fwrite.	2023-04-10 17:43:47 -07:00
Ryan Houdek	0bb59ade53	Updates jemalloc Needed to change some symbol names due to proper jemalloc namespacing now.	2023-04-10 16:21:33 -07:00
Ryan Houdek	1864a1d3b5	fextl::fmt: Add print with std::FILE* handler This is to match prior behaviour, but untested if fwrite itself allocates any memory so far.	2023-04-10 15:38:55 -07:00
Ryan Houdek	e98a46aa5f	Review comments	2023-04-07 17:01:53 -07:00
Ryan Houdek	265c918d90	Move fextl::String_from_path to the only usage in FEXConfig Ensures that people won't be tempted to use this elsewhere.	2023-04-07 17:01:53 -07:00
Ryan Houdek	46b306e861	Config: Remove to_string usage	2023-04-07 17:01:52 -07:00
Ryan Houdek	32d7fae373	GdbServer: Convert to_string usage	2023-04-07 17:01:52 -07:00
Ryan Houdek	11402b637a	fextl: Remove now unused string_from_string	2023-04-07 17:01:52 -07:00
Ryan Houdek	f5ed9c4ff3	CodeCache: Convert std::fs to FHU	2023-04-07 17:01:52 -07:00
Ryan Houdek	1306e597dd	CodeSerialize: Convert unique_ptr to fextl	2023-04-07 17:01:52 -07:00
Ryan Houdek	4d70f4fc4e	Remove some unused headers now.	2023-04-07 17:01:52 -07:00
Ryan Houdek	bb922f9a9e	External: Update robin-map	2023-04-07 17:01:52 -07:00
Ryan Houdek	4ab822aebb	IRParser: Convert to fextl	2023-04-07 17:01:52 -07:00
Ryan Houdek	efada6a0ea	Config: Convert some std::filesystem to FHU	2023-04-07 17:01:52 -07:00
Ryan Houdek	2d18156e15	AOT: Convert fstream to fextl and raw files	2023-04-07 17:01:52 -07:00
Ryan Houdek	60fe987e09	NetStream: Add operator new/delete because of raw pointer usage.	2023-04-07 17:01:51 -07:00
Ryan Houdek	7180bb1496	GdbServer: Convert fstream to fextl	2023-04-07 17:01:51 -07:00
Ryan Houdek	001a086d85	Convert remaining fmt::format to fextl	2023-04-07 17:01:51 -07:00
Ryan Houdek	8a711383bb	fextl/fmt: Adds print helper that takes FD	2023-04-07 17:01:51 -07:00
Ryan Houdek	257a3a54dc	Context: Convert over to a unique_ptr	2023-04-07 17:01:51 -07:00
Ryan Houdek	e4fadd6992	Merge pull request #2587 from Sonicadvance1/disable_sbrk Allocator: Disable glibc sbrk allocations	2023-04-07 17:01:19 -07:00
Ryan Houdek	9e5971b89c	Allocator: Disable glibc sbrk allocations This is done by consuming a single page at the end of the current sbrk memory region. Then consuming any remaining bytes that could have potentially ended up in it. This ensures that glibc won't be able to return 64-bit pointers to 32-bit thunks once the remaining work is in place.	2023-04-06 12:27:44 -07:00
André Zwing	f944709139	Dispatcher: Fixes restoring of AVX state	2023-04-05 21:15:05 +02:00
Ryan Houdek	aac4e25ca4	Merge pull request #2549 from Sonicadvance1/glibc_remaining_allocations Move FEX away from the remaining glibc allocations that we can	2023-04-01 09:46:29 -07:00
Ryan Houdek	546a1edb55	CodeReview	2023-04-01 09:27:01 -07:00
Tony Wasserka	21838fe03f	Merge pull request #2574 from neobrain/feature_thunk_wayland Add support for thunking Wayland	2023-04-01 17:15:49 +02:00
Tony Wasserka	99ba648a71	Thunks: Fix thunking libraries with "-" in their name The LOAD_LIB and EXPORTS macros behave slightly differently in this regard: * Use LOAD_LIB(libwayland-client) in Guest.cpp (library name with dash) * Use EXPORTS(libwayland_client) in Host.cpp (library name with underscore)	2023-04-01 16:52:49 +02:00
Ryan Houdek	97daec3dba	Review comments	2023-03-31 06:03:06 -07:00
Ryan Houdek	2990a9d820	FaultingAllocator: Review comments	2023-03-30 16:28:34 -07:00
Ryan Houdek	53bbbd5a4f	Review code	2023-03-30 16:28:34 -07:00
Ryan Houdek	047dddb023	Rebase patching	2023-03-30 16:28:34 -07:00
Ryan Houdek	4f66ff6ec4	Paths: Remove unique_ptr usage	2023-03-30 16:28:34 -07:00
Ryan Houdek	067f807405	InternalThreadState: Convert tsl to fextl	2023-03-30 16:28:34 -07:00
Ryan Houdek	463b4b748c	fextl: Add fextl::fmt::print	2023-03-30 16:28:34 -07:00
Ryan Houdek	3eae668cec	X86Jit: Fix xbyak allocating through glibc	2023-03-30 16:28:34 -07:00
Ryan Houdek	fc8bf9f0f6	Config: Remove static which is allocated	2023-03-30 16:28:34 -07:00
Ryan Houdek	3cfc1de410	Common: Convert cpp-optparse over to fextl and use.	2023-03-30 16:28:34 -07:00
Ryan Houdek	a14353bc3c	FEXLoader: Convert remaining usages away from glibc	2023-03-30 16:28:34 -07:00
Ryan Houdek	629e547e5f	AllocatorOverride: Remove AFmt, it can try allocating memory and infinite loop.	2023-03-30 16:28:34 -07:00
Ryan Houdek	bbd0d26c16	Telemetry: Convert to FHU to remove glibc	2023-03-30 16:28:34 -07:00
Ryan Houdek	1eac7e7105	AOTIR: Convert to FHU to remove glibc	2023-03-30 16:28:34 -07:00
Ryan Houdek	1f9458a3c3	Config: Convert to FHU to remove glibc	2023-03-30 16:28:34 -07:00
Ryan Houdek	a3be4b77fa	Paths: Convert to FHU to remove glibc	2023-03-30 16:28:33 -07:00
Ryan Houdek	b2ec28503d	LookupCache: Move over to fextl::pmr	2023-03-30 16:28:33 -07:00
Ryan Houdek	170c9ee9e4	LookupCache: Switch to fextl	2023-03-30 16:28:33 -07:00
Ryan Houdek	1eb36b8b31	Convert a ton of things over to fextl	2023-03-30 16:28:33 -07:00
Ryan Houdek	465ecd9b19	Mark code regions that require glibc memory allocations. This ensures that when we enable glibc fault testing these sections won't break CI.	2023-03-30 16:28:33 -07:00
Mai	df354e37dd	Merge pull request #2578 from Sonicadvance1/support_salc OpcodeDispatcher: Implement support for 32-bit SALC instruction	2023-03-30 18:12:15 -04:00
Ryan Houdek	43e6d398b6	X86Dispatcher: Move xbyak to custom types	2023-03-30 08:49:26 -07:00
Ryan Houdek	0d7c856775	Update xbyak	2023-03-30 08:49:26 -07:00
Ryan Houdek	141dddc83e	CMake: Adds glibc allocator fault option This will be used for CI to ensure FEX doesn't use the glibc allocator	2023-03-30 08:49:26 -07:00
Ryan Houdek	64aa3bfabe	Switch FEX to fextl::fmt	2023-03-30 08:49:26 -07:00
Ryan Houdek	f02a111d33	fextl: add memory for unique_ptr and make_unique	2023-03-30 08:49:26 -07:00
Ryan Houdek	79f7baffe3	fextl: Add pmr default resource	2023-03-30 08:49:26 -07:00
Ryan Houdek	7150c532f3	fextl: add robin_map	2023-03-30 08:49:26 -07:00
Ryan Houdek	e48fb1850e	fextl: add unordered_multimap	2023-03-30 08:49:26 -07:00
Mai	88dba60bee	Merge pull request #2579 from Sonicadvance1/invalid_instruction_log Core: Add a new log message for unsupported instruction	2023-03-29 22:52:08 -04:00
Ryan Houdek	d615ae9c6a	Core: Add a new log message for unsupported instruction The previous log in the frontend is super useful when an instruction decoding wasn't supported. Now that most of AVX is covered, a game will crash on SIGILL (and usually catch it) and close without any indication. Now if the instruction is decoded but it is invalid for the configuration, still output a message as a good indicator that the game is using instructions that the host doesn't support. Will let us still pick up on games crashing due to lack of SVE very easily.	2023-03-29 14:38:16 -07:00
Ryan Houdek	7629edcf61	OpcodeDispatcher: Implement support for 32-bit SALC instruction This is an undocumented but supported instruction. It behaves just like an `sbb al, al` but doesn't set flags and is one byte shorter. The end result is that al is set to 0xFF or 0 depending on if CF is set or not.	2023-03-29 14:34:51 -07:00
Lioncache	73d250c555	ARMEmitter: Handle SVE2 integer add/subtract wide category	2023-03-29 17:02:18 -04:00
Lioncache	337f8b06a3	ARMEmitter: Convert SVE2 integer multiply long to wide helper Unifies the emitted ops under the same underlying emitter function.	2023-03-29 16:52:37 -04:00
Lioncache	87fa545bd0	ARMEmitter: Convert SVE2 integer add/subtract long to wide helper The generic helper will be used to implement the remaining unimplemented category from this group	2023-03-29 16:52:34 -04:00
Ryan Houdek	7747ac8de8	Merge pull request #2576 from lioncash/mul ARMEmitter: Handle SVE Integer Multiply-Add - Unpredicated group	2023-03-29 13:14:29 -07:00
Lioncache	deb1c9e933	ARMEmitter: Handle SVE mixed sign dot product category	2023-03-29 15:43:07 -04:00
Lioncache	672a88395d	ARMEmitter: Handle SVE2 saturating multiply-add high category	2023-03-29 15:38:38 -04:00
Lioncache	88524ce718	ARMEmitter: Handle SVE2 saturating multiply-add long category	2023-03-29 15:33:49 -04:00
Lioncache	bb153054f9	ARMEmitter: Handle SVE2 integer multiply-add long category	2023-03-29 15:28:32 -04:00
Lioncache	22a7a49042	ARMEmitter: Handle SVE2 complex integer multiply-add	2023-03-29 15:19:09 -04:00
Lioncache	9876f3eb5c	ARMEmitter: Handle SVE2 saturating multiply-add interleaved long category	2023-03-29 15:08:30 -04:00
Lioncache	b9e4ce4029	ARMEmitter: Handle SVE integer dot product (unpredicated) category	2023-03-29 14:55:06 -04:00
Lioncache	0c048772e0	ARMEmitter: Handle CDOT (vectors)	2023-03-29 14:54:35 -04:00
Lioncache	830c1884d1	OpcodeDispatcher: Handle store variants of VMASKMOVPD/VMASKMOVPS And with that, we support all of the AVX1-only instructions. The remaining instructions for full AVX1 support is now just the SSE4.2 string instructions.	2023-03-29 14:03:23 -04:00
Lioncache	5abf9de8a5	IR: Add VStoreVectorMasked IR op Will be used to implement the store variants of VPMASKMOV and VMASKMOVP{D, S}	2023-03-29 14:03:20 -04:00
Lioncache	25960fe6b1	OpcodeDispatcher: Handle load variants of VMASKMOVP{D, S}	2023-03-28 10:35:23 -04:00
Lioncache	eb8626c1f7	IR: Add VLoadVectorMasked IR op Will be used to implement the load variants of VMASKMOVP{D, S} and VPMASKMOV{D, Q} Particularly useful, since with SVE this behavior can be collapsed into two instructions (CMPGT followed by the relevant LD1 load instruction)	2023-03-28 01:57:25 -04:00
Lioncache	ef7853ca4a	ARMEmitter: Fix treating 32-bit elements as 64-bit with ld1w These conditionals were accidentally inverted and were treating 32-bit elements as 64-bit ones, when this is unintended. Also add missing tests to ensure this doesn't slip through in the future.	2023-03-28 00:55:04 -04:00
Ryan Houdek	55d65f3aea	Merge pull request #2570 from lioncash/mov Arm64/VectorOps: Remove a few unnecessary EORs from comparisons in SVE path	2023-03-27 13:35:48 -07:00
Ryan Houdek	2c2abc550b	Merge pull request #2569 from lioncash/psadbw OpcodeDispatcher: Handle VMPSADBW	2023-03-27 13:31:52 -07:00
Lioncache	a1dc132f03	Arm64/VectorOps: Eliminate unnecessary EOR and MOV in FP compares We can use the zeroing variant of MOVPRFX to perform the same behavior.	2023-03-27 16:16:28 -04:00
Lioncache	ef31e0c7c7	OpcodeDispatcher: Handle VMPSADBW	2023-03-27 16:00:24 -04:00
Lioncache	eecd016ba8	Arm64/VectorOps: Eliminate unnecessary EOR and MOV in VCMP{EQ,GT} We can use the zeroing version of MOVPRFX to perform the same behavior.	2023-03-27 15:38:03 -04:00
Lioncache	b2ec6d5208	OpcodeDispatcher: Move MPSADBW implementation into helper This will be used for implementing the AVX variant of this instruction.	2023-03-27 12:40:28 -04:00
Lioncache	416a7b825d	ARMEmitter: Move SVE2IntegerSaturatingAddSub over to using SVE2IntegerPredicated helper Deduplicates some code.	2023-03-27 12:21:42 -04:00
Lioncache	c3511ffa48	ARMEmitter: Move SVEIntegerPairwiseArithmetic over to using SVE2IntegerPredicated helper Deduplicates some code.	2023-03-27 12:21:42 -04:00
Lioncache	b9c3277b09	ARMEmitter: Move SVE2IntegerHalvingPredicated over to using SVE2IntegerPredicated helper Deduplicates some code.	2023-03-27 12:21:38 -04:00

... 4 5 6 7 8 ...

4056 Commits