llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-02 18:58:15 +00:00

Author	SHA1	Message	Date
Luke Lau	aca7586ac9	[RISCV] Fix M1 shuffle on wrong SrcVec in lowerShuffleViaVRegSplitting This fixes a miscompile from #79072 where we were taking the wrong SrcVec to do the M1 shuffle. E.g. if the SrcVecIdx was 2 and we had 2 VRegsPerSrc, we ended up taking it from V1 instead of V2.	2024-01-31 22:49:19 -08:00
Luke Lau	5605312fc5	[RISCV] Add test to showcase miscompile from #79072	2024-01-31 22:49:19 -08:00
Sander de Smalen	e502141a42	[AArch64][SME] Fix inlining bug introduced in #78703 (#79994 ) Calling a `__arm_locally_streaming` function from a function that is not a streaming-SVE function would lead to incorrect inlining. The issue didn't surface because the tests were not testing what they were supposed to test. (cherry picked from commit 3abf55a68caefd45042c27b73a658c638afbbb8b)	2024-01-31 22:29:42 -08:00
Sander de Smalen	27139bceb2	[SME] Stop RA from coalescing COPY instructions that transcend beyond smstart/smstop. (#78294 ) This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that expands to a 'nop', but which stops the register allocator from coalescing a COPY node when its use/def crosses a SMSTART or SMSTOP instruction. For example: %0:fpr64 = COPY killed $d0 undef %2.dsub:zpr = COPY %0 // <- Do not coalesce this COPY ADJCALLSTACKDOWN 0, 0 MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0 $d0 = COPY killed %0 BL @use_f64, csr_aarch64_aapcs If the COPY would be coalesced, that would lead to: $d0 = COPY killed %0 being replaced by: $d0 = COPY killed %2.dsub which means the whole ZPR reg would be live upto the call, causing the MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register: str q0, [sp] // 16-byte Folded Spill smstop sm ldr z0, [sp] // 16-byte Folded Reload bl use_f64 which would be incorrect for two reasons: 1. The program may load more data than it has allocated. 2. If there are other SVE objects on the stack, the compiler might use the 'mul vl' addressing modes to access the spill location. By disabling the coalescing, we get the desired results: str d0, [sp, #8] // 8-byte Folded Spill smstop sm ldr d0, [sp, #8] // 8-byte Folded Reload bl use_f64 (cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)	2024-01-31 15:41:15 +00:00
Sam James	400a02bb28	[sanitizer] Handle Gentoo's libstdc++ path On Gentoo, libc++ is indeed in /usr/include/c++/, but libstdc++ is at e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14. Use '/include/g++' as it should be unique enough. Note that the omission of a trailing slash is intentional to match g++-. See https://github.com/llvm/llvm-project/pull/78534#issuecomment-1904145839. Reviewed by: mgorny Closes: https://github.com/llvm/llvm-project/pull/79264 Signed-off-by: Sam James <sam@gentoo.org> (cherry picked from commit e8f882f83acf30d9b4da8846bd26314139660430)	2024-01-30 15:47:38 -08:00
Alex Bradbury	fe2fca3b8e	Backport [RISCV] Graduate Zicond to non-experimental (#79811 ) (#80018 ) The Zicond extension was ratified in the last few months, with no changes that affect the LLVM implementation. Although there's surely more tuning that could be done about when to select Zicond or not, there are no known correctness issues. Therefore, we should mark support as non-experimental. (cherry-picked from commit d833b9d677c9dd0a35a211e2fdfada21ea9a464b)	2024-01-30 15:31:38 -08:00
Yi Wu	a2d4a4c0b2	Apply kind code check on exitstat and cmdstat (#78286 ) When testing on gcc, both exitstat and cmdstat must be a kind=4 integer, e.g. DefaultInt. This patch changes the input arg requirement from `AnyInt` to `TypePattern{IntType, KindCode::greaterOrEqualToKind, n}`. The standard stated in 16.9.73 - EXITSTAT (optional) shall be a scalar of type integer with a decimal exponent range of at least nine. - CMDSTAT (optional) shall be a scalar of type integer with a decimal exponent range of at least four. ```fortran program bug implicit none integer(kind = 2) :: exitstatvar integer(kind = 4) :: cmdstatvar character(len=256) :: msg character(len=:), allocatable :: command command='echo hello' call execute_command_line(command, exitstat=exitstatvar, cmdstat=cmdstatvar) end program ``` When testing the above program with exitstatvar kind<4, an error would occur: ``` $ ../build-release/bin/flang-new test.f90 error: Semantic errors in test.f90 ./test.f90:8:47: error: Actual argument for 'exitstat=' has bad type or kind 'INTEGER(2)' call execute_command_line(command, exitstat=exitstatvar) ``` When testing the above program with exitstatvar kind<2, an error would occur: ``` $ ../build-release/bin/flang-new test.f90 error: Semantic errors in test.f90 ./test.f90:8:47: error: Actual argument for 'cmdstat=' has bad type or kind 'INTEGER(1)' call execute_command_line(command, cmdstat=cmdstatvar) ``` Test file for this semantics has been added to `flang/test/Semantic` Fixes: https://github.com/llvm/llvm-project/issues/77990 (cherry picked from commit 14a15103cc9dbdb3e95c04627e0b96b5e3aa4944)	2024-01-30 16:45:35 +00:00
David Green	bab01aead7	Revert "[AArch64] merge index address with large offset into base address" This reverts commit `32878c2065` due to #79756 and #76202. (cherry picked from commit 915c3d9e5a2d1314afe64cd6116a3b6c9809ec90)	2024-01-29 15:17:53 -08:00
Andrei Golubev	0680e84a3f	[mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() (#79582 ) Fold expressions on Clang are limited to 256 elements. This causes compilation errors in cases when the amount of elements added exceeds this limit. Side-step the issue by restoring the original trick that would use the std::initializer_list. For the record, in our downstream Clang 16 gives: mlir/include/mlir/IR/Dialect.h:269:23: fatal error: instantiating fold expression with 688 arguments exceeded expression nesting limit of 256 (addType<Args>(), ...); Partially reverts `26d811b3ec`. Co-authored-by: Nikita Kudriavtsev <nikita.kudriavtsev@intel.com> (cherry picked from commit e3a38a75ddc6ff00301ec19a0e2488d00f2cc297)	2024-01-29 15:10:47 -08:00
David Sherwood	bdaf16d59f	[LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034 ) When we generate runtime memory checks for an inner loop it's possible that these checks are invariant in the outer loop and so will get hoisted out. In such cases, the effective cost of the checks should reduce to reflect the outer loop trip count. This fixes a 25% performance regression introduced by commit `49b0e6dcc2` when building the SPEC2017 x264 benchmark with PGO, where we decided the inner loop trip count wasn't high enough to warrant the (incorrect) high cost of the runtime checks. Also, when runtime memory checks consist entirely of diff checks these are likely to be outer loop invariant. (cherry picked from commit 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c)	2024-01-29 15:05:18 -08:00
Jay Foad	824a3e5dec	[AMDGPU] Do not bother adding reserved registers to liveins (#79436 ) Tweak the implementation of llvm.amdgcn.wave.id to not add TTMP8 to the function liveins.	2024-01-29 15:00:47 -08:00
Jay Foad	4c8cf4a1c2	[AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325 ) This is only valid on targets with architected SGPRs.	2024-01-29 15:00:47 -08:00
Alexander Kornienko	b73cd5ec71	Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768 )" This reverts commit `924701311a`. Causes compilation errors on valid code, see https://github.com/llvm/llvm-project/pull/77768#issuecomment-1908062472. (cherry picked from commit 6e4930c67508a90bdfd756f6e45417b5253cd741)	2024-01-29 14:57:23 -08:00
Andrei Golubev	3df71e5a3f	[mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562 ) GEPArg can only be constructed from int32_t and mlir::Value. Explicitly cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing conversion warnings on MSVC. Some recent examples of such are: ``` mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'size_t' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'unsigned int' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] ``` Co-authored-by: Nikita Kudriavtsev <nikita.kudriavtsev@intel.com> (cherry picked from commit 89cd345667a5f8f4c37c621fd8abe8d84e85c050)	2024-01-29 10:29:59 -08:00
Michał Górny	2c3214135f	[llvm] [cmake] Include httplib in LLVMConfig.cmake (#79305 ) Include LLVM_ENABLE_HTTPLIB along with httplib package finding in LLVMConfig.cmake, as this dependency is needed by LLVMDebuginfod that is now used by LLDB. Without it, building LLDB standalone fails with: ``` CMake Error at /usr/lib/llvm/19/lib64/cmake/llvm/LLVMExports.cmake:90 (set_target_properties): The link interface of target "LLVMDebuginfod" contains: httplib::httplib but the target was not found. Possible reasons include: * There is a typo in the target name. * A find_package call is missing for an IMPORTED target. * An ALIAS target is missing. Call Stack (most recent call first): /usr/lib/llvm/19/lib64/cmake/llvm/LLVMConfig.cmake:357 (include) cmake/modules/LLDBStandalone.cmake:9 (find_package) CMakeLists.txt:34 (include) ``` (cherry picked from commit 3c9f34c12450345c6eb524e47cf79664271e4260)	2024-01-29 10:25:40 -08:00
Tom Stellard	b79e6a3c8a	[workflows] Fix argument passing in abi-dump jobs (#79658 ) (#79836 ) This was broken by `859e6aa100`, which added quotes around the EXTRA_ARGS variable.	2024-01-29 10:15:50 -08:00
Jay Foad	27654471cc	[AMDGPU] Move architected SGPR implementation into isel (#79120 ) (cherry picked from commit 70fc9703788e8965813c5b677a85cb84b66671b6)	2024-01-27 23:18:25 -08:00
Tom Stellard	ddbdd7b267	workflows: Merge LLVM tests together into a single job (#78877 ) (#79710 ) This is possible now that the free GitHub runners for Windows and Linux have more disk space: https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/ I also had to switch from macOS-11 to macOS-13 in order to prevent the job from timing out. macOS-13 runners have 4 vCPUs and the macOS-11 runners only have 3.	2024-01-27 23:14:59 -08:00
Paschalis Mpeis	fa0a72b584	[LTO] Fix Veclib flags correctly pass to LTO flags (#78749 ) Flags `-fveclib=name` were not passed to LTO flags. This pass fixes that by converting the `-fveclib` flags to their relevant names for opt's `-vector-lib=name` flags. For example: `-fveclib=SLEEF` would become `-vector-library=sleefgnuabi` and passed through the `-plugin-opt` flag. (cherry picked from commit 03cf0e9354e7e56ff794e9efb682ed2971bc91ec)	2024-01-27 15:52:23 -08:00
erichkeane	16bfe1e89f	Fix comparison of Structural Values Fixes a regression from #78041 as reported in the review. The original patch failed to compare the canonical type, which this adds. A slightly modified test of the original report is added. (cherry picked from commit e3ee3762304aa81e4a240500844bfdd003401b36)	2024-01-27 10:31:53 -08:00
Adhemerval Zanella	2cf04c020f	[X86] Do not end 'note.gnu.property' section with -fcf-protection (#79360 ) The glibc now adds the required minimum ISA level for libc-nonshared.a (linked on all programs) and this is done with an inline asm along with .note.gnu.property and .pushsection/.popsection. However, the x86 backend always ends the 'note.gnu.property' section when building with -fcf-protection, leading to assert failure: llvm/llvm-project-git/llvm/lib/MC/MCStreamer.cpp:1251: virtual void llvm::MCStreamer::switchSection(llvm::MCSection, const llvm::MCExpr): Assertion `!Section->hasEnded() && "Section already ended"' failed. [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/isa-level.c;h=3f1b269848a52f994275bab6f60dded3ded6b144;hb=HEAD (cherry picked from commit a58c62fa824fd24d20fa2366e0ec8f241cb321fe)	2024-01-27 10:19:11 -08:00
Sean Fertile	15aeb35c53	[LTO] Fix fat-lto output for -c -emit-llvm. (#79404 ) Fix and add a test case for combining '-ffat-lto-objects -c -emit-llvm' options and fix a spelling mistake in same test. (cherry picked from commit f1b1611148fa533fe198fec3fa4ef8139224dc80)	2024-01-27 06:51:08 -08:00
Mariusz Sikora	2d759eff89	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414 ) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com> (cherry picked from commit cfddb59be2124f7ec615f48a2d0395c6fdb1bb56)	2024-01-27 06:45:32 -08:00
Fangrui Song	e2521eaa1a	[ELF] Implement R_RISCV_TLSDESC for RISC-V Support R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL. LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20 location, which requires special handling. We save the value of HI20 to be reused. Two interleaved TLSDESC code sequences, which compilers do not generate, are unsupported. For -no-pie/-pie links, TLSDESC to initial-exec or local-exec optimizations are eligible. Implement the relevant hooks (R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions are converted to NOP while the latter two are converted to a GOT load or a lui+addi. The first two instructions, which would be converted to NOP, are removed instead in the presence of relaxation. Relaxation is eligible as long as the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX, regardless of whether the following instructions have a R_RISCV_RELAX. In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`), `lui` can be removed (i.e. use the short form) if hi20 is 0. ``` // TLSDESC to LE/IE optimization .Ltlsdesc_hi2: auipc a4, %tlsdesc_hi(c) # if relax: remove; otherwise, NOP load a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP addi a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2) # if LE && !hi20 {if relax: remove; otherwise, NOP} jalr t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2) add a0, a0, tp ``` The implementation carefully ensures that an instruction unrelated to the current TLSDESC code sequence, if immediately follows a removable instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not converted to NOP. * `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582). * `riscv64-tlsdesc-relax.s` tests linker relaxation. * `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900). Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373 Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/79239 (cherry picked from commit 1117fdd7c16873eb389e988c6a39ad922bae0fd0)	2024-01-26 21:34:49 -08:00
Fangrui Song	3d02473ac5	[ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC (cherry picked from commit 849951f8759171cb6c74d3ccbcf154506fc1f0ae)	2024-01-26 21:34:49 -08:00
Fangrui Song	e9d99e5183	[ELF] Clean up R_RISCV_RELAX code. NFC (cherry picked from commit ccb99f221422b8de5e1ae04d3427f15878f7cd93)	2024-01-26 21:34:48 -08:00
Wang Pengcheng	d9e26c223b	[RISCV] Use TableGen-based macro fusion (#72224 ) We convert existed macro fusions to TableGen. Bacause `Fusion` depend on `Instruction` definitions which is defined below `RISCVFeatures.td`, so we recommend user to add fusion features when defining new processor. (cherry picked from commit 3fdb431b636975f2062b1931158aa4dfce6a3ff1)	2024-01-26 21:28:01 -08:00
Wang Pengcheng	7cfa0c1b7c	[TableGen] Add predicates for immediates comparison (#76004 ) These predicates can be used to represent `<`, `<=`, `>`, `>=`. And a predicate for `in range` is added. (cherry picked from commit 664a0faac464708fc061d12e5cd492fcbfea979a)	2024-01-26 21:28:00 -08:00
Wang Pengcheng	62877e375c	[TableGen] Use MapVector to remove non-determinism This fixes found non-determinism when `LLVM_REVERSE_ITERATION` option is `ON`. Fixes #79420. Reviewers: ilovepi, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/79411 (cherry picked from commit 41fe98a6e7e5cdcab4a4e9e0d09339231f480c01)	2024-01-26 21:24:37 -08:00
Mirko Brkušanin	ed48280f8e	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-26 20:01:08 -08:00
Fangrui Song	aa4cb0e313	[Driver,CodeGen] Support -mtls-dialect= (#79256 ) GCC supports -mtls-dialect= for several architectures to select TLSDESC. This patch supports the following values * x86: "gnu". "gnu2" (TLSDESC) is not supported yet. * RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915) AArch64 toolchains seem to support TLSDESC from the beginning, and the general dynamic model has poor support. Nobody seems to use the option -mtls-dialect= at all, so we don't bother with it. There also seems very little interest in AArch32's TLSDESC support. TLSDESC does not change IR, but affects object file generation. Without a backend option the option is a no-op for in-process ThinLTO. There seems no motivation to have fine-grained control mixing trad/desc for TLS, so we just pass -mllvm, and don't bother with a modules flag metadata or function attribute. Co-authored-by: Paul Kirth <paulkirth@google.com> (cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20)	2024-01-26 19:51:03 -08:00
dyung	147c623a86	Change check for embedded llvm version number to a regex to make test more flexible. (#79528 ) (#79642 ) This test started to fail when LLVM created the release/18.x branch and the main branch subsequently had the version number increased from 18 to 19. I investigated this failure (it was blocking our internal automation) and discovered that the CHECK statement on line 27 seemed to have the compiler version number (1800) encoded in octal that it was checking for. I don't know if this is something that explicitly needs to be checked, so I am leaving it in, but it should be more flexible so the test doesn't fail anytime the version number is changed. To accomplish that, I changed the check for the 4-digit version number to be a regex. I originally updated this test for the 18->19 transition in a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK line more flexible so it doesn't need to be continually updated. (cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874)	2024-01-26 19:27:28 -08:00
Fangrui Song	0991d3c7b5	[ELF] Don't resolve relocations referencing SHN_ABS to tombstone in non-SHF_ALLOC sections (#79238 ) A SHN_ABS symbol has never been considered for InputSection::relocateNonAlloc. Before #74686, the code did made it work in the absence of `-z dead-reloc-in-nonalloc=`. There is now a report about such SHN_ABS uses (https://github.com/llvm/llvm-project/pull/74686#issuecomment-1904101711) and I think it makes sense for non-SHF_ALLOC to support SHN_ABS, like SHF_ALLOC sections do. ``` // clang -g __attribute__((weak)) int symbol; int *foo() { return &symbol; } 0x00000023: DW_TAG_variable [2] (0x0000000c) ... DW_AT_location [DW_FORM_exprloc] (DW_OP_addrx 0x0) ``` .debug_addr references `symbol`, which can be redefined by a symbol assignment or --defsym to become a SHN_ABS symbol. The problem is that `!sym.getOutputSection()` cannot discern SHN_ABS from a symbol whose section has been discarded. Since commit `1981b1b6b9`, a symbol relative to a discarded section is changed to `Undefined`, so the `SHN_ABS` check become trivial. We currently apply tombstone for a relocation referencing `SharedSymbol`. This patch does not change the behavior. (cherry picked from commit 8abf8d124ae346016c56209de7f57b85671d4367)	2024-01-25 17:27:37 -08:00
Shengchen Kan	453ff4b733	[X86][CodeGen] Fix crash when commute operands of Instruction for code size (#79245 ) Reported in `134fcc6278` Incorrect opcode is used b/c there is a `[[fallthrough]]` at line 2386. (cherry picked from commit 33ecef9812e2c9bfadef035b8e34a949acae2abc)	2024-01-25 17:24:02 -08:00
Weining Lu	c9e73cdd9a	[test] Update dwarf-loongarch-relocs.ll Address buildbot faiures: http://45.33.8.238/macm1/77360/step_11.txt http://45.33.8.238/linux/128902/step_12.txt (cherry picked from commit baba7e4175b6ca21e83b1cf8229f29dbba02e979)	2024-01-25 17:14:34 -08:00
Louis Dionne	3173faaff1	[🍒][ci] Fix the base branch we use to determine changes (#79503 ) (#79506 ) We should diff against the base branch, not always against `main`. This allows the BuildKite pre-commit CI to work properly when we target other branches, such as `release/18.x`. (cherry picked from commit 3b762891826192ded07286852d326f9c9060f52e)	2024-01-25 13:57:41 -08:00
Tom Stellard	6abd792a67	[workflows] Fix version-check.yml to work with the new minor release bump (cherry picked from commit d5e69147b9d261bd53b4dd027f17131677be8613)	2024-01-25 13:33:17 -08:00
Tom Stellard	2268346374	Use rc version suffix	2024-01-23 20:27:37 -08:00
Tom Stellard	f0c79331a0	Bump version to 18.1.0	2024-01-23 20:19:45 -08:00
Wang Pengcheng	93248729cf	[RISCV][MC] Split tests for A into Zaamo and Zalrsc parts So that we don't duplicate tests in later patch. Reviewers: topperc, dtcxzyw, asb Reviewed By: asb Pull Request: https://github.com/llvm/llvm-project/pull/79111	2024-01-24 10:49:14 +08:00
Michael Maitland	63f742c15f	[RISCV] Add sifive-p670 processor (#79015 ) This is an OOO core that has a vector unit. For more information see https://www.sifive.com/cores/performance-p650-670. Scheduler model and other tuning will come in separate patches.	2024-01-23 21:45:24 -05:00
paperchalice	7bda0ce15a	[llc] Remove C backend support (#79237 ) C backend is removed in 3.1.	2024-01-24 10:40:11 +08:00
Chuanqi Xu	f0c3870388	[Modules] [HeaderSearch] Don't reenter headers if it is pragma once (#76119 ) Close https://github.com/llvm/llvm-project/issues/73023 The direct issue of https://github.com/llvm/llvm-project/issues/73023 is that we entered a header which is marked as pragma once since the compiler think it is OK if there is controlling macro. It doesn't make sense. I feel like it should be sufficient to skip it after we see the '#pragma once'. From the context, it looks like the workaround is primarily for ObjectiveC. So we might need reviewers from OC.	2024-01-24 10:22:35 +08:00
Nico Weber	ecde13b1a8	[gn build] port `7e50f006f7`	2024-01-23 21:14:36 -05:00
Craig Topper	3dea0aa8f4	[LSR] Fix incorrect comment. NFC (#79207 )	2024-01-23 17:57:34 -08:00
Christudasan Devadasan	230c13d59d	[AMDGPU] Pick available high VGPR for CSR SGPR spilling (#78669 ) CSR SGPR spilling currently uses the early available physical VGPRs. It currently imposes a high register pressure while trying to allocate large VGPR tuples within the default register budget. This patch changes the spilling strategy by picking the VGPRs in the reverse order, the highest available VGPR first and later after regalloc shift them back to the lowest available range. With that, the initial VGPRs would be available for allocation and possibility of finding large number of contiguous registers will be more.	2024-01-24 07:08:43 +05:30
paperchalice	7e50f006f7	[NewPM][CodeGen][llc] Add NPM support (#70922 ) Add new pass manager support to `llc`. Users can use `--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm` to run default codegen pipeline. This patch is taken from [D83612](https://reviews.llvm.org/D83612), the original author is @yuanfang-chen. --------- Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>	2024-01-24 09:27:25 +08:00
Fangrui Song	c663c8b883	[ELF,test] Improve dead-reloc-in-nonalloc.s Test an absolute relocation referencing a DSO symbol, relocating a non-SHF_ALLOC section. Also test --gc-sections.	2024-01-23 17:23:52 -08:00
Jeffrey Byrnes	f709fbb1bb	[SROA] Only try additional vector type candidates when needed (#77678 ) `f9c2a341b9` causes regressions when we have a slice with integer vector type that is the same size as the partition, and a ptr load/store slice that is not the size of the element type. Ref `vector-promotion.ll:ptrLoadStoreTys`. Before the patch, we would only consider `<4 x i32>` as a candidate type for vector promotion, and would find that it is a viable type for all the slices. After the patch, we now add `<2 x ptr>` as a candidate type due to slice with user `store ptr %val0, ptr %obj, align 8` -- and flag that we `HaveVecPtrTy`. The pre-existing behavior of this flag results in removing the viable `<4 x i32>` and keeping only the unviable `<2 x ptr>`, which results in a failure to promote. The end result is failing to promote an alloca that was previously promoted -- this does not appear to be the intent of that patch, which has the goal of increasing promotions by providing more promotion opportunities. This PR preserves this behavior via a simple reorganization of the implemention: try first the slice types with same size as the partition, then, if there is no promotable type, try the `LoadStoreTys.`	2024-01-23 17:22:49 -08:00
Jinyang He	c51ab483e6	[LoongArch] Insert nops and emit align reloc when handle alignment directive (#72962 ) Refer to RISCV, we will fix up the alignment if linker relaxation changes code size and breaks alignment. Insert enough Nops and emit R_LARCH_ALIGN relocation type so that linker could satisfy the alignment by removing Nops. It does so only in sections with the SHF_EXECINSTR flag. In LoongArch psABI v2.30, R_LARCH_ALIGN requires symbol index. The lowest 8 bits of addend represent alignment and the other bits of addend represent the maximum number of bytes to emit.	2024-01-24 09:17:49 +08:00

1 2 3 4 5 ...

487439 Commits