RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-03-03 08:07:51 +00:00

Author	SHA1	Message	Date
Matt Arsenault	e14caa73dc	GlobalISel: Use the original flags when lowering fneg to fsub This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363637 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:48:43 +00:00
Peter Collingbourne	644422a4d3	hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag. This saves roughly 32 bytes of instructions per function with stack objects and causes us to preserve enough information that we can recover the original tags of all stack variables. Now that stack tags are deterministic, we no longer need to pass -hwasan-generate-tags-with-calls during check-hwasan. This also means that the new stack tag generation mechanism is exercised by check-hwasan. Differential Revision: https://reviews.llvm.org/D63360 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363636 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:39:51 +00:00
Peter Collingbourne	f159a1182e	hwasan: Add a tag_offset DWARF attribute to instrumented stack variables. The goal is to improve hwasan's error reporting for stack use-after-return by recording enough information to allow the specific variable that was accessed to be identified based on the pointer's tag. Currently we record the PC and lower bits of SP for each stack frame we create (which will eventually be enough to derive the base tag used by the stack frame) but that's not enough to determine the specific tag for each variable, which is the stack frame's base tag XOR a value (the "tag offset") that is unique for each variable in a function. In IR, the tag offset is most naturally represented as part of a location expression on the llvm.dbg.declare instruction. However, the presence of the tag offset in the variable's actual location expression is likely to confuse debuggers which won't know about tag offsets, and moreover the tag offset is not required for a debugger to determine the location of the variable on the stack, so at the DWARF level it is represented as an attribute so that it will be ignored by debuggers that don't know about it. Differential Revision: https://reviews.llvm.org/D63119 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363635 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:39:41 +00:00
Peter Collingbourne	bca91c8cab	gn build: Merge r363626. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363634 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:39:31 +00:00
Amara Emerson	f765312a6f	[GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block. Inter-block localization is the same as what currently happens, except now it only runs on the entry block because that's where the problematic constants with long live ranges come from. The second phase is a new intra-block localization phase which attempts to re-sink the already localized instructions further right before one of the multiple uses. One additional change is to also localize G_GLOBAL_VALUE as they're constants too. However, on some targets like arm64 it takes multiple instructions to materialize the value, so some additional heuristics with a TTI hook have been introduced attempt to prevent code size regressions when localizing these. Overall, these changes improve CTMark code size on arm64 by 1.2%. Full code size results: Program baseline new diff ------------------------------------------------------------------------------ test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6% test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6% test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4% test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2% test-suite :: CTMark/lencod/lencod.test 1340592 1324200 -1.2% test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9% test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0% test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0% Geomean difference -1.2% Differential Revision: https://reviews.llvm.org/D63303 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363632 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:20:29 +00:00
Michael Berg	95009014f8	Propagate fmf in IRTranslate for fneg Summary: This case is related to D63405 in that we need to be propagating FMF on negates. Reviewers: volkan, spatel, arsenm Reviewed By: arsenm Subscribers: wdng, javed.absar Differential Revision: https://reviews.llvm.org/D63458 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363631 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:19:40 +00:00
Craig Topper	8d008c0fb8	Use VR128X instead of FR32X/FR64X for the register class in VMOVSSZmrk/VMOVSDZmrk. Removes COPY_TO_REGCLASS from some patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363630 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:08:29 +00:00
Craig Topper	300b39cdac	[X86] Make an assert in LowerSCALAR_TO_VECTOR stricter to make it clear what types are allowed here. NFC Make it clear that only integer type with i32 or smaller elements shoudl get to this part of the code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363629 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:08:09 +00:00
Stanislav Mekhanoshin	37e080d871	[AMDGPU] Use custom inserter for gfx10 VOP2b This is part of the approved D63204 pending parent revision. This small change is in fact a part of the VOP2b legalization which does not technically belong to wave32 support, so extracted separately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363625 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 22:37:37 +00:00
Stanislav Mekhanoshin	4b0f16838a	[AMDGPU] gfx1010 subvector test. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363623 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 21:55:06 +00:00
Volkan Keles	a313274b56	[test][AArch64] Relax the check line for G_BRJT in legalizer-info-validation.mir Replace the specific number with a pattern to relax the test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363621 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 21:25:25 +00:00
Philip Reames	10c34d1b8d	Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges This patch really contains two pieces: Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi. Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses. Differential Revision: https://reviews.llvm.org/D63224 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363619 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 21:06:17 +00:00
Richard Smith	b31d176367	Add convenience utility for replacing a range within a container with a different range, in preparation for use in Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363617 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 21:01:09 +00:00
Daniel Sanders	bb2252a33d	[globalisel] Fix iterator invalidation in the extload combines Summary: Change the way we deal with iterator invalidation in the extload combines as it was still possible to neglect to visit a use. Even worse, it happened in the in-tree test cases and the checks weren't good enough to detect it. We now take a cheap copy of the use list before iterating over it. This prevents iterator invalidation from occurring and has the nice side effect of making the existing schedule-for-erase/schedule-for-insert mechanism moot. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363616 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 20:56:31 +00:00
Stanislav Mekhanoshin	dc6cf0fe9d	[AMDGPU] Propagate function attributes thru bitcasts AMDGPUPropagateAttributes will not work on function bitcatsts, so move AMDGPUFixFunctionBitcasts before it. Differential Revision: https://reviews.llvm.org/D63455 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363614 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 20:42:48 +00:00
Philip Reames	d773cb7e1c	Fix a bug w/inbounds invalidation in LFTR (recommit) Recommit r363289 with a bug fix for crash identified in pr42279. Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case. Test case previously committed in r363601. Original commit comment follows: This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363613 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 20:32:22 +00:00
Peter Collingbourne	e6bda13f36	gn build: Merge r363483. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363610 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 20:03:11 +00:00
Peter Collingbourne	d07c0627e3	gn build: Merge r363584. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363609 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:59:16 +00:00
Nicolai Haehnle	3c07918c35	AMDGPU/GFX10: Don't generate s_code_end padding in the asm-printer Summary: The purpose of the padding is to guard against stale code being fetched into the instruction cache by the lowest level prefetching. We're generating relocatable ELF here, and so the padding should arguably be added by the linker. This is in fact what Mesa does. This also fixes multi-part shaders for Mesa. Change-Id: I6bfede58f20e9f337762ccf39ef9e0e263e69e82 Reviewers: arsenm, rampitec, t-tye Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363602 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:28:43 +00:00
Philip Reames	185d55ed9f	Reduced test case for pr42279 in advance of the relevant re-commit + fix git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363601 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:27:45 +00:00
Nicolai Haehnle	17d3a9f1c1	AMDGPU: Explicitly define a triple for some tests Summary: This is related to the changes to the groupstaticsize intrinsic in D61494 which would otherwise make the related tests in these files fail or much less useful. Note that for some reason, SOPK generation is less effective in the amdhsa OS, which is why I chose PAL. I haven't investigated this deeper. Change-Id: I6bb99569338f7a433c28b4c9eb1e3e036b00d166 Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363600 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:25:57 +00:00
Joseph Tremoulet	12c73face9	[EarlyCSE] Fix hashing of self-compares Summary: Update compare normalization in SimpleValue hashing to break ties (when the same value is being compared to itself) by switching to the swapped predicate if it has a lower numerical value. This brings the hashing in line with isEqual, which already recognizes the self-compares with swapped predicates as equal. Fixes PR 42280. Reviewers: spatel, efriedma, nikic, fhahn, uabelho Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63349 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363598 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:11:28 +00:00
Alina Sbirlea	6327268c4c	[MemorySSA] Don't use template when the clone is a simplified instruction. Summary: LoopRotate doesn't create a faithful clone of an instruction, it may simplify it beforehand. Hence the clone of an instruction that has a MemoryDef associated may not be a definition, but a use or not a memory alternig instruction. Don't rely on the template when the clone may be simplified. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63355 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363597 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:58:40 +00:00
Jessica Paquette	798f83b658	[GlobalISel][AArch64] Fold G_SUB into G_ICMP when it's safe to do so Basically porting over the behaviour in AArch64ISelLowering to GISel. See emitComparison for reference. When we have something like this: ``` lhs = G_SUB 0, y ... G_ICMP lhs, rhs ``` We can fold away the G_SUB and produce a cmn instead, given that we produce the same value in NZCV. Add a test showing that the transformation works, and also showing that we don't perform the transformation when it's unsafe. Also factor out the CSet emission into emitCSetForICMP. Differential Revision: https://reviews.llvm.org/D63163 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363596 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:40:06 +00:00
Craig Topper	6750ca7ad7	[X86] Add TB_NO_REVERSE to some memory folding table entries where the register form requires 64-bit mode, but the memory form does not. We don't know if its safe to unfold if we're in 32-bit mode. This is simlar to what was done to some load opcodes in r363523. I think its pretty unlikely we will try to unfold these anyway so I don't think this is testable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363595 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:38:07 +00:00
Valery Pykhtin	97e04d1cf3	LiveInterval.h: add LiveRange::findIndexesLiveAt function - return a list of SlotIndexes the LiveRange live at. Differential revision: https://reviews.llvm.org/D62411 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363593 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:23:39 +00:00
Simon Pilgrim	22af01a414	[X86][SSE] Scalarize under-aligned XMM vector nt-stores (PR42026) If a XMM non-temporal store has less than natural alignment, scalarize the vector - with SSE4A we can stay on the vector and use MOVNTSD(f64), else we must move to GPRs and use MOVNTI(i32/i64). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363592 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:20:04 +00:00
Matt Arsenault	fcf015b682	AMDGPU: Make getreg intrinsic inaccessiblememonly git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363591 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:17:25 +00:00
Alina Sbirlea	e0ff6cd963	[MemorySSA] Add all MemoryPhis before filling their values. Summary: Add all MemoryPhis in IDF before filling in their incomign values. Otherwise, a new Phi can be added that needs to become the incoming value of another Phi. Test fails the verification in verifyPrevDefInPhis. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63353 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363590 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 18:16:53 +00:00
Stanislav Mekhanoshin	95dd0d84d2	[AMDGPU] gfx1010 wavefrontsize intrinsic folding Differential Revision: https://reviews.llvm.org/D63206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363588 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:57:50 +00:00
Matt Arsenault	55ce7b4ea9	AMDGPU: Fold readlane/readfirstlane calls git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363587 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:52:35 +00:00
Stanislav Mekhanoshin	f1a5ef5a39	[AMDGPU] Pass to propagate ABI attributes from kernels to the functions The pass works in two modes: Mode 1: Just set attributes starting from kernels. This can work at the very beginning of opt and llc pipeline, but cannot clone functions because it must be a function pass. Mode 2: Actually clone functions for new attributes. This can only work after all function passes in the opt pipeline because it has to be a module pass. Differential Revision: https://reviews.llvm.org/D63208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363586 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:47:28 +00:00
Nico Weber	c0de78b220	gn build: Merge r363541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363583 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:45:12 +00:00
Simon Pilgrim	1d6c2fbdd1	[X86][AVX] Split under-aligned vector nt-stores. If a YMM/ZMM non-temporal store has less than natural alignment, split the vector - either they will be satisfactorily aligned or will continue to be split until they are XMMs - at which point the legalizer will scalarize it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363582 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:22:38 +00:00
Warren Ristow	31868b92df	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363581 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:20:08 +00:00
Matt Arsenault	bb2597ae2f	GlobalISel: Ignore callsite attributes when picking intrinsic type A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363580 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:01:35 +00:00
Matt Arsenault	6f81a49f5c	GlobalISel: Verify intrinsics I keep using the wrong instruction when manually writing tests. This really needs to check the number of operands, but I don't see an easy way to do that right now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363579 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:01:32 +00:00
Matt Arsenault	33482eea4c	AMDGPU/GlobalISel: Account for multiple defs when finding intrinsic ID git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363578 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:01:27 +00:00
Stanislav Mekhanoshin	c33d0e6650	[AMDGPU] gfx1010 wave32 metadata Differential Revision: https://reviews.llvm.org/D63207 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363577 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 16:48:56 +00:00
Tom Stellard	2d928b87de	AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60640 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363576 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 16:27:43 +00:00
Francis Visoiu Mistrih	1cb383ef62	[Remarks] Extend -fsave-optimization-record to specify the format Use -fsave-optimization-record=<format> to specify a different format than the default, which is YAML. For now, only YAML is supported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363573 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 16:06:00 +00:00
Simon Pilgrim	3ea8c2ac6a	[X86] combineLoad - begun making the load split code more generic. NFCI. This is currently only used for ymm->xmm splitting but we shouldn't hardcode the offsets/alignment. This is necessary for an upcoming patch to split under-aligned non-temporal vector loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363570 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 15:54:36 +00:00
Whitney Tsang	bdd7b78551	PHINode: introduce setIncomingValueForBlock() function, and use it. Summary: There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue() but no function to replace incoming value for a specified BasicBlock* predecessor. Clearly, there are a lot of places that could use that functionality. Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn Reviewed By: Meinersbur, fhahn Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63338 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363566 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 14:38:56 +00:00
Simon Pilgrim	dbb1cec2e7	[X86][SSE] Add tests for underaligned nt loads Test both 'unaligned' (which we should just use regular unaligned loads) and 'subvector aligned' (which we should split) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363565 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 14:38:17 +00:00
Simon Pilgrim	bfe7eb96c9	[X86][SSE] Prevent misaligned non-temporal vector load/store combines For loads, pre-SSE41 we can't perform NT loads at all, and after that we can only perform vector aligned loads, so if the alignment is less than for a xmm we'll just end up using the regular unaligned vector loads anyway. First step towards fixing PR42026 - the next step for stores will be to use SSE4A movntsd where possible and to avoid the stack spill on SSE2 targets. Differential Revision: https://reviews.llvm.org/D63246 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363564 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 14:26:10 +00:00
Matt Arsenault	6b4d361457	InferAddressSpaces: Fix cloning original addrspacecast If an addrspacecast needed to be inserted again, this was creating a clone of the original cast for each user. Just use the original, which also saves losing the value name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363562 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 14:13:29 +00:00
Matt Arsenault	6f9792a238	AMDGPU: Ignore subtarget for InferAddressSpaces Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363561 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 14:13:24 +00:00
Matt Arsenault	14dc077c20	AMDGPU: Mark exp/exp.compr as inaccessiblememonly Should also be marked writeonly, but I think that would require splitting the version with done set to a separate intrinsic Test change is only from renumbering the attribute group numbers, which for some reason the generated check lines consider. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363560 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 13:52:24 +00:00
Matt Arsenault	baa4b8244b	AMDGPU/GlobalISel: Fix default mapping for non-register operands Tests will be in future commits when new intrinsics are handled here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363559 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 13:52:19 +00:00
Matt Arsenault	b6ebfc02b3	AMDGPU: Cleanup custom PseudoSourceValue definitions Use separate enums for each kind, avoid repeating overloads, and add missing classof implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363558 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 13:52:15 +00:00

1 2 3 4 5 ...

180627 Commits