archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Jinsong Ji	f522bdf453	Set useful flags for vector imm setting instructions Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz Should behave like LI8. We should set corresponding flags to allow rematerialization and other opts in LICM, RA, Scheduling etc. Differential Revision: https://reviews.llvm.org/D58645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355948 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 18:27:09 +00:00
Jinsong Ji	d690dcb9da	[NFC][PowerPC] Update testcases using utils/update_llc_test_checks.py git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355945 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 17:55:32 +00:00
Nikita Popov	f7a652c414	[SDAG] Expand pow2 mulo using shifts Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355937 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 16:57:25 +00:00
Simon Pilgrim	5ebb3b9e4a	Regenerate sign_extend.ll test. This will change as part of the fix for the regressions in D58017. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355933 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 16:00:59 +00:00
Tim Northover	99eb9152f9	CodeGenPrep: preserve inbounds attribute when sinking GEPs. Targets can potentially emit more efficient code if they know address computations never overflow. For example ILP32 code on AArch64 (which only has 64-bit address computation) can ignore the possibility of overflow with this extra information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355926 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 15:22:23 +00:00
Sam Parker	88f0c562f5	[ARM][NFC] Delete original smlad tests Because I don't understand svn. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355908 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 11:06:15 +00:00
Sam Parker	a7682f4fed	[ARM][NFC] Move smlad tests Created a test/CodeGen/ARM/ParallelDSP folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355907 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 11:01:11 +00:00
Eugene Leviant	13df7e16ef	[CGP] Fix UB when GEP is bound to trivial PHINode Differential revision: https://reviews.llvm.org/D59140 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355904 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 10:10:29 +00:00
David Stuttard	4b43b47d8a	[AMDGPU] Add support for immediate operand for S_ENDPGM Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355902 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 09:52:58 +00:00
Alex Bradbury	456da09d90	[RISCV] Add test cases for the lp64 ABI These are closely modeled on similar tests for the ilp32 ABI. Like those tests, we group together tests that should be common cross lp64, lp64+lp64f, and lp64+lp64f+lp64d ABIs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355899 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-12 09:26:53 +00:00
Jessica Paquette	2377ecc161	Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT" After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without running into any problematic intrinsics. Also add a fix for lane copies, which don't support index 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355871 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 22:18:01 +00:00
Alex Bradbury	f2cfb7b3d3	[RISCV] Do a sign-extension in a compare-and-swap of 32 bit in RV64A AtomicCmpSwapWithSuccess is legalised into an AtomicCmpSwap plus a comparison. This requires an extension of the value which, by default, is a zero-extension. When we later lower AtomicCmpSwap into a PseudoCmpXchg32 and then expanded in RISCVExpandPseudoInsts.cpp, the lr.w instruction does a sign-extension. This mismatch of extensions causes the comparison to fail when the compared value is negative. This change overrides TargetLowering::getExtendForAtomicOps for RISC-V so it does a sign-extension instead. Differential Revision: https://reviews.llvm.org/D58829 Patch by Ferran Pallarès Roca. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355869 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 21:41:22 +00:00
Jessica Paquette	cb9067728a	[GlobalISel][AArch64] Always fall back on aarch64.neon.addp.* Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355865 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 20:51:17 +00:00
Nikita Popov	802a6632d5	[SDAG][AArch64] Legalize VECREDUCE Fixes https://bugs.llvm.org/show_bug.cgi?id=36796. Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them. This also includes a few more changes to make this work somewhat reasonably: * Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. This uses a shuffle reduction if possible, followed by a naive scalar reduction. * Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64. * Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also change the legalize vector op code to handle operations that only have vector operands, but no vector results, as is the case for VECREDUCE. * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported. This does not handle anything related to VECREDUCE_STRICT_*. Differential Revision: https://reviews.llvm.org/D58015 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355860 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 20:22:13 +00:00
Jonas Paulsson	e3834c1b19	[RegAlloc] Avoid compile time regression with multiple copy hints. As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints(). This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical/virtual registers are passed no more than once to addRegAllocationHint(). Review: Dimitry Andric, Quentin Colombet https://reviews.llvm.org/D59201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355854 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 19:00:37 +00:00
Simon Pilgrim	ca9c63cece	[X86] Extend widening comparison test. Ensure we test both v2i16 unary and binary comparisons. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355849 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 18:08:20 +00:00
Petar Jovanovic	16c00e17ef	[MIPS][microMIPS] Add a pattern to match TruncIntFP A pattern needed to match TruncIntFP was missing. This was causing multiple tests from llvm test suite to fail during compilation for micromips. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D58722 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355825 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 14:13:31 +00:00
Petar Avramovic	e7f4ae297e	[MIPS GlobalISel] NarrowScalar G_UMULH NarrowScalar G_UMULH in LegalizerHelper using multiplyRegisters helper function. NarrowScalar G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58825 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355815 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 10:08:44 +00:00
Petar Avramovic	4306b0ed95	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355814 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 10:00:17 +00:00
Craig Topper	6c46962524	[X86] Enable sse2_cvtsd2ss intrinsic to use an EVEX encoded instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355810 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 06:01:04 +00:00
Amaury Sechet	5ddf9b4b32	Add test case for add to sub post legalization. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355797 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-11 01:25:48 +00:00
Sanjay Patel	6a6735dc58	[x86] add x86-specific opcodes to extractelement scalarization list git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355792 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-10 18:56:21 +00:00
Craig Topper	385eaa67a3	[X86] Make lowering of intrinsics with rounding mode stricter so that only valid rounding modes are lowered. Update tests accordingly Many of our tests were not using valid rounding mode immediates. Clang verifies this in the frontend when it creates the intrinsics from builtins, but the backend would still lower invalid immediates. With this change we will now leave them as intrinsics if the immediate is invalid. This will cause an isel selection failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355789 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-10 17:20:45 +00:00
Nikita Popov	47bbaefec4	[AArch64] Add tests for saddsat/ssubsat; NFC Signed versions of the existing unsigned tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355787 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-10 12:21:36 +00:00
Nikita Popov	86f7633c4d	[ARM] Use non-constant operand in umulo-32.ll; NFC Currently the store+load is folded and both operands of the umulo end up being constants. To avoid this getting folded away entirely, make sure at least one operand is non-constant. Also remove some allocas which don't seem relevant to the test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355776 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 13:43:21 +00:00
Nikita Popov	79fd81910e	[ARM] Generate test checks for umulo-32.ll; NFC The second test case is going to be changed by D59041, so generate full baseline checks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355775 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 13:21:15 +00:00
Alex Bradbury	ee8f60f78f	[RISCV] Support -target-abi at the MC layer and for codegen This patch adds proper handling of -target-abi, as accepted by llvm-mc and llc. Lowering (codegen) for the hard-float ABIs will follow in a subsequent patch. However, this patch does add MC layer support for the hard float and RVE ABIs (emission of the appropriate ELF flags https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#-file-header). ABI parsing must be shared between codegen and the MC layer, so we add computeTargetABI to RISCVUtils. A warning will be printed if an invalid or unrecognized ABI is given. Differential Revision: https://reviews.llvm.org/D59023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355771 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 09:28:06 +00:00
Thomas Lively	aca9dda712	[WebAssembly] Use named operands to identify loads and stores Summary: Uses the named operands tablegen feature to look up the indices of offset, address, and p2align operands for all load and store instructions. This replaces brittle, incorrect logic for identifying loads and store when eliminating frame indices, which previously crashed on bulk-memory ops. It also cleans up the SetP2Alignment pass. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59007 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355770 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 04:31:37 +00:00
Sanjay Patel	e7e3dde903	[x86] add tests for extract of FP select; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355768 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 02:11:05 +00:00
Craig Topper	7c2c4642b9	[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355767 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-09 02:08:41 +00:00
Amara Emerson	a8b5f0e7e3	[AArch64][GlobalISel] Fix i1 arguments not being zero-extended as required by ABI. Fixes PR41001. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355745 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 22:17:00 +00:00
Sanjay Patel	f033627be4	[x86] scalarize extract element 0 of FP cmp An extension of D58282 noted in PR39665: https://bugs.llvm.org/show_bug.cgi?id=39665 This doesn't answer the request to use movmsk, but that's an independent problem. We need this and probably still need scalarization of FP selects because we can't do that as a target-independent transform (although it seems likely that targets besides x86 should have this transform). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355741 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 21:54:41 +00:00
Mitch Phillips	7f8f369b99	[HWASan] Save + print registers when tag mismatch occurs in AArch64. Summary: This change change the instrumentation to allow users to view the registers at the point at which tag mismatch occured. Most of the heavy lifting is done in the runtime library, where we save the registers to the stack and emit unwind information. This allows us to reduce the overhead, as very little additional work needs to be done in each __hwasan_check instance. In this implementation, the fast path of __hwasan_check is unmodified. There are an additional 4 instructions (16B) emitted in the slow path in every __hwasan_check instance. This may increase binary size somewhat, but as most of the work is done in the runtime library, it's manageable. The failure trace now contains a list of registers at the point of which the failure occured, in a format similar to that of Android's tombstones. It currently has the following format: Registers where the failure occurred (pc 0x0055555561b4): x0 0000000000000014 x1 0000007ffffff6c0 x2 1100007ffffff6d0 x3 12000056ffffe025 x4 0000007fff800000 x5 0000000000000014 x6 0000007fff800000 x7 0000000000000001 x8 12000056ffffe020 x9 0200007700000000 x10 0200007700000000 x11 0000000000000000 x12 0000007fffffdde0 x13 0000000000000000 x14 02b65b01f7a97490 x15 0000000000000000 x16 0000007fb77376b8 x17 0000000000000012 x18 0000007fb7ed6000 x19 0000005555556078 x20 0000007ffffff768 x21 0000007ffffff778 x22 0000000000000001 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000007ffffff6f0 x30 00000055555561b4 ... and prints after the dump of memory tags around the buggy address. Every register is saved exactly as it was at the point where the tag mismatch occurs, with the exception of x16/x17. These registers are used in the tag mismatch calculation as scratch registers during __hwasan_check, and cannot be saved without affecting the fast path. As these registers are designated as scratch registers for linking, there should be no important information in them that could aid in debugging. Reviewers: pcc, eugenis Reviewed By: pcc, eugenis Subscribers: srhines, kubamracek, mgorny, javed.absar, krytarowski, kristof.beyls, hiraditya, jdoerfert, llvm-commits, #sanitizers Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58857 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355738 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 21:22:35 +00:00
Matt Arsenault	30e4d41861	AMDGPU: Move d16 load matching to preprocess step When matching half of the build_vector to a load, there could still be a hidden dependency on the other half of the build_vector the pattern wouldn't detect. If there was an additional chain dependency on the other value, a cycle could be introduced. I don't think a tablegen pattern is capable of matching the necessary conditions, so move this into PreprocessISelDAG. Check isPredecessorOf for the other value to avoid a cycle. This has a warning that it's expensive, so this should probably be moved into an MI pass eventually that will have more freedom to reorder instructions to help match this. That is currently complicated by the lack of a computeKnownBits type mechanism for the selected function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355731 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 20:58:11 +00:00
Matt Arsenault	ed12070421	DAG: Don't try to cluster loads with tied inputs This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355728 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 20:46:15 +00:00
Sanjay Patel	c00fa49767	[x86] add tests for extracted vector FP cmp; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355727 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 20:45:27 +00:00
Matt Arsenault	6bc2eb6a5f	AMDGPU: Add more tests for d16 loads Also fix a few cases that weren't testing what they were supposed to. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355724 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 20:30:51 +00:00
Matt Arsenault	f031b0c044	AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr This was checking the wrong operands for the base register and the offsets. The indexes are shifted by the number of output registers from the machine instruction definition, and the chain is moved to the end. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355722 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 20:30:50 +00:00
Amaury Sechet	bcd58e4ff6	[DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) Summary: This pattern is sometime created after legalization. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58874 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355716 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 19:39:32 +00:00
Sanjay Patel	fea1ddbc35	[x86] prevent infinite looping from inverse shuffle transforms git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355713 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 19:20:28 +00:00
Simon Pilgrim	9808bce5cc	[X86] Add test case for PR22473 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355712 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 19:16:26 +00:00
Simon Pilgrim	4530e3826d	Fix typo in constant vector git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355699 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 15:17:26 +00:00
Clement Courbet	154171347b	[SelectionDAG] Allow the user to specify a memeq function. Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355672 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 09:07:45 +00:00
Carl Ritson	5b94cee930	[AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions. Reviewers: arsenm, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355671 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 09:03:11 +00:00
Craig Topper	80bb69d69b	[X86] Improve the type checking in isLegalMaskedLoad and isLegalMaskedGather. We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store. For FP we now explicitly check for only double and float. For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly. We only check bitwidth after checking that the type is an integer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355667 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-08 07:33:43 +00:00
Sanjay Patel	81588cab8d	[x86] add extract FP tests for target-specific nodes; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355655 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-07 23:55:54 +00:00
Konstantin Zhuravlyov	323ccae6ca	AMDHSA: Code object v3 updates - Copy kernel symbol attributes into kernel descriptor attributes - Make sure kernel symbol's visibility is not "higher" than protected Differential Revision: https://reviews.llvm.org/D59057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355630 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-07 19:58:29 +00:00
Vlad Tsyrklevich	863ea8c618	Delete x86_64 ShadowCallStack support Summary: ShadowCallStack on x86_64 suffered from the same racy security issues as Return Flow Guard and had performance overhead as high as 13% depending on the benchmark. x86_64 ShadowCallStack was always an experimental feature and never shipped a runtime required to support it, as such there are no expected downstream users. Reviewers: pcc Reviewed By: pcc Subscribers: mgorny, javed.absar, hiraditya, jdoerfert, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D59034 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355624 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-07 18:56:36 +00:00
David Green	8babc52a2b	[LSR] Attempt to increase the accuracy of LSR's setup cost In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355597 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-07 13:44:40 +00:00
Petar Avramovic	deb1a9834b	[MIPS GlobalISel] Fix mul operands Unsigned mul high for MIPS32 is selected into two PseudoInstructions: PseudoMULTu and PseudoMFHI that use accumulator register class ACC64 for some of its operands. Registers in this class have appropriate hi and lo register as subregisters: $lo0 and $hi0 are subregisters of $ac0 etc. mul instruction implicit-defs $lo0 and $hi0 according to MipsInstrInfo.td. In functions where mul and PseudoMULTu are present fastRegisterAllocator will "run out of registers during register allocation" because 'calcSpillCost' for $ac0 will return spillImpossible because subregisters $lo0 and $hi0 of $ac0 are reserved by mul instruction above. A solution is to mark implicit-defs of $lo0 and $hi0 as dead in mul instruction. Differential Revision: https://reviews.llvm.org/D58715 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355594 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-07 13:28:29 +00:00

1 2 3 4 5 ...

28800 Commits