archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Craig Topper	9799e6facc	[X86] Don't call validateInstruction from MatchAndEmitInstruction when MatchingInlineAsm is set. The MCInst won't be populated. Without this we can't parse gather instructions in ms inline asm blocks. The validateInstruction function was introduced in r316700 to check gather constraints. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317713 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 19:38:48 +00:00
Dan Gohman	40486d7dcf	[WebAssembly] Call signExtend to get sign extended register Patch by Jatin Bhateja! Differential Revision: https://reviews.llvm.org/D39529 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317710 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 19:24:21 +00:00
Dan Gohman	8e0d3d4cd7	[WebAssembly] Revise the strategy for inline asm. Previously, an "r" constraint would mean the compiler provides a value on WebAssembly's operand stack. This was tricky to use properly, particularly since it isn't possible to declare a new local from within an inline asm string. With this patch, "r" provides the value in a WebAssembly local, and the local index is provided to the inline asm string. This requires inline asm to use get_local and set_local to read the register. This does potentially result in larger code size, however inline asm should hopefully be quite rare in WebAssembly. This also means that the "m" constraint can no longer be supported, as WebAssembly has nothing like a "memory operand" that includes an implicit get_local. This fixes PR34599 for the wasm32-unknown-unknown-wasm target (though not for the ELF target). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317707 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 19:18:08 +00:00
Alex Bradbury	da781c7295	[RISCV] Initial support for function calls Note that this is just enough for simple function call examples to generate working code. Support for varargs etc follows in future patches. Differential Revision: https://reviews.llvm.org/D29936 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317691 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 13:41:21 +00:00
Alex Bradbury	eacca308e4	[RISCV] Codegen for conditional branches A good portion of this patch is the extra functions that needed to be implemented to support the test case. e.g. storeRegToStackSlot, loadRegFromStackSlot, eliminateFrameIndex. Setting ISD::BR_CC to Expand may appear non-obvious on an architecture with branch+cmp instructions. However, I found it much easier to deal with matching the expanded form. I had to change simm13_lsb0 and simm21_lsb0 to inherit from the Operand<OtherVT> class rather than Operand<i32> in order to keep tablegen happy. This isn't a big deal, but it does seem a shame to lose the uniformity across immediate types when there's not an obvious benefit (I'm hoping a tablegen expert will educate me on what I'm missing here!). Differential Revision: https://reviews.llvm.org/D29935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317690 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 13:31:40 +00:00
Alex Bradbury	6c9938cf11	[RISCV] Codegen support for memory operations on global addresses Differential Revision: https://reviews.llvm.org/D39103 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317688 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 13:24:21 +00:00
Alex Bradbury	21ae2e7a56	[RISCV] Codegen support for memory operations This required the implementation of RISCVTargetInstrInfo::copyPhysReg. Support for lowering global addresses follow in the next patch. Differential Revision: https://reviews.llvm.org/D29934 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317685 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 12:20:01 +00:00
Alex Bradbury	c5abad3f59	[RISCV] Codegen support for materializing constants Differential Revision: https://reviews.llvm.org/D39101 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317684 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 12:02:22 +00:00
Simon Dardis	d036c9c439	[mips] Guard indirect and tailcall pseudo instructions correctly. Previously these pseudo instructions were not guarded by ISA, so their select was dependant on the ordering of the entries in the DAG matcher. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39723 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317681 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 11:13:44 +00:00
Alex Bradbury	4d211caa37	[NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0 rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if it has to guess properties from patterns. Unfortunately, guessInstructionProperties=0 can't be used with current upstream LLVM as instructions in the TargetOpcode namespace are always included and sometimes have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch provides the simplest possible fix to this problem, setting default values for these fields in the TargetOpcode scope. There is no intended functional change, as the explicitly set properties should match what was previously inferred. A number of the instructions had hasSideEffects=1 inferred unintentionally. This patch makes it explicit, while future patches (such as D37097) correct the property. Differential Revision: https://reviews.llvm.org/D37065 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317674 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 09:26:06 +00:00
Craig Topper	d550a31777	[X86] Add patterns to fold EVEX store with EVEX encoded vcvtps2ph instructions. Remove bad pattern that had vf432 vcvtps2ph storing 128-bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317662 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 04:00:31 +00:00
Craig Topper	c30df5f3d3	[X86] Allow legacy vcvtps2ph intrinsics to select EVEX encoded instructions. Rely on EVEX->VEX to convert back. Missed store folding opportunities will be fixed in a subsequent commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317661 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 04:00:30 +00:00
David Blaikie	48319238e4	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317647 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 01:01:31 +00:00
Matt Arsenault	a932c3f118	AMDGPU: Set correct sched model on v_mad_u64_u32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317645 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 00:48:25 +00:00
Sriraman Tallam	ea30756f68	Attribute nonlazybind should not affect calls to functions with hidden visibility. Differential Revision: https://reviews.llvm.org/D39625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317639 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 00:01:05 +00:00
Justin Lebar	d8660fa5dc	[NVPTX] Implement __nvvm_atom_add_gen_d builtin. Summary: This just seems to have been an oversight. We already supported the f64 atomic add with an explicit scope (e.g. "cta"), but not the scopeless version. Reviewers: tra Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39638 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317623 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 22:10:54 +00:00
Graham Yiu	5363e7a31e	Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases. Differential Revision: https://reviews.llvm.org/D34630 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317613 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 20:55:43 +00:00
Krzysztof Parzyszek	5933b2471c	[Hexagon] Make a test more flexible in HexagonLoopIdiomRecognition An "or" that sets the sign-bit can be replaced with a "xor", if the sign-bit was known to be clear before. With some changes to instruction combining, the simple sign-bit check was failing. Replace it with a more flexible one to catch more cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317592 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 17:05:54 +00:00
Florian Hahn	9983524662	[AArch64][SVE] Asm: Add support for (ADD\|SUB)_ZZZ Patch [5/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 16:58:13 +00:00
Florian Hahn	6cf02b953e	[AArch64][SVE] Asm: Add SVE (Z) Register definitions and parsing support Patch [3/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. To summarise, this patch adds: * SVE register definitions * Methods to parse SVE register operands * Methods to print SVE register operands * RegKind SVEDataVector to distinguish it from other data types like scalar register or Neon vector. * k_SVEDataRegister and SVEDataRegOp to describe SVE registers (which will be extended by further patches with e.g. ElementWidth and the shift-extend type). Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39089 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317590 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 16:45:48 +00:00
Florian Hahn	861d2963c7	[AArch64][SVE] Asm: Set SVE as unsupported feature for existing scheduler models. Patch [4/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. We add SVE as unsupported feature for CPUs that don't have SVE to prevent errors from scheduler models saying it lacks information for these instructions. Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39090 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317582 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 15:03:11 +00:00
Petar Jovanovic	8cec6c4916	Reland "Correct dwarf unwind information in function epilogue for X86" Reland r317100 with minor fix regarding ComputeCommonTailLength function in BranchFolding.cpp. Skipping top CFI instructions block needs to executed on several more return points in ComputeCommonTailLength(). Original r317100 message: "Correct dwarf unwind information in function epilogue for X86" This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317579 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 14:40:27 +00:00
Alexey Bataev	25ff19d87a	[SLP] Fix PR35047: Fix default cost model for cast op in X86. Summary: The cost calculation for default case on X86 target does not always follow correct wayt because of missing 4-th argument in `BaseT::getCastInstrCost()` call. Added this missing parameter. Reviewers: hfinkel, mkuper, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39687 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317576 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 14:23:44 +00:00
Florian Hahn	c5e08cf67c	[AArch64][SVE] Asm: Replace 'IsVector' by 'RegKind' in AArch64AsmParser (NFC) Patch [2/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. This change is a non functional change that adds RegKind as an alternative to 'isVector' to prepare it for newer types (SVE data vectors and predicate vectors) that will be added in next patches (where the SVE data vector is added as part of this patch set) Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39088 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317569 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 13:07:50 +00:00
Kristof Beyls	b79469ca2f	[GlobalISel] Enable legalizing non-power-of-2 sized types. This changes the interface of how targets describe how to legalize, see the below description. 1. Interface for targets to describe how to legalize. In GlobalISel, the API in the LegalizerInfo class is the main interface for targets to specify which types are legal for which operations, and what to do to turn illegal type/operation combinations into legal ones. For each operation the type sizes that can be legalized without having to change the size of the type are specified with a call to setAction. This isn't different to how GlobalISel worked before. For example, for a target that supports 32 and 64 bit adds natively: for (auto Ty : {s32, s64}) setAction({G_ADD, 0, s32}, Legal); or for a target that needs a library call for a 32 bit division: setAction({G_SDIV, s32}, Libcall); The main conceptual change to the LegalizerInfo API, is in specifying how to legalize the type sizes for which a change of size is needed. For example, in the above example, how to specify how all types from i1 to i8388607 (apart from s32 and s64 which are legal) need to be legalized and expressed in terms of operations on the available legal sizes (again, i32 and i64 in this case). Before, the implementation only allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0, s128}, NarrowScalar). A worse limitation was that if you'd wanted to specify how to legalize all the sized types as allowed by the LLVM-IR LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times and probably would need a lot of memory to store all of these specifications. Instead, the legalization actions that need to change the size of the type are specified now using a "SizeChangeStrategy". For example: setLegalizeScalarToDifferentSizeStrategy( G_ADD, 0, widenToLargerAndNarrowToLargest); This example indicates that for type sizes for which there is a larger size that can be legalized towards, do it by Widening the size. For example, G_ADD on s17 will be legalized by first doing WidenScalar to make it s32, after which it's legal. The "NarrowToLargest" indicates what to do if there is no larger size that can be legalized towards. E.g. G_ADD on s92 will be legalized by doing NarrowScalar to s64. Another example, taken from the ARM backend is: for (unsigned Op : {G_SDIV, G_UDIV}) { setLegalizeScalarToDifferentSizeStrategy(Op, 0, widenToLargerTypesUnsupportedOtherwise); if (ST.hasDivideInARMMode()) setAction({Op, s32}, Legal); else setAction({Op, s32}, Libcall); } For this example, G_SDIV on s8, on a target without a divide instruction, would be legalized by first doing action (WidenScalar, s32), followed by (Libcall, s32). The same principle is also followed for when the number of vector lanes on vector data types need to be changed, e.g.: setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal); setLegalizeVectorElementToDifferentSizeStrategy( G_ADD, 0, widenToLargerTypesUnsupportedOtherwise); As currently implemented here, vector types are legalized by first making the vector element size legal, followed by then making the number of lanes legal. The strategy to follow in the first step is set by a call to setLegalizeVectorElementToDifferentSizeStrategy, see example above. The strategy followed in the second step "moreToWiderTypesAndLessToWidest" (see code for its definition), indicating that vectors are widened to more elements so they map to natively supported vector widths, or when there isn't a legal wider vector, split the vector to map it to the widest vector supported. Therefore, for the above specification, some example legalizations are: * getAction({G_ADD, LLT::vector(3, 3)}) returns {WidenScalar, LLT::vector(3, 8)} * getAction({G_ADD, LLT::vector(3, 8)}) then returns {MoreElements, LLT::vector(8, 8)} * getAction({G_ADD, LLT::vector(20, 8)}) returns {FewerElements, LLT::vector(16, 8)} 2. Key implementation aspects. How to legalize a specific (operation, type index, size) tuple is represented by mapping intervals of integers representing a range of size types to an action to take, e.g.: setScalarAction({G_ADD, LLT:scalar(1)}, {{1, WidenScalar}, // bit sizes [ 1, 31[ {32, Legal}, // bit sizes [32, 33[ {33, WidenScalar}, // bit sizes [33, 64[ {64, Legal}, // bit sizes [64, 65[ {65, NarrowScalar} // bit sizes [65, +inf[ }); Please note that most of the code to do the actual lowering of non-power-of-2 sized types is currently missing, this is just trying to make it possible for targets to specify what is legal, and how non-legal types should be legalized. Probably quite a bit of further work is needed in the actual legalizing and the other passes in GlobalISel to support non-power-of-2 sized types. I hope the documentation in LegalizerInfo.h and the examples provided in the various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well enough how this is meant to be used. This drops the need for LLT::{half,double}...Size(). Differential Revision: https://reviews.llvm.org/D30529 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317560 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 10:34:34 +00:00
Bjorn Steinbrink	c1c411e7a8	[X86] Don't clobber reserved registers with stack adjustments Summary: Calls using invoke in funclet based functions are assumed to clobber all registers, which causes the stack adjustment using pops to consider all registers not defined by the call to be undefined, which can unfortunately include the base pointer, if one is needed. To prevent this (and possibly other hazards), skip reserved registers when looking for candidate registers. This fixes issue #45034 in the Rust compiler. Reviewers: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317551 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 08:50:21 +00:00
Craig Topper	c305f3d45a	[X86] Add patterns to fold a 64-bit load into the EVEX vcvtph2ps instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317548 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 07:13:07 +00:00
Craig Topper	7f4581842b	[X86] Add patterns for folding a v16i8 with the VEX vcvtph2ps intrinsics. Disable the peephole pass to prove that the pattern is working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317547 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 07:13:06 +00:00
Craig Topper	0c6a9e1d8e	[X86] Add support for using EVEX instructions for the legacy vcvtph2ps intrinsics. Looks like there's some missed load folding opportunities for i64 loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317544 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 07:13:03 +00:00
Craig Topper	f14ad9a908	[X86] Use IMPLICIT_DEF in VEX/EVEX vcvtss2sd/vcvtsd2ss patterns instead of a COPY_TO_REGCLASS. ExeDepsFix pass should take care of making the registers match. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317542 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 04:44:22 +00:00
Craig Topper	0b9dcde0fa	[X86] Remove 'Requires' from instructions with no patterns. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317541 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 04:44:21 +00:00
Matt Arsenault	db6cc311a0	AMDGPU: Remove redundant combine This combine was already done in two places. The generic combiner already has done this since r217610, for adds (with a single use). This one was added in r303641, and added support for handling or as well. r313251 later added support to the generic combine for or. It also turns out the isOrEquivalentToAdd check is not necessary for this combine. Additionally, we already reproduce this combine in yet another place in the backend, although in that version multiple uses of the add are still folded if it will allow a fold into the addressing mode. That version needs to be improved to understand ors though, as well as the correct legal offsets for private. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317526 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-07 00:06:32 +00:00
Craig Topper	acf8758c20	[X86] Make FeatureAVX512 imply FeatureF16C. The EVEX to VEX pass is already assuming this is true under AVX512VL. We had special patterns to use zmm instructions if VLX and F16C weren't available. Instead just make AVX512 imply F16C to make the EVEX to VEX behavior explicitly legal and remove the extra patterns. All known CPUs with AVX512 have F16C so this should safe for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317521 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 22:49:04 +00:00
Craig Topper	490bc3940d	[X86] Make FeatureAVX512 imply FeatureFMA. Previously our VEX patterns were checking Subtarget.hasFMA() which checked FMA \|\| AVX512. So we were behaving as if AVX512 implied it anyway. Which means we'd allow VEX encoded 128/256 FMA when AVX512F was enabled but AVX512VL is off. Regardless of the FMA flag. EVEX to VEX also transforms scalar EVEX FMA instructions to their VEX versions even without the FMA flag. Similarly for 128/256 under AVX512VL. So this makes AVX512 imply FeatureFMA to make our current behavior explicit. All known CPUs that support AVX512 have VEX FMA instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317520 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 22:49:01 +00:00
Graham Yiu	e005ea7d87	Fix buildbot breakages from r317503. Add parentheses to assignment when using result as a condition. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317508 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 21:04:19 +00:00
Graham Yiu	20a90ab746	Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word. Differential Revision: https://reviews.llvm.org/D34497 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317503 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 20:18:30 +00:00
Guozhi Wei	416cdcce39	[PPC] Use xxbrd to speed up bswap64 Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317499 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 19:09:38 +00:00
Matt Arsenault	bd04b64cd1	AMDGPU: Select v_mad_u64_u32 and v_mad_i64_i32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317492 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 17:04:37 +00:00
Sanjay Patel	00e900afdb	[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html and again more recently: http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html ...this is a step in cleaning up our fast-math-flags implementation in IR to better match the capabilities of both clang's user-visible flags and the backend's flags for SDNode. As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'. We're also adding a bit to allow approximations for library functions called 'ApproxFunc' (this was initially proposed as 'libm' or similar). ...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day. We keep the 'fast' keyword. I thought about removing that, but seeing IR like this: %f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2 ...made me think we want to keep the shortcut synonym. Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. This statement: "Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility ) ...provides the flexibility we want to make this change without requiring a new IR version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'. Note: an inter-dependent clang commit to use the new API name should closely follow commit. Differential Revision: https://reviews.llvm.org/D39304 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317488 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 16:27:15 +00:00
Simon Pilgrim	ae8a0007cb	[X86][SSE] Merge combineExtractVectorElt_SSE into combineExtractVectorElt. NFCI. We still early-out for X86ISD::PEXTRW/X86ISD::PEXTRB so no actual change in behaviour, but it'll make it easier to add support in a future patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317485 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 15:28:25 +00:00
Simon Pilgrim	6e1c5e0143	[X86][SSE] Combine EXTRACT_VECTOR_ELT with combineExtractWithShuffle before XFormVExtractWithShuffleIntoLoad combineExtractWithShuffle can handle more complex shuffles/bitcasts than we can with the equivalent code in XFormVExtractWithShuffleIntoLoad. Mainly a compile time improvement now (combineExtractWithShuffle combines will have always failed late on inside XFormVExtractWithShuffleIntoLoad), and will let us merge combineExtractVectorElt_SSE in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317481 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 14:34:19 +00:00
Yaxun Liu	ec1f0cc416	[AMDGPU] Change alloca addr space of r600 to 5 for amdgiz environment Differential Revision: https://reviews.llvm.org/D39657 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 14:32:33 +00:00
Jonas Paulsson	044ea898dd	[SystemZ] implement hasDivRemOp() SystemZ can do division and remainder in a single instruction for scalar integer types, which are now reflected by returning true in this hook for those cases. Review: Ulrich Weigand git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317477 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 13:10:31 +00:00
Yaxun Liu	13a223e6e6	[AMDGPU] Fix assertion due to assuming pointer in default addr space is 32 bit The backend assumes pointer in default addr space is 32 bit, which is not true for the new addr space mapping and causes assertion for unresolved functions. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39643 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 13:01:33 +00:00
Simon Dardis	dfaa4d2c2b	[mips] Add movep for microMIPS32R6 and fix microMIPS32r3 version Previously, the 'movep' instruction was defined for microMIPS32r3 and shared that definition with microMIPS32R6. 'movep' was re-encoded for microMIPS32r6, so this patch provides the correct encoding. Secondly, correct the encoding of the 'rs' and 'rt' operands which have an instruction specific encoding for the registers those operands accept. Finally, correct the decoding of the 'dst_regs' operand which was extracting the relevant field from the instruction, but was actually extracting the field from the alreadly extracted field. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39495 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317475 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 12:59:53 +00:00
Mohammed Agabaria	84066df4d9	[LV][X86] update the cost of interleaving mem. access of floats Recommit: This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8. fixed the location of the lit test it works with make check-all. Differential Revision: https://reviews.llvm.org/D39403 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317471 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 10:56:20 +00:00
Simon Dardis	194c54be45	[mips] Fix PR35140 Mark all symbols involved with TLS relocations as being TLS symbols. This resolves PR35140. Thanks to Alex Crichton for reporting the issue! Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39591 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317470 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 10:50:04 +00:00
Uriel Korach	d8ea26422c	[X86][AVX512] Improve lowering of AVX512 test intrinsics Added TESTM and TESTNM to the list of instructions that already zeroing unused upper bits and does not need the redundant shift left and shift right instructions afterwards. Added a pattern for TESTM and TESTNM in iselLowering, so now icmp(neq,and(X,Y), 0) goes folds into TESTM and icmp(eq,and(X,Y), 0) goes folds into TESTNM This commit is a preparation for lowering the test and testn X86 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317465 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 09:22:38 +00:00
Uriel Korach	16b230fc73	[X86] Replace duplicate function call with variable. NFC Change from: if (N->getOperand(0).getValueType() == MVT::v8i32 \|\| N->getOperand(0).getValueType() == MVT::v8f32) to: EVT OpVT = N->getOperand(0).getValueType(); if (OpVT == MVT::v8i32 \|\| OpVT == MVT::v8f32) Change-Id: I5a105f8710b73a828e6cfcd55fac2eae6153ce25 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317464 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 08:32:45 +00:00
Zvi Rackover	807235c9ea	X86 ISel: Basic support for variable-index vector permutations Summary: Try to lower a BUILD_VECTOR composed of extract-extract chains that can be reasoned to be a permutation of a vector by indices in a non-constant vector. We saw this pattern created by ISPC, which resolts to creating it due to the requirement that shufflevector's mask operand be a constant vector. I didn't check this but we could possibly use this pattern for lowering the X86 permute C-instrinsics instead of llvm.x86 instrinsics. This change can be followed by more improvements: 1. Handle vectors with undef elements. 2. Utilize pshufb and zero-mask-blending to support more effiecient construction of vectors with constant-0 elements. 3. Use smaller-element vectors of same width, and "interpolate" the indices, when no native operation available. Reviewers: RKSimon, craig.topper Reviewed By: RKSimon Subscribers: chandlerc, DavidKreitzer Differential Revision: https://reviews.llvm.org/D39126 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317463 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 08:25:46 +00:00

1 2 3 4 5 ...

45261 Commits