archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Sam Tebbs	a71ac42554	[ARM] Add support for the MVE long shift instructions MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers. The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl. test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions. Differential Revision: https://reviews.llvm.org/D63430 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364654 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 15:43:31 +00:00
David Green	2dd6a9d6de	[ARM] Mark math routines as non-legal for MVE This adds handling and tests for a number of floating point math routines, which have no MVE instructions. Differential Revision: https://reviews.llvm.org/D63725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364641 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 11:17:38 +00:00
David Green	0272917d5a	[ARM] Widening loads and narrowing stores MVE has instructions to widen as it loads, and narrow as it stores. This adds the required patterns and legalisation to make them work including specifying that they are legal, patterns to select them and test changes. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364636 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 09:47:55 +00:00
David Green	c2c9f7413e	[ARM] MVE loads and stores This fills in the gaps for basic MVE loads and stores, allowing unaligned access and adding far too many tests. These will become important as narrowing/expanding and pre/post inc are added. Big endian might still not be handled very well, because we have not yet added bitcasts (and I'm not sure how we want it to work yet). I've included the alignment code anyway which maps with our current patterns. We plan to return to that later. Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D63838 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364633 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 08:41:40 +00:00
David Green	67091fc7e1	[ARM] Mark div and rem as expand for MVE We don't have vector operations for these, so they need to be expanded for both integer and float. Differential Revision: https://reviews.llvm.org/D63595 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364631 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 08:18:55 +00:00
David Green	3b9e91d904	[ARM] MVE vector shuffles This patch adds necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like upcoming patches). This mostly uses the same code from Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific. Most code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63567 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364626 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 07:08:42 +00:00
Sam Tebbs	731fa4b917	[ARM] Fix formatting issue in ARMISelLowering.cpp Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test commit after receiving write access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364560 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-27 16:28:28 +00:00
Eli Friedman	f5ce44c687	[ARM] Don't reserve R12 on Thumb1 as an emergency spill slot. The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364490 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-26 23:46:51 +00:00
Fangrui Song	39f43cfa55	[ARM] Fix -Wimplicit-fallthrough after D60709/r364331 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364376 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-26 02:34:10 +00:00
Simon Tatham	6b3b0e03f0	[ARM] Support inline assembler constraints for MVE. "To" selects an odd-numbered GPR, and "Te" an even one. There are some 8.1-M instructions that have one too few bits in their register fields and require registers of particular parity, without necessarily using a consecutive even/odd pair. Also, the constraint letter "t" should select an MVE q-register, when MVE is present. This didn't need any source changes, but some extra tests have been added. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60709 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364331 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 16:49:32 +00:00
Simon Tatham	0be2584eea	[ARM] Code-generation infrastructure for MVE. This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and if you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364329 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 16:48:46 +00:00
Fangrui Song	8a62fe5307	[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364312 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 13:28:44 +00:00
Simon Tatham	796e019154	[ARM] Explicit lowering of half <-> double conversions. If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364294 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 11:24:50 +00:00
Simon Pilgrim	9b95d3f22f	[TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363179 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-12 17:14:03 +00:00
David Green	0f4acbbf40	[ARM] Adjust isLegalT1AddressImmediate for non-legal types Types such as float and i64's do not have legal loads in Thumb1, but will still be loaded with a LDR (or potentially multiple LDR's). As such we can treat the cost of addressing mode calculations the same as an i32 and get some optimisation benefits. Differential Revision: https://reviews.llvm.org/D62968 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362874 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-08 10:32:53 +00:00
David Green	f62bbb6909	[ARM] Add MVE addressing to isLegalT2AddressImmediate Now with MVE being added, we can add the vector addressing mode costs for it. These are generally imm7 multiplied by the size of the type being loaded / stored. Differential Revision: https://reviews.llvm.org/D62967 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362873 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-08 10:18:23 +00:00
David Green	3ca35d81c1	[ARM] Add fp16 addressing to isLegalT2AddressImmediate The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs to account for those, and adds extra testing. It is dependant upon hasFPRegs16 as this is what the load/store instructions require. Differential Revision: https://reviews.llvm.org/D62966 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362872 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-08 10:09:02 +00:00
Diogo N. Sampaio	7edd97b87c	[ARM][FIX] Ran out of registers due tail recursion Summary: - pr42062 When compiling for MinSize, ARMTargetLowering::LowerCall decides to indirect multiple calls to a same function. However, it disconsiders the limitation that thumb1 indirect calls require the callee to be in a register from r0 to r3 (llvm limiation). If all those registers are used by arguments, the compiler dies with "error: run out of registers during register allocation". This patch tells the function IsEligibleForTailCallOptimization if we intend to perform indirect calls, as to avoid tail call optimization. Reviewers: dmgreen, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62683 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362366 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-03 08:58:05 +00:00
Simon Pilgrim	bf4c5f4724	[ARM] LowerVECTOR_SHUFFLE - fix uninitialized variable warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362094 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-30 14:01:24 +00:00
Simon Tatham	2edda5220d	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361845 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-28 16:13:20 +00:00
Alexander Timofeev	d224ecc383	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361741 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-26 20:33:26 +00:00
David Green	b459251126	[ARM] Select a number of fp16 rounding functions This add patterns for fp16 round and ceil etc. Same as the float and double patterns. Differential Revision: https://reviews.llvm.org/D62326 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361718 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-26 11:13:00 +00:00
David Green	b089430c53	[ARM] Promote various fp16 math intrinsics Promote a number of fp16 math intrinsics to float, so that the relevant float math routines can be used. Copysign is expanded so as to be handled in-place. Differential Revision: https://reviews.llvm.org/D62325 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361717 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-26 10:59:21 +00:00
David Green	6e367e39d6	[ARM] Promote fp16 frem Promote fp16 frem operations on ARM to floats so they call fmodf. Differential Revision: https://reviews.llvm.org/D62321 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361713 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-26 10:30:22 +00:00
Peter Collingbourne	d7a83f9517	Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361688 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-25 01:52:38 +00:00
Alexander Timofeev	6a29119c95	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361644 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-24 15:32:18 +00:00
David Green	769d167a6f	[ARM] Don't use the Machine Scheduler for cortex-m at minsize The new cortex-m schedule in rL360768 helps performance, but can increase the amount of high-registers used. This, on average, ends up increasing the codesize by a fair amount (because less instructions are converted from T2 to T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use the existing DAG scheduler with the RegPressure scheduling preference (at least until the issues around T2 vs T1 instructions can be improved). I have also made sure that the Sched::RegPressure dag scheduler is always chosen for MinSize. The test shows one case where we increase the number of registers used. Differential Revision: https://reviews.llvm.org/D61882 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360769 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-15 12:58:02 +00:00
Eli Friedman	69b042c01d	[ARM] Glue register copies to tail calls. This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360099 91177308-0d34-0410-b5e6-96231b3b80d8	2019-05-06 23:21:59 +00:00
Sjoerd Meijer	67c2691c8d	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359537 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-30 08:38:12 +00:00
David Green	cc532a593e	[ARM] Rewrite isLegalT2AddressImmediate This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358845 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-21 09:54:29 +00:00
Simon Pilgrim	29d0764b94	[TargetLowering] Rename preferShiftsToClearExtremeBits and shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358526 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-16 20:57:28 +00:00
Evandro Menezes	d71ea05ab7	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357731 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-04 22:40:06 +00:00
Eli Friedman	95e6c7dcce	[ARM] Optimize expressions like "return x != 0;" for Thumb1. There's an existing optimization for x != C, but somehow it was missing a special case for 0. While I'm here, also cleaned up the code/comments a bit: the second value produced by the MERGE_VALUES was actually dead, since a CMOV only produces one result. Differential Revision: https://reviews.llvm.org/D59616 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357437 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-02 00:01:23 +00:00
Eli Friedman	fee8cc83e8	[ARM] Add missing memory operands to a bunch of instructions. This should hopefully lead to minor improvements in code generation, and more accurate spill/reload comments in assembly. Also fix isLoadFromStackSlotPostFE/isStoreToStackSlotPostFE so they don't lead to misleading assembly comments for merged memory operands; this is technically orthogonal, but in practice the relevant memory operand lists don't show up without this change. Differential Revision: https://reviews.llvm.org/D59713 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356963 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-25 22:42:30 +00:00
Simon Pilgrim	6325bddd1e	[Thumb] Fix infinite loop in ABS expansion (PR41160) Don't expand ISD::ABS node if its legal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356661 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-21 12:41:18 +00:00
Simon Pilgrim	eea77acda7	Fix unused variable warning. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356474 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-19 16:49:59 +00:00
Simon Pilgrim	5425f5c42b	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356468 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-19 16:24:55 +00:00
Adhemerval Zanella	0ce3660e40	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356389 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-18 18:40:07 +00:00
Tim Renouf	a4f465e4c7	[ARM] Fixed an assumption of power-of-2 vector MVT I am about to introduce some non-power-of-2 width vector MVTs. This commit fixes a power-of-2 assumption that my forthcoming change would otherwise break, as shown by test/CodeGen/ARM/vcvt_combine.ll and vdiv_combine.ll. Differential Revision: https://reviews.llvm.org/D58927 Change-Id: I56a282e365d3874ab0621e5bdef98a612f702317 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356341 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-17 20:48:54 +00:00
Florian Hahn	3cd4111205	[ARM] Sink zext/sext operands for add and sub to enable vsubl generation. This uses the infrastructure added in rL353152 to sink zext and sexts to sub/add users, to enable vsubl/vaddl generation when NEON is available. See https://bugs.llvm.org/show_bug.cgi?id=40025. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D58063 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355460 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-06 00:10:03 +00:00
Oliver Stannard	3a22eb400f	[ARM] Fix select_cc lowering for fp16 When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355385 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-05 10:42:34 +00:00
Oliver Stannard	c064077099	[ARM] Consider undefined-on-NaN conditions in checkVSELConstraints This function was not checking for the condition code variants which are undefined if either input is NaN, so we were missing selection of the VSEL instruction in some cases when using -fno-honor-nans or -ffast-math. Differential revision: https://reviews.llvm.org/D58812 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355199 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-01 13:58:25 +00:00
Kadir Cetinkaya	0ad049beea	[Target][ARM] Add a usage for SrcSz to unbreak build-bots without assertions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355101 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-28 15:55:11 +00:00
Bjorn Pettersson	85de1fd399	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355099 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-28 15:45:29 +00:00
Sam Parker	fcb98b7928	[ARM] Add OptMinSize to ARMSubtarget In many places in the backend, we like to know whether we're optimising for code size and this is performed by checking the current machine function attributes. A subtarget is created on a per-function basis, so it's possible to know when we're compiling for code size on construction so record this in the new object. Differential Revision: https://reviews.llvm.org/D57812 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353501 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-08 07:57:42 +00:00
James Y Knight	3bab951f0f	[opaque pointer types] Pass value type to GetElementPtr creation. This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352913 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-01 20:44:47 +00:00
Reid Kleckner	2f8d69acb0	[ARM] Deduplicate table generated CC analysis code Create ARMCallingConv.cpp and emit code for calling convention analysis from there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352431 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-28 21:28:43 +00:00
Matt Arsenault	52f6127659	Reapply "IR: Add fp operations to atomicrmw" This reapplies commits r351778 and r351782 with RISCV test fixes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351850 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-22 18:18:02 +00:00
Chandler Carruth	73f9a1d617	Revert r351778: IR: Add fp operations to atomicrmw This broke the RISCV build, and even with that fixed, one of the RISCV tests behaves surprisingly differently with asserts than without, leaving there no clear test pattern to use. Generally it seems bad for hte IR to differ substantially due to asserts (as in, an alloca is used with asserts that isn't needed without!) and nothing I did simply would fix it so I'm reverting back to green. This also required reverting the RISCV build fix in r351782. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351796 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-22 10:29:58 +00:00
Matt Arsenault	020dc8d94b	IR: Add fp operations to atomicrmw Add just fadd/fsub for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351778 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-22 03:32:36 +00:00

1 2 3 4 5 ...

1672 Commits