archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Nirav Dave	3bbf394145	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297695 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-14 00:34:14 +00:00
Tim Northover	d0188c3d44	Revert "GlobalISel: move vector extract/insert inside generic opcode region." I was writing against an earlier branch and Volkan had already fixed this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297668 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 21:25:10 +00:00
Tim Northover	6ac2f2bb7f	GlobalISel: move vector extract/insert inside generic opcode region. Otherwise they won't be legalized or selected, causing instruction selection to fail horribly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297666 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 21:18:59 +00:00
Volkan Keles	7fdb926c14	[GlobalISel] Update PRE_ISEL_GENERIC_OPCODE_END marker git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297663 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 20:31:45 +00:00
Jessica Paquette	af024ba867	[Outliner] Add tail call support This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297653 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 18:39:33 +00:00
Craig Topper	27a7060a36	[SelectionDAG] Enhance SDTCisSameNumEltsAs to work with scalar types and use it on extend/trunc/round operations. Currently we don't enforce that ISD::ANY_EXTEND, ZERO_EXTEND, SIGN_EXTEND, TRUNC, FP_ROUND, FP_EXTEND have the same number of elements(including scalar) between their input and output. Though we have them documented as such. Up until a few months ago x86 created nodes that violated this rule. That's all been fixed now, and we should enforce the rule going forward. In order to do this we need to allow SDTCisSameNumEltsAs to support scalar types and not enforce being a vector. If one type is scalar we will force the other type to also be scalar. Differential Revision: https://reviews.llvm.org/D30878 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297648 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 17:37:14 +00:00
Volkan Keles	27b5bfd0d0	[GlobalISel] Translate insertelement and extractelement Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297495 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 19:08:28 +00:00
Eli Friedman	bfa11453a9	Refactor alias check from MISched into common helper. NFC. Differential Revision: https://reviews.llvm.org/D30598 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297421 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 23:33:36 +00:00
Volkan Keles	0e1e54e6f2	[GlobalISel] Translate floating-point negation Reviewers: qcolombet, javed.absar, aditya_nandakumar, dsanders, t.p.northover, ab Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30671 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297171 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 18:03:28 +00:00
Tim Northover	2c87ca8a0e	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297100 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 23:50:28 +00:00
Volkan Keles	6f84539d5d	[GlobalISel] Fix G_FPEXT’s description. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297088 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 22:47:19 +00:00
Jessica Paquette	d43adee378	[Outliner] Fixed Asan bot failure in r296418 Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 21:31:18 +00:00
Simon Pilgrim	ca6750e3d5	[X86][SSE] Lower 128-bit vectors to SIGN/ZERO_EXTEND_VECTOR_IN_REG ops As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT. We're missing a couple of shuffle combines that will be added in a future patch for review. Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases. Differential Revision: https://reviews.llvm.org/D30549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296985 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-05 09:57:20 +00:00
Sanjay Patel	59071e49a9	[DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C) select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook. This is part of the ongoing process to obsolete D24480. The motivation is to canonicalize to select IR in InstCombine whenever possible, so we need to have a way to undo that easily in codegen. PowerPC is an obvious winner for this kind of transform because it has fast and complete bit-twiddling abilities but generally lousy conditional execution perf (although this might have changed in recent implementations). x86 also sees some wins, but the effect is limited because these transforms already mostly exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 changes just shows that that code is a mess of special-case holes. We may be able to remove some of that logic now. My guess is that other targets will want to enable this hook for most cases. The likely follow-ups would be to add value type and/or the constants themselves as parameters for the hook. As the tests in select_const.ll show, we can transform any select-of-constants to math/logic, but the general transform for any 2 constants needs one more instruction (multiply or 'and'). ARM is one target that I think may not want this for most cases. I see infinite loops there because it wants to use selects to enable conditionally executed instructions. Differential Revision: https://reviews.llvm.org/D30537 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296977 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-04 19:18:09 +00:00
Tim Northover	ce6504cc93	GlobalISel: constrain G_INSERT to inserting just one value per instruction. It's much easier to reason about single-value inserts and no-one was actually using the variadic variants before. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296923 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 23:05:47 +00:00
Tim Northover	f4d294cc96	GlobalISel: add merge/unmerge nodes for legalization. These are simplified variants of the current G_SEQUENCE and G_EXTRACT, which assume the individual parts will be contiguous, homogeneous, and occupy the entirity of the larger register. This makes reasoning about them much easer since you only have to look at the first register being merged and the result to know what the instruction is doing. I intend to gradually replace all uses of the more complicated sequence/extract with these (or single-element insert/extracts), and then remove the older variants. For now we start with legalization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296921 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 22:46:09 +00:00
Krzysztof Parzyszek	88a7ff46b1	Make TargetInstrInfo::isPredicable take a const reference, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296901 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 18:30:54 +00:00
Chandler Carruth	f970832c3b	[SDAG] Revert r296476 (and r296486, r296668, r296690). This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296862 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 10:02:25 +00:00
Reid Kleckner	4c3428b604	Elide argument copies during instruction selection Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters. This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it. Supersedes D28388 Fixes PR26328 Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29668 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296683 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-01 21:42:00 +00:00
Nirav Dave	bfdb3f2a5a	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 14:24:15 +00:00
Matthias Braun	73ddbb7dff	Revert "Add MIR-level outlining pass" Revert Machine Outliner for now, as it breaks the asan bot. This reverts commit r296418. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296426 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 02:24:30 +00:00
Matthias Braun	c043a889f1	Add MIR-level outlining pass This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable. This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test. The outliner is run like so: clang -mno-red-zone -mllvm -enable-machine-outliner file.c Patch by Jessica Paquette<jpaquette@apple.com>! rdar://29166825 Differential Revision: https://reviews.llvm.org/D26872 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296418 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 00:33:32 +00:00
Nirav Dave	b89cc7e5e3	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r296252 until 256-bit operations are more efficiently generated in X86. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296279 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-26 01:27:32 +00:00
Nirav Dave	32147cef64	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296252 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-25 11:43:58 +00:00
Stanislav Mekhanoshin	fef0dbe59c	Revert "Correct register pressure calculation in presence of subregs" This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296182 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-24 21:56:16 +00:00
Stanislav Mekhanoshin	0bf4d71d50	Correct register pressure calculation in presence of subregs If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296009 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 20:19:44 +00:00
Matt Arsenault	5736385164	TargetOptions: Fix not accounting for NoSignedZerosFPMath in == git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295928 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 03:16:44 +00:00
Geoff Berry	fc170d8f5d	[CodeGenPrepare] Sink and duplicate more 'and' instructions. Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295746 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 18:53:14 +00:00
Hans Wennborg	b6ae6ad928	[X86] Re-enable conditional tail calls and fix PR31257. This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295262 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-16 00:04:05 +00:00
Tim Northover	432026394b	GlobalISel: support translating va_arg Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295254 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 23:22:33 +00:00
Simon Pilgrim	b529f91924	Fix spelling mistake - paramater -> parameter. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295182 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 15:11:36 +00:00
Tim Northover	173c7d4750	GlobalISel: introduce G_PTR_MASK to simplify alloca handling. This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295103 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 20:56:18 +00:00
Reid Kleckner	cfee44a41c	[CodeGen] Use bitfields instead of manual masks in ArgFlagsTy, NFC This revealed that we actually have 8 more unused flag bits, and byval size doesn't need to be a bitfield at all. This came up during code review here: https://reviews.llvm.org/D29668#inline-258469 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294989 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-13 21:33:26 +00:00
Tim Northover	d8f030286c	GlobalISel: translate @llvm.pow intrinsic to G_FPOW. It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294530 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 23:23:32 +00:00
Tim Northover	063022b81a	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294519 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 21:22:15 +00:00
Tim Northover	de0a86879f	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294492 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 17:57:20 +00:00
Matt Arsenault	a647924c48	TargetLowering: Remove AddrSpace parameter from GetAddrModeArguments It doesn't make any sense to pass in to what is supposed to be parsing the call, and this can be inferred from the pointer output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294412 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 07:09:03 +00:00
Hans Wennborg	34a6e0d36a	[X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294348 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-07 20:37:45 +00:00
Sanjoy Das	a0be0b1390	[ImplicitNullCheck] Extend Implicit Null Check scope by using stores Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294338 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-07 19:19:49 +00:00
Nirav Dave	529986a15d	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293915 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-02 18:24:55 +00:00
Nirav Dave	99b0642f83	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293893 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-02 14:39:42 +00:00
Dehao Chen	fe46230d58	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: LTO requires the debug-info-for-profiling to be a function attribute. Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl Reviewed By: mehdi_amini, dblaikie, aprantl Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29203 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293833 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-01 22:45:09 +00:00
Evandro Menezes	20de1ea345	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293737 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-01 02:54:34 +00:00
Nirav Dave	53d52e9e2b	[X86] Implement -mfentry Summary: Insert calls to __fentry__ at function entry. Reviewers: hfinkel, craig.topper Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D28000 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293648 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-31 17:00:27 +00:00
Kristof Beyls	e9939d4d42	[GlobalISel] Add support for indirectbr Differential Revision: https://reviews.llvm.org/D28079 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293470 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-30 09:13:18 +00:00
David Majnemer	7bf48fa3f7	[Target] Add NoSignedZerosFPMath to the TargetOptions constructor Most flags were already initialized by the TargetOptions constructor but we missed out on one. Also, simplify the constructor by using field initializers when possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293406 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-29 01:27:08 +00:00
Stanislav Mekhanoshin	be4948cead	Replace addEarlyAsPossiblePasses callback with adjustPassManager This change introduces adjustPassManager target callback giving a target an opportunity to tweak PassManagerBuilder before pass managers are populated. This generalizes and replaces addEarlyAsPossiblePasses target callback. In particular that can be used to add custom passes to extension points other than EP_EarlyAsPossible. Differential Revision: https://reviews.llvm.org/D28336 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293189 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 16:49:08 +00:00
Nirav Dave	d9031ef908	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293188 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 16:46:13 +00:00
Nirav Dave	dbb7a65598	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293184 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 16:02:24 +00:00
Krzysztof Parzyszek	d40c764784	Add iterator_range<regclass_iterator> to {Target,MC}RegisterInfo, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293077 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-25 19:29:04 +00:00

1 2 3 4 5 ...

3431 Commits