RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-02-05 18:17:00 +00:00

Author	SHA1	Message	Date
Jingyue Wu	c4aff77ec0	[SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245003 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-14 02:02:05 +00:00
Alex Lorenz	5d09c2f25d	MIR Serialization: Change MIR syntax - use custom syntax for MBBs. This commit modifies the way the machine basic blocks are serialized - now the machine basic blocks are serialized using a custom syntax instead of relying on YAML primitives. Instead of using YAML mappings to represent the individual machine basic blocks in a machine function's body, the new syntax uses a single YAML block scalar which contains all of the machine basic blocks and instructions for that function. This is an example of a function's body that uses the old syntax: body: - id: 0 name: entry instructions: - '%eax = MOV32r0 implicit-def %eflags' - 'RETQ %eax' ... The same body is now written like this: body: \| bb.0.entry: %eax = MOV32r0 implicit-def %eflags RETQ %eax ... This syntax change is motivated by the fact that the bundled machine instructions didn't map that well to the old syntax which was using a single YAML sequence to store all of the machine instructions in a block. The bundled machine instructions internally use flags like BundledPred and BundledSucc to determine the bundles, and serializing them as MI flags using the old syntax would have had a negative impact on the readability and the ease of editing for MIR files. The new syntax allows me to serialize the bundled machine instructions using a block construct without relying on the internal flags, for example: BUNDLE implicit-def dead %itstate, implicit-def %s1 ... { t2IT 1, 24, implicit-def %itstate %s1 = VMOVS killed %s0, 1, killed %cpsr, implicit killed %itstate } This commit also converts the MIR testcases to the new syntax. I developed a script that can convert from the old syntax to the new one. I will post the script on the llvm-commits mailing list in the thread for this commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244982 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 23:10:16 +00:00
Ahmed Bougacha	7409d94d2d	[AArch64] Provide "too few operands" diags on short-form NEON also. We used to just say "invalid type suffix for instruction", which is misleading. This is because we fallback to the long-form matcher if the short-form matcher failed, losing the error information on the way. Save it, so that we can provide a little better diagnostics when the long-form matcher thinks a suffix is the cause of the error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244955 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 21:09:13 +00:00
Alex Lorenz	1b93706717	MIR Parser: Don't allow negative alignments for memory operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244953 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 20:55:01 +00:00
Davide Italiano	d0a074c824	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244947 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 20:34:26 +00:00
Simon Pilgrim	d85f3b303d	[X86][SSE] Tests for SMAX/SMIN/UMAX/UMIN vector instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244944 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 20:31:03 +00:00
Jingyue Wu	a49e11047c	[SeparateConstOffsetFromGEP] strengthen the inbounds attribute We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244937 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 18:48:49 +00:00
Nemanja Ivanovic	31f6eee816	Scalar to vector conversions using direct moves This patch corresponds to review: http://reviews.llvm.org/D11471 It improves the code generated for converting a scalar to a vector value. With direct moves from GPRs to VSRs, we no longer require expensive stack operations for this. Subsequent patches will handle the reverse case and more general operations between vectors and their scalar elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244921 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 17:40:44 +00:00
Igor Laevsky	ea56ef761a	Emit argmemonly attribute for intrinsics. Differential Revision: http://reviews.llvm.org/D11352 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244920 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 17:40:04 +00:00
James Molloy	a215ac72ef	[ARM] Rejig vmax tests a bit They rely on global fast-math options, but soon ISel will rely only on fast-math flags on the instructions themselves. Rip the fast checks out into their own file so we can mark their instructions as fast. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244914 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 17:28:16 +00:00
James Molloy	8eafed2468	[AArch64] Small rejig of fmax tests, NFCI. These tests relied on -enable-no-nans-fp-math, whereas soon they'll take their no-nans hint from the FCMP instruction itself, so split the no-nans stuff out into its own test. Also do a slight rejig of instruction order. The old FMIN/MAX backend matching had to deal with looking through casts, which it never did particularly well. Now, instcombine will recognize such patterns and canonicalize the cast outside the select. So modify the test inputs to assume that instcombine has already run. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244913 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 17:28:10 +00:00
Erik Eckstein	22af77d94f	[DeadStoreElimination] remove a redundant store even if the load is in a different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32) { entry: %l2 = load i32, i32 %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244901 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 15:36:11 +00:00
Petar Jovanovic	319eb435fd	[mips][mcjit] Calculate correct addend for HI16 and PCHI16 reloc Previously, for O32 ABI we did not calculate correct addend for R_MIPS_HI16 and R_MIPS_PCHI16 relocations. This patch fixes that. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D11186 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244897 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 15:12:49 +00:00
Joseph Tremoulet	ccc0cf3d8d	[WinEHPrepare] Update demotion logic Summary: Update the demotion logic in WinEHPrepare to avoid creating new cleanups by walking predecessors as necessary to insert stores for EH-pad PHIs. Also avoid creating stores for EH-pad PHIs that have no uses. The store/load placement is still pretty naive. Likely future improvements (at least for optimized compiles) include: - Share loads for related uses as possible - Coalesce non-interfering use/def-related PHIs - Store at definition point rather than each PHI pred for non-interfering lifetimes. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11955 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244894 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 14:30:10 +00:00
Ulrich Weigand	69585fd51b	[SystemZ] Support large LLVM IR struct return values Recent mesa/llvmpipe crashes on SystemZ due to a failed assertion when attempting to compile a routine with a return type of { <4 x float>, <4 x float>, <4 x float>, <4 x float> } on a system without vector instruction support. This is because after legalizing the vector type, we get a return value consisting of 16 floats, which cannot all be returned in registers. Usually, what should happen in this case is that the target's CanLowerReturn routine rejects the return type, in which case SelectionDAG falls back to implementing a structure return in memory via implicit reference. However, the SystemZ target never actually implemented any CanLowerReturn routine, and thus would accept any struct return type. This patch fixes the crash by implementing CanLowerReturn. As a side effect, this also handles fp128 return values, fixing a todo that was noted in SystemZCallingConv.td. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244889 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 13:37:06 +00:00
Charlie Turner	99985c9fac	[InstCombinePHI] Partial simplification of identity operations. Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr \| 0, mul, sdiv, div \| 1. Patch by Jakub Kuderski! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244887 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 12:38:58 +00:00
John Brawn	4d88daed01	[ARM] Reorganise and simplify thumb-1 load/store selection Other than PC-relative loads/store the patterns that match the various load/store addressing modes have the same complexity, so the order that they are matched is the order that they appear in the .td file. Rearrange the instruction definitions in ARMInstrThumb.td, and make use of AddedComplexity for PC-relative loads, so that the instruction matching order is the order that results in the simplest selection logic. This also makes register-offset load/store be selected when it should, as previously it was only selected for too-large immediate offsets. Differential Revision: http://reviews.llvm.org/D11800 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244882 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 10:48:22 +00:00
Simon Pilgrim	335fc61873	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244872 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 07:39:03 +00:00
Ahmed Bougacha	5689f67ef7	[CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating. Now that we can properly promote mismatched FCOPYSIGNs (r244858), we can mark the FP_ROUND on the result as truncating, to expose folding. FCOPYSIGN doesn't change anything but the sign bit, so (fp_round (fcopysign (fpext a), b)) is equivalent to (modulo the sign bit): (fp_round (fpext a)) which is a no-op. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244862 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 01:32:30 +00:00
Ahmed Bougacha	c5b90eb284	[AArch64] Cleanup vector-fcopysign.ll test. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244861 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 01:20:38 +00:00
Ahmed Bougacha	2e22177214	[AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN. We can lower them using our cool tricks if we fpext/fptrunc the second input, like we do for f32/f64. Follow-up to r243924, r243926, and r244858. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244860 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 01:13:56 +00:00
Ahmed Bougacha	39d772ac64	[CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand. We don't care about its type, and there's even a combine that'll fold away the FP_EXTEND if we let it run. However, until it does, we'll have something broken like: (f32 (fp_extend (f64 v))) Scalar f16 follow-up to r243924. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244858 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-13 01:09:43 +00:00
Philip Reames	bf09064f46	[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints To be clear: this is an optimization not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244821 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 22:11:45 +00:00
Alex Lorenz	79faadca28	MIR Parser: Allow the MI IR references to reference global values. This commit fixes a bug where MI parser couldn't resolve the named IR references that referenced named global values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244817 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 21:27:16 +00:00
Alex Lorenz	c338a581fd	MIR Serialization: Serialize the fixed stack pseudo source values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244816 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 21:23:17 +00:00
Alex Lorenz	710eecab5d	MIR Serialization: Serialize the jump table pseudo source values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244813 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 21:11:08 +00:00
Alex Lorenz	6b8e62f3f5	MIR Serialization: Serialize the GOT pseudo source values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244809 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 21:00:22 +00:00
Philip Reames	5119cb9b04	[RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244808 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 21:00:20 +00:00
Alex Lorenz	3f0c495bbb	MIR Serialization: Serialize the stack pseudo source values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244806 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 20:44:16 +00:00
Alex Lorenz	ad20340006	MIR Serialization: Serialize the constant pool pseudo source values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244803 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 20:33:26 +00:00
JF Bastien	5f53aadd9c	WebAssembly: floating-point comparisons Summary: D11924 implemented part of the floating-point comparisons, this patch implements the rest: * Tell ISelLowering that all booleans are either 0 or 1. * Expand the eq/ne/lt/le/gt/ge floating-point comparisons to the canonical ones (similar to what Mips32r6InstrInfo.td does). * Add tests for ord/uno. * Add tests for ueq/one/ult/ule/ugt/uge. * Fix existing comparison tests to remove the (res & 1) code, which setBooleanContents stops from generating. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11970 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244779 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 17:53:29 +00:00
Simon Pilgrim	c29cf29e62	Cleaned up test. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244765 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 17:00:50 +00:00
John Brawn	1843d3de5d	Redo "Make global aliases have symbol size equal to their type" r242520 was reverted in r244313 as the expected behaviour of the alias attribute in C is that the alias has the same size as the aliasee. However we can re-introduce adding the size on the alias when the aliasee does not, from a source code or object perspective, exist as a discrete entity. This happens when the aliasee is not a symbol, or when that symbol is private. Differential Revision: http://reviews.llvm.org/D11943 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244752 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 15:05:39 +00:00
John Brawn	1a5ed7be24	[GlobalMerge] Only emit aliases for internal linkage variables for non-Mach-O On Mach-O emitting aliases for the variables that make up a MergedGlobals variable can cause problems when linking with dead stripping enabled so don't do that, except for external variables where we must emit an alias. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244748 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 13:36:48 +00:00
Zoran Jovanovic	f7d96a06fb	[mips][microMIPS] Create microMIPS64r6 subtarget and implement DALIGN, DAUI, DAHI, DATI, DEXT, DEXTM and DEXTU instructions Differential Revision: http://reviews.llvm.org/D10923 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244744 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 12:45:16 +00:00
Michael Kuperstein	2fa1b0f1fc	[X86] Disable mul -> shl + lea combine when compiling for minsize Differential Revision: http://reviews.llvm.org/D11904 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244740 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 11:27:26 +00:00
Davide Italiano	6a7d7ba152	[MC] Convert the last test using macho-dump under X86/ to llvm-readobj. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244732 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 10:36:16 +00:00
Michael Kuperstein	426921ffc7	[X86] Allow x86 call frame optimization to fold more loads into pushes This abstracts away the test for "when can we fold across a MachineInstruction" into the the MI interface, and changes call-frame optimization use the same test the peephole optimizer users. Differential Revision: http://reviews.llvm.org/D11945 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244729 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 10:14:58 +00:00
Matt Arsenault	892803aa81	AMDGPU: Fix assert on dbg_value instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244728 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 09:04:44 +00:00
Simon Pilgrim	49c9300c28	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244723 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 08:08:56 +00:00
Sanjay Patel	f5a6019248	[x86] enable machine combiner reassociations for 256-bit vector FP mul/add git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244705 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 00:29:10 +00:00
Adam Nemet	212ab28e02	[LoopDist] Add test for missing coverage Add a testcase to ensure that if we can't find bounds for a necessary memcheck we don't distribute. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244703 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 00:21:59 +00:00
Adam Nemet	cf1d982d01	[LAA] Fix typo in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244690 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 23:03:09 +00:00
Mark Heffernan	67278b1774	Use 32-bit divides instead of 64-bit divides where possible. For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason is that many index computations are carried out in 64-bits but never actually exceed the capacity of a 32-bit word. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244684 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 22:16:34 +00:00
Paul Robinson	bbf32bd275	Make DW_AT_[MIPS_]linkage_name optional, and off by default for SCE. Mangled "linkage" names can be huge, and if the debugger (or other tools) have no use for them, the size savings can be very impressive (on the order of 40%). Add one test for controlling behavior, and modify a number of tests to either stop using linkage names, or make llc emit them (so these tests will still run when the default triple is for PS4). Differential Revision: http://reviews.llvm.org/D11374 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244678 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 21:36:45 +00:00
Sanjoy Das	162f046004	Fix PR24354. `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244676 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 21:33:55 +00:00
JF Bastien	06a1f0e6d8	WebAssembly: implement comparison. Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11924 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244665 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 21:02:46 +00:00
Sanjay Patel	fb4d106de1	[x86] enable machine combiner reassociations for 128-bit vector single/double multiplies git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 20:19:23 +00:00
Sanjay Patel	dec9fb9abd	fix minsize detection: minsize attribute implies optimizing for size Also, add a test for optsize because this was not part of any existing regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244651 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 19:39:36 +00:00
Jingyue Wu	d1ff8a7f6e	SelectionDAG: Prefer to combine multiplication with less uses for fma Summary: For example: s6 = s0s5; s2 = s6s6 + s6; ... s4 = s6*s3; We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)). This only happens when Aggressive is true, otherwise hasOneUse() check already prevents from folding the multiplication with more uses. Test Plan: test/CodeGen/NVPTX/fma-assoc.ll Patch by Xuetian Weng Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm Subscribers: arsenm, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11855 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244649 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-11 19:21:46 +00:00

1 2 3 4 5 ...

31429 Commits