RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-14 09:26:22 +00:00

Author	SHA1	Message	Date
Hans Wennborg	fa13347f23	Merging r295990: ------------------------------------------------------------------------ r295990 \| jvesely \| 2017-02-23 08:12:21 -0800 (Thu, 23 Feb 2017) \| 5 lines AMDGPU/SI: Fix trunc i16 pattern Hit on ASICs that support 16bit instructions. Differential Revision: https://reviews.llvm.org/D30281 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@296158 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-24 18:56:41 +00:00
Hans Wennborg	8de5c2160f	Merging r295762: ------------------------------------------------------------------------ r295762 \| eugenis \| 2017-02-21 12:17:34 -0800 (Tue, 21 Feb 2017) \| 3 lines Fix PR31896. Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset). ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@296002 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 18:39:15 +00:00
Hans Wennborg	d8ff05ed89	Backport r293433, ARM: support `-mlong-calls` with AEABI TLS on ELF Support lowering AEABI TLS access (__aeabi_read_tp) with long calls. This requires adjusting the call sequence to use an indirect call to get full addressability. Resolves PR31769! By Saleem Abdulrasool! git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295910 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 00:14:14 +00:00
Hans Wennborg	eb6d6dd6f6	Merging r295512: ------------------------------------------------------------------------ r295512 \| matze \| 2017-02-17 15:15:03 -0800 (Fri, 17 Feb 2017) \| 8 lines AArch64LoadStoreOptimizer: Correctly clear kill flags When promoting the Load of a Store-Load pair to a COPY all kill flags between the store and the load need to be cleared. rdar://30402435 Differential Revision: https://reviews.llvm.org/D30110 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295744 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 18:46:57 +00:00
Hans Wennborg	dbca326a90	Merging r294527: ------------------------------------------------------------------------ r294527 \| arnolds \| 2017-02-08 14:30:47 -0800 (Wed, 08 Feb 2017) \| 14 lines [ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295135 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 01:06:12 +00:00
Hans Wennborg	407aa2606f	Merging r294551: ------------------------------------------------------------------------ r294551 \| arnolds \| 2017-02-08 17:52:17 -0800 (Wed, 08 Feb 2017) \| 10 lines SwiftCC: swifterror register cannot be as the base register Functions that have a dynamic alloca require a base register which is defined to be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be the same register. Use a different callee save register for swifterror instead: X21 on AArch64 R8 on ARM rdar://30433803 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295079 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 17:39:39 +00:00
Hans Wennborg	366ce55054	Merging r294348: ------------------------------------------------------------------------ r294348 \| hans \| 2017-02-07 12:37:45 -0800 (Tue, 07 Feb 2017) \| 6 lines [X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@294476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 16:50:40 +00:00
Hans Wennborg	524e53d43f	Merging r294203: ------------------------------------------------------------------------ r294203 \| john.brawn \| 2017-02-06 10:07:20 -0800 (Mon, 06 Feb 2017) \| 9 lines [AArch64] Fix incorrect MachinePointerInfo in splitStoreSplat When splitting up one store into several in splitStoreSplat we have to make sure we get the MachinePointerInfo right, otherwise alias analysis thinks they all store to the same location. This can then cause invalid scheduling later on. Differential Revision: https://reviews.llvm.org/D29446 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@294242 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-06 21:27:55 +00:00
Hans Wennborg	7fbc479dcc	Merging r292117: ------------------------------------------------------------------------ r292117 \| sdardis \| 2017-01-16 05:55:58 -0800 (Mon, 16 Jan 2017) \| 14 lines [mips] Correct c.cond.fmt instruction definition. Permit explicit $fcc<X> operand in c.cond.fmt instruction. Add c.cond.fmt to the MIPS to microMIPS instruction mapping table. Check that $fcc1 - $fcc7 are unusable for MIPS-I to MIPS-III for c.cond.fmt, bc1t, bc1f. Reviewers: seanbruno, zoran.jovanovic, vkalintiris Differential Revision: https://reviews.llvm.org/D24510 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293665 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-31 18:23:49 +00:00
Hans Wennborg	e2e5c5103c	Merging r292624: ------------------------------------------------------------------------ r292624 \| petarj \| 2017-01-20 09:53:30 -0800 (Fri, 20 Jan 2017) \| 9 lines [mips] Fix debug information for __thread variable This patch fixes debug information for __thread variable on Mips using .dtprelword and .dtpreldword directives. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D28770 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293664 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-31 18:21:40 +00:00
Hans Wennborg	9065e3db7a	Merging r293417: ------------------------------------------------------------------------ r293417 \| jhibbits \| 2017-01-28 20:55:57 -0800 (Sat, 28 Jan 2017) \| 16 lines Add some Book-E instructions to the asm parser and printer. Summary: Adds the following instructions: * mfpmr * mtpmr * icblc * icblq * icbtls Fix the scheduling for mtspr on e5500, which uses CFX0, instead of SFX0/SFX1 as on e500mc. Addresses PR 31538. Differential Revision: https://reviews.llvm.org/D29002 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293651 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-31 17:23:10 +00:00
Evandro Menezes	a9053251e8	[AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128' In order to follow the pattern of the existing 'slow-misaligned-128store' option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293332 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 20:57:35 +00:00
Matt Arsenault	974695d103	Merging r293310: ------------------------------------------------------------------------ r293310 \| arsenm \| 2017-01-27 09:42:26 -0800 (Fri, 27 Jan 2017) \| 8 lines AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293329 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 20:21:31 +00:00
Matt Arsenault	e536396218	Merging r292982: ------------------------------------------------------------------------ r292982 \| arsenm \| 2017-01-24 14:02:15 -0800 (Tue, 24 Jan 2017) \| 8 lines Enable FeatureFlatForGlobal on Volcanic Islands This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293326 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 19:26:48 +00:00
Matt Arsenault	cb3fa250a4	Merging r292473: ------------------------------------------------------------------------ r292473 \| arsenm \| 2017-01-18 22:35:27 -0800 (Wed, 18 Jan 2017) \| 9 lines AMDGPU: Disable some fneg combines unless nsz For -(x + y) -> (-x) + (-y), if x == -y, this would change the result from -0.0 to 0.0. Since the fma/fmad combine is an extension of this problem it also applies there. fmul should be fine, and I don't think any of the unary operators or conversions should be a problem either. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293319 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 18:39:19 +00:00
Matt Arsenault	802910f034	Merging r292472: ------------------------------------------------------------------------ r292472 \| arsenm \| 2017-01-18 22:04:12 -0800 (Wed, 18 Jan 2017) \| 5 lines AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293317 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 18:31:33 +00:00
Hans Wennborg	27ec2b89fc	Merging r293259: ------------------------------------------------------------------------ r293259 \| compnerd \| 2017-01-26 19:41:53 -0800 (Thu, 26 Jan 2017) \| 11 lines ARM: fix vectorized division on WoA The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293306 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 17:13:59 +00:00
Hans Wennborg	8a028a206e	Merging r292712 and r292713: ------------------------------------------------------------------------ r292712 \| ctopper \| 2017-01-20 22:59:35 -0800 (Fri, 20 Jan 2017) \| 1 line [X86] Add test cases that show bad commuting being allowed to create a phsub operation. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r292713 \| ctopper \| 2017-01-20 22:59:38 -0800 (Fri, 20 Jan 2017) \| 3 lines [X86] Don't allow commuting to form phsub operations. Fixes PR31714. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293299 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 16:37:00 +00:00
Hans Wennborg	6144eee70a	Merging r292516: ------------------------------------------------------------------------ r292516 \| rserge \| 2017-01-19 12:24:23 -0800 (Thu, 19 Jan 2017) \| 14 lines [XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293295 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 16:17:56 +00:00
Tom Stellard	72f82e455a	Merging r293000: ------------------------------------------------------------------------ r293000 \| thomas.stellard \| 2017-01-24 17:25:13 -0800 (Tue, 24 Jan 2017) \| 15 lines AMDGPU add support for spilling to a user sgpr pointed buffers Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293240 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-27 00:45:06 +00:00
Hans Wennborg	4fba04fd96	Merging r292651: ------------------------------------------------------------------------ r292651 \| jvesely \| 2017-01-20 13:24:26 -0800 (Fri, 20 Jan 2017) \| 8 lines AMDGPU/R600: Serialize vector trunc stores to private AS Add DUMMY_CHAIN SDNode to denote stores of interest Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915 Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411 Differential Revision: https://reviews.llvm.org/D27964 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293118 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 00:26:36 +00:00
Hans Wennborg	dc54ec4e5b	Merging r292444: ------------------------------------------------------------------------ r292444 \| mkuper \| 2017-01-18 15:05:58 -0800 (Wed, 18 Jan 2017) \| 7 lines Revert r291670 because it introduces a crash. r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293070 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-25 16:57:43 +00:00
Hans Wennborg	52cd70f859	Merging r291909: ------------------------------------------------------------------------ r291909 \| compnerd \| 2017-01-13 08:25:33 -0800 (Fri, 13 Jan 2017) \| 9 lines ARM: match GCC's behaviour for builtins GCC changes the CC between the user-code and the builtins based on the value of `-target` rather than `-mfloat-abi`. When a HF target is used, the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant is used. In all cases, the AEABI functions use the AAPCS CC. Adjust the calling convention based on the target. Resolves PR30543! ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292951 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-24 16:58:58 +00:00
Hans Wennborg	3ff1f39e4d	Merging r292758: ------------------------------------------------------------------------ r292758 \| spatel \| 2017-01-22 09:06:12 -0800 (Sun, 22 Jan 2017) \| 4 lines [x86] avoid crashing with illegal vector type (PR31672) https://llvm.org/bugs/show_bug.cgi?id=31672 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292832 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-23 21:33:34 +00:00
Matthias Braun	b3eb8fe616	Cherry pick r292625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292820 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-23 19:26:12 +00:00
Joerg Sonnenberger	58af97afa2	Merging r292244: ------------------------------------------------------------------------ r292244 \| joerg \| 2017-01-17 20:29:15 +0100 (Di, 17. Jan 2017) \| 2 Zeilen Remove an overeager assert from r288844. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292453 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-19 00:19:28 +00:00
Hans Wennborg	eb0fbb2a84	Merging r292242: ------------------------------------------------------------------------ r292242 \| bwilson \| 2017-01-17 11:18:57 -0800 (Tue, 17 Jan 2017) \| 5 lines Revert r291640 change to fold X86 comparison with atomic_load_add. Even with the fix from r291630, this still causes problems. I get widespread assertion failures in the Swift runtime's WeakRefCount::increment() function. I sent a reduced testcase in reply to the commit. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292243 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 19:29:13 +00:00
Nikolai Bozhenov	7b4bd48edb	[X86] Replace AND+IMM64 with SRL/SHL Emit SHRQ/SHLQ instead of ANDQ with a 64 bit constant mask if the result is unused and the mask has only higher/lower bits set. For example, with this patch LLVM emits shrq $41, %rdi je instead of movabsq $0xFFFFFE0000000000, %rcx testq %rcx, %rdi je This reduces number of instructions, code size and register pressure. The transformation is applied only for cases where the mask cannot be encoded as an immediate value within TESTQ instruction. Differential Revision: https://reviews.llvm.org/D28198 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291806 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 19:54:27 +00:00
Nikolai Bozhenov	724695062b	[X86] Tune bypassing of slow division for Intel CPUs 64-bit integer division in Intel CPUs is extremely slow, much slower than 32-bit division. On the other hand, 8-bit and 16-bit divisions aren't any faster. The only important exception is Atom where DIV8 is fastest. Because of that, the patch 1) Enables bypassing of 64-bit division for Atom, Silvermont and all big cores. 2) Modifies 64-bit bypassing to use 32-bit division instead of 16-bit one. This doesn't make the shorter division slower but increases chances of taking it. Moreover, it's much more likely to prove at compile-time that a value fits 32 bits and doesn't require a run-time check (e.g. zext i32 to i64). Differential Revision: https://reviews.llvm.org/D28196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291800 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 19:34:15 +00:00
Matt Arsenault	cd002582ba	AMDGPU: Skip fneg/select combine if it can fold into other git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291792 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:58:15 +00:00
Matt Arsenault	9db1ec3d4d	AMDGPU: Fold free fneg into sin git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291790 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:48:09 +00:00
Saleem Abdulrasool	13f3144204	ARM: slightly more table driven libcall setup Switch some additional library call setup to be table driven. This makes it more immediately obvious what the library call looks like. This is important for ARM since the calling conventions for the builtins change based on the target/libcall name. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291789 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:46:11 +00:00
Matt Arsenault	49dd8fcb21	AMDGPU: Fold fneg into fmul_legacy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291784 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 18:26:30 +00:00
Matt Arsenault	bd870734a5	AMDGPU: Fold fneg into rcp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:35 +00:00
Matt Arsenault	cca494fd03	AMDGPU: Fold fneg into fp_round git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:33 +00:00
Matt Arsenault	e652041f69	AMDGPU: Fold fneg into fp_extend git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 17:46:28 +00:00
Daniel Sanders	819f312880	[globalisel] Move as much RegisterBank initialization to the constructor as possible Summary: The register bank is now entirely initialized in the constructor. However, we still have the hardcoded number of register classes which will be dealt with in the TableGen patch (D27338) since we do not have access to this information to resolve this at this stage. The number of register classes is known to the TRI and to TableGen but the RegisterBank constructor is too early for the former and too late for the latter. This will be fixed when the data is tablegen-erated. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27809 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291770 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 16:11:23 +00:00
Daniel Sanders	6e712a22a6	[globalisel] Initialize RegisterBanks with static data. Summary: Refactor the RegisterBank initialization to use static data. This requires GlobalISel implementations to rewrite calls to createRegisterBank() and addRegBankCoverage() into a call to setRegBankData(). Out of tree targets can use diff 4 of D27807 (https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump the register classes and other data that needs to be provided to setRegBankData(). This is the method that was used to generate the static data in this patch. Tablegen-eration of this static data will follow after some refactoring. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27807 Differential Revision: https://reviews.llvm.org/D27808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291768 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 15:32:10 +00:00
Matt Arsenault	3517370c4d	AMDGPU: Fix sub_oneuse being marked commutative git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291748 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 07:17:28 +00:00
Craig Topper	4a4c1fcaaa	[AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64 with VLX, but no DQ or BW support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291747 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 06:49:12 +00:00
Craig Topper	49cfd1ffd8	[AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64 when avx512vl is available, but not avx512dq. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291746 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 06:49:08 +00:00
Elad Cohen	f160ecc799	[X86][AVX512] Fix PR31515 - Do not flip vselect condition if it's not a vXi1 mask r289653 added a case where `vselect <cond> <vector1> <all-zeros>` is transformed to: `vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>` This was not aimed to catch cases where Cond is not a vXi1 mask but it does. Moreover, when Cond type is VxiN (N > 1) then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond). This patch changes the above to xor with allones, and avoids entering the case for non-mask Conds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291745 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 06:49:03 +00:00
Matt Arsenault	94bf68d551	AMDGPU: Fold fneg into fma or fmad Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:32:16 +00:00
Matt Arsenault	ef33822be5	AMDGPU: Fold fneg into fmul Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:23:20 +00:00
Matt Arsenault	bcf34bbbdd	AMDGPU: Fold fneg into fadd Patch mostly by Fiona Glaser git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-12 00:09:34 +00:00
Matt Arsenault	8694e2f853	AMDGPU: Pull fneg/fabs out of a select Allows better source modifier usage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 23:57:38 +00:00
Peter Collingbourne	7371eca308	X86: Remove dead code. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291721 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 23:00:28 +00:00
Matt Arsenault	f1e95d3604	AMDGPU: Fix shrinking of addc/subb. To shrink to VOP2 the input carry must also be VCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:58:12 +00:00
Matt Arsenault	fac51240d9	AMDGPU: Fix sext_inreg for i1 in i16 This produces worse code when i16 is legal, mostly due to combines getting confused by conversions inserted for uniform 16-bit operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:35:22 +00:00
Matt Arsenault	8c7e9845cf	AMDGPU: Fix breaking VOP3 v_add_i32s This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 22:35:17 +00:00

1 2 3 4 5 ...

41435 Commits