RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-02-15 10:39:47 +00:00

Author	SHA1	Message	Date
Sanjay Patel	e4e5cf5a66	make reciprocal estimate code generation more flexible by adding command-line options (3rd try) The first try (r238051) to land this was reverted due to ExecutionEngine build failure; that was hopefully addressed by r238788. The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure; that was hopefully addressed by r238953. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-04 01:32:35 +00:00
Tom Stellard	8c009e3943	R600: Re-enable sub-reg liveness The bug in the R600 backend that this uncovered has been fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238999 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-04 01:20:04 +00:00
Matt Arsenault	b743320bb3	R600/SI: Fix tests with triples in them Only set the triple from the command line options. Some of these were still testing SI features and using the old r600-- triple. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238958 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 20:04:05 +00:00
Colin LeMahieu	0a999e4d3a	[Hexagon] Test doesn't work on all platforms. At any rate the uninitialized variable issue was fixed. Removing re-registering ASM backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238949 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 18:00:45 +00:00
Colin LeMahieu	4009e89061	[Hexagon] Reapply 238772 OSABI was not correctly set, added empty_elf test to make sure it is. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238947 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 17:34:16 +00:00
Matthias Braun	e942914d29	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 Recommiting after the revert in r238821, the buildbot still failed with the patch removed so there seems to be another reason for the breakage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238935 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 16:30:24 +00:00
Asaf Badouh	ce375dc63a	re-apply 238809 AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. CR: http://reviews.llvm.org/D9991 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238923 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 13:41:48 +00:00
Elena Demikhovsky	10eb2dd9df	AVX-512: VSHUFPD instruction selection - code improvements git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238918 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 11:21:01 +00:00
Rafael Espindola	a0bcb4184b	Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238900 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-03 05:32:44 +00:00
Sanjoy Das	8fadf8f4d3	[SelectionDAG] Fix PR23603. Summary: LLVM's MI level notion of invariant_load is different from LLVM's IR level notion of invariant_load with respect to dereferenceability. The IR notion of invariant_load only guarantees that all non-faulting invariant loads result in the same value. The MI notion of invariant load guarantees that the load can be legally moved to any location within its containing function. The MI notion of invariant_load is stronger than the IR notion of invariant_load -- an MI invariant_load is an IR invariant_load + a guarantee that the location being loaded from is dereferenceable throughout the function's lifetime. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10075 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238881 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 22:33:30 +00:00
Daniel Sanders	92a1dad6d1	[mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only. Summary: Following on from r209907 which made personality encodings indirect, do the same for TType encodings. This fixes the case where a try/catch block needs to generate references to, for example, std::exception in the .gcc_except_table. Previous attempts at committing this broke the buildbots due to bugs in IAS. These bugs have now been fixed so trying again. Reviewers: petarj Reviewed By: petarj Subscribers: srhines, joerg, tberghammer, llvm-commits Differential Revision: http://reviews.llvm.org/D9669 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238863 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 20:32:50 +00:00
Sanjay Patel	871beb8dd7	make reciprocal estimate code generation more flexible by adding command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238842 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 15:28:15 +00:00
Elena Demikhovsky	ccbc17f896	AVX-512: Shorten implementation of lowerV16X32VectorShuffle() using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines. Added matching of VALIGN instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238830 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 13:43:18 +00:00
Vasileios Kalintiris	330e5f16d1	[mips] Add support for dynamic stack realignment. Summary: With this change we are able to realign the stack dynamically, whenever it contains objects with alignment requirements that are larger than the alignment specified from the given ABI. We have to use the $fp register as the frame pointer when we perform dynamic stack realignment. In complex stack frames, with variably-sized objects, we reserve additionally the callee-saved register $s7 as the base pointer in order to reference locals. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238829 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 13:14:46 +00:00
Renato Golin	6b35bec8ef	Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs" This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot. Since self-hosting issues with Clang are hard to investigate, I'm taking the liberty to revert now, so we can investigate it offline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238821 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 11:47:30 +00:00
Asaf Badouh	aa9e1c528b	revert 238809 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238810 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 07:45:19 +00:00
Asaf Badouh	82fa06895e	AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238809 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-02 07:18:14 +00:00
Matthias Braun	d421582e90	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238795 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 23:27:08 +00:00
Matthias Braun	fe1391f07d	AArch64: Use CMP;CCMP sequences for and/or/setcc trees. Previously CCMP/FCCMP instructions were only used by the AArch64ConditionalCompares pass for control flow. This patch uses them for SELECT like instructions as well by matching patterns in ISelLowering. PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D8232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238793 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 22:31:17 +00:00
Matthias Braun	fa2b7e5cb4	LiveRangeEdit: Fix liveranges not shrinking on subrange kill. If a dead instruction we may not only have a last-use in the main live range but also in a subregister range if subregisters are tracked. We need to partially rebuild live ranges in both cases. The testcase only broke when subregister liveness was enabled. I commited it in the current form because there is currently no flag to enable/disable subregister liveness. This fixes PR23720. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238785 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 21:26:26 +00:00
Rafael Espindola	9be6a55ba0	Revert "[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath." This reverts commit r238748. It broke the msan bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4372/steps/check-llvm%20msan/logs/stdio git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238772 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 19:20:47 +00:00
Vasileios Kalintiris	a509ef9a17	[mips][FastISel] Implement bswap. Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 . Based on a patch by Reed Kotler. Test Plan: bswap1.ll test-suite Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7219 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238760 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 16:40:45 +00:00
Vasileios Kalintiris	0cc6b87583	[mips][FastISel] Implement intrinsics memset, memcopy & memmove. Summary: Implement the intrinsics memset, memcopy and memmove in MIPS FastISel. Make some needed infrastructure fixes so that this can work. Based on a patch by Reed Kotler. Test Plan: memtest1.ll The patch passes test-suite for mips32 r1/r2 and at O0/O2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238759 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 16:36:01 +00:00
Vasileios Kalintiris	d4b311fb43	[mips][FastISel] Implement srem/urem and sdiv/udiv instructions. Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel. Based on a patch by Reed Kotler. Test Plan: srem1.ll div1.ll test-suite at O0/O2 for mips32 r1/r2 Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238757 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 16:17:37 +00:00
Vasileios Kalintiris	e490b7733a	[mips][FastISel] Implement the select statement for MIPS FastISel. Summary: Implement the LLVM IR select statement for MIPS FastISelsel. Based on a patch by Reed Kotler. Test Plan: "Make check" test included now. Passes test-suite at O2/O0 mips32 r1/r2. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6774 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238756 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 15:56:40 +00:00
Vasileios Kalintiris	30b5412d92	[mips][FastISel] Clobber HI0/LO0 registers in MUL instructions. Summary: The contents of the HI/LO registers are unpredictable after the execution of the MUL instruction. In addition to implicitly defining these registers in the MUL instruction definition, we have to mark those registers as dead too. Without this the fast register allocator is running out of registers when the MUL instruction is followed by another one that tries to allocate the AC0 register. Based on a patch by Reed Kotler. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D9825 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 15:48:09 +00:00
Colin LeMahieu	8a6a249f6b	[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238748 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 14:51:26 +00:00
Elena Demikhovsky	bbd7cab2b9	AVX-512: Optimized vector shuffle for v16f32 and v16i32 types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238743 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 13:26:18 +00:00
Luke Cheeseman	68f83e59f7	Removing commited assembly file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238742 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 13:18:53 +00:00
Luke Cheeseman	7d97fc4164	Re-commit of r238201 with fix for building with shared libraries. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238739 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 12:02:47 +00:00
Elena Demikhovsky	aa62d8a6b2	AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types. I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238735 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 09:49:53 +00:00
Elena Demikhovsky	9f63519857	AVX-512: Fixed a bug in compress and expand intrinsics. By Igor Breger (igor.breger@intel.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238724 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-01 06:30:13 +00:00
Tim Northover	876dd978b8	ARM: recommit r237590: allow jump tables to be placed as constant islands. The original version didn't properly account for the base register being modified before the final jump, so caused miscompilations in Chromium and LLVM. I've fixed this and tested with an LLVM self-host (I don't have the means to build & test Chromium). The general idea remains the same: in pathological cases jump tables can be too far away from the instructions referencing them (like other constants) so they need to be movable. Should fix PR23627. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238680 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-31 19:22:07 +00:00
Chandler Carruth	fa68750e54	[x86] Unify the horizontal adding used for popcount lowering taking the best approach of each. For vNi16, we use SHL + ADD + SRL pattern that seem easily the best. For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases there is a huge improvement with this in IACA's estimated throughput -- over 2x higher throughput!!!! -- but the measurements are too good to be true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks slightly faster, but I'm not sure I believe any of the measurements at this point. Both are the exact same uops though. Hard to be confident of anything past that. If anyone wants to collect very detailed (Agner-level) timings with the result of this patch, or with the i32 case replaced with SHL + ADD + SHl + ADD + SRL, I'd be very interested. Note that you'll need to test it on both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as I saw unique behavior in each of these buckets with IACA all of which should be checked against measured performance. But this patch is still a useful improvement by dropping duplicate work and getting the much nicer PSADBW lowering for v2i64. I'd still like to rephrase this in terms of generic horizontal sum. It's a bit lame to have a special case of that just for popcount. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238652 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-30 10:35:03 +00:00
Chandler Carruth	60dbe0fd0d	[x86] Update the order of instructions after I switched to a bitcast helper that skips creating a cast when it isn't necessary. It's really somewhat concerning that this was caused by the the presence of a no-op bitcast, but... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238642 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-30 06:02:37 +00:00
Chandler Carruth	d8018eeac9	[x86] Restore the bitcasts I removed when refactoring this to avoid shifting vectors of bytes as x86 doesn't have direct support for that. This removes a bunch of redundant masking in the generated code for SSE2 and SSE3. In order to avoid the really significant code size growth this would have triggered, I also factored the completely repeatative logic for shifting and masking into two lambdas which in turn makes all of this much easier to read IMO. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238637 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-30 04:05:11 +00:00
Chandler Carruth	828f5b807c	[x86] Implement a faster vector population count based on the PSHUFB in-register LUT technique. Summary: A description of this technique can be found here: http://wm.ite.pl/articles/sse-popcount.html The core of the idea is to use an in-register lookup table and the PSHUFB instruction to compute the population count for the low and high nibbles of each byte, and then to use horizontal sums to aggregate these into vector population counts with wider element types. On x86 there is an instruction that will directly compute the horizontal sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily. Various tricks are used to get vNi32 and vNi16 from the vNi8 that the LUT computes. The base implemantion of this, and most of the work, was done by Bruno in a follow up to D6531. See Bruno's detailed post there for lots of timing information about these changes. I have extended Bruno's patch in the following ways: 0) I committed the new tests with baseline sequences so this shows a diff, and regenerated the tests using the update scripts. 1) Bruno had noticed and mentioned in IRC a redundant mask that I removed. 2) I introduced a particular optimization for the i32 vector cases where we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD + PSADBW to compute doubled high i32 popcounts. This takes advantage of the fact that to line up the high i32 popcounts we have to shift them anyways, and we can shift them by one fewer bit to effectively divide the count by two. While the PSHUFD based horizontal add is no faster, it doesn't require registers or load traffic the way a mask would, and provides more ILP as it happens on different ports with high throughput. 3) I did some code cleanups throughout to simplify the implementation logic. 4) I refactored it to continue to use the parallel bitmath lowering when SSSE3 is not available to preserve the performance of that version on SSE2 targets where it is still much better than scalarizing as we'll still do a bitmath implementation of popcount even in scalar code there. With #1 and #2 above, I analyzed the result in IACA for sandybridge, ivybridge, and haswell. In every case I measured, the throughput is the same or better using the LUT lowering, even v2i64 and v4i64, and even compared with using the native popcnt instruction! The latency of the LUT lowering is often higher than the latency of the scalarized popcnt instruction sequence, but I think those latency measurements are deeply misleading. Keeping the operation fully in the vector unit and having many chances for increased throughput seems much more likely to win. With this, we can lower every integer vector popcount implementation using the LUT strategy if we have SSSE3 or better (and thus have PSHUFB). I've updated the operation lowering to reflect this. This also fixes an issue where we were scalarizing horribly some AVX lowerings. Finally, there are some remaining cleanups. There is duplication between the two techniques in how they perform the horizontal sum once the byte population count is computed. I'm going to factor and merge those two in a separate follow-up commit. Differential Revision: http://reviews.llvm.org/D10084 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238636 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-30 03:20:59 +00:00
Chandler Carruth	43d1e87d73	[x86] Restructure the parallel bitmath lowering of popcount into a separate routine, generalize it to work for all the integer vector sizes, and do general code cleanups. This dramatically improves lowerings of byte and short element vector popcount, but more importantly it will make the introduction of the LUT-approach much cleaner. The biggest cleanup I've done is to just force the legalizer to do the bitcasting we need. We run these iteratively now and it makes the code much simpler IMO. Other changes were minor, and mostly naming and splitting things up in a way that makes it more clear what is going on. The other significant change is to use a different final horizontal sum approach. This is the same number of instructions as the old method, but shifts left instead of right so that we can clear everything but the final sum with a single shift right. This seems likely better than a mask which will usually have to read the mask from memory. It is certaily fewer u-ops. Also, this will be temporary. This and the LUT approach share the need of horizontal adds to finish the computation, and we have more clever approaches than this one that I'll switch over to. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238635 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-30 03:20:55 +00:00
Reid Kleckner	bfa311df8c	[WinEH] Adjust the 32-bit SEH prologue to better match reality It turns out that _except_handler3 and _except_handler4 really use the same stack allocation layout, at least today. They just make different choices about encoding the LSDA. This is in preparation for lowering the llvm.eh.exceptioninfo(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238627 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 22:57:46 +00:00
Reid Kleckner	f0e3e4cd84	Disable FP elimination in funcs using 32-bit MSVC EH personalities The value in 'ebp' acts as an implicit argument to the outlined handlers, and is recovered with frameaddress(1). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238619 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 21:58:11 +00:00
Matthias Braun	3bd732d1ee	MachineCopyPropagation: Remove the copies instead of using KILL instructions. For some history here see the commit messages of r199797 and r169060. The original intent was to fix cases like: %EAX<def> = COPY %ECX<kill>, %RAX<imp-def> %RCX<def> = COPY %RAX<kill> where simply removing the copies would have RCX undefined as in terms of machine operands only the ECX part of it is defined. The machine verifier would complain about this so 169060 changed such COPY instructions into KILL instructions so some super-register imp-defs would be preserved. In r199797 it was finally decided to always do this regardless of super-register defs. But this is wrong, consider: R1 = COPY R0 ... R0 = COPY R1 getting changed to: R1 = KILL R0 ... R0 = KILL R1 It now looks like R0 dies at the first KILL and won't be alive until the second KILL, while in reality R0 is alive and must not change in this part of the program. As this only happens after register allocation there is not much code still performing liveness queries so the issue was not noticed. In fact I didn't manage to create a testcase for this, without unrelated changes I am working on at the moment. The fix is simple: As of r223896 the MachineVerifier allows reads from partially defined registers, so the whole transforming COPY->KILL thing is not necessary anymore. This patch also changes a similar (but more benign case as the def and src are the same register) case in the VirtRegRewriter. Differential Revision: http://reviews.llvm.org/D10117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238588 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 18:19:25 +00:00
Nemanja Ivanovic	8493722975	Add support for VSX FMA single-precision instructions to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9941 It adds the various FMA instructions introduced in the version 2.07 of the ISA along with the testing for them. These are operations on single precision scalar values in VSX registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238578 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 17:13:25 +00:00
Alex Lorenz	83d291ae8b	MIR Serialization: use correct line and column numbers for LLVM IR errors. This commit translates the line and column numbers for LLVM IR errors from the numbers in the YAML block scalar to the numbers in the MIR file so that the MIRParser users can report LLVM IR errors with the correct line and column numbers. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D10108 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238576 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 17:05:41 +00:00
Reid Kleckner	16e4a624c4	[WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86 Small (really small!) C++ exception handling examples work on 32-bit x86 now. This change disables the use of .seh_* directives in WinException when CFI is not in use. It also uses absolute symbol references in the tables instead of imagerel32 relocations. Also fixes a cache invalidation bug in MMI personality classification. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238575 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 17:00:57 +00:00
Jingyue Wu	ef056a9111	[NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast Summary: This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast from longer chains consisting of GEPs and BitCasts. For example, it can now optimize %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]* %1 = gep [10 x float]* %0, i64 0, i64 %i %2 = bitcast float* %1 to i32* %3 = load i32* %2 ; emits ld.u32 to %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)* %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32 Test Plan: @ld_int_from_global_float in access-non-generic.ll Reviewers: broune, eliben, jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10074 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238574 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 17:00:27 +00:00
Colin LeMahieu	0ace3c01f7	[Hexagon] Disassembling, printing, and emitting instructions a whole-bundle at a time which is the semantic unit for Hexagon. Fixing tests to use the new format. Disabling tests in the direct object emission path for a followup patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238556 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 14:44:13 +00:00
Quentin Colombet	7e31fe7e20	Add a test for the MachineCopyPropagation change landed in r238518. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238537 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-29 01:40:00 +00:00
Chandler Carruth	b78a66659b	[x86] Move the vector popcount tests into non-ISA files, and instead organize them by the width of vector. This makes it a lot easier to see that we're covering all of the vector types but not doing so excessively. This also adds tests across the spectrum of SSE versions in addition to the AVX versions. If you're really tired of seeing the massive sprawl of scalarized code for this, don't worry, I'm just about to land Bruno's patch that dramatically improve the situation for SSSE3 and newer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238520 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-28 22:46:48 +00:00
Alex Lorenz	3682046086	MIR Serialization: print and parse machine function names. This commit introduces a serializable structure called 'llvm::yaml::MachineFunction' that stores the machine function's name. This structure will mirror the machine function's state in the future. This commit prints machine functions as YAML documents containing a YAML mapping that stores the state of a machine function. This commit also parses the YAML documents that contain the machine functions. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D9841 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238519 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-28 22:41:12 +00:00
David Majnemer	47b5a3cbea	Add testcase for r238503. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238515 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-28 22:12:27 +00:00

1 2 3 4 5 ...

13650 Commits