RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-01-10 22:46:25 +00:00

Author	SHA1	Message	Date
Matt Arsenault	84895bd2e6	R600/SI: Allow comuting fp immediates git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220062 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 18:00:39 +00:00
Peter Collingbourne	e7b03ee85c	We also need to catch OSError here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:46:46 +00:00
Matt Arsenault	46b53c9e4b	R600/SI: Remove SI_BUFFER_RSRC pseudo Just use REG_SEQUENCE directly, so there are fewer instructions to need to deal with later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:42:56 +00:00
Juergen Ributzka	32ef68718d	[Stackmaps] Enable invoking the patchpoint intrinsic. Patch by Kevin Modzelewski Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits, reames Differential Revision: http://reviews.llvm.org/D5634 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220055 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:39:00 +00:00
Andrea Di Biagio	5512b50db5	[X86] Fix missed selection of non-temporal store of zero vector. When the input to a store instruction was a zero vector, the backend always selected a normal vector store regardless of the non-temporal hint. This is fixed by this patch. This fixes PR19370. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:27:06 +00:00
James Molloy	7023b85187	[AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering. We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same. While we're at it, avoid writing to std::vector::end()... Bug found with random testing and a lot of coffee. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220051 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 17:06:31 +00:00
Rafael Espindola	ad8eef5a90	Don't crash if find_executable return None. This was crashing when trying to run the tests on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220048 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 16:07:43 +00:00
Bill Schmidt	b76f5ba103	[PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64 types, but does not yet use lxvw4x and stxvw4x for 4x32 types. This patch adds that support. As with lxvd2x/stxvd2x, this involves straightforward overriding of the patterns normally recognized for lvx/stvx, with preference given to the VSX patterns when VSX is enabled. In addition, the logic for permitting misaligned memory accesses is modified so that v4r32 and v4i32 are treated the same as v2f64 and v2i64 when VSX is enabled. Finally, the DAG generation for unaligned loads is changed to just use a normal LOAD (which will become lxvw4x) on P8 and later hardware, where unaligned loads are preferred over lvsl/lvx/lvx/vperm. A number of tests now generate the VSX loads/stores instead of lvx/stvx, so this patch adds VSX variants to those tests. I've also added <4 x float> tests to the vsx.ll test case, and created a vsx-p8.ll test case to be used for testing code generation for the P8Vector feature. For now, that simply tests the unaligned load/store behavior. This has been tested along with a temporary patch to enable the VSX and P8Vector features, with no new regressions encountered with or without the temporary patch applied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220047 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 15:13:38 +00:00
Jan Vesely	9c1d1fa266	R600: Add EG to FMA test Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220045 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:45:27 +00:00
Jan Vesely	cef793e8c7	SelectionDAG: Add sext_inreg optimizations v2: use dyn_cast fixup comments v3: use cast Reviewed-by: Matt Arsenault <arsenm2@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220044 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:45:25 +00:00
Vasileios Kalintiris	eaf8f5efe9	[mips] Add support for COP1's Branch-On-Cond-Likely instructions Summary: Depends on D5782 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220042 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 14:08:28 +00:00
Vasileios Kalintiris	0f22fe9b56	[mips] Add support for COP0's Branch-On-Cond-Likely instructions Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5782 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220036 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 12:38:35 +00:00
Hal Finkel	9d85eff56a	[DSE] Remove no-data-layout-only type-based overlap checking DSE's overlap checking contained special logic, used only when no DataLayout was available, which inferred a complete overwrite when the pointee types were equal. This logic seems fine for regular loads/stores, but does not work for memcpy and friends. Instead of fixing this, I'm just removing it. Philosophically, transformations should not contain enhanced behavior used only when data layout is lacking (data layout should be strictly additive), and maintaining these rarely-tested code paths seems not worthwhile at this stage. Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case (slightly reduced from that provided by Aliaksei) replaces the original contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few other tests have been updated to have a data layout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 11:56:00 +00:00
Rafael Espindola	d8ee23f34c	Add back commits r219835 and a fixed version of r219829. The only difference from r219829 is using getOrCreateSectionSymbol(*ELFSec) instead of GetOrCreateSymbol(ELFSec->getSectionName()) in ELFObjectWriter which causes us to use the correct section symbol even if we have multiple sections with the same name. Original messages: r219829: Correctly handle references to section symbols. When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. r219835: Allow forward references to section symbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220021 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:48:58 +00:00
Bill Schmidt	101025c33d	[PPC] Adjust some PowerPC tests to account for presence/absence of VSX Patch by Bill Seurer; committed on his behalf. These test cases generate slightly different code sequences when VSX is activated and thus fail. The update turns off VSX explicitly for the existing checks and then adds a second set of checks for most of them that test the VSX instruction output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220019 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:41:22 +00:00
Rafael Espindola	410bde5171	Add a test that would have found the bug in r219829. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220016 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:34:23 +00:00
Akira Hatanaka	1cdebe50c1	ARM: Fix a bug which was causing convergence failure in constant-island pass. The bug is in ARMConstantIslands::createNewWater where the upper bound of the new water split point is computed: // This could point off the end of the block if we've already got constant // pool entries following this block; only the last one is in the water list. // Back past any possible branches (allow for a conditional and a maximally // long unconditional). if (BaseInsertOffset + 8 >= UserBBI.postOffset()) { BaseInsertOffset = UserBBI.postOffset() - UPad - 8; DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset)); } The split point is supposed to be somewhere between the machine instruction that loads from the constant pool entry and the end of the basic block, before branch instructions. The code above is fine if the basic block is large enough and there are a sufficient number of instructions following the machine instruction. However, if the machine instruction is near the end of the basic block, BaseInsertOffset can point to the machine instruction or another instruction that precedes it, and this can lead to convergence failure. This commit fixes this bug by ensuring BaseInsertOffset is larger than the offset of the instruction following the constant-loading instruction. rdar://problem/18581150 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:31:47 +00:00
Rafael Espindola	70a1be3f76	Revert commit r219835 and r219829. Revert "Correctly handle references to section symbols." Revert "Allow forward references to section symbols." Rui found a regression I am debugging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220010 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:06:02 +00:00
Peter Zotov	46b94aa80e	[OCaml] Add Llvm.instr_clone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220008 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 01:02:40 +00:00
Alexander Potapenko	0fea775e5c	[llvm-symbolizer] Introduce the -dsym-hint option. llvm-symbolizer will consult one of the .dSYM paths passed via -dsym-hint if it fails to find the .dSYM bundle at the default location. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220004 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-17 00:50:19 +00:00
Peter Collingbourne	770f3af232	Add our own copy of the find_executable function to cope with installations that do not have the distutils.spawn package. Should hopefully fix the aarch64 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219991 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 23:43:20 +00:00
Peter Collingbourne	798ace2e58	Initial version of Go bindings. This code is based on the existing LLVM Go bindings project hosted at: https://github.com/go-llvm/llvm Note that all contributors to the gollvm project have agreed to relicense their changes under the LLVM license and submit them to the LLVM project. Differential Revision: http://reviews.llvm.org/D5684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 22:48:02 +00:00
Robin Morisset	d310963833	Erase fence insertion from SelectionDAGBuilder.cpp (NFC) Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain unsound, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it.. This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code. Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence. Test Plan: make check-all, no functional change Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5474 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219957 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:34:57 +00:00
Matt Arsenault	0134a9bed3	R600: Fix nonsensical implementation of computeKnownBits for BFE This was resulting in invalid simplifications of sdiv git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219953 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:07:40 +00:00
Rafael Espindola	2f8f1d34e3	Delete -std-compile-opts. These days -std-compile-opts was just a silly alias for -O3. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219951 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 20:00:02 +00:00
Bjorn Steinbrink	6eaa62af77	Allow call-slop optzn for destinations with a suitable dereferenceable attribute Summary: Currently, call slot optimization requires that if the destination is an argument, the argument has the sret attribute. This is to ensure that the memory access won't trap. In addition to sret, we can also allow the optimization to happen for arguments that have the new dereferenceable attribute, which gives the same guarantee. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5832 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 19:43:08 +00:00
Sanjay Patel	d8214db086	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y) If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 18:48:17 +00:00
Juergen Ributzka	c40dab2069	[AArch64] Fix miscompile of sdiv-by-power-of-2. When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219934 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 16:41:15 +00:00
Vasileios Kalintiris	3b72ec5083	[mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 nodes. Summary: In order to support big endian targets for the BuildPairF64 nodes we just need to swap the low/high pair registers. Additionally, for the ExtractElementF64 nodes we have to calculate the correct stack offset with respect to the node's register/operand that we want to extract. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5753 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219931 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 15:41:51 +00:00
Vasileios Kalintiris	02065a65cd	[mips] Marked the DI/EI instruction aliases as MIPS32r2 Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5751 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219927 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 15:23:52 +00:00
Akira Hatanaka	4eb03123df	Reapply r219832 - InstCombine: Narrow switch instructions using known bits. The code committed in r219832 asserted when it attempted to shrink a switch statement whose type was larger than 64-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 06:00:46 +00:00
Saleem Abdulrasool	ebe6584c32	TRE: make TRE a bit more aggressive Make tail recursion elimination a bit more aggressive. This allows us to get tail recursion on functions that are just branches to a different function. The fact that the function takes a byval argument does not restrict it from being optimised into just a tail call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219899 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 03:27:30 +00:00
Akira Hatanaka	608d59f535	Revert r219832. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219884 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-16 01:17:02 +00:00
Sanjoy Das	a0b0184b33	Revert "r219834 - Teach ScalarEvolution to sharpen range information" This change breaks the asan buildbots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219878 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:46:04 +00:00
Hal Finkel	43141a0764	Preserve non-byval pointer alignment attributes using @llvm.assume when inlining For pointer-typed function arguments, enhanced alignment can be asserted using the 'align' attribute. When inlining, if this enhanced alignment information is not otherwise available, preserve it using @llvm.assume-based alignment assumptions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219876 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:44:41 +00:00
Adam Nemet	fb9d61a8d6	[AVX512] Add DQ subvector inserts In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4 respectively. These are matched by "Alt" Pat<>'s (Alt stands for alternative VTs). Since DQ has native support for these intructions, I peeled off the non-"Alt" part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are derived from this multiclass. The "Alt" Pat<>'s are disabled with DQ. Fixes <rdar://problem/18426089> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219874 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:17 +00:00
Adam Nemet	ec7f30662e	[AVX512] Add SKX testing to avx512-insert-extract.ll This is in preparation to adding DQ subvector inserts to this testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219873 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:14 +00:00
Adam Nemet	f9e3a3afa1	[AVX512] Fix test to produce a defined value We're inserting into a 8 wide vector, so the index should be < 8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219872 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 23:42:11 +00:00
Tom Stellard	d3fc10a525	R600/SI: Fix bug where immediates were being used in DS addr operands The SelectDS1Addr1Offset complex pattern always tries to store constant lds pointers in the offset operand and store a zero value in the addr operand. Since the addr operand does not accept immediates, the zero value needs to first be copied to a register. This newly created zero value will not go through normal instruction selection, so we need to manually insert a V_MOV_B32_e32 in the complex pattern. This bug was hidden by the fact that if there was another zero value in the DAG that had not been selected yet, then the CSE done by the DAG would use the unselected node for the addr operand rather than the one that was just created. This would lead to the zero value being selected and the DAG automatically inserting a V_MOV_B32_e32 instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219848 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 21:08:59 +00:00
Rafael Espindola	fc6e0f6f87	Allow forward references to section symbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219835 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:30:18 +00:00
Sanjoy Das	40edbf130e	Teach ScalarEvolution to sharpen range information. If x is known to have the range [a, b) in a loop predicated by (icmp ne x, a), its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. phabricator: http://reviews.llvm.org/D5639 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219834 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:25:28 +00:00
Akira Hatanaka	38537634e2	InstCombine: Narrow switch instructions using known bits. Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219832 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 19:05:50 +00:00
Juergen Ributzka	7440a83e60	Reapply "[FastISel][AArch64] Add custom lowering for GEPs." This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219831 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:58:07 +00:00
Rafael Espindola	ad04f5db82	Correctly handle references to section symbols. When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219829 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:55:30 +00:00
Matt Arsenault	8b3a9205b7	R600/SI: Also try to use 0 base for misaligned 8-byte DS loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219823 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 18:06:43 +00:00
Matt Arsenault	7fdd553b66	R600: Fix miscompiles when BFE has multiple uses SimplifyDemandedBits would break the other uses of the operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219819 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:58:34 +00:00
Hal Finkel	6c15862fd3	[SLPVectorize] Basic ephemeral-value awareness The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219816 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:35:01 +00:00
Derek Schuff	279b5504a3	[MC] Make bundle alignment mode setting idempotent and support nested bundles Summary: Currently an error is thrown if bundle alignment mode is set more than once per module (either via the API or the .bundle_align_mode directive). This change allows setting it multiple times as long as the alignment doesn't change. Also nested bundle_lock groups are currently not allowed. This change allows them, with the effect that the group stays open until all nests are exited, and if any of the bundle_lock directives has the align_to_end flag, the group becomes align_to_end. These changes make the bundle aligment simpler to use in the compiler, and also better match the corresponding support in GNU as. Reviewers: jvoung, eliben Differential Revision: http://reviews.llvm.org/D5801 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 17:10:04 +00:00
Juergen Ributzka	0081070cfd	Revert "[FastISel][AArch64] Add custom lowering for GEPs." This breaks our internal build bots. Reverting it to get the bots green again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219776 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 04:55:48 +00:00
Jingyue Wu	75d77cd179	[MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 03:27:43 +00:00

1 2 3 4 5 ...

26605 Commits