RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-02-05 03:46:27 +00:00

Author	SHA1	Message	Date
Saleem Abdulrasool	c4c2318e72	CodeGen: ensure that libcalls are always AAPCS CC The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280833 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 17:56:09 +00:00
Hans Wennborg	4a83266436	X86: Fold tail calls into conditional branches where possible (PR26302) When branching to a block that immediately tail calls, it is possible to fold the call directly into the branch if the call is direct and there is no stack adjustment, saving one byte. Example: define void @f(i32 %x, i32 %y) { entry: %p = icmp eq i32 %x, %y br i1 %p, label %bb1, label %bb2 bb1: tail call void @foo() ret void bb2: tail call void @bar() ret void } before: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne .LBB0_2 jmp foo .LBB0_2: jmp bar after: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne bar .LBB0_1: jmp foo I don't expect any significant size savings from this (on a Clang bootstrap I saw 288 bytes), but it does make the code a little tighter. This patch only does 32-bit, but 64-bit would work similarly. Differential Revision: https://reviews.llvm.org/D24108 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280832 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 17:52:14 +00:00
Davide Italiano	fab897dd11	[lib/LTO] Add a way to run a custom pipeline Differential Revision: https://reviews.llvm.org/D24095 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280830 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 17:46:16 +00:00
Yaxun Liu	6874fa846b	AMDGPU: Add hidden kernel arguments to runtime metadata OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata. Differential Revision: https://reviews.llvm.org/D23424 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280829 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 17:44:00 +00:00
Reid Kleckner	eadc14fcd6	[codeview] Add new directives to record inlined call site line info Summary: Previously we were trying to represent this with the "contains" list of the .cv_inline_linetable directive, which was not enough information. Now we directly represent the chain of inlined call sites, so we know what location to emit when we encounter a .cv_loc directive of an inner inlined call site while emitting the line table of an outer function or inlined call site. Fixes PR29146. Also fixes PR29147, where we would crash when .cv_loc directives crossed sections. Now we write down the section of the first .cv_loc directive, and emit an error if any other .cv_loc directive for that function is in a different section. Also fixes issues with discontiguous inlined source locations, like in this example: volatile int unlikely_cond = 0; extern void __declspec(noreturn) abort(); __forceinline void f() { if (!unlikely_cond) abort(); } int main() { unlikely_cond = 0; f(); unlikely_cond = 0; } Previously our tables gave bad location information for the 'abort' call, and the debugger wouldn't snow the inlined stack frame for 'f'. It is important to emit good line tables for this code pattern, because it comes up whenever an asan bug occurs in an inlined function. The __asan_report* stubs are generally placed after the normal function epilogue, leading to discontiguous regions of inlined code. Reviewers: majnemer, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24014 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280822 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 16:15:31 +00:00
Justin Lebar	600740bf18	[LSV] Use the original loads' names for the extractelement instructions. Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280818 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 15:49:48 +00:00
Simon Pilgrim	0639c00041	Fix typo in test - it should be masking bits0-15 not bit16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280816 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 15:19:07 +00:00
Andrea Di Biagio	959abb8a67	Regenerate vector bitcast folding tests using update_test_checks.py. Two tests have been merged together, regenerated and then moved to a more appropriate directory. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280814 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 14:50:07 +00:00
Simon Pilgrim	56463ddaa2	[X86][SSE] Added or combine tests for known bits of vectors Part of the yak shaving for D24253 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280813 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 14:49:50 +00:00
Simon Pilgrim	b362b6eae0	[X86][SSE] Added and+or+zext combine tests for known bits of vectors Part of the yak shaving for D24253 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280810 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 14:00:52 +00:00
Simon Pilgrim	9313182d93	[X86][SSE] Added and+or combine tests currently failing with vectors (and (or x, C), D) -> D if (C & D) == D Part of the yak shaving for D24253 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280809 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 13:40:03 +00:00
Pablo Barrio	ba3ea4d3a9	[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM) Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280808 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 12:49:15 +00:00
Andrea Di Biagio	2fdf2bfd1f	[InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280807 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 12:47:53 +00:00
Andrea Di Biagio	fd11478c26	[InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280804 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 12:03:03 +00:00
Vasileios Kalintiris	c7d75117d4	[mips] Disable the TImode shift libcalls for 32-bit targets. Summary: The o32 ABI doesn't not support the TImode helpers. For the time being, disable just the shift libcalls as they break recursive builds on MIPS. Reviewers: sdardis Subscribers: llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D24259 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280798 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 10:01:18 +00:00
James Molloy	80fedcb5bf	[SimplifyCFG] Update workaround for PR30188 to also include loads I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores. Fixes PR30188 (again). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280792 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 08:40:20 +00:00
James Molloy	40fa86fff8	[SimplifyCFG] Check PHI uses more accurately PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG. Fixes PR30292. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280790 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 08:15:54 +00:00
Hal Finkel	21c68fac74	[PowerPC] Fix address-offset folding for plain addi When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280789 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 07:36:11 +00:00
Elena Demikhovsky	78405002cf	AVX512F: FMA intrinsic + FNEG - sequence optimization The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set. FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))). It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead. I added pattern match for integer XOR. Differential Revision: https://reviews.llvm.org/D24221 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280785 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 06:54:28 +00:00
Craig Topper	83d6e111bc	[AVX-512] Add support for commuting masked instructions in findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280781 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 04:46:11 +00:00
Saleem Abdulrasool	4ccc33a9ab	Revert "CodeGen: ensure that libcalls are always AAPCS CC" This reverts SVN r280683. Revert until I figure out why this is breaking lli tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280778 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-07 03:17:19 +00:00
Zachary Turner	8eddc0f657	Re-add "Make FieldList records print as a YAML sequence" This was originally submitted in r280549, and reverted in r280577 due to breaking one MSVC buildbot. The issue is that MSVC 2013 doesn't synthesize move constructors. So even though i was writing std::move(A) it was copying it, leading to a bogus ArrayRef. The solution here is to simply remove the std::vector<> from the type, since it is unused and unnecessary. This way the ArrayRef continues to point into the original memory backing the CVType. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 23:45:47 +00:00
Hal Finkel	8d0ebe9402	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike) I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280767 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 23:02:23 +00:00
Ying Yi	84f34c0155	[llvm-cov] Add the project summary to the text coverage report for each source file. This patch is a spin-off from https://reviews.llvm.org/D23922. It extends the text view to preserve the same feature as the html view. Differential Revision: https://reviews.llvm.org/D24241 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280756 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 21:41:38 +00:00
Konstantin Zhuravlyov	c87867f700	[AMDGPU] Wave and register controls - Add missing test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280749 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 20:29:10 +00:00
Konstantin Zhuravlyov	1f99c41083	[AMDGPU] Wave and register controls - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 20:22:28 +00:00
Wei Ding	9493fa986d	AMDGPU : Add XNACK feature to GPUs that support it. Differential Revision: http://reviews.llvm.org/D24276 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280742 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 19:55:17 +00:00
Ying Yi	f73cd14f4a	[llvm-cov] Add the "Go to first unexecuted line" feature. This patch provides easy navigation to find the zero count lines, especially useful when the source file is very large. Differential Revision: https://reviews.llvm.org/D23277 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280739 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 19:31:18 +00:00
Rafael Espindola	ec44cfdaf0	Add an c++ itanium demangler to llvm. This adds a copy of the demangler in libcxxabi. The code also has no dependencies on anything else in LLVM. To enforce that I added it as another library. That way a BUILD_SHARED_LIBS will fail if anyone adds an use of StringRef for example. The no llvm dependency combined with the fact that this has to build on linux, OS X and Windows required a few changes to the code. In particular: No constexpr. No alignas On OS X at least this library has only one global symbol: __ZN4llvm16itanium_demangleEPKcPcPmPi My current plan is: Commit something like this Change lld to use it Change lldb to use it as the fallback Add a few #ifdefs so that exactly the same file can be used in libcxxabi to export abi::__cxa_demangle. Once the fast demangler in lldb can handle any names this implementation can be replaced with it and we will have the one true demangler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280732 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 19:16:48 +00:00
Krzysztof Parzyszek	a68463ecad	[RDF] Ignore undef use operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280717 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 17:03:13 +00:00
Simon Pilgrim	854572da4c	[SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ), Idx ) -> In If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector. This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes. Differential Revision: https://reviews.llvm.org/D24254 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280715 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 16:42:05 +00:00
Adam Nemet	6050b768b6	[JumpThreading] Only write back branch-weight MDs for blocks that originally had PGO info Currently the pass updates branch weights in the IR if the function has any PGO info (entry frequency is set). However we could still have regions of the CFG that does not have branch weights collected (e.g. a cold region). In this case we'd use static estimates. Since static estimates for branches are determined independently, they are inconsistent. Updating them can "randomly" inflate block frequencies. I've run into this in a completely cold loop of h264ref from SPEC. -Rpass-with-hotness showed the loop to be completely cold during inlining (before JT) but completely hot during vectorization (after JT). The new testcase demonstrate the problem. We check array elements against 1, 2 and 3 in a loop. The check against 3 is the loop-exiting check. The block names should be self-explanatory. In this example, jump threading incorrectly updates the weight of the loop-exiting branch to 0, drastically inflating the frequency of the loop (in the range of billions). There is no run-time profile info for edges inside the loop, so branch probabilities are estimated. These are the resulting branch and block frequencies for the loop body: check_1 (16) (8) / \| eq_1 \| (8) \ \| check_2 (16) (8) / \| eq_2 \| (8) \ \| check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) First we thread eq_1 -> check_2 to check_3. Frequencies are updated to remove the frequency of eq_1 from check_2 and then from the false edge leaving check_2. Changed frequencies are highlighted with * : check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| \ eq_2 \| (0) \ \ \| ` --- check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new back edges. Frequencies are updated to remove the frequency of eq_1 and eq_3 from check_3 and then the false edge leaving check_3 (changed frequencies are highlighted with ): check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| /-- eq_2~ \| (0) (back edge) \| check_3 (0) (0) / \| (loop exit) \| (0*) \| (back edge) As a result, the loop exit edge ends up with 0 frequency which in turn makes the loop header to have maximum frequency. There are a few potential problems here: 1. The profile data seems odd. There is a single profile sample of the loop being entered. On the other hand, there are no weights inside the loop. 2. Based on static estimation we shouldn't set edges to "extreme" values, i.e. extremely likely or unlikely. 3. We shouldn't create profile metadata that is calculated from static estimation. I am not sure what policy is but it seems to make sense to treat profile metadata as something that is known to originate from profiling. Estimated probabilities should only be reflected in BPI/BFI. Any one of these would probably fix the immediate problem. I went for 3 because I think it's a good policy to have and added a FIXME about 2. Differential Revision: https://reviews.llvm.org/D24118 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280713 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 16:08:33 +00:00
Simon Dardis	7e4471c842	[mips] Tighten FastISel restrictions LLVM PR/29052 highlighted that FastISel for MIPS attempted to lower arguments assuming that it was using the paired 32bit registers to perform operations for f64. This mode of operation is not supported for MIPSR6. This patch resolves the reported issue by adding additional checks for unsupported floating point unit configuration. Thanks to mike.k for reporting this issue! Reviewers: seanbruno, vkalintiris Differential Review: https://reviews.llvm.org/D23795 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280706 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 12:36:24 +00:00
Krzysztof Parzyszek	6ee91ac76e	[PPC] Claim stack frame before storing into it, if no red zone is present Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it there is no guarantee that this part of the stack will not be modified by any interrupt. To avoid this, make sure to claim the stack frame first before storing into it. This fixes https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280705 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 12:30:00 +00:00
Craig Topper	899add2fea	[AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast. We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280696 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 06:56:59 +00:00
Craig Topper	def05b4ab5	[AVX-512] Add a test case to show that we don't select masked vpermi2ps when the index operand comes from a bitcast. It doesn't work because we're looking for a bitcast from the v4i32 index operand to v4f32 for the passthru part of the DAG. But since the index is bitcasted from v2i64 and bitcasts fold, we actually have a bitcast from v2i64 to v4f32 in the passthru part of the DAG. Taken from optimized output from clang's test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280695 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 05:45:27 +00:00
Saleem Abdulrasool	02c3a66ad9	ARM: workaround bundled operation predication This is a Windows ARM specific issue. If the code path in the if conversion ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up with a bundle to ensure that the mov.w/mov.t pair is not split up. This is normally fine, however, if the branch is also predicated, then we end up trying to predicate the bundle. For now, report a bundle as being unpredicatable. Although this is false, this would trigger a failure case previously anyways, so this is no worse. That is, there should not be any code which would previously have been if converted and predicated which would not be now. Under certain circumstances, it may be possible to "predicate the bundle". This would require scanning all bundle instructions, and ensure that the bundle contains only predicatable instructions, and converting the bundle into an IT block sequence. If the bundle is larger than the maximal IT block length (4 instructions), it would require materializing multiple IT blocks from the single bundle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280689 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 04:00:12 +00:00
Craig Topper	a9a5240720	[AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280684 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 00:31:10 +00:00
Saleem Abdulrasool	7471df8d42	CodeGen: ensure that libcalls are always AAPCS CC All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC rather than C CC for ARM/thumb targets. The changes to the tests are necessary to ensure that the calling convention for the lowered library calls are honoured. Furthermore, these adjustments cause certain branch invocations to change to branch-and-link since the returned value needs to be moved across registers (d0 -> r0, r1). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280683 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 00:28:43 +00:00
Craig Topper	610e45c3d2	[AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double. Still need to fix the register classes to allow the extended range of registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280682 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:58:40 +00:00
Craig Topper	d844741822	[X86] Update fast-isel store test to have more 256 and 512-bit test cases. Add command lines for AVX and AVX512 feature sets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280681 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:58:37 +00:00
Craig Topper	e4bd2be5c6	[X86] Update fast-isel vector load test to have more 256 and 512-bit test cases. Add a command line for SKX features too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280680 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:58:32 +00:00
Sanjay Patel	ea437ed14a	fix FileCheck variables for test added with r280677 The script (utils/update_test_checks.py) seems to have problems with variable names that start with the same string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280679 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:49:32 +00:00
Gor Nishanov	5b6d16cd6c	[Coroutines] Part12: Handle alloca address-taken Summary: Move early uses of spilled variables after CoroBegin. For example, if a parameter had address taken, we may end up with the code like: define @f(i32 %n) { %n.addr = alloca i32 store %n, %n.addr ... call @coro.begin This patch fixes the problem by moving uses of spilled variables after CoroBegin. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24234 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280678 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:45:45 +00:00
Sanjay Patel	47e6904fe1	[InstCombine] don't assert that division-by-constant has been folded (PR30281) This is effectively a revert of: https://reviews.llvm.org/rL280115 And this should fix https://llvm.org/bugs/show_bug.cgi?id=30281: git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280677 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 23:38:22 +00:00
Sanjay Patel	6704841232	[InstCombine] revert r280637 because it causes test failures on an ARM bot http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280676 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 22:36:32 +00:00
Simon Pilgrim	bf482f99dd	[X86][SSE] Add test cases for PR29078 'Failure to recognise i64 sitofp/uitofp conversions that can be performed as i32' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280671 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 18:11:17 +00:00
Simon Pilgrim	70794a5363	[X86][SSE] Add test cases for PR29079 'Failure to recognise uitofp conversions that can be performed as sitofp' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280670 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 18:04:38 +00:00
Simon Pilgrim	70de308288	[X86][SSE] Regenerate odd shuffle tests with common prefixes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280661 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 14:15:38 +00:00
Oliver Stannard	bbe0f05199	[SimplifyCFG] Add test for sinking inline asm in if/else This test code previously caused a failure in the module verifier, because SimplifyCFG created this invalid instruction, which tries to take the address of inline asm: %.sink = select i1 %1, i64 ()* asm "mov $0, #1", "=r", i64 ()* asm %"mov $0, #2", "=r" This has been fixed recently, presumably by James Molloy's patches that re-wrote and changed parts of SimplifyCFG, so this patch just adds a regression test for it. Differential Revision: https://reviews.llvm.org/D24231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280660 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-05 13:49:26 +00:00

1 2 3 4 5 ...

39359 Commits