RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-04-15 20:40:30 +00:00

Author	SHA1	Message	Date
Igor Breger	1e48871118	[AVX512] Fix cvtusi2sd instruction Opcode, it should be 0x7B instead of 0x2A. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272122 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 07:48:23 +00:00
Quentin Colombet	d292b5eaf2	[AArch64][RegisterBankInfo] Use the generic implementation of copyCost. Long term we may want to give high cost at FPR to/from GPR copies. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272086 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 01:24:00 +00:00
Quentin Colombet	edb4f7c12a	[RegisterBankInfo] Add a size argument for the cost of copy. The cost of a copy may be different based on how many bits we have to copy around. E.g., a 8-bit copy may be different than a 32-bit copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272084 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 01:11:03 +00:00
Nicolai Haehnle	2ac1fa00c9	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272063 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 21:37:17 +00:00
Eric Christopher	c2a7f10882	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272056 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:27:12 +00:00
Etienne Bergeron	70cf01c276	[stack-protection] Add support for MSVC buffer security check Summary: This patch is adding support for the MSVC buffer security check implementation The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347 Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ``` The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ``` The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling. Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation. * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'. * include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'. * lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm. Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp). Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h) Results * IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ``` ``` * Final LLVM Code input to ISel * ; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8, i8* %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ``` * SelectionDAG generated instrumentation: ``` clang-cl /GS test.cc /O1 /c /FA ``` ``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ``` Reviewers: kcc, pcc, eugenis, rnk Subscribers: majnemer, llvm-commits, hans, thakis, rnk Differential Revision: http://reviews.llvm.org/D20346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272053 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:15:35 +00:00
Krzysztof Parzyszek	6fc4b2ad52	Revert r272045 since GCC doesn't know how to compile it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272048 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 19:25:28 +00:00
Krzysztof Parzyszek	01da260d54	[Hexagon] Modify HexagonExpandCondsets to handle subregisters Also, switch to using functions from LiveIntervalAnalysis to update live intervals, instead of performing the updates manually. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272045 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 19:06:23 +00:00
Wei Ding	e2d1122183	Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 19:04:44 +00:00
Geoff Berry	f323692d97	Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Originally reviewed here: http://reviews.llvm.org/D17463 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272023 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 16:48:43 +00:00
Oliver Stannard	f88b5f4e1e	[ARM] Accept conditional versions of BXNS and BLXNS These instructions end in "S" but are not flag-setting, so they need including in the list of special cases in the assembly parser. Differential Revision: http://reviews.llvm.org/D21077 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272015 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 14:58:48 +00:00
Simon Pilgrim	2f401854a1	[X86][SSE] Add general lowering of nontemporal vector loads (fixed bad merge) Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272011 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:47:23 +00:00
Simon Pilgrim	6ad3e358e9	[X86][SSE] Add general lowering of nontemporal vector loads Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272010 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:34:24 +00:00
James Molloy	6e988c5976	[Thumb-1] Add optimized constant materialization for integers [256..512) We can materialize these integers using a MOV; ADDi8 pair. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272007 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:10:14 +00:00
Igor Breger	7e0019d8f7	[AVX512] Fix load opcode for fast isel. Differential Revision: http://reviews.llvm.org/D21067 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272006 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:08:45 +00:00
Ulrich Weigand	d7ad443387	[PowerPC] Support multiple return values with fast isel Using an LLVM IR aggregate return value type containing three or more integer values causes an abort in the fast isel pass. This patch adds two more registers to RetCC_PPC64_ELF_FIS to allow returning up to four integers with fast isel, just the same as is currently supported with regular isel (RetCC_PPC). This is needed for Swift and (possibly) other non-clang frontends. Fixes PR26190. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272005 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:48:22 +00:00
Simon Pilgrim	c1e27cc453	[X86][SSE] Improved blend+zero target shuffle combining to use combined shuffle mask directly We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened. This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272003 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:20:14 +00:00
James Molloy	87f50aafbc	[ARM] Shrink post-indexed LDR and STR to LDM/STM A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272002 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:13:34 +00:00
James Molloy	d5127f4273	[ARM] Transform LDMs into writeback form to save code size If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272000 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 11:47:24 +00:00
Peter Smith	884efc1a85	[ARM] Incorrect relocation type for Thumb2 B<cond>.w The Thumb2 conditional branch B<cond>.W has a different encoding (T3) to the unconditional branch B.W (T4) as it needs to record <cond>. As the encoding is different the B<cond>.W is given a different relocation type. ELF for the ARM Architecture 4.6.1.6 (Table-13) states that R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the MC layer is using the R_ARM_THM_JUMP24 from B.W. This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the existing test that checks for R_ARM_THM_JUMP24 to expect R_ARM_THM_JUMP19. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271997 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 10:34:33 +00:00
Craig Topper	55ae04708d	[AVX512] Allow avx2 and sse41 nontemporal load intrinsics to select EVEX encoded instructions when VLX is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271988 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 07:27:57 +00:00
Craig Topper	a2dda815a0	[AVX512] Remove unnecessary mayLoad, mayStore, hasSidEffects flags from instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271987 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 07:27:54 +00:00
Craig Topper	9c727fb929	[AVX512] Add NoVLX to a couple patterns that have VLX equivalents. Ordering of the patterns in the .td file protects this, but its better to be explicit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271986 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 07:27:51 +00:00
Saleem Abdulrasool	48ff9d62da	ARM: correct TLS access on WoA TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271974 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 03:15:07 +00:00
Saleem Abdulrasool	7fe5e9209b	ARM: clang-format a couple of switches, add comments clang-format a couple of switches in preparation for a future change. Add some enumeration comments git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271973 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 03:15:01 +00:00
Saleem Abdulrasool	4da96fca18	ARM: normalise space in the patterns Just adjust the whitespace for the selection patterns. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271972 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 03:14:57 +00:00
Matt Arsenault	f4135c634c	AMDGPU: Add function for getting instruction size git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271936 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 20:10:33 +00:00
Matt Arsenault	bc1b8d5b49	AMDGPU: Fix constantexpr addrspacecasts If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271935 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 20:03:31 +00:00
Artem Tamazov	7049ac906c	[AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271900 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 15:23:43 +00:00
Igor Breger	5238dbe213	[KNL] Fix UMULO lowering. Differential Revision: http://reviews.llvm.org/D21013 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271891 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 12:24:52 +00:00
Benjamin Kramer	ad53d4da9e	Remove dead function with incredibly broken assert. Found by clang-tidy's misc-assert-side-effect. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271887 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 12:10:42 +00:00
Filipe Cabecinhas	67708909d0	[NFC] Silence gcc warning (-Wsign-compare) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271882 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 10:49:56 +00:00
Craig Topper	856b53e006	[AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271870 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 05:39:10 +00:00
Simon Pilgrim	7705b591df	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS raw mask decoding for target shuffle combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271834 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 15:21:30 +00:00
Simon Pilgrim	2d63358b82	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS as a target shuffle type git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271831 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 15:01:45 +00:00
Simon Pilgrim	b8b77a8df5	[X86][XOP] Tidied up DecodeVPERMIL2PMask to more closely match DecodeVPERMILPMask. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271830 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 14:33:43 +00:00
Craig Topper	3789c0fe24	[AVX512] Add support for lowering PALIGNR for v64i8. Could do this for other types to, but this is what's needed to replace the instrinsic with native IR in clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271828 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 06:29:12 +00:00
Craig Topper	dd17bc5daa	[AVX512] Fix PANDN combining for v4i32/v8i32 when VLX is enabled. v4i32/v8i32 ANDs aren't promoted to v2i64/v4i64 when VLX is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271826 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 05:35:11 +00:00
Simon Pilgrim	b6dac61a73	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decoding git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271809 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 21:44:28 +00:00
Craig Topper	c21f33738d	[X86] Add the VR128L/H and VR256L/H to the list of vector register classes for inline asm constraints. Also fix the comment on the function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271802 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 20:15:08 +00:00
Saleem Abdulrasool	9195dbacc0	X86: enable TLS on Windows itanium Windows itanium is nearly identical to windows-msvc (MS ABI for C, itanium for C++). Enable the TLS support for the target similar to the MSVC model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271797 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 18:27:22 +00:00
Simon Pilgrim	ebbbdf51f2	[X86][AVX2] Fix v16i16 SHL lowering (PR27730) The AVX2 v16i16 shift lowering works by unpacking to 2 x v8i32, performing the shift and then truncating the result. The unpacking is used to place the values in the upper 16-bits so that we can correctly sign-extend for SRA shifts. Unfortunately we weren't ensuring that the lower 16-bits were zero to ensure that SHL correctly shifts in zero bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271796 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 16:45:33 +00:00
Craig Topper	915ac2bcce	[X86] Use smaller types to shrink the intrinsic lowering tables by about 12K. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271776 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 04:32:17 +00:00
Craig Topper	32e67c3f6b	[X86] Use X86ISD::ABS for lowering pabs SSSE3/AVX intrinsics to match AVX512. Should allow those intrinsics to use the EVEX encoded instructions and get the extra registers when available. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271775 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 04:32:15 +00:00
Chad Rosier	2ef8a598eb	[AArch64] Spot SBFX-compatible code expressed with sign_extend. This is very similar to r271677, but for extracts from i32 with the SIGN_EXTEND acting on a arithmetic shift. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271717 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 20:05:49 +00:00
Derek Schuff	561fb73b85	[WebAssembly] Emit type signatures for declared functions Under emscripten, C code can take the address of a function implemented in Javascript (which is exposed via an import in wasm). Because imports do not have linear memory address in wasm, we need to generate a thunk to be the target of the indirect call; it call the import directly. To make this possible, LLVM needs to emit the type signatures for these functions, because they may not be called directly or referred to other than where the address is taken. This uses s new .s directive (.functype) which specifies the signature. Differential Revision: http://reviews.llvm.org/D20891 Re-apply r271599 but instead of bailing with an error when a declared function has multiple returns, replace it with a pointer argument. Also add the test case I forgot to 'git add' last time around. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271703 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 18:34:36 +00:00
Sjoerd Meijer	81cccc948a	Code size optimisation: do not inline memcpy if this expansion results in more instructions than the libary call. Differential Revision: http://reviews.llvm.org/D20958 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271678 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 15:38:55 +00:00
Chad Rosier	ce31d93762	[AArch64] Spot SBFX-compatbile code expressed with sign_extend_inreg. We were assuming all SBFX-like operations would have the shl/asr form, but often when the field being extracted is an i8 or i16, we end up with a SIGN_EXTEND_INREG acting on a shift instead. This is a port of r213754 from ARM to AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271677 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 15:00:09 +00:00
Artem Tamazov	a03214d227	[test/AMDGPU] Square-braced-syntax for registers: add macro test/example. Test added as per discussion in http://reviews.llvm.org/D20588. The macro is just a demonstration, useless in practice. Coding style fixes. Differential Revision: http://reviews.llvm.org/D20797 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271675 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 14:41:17 +00:00
Sjoerd Meijer	2fca6568ee	RAS extensions are part of ARMv8.2-A. This change enables them by introducing a new instruction to ARM and AArch64 targets and several system registers. Patch by: Roger Ferrer Ibanez and Oliver Stannard Differential Revision: http://reviews.llvm.org/D20282 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271670 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 14:03:27 +00:00

1 2 3 4 5 ...

38399 Commits