RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-22 21:35:57 +00:00

Author	SHA1	Message	Date
Amjad Aboud	a90eb62d9e	[X86] Generate VZEROUPPER for Skylake-avx512. VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be. Differential Revision: https://reviews.llvm.org/D29874 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296859 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 09:03:24 +00:00
Craig Topper	02e45aadbe	[X86] Use SHLD with both inputs from the same register to implement rotate on Sandy Bridge and later Intel CPUs Summary: Sandy Bridge and later CPUs have better throughput using a SHLD to implement rotate versus the normal rotate instructions. Additionally it saves one uop and avoids a partial flag update dependency. This patch implements this change on any Sandy Bridge or later processor without BMI2 instructions. With BMI2 we will use RORX as we currently do. Reviewers: zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30181 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295697 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 06:39:13 +00:00
Craig Topper	b4bf68e4e8	[X86] Remove the HLE feature flag. We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294562 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 06:51:02 +00:00
Craig Topper	4b9bffa31e	[X86] Clzero intrinsic and its addition under znver1 This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294558 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 04:27:34 +00:00
Eugene Zelenko	5fcc26dee3	[X86] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293949 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-02 22:55:55 +00:00
Peter Collingbourne	1b5750c5ea	X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128). Differential Revision: https://reviews.llvm.org/D28689 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293844 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-02 00:32:03 +00:00
Joerg Sonnenberger	a7862537ba	Remove an overeager assert from r288844. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292244 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 19:29:15 +00:00
Peter Collingbourne	b29976c54b	IR, X86: Understand !absolute_symbol metadata on global variables. Summary: Attaching !absolute_symbol to a global variable does two things: 1) Marks it as an absolute symbol reference. 2) Specifies the value range of that symbol's address. Teach the X86 backend to allow absolute symbols to appear in place of immediates by extending the relocImm and mov64imm32 matchers. Start using relocImm in more places where it is legal. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html Differential Revision: https://reviews.llvm.org/D25878 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289087 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-08 19:01:00 +00:00
Zvi Rackover	42694a3a7e	[X86] Prefer reduced width multiplication over pmulld on Silvermont Summary: Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld. On Silvermont [source: Optimization Reference Manual]: PMULLD has a throughput of 1/11 [instruction/cycles]. PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles]. Fixes pr31202. Analysis of this issue was done by Fahana Aleen. Reviewers: wmi, delena, mkuper Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27203 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288844 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-06 19:35:20 +00:00
Zvi Rackover	1ccb8e7cd4	[X86][GlobalISel] Add minimal call lowering support to the IRTranslator Summary: Add basic functionality to support call lowering for X86. Currently only supports functions which return void and take zero arguments. Inspired by commit 286573. Reviewers: ab, qcolombet, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286935 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-15 06:34:33 +00:00
Pierre Gousseau	4594395329	[X86] Take advantage of the lzcnt instruction on btver2 architectures when ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284248 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 16:41:38 +00:00
Nikolai Bozhenov	17c4ba4fe4	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277725 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 12:47:28 +00:00
Rafael Espindola	809018e56e	Delete unused includes. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274225 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-30 12:19:16 +00:00
Rafael Espindola	99b487713f	Drop support for creating $stubs. They are created by ld64 since OS X 10.5. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274130 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-29 14:59:50 +00:00
Rafael Espindola	d980ed0d00	Move shouldAssumeDSOLocal to Target. Should fix the shared library build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273958 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 23:15:57 +00:00
Rafael Espindola	ae47c1d46e	Simplify PICStyles. The main difference is that StubDynamicNoPIC is gone. The dynamic-no-pic mode as the name implies is simply not pic. It is just conservative about what it assumes to be dso local. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273222 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 23:41:56 +00:00
Davide Italiano	af8f40da2b	[X86Subtarget] Use isPositionIndependent(). NFC. Differential Revision: http://reviews.llvm.org/D21480 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273071 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-18 00:03:20 +00:00
Rafael Espindola	22f5417147	Use shouldAssumeDSOLocal on AArch64. This reduces code duplication and now AArch64 also handles PIE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270844 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-26 12:42:55 +00:00
Rafael Espindola	6edb5180dd	Fix shouldAssumeDSOLocal for private linkage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270746 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 19:55:16 +00:00
David Majnemer	ee47fd7219	[X86] Reduce memory allocations in X86TargetMachine::getSubtargetImpl We performed a number of memory allocations each time getTTI was called, remove them by using SmallString. No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270246 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-20 18:16:06 +00:00
Rafael Espindola	dddfc0bb56	Refactor X86 symbol access classification. This refactors the logic in X86 to avoid code duplication. It also splits it in two steps: it first decides if a symbol is local to the DSO and then uses that information to decide how to access it. The first part is implemented by shouldAssumeDSOLocal. It is not in any way specific to X86. In a followup patch I intend to move it to somewhere common and reused it in other backends. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270209 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-20 12:20:10 +00:00
Rafael Espindola	c2d48e6efd	Record a TargetMachine instead of a Reloc::Model. Addresses r270095's code review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270147 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-19 22:07:57 +00:00
Rafael Espindola	187ca85319	Remember the relocation model. NFC. This avoids passing a TargetMachine in a few places. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270095 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-19 18:49:29 +00:00
Rafael Espindola	0323a2e3a3	Style fixes. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270093 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-19 18:34:20 +00:00
Ashutosh Nema	1db659ede4	Add new flag and intrinsic support for MWAITX and MONITORX instructions Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper, RKSimon Subscribers: RKSimon, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19795 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269911 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 11:59:12 +00:00
Rafael Espindola	6a70b9b746	Simplify handling of hidden stub. Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269776 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-17 16:01:32 +00:00
Marcin Koscielnicki	d11ea2f4ad	[X86] Extend some Linux special cases to cover kFreeBSD. Both Linux and kFreeBSD use glibc, so follow similiar code paths. Add isTargetGlibc to check for this, and use it instead of isTargetLinux in a few places. Fixes PR22248 for kFreeBSD. Differential Revision: http://reviews.llvm.org/D19104 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268624 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-05 11:35:51 +00:00
Sriraman Tallam	7eaa51e95d	Differential Revision: http://reviews.llvm.org/D19733 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268106 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-29 21:19:16 +00:00
Sriraman Tallam	2057e74717	Differential Revision: http://reviews.llvm.org/D19040 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267229 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-22 21:41:58 +00:00
Asaf Badouh	ec050f449d	[X86] enable PIE for functions Call locally defined function directly for PIE/fPIE Differential Revision: http://reviews.llvm.org/D19226 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266863 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-20 08:32:57 +00:00
Sriraman Tallam	0fd07efa78	Test commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265976 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-11 18:40:50 +00:00
Andrey Turetskiy	2228d07bc0	[X86] Introduction of FeatureX87. Add FeatureX87 in X86 backend to be able to define CPUs which doesn't have x87. Differential Revision: http://reviews.llvm.org/D13979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264148 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-23 11:13:54 +00:00
Yunzhong Gao	6784b3ca2d	Disable the vzeroupper insertion pass on PS4. Differential Revision: http://reviews.llvm.org/D16837 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260764 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 23:37:57 +00:00
Elena Demikhovsky	f4a0173582	Added Skylake client to X86 targets and features Changes in X86.td: I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X .. I added Skylake client processor and defined it's features FeatureADX was missing on KNL Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others Differential Revision: http://reviews.llvm.org/D16357 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258659 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-24 10:41:28 +00:00
Michael Zuckerman	d26cdd00ea	[AVX512] adding AVXVBMI feature flag The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258012 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-17 13:42:12 +00:00
Asaf Badouh	22bb8f5964	[x86] adding PKU feature flag the feature flag is essential for RDPKRU and WRPKRU instruction more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm Differential Revision: http://reviews.llvm.org/D15491 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255644 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-15 13:35:29 +00:00
Hans Wennborg	35cba4cf6a	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254793 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-04 23:00:33 +00:00
Eric Christopher	d1b0795570	Add MMX to the 3dnow enum and propagate changes around. This makes it somewhat more consistent with how the feature is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253122 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-14 03:04:00 +00:00
Craig Topper	0f31d547eb	[X86] Add fxsr feature flag for fxsave/fxrestore instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250497 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-16 06:03:09 +00:00
Amjad Aboud	8e03ab46f2	[X86] Add XSAVE intrinsic family Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250029 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-12 11:47:46 +00:00
Eric Christopher	47f0e3f434	Move the MMX subtarget feature out of the SSE set of features and into its own variable. This is needed so that we can explicitly turn off MMX without turning off SSE and also so that we can diagnose feature set incompatibilities that involve MMX without SSE. Rationale: // sse3 __m128d test_mm_addsub_pd(__m128d A, __m128d B) { return _mm_addsub_pd(A, B); } // mmx void shift(__m64 a, __m64 b, int c) { _mm_slli_pi16(a, c); _mm_slli_pi32(a, c); _mm_slli_si64(a, c); _mm_srli_pi16(a, c); _mm_srli_pi32(a, c); _mm_srli_si64(a, c); _mm_srai_pi16(a, c); _mm_srai_pi32(a, c); } clang -msse3 -mno-mmx file.c -c For this code we should be able to explicitly turn off MMX without affecting the compilation of the SSE3 function and then diagnose and error on compiling the MMX function. This matches the existing gcc behavior and follows the spirit of the SSE/MMX separation in llvm where we can (and do) turn off MMX code generation except in the presence of intrinsics. Updated a couple of tests, but primarily tested with a couple of tests for turning on only mmx and only sse. This is paired with a patch to clang to take advantage of this behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249731 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-08 20:10:06 +00:00
Daniel Sanders	47b167dd84	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 16:17:27 +00:00
Daniel Sanders	9781f90c7e	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 14:08:28 +00:00
Daniel Sanders	a6aa0c3bcc	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:46:21 +00:00
Daniel Sanders	7b82808e13	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:17:40 +00:00
Sanjay Patel	ac515c4087	rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI) This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246585 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-01 20:51:51 +00:00
Sanjay Patel	af7c00f213	make fast unaligned memory accesses implicit with SSE4.2 or SSE4a This is a follow-on from the discussion in http://reviews.llvm.org/D12154. This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops. A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example: $ clang -O1 foo.c -mavx This resolves a difference in lowering noted in PR24449: https://llvm.org/bugs/show_bug.cgi?id=24449 Before this patch, we used different store types depending on whether the example can be lowered as a memset or not. Differential Revision: http://reviews.llvm.org/D12288 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245950 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-25 16:29:21 +00:00
Sanjay Patel	2071d7abd9	[x86] invert logic for attribute 'FeatureFastUAMem' This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245729 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 20:17:26 +00:00
Sanjay Patel	22ac7abe6c	don't repeaat function names in comments; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245058 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-14 15:11:42 +00:00
Duncan P. N. Exon Smith	5733fd14d4	MC: Remove MCSubtargetInfo::InitCPUSched() Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`. We were only calling the former after explicitly calling the latter with the same CPU; it's confusing to have both methods exposed. Besides a minor (surely unmeasurable) speedup in ARM and X86 from avoiding running the logic twice, no functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241956 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-10 22:33:01 +00:00

1 2 3 4 5 ...

379 Commits