llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-01-16 13:08:42 +00:00

Author	SHA1	Message	Date
Johannes Doerfert	707a406078	Add bounded loop assumption So far we ignored the unbounded parts of the iteration domain, however we need to assume they do not occure at all to remain sound if they do. llvm-svn: 248126	2015-09-20 16:38:19 +00:00
Johannes Doerfert	f2cc86edae	Simplify domain generation We now add loop carried information during the second traversal of the region instead of in a intermediate step in-between. This makes the generation simpler, removes code and should even be faster. llvm-svn: 248125	2015-09-20 16:15:32 +00:00
Johannes Doerfert	0c1123a831	[FIX] Repair test case that was unprofitable llvm-svn: 248124	2015-09-20 16:14:41 +00:00
Sanjay Patel	bab5d6c636	add test file ahead of any functional changes for PR22428 llvm-svn: 248123	2015-09-20 15:58:00 +00:00
Simon Pilgrim	c6a553241c	[X86][SSE] Intrinsics builtins test refresh. NFCI llvm-svn: 248122	2015-09-20 15:41:35 +00:00
Igor Breger	b7e1f9d680	AVX512: Implemented encoding and intrinsics for vcmpss/sd. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12593 llvm-svn: 248121	2015-09-20 15:15:10 +00:00
Johannes Doerfert	06c57b594c	Allow loops with multiple back edges In order to allow multiple back edges we: - compute the conditions under which each back edge is taken - build the union over all these conditions, thus the condition that any back edge is taken - apply the same logic to the union we applied to a single back edge llvm-svn: 248120	2015-09-20 15:00:20 +00:00
Johannes Doerfert	7175bdfbe4	Add loop trip count based heuristic for SCoP detection As we currently do not perform any optimizations that targets (or is even aware) small trip counts we will skip them when we count the loops in a region. llvm-svn: 248119	2015-09-20 14:56:54 +00:00
Johannes Doerfert	b276bde162	[NFC] Remove obsolete diagnostic for unstructured control flow llvm-svn: 248118	2015-09-20 14:55:50 +00:00
Asaf Badouh	2744d21fb8	[X86][AVX512] extend support in Scalar conversion add scalar FP to Int conversion with truncation intrinsics add scalar conversion FP32 from/to FP64 intrinsics add rounding mode and SAE mode encoding for these intrinsics Differential Revision: http://reviews.llvm.org/D12665 llvm-svn: 248117	2015-09-20 14:31:19 +00:00
Igor Breger	4c4cd789c9	AVX512: vsqrtss/sd encoding and intrinsics implementation. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12102 llvm-svn: 248116	2015-09-20 09:13:41 +00:00
Asaf Badouh	572bbceecc	[X86][AVX512DQ] Add fpclass instruction Differential Revision: http://reviews.llvm.org/D12931 llvm-svn: 248115	2015-09-20 08:46:07 +00:00
Michael Kuperstein	58e86bc893	[X86] Fix sitofp and uitofp instruction matching failures with long double and avx512 The operation action for i32 and i64 cannot be set to legal, as long double needs custom lowering. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D12372 llvm-svn: 248114	2015-09-20 08:12:17 +00:00
Igor Breger	1d55f20bee	AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4 Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D12525 llvm-svn: 248113	2015-09-20 07:18:53 +00:00
Sanjoy Das	9119bf4c0b	[IndVars] Don't repeat function names in comment; NFC. Only changes comments. llvm-svn: 248112	2015-09-20 06:58:03 +00:00
Igor Breger	0ede3cbb5c	AVX512: Implement instructions encoding, lowering and intrinsics vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4 Added tests for encoding, lowering and intrinsics. Differential Revision: http://reviews.llvm.org/D11893 llvm-svn: 248111	2015-09-20 06:52:42 +00:00
Saleem Abdulrasool	4966f58ac2	ARM: cleanup formatting clang-format a line which was poorly formatted. NFC. llvm-svn: 248110	2015-09-20 03:19:09 +00:00
Rui Ueyama	997b357ac1	COFF: Run InputFile::parse() in background using std::async(). Previously, InputFile::parse() was run in batch. We construct a list of all input files and call parse() on each file using parallel_for_each. That means we cannot start parsing files until we get a complete list of input files, although InputFile::parse() is safe to call from anywhere. This patch makes it asynchronous. As soon as we add a file to the symbol table, we now start parsing the file using std::async(). This change shortens self-hosting time (650 ms) by 28 ms. It's about 4% improvement. llvm-svn: 248109	2015-09-20 03:11:16 +00:00
Saleem Abdulrasool	af99cd4174	EH: fix register usage for SjLj When using SjLj EH, do not use __builtin_eh_return_regno, map directly to the ID. This would work on some targets, particularly those where the non-SjLj EH personality used the same register mapping (0 -> 0, 1 -> 1). However, this is not guaranteed. Avoiding the use of the builtin enables the use of libc++ with SjLj EH on all targets. llvm-svn: 248108	2015-09-20 02:08:31 +00:00
Sanjoy Das	428db150d1	[IndVars] Fix a bug in r248045. Because -indvars widens induction variables through arithmetic, `NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV` manages information for all transitive uses of an IV being widened, including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse` which manages information for a specific def-use edge in the transitive use list of an induction variable. This change also adds a test case that demonstrates the problem with r248045. llvm-svn: 248107	2015-09-20 01:52:18 +00:00
Rui Ueyama	f49712a853	COFF: Fix race condition. NextID is updated inside parallel_for_each, so it needs mutual exclusion. llvm-svn: 248106	2015-09-20 01:44:44 +00:00
Rui Ueyama	3cfd2bff1e	Remove dead code. llvm-svn: 248105	2015-09-20 01:19:36 +00:00
Rui Ueyama	1cce300843	COFF: Change Symbol::Body type from atomic pointer to regular pointer. I made the field an atomic pointer in hope that we would be able to parallelize the symbol resolver soon, but that's not going to happen soon. This patch reverts that change for the sake of readability. llvm-svn: 248104	2015-09-20 00:00:05 +00:00
Rui Ueyama	63bbe84b27	COFF: Make Chunk::writeTo() const. NFC. This should improve code readability especially because this function is called inside parallel_for_each. llvm-svn: 248103	2015-09-19 23:28:57 +00:00
Rui Ueyama	ebb0ebff4b	COFF: Fix thread-safety bug. LTOModule doesn't seem to be thread-safe, so guard that with mutex. llvm-svn: 248102	2015-09-19 23:14:51 +00:00
Adrian Prantl	1e63b2bdc3	Further simplify the interface of PCHContainerGenerator by dropping the const qualifier on the CI. NFC llvm-svn: 248101	2015-09-19 21:42:52 +00:00
Nico Weber	fb80f961df	Convert two loops to range-based loops. No behavior change. llvm-svn: 248100	2015-09-19 21:36:51 +00:00
Rui Ueyama	a5f0f758d3	COFF: Move markLive() from Writer.cpp to its own file. Conceptually, garbage collection is not part of Writer, so move the function out of the file. llvm-svn: 248099	2015-09-19 21:36:28 +00:00
Rui Ueyama	0652c59506	COFF: Actually parallelize InputFile::parse(). This is a follow-up patch to r248078. llvm-svn: 248098	2015-09-19 21:33:26 +00:00
Davide Italiano	e210ee56f2	Fixup r248096, commit the correct test. llvm-svn: 248097	2015-09-19 20:52:47 +00:00
Davide Italiano	a539f63ae1	[obj2yaml] Fix "time of check to time of use" bug. Add a test. llvm-svn: 248096	2015-09-19 20:49:34 +00:00
Saleem Abdulrasool	06f6f995a1	Driver: alter the getARMFloatABI signature This changes getARMFloatABI to use the ToolChain and Args instead of Driver, Args, Triple. Although this pushes the Triple calculation/parsing into the function itself, it enables the use of the function for a future change. The reason to sink the triple calculation here is to avoid threading the Triple through multiple layers in a future change. llvm-svn: 248095	2015-09-19 20:40:16 +00:00
Saleem Abdulrasool	ce63ce947e	Driver: tweak ARM target feature calculation Rather than using re-calculating the effective triple, thread the already calculated value down into AddARMTargetArgs. This avoids both recreating the triple, as well as re-parsing the triple as it was already done in the previous frame. llvm-svn: 248094	2015-09-19 18:19:44 +00:00
Simon Pilgrim	27f81776ad	[X86][AVX2] Use general sext IR for vpmovsx stack folding tests llvm-svn: 248093	2015-09-19 17:04:18 +00:00
Simon Pilgrim	12919f7e49	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR 128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds. This patch removes the builtins and reimplements the _mm_cvtepi_epi intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension). Differential Revision: http://reviews.llvm.org/D12835 llvm-svn: 248092	2015-09-19 15:12:38 +00:00
Simon Pilgrim	d0448ee59f	[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1)) Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x)) Differential Revision: http://reviews.llvm.org/D12663 llvm-svn: 248091	2015-09-19 13:22:57 +00:00
Alexander Kornienko	39861b3eb5	[clang-tidy] Fix example comments. Patch by don hinton! Differential revision: http://reviews.llvm.org/D12967 llvm-svn: 248090	2015-09-19 13:01:57 +00:00
Simon Pilgrim	996725eb17	[InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI. Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680. llvm-svn: 248089	2015-09-19 11:41:53 +00:00
NAKAMURA Takumi	5881d349f9	[CMake] Update LLVM_TEST_DEPENDS not to use macho-dump. It has been unused since r247235. llvm-svn: 248088	2015-09-19 07:19:30 +00:00
Matt Arsenault	1fafdc82d6	AMDGPU: Remove dead code getCFGStructurizerRegClass is not used for SI, so move it into R600 specific stuff. llvm-svn: 248087	2015-09-19 06:41:10 +00:00
Bob Wilson	8823b84fae	NFC: Fix indentation and add braces to clarify nested of else-statement. llvm-svn: 248086	2015-09-19 06:20:59 +00:00
Serge Pavlov	c4e04a2964	[Modules] More descriptive diagnostics for misplaced import directive If an import directive was put into wrong context, the error message was obscure, complaining on misbalanced braces. To get more descriptive messages, annotation tokens related to modules are processed where they must not be seen. Differential Revision: http://reviews.llvm.org/D11844 llvm-svn: 248085	2015-09-19 05:32:57 +00:00
Maksim Panchenko	0510cd5161	[PrologEpilogInserter] Minor refactoring. Differential Revision: http://reviews.llvm.org/D12924 llvm-svn: 248084	2015-09-19 04:42:15 +00:00
Saleem Abdulrasool	d5556e34e4	Driver: avoid unnecessary string ops Use an enumeration for the Floating Point ABIs supported on MIPS. This is replicating the ARM change to avoid string based tracking of the floating point ABI. NFC. llvm-svn: 248083	2015-09-19 04:33:38 +00:00
Maksim Panchenko	07b754daf8	Test commit. Fix comment. NFC. llvm-svn: 248082	2015-09-19 04:01:19 +00:00
Rui Ueyama	27e9e6540c	Remove unused #includes. llvm-svn: 248081	2015-09-19 02:28:32 +00:00
NAKAMURA Takumi	e677e2f545	clang-tools-extra: Appease PR24881. [-Wdocumentation] \returns doesn't accept \li, but \parblock \li. llvm-svn: 248080	2015-09-19 02:21:28 +00:00
Richard Smith	664798c034	Add test that we correctly allow some non-letter unicode characters in identifiers, and extend existing test to also cover C++. llvm-svn: 248079	2015-09-19 02:14:12 +00:00
Rui Ueyama	f4d05d7a80	COFF: Parallelize InputFile::parse(). InputFile::parse() can be called in parallel with other calls of the same function. By doing that, time to self-link improves from 741 ms to 654 ms or 12% faster. This is probably the last low hanging fruit in terms of parallelism. Input file parsing and symbol table insertion takes 450 ms in total. If we want to optimize further, we probably have to parallelize symbol table insertion using concurrent hashmap or something. That's doable, but that's not easy, especially if you want to keep the exact same semantics and linking order. I'm not going to do that at least soon. Anyway, compared to r248019 (the change before the first attempt for parallelism), we achieved 36% performance improvement from 1022 ms to 654 ms. MSVC linker takes 3.3 seconds to link the same program. MSVC's ICF feature is very slow for some reason, but even if we disable the feature, it still takes about 1.2 seconds. Our number is probably good enough. llvm-svn: 248078	2015-09-19 01:48:26 +00:00
Adrian Prantl	2f957ac092	Further simplify CGDebugInfo::getOrCreateModuleRef(). DIBuilder ignoers DICompileUnits that are passed in as scopes anyway. llvm-svn: 248077	2015-09-19 00:59:22 +00:00

1 2 3 4 5 ...

210835 Commits