llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-15 07:39:31 +00:00

Author	SHA1	Message	Date
Andrea Di Biagio	e962698410	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234	2014-01-27 18:45:30 +00:00
Reid Kleckner	c863fd7d4a	Silence MSVC warning on 'uint16_t \|= bool' with a cast This isn't C4800, it's C4805. MSVC says this is unsafe, but it generates correct code. llvm-svn: 200229	2014-01-27 17:47:11 +00:00
NAKAMURA Takumi	b340d5771b	[CMake] Put *_exports into "Misc" folder. llvm-svn: 200228	2014-01-27 17:39:38 +00:00
David Majnemer	68ba6d5a5b	MC: Add support for .cfi_startproc simple This commit allows LLVM MC to process .cfi_startproc directives when they are followed by an additional `simple' identifier. This signals to elide the emission of target specific CFI instructions that would normally occur initially. This fixes PR16587. Differential Revision: http://llvm-reviews.chandlerc.com/D2624 llvm-svn: 200227	2014-01-27 17:20:25 +00:00
Tobias Grosser	81ff43601d	Do not reference llvm-gcc from bugpoint Reiterating: llvm-gcc is dead since a long time. llvm-svn: 200220	2014-01-27 13:44:58 +00:00
Chandler Carruth	f70ef7ae29	[vectorize] Initial version of respecting PGO in the vectorizer: treat cold loops as-if they were being optimized for size. Nothing fancy here. Simply test case included. The nice thing is that we can now incrementally build on top of this to drive other heuristics. All of the infrastructure work is done to get the profile information into this layer. The remaining work necessary to make this a fully general purpose loop unroller for very hot loops is to make it a fully general purpose loop unroller. Things I know of but am not going to have time to benchmark and fix in the immediate future: 1) Don't disable the entire pass when the target is lacking vector registers. This really doesn't make any sense any more. 2) Teach the unroller at least and the vectorizer potentially to handle non-if-converted loops. This is trivial for the unroller but hard for the vectorizer. 3) Compute the relative hotness of the loop and thread that down to the various places that make cost tradeoffs (very likely only the unroller makes sense here, and then only when dealing with loops that are small enough for unrolling to not completely blow out the LSD). I'm still dubious how useful hotness information will be. So far, my experiments show that if we can get the correct logic for determining when unrolling actually helps performance, the code size impact is completely unimportant and we can unroll in all cases. But at least we'll no longer burn code size on cold code. One somewhat unrelated idea that I've had forever but not had time to implement: mark all functions which are only reachable via the global constructors rigging in the module as optsize. This would also decrease the impact of any more aggressive heuristics here on code size. llvm-svn: 200219	2014-01-27 13:11:50 +00:00
Benjamin Kramer	65df2371a8	ConstantHoisting: We can't insert instructions directly in front of a PHI node. Insert before the terminating instruction of the dominating block instead. llvm-svn: 200218	2014-01-27 13:11:43 +00:00
Benjamin Kramer	ce3ca2ba83	XCore: Fix typo in function name. llvm-svn: 200216	2014-01-27 11:50:13 +00:00
Chandler Carruth	88d92716dd	[vectorizer] Add an override for the target instruction cost and use it to stabilize a test that really is trying to test generic behavior and not a specific target's behavior. llvm-svn: 200215	2014-01-27 11:41:50 +00:00
Chandler Carruth	eb82628ff7	[vectorizer] Simplify code to use existing helpers on the Function object and fewer pointless variables. Also, add a clarifying comment and a FIXME because the code which disables all vectorization if we can't use implicit floating point instructions just makes no sense at all. llvm-svn: 200214	2014-01-27 11:27:37 +00:00
Chandler Carruth	d1ecfe35ae	[vectorizer] Teach the loop vectorizer's unroller to only unroll by powers of two. This is essentially always the correct thing given the impact on alignment, scaling factors that can be used in addressing modes, etc. Also, fix the management of the unroll vs. small loop cost to more accurately model things with this world. Enhance a test case to actually exercise more of the unroll machinery if using synthetic constants rather than a specific target model. Before this change, with the added flags this test will unroll 3 times instead of either 2 or 4 (the two sensible answers). While I don't expect this to make a huge difference, if there are lots of loops sitting right on the edge of hitting the 'small unroll' factor, they might change behavior. However, I've benchmarked moving the small loop cost up and down in many various ways and by a huge factor (2x) without seeing more than 0.2% code size growth. Small adjustments such as the series that led up here have led to about 1% improvement on some benchmarks, but it is very close to the noise floor so I mostly checked that nothing regressed. Let me know if you see bad behavior on other targets but I don't expect this to be a sufficiently dramatic change to trigger anything. llvm-svn: 200213	2014-01-27 11:12:24 +00:00
Chandler Carruth	bdbe34a1a1	[vectorizer] Add some flags which are useful for conducting experiments with the unrolling behavior in the loop vectorizer. No functionality changed at this point. These are a bit hack-y, but talking with Hal, there doesn't seem to be a cleaner way to easily experiment with different thresholds here and he was also interested in them so I wanted to commit them. Suggestions for improvement are very welcome here. llvm-svn: 200212	2014-01-27 11:12:19 +00:00
Chandler Carruth	dd6cf9494b	[vectorizer] Fix a trivial oversight where we always requested the number of vector registers rather than toggling between vector and scalar register number based on VF. I don't have a test case as I spotted this by inspection and on X86 it only makes a difference if your target is lacking SSE and thus has no vector registers. If someone wants to add a test case for this for ARM or somewhere else where this is more significant, that would be awesome. Also made the variable name a bit more sensible while I'm here. llvm-svn: 200211	2014-01-27 11:12:14 +00:00
Nick Lewycky	30a25a7139	Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else! llvm-svn: 200210	2014-01-27 10:47:44 +00:00
Nick Lewycky	baf1d18cf0	Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any number of contiguous 1 bits in a row, with any number of leading and trailing 0 bits. Unfortunately, this in turn led to some lower quality SCEVs due to some different paths through expression simplification, so add getUDivExactExpr and use it. This fixes all instances of the problems that I found, but we can make that function smarter as necessary. Merge test "xor-and.ll" into "and-xor.ll" since I needed to update it anyways. Test 'nsw-offset.ll' analyzes a little deeper, %n now gets a scev in terms of %no instead of a SCEVUnknown. llvm-svn: 200203	2014-01-27 10:04:03 +00:00
Stepan Dyatkovskiy	4a30c89d70	Additional fix for 200201: due to dependence on bitwidth test was moved to X86 directory. llvm-svn: 200202	2014-01-27 09:43:10 +00:00
Stepan Dyatkovskiy	9fc6926a85	Fix for PR18102. Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 \| 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201	2014-01-27 09:18:31 +00:00
Chandler Carruth	a89deb11ba	[vectorizer] Clean up the handling of unvectorized loop unrolling in the LoopVectorize pass. The logic here doesn't make much sense. We only unrolled if the unvectorized loop was a reduction loop with a single basic block and small loop body. The reduction part in particular doesn't make much sense. Instead, if we just fall through to the vectorized unroll logic it makes more sense of unrolling if there is a vectorized reduction that could be hacked on by the SLP vectorizer or if the loop is small. This is mostly a cleanup and nothing in the test suite really exercises this, but I did run benchmarks across this change and saw no really significant changes. llvm-svn: 200198	2014-01-27 08:17:58 +00:00
Michel Danzer	65a5397c22	R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructions Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196	2014-01-27 07:20:51 +00:00
Michel Danzer	36dd8ac577	R600/SI: Add intrinsic for S_SENDMSG instruction Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195	2014-01-27 07:20:44 +00:00
Alp Toker	b29938426f	Roll back the ConstStringRef change for now There are a couple of interesting things here that we want to check over (particularly the expecting asserts in StringRef) and get right for general use in ADT so hold back on this one. For clang we have a workable templated solution to use in the meanwhile. This reverts commit r200187. llvm-svn: 200194	2014-01-27 05:24:39 +00:00
Rafael Espindola	d8044a2e39	Print .mask and .fmask with the target streamer. Testing this also found the missing '\n' after .frame that this patch also fixes. llvm-svn: 200192	2014-01-27 04:33:11 +00:00
Rui Ueyama	f81e57de36	Rename IMAGE_DLL_CHARACTERISTICS_HIGH_ENTROPY_VA. editbin.exe and link.exe both accepts /highentropyva option to set this bit, so doing s/VIRTUAL_ADDRESS/VA/ should make sense. llvm-svn: 200191	2014-01-27 04:22:24 +00:00
Alp Toker	06df538b11	Move true/false StringRef helper to StringExtras StringRef is a low-level data wrapper that shouldn't know about language strings like 'true' and 'false' whereas StringExtras is just the place for higher-level utilities. llvm-svn: 200188	2014-01-27 04:07:36 +00:00
Alp Toker	43e8630002	StringRef: Extend constexpr capabilities and introduce ConstStringRef (1) Add llvm_expect(), an asserting macro that can be evaluated as a constexpr expression as well as a runtime assert or compiler hint in release builds. This technique can be used to construct functions that are both unevaluated and compiled depending on usage. (2) Update StringRef using llvm_expect() to preserve runtime assertions while extending the same checks to static asserts in C++11 builds that support the feature. (3) Introduce ConstStringRef, a strong subclass of StringRef that references compile-time constant strings. It's convertible to, but not from, ordinary StringRef and thus can be used to add compile-time safety to various interfaces in LLVM and clang that only accept fixed inputs such as diagnostic format strings that tend to get misused. llvm-svn: 200187	2014-01-27 04:07:17 +00:00
Rafael Espindola	3cc89c008e	Print .frame via the target streamer. llvm-svn: 200186	2014-01-27 03:53:56 +00:00
Kevin Qin	436aae7633	[AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR. Replace r199791. llvm-svn: 200180	2014-01-27 02:53:54 +00:00
Kevin Qin	d83dee8270	Revert r199791. It's old version which has some bugs. I'll commit lattest patch soon. llvm-svn: 200179	2014-01-27 02:53:41 +00:00
Rafael Espindola	529d00682f	Use SwitchSection in MipsAsmPrinter::EmitStartOfAsmFile. llvm-svn: 200178	2014-01-27 01:33:33 +00:00
Rafael Espindola	ba41c3a33d	Remove dead code. llvm-svn: 200174	2014-01-27 00:47:51 +00:00
Rafael Espindola	c1b8f12d60	Add back spaces I missed in the conversion to emitRawComments. Sorry about that. llvm-svn: 200171	2014-01-27 00:19:41 +00:00
Rafael Espindola	7ee6a0aa63	Use emitRawComment instead of EmitRawText. llvm-svn: 200170	2014-01-27 00:16:00 +00:00
Rafael Espindola	ff7cc35a84	Add missing file. llvm-svn: 200169	2014-01-27 00:08:17 +00:00
Rafael Espindola	26ffd68198	Add a XCoreTargetStreamer and port over the simple uses of EmitRawText. llvm-svn: 200167	2014-01-26 23:57:05 +00:00
Saleem Abdulrasool	91bd110de4	MC: fix test locations/name Placed the MC variant diagnostics in the wrong directory accidentally. Move them into their respective architecture specific directories. llvm-svn: 200161	2014-01-26 22:55:02 +00:00
Saleem Abdulrasool	bad8eceeeb	ARM: improve diagnostics for .word directive If a complex expression was passed to the .word directive and the first part of the directive failed to parse, a secondary diagnostic would be produced that would clutter the error diagnostics. Improve the diagnostics by consuming the remainder of the statement. llvm-svn: 200160	2014-01-26 22:29:50 +00:00
Saleem Abdulrasool	254ffbe584	AsmParser: improve diagnostics for invalid variants An emitted diagnostic for an invalid relocation variant would place the caret on the token following the relocation variant indicator or at the end of the line if there was no following token. This change corrects the placement of the caret to point to the token. llvm-svn: 200159	2014-01-26 22:29:43 +00:00
Saleem Abdulrasool	31387685d9	MC: whitespace Fix indentation, remove unnecessary line. NFC. llvm-svn: 200158	2014-01-26 22:29:36 +00:00
Alp Toker	cfd11a14d9	Avoid C++ comment in C sources lib/Target/X86/Disassembler/X86DisassemblerDecoder.c:1361:7: error: C++ style comments are not allowed in ISO C90 llvm-svn: 200153	2014-01-26 18:44:34 +00:00
Evan Cheng	c4852e298b	Follow up of r200095. Code clean up. llvm-svn: 200152	2014-01-26 18:30:13 +00:00
NAKAMURA Takumi	e6d62c55d1	[CMake] tablegen(): Use -I <dir> according to the list by include_directories(). For now, local_tds and global_tds are integrated to dependent_tds. llvm-svn: 200150	2014-01-26 12:41:38 +00:00
NAKAMURA Takumi	3d3cc2031e	[CMake] Functionalize tblgen(). llvm-svn: 200149	2014-01-26 12:41:33 +00:00
Jakob Stoklund Olesen	8dcf39e4f3	Clean up the Legal/Expand logic for SPARC popc. llvm-svn: 200141	2014-01-26 08:12:34 +00:00
Rafael Espindola	39bfe463a9	Implement the missing bits corresponding to .mips_hack_elf_flags. These were: * noreorder handling on the target object streamer and asm parser. * setting the initial flag bits based on the enabled features. * setting the elf header flag for micromips It is really depressing I am the one doing this instead of someone at mips actually taking the time to understand the infrastructure. llvm-svn: 200138	2014-01-26 06:57:13 +00:00
Rafael Espindola	bfdd58b802	Pass a MCSubtargetInfo down to the TargetStreamer creation. With this the target streamers will be able to know the target features that are in use. llvm-svn: 200135	2014-01-26 06:38:58 +00:00
NAKAMURA Takumi	937fce3653	[CMake] configure_lit_site_cfg: ${SHLIBDIR} should point the build tree. llvm-svn: 200134	2014-01-26 06:18:56 +00:00
Jakob Stoklund Olesen	6804208a8e	Only generate the popc instruction for SPARC CPUs that implement it. The popc instruction is defined in the SPARCv9 instruction set architecture, but it was emulated on CPUs older than Niagara 2. llvm-svn: 200131	2014-01-26 06:09:59 +00:00
Jakob Stoklund Olesen	0e9b704ac9	Fix swapped CASA operands. Found by SingleSource/UnitTests/AtomicOps.c llvm-svn: 200130	2014-01-26 06:09:54 +00:00
Rafael Espindola	806f778fa0	Construct the MCStreamer before constructing the MCTargetStreamer. This has a few advantages: * Only targets that use a MCTargetStreamer have to worry about it. * There is never a MCTargetStreamer without a MCStreamer, so we can use a reference. * A MCTargetStreamer can talk to the MCStreamer in its constructor. llvm-svn: 200129	2014-01-26 06:06:37 +00:00
Venkatraman Govindaraju	4cbcf2fd77	[Sparc] Add support for parsing DW_CFA_GNU_window_save. llvm-svn: 200127	2014-01-26 05:13:44 +00:00

... 6 7 8 9 10 ...

99943 Commits