llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-30 06:40:53 +00:00

Author	SHA1	Message	Date
Rafael Espindola	8113c62abe	Replace .mips_hack_stocg with ".set micromips" and ".set nomicromips". This matches what gnu as does and implementing this is easier than arguing about it. llvm-svn: 199181	2014-01-14 04:25:13 +00:00
Mark Seaborn	76b70ff14a	Fix llc to not reuse spill slots in functions that invoke setjmp() We need to ensure that StackSlotColoring.cpp does not reuse stack spill slots in functions that call "returns_twice" functions such as setjmp(), otherwise this can lead to miscompiled code, because a stack slot would be clobbered when it's still live. This was already handled correctly for functions that call setjmp() (though this wasn't covered by a test), but not for functions that invoke setjmp(). We fix this by changing callsFunctionThatReturnsTwice() to check for invoke instructions. This fixes PR18244. llvm-svn: 199180	2014-01-14 04:20:01 +00:00
Cameron McInally	551c2b6ed9	Clean up RUN command for Assembler/getInt.ll. llvm-svn: 199158	2014-01-13 22:37:35 +00:00
Cameron McInally	0d6aa675fa	Fix uninitialized warning in llvm/lib/IR/DataLayout.cpp. llvm-svn: 199147	2014-01-13 22:04:55 +00:00
Juergen Ributzka	52e4b4d675	[DAG] Teach DAG to also reassociate vector operations This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135	2014-01-13 20:51:35 +00:00
Weiming Zhao	04e2261e54	Fix PR 18369: [Thumbv8] asserts due to inconsistent CPSR liveness of IT blocks The issue is caused when Post-RA scheduler reorders a bundle instruction (IT block). However, it only flips the CPSR liveness of the bundle instruction, leaves the instructions inside the bundle unchanged, which causes inconstancy and crashes Thumb2SizeReduction.cpp::ReduceMBB(). llvm-svn: 199127	2014-01-13 18:47:54 +00:00
Andrea Di Biagio	c159ef589c	[AArch64] Fix assertion failure caused by an invalid comparison between APInt values. APInt only knows how to compare values with the same BitWidth and asserts in all other cases. With this fix, function PerformORCombine does not use the APInt equality operator if the APInt values returned by 'isConstantSplat' differ in BitWidth. In that case they are different and no comparison is needed. llvm-svn: 199119	2014-01-13 16:51:00 +00:00
Richard Sandiford	c891615944	[SystemZ] Flesh out stackrestore test (frame-11.ll) ...so that it does something vaguely sensible. llvm-svn: 199117	2014-01-13 15:44:44 +00:00
Richard Sandiford	a5ce476be6	[SystemZ] Add "volatile" to a dead store in variable-loc.ll llvm-svn: 199116	2014-01-13 15:42:16 +00:00
Richard Sandiford	70c8cbd696	[SystemZ] Improve risbg-01.ll test The old mask in f24 wasn't well chosen because the lshr would always be zero. CodeGen didn't detect this but InstCombine would. The new mask ensures that both shifts are needed. f26 is specifically testing for a wrap-around mask. The AND can be applied to just the shift left, either before or after the shift. Again, CodeGen kept it in the original form but InstCombine would mask after the shift instead. The exact choice of NILF isn't important for the test so I just dropped it and kept the rotate. llvm-svn: 199115	2014-01-13 15:40:25 +00:00
Richard Sandiford	add53b9fe0	[SystemZ] Optimize (sext (ashr (shl ...), ...)) ...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114	2014-01-13 15:17:53 +00:00
Tim Northover	e33f96dc43	ARM: add test for r199108. Oops. rdar://problem/15800156 llvm-svn: 199109	2014-01-13 14:20:25 +00:00
David Woodhouse	a7b8d3d331	[x86] Fix retq/retl handling in 64-bit mode This finishes the job started in r198756, and creates separate opcodes for 64-bit vs. 32-bit versions of the rest of the RET instructions too. LRETL/LRETQ are interesting... I can't see any justification for their existence in the SDM. There should be no 'LRETL' in 64-bit mode, and no need for a REX.W prefix for LRETQ. But this is what GAS does, and my Sandybridge CPU and an Opteron 6376 concur when tested as follows: asm __volatile__("pushq $0x1234\nmovq $0x33,%rax\nsalq $32,%rax\norq $1f,%rax\npushq %rax\nlretl $8\n1:"); asm __volatile__("pushq $1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); asm __volatile__("pushq $0x33\npushq $1f\nlretq\n1:"); asm __volatile__("pushq $0x1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); cf. PR8592 and commit r118903, which added LRETQ. I only added LRETIQ to match it. I don't quite understand how the Intel syntax parsing for ret instructions is working, despite r154468 allegedly fixing it. Aren't the explicitly sized 'retw', 'retd' and 'retq' supposed to work? I have at least made the 'lretq' work with (and indeed require) the 'q'. llvm-svn: 199106	2014-01-13 14:05:59 +00:00
Elena Demikhovsky	e635ade802	AVX-512: Embedded Rounding Control - encoding and printing Changed intrinsics for vrcp14/vrcp28 vrsqrt14/vrsqrt28 - aligned with GCC. llvm-svn: 199102	2014-01-13 12:55:03 +00:00
Chandler Carruth	d090eb21c2	[PM] Wire up support for writing bitcode with new PM. This moves the old pass creation functionality to its own header and updates the callers of that routine. Then it adds a new PM supporting bitcode writer to the header file, and wires that up in the opt tool. A test is added that round-trips code into bitcode and back out using the new pass manager. llvm-svn: 199078	2014-01-13 07:38:24 +00:00
NAKAMURA Takumi	2ecc1b5bd9	llvm/test/ExecutionEngine/MCJIT/load-object-a.ll: Put together rm(1) and mkdir(1) at the top. llvm-svn: 199077	2014-01-13 05:55:10 +00:00
Chandler Carruth	64f26e1076	[PM] Wire up support for printing assembly output from the opt command. This lets us round-trip IR in the expected manner with the opt tool. llvm-svn: 199075	2014-01-13 05:16:45 +00:00
Kevin Qin	5aa184711d	[AArch64 NEON] Add missing patterns for bitcast from or to v1f64 llvm-svn: 199070	2014-01-13 01:58:38 +00:00
Kevin Qin	9b14d101ea	[AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069	2014-01-13 01:56:29 +00:00
Saleem Abdulrasool	b90512a41c	correct target directive handling error handling The target specific parser should return `false' if the target AsmParser handles the directive, and `true' if the generic parser should handle the directive. Many of the target specific directive handlers would `return Error' which does not follow these semantics. This change simply changes the target specific routines to conform to the semantis of the ParseDirective correctly. Conformance to the semantics improves diagnostics emitted for the invalid directives. X86 is taken as a sample to ensure that multiple diagnostics are not presented for a single error. llvm-svn: 199068	2014-01-13 01:15:39 +00:00
Jakob Stoklund Olesen	026885712c	Handle bundled terminators in isBlockOnlyReachableByFallthrough. Targets like SPARC and MIPS have delay slots and normally bundle the delay slot instruction with the corresponding terminator. Teach isBlockOnlyReachableByFallthrough to find any MBB operands on bundled terminators so SPARC doesn't need to specialize this function. llvm-svn: 199061	2014-01-12 19:24:08 +00:00
Nico Rieck	ddf7787f91	Make test independent of scheduling llvm-svn: 199055	2014-01-12 15:57:38 +00:00
NAKAMURA Takumi	75b448c67e	llvm/test/CodeGen/X86/shl_undef.ll: Tweak to satisfy r199050. Use intel syntax, or "shl" might hit "pushl". llvm-svn: 199051	2014-01-12 14:41:41 +00:00
Nico Rieck	884ee061f2	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Chandler Carruth	2fbea03f0f	[PM] Add module and function printing passes for the new pass manager. This implements the legacy passes in terms of the new ones. It adds basic testing using explicit runs of the passes. Next up will be wiring the basic output mechanism of opt up when the new pass manager is engaged unless bitcode writing is requested. llvm-svn: 199049	2014-01-12 12:15:39 +00:00
Chandler Carruth	a3ab8c27ea	[PM] Fix a bunch of bugs I spotted by inspection when working on this code. Copious tests added to cover these cases. llvm-svn: 199039	2014-01-12 10:02:02 +00:00
Chandler Carruth	4f430207ee	[PM] Add support for parsing function passes and function pass manager nests to the opt commandline support. This also showcases the implicit-initial-manager support which will be most useful for testing. There are several bugs that I spotted by inspection here that I'll fix with test cases in subsequent commits. llvm-svn: 199038	2014-01-12 09:34:22 +00:00
Saleem Abdulrasool	c74b8ca814	ARM IAS: fix diagnostics of improper qualification An improper qualifier would result in a superfluous error due to the parser not consuming the remainder of the statement. Simply consume the remainder of the statement to avoid the error. llvm-svn: 199035	2014-01-12 05:25:44 +00:00
Venkatraman Govindaraju	816c9a7dd9	[Sparc] Add support for parsing floating point instructions. llvm-svn: 199033	2014-01-12 04:48:54 +00:00
Saleem Abdulrasool	1a40ebc0d6	ARM: change implicit immediate forms of {ld,st}r{,b}t to psuedo-instructions The implicit immediate 0 forms are assembly aliases, not distinct instruction encodings. Fix the initial implementation introduced in r198914 to an alias to avoid two separate instruction definitions for the same encoding. An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero. By using the AsmPsuedoInst, fall back to the C++ code to transform the instruction to the equivalent _POST_IMM form, inserting the additional implicit immediate 0. llvm-svn: 199032	2014-01-12 04:36:01 +00:00
Jakob Stoklund Olesen	3dd52f1fbd	The SPARCv9 ABI returns a float in %f0. This is different from the argument passing convention which puts the first float argument in %f1. With this patch, all returned floats are treated as if the 'inreg' flag were set. This means multiple float return values get packed in %f0, %f1, %f2, ... Note that when returning a struct in registers, clang will set the 'inreg' flag on the return value, so that behavior is unchanged. This also happens when returning a float _Complex. llvm-svn: 199028	2014-01-12 04:13:17 +00:00
Joerg Sonnenberger	8a42fa06ce	Typo llvm-svn: 199027	2014-01-12 03:38:30 +00:00
Joerg Sonnenberger	d23fb01819	Add missing mul aliases for armv4 support. Add checks that armv4 can assemble the various mul instructions. llvm-svn: 199026	2014-01-12 03:35:18 +00:00
Hans Wennborg	f5c5f6e123	Switch-to-lookup tables: Don't require a result for the default case when the lookup table doesn't have any holes. This means we can build a lookup table for switches like this: switch (x) { case 0: return 1; case 1: return 2; case 2: return 3; case 3: return 4; default: exit(1); } The default case doesn't yield a constant result here, but that doesn't matter, since a default result is only necessary for filling holes in the lookup table, and this table doesn't have any holes. This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB off the resulting clang binary. llvm-svn: 199025	2014-01-12 00:44:41 +00:00
Venkatraman Govindaraju	406e85c8e3	[Sparc] Add missing processor types: v7 and niagara llvm-svn: 199024	2014-01-11 23:56:13 +00:00
Saleem Abdulrasool	2c235839b6	ARM IAS: support emitting constant values in target expressions A 32-bit immediate value can be formed from a constant expression and loaded into a register. Add support to emit this into an object file. Because this value is a constant, a relocation must not be produced for it. llvm-svn: 199023	2014-01-11 23:03:48 +00:00
Benjamin Kramer	002aed9cb3	Fix broken CHECK lines. llvm-svn: 199016	2014-01-11 21:06:00 +00:00
Venkatraman Govindaraju	db6b1cbac8	[Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend. llvm-svn: 199014	2014-01-11 19:38:03 +00:00
Chandler Carruth	e9252766d0	[PM] Actually nest pass managers correctly when parsing the pass pipeline string. Add tests that cover this now that we have execution dumping in the pass managers. llvm-svn: 199005	2014-01-11 12:06:47 +00:00
NAKAMURA Takumi	ee1c766de9	llvm/test/Transforms/SampleProfile/syntax.ll: Eliminate locale-sensitive message check. llvm-svn: 199000	2014-01-11 09:23:52 +00:00
NAKAMURA Takumi	a62eb4ba36	llvm/test/CodeGen/X86/anyregcc.ll: Add explicit -mtriple=x86_64-unknown-unknown. XMM(s) are really spilling for targeting Win64. llvm-svn: 198999	2014-01-11 09:23:44 +00:00
Chandler Carruth	c5861be643	[PM] Add (very skeletal) support to opt for running the new pass manager. I cannot emphasize enough that this is a WIP. =] I expect it to change a great deal as things stabilize, but I think its really important to get some functionality here so that the infrastructure can be tested more traditionally from the commandline. The current design is looking something like this: ./bin/opt -passes='module(pass_a,pass_b,function(pass_c,pass_d))' So rather than custom-parsed flags, there is a single flag with a string argument that is parsed into the pass pipeline structure. This makes it really easy to have nice structural properties that are very explicit. There is one obvious and important shortcut. You can start off the pipeline with a pass, and the minimal context of pass managers will be built around the entire specified pipeline. This makes the common case for tests super easy: ./bin/opt -passes=instcombine,sroa,gvn But this won't introduce any of the complexity of the fully inferred old system -- we only ever do this for the entire argument, and we only look at the first pass. If the other passes don't fit in the pass manager selected it is a hard error. The other interesting aspect here is that I'm not relying on any registration facilities. Such facilities may be unavoidable for supporting plugins, but I have alternative ideas for plugins that I'd like to try first. My plan is essentially to build everything without registration until we hit an absolute requirement. Instead of registration of pass names, there will be a library dedicated to parsing pass names and the pass pipeline strings described above. Currently, this is directly embedded into opt for simplicity as it is very early, but I plan to eventually pull this into a library that opt, bugpoint, and even Clang can depend on. It should end up as a good home for things like the existing PassManagerBuilder as well. There are a bunch of FIXMEs in the code for the parts of this that are just stubbed out to make the patch more incremental. A quick list of what's coming up directly after this: - Support for function passes and building the structured nesting. - Support for printing the pass structure, and FileCheck tests of all of this code. - The .def-file based pass name parsing. - IR priting passes and the corresponding tests. Some obvious things that I'm not going to do right now, but am definitely planning on as the pass manager work gets a bit further: - Pull the parsing into library, including the builders. - Thread the rest of the target stuff into the new pass manager. - Wire support for the new pass manager up to llc. - Plugin support. Some things that I'd like to have, but are significantly lower on my priority list. I'll get to these eventually, but they may also be places where others want to contribute: - Adding nice error reporting for broken pass pipeline descriptions. - Typo-correction for pass names. llvm-svn: 198998	2014-01-11 08:16:35 +00:00
Juergen Ributzka	3673ce83a4	[anyregcc] Fix callee-save mask for anyregcc Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and select the proper mask depending on the target cpu we compile for. llvm-svn: 198985	2014-01-11 01:00:27 +00:00
Diego Novillo	f47aa4d47f	Extend and simplify the sample profile input file. 1- Use the line_iterator class to read profile files. 2- Allow comments in profile file. Lines starting with '#' are completely ignored while reading the profile. 3- Add parsing support for discriminators and indirect call samples. Our external profiler can emit more profile information that we are currently not handling. This patch does not add new functionality to support this information, but it allows profile files to provide it. I will add actual support later on (for at least one of these features, I need support for DWARF discriminators in Clang). A sample line may contain the following additional information: Discriminator. This is used if the sampled program was compiled with DWARF discriminator support (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This is currently only emitted by GCC and we just ignore it. Potential call targets and samples. If present, this line contains a call instruction. This models both direct and indirect calls. Each called target is listed together with the number of samples. For example, 130: 7 foo:3 bar:2 baz:7 The above means that at relative line offset 130 there is a call instruction that calls one of foo(), bar() and baz(). With baz() being the relatively more frequent call target. Differential Revision: http://llvm-reviews.chandlerc.com/D2355 4- Simplify format of profile input file. This implements earlier suggestions to simplify the format of the sample profile file. The symbol table is not necessary and function profiles do not need to know the number of samples in advance. Differential Revision: http://llvm-reviews.chandlerc.com/D2419 llvm-svn: 198973	2014-01-10 23:23:51 +00:00
Diego Novillo	9e8454b3fe	Propagation of profile samples through the CFG. This adds a propagation heuristic to convert instruction samples into branch weights. It implements a similar heuristic to the one implemented by Dehao Chen on GCC. The propagation proceeds in 3 phases: 1- Assignment of block weights. All the basic blocks in the function are initial assigned the same weight as their most frequently executed instruction. 2- Creation of equivalence classes. Since samples may be missing from blocks, we can fill in the gaps by setting the weights of all the blocks in the same equivalence class to the same weight. To compute the concept of equivalence, we use dominance and loop information. Two blocks B1 and B2 are in the same equivalence class if B1 dominates B2, B2 post-dominates B1 and both are in the same loop. 3- Propagation of block weights into edges. This uses a simple propagation heuristic. The following rules are applied to every block B in the CFG: - If B has a single predecessor/successor, then the weight of that edge is the weight of the block. - If all the edges are known except one, and the weight of the block is already known, the weight of the unknown edge will be the weight of the block minus the sum of all the known edges. If the sum of all the known edges is larger than B's weight, we set the unknown edge weight to zero. - If there is a self-referential edge, and the weight of the block is known, the weight for that edge is set to the weight of the block minus the weight of the other incoming edges to that block (if known). Since this propagation is not guaranteed to finalize for every CFG, we only allow it to proceed for a limited number of iterations (controlled by -sample-profile-max-propagate-iterations). It currently uses the same GCC default of 100. Before propagation starts, the pass builds (for each block) a list of unique predecessors and successors. This is necessary to handle identical edges in multiway branches. Since we visit all blocks and all edges of the CFG, it is cleaner to build these lists once at the start of the pass. Finally, the patch fixes the computation of relative line locations. The profiler emits lines relative to the function header. To discover it, we traverse the compilation unit looking for the subprogram corresponding to the function. The line number of that subprogram is the line where the function begins. That becomes line zero for all the relative locations. llvm-svn: 198972	2014-01-10 23:23:46 +00:00
Arnold Schwaighofer	702d83d3d8	LoopVectorizer: Handle strided memory accesses by versioning for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 llvm-svn: 198950	2014-01-10 18:20:32 +00:00
Artyom Skrobov	cbb9547cdc	Amending test/MC/ARM/thumb2-mclass.s to match its apparent original purpose (to test the ARMv6M/ARMv7M commonality), and creating a new test case for the differences between ARMv6M and ARMv7M llvm-svn: 198946	2014-01-10 16:49:49 +00:00
Artyom Skrobov	759f6384e9	Must not produce Tag_CPU_arch_profile for pre-ARMv7 cores (e.g. cortex-m0) llvm-svn: 198945	2014-01-10 16:42:55 +00:00
Saleem Abdulrasool	f544263238	ARM: fix regression caused by r198914 The disassembler would no longer be able to disambiguage between the two variants (explicit immediate #0 vs implicit, omitted #0) for the ldrt, strt, ldrbt, strbt mnemonics as both versions indicated the disassembler routine. llvm-svn: 198944	2014-01-10 16:22:47 +00:00
Kristof Beyls	082ab7548c	Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux. llvm-svn: 198937	2014-01-10 13:41:49 +00:00

1 2 3 4 5 ...

22323 Commits