archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Benjamin Kramer	e4eb1b495f	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Matt Arsenault	7b69102edb	Add address space argument to allowsUnalignedMemoryAccess. On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887	2014-02-05 23:15:53 +00:00
Alp Toker	1c4b33e8e5	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Richard Sandiford	add53b9fe0	[SystemZ] Optimize (sext (ashr (shl ...), ...)) ...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114	2014-01-13 15:17:53 +00:00
Chandler Carruth	87f14b4eec	Re-sort all of the includes with ./utils/sort_includes.py so that subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685	2014-01-07 11:48:04 +00:00
Richard Sandiford	3cb00264d7	Fix typo. llvm-svn: 197986	2013-12-24 15:22:39 +00:00
Richard Sandiford	99ae48f5bb	[SystemZ] Use interlocked-access 1 instructions for CodeGen ...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR. LOAD AND ADD LOGICAL isn't really separately useful for LLVM. I'll look at adding reusing the CC results in new year. llvm-svn: 197985	2013-12-24 15:18:04 +00:00
Richard Sandiford	8daaabe4c3	[SystemZ] Optimize comparisons with truncated extended loads If the extension of a loaded value is compared against zero and used in other arithmetic, InstCombine will change the comparison to use the unextended load. It's also possible that the comparison could be against the unextended load from the outset. In DAG form this becomes a truncation of an extending load. We want to strip the truncation if possible so that we can use load-and-test instructions. llvm-svn: 197804	2013-12-20 11:56:02 +00:00
Richard Sandiford	5a37a68afd	[SystemZ] Optimize X [!=]= Y in cases where X - Y or Y - X is also computed In those cases it's better to compare the result of the subtraction against zero. llvm-svn: 197239	2013-12-13 15:50:30 +00:00
Richard Sandiford	7ae30a86de	[SystemZ] Make more use of TMHH This originally came about after noticing that InstCombine turns some of the TMHH (icmp (and...), ...) tests into plain comparisons. Since there is no instruction to compare with a 64-bit immediate, TMHH is generally better than an ordered comparison for the cases that it can handle. llvm-svn: 197238	2013-12-13 15:46:55 +00:00
Richard Sandiford	50a6f85c7a	[SystemZ] Extend integer absolute selection This patch makes more use of LPGFR and LNGFR. It builds on top of the LTGFR selection from r197234. Most of the tests are motivated by what InstCombine would produce. llvm-svn: 197236	2013-12-13 15:35:00 +00:00
Richard Sandiford	21b1981cca	[SystemZ] Add a structure to represent a selected comparison ...in an attempt to rein back the increasingly complex selection code. A knock-on effect is that ICmpType is exposed from the outset, which slightly simplifies adjustSubwordCmp. The code is no piece of art even after this change, but at least it should be slightly better. No behavioral change intended. llvm-svn: 197235	2013-12-13 15:28:45 +00:00
Richard Sandiford	3cb2e57bb5	[SystemZ] Make more use of LTGFR InstCombine turns (sext (trunc)) into (ashr (shl)), then converts any comparison of the ashr against zero into a comparison of the shl against zero. This makes sense in itself, but we want to undo it for z, since the sign- extension instruction has a CC-setting form. I've included tests for both the original and InstCombined variants, but the former already worked. The patch fixes the latter. llvm-svn: 197234	2013-12-13 15:07:39 +00:00
Richard Sandiford	533c977c68	[SystemZ] Optimize fcmp X, 0 in cases where X is also negated In such cases it's often better to test the result of the negation instead, since the negation also sets CC. llvm-svn: 197032	2013-12-11 11:45:08 +00:00
Richard Sandiford	1b5b817c73	Add TargetLowering::prepareVolatileOrAtomicLoad One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196906	2013-12-10 10:49:34 +00:00
Richard Sandiford	bc88711db8	Add TargetLowering::prepareVolatileOrAtomicLoad One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196905	2013-12-10 10:36:34 +00:00
Richard Sandiford	f6544bf49a	[SystemZ] Extend the use of C(L)GFR instcombine prefers to put extended operands first, so this patch handles that case for C(L)GFR. llvm-svn: 196579	2013-12-06 09:56:50 +00:00
Richard Sandiford	9be3fcae69	[SystemZ] Optimize selects between 0 and -1 Since z has no setcc instruction as such, the choice of setBooleanContents is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent, so we produced a branch-free form when selecting between 0 and 1, but not when selecting between 0 and -1. This patch handles the latter case too. At some point I'd like to measure whether it's better to use conditional moves for constant selects on z196, but that's future work. llvm-svn: 196578	2013-12-06 09:53:09 +00:00
Richard Sandiford	f84d58699b	[SystemZ] Fix choice of known-zero mask in insertion optimization The backend converts 64-bit ORs into subreg moves if the upper 32 bits of one operand and the low 32 bits of the other are known to be zero. It then tries to peel away redundant ANDs from the upper 32 bits. Since AND masks are canonicalized to exclude known-zero bits, the test ORs the mask and the known-zero bits together before checking for redundancy. The problem was that it was using the wrong node when checking for known-zero bits, so could drop ANDs that were still needed. llvm-svn: 196267	2013-12-03 11:01:54 +00:00
Richard Sandiford	593286d79d	[SystemZ] Handle vectors in getSetCCResultType I don't have a standalone testcase for this, but it should allow r193676 to be reapplied. llvm-svn: 194148	2013-11-06 12:16:02 +00:00
Richard Sandiford	15044afbed	[SystemZ] Improve handling of SETCC We previously used the default expansion to SELECT_CC, which in turn would expand to "LHI; BRC; LHI". In most cases it's better to use an IPM-based sequence instead. llvm-svn: 192784	2013-10-16 11:10:55 +00:00
Will Dietz	2703980be5	Add missing #include's to cctype when using isdigit/alpha/etc. llvm-svn: 192519	2013-10-12 00:55:57 +00:00
Richard Sandiford	e2f5332463	[SystemZ] Extend pseudo conditional 8- and 16-bit stores to high words As the comment says, we always want to use STOC for 32-bit stores. llvm-svn: 191767	2013-10-01 14:33:55 +00:00
Richard Sandiford	24e987020f	[SystemZ] Optimize 32-bit FPR<->GPR moves for z196 and above Floats are stored in the high 32 bits of an FPR, and the only GPR<->FPR transfers are full-register transfers. This patch optimizes GPR<->FPR float transfers when the high word of a GPR is directly accessible. llvm-svn: 191764	2013-10-01 14:31:11 +00:00
Richard Sandiford	7125240faa	[SystemZ] Allow integer AND involving high words llvm-svn: 191762	2013-10-01 14:20:41 +00:00
Richard Sandiford	9d3cacb101	[SystemZ] Allow integer XOR involving high words llvm-svn: 191759	2013-10-01 14:08:44 +00:00
Richard Sandiford	d2a449d3de	[SystemZ] Allow integer OR involving high words llvm-svn: 191755	2013-10-01 13:22:41 +00:00
Richard Sandiford	497097c027	[SystemZ] Allow selects with a high-word destination llvm-svn: 191751	2013-10-01 13:10:16 +00:00
Richard Sandiford	c2e496f7ba	[SystemZ] Use upper words of GR64s for codegen This just adds the basics necessary for allocating the upper words to virtual registers (move, load and store). The move support is parameterised in a way that makes it easy to handle zero extensions, but the associated zero-extend patterns are added by a later patch. The easiest way of testing this seemed to be add a new "h" register constraint for high words. I don't expect the constraint to be useful in real inline asms, but it should work, so I didn't try to hide it behind an option. llvm-svn: 191739	2013-10-01 11:26:28 +00:00
Richard Sandiford	a4097cc6b1	[SystemZ] Rename subregs and add subreg_h32 Use subreg_hNN and subreg_lNN for the high and low NN bits of a register. List the low registers first, so that subreg_l32 also means the low 32 bits of a 128-bit register. Floats are stored in the upper 32 bits of a 64-bit register, so they should use subreg_h32 rather than subreg_l32. No behavioral change intended. llvm-svn: 191659	2013-09-30 10:28:35 +00:00
Richard Sandiford	27ff6cb147	[SystemZ] Rename 32-bit GPR registers I'm about to add support for high-word operations, so it seemed better for the low-word registers to have names like R0L rather than R0W. No behavioral change intended. llvm-svn: 191655	2013-09-30 08:48:38 +00:00
Richard Sandiford	cae9d29151	[SystemZ] Improve handling of PC-relative addresses The backend previously folded offsets into PC-relative addresses whereever possible. That's the right thing to do when the address can be used directly in a PC-relative memory reference (using things like LRL). But if we have a register-based memory reference and need to load the PC-relative address separately, it's better to use an anchor point that could be shared with other accesses to the same area of the variable. Fixes a FIXME. llvm-svn: 191524	2013-09-27 15:14:04 +00:00
Richard Sandiford	43a9297729	[SystemZ] Define the GR64 low-word logic instructions as pseudo aliases. Another patch to avoid duplication of encoding information. Things like NILF, NILL and NILH are used as both 32-bit and 64-bit instructions. Here the 64-bit versions are defined as aliases of the 32-bit ones. llvm-svn: 191369	2013-09-25 11:11:53 +00:00
Richard Sandiford	6f1fae507a	[SystemZ] Use subregs for 64-bit truncating stores Another patch to reduce the duplication of encoding information. Rather than define separate patterns for truncating 64-bit stores, use the 32-bit stores with a subreg. No behavioral changed intended. llvm-svn: 191365	2013-09-25 10:29:47 +00:00
Richard Sandiford	c0e0e27a84	[SystemZ] Use getTarget{Insert,Extract}Subreg rather than getMachineNode Just a clean-up, no behavioral change intended. llvm-svn: 190673	2013-09-13 09:12:44 +00:00
Richard Sandiford	30374b51cb	[SystemZ] Try to fold shifts into TMxx E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4". llvm-svn: 190672	2013-09-13 09:09:50 +00:00
Richard Sandiford	bfcf129b8e	[SystemZ] Add TM and TMY The main complication here is that TM and TMY (the memory forms) set CC differently from the register forms. When the tested bits contain some 0s and some 1s, the register forms set CC to 1 or 2 based on the value the uppermost bit. The memory forms instead set CC to 1 regardless of the uppermost bit. Until now, I've tried to make it so that a branch never tests for an impossible CC value. E.g. NR only sets CC to 0 or 1, so branches on the result will only test for 0 or 1. Originally I'd tried to do the same thing for TM and TMY by using custom matching code in ISelDAGToDAG. That ended up being very ugly though, and would have meant duplicating some of the chain checks that the common isel code does. I've therefore gone for the simpler alternative of adding an extra operand to the TM DAG opcode to say whether a memory form would be OK. This means that the inverse of a "TM;JE" is "TM;JNE" rather than the more precise "TM;JNLE", just like the inverse of "TMLL;JE" is "TMLL;JNE". I suppose that's arguably less confusing though... llvm-svn: 190400	2013-09-10 10:20:32 +00:00
Richard Sandiford	8d6edc5218	[SystemZ] Tweak integer comparison code The architecture has many comparison instructions, including some that extend one of the operands. The signed comparison instructions use sign extensions and the unsigned comparison instructions use zero extensions. In cases where we had a free choice between signed or unsigned comparisons, we were trying to decide at lowering time which would best fit the available instructions, taking things like extension type into account. The code to do that was getting increasingly hairy and was also making some bad decisions. E.g. when comparing the result of two LLCs, it is better to use CR rather than CLR, since CR can be fused with a branch while CLR can't. This patch removes the lowering code and instead adds an operand to integer comparisons to say whether signed comparison is required, whether unsigned comparison is required, or whether either is OK. We can then leave the choice of instruction up to the normal isel code. llvm-svn: 190138	2013-09-06 11:51:39 +00:00
Richard Sandiford	399318ba38	[SystemZ] Add NC, OC and XC For now these are just used to handle scalar ANDs, ORs and XORs in which all operands are memory. llvm-svn: 190041	2013-09-05 10:36:45 +00:00
Richard Sandiford	2543e2b36c	[SystemZ] Add support for TMHH, TMHL, TMLH and TMLL For now this just handles simple comparisons of an ANDed value with zero. The CC value provides enough information to do any comparison for a 2-bit mask, and some nonzero comparisons with more populated masks, but that's all future work. llvm-svn: 189819	2013-09-03 15:38:35 +00:00
Richard Sandiford	9fc2e5cdff	[SystemZ] Add support for TMHH, TMHL, TMLH and TMLL For now just handles simple comparisons of an ANDed value with zero. The CC value provides enough information to do any comparison for a 2-bit mask, and some nonzero comparisons with more populated masks, but that's all future work. llvm-svn: 189469	2013-08-28 10:31:43 +00:00
Richard Sandiford	96af6a5cf1	[SystemZ] Extend memcmp support to all constant lengths This uses the infrastructure added for memcpy and memmove in r189331. llvm-svn: 189458	2013-08-28 09:01:51 +00:00
Richard Sandiford	b585b1e48e	[SystemZ] Extend memcpy and memset support to all constant lengths Lengths up to a certain threshold (currently 6 * 256) use a series of MVCs. Lengths above that threshold use a loop to handle X*256 bytes followed by a single MVC to handle the excess (if any). This loop will also be needed in future when support for variable lengths is added. Because the same tablegen classes are used to define MVC and CLC, the patch also has the side-effect of defining a pseudo loop instruction for CLC. That instruction isn't used yet (and wouldn't be handled correctly if it were). I'm planning to use it soon though. llvm-svn: 189331	2013-08-27 09:54:29 +00:00
Richard Sandiford	9867b44c59	[SystemZ] Add basic prefetch support Just the instructions and intrinsics for now. llvm-svn: 189100	2013-08-23 11:36:42 +00:00
Richard Sandiford	152d2f09a8	[SystemZ] Try reversing comparisons whose first operand is in memory This allows us to make more use of the many compare reg,mem instructions. llvm-svn: 189099	2013-08-23 11:27:19 +00:00
Richard Sandiford	1dc05c13d2	[SystemZ] Define remainig *MUL_LOHI patterns The initial port used MLG(R) for i64 UMUL_LOHI but left the other three combinations as not-legal-or-custom. Although 32x32->{32,32} multiplications exist, they're not as quick as doing a normal 64-bit multiplication, so it didn't seem like i32 SMUL_LOHI and UMUL_LOHI would be useful. There's also no direct instruction for i64 SMUL_LOHI, so it needs to be implemented in terms of UMUL_LOHI. However, not defining these patterns means that we don't convert division by a constant into multiplication, so this patch fills in the other cases. The new i64 SMUL_LOHI sequence is simpler than the one that we used previously for 64x64->128 multiplication, so int-mul-08.ll now tests the full sequence. llvm-svn: 188898	2013-08-21 09:34:56 +00:00
Richard Sandiford	e6e07910e3	[SystemZ] Use FI[EDX]BRA for codegen llvm-svn: 188895	2013-08-21 09:04:20 +00:00
Richard Sandiford	add1a68f21	[SystemZ] Use SRST to optimize memchr SystemZTargetLowering::emitStringWrapper() previously loaded the character into R0 before the loop and made R0 live on entry. I'd forgotten that allocatable registers weren't allowed to be live across blocks at this stage, and it confused LiveVariables enough to cause a miscompilation of f3 in memchr-02.ll. This patch instead loads R0 in the loop and leaves LICM to hoist it after RA. This is actually what I'd tried originally, but I went for the manual optimisation after noticing that R0 often wasn't being hoisted. This bug forced me to go back and look at why, now fixed as r188774. We should also try to optimize null checks so that they test the CC result of the SRST directly. The select between null and the SRST GPR result could then usually be deleted as dead. llvm-svn: 188779	2013-08-20 09:38:48 +00:00
Richard Sandiford	841d24aa5a	[SystemZ] Add support for sibling calls This first cut is pretty conservative. The final argument register (R6) is call-saved, so we would need to make sure that the R6 argument to a sibling call is the same as the R6 argument to the calling function, which seems worth keeping as a separate patch. Saying that integer truncations are free means that we no longer use the extending instructions LGF and LLGF for spills in int-conv-09.ll and int-conv-10.ll. Instead we treat the registers as 64 bits wide and truncate them to 32-bits where necessary. I think it's unlikely we'd use LGF and LLGF for spills in other situations for the same reason, so I'm removing the tests rather than replacing them. The associated code is generic and applies to many more instructions than just LGF and LLGF, so there is no corresponding code removal. llvm-svn: 188669	2013-08-19 12:42:31 +00:00
Richard Sandiford	06a13f49c8	[SystemZ] Use SRST to implement strlen and strnlen It would also make sense to use it for memchr; I'm working on that now. llvm-svn: 188547	2013-08-16 11:41:43 +00:00

1 2 3 4

177 Commits