archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Zvi Rackover	32d2ff0d0f	X86 Tests: Add a case for combining sdiv by a splatted pow2 negative. NFC. Noticed test was missing while working on D42479. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329356 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 21:57:20 +00:00
Craig Topper	efb0b0966d	[X86] Separate CDQ and CDQE in the scheduler model. According to Agner's data, CDQE is closer to CWDE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329354 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 21:56:19 +00:00
Craig Topper	d5783cbc20	[X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329351 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 21:40:32 +00:00
Craig Topper	f828e9316d	[X86] Add LEAVE instruction to the scheduler models using the same data as LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329347 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 21:16:26 +00:00
Simon Pilgrim	62cc26416d	[X86][SSE] Add floating point add/mul fast-math vector.reduce tests Strict versions aren't working at all (PR36732) and the accumulators aren't supported (PR36734) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329344 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 21:01:21 +00:00
Simon Pilgrim	23bf3d86ba	[X86][SSE] Add floating point min/max vector.reduce tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329343 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 20:54:55 +00:00
Konstantin Zhuravlyov	ae3b2037b4	AMDGPU/Metadata: Always report a fixed number of hidden arguments Currently it is 6. If the "feature" was not used, report dummy hidden argument. Otherwise it does not match the kernarg size reported in the kernel header. Differential Revision: https://reviews.llvm.org/D45129 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329341 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 20:46:04 +00:00
Craig Topper	b49f64bf05	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329339 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 20:04:06 +00:00
Craig Topper	8f85f224b9	[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. Mostly vector load, store, and move instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329330 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 18:38:45 +00:00
Simon Pilgrim	f614f36c69	[X86][SSE] Add integer add/mul vector.reduce tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329321 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 17:37:35 +00:00
Simon Pilgrim	f096c30c1e	[X86][SSE] Add integer and/or/xor vector.reduce tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329320 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 17:29:51 +00:00
Simon Pilgrim	34841016c3	[X86][SSE] Add integer min/max vector.reduce tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329319 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 17:25:40 +00:00
Sam Clegg	13c09234d6	[WebAssembly] Allow for the creation of user-defined custom sections This patch adds a way for users to create their own custom sections to be added to wasm files. At the LLVM IR layer, they are defined through the "wasm.custom_sections" named metadata. The expected use case for this is bindings generators such as wasm-bindgen. Patch by Dan Gohman Differential Revision: https://reviews.llvm.org/D45297 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329315 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 17:01:39 +00:00
Tim Northover	1444c19a6b	ARM: Do not spill CSR to stack on entry to noreturn functions A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329287 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 14:26:06 +00:00
Sam Parker	b326c6e97a	[DAGCombine] Revert r329160 Again, broke the big endian stage 2 builders. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329283 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 13:46:17 +00:00
Simon Dardis	60da41d8b5	[mips] Regenerate test before posting patch for constant multiplication (NFC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329268 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 10:30:17 +00:00
Craig Topper	19636dfbd8	[X86] Revert r329251-329254 It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329256 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 05:19:36 +00:00
Craig Topper	67c388f5a1	[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. Mostly vector load, store, and move instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329254 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 04:42:03 +00:00
Craig Topper	d4b8b33a93	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329252 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 04:42:01 +00:00
Craig Topper	224b702752	[X86] Auto-generate complete checks. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329251 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 04:41:59 +00:00
Puyan Lotfi	a7f9b6aaad	[MIR-Canon] Improving performance by switching to named vregs. No more skipping thounsands of vregs. Much faster running time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329246 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 00:27:15 +00:00
Puyan Lotfi	29bc6472de	[MIR-Canon] Adding support for multi-def -> user distance reduction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329243 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 00:08:15 +00:00
Peter Collingbourne	70bca66be3	AArch64: Implement support for the shadowcallstack attribute. The implementation of shadow call stack on aarch64 is quite different to the implementation on x86_64. Instead of reserving a segment register for the shadow call stack, we reserve the platform register, x18. Any function that spills lr to sp also spills it to the shadow call stack, a pointer to which is stored in x18. Differential Revision: https://reviews.llvm.org/D45239 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329236 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 21:55:44 +00:00
Craig Topper	84038c4e88	[X86] Separate BSWAP32r and BSWAP64r scheduling data in SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329211 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 17:54:19 +00:00
Lei Huang	200eeca319	[Power9]Legalize and emit code for quad-precision fma instructions Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329206 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 16:43:50 +00:00
Nicolai Haehnle	83bfebdaca	AMDGPU: Dimension-aware image intrinsics Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329166 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 10:58:54 +00:00
Nicolai Haehnle	126cd7e831	AMDGPU: Fix copying i1 value out of loop with non-uniform exit Summary: When an i1-value is defined inside of a loop and used outside of it, we cannot simply use the SGPR bitmask from the loop's last iteration. There are also useful and correct cases of an i1-value being copied between basic blocks, e.g. when a condition is computed outside of a loop and used inside it. The concept of dominators is not sufficient to capture what is going on, so I propose the notion of "lane-dominators". Fixes a bug encountered in Nier: Automata. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743 Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329164 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 10:57:58 +00:00
John Brawn	107f6100d7	[AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y) Differential Revision: https://reviews.llvm.org/D44573 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329163 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 10:12:53 +00:00
Sam Parker	9b4235c03c	[DAGCombine] Improve ReduceLoadWidth for SRL Recommitting rL321259. Previosuly this caused an issue with PPCBE but I didn't receieve a reproducer and didn't have the time to follow up. If the issue appears again, please provide a reproducer so I can fix it. Original commit message: If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329160 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 09:26:56 +00:00
Vlad Tsyrklevich	484fd96051	Add the ShadowCallStack pass Summary: The ShadowCallStack pass instruments functions marked with the shadowcallstack attribute. The instrumented prolog saves the return address to [gs:offset] where offset is stored and updated in [gs:0]. The instrumented epilog loads/updates the return address from [gs:0] and checks that it matches the return address on the stack before returning. Reviewers: pcc, vitalybuka Reviewed By: pcc Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329139 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 01:21:16 +00:00
Jessica Paquette	ddbccbd6f6	[MachineOutliner] Test for X86FI->getUsesRedZone() as well as Attribute::NoRedZone This commit is similar to r329120, but uses the existing getUsesRedZone() function in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a function truly uses a redzone instead of just the noredzone attribute on a function. Thus, after this commit, it's possible to outline from x86 without using -mno-red-zone and still get outlining results. This also adds a new test for the new redzone behaviour. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329134 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 23:32:41 +00:00
Farhana Aleen	a59291c1f6	[AMDGPU] performMinMaxCombine should not optimize patterns of vectors to min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329131 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 23:00:30 +00:00
Jessica Paquette	5d2f4dba66	[MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfo This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329120 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:56:10 +00:00
Farhana Aleen	d82ffe5dae	Revert "MSG" This reverts commit `9a0ce889d1`. This was committed by mistake. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329119 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:51:45 +00:00
Jessica Paquette	4831763daa	[MachineOutliner][NFC] Make outlined functions have internal linkage The linkage type on outlined functions was private before. This meant that if you set a breakpoint in an outlined function, the debugger wouldn't be able to give a sane name to the outlined function. This commit changes the linkage type to internal and updates any tests that relied on the prefixes on the names of outlined functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329116 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:36:00 +00:00
Farhana Aleen	9a0ce889d1	MSG git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329114 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:20:39 +00:00
Sanjay Patel	8c6709f8ef	[x86] add tests for convert-FP-to-integer with constants; NFC We don't constant fold any of these, but we could...but if we do, we must produce the right answer. Unlike the IR fptosi instruction or its DAG node counterpart ISD::FP_TO_SINT, these are not undef for an out-of-range input. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329100 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 18:34:56 +00:00
Krzysztof Parzyszek	22950265a5	[Hexagon] Remove unneeded attributes from lit test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329078 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 16:05:20 +00:00
Chandler Carruth	7e78daafdd	[x86] Fix a pretty obvious think-o with my asm scrubbing. You have to in fact use regular expression syntax to use regular expressions. Should restore the bots. Sorry for the noise on this test. Thanks to Philip for spotting the bug! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329057 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 10:28:56 +00:00
Chandler Carruth	18ceb931fd	[x86] Clean up and enhance a test around eflags copying. This adds the basic test cases from all the EFLAGS bugs in more direct forms. It also switches to generated check lines, and includes both 32-bit and 64-bit variations. No functionality changing here, just setting things up to have a nice clean asm diff in my EFLAGS patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329056 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 10:04:37 +00:00
Chandler Carruth	6f46178ea5	[x86] Extend my goofy SP offset scrubbing for llc test cases to actually do explicit scrubbing of the offsets of stack spills and reloads. You can always turn this off in order to test specific stack slot usage. We were already hiding most of this, but the new logic hides it more generically. Notably, we should effectively hide stack slot churn in functions that have a frame pointer now, and should also hide it when changing a function from stack pointer to frame pointer. That transition already changes enough to be clearly noticed in the test case diff, showing every spill and reload is really noisy without benefit. See the test case I ran this on as a classic example. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329055 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 09:57:05 +00:00
Yonghong Song	86ab905c1d	bpf: fix incorrect SELECT_CC lowering Commit `37962a331c` ("bpf: Improve expanding logic in LowerSELECT_CC") intended to improve code quality for certain jmp conditions. The commit, however, has a couple of issues: (1). In code, just swap is not enough, ConditionalCode CC should also be swapped, otherwise incorrect code will be generated. (2). The ConditionalCode swap should be subject to getHasJmpExt(). If getHasJmpExt() is False, certain conditional codes will not be supported and swap may generate incorrect code. The original goal for this patch is to optimize jmp operations which does not have JmpExt turned on. If JmpExt is on, better code could be generated. For example, the test select_ri.ll is introduced to demonstrate the optimization. The same result can be achieved with -mcpu=v2 flag. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329043 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 03:56:37 +00:00
Chandler Carruth	bab9d7badc	[x86] Tidy up test case, generate check lines with script. NFC. Just adds basic block labels and tidies up where comments go in the test case and then generates fresh CHECK lines with the script. This way, the check lines are much easier to maintain. They were already close to this but not quite there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329040 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 02:19:05 +00:00
Rafael Espindola	aa2d256782	Align stubs for external and common global variables to pointer size. This patch fixes PR36885: clang++ generates unaligned stub symbol holding a pointer. Patch by Rahul Chaudhry! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329030 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 23:20:30 +00:00
Lama Saba	3c03a2ac26	[X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346 If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Differential revision: https://reviews.llvm.org/D41330 Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328973 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 13:48:28 +00:00
Craig Topper	1ad5730bb9	[X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model. Data taken from Table 16-17 in the Intel Optimization Manual. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328962 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 06:34:16 +00:00
Craig Topper	1936d3b892	[X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide. Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328961 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 05:54:34 +00:00
Craig Topper	d9e6efc2fb	[X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328960 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 05:33:28 +00:00
Craig Topper	0036663aec	[X86] Fix the SchedRW for AVX512 shift instructions. It was being inadvertently defaulted to an FADD scheduler class. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328959 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 03:15:02 +00:00
Craig Topper	8b17c46f75	[X86] Add an itinerary to BTR64rr. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328956 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 01:12:34 +00:00

1 2 3 4 5 ...

24874 Commits