llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-04 11:17:31 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	ffdb7cfe3b	[X86] SimplifyDemandedVectorEltsForTargetNode - Move SUBV_BROADCAST narrowing handling. NFCI. Move the narrowing of SUBV_BROADCAST to where we handle all the other opcodes. llvm-svn: 366660	2019-07-21 19:04:44 +00:00
Nemanja Ivanovic	f160dc21c1	[PowerPC][NFC] Regenerate test using script This test case ended up as a hybrid of generated checks and manually inserted checks. Regenerate using script to make it consistent. llvm-svn: 366659	2019-07-21 18:42:29 +00:00
Craig Topper	4e73afb68e	[InstCombine] Update comment I missed in r366649. NFC llvm-svn: 366658	2019-07-21 16:15:03 +00:00
Simon Pilgrim	d36aa2fa3c	[SmallBitVector] Fix bug in find_next_unset for small types with indices >=32 We were creating a bitmask from a shift of unsigned instead of uintptr_t, meaning we couldn't create masks for indices above 31. Noticed due to a MSVC analyzer warning. llvm-svn: 366657	2019-07-21 16:06:26 +00:00
Aditya Nandakumar	1d6b95ca2a	[GISel]: Attach missing range metadata while translating G_LOADs https://reviews.llvm.org/D65048 Attach range information to G_LOAD when only defining one register. reviewed by: arsenm llvm-svn: 366656	2019-07-21 14:07:54 +00:00
David Green	59e56be139	[ARM] Move MVE VPT block tests into the Thumb2 directory. NFC llvm-svn: 366655	2019-07-21 13:09:19 +00:00
Roman Lebedev	6a6c707ef7	[NFC][InstCombine] Add a few extra srem-by-power-of-two tests - extra uses llvm-svn: 366652	2019-07-21 09:05:49 +00:00
Craig Topper	1985c47dd6	[InstCombine] Remove insertRangeTest code that handles the equality case. For equality, the function called getTrue/getFalse with the VT of the comparison input. But getTrue/getFalse need the boolean VT. So if this code ever executed, it would assert. I believe these cases are removed by InstSimplify so we don't get here. So this patch just fixes up an assert to exclude the equality possibility and removes the broken code. llvm-svn: 366649	2019-07-21 06:43:38 +00:00
Craig Topper	9b03e30e86	[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI AddOne/SubOne create new Constant objects. That seems heavy for comparing ConstantInts which wrap APInts. Just do the math on on the APInts and compare them. llvm-svn: 366648	2019-07-21 05:26:05 +00:00
Nico Weber	f25a994803	gn build: Merge r366622 llvm-svn: 366646	2019-07-21 00:03:55 +00:00
Roman Lebedev	f6ef7f16ab	[NFC][InstCombine] Autogenerate a few tests llvm-svn: 366643	2019-07-20 21:34:00 +00:00
Roman Lebedev	a874990efe	[NFC][InstCombine] Add srem-by-signbit tests - still can fold to bittest https://rise4fun.com/Alive/IIeS llvm-svn: 366642	2019-07-20 21:33:50 +00:00
Roman Lebedev	ed828c1134	[NFC][Codegen][X86][AArch64] Add "(x s% C) == 0" tests Much like with `urem`, the same optimization (albeit with slightly different algorithm) applies for the signed case, too. I'm simply copying the test coverage from `urem` case for now, i believe it should be (close to?) sufficient. llvm-svn: 366640	2019-07-20 19:25:44 +00:00
Roman Lebedev	609e3ba7ea	[Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvements Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we were ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are anything other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637	2019-07-20 16:33:15 +00:00
Simon Pilgrim	5c2a6a3c00	[X86][SSE] Use PSADBW to improve vXi8 sum reduction (PR42674) As detailed on PR42674, we can reduce a vXi8 down until we have the final <8 x i8>, and then use PSADBW with zero, to sum those values. We then extract the bottom i8, discarding any overflow from the upper bits of the i16 result. llvm-svn: 366636	2019-07-20 15:20:11 +00:00
Florian Hahn	455b8d429f	[Local] Zap blockaddress without users in ConstantFoldTerminator. If the blockaddress is not destoryed, the destination block will still be marked as having its address taken, limiting further transformations. I think there are other places where the dead blockaddress constants are kept around, I'll look into that as follow up. Reviewers: craig.topper, brzycki, davide Reviewed By: brzycki, davide Differential Revision: https://reviews.llvm.org/D64936 llvm-svn: 366633	2019-07-20 12:25:47 +00:00
Jessica Paquette	a789c33e9a	[GlobalISel][AArch64] Contract trivial same-size cross-bank copies into G_STOREs Sometimes, you can end up with cross-bank copies between same-sized GPRs and FPRs, which feed into G_STOREs. When these copies feed only into stores, they aren't necessary; we can just store using the original register bank. This provides some minor code size savings for some floating point SPEC benchmarks. (Around 0.2% for 453.povray and 450.soplex) This issue doesn't seem to show up due to regbankselect or anything similar. So, this patch introduces an early select function, `contractCrossBankCopyIntoStore` which performs the contraction when possible. The selector then continues normally and selects the correct store opcode, eliminating needless copies along the way. Differential Revision: https://reviews.llvm.org/D65024 llvm-svn: 366625	2019-07-20 01:55:35 +00:00
Guanzhong Chen	5ecf5af2b4	[WebAssembly] Compute and export TLS block alignment Summary: Add immutable WASM global `__tls_align` which stores the alignment requirements of the TLS segment. Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang. The expected usage has now changed to: __wasm_init_tls(memalign(__builtin_wasm_tls_align(), __builtin_wasm_tls_size())); Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton Reviewed By: tlively Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65028 llvm-svn: 366624	2019-07-19 23:34:16 +00:00
Daniel Sanders	60c2b0f7af	Re-commit: r366610 and r366612: Expand pseudo-components before embedding in llvm-config There were two main problems: * The 'nativecodegen' pseudo-component was unconditionally adding ${native_tgt}CodeGen even though it conditionally added ${native_tgt}Info and ${native_tgt}Desc. This has been fixed by making ${native_tgt}CodeGen conditional too * The 'all' pseudo-component was causing library names like LLVMLLVMDemangle as the expansion was to a library name and not a component. There doesn't seem to be a list of available components anywhere so this has been fixed by moving the expansion of 'all' back where it was before. This manifested in different ways on different builders but it was the same root cause llvm-svn: 366622	2019-07-19 22:46:47 +00:00
Matt Arsenault	0e9abc2fbd	AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces llvm-svn: 366621	2019-07-19 22:28:44 +00:00
Stanislav Mekhanoshin	1ace6d5780	[AMDGPU] Autogenerate register sequences in tuples Differential Revision: https://reviews.llvm.org/D65007 llvm-svn: 366619	2019-07-19 21:43:42 +00:00
Stanislav Mekhanoshin	4f62c7dc7b	[AMDGPU] Fixed occupancy calculation for gfx10 Differential Revision: https://reviews.llvm.org/D65010 llvm-svn: 366616	2019-07-19 21:29:51 +00:00
Daniel Sanders	47b2ce4000	Revert r366610 and r366612: Expand pseudo-components before embedding in llvm-config Some targets are missing LLVMDemangle, one is adding the LLVM prefix twice, and two are hitting the very error this patch fixes for my target. Reverting while I work through the reports. llvm-svn: 366615	2019-07-19 21:11:05 +00:00
Craig Topper	a0fef0a13e	[InstCombine] Fix copy/paste mistake in the test cases I added for PR42691. NFC llvm-svn: 366614	2019-07-19 21:09:21 +00:00
Matt Arsenault	e337bb98bb	AMDGPU: Avoid custom predicates for stores with glue llvm-svn: 366613	2019-07-19 21:01:30 +00:00
Daniel Sanders	eb9d60e13b	Fix a latent bug discovered by r366610: nativecodegen includes X86CodeGen when X86 is not compiled I believe this to have been a latent bug as the same expansion checks for the existence of ${native_tgt}Info and ${native_tgt}Desc and only adds them if they were compiled but unconditionally adds ${native_tgt}CodeGen. This should fix llvm-clang-x86_64-win-fast which builds ARM only on an X86 host and similar builders. llvm-svn: 366612	2019-07-19 20:58:11 +00:00
Craig Topper	192e68f782	[InstCombine] Add test cases for PR42691. NFC llvm-svn: 366611	2019-07-19 20:48:52 +00:00
Daniel Sanders	70b0573626	Expand pseudo-components before embedding in llvm-config Summary: If you use pseudo-targets like AllTargetsCodeGens in LLVM_DYLIB_COMPONENTS then a test will fail because `./bin/llvm-config --shared-mode` can't handle these targets. We can fix this by expanding them before embedding the string into llvm-config Reviewers: bogner Reviewed By: bogner Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65011 llvm-svn: 366610	2019-07-19 20:38:05 +00:00
Matt Arsenault	6e1ebba17d	AMDGPU: Redefine setcc condition PatLeafs Avoid using custom code predicates. llvm-svn: 366609	2019-07-19 20:24:40 +00:00
Matt Arsenault	2724c8cef5	AMDGPU: Don't rely on m0 being -1 for GWS offsets This only works if the high bits of m0 are also 0, so m0 would have to be set to 0xffff. llvm-svn: 366608	2019-07-19 20:01:24 +00:00
Matt Arsenault	b81a355be5	AMDGPU: Force s_waitcnt after GWS instructions This is apparently required to be the immediately following instruction, so force it into a bundle with a waitcnt. llvm-svn: 366607	2019-07-19 19:47:30 +00:00
Matt Arsenault	f3cd8c40d1	LiveIntervals: Fix handleMove asserting on BUNDLE The top-level BUNDLE instruction should behave as an ordinary instruction. It is supposed to have all relevant registers as implicit operands. Moving it should work as any other instruction. I believe the assert intended to avoid moving instructions inside bundles. llvm-svn: 366605	2019-07-19 19:32:00 +00:00
Louis Dionne	747dd4b220	Revert "[libc++] Integrate the PSTL into libc++" This reverts r366593, which caused unforeseen breakage on the build bots. I'm reverting until the problems have been figured out and fixed. llvm-svn: 366603	2019-07-19 18:52:46 +00:00
Michael Liao	5f9a32dcb3	[AMDGPU] Add test case on crashing of `si-lower-sgpr-spills` pass Reviewers: arsenm Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64273 llvm-svn: 366602	2019-07-19 18:50:53 +00:00
Nick Desaulniers	c56d9ef695	Revert "Use the MachineBasicBlock symbol for a callbr target" This reverts commit r366523/ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea. Two regressions were immediately reported: - https://github.com/ClangBuiltLinux/linux/issues/614 - https://github.com/ClangBuiltLinux/linux/issues/615 Reported-by: nathanchance llvm-svn: 366600	2019-07-19 18:18:02 +00:00
Matt Morehouse	2bd8f0a2eb	[RISCV] Disable tests failing on buildbots. r366399 enabled a couple tests that are failing on a few buildbots. llvm-svn: 366599	2019-07-19 18:05:12 +00:00
Stanislav Mekhanoshin	0c9d743a5c	[AMDGPU] Allow register tuples to set asm names This change reverts most of the previous register name generation. The real problem is that RegisterTuple does not generate asm names. Added optional operand to RegisterTuple. This way we can simplify register name access and dramatically reduce the size of static tables for the backend. Differential Revision: https://reviews.llvm.org/D64967 llvm-svn: 366598	2019-07-19 18:05:01 +00:00
Matt Arsenault	c815dbdcb1	AMDGPU/GlobalISel: Fix MMO flags for kernel argument loads The DAG lowering sets dereferencable and invariant, not nontemporal. llvm-svn: 366597	2019-07-19 17:52:56 +00:00
Matt Arsenault	fbb71b0b9e	GlobalISel: Add GINodeEquiv for fcopysign I don't need this at the moment, but it should be here. llvm-svn: 366596	2019-07-19 17:32:19 +00:00
Shoaib Meenai	5746665a65	[llvm-lipo] Remove trailing whitespace. NFC llvm-svn: 366595	2019-07-19 17:19:57 +00:00
Louis Dionne	5a48459ac2	[libc++] Integrate the PSTL into libc++ Summary: This commit allows specifying LIBCXX_ENABLE_PARALLEL_ALGORITHMS when configuring libc++ in CMake. When that option is enabled, libc++ will assume that the PSTL can be found somewhere on the CMake module path, and it will provide the C++17 parallel algorithms based on the PSTL (that is assumed to be available). The commit also adds support for running the PSTL tests as part of the libc++ test suite. Reviewers: rodgert, EricWF Subscribers: mgorny, christof, jkorous, dexonsmith, libcxx-commits, mclow.lists, EricWF Tags: #libc Differential Revision: https://reviews.llvm.org/D60480 llvm-svn: 366593	2019-07-19 17:02:42 +00:00
Matt Arsenault	59f385cd0c	AMDGPU: Add some function return test cases llvm-svn: 366591	2019-07-19 16:45:48 +00:00
Simon Pilgrim	caf2aa9106	[AMDGPU] Regenerate test file for upcoming patch. NFCI. llvm-svn: 366589	2019-07-19 15:43:56 +00:00
Matt Arsenault	2217349a57	AMDGPU: Attempt to fix bot error Manually remove file name from check line, since it somehow ends up being different on an msvc bot. llvm-svn: 366586	2019-07-19 14:56:24 +00:00
Matt Arsenault	43bcdb17a1	AMDGPU/GlobalISel: Selection for fminnum/fmaxnum v2f16 case doesn't work yet because the VOP3P complex patterns haven't been ported yet. llvm-svn: 366585	2019-07-19 14:42:40 +00:00
Matt Arsenault	e76c843d30	AMDGPU/GlobalISel: Support arguments with multiple registers Handles structs used directly in argument lists. llvm-svn: 366584	2019-07-19 14:29:30 +00:00
Matt Arsenault	3b8fc1b930	AMDGPU/GlobalISel: Rewrite lowerFormalArguments This should now handle everything except structs passed as multiple registers. I think most of the packing logic should be handled by handleAssignments, but I'm unclear on what the contract is for multiple registers. This is copying how x86 handles this. This does change the behavior of the test_sgpr_alignment0 amdgpu_vs test. I don't think shader arguments should try to follow the alignment, and registers need to be repacked. I also don't think it matters, since I think the pointers are packed to the beginning of the argument list anyway. llvm-svn: 366582	2019-07-19 14:15:18 +00:00
Matt Arsenault	a09c49dee2	AMDGPU: Decompose all values to 32-bit pieces for calling conventions This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578	2019-07-19 13:57:44 +00:00
Nico Weber	ab9357c357	gn build: Set +x on symlink_or_copy.py llvm-svn: 366576	2019-07-19 13:40:54 +00:00
Matt Arsenault	1ad1497bb5	DAG: Handle dbg_value for arguments split into multiple subregs This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574	2019-07-19 13:36:46 +00:00

1 2 3 4 5 ...

182202 Commits