RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-16 10:26:23 +00:00

Author	SHA1	Message	Date
Roman Lebedev	5a5bcb7952	[SimplifyCFG] mergeConditionalStoreToAddress(): try to pacify MSAN MSAN bot complains that there is use-of-uninitialized-value of this FreeStores later in IsWorthwhile(). Perhaps FreeStores needs to be stored in a vector? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372262 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 21:04:39 +00:00
Roman Lebedev	ebc98fa7a0	[InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251) Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372257 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 20:10:07 +00:00
Roman Lebedev	2ecd35251f	[SimplifyCFG] mergeConditionalStoreToAddress(): consider cost, not instruction count Summary: As it can be see in the changed test, while `div` is really costly, we were speculating it. This does not seem correct. Also, the old code would run for every single insturuction in BB, instead of eagerly bailing out as soon as there are too many instructions. This function still has a problem that `PHINodeFoldingThreshold` is per-basic-block, while it should be for all the basic blocks. Reviewers: efriedma, craig.topper, dmgreen, jmolloy Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67315 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372255 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 19:46:57 +00:00
Roman Lebedev	ce09123b44	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before upcoming patch git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372245 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 18:38:40 +00:00
Wei Mi	2e1df5f759	[SampleFDO] Minimize performance impact when profile-sample-accurate is enabled. We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSampleAccurate is true, function without sample will be regarded as cold and this could potentially cause performance regression. To minimize the potential negative performance impact, we want to be a little conservative here saying if a function shows up in the profile, no matter as outline instance, inline instance or call targets, treat the function as not being cold. This will handle the cases such as most callsites of a function are inlined in sampled binary (thus outline copy don't get any sample) but not inlined in current build (because of source code drift, imprecise debug information, or the callsites are all cold individually but not cold accumulatively...), so that the outline function showing up as cold in sampled binary will actually not be cold after current build. After the change, such function will be treated as not cold even profile-sample-accurate is enabled. At the same time we lower the hot criteria of callsiteIsHot check when profile-sample-accurate is enabled. callsiteIsHot is used to determined whether a callsite is hot and qualified for early inlining. When profile-sample-accurate is enabled, functions without profile will be regarded as cold and much less inlining will happen in CGSCC inlining pass, so we can worry less about size increase and be aggressive to allow more early inlining to happen for warm callsites and it is helpful for performance overall. Differential Revision: https://reviews.llvm.org/D67561 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372232 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 16:06:28 +00:00
Sanjay Patel	e700dd9b79	[SimplifyLibCalls] fix crash with empty function name (PR43347) ...and improve some variable names while here. https://bugs.llvm.org/show_bug.cgi?id=43347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372227 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-18 14:33:40 +00:00
Teresa Johnson	584a08b94d	[PGO] Change hardcoded thresholds for cold/inlinehint to use summary Summary: The PGO counter reading will add cold and inlinehint (hot) attributes to functions that are very cold or hot. This was using hardcoded thresholds, instead of the profile summary cutoffs which are used in other hot/cold detection and are more dynamic and adaptable. Switch to using the summary-based cold/hot detection. The hardcoded limits were causing some code that had a medium level of hotness (per the summary) to be incorrectly marked with a cold attribute, blocking inlining. Reviewers: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67673 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372189 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 23:12:13 +00:00
Reid Kleckner	f624998449	[PGO] Don't use comdat groups for counters & data on COFF For COFF, a comdat group is really a symbol marked IMAGE_COMDAT_SELECT_ANY and zero or more other symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE. Typically the associative symbols in the group are not external and are not referenced by other TUs, they are things like debug info, C++ dynamic initializers, or other section registration schemes. The Visual C++ linker reports a duplicate symbol error for symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE even if they would be discarded after handling the leader symbol. Fixes coverage-inline.cpp in check-profile after r372020. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372182 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 21:10:49 +00:00
Roman Lebedev	0b9b1fa8a1	[NFC][InstCombine] dropRedundantMaskingOfLeftShiftInput(): some NFC diff shaving git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372171 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 19:32:26 +00:00
David Bolvansky	833ff6adf3	Reland "[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y)" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372142 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 17:12:24 +00:00
Krasimir Georgiev	d5f8516fb7	Revert "[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y)" Summary: This reverts commit r372101. Causes ASAN build bot failures: http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/14176 From http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/14176/steps/64-bit%20check-asan/logs/stdio: ``` [ RUN ] AddressSanitizer.StrNCatOOBTest /home/buildbots/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/lib/asan/tests/asan_str_test.cpp:462: Failure Death test: strncat(to - 1, from, 0) Result: failed to die. ``` Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67658 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372125 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 14:15:23 +00:00
Simon Pilgrim	f2aa921cfe	[LoopVectorize] Don't dereference a dyn_cast result. NFCI. The static analyzer is warning about potential null dereferences of dyn_cast<> results, we can use cast<> directly as we know that these cases should all be CastInst, which is why its working atm and anyway cast<> will assert if they aren't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372116 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 13:24:54 +00:00
Benjamin Kramer	1d7df59e12	Hide implementation details in namespaces. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 12:56:29 +00:00
Johannes Doerfert	80781fec87	[Attributor][Fix] Initialize the cache prior to using it Summary: There were segfaults as we modified and iterated the instruction maps in the cache at the same time. This was happening because we created new instructions while we populated the cache. This fix changes the order in which we perform these actions. First, the caches for the whole module are created, then we start to create abstract attributes. I don't have a unit test but the LLVM test suite exposes this problem. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372105 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 10:52:41 +00:00
David Bolvansky	d15afd3638	[SLC] Preserve attrs for strncpy(x, "", y) -> memset(align 1 x, '\0', y) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372101 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 10:25:38 +00:00
David Bolvansky	d4e3b6ef79	[InstCombine] Annotate strdup with deref_or_null git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372098 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 10:12:48 +00:00
David Bolvansky	f0f4b7afaa	[NFCI] Fixed buildbots git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372097 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 10:03:45 +00:00
Fangrui Song	2920626e02	[SimplifyLibCalls] Fix -Wunused-result after D53342/r372091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372096 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 09:56:55 +00:00
David Bolvansky	077e4d5d58	[SimplifyLibCalls] Mark known arguments with nonnull Reviewers: efriedma, jdoerfert Reviewed By: jdoerfert Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D53342 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372091 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 09:32:52 +00:00
Florian Hahn	bac41ae77d	[LoopUnroll] Use LoopSize+1 as threshold, to allow unrolling loops matching LoopSize. We use `< UP.Threshold` later on, so we should use LoopSize + 1, to allow unrolling if the result won't exceed to loop size. Fixes PR43305. Reviewers: efriedma, dmgreen, paquette Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D67594 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372084 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 09:02:48 +00:00
Hideto Ueno	f6a402010b	[Attributor] Use Alias Analysis in noalias callsite argument deduction Summary: This patch adds a check of alias analysis in `noalias` callsite argument deduction. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67604 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372075 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 06:53:27 +00:00
Hideto Ueno	8a6eaada3c	[Attributor] Create helper struct for handling analysis getters Summary: This patch introduces a helper struct `AnalysisGetter` to put together analysis getters. In this patch, a getter for `AAResult` is also added for `noalias`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67603 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372072 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-17 05:45:18 +00:00
Reid Kleckner	713b7a45a9	[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups This fixes relocations against __profd_ symbols in discarded sections, which is PR41380. In general, instrumentation happens very early, and optimization and inlining happens afterwards. The counters for a function are calculated early, and after inlining, counters for an inlined function may be widely referenced by other functions. For C++ inline functions of all kinds (linkonce_odr & available_externally mainly), instr profiling wants to deduplicate these __profc_ and __profd_ globals. Otherwise the binary would be quite large. I made __profd_ and __profc_ comdat in r355044, but I chose to make __profd_ internal. At the time, I was only dealing with coverage, and in that case, none of the instrumentation needs to reference __profd_. However, if you use PGO, then instrumentation passes add calls to __llvm_profile_instrument_range which reference __profd_ globals. The solution is to make these globals externally visible by using linkonce_odr linkage for data as was done for counters. This is safe because PGO adds a CFG hash to the names of the data and counter globals, so if different TUs have different globals, they will get different data and counter arrays. Reviewers: xur, hans Differential Revision: https://reviews.llvm.org/D67579 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372020 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 18:49:09 +00:00
Roman Lebedev	d1f4a55c8c	[SimplifyCFG] FoldTwoEntryPHINode(): consider total speculation cost, not per-BB cost Summary: Previously, if the threshold was 2, we were willing to speculatively execute 2 cheap instructions in both basic blocks (thus we were willing to speculatively execute cost = 4), but weren't willing to speculate when one BB had 3 instructions and other one had no instructions, even thought that would have total cost of 3. This looks inconsistent to me. I don't think `cmov`-like instructions will start executing until both of it's inputs are available: https://godbolt.org/z/zgHePf So i don't see why the existing behavior is the correct one. Also, let's add it's own `cl::opt` for this threshold, with default=4, so it is not stricter than the previous threshold: will allow to fold when there are 2 BB's each with cost=2. And since the logic has changed, it will also allow to fold when one BB has cost=3 and other cost=1, or there is only one BB with cost=4. This is an alternative solution to D65148: This fix is mainly motivated by `signbit-like-value-extension.ll` test. That pattern comes up in JPEG decoding, see e.g. `Figure F.12 – Extending the sign bit of a decoded value in V` of `ITU T.81` (JPEG specification). That branch is not predictable, and it is within the innermost loop, so the fact that that pattern ends up being stuck with a branch instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial. This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240) \| metric \| old \| new \| delta \| % \| \| x86-mi-counting.NumMachineFunctions \| 37720 \| 37721 \| 1 \| 0.00% \| \| x86-mi-counting.NumMachineBasicBlocks \| 773545 \| 771181 \| -2364 \| -0.31% \| \| x86-mi-counting.NumMachineInstructions \| 7488843 \| 7486442 \| -2401 \| -0.03% \| \| x86-mi-counting.NumUncondBR \| 135770 \| 135543 \| -227 \| -0.17% \| \| x86-mi-counting.NumCondBR \| 423753 \| 422187 \| -1566 \| -0.37% \| \| x86-mi-counting.NumCMOV \| 24815 \| 25731 \| 916 \| 3.69% \| \| x86-mi-counting.NumVecBlend \| 17 \| 17 \| 0 \| 0.00% \| We significantly decrease basic block count, notably decrease instruction count, significantly decrease branch count and very significantly increase `cmov` count. Performance-wise, unsurprisingly, this has great effect on target RawSpeed benchmark. I'm seeing 5 major improvements: ``` Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean -0.3064 -0.3064 226.9913 157.4452 226.9800 157.4384 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median -0.3057 -0.3057 226.8407 157.4926 226.8282 157.4828 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev -0.4985 -0.4954 0.3051 0.1530 0.3040 0.1534 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean -0.1747 -0.1747 80.4787 66.4227 80.4771 66.4146 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median -0.1742 -0.1743 80.4686 66.4542 80.4690 66.4436 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev +0.6089 +0.5797 0.0670 0.1078 0.0673 0.1062 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean -0.1598 -0.1598 171.6996 144.2575 171.6915 144.2538 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median -0.1598 -0.1597 171.7109 144.2755 171.7018 144.2766 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev +0.4024 +0.3850 0.0847 0.1187 0.0848 0.1175 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean -0.0550 -0.0551 280.3046 264.8800 280.3017 264.8559 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median -0.0554 -0.0554 280.2628 264.7360 280.2574 264.7297 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev +0.7005 +0.7041 0.2779 0.4725 0.2775 0.4729 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean -0.0354 -0.0355 316.7396 305.5208 316.7342 305.4890 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median -0.0354 -0.0356 316.6969 305.4798 316.6917 305.4324 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev +0.0493 +0.0330 0.3562 0.3737 0.3563 0.3681 ``` That being said, it's always best-effort, so there will likely be cases where this worsens things. Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67318 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372009 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 16:18:24 +00:00
Sanjay Patel	49121b4df9	[InstCombine] remove unneeded one-use checks for icmp fold Related folds were added in: rL125734 ...the code comment about register pressure is discussed in more detail in: https://bugs.llvm.org/show_bug.cgi?id=2698 But 10 years later, perf testing bzip2 with this change now shows a slight (0.2% average) improvement on Haswell although that's probably within test noise. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 and rL371981 are related patches in this series. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 16:15:25 +00:00
Sanjay Patel	cba3b2c1c3	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 <https://reviews.llvm.org/rL125734> ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 is a related patch in this series. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 12:54:34 +00:00
Sanjay Patel	1a3c0ac4ea	[InstCombine] fix comments to match code; NFC This blob was written before match() existed, so it could probably be reduced significantly. But I suspect it isn't well tested, so tests would have to be added to reduce risk from logic changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 12:12:05 +00:00
Simon Pilgrim	f4c41d6cba	[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI. The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371975 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 11:22:44 +00:00
Simon Pilgrim	5a41182176	[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371974 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 10:48:16 +00:00
Simon Pilgrim	c46379da2c	[SLPVectorizer] Don't dereference a dyn_cast result. NFCI. The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371973 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 10:35:09 +00:00
Stefan Stipanovic	e342d834af	[Attributor] Heap-To-Stack Conversion D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features. Reviewers: jdoerfert, uenoku, hfinkel, efriedma Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D65408 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371942 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-15 21:47:41 +00:00
Sanjay Patel	3050f70ce7	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. There are similar checks as noted with the TODO comments. I'm hoping to remove those restrictions too, but if any of these does cause a regression, it should be easier to correct by making small, individual commits. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-15 20:56:34 +00:00
Simon Pilgrim	81bcd7a858	[LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down the load chain. NFCI. Silence static analyzer uninitialized variable warning by setting the LoadTy to null and then asserting we find a real value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371936 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-15 16:44:35 +00:00
Sanjay Patel	71f28d66dd	[SLP] limit vectorization of Constant subclasses (PR33958) This is a fix for: https://bugs.llvm.org/show_bug.cgi?id=33958 It seems universally true that we would not want to transform this kind of sequence on any target, but if that's not correct, then we could view this as a target-specific cost model problem. We could also white-list ConstantInt, ConstantFP, etc. rather than blacklist Global and ConstantExpr. Differential Revision: https://reviews.llvm.org/D67362 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371931 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-15 13:03:24 +00:00
Johannes Doerfert	60c9595c32	[Attributor][Fix] Use right type to replace expressions Summary: This should be obsolete once the functionality in D66967 is integrated. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371915 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-14 02:57:50 +00:00
Florian Hahn	768257cad5	[BasicBlockUtils] Add optional BBName argument, in line with BB:splitBasicBlock Reviewers: spatel, asbirlea, craig.topper Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D67521 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371819 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-13 08:03:32 +00:00
Alina Sbirlea	333387e4a8	[MemorySSA] Pass (for update) MSSAU when hoisting instructions. Summary: Pass MSSAU to makeLoopInvariant in order to properly update MSSA. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67470 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371748 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-12 17:12:51 +00:00
Sanjay Patel	21dbe0be4a	[InstCombine] rename variable for readability; NFC There's more that can be done here, but "OpI" doesn't convey that we casted to BinaryOperator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371682 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 22:31:34 +00:00
Eli Friedman	3473785eb3	[ConstantHoisting] Fix non-determinism. Differential Revision: https://reviews.llvm.org/D66114 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371644 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 18:55:00 +00:00
Petr Hosek	62e7ac6bbc	Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing the developer's annotation against a collected PGO profile. A more detailed proposal and discussion appears on the CFE-dev mailing list (http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a prototype of the initial frontend changes appear here in D65300 We revised the work in D65300 by moving the misexpect check into the LLVM backend, and adding support for IR and sampling based profiles, in addition to frontend instrumentation. We add new misexpect metadata tags to those instructions directly influenced by the llvm.expect intrinsic (branch, switch, and select) when lowering the intrinsics. The misexpect metadata contains information about the expected target of the intrinsic so that we can check against the correct PGO counter when emitting diagnostics, and the compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight. We use these branch weight values to determine when to emit the diagnostic to the user. A future patch should address the comment at the top of LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and UnlikelyBranchWeight values into a shared space that can be accessed outside of the LowerExpectIntrinsic pass. Once that is done, the misexpect metadata can be updated to be smaller. In the long term, it is possible to reconstruct portions of the misexpect metadata from the existing profile data. However, we have avoided this to keep the code simple, and because some kind of metadata tag will be required to identify which branch/switch/select instructions are influenced by the use of llvm.expect Patch By: paulkirth Differential Revision: https://reviews.llvm.org/D66324 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371635 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 16:19:50 +00:00
Florian Hahn	c60c66e7ae	Revert [InstCombine] Use SimplifyFMulInst to simplify multiply in fma. This introduces additional rounding error in some cases. See D67434. This reverts r371518 (git commit 18a1f0818b659cee13865b4fad2648d85984a4ed) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371634 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 16:17:03 +00:00
Whitney Tsang	0123ecb125	LLVM: Optimization Pass: Remove conflicting attribute, if any, before adding new read attribute to an argument Summary: Update optimization pass to prevent adding read-attribute to an argument without removing its conflicting attribute. A read attribute, based on the result of the attribute deduction process, might be added to an argument. The attribute might be in conflict with other read/write attribute currently associated with the argument. To ensure the compatibility of attributes, conflicting attribute, if any, must be removed before a new one is added. The following snippet shows the current behavior of the compiler, where the compilation process is aborted due to incompatible attributes. $ cat x.ll ; ModuleID = 'x.bc' %_type_of_d-ccc = type <{ i8, i8, i8, i8, i8 }> @d-ccc = internal global %_type_of_d-ccc <{ i8 null, i8 1, i8 13, i8 0, i8 -127 }>, align 8 define void @foo(i32* writeonly %.aaa) { foo_entry: %_param_.aaa = alloca i32, align 8 store i32 %.aaa, i32** %_param_.aaa, align 8 store i8 0, i8* getelementptr inbounds (%_type_of_d-ccc, %_type_of_d-ccc* @d-ccc, i32 0, i32 3) ret void } $ opt -O3 x.ll Attributes 'readnone and writeonly' are incompatible! void (i32) @foo in function foo LLVM ERROR: Broken function found, compilation aborted! The purpose of this changeset is to fix the above error. This fix is based on a suggestion from Johannes @jdoerfert (many thanks!!!) Authored By: anhtuyen Reviewer: nicholas, rnk, chandlerc, jdoerfert Reviewed By: rnk Subscribers: hiraditya, jdoerfert, llvm-commits, anhtuyen, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D58694 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371622 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 14:26:22 +00:00
Sanjay Patel	3813551fb7	[InstCombine] fold sign-bit compares of srem (srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking off the sign bit and the module (low) bits: https://rise4fun.com/Alive/jSO A '2' divisor allows slightly more folding: https://rise4fun.com/Alive/tDBM Any chance to remove an 'srem' use is probably worthwhile, but this is limited to the one-use improvement case because doing more may expose other missing folds. That means it does nothing for PR21929 yet: https://bugs.llvm.org/show_bug.cgi?id=21929 Differential Revision: https://reviews.llvm.org/D67334 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371610 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 12:04:26 +00:00
David Bolvansky	43dfaac978	[InstCombine] Fixed handling of isOpNewLike (PR11748) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371602 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 10:37:03 +00:00
Florian Hahn	fcb40334f1	[LoopInterchange] Drop unused splitInnerLoopHeader declaration. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371601 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 10:32:15 +00:00
Dmitri Gribenko	b91891ad06	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" This reverts commit r371584. It introduced a dependency from compiler-rt to llvm/include/ADT, which is problematic for multiple reasons. One is that it is a novel dependency edge, which needs cross-compliation machinery for llvm/include/ADT (yes, it is true that right now compiler-rt included only header-only libraries, however, if we allow compiler-rt to depend on anything from ADT, other libraries will eventually get used). Secondly, depending on ADT from compiler-rt exposes ADT symbols from compiler-rt, which would cause ODR violations when Clang is built with the profile library. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371598 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 09:16:17 +00:00
Florian Hahn	e72ccee9d5	[LoopInterchange] Properly move condition, induction increment and ops to latch. Currently we only rely on the induction increment to come before the condition to ensure the required instructions get moved to the new latch. This patch duplicates and moves the required instructions to the newly created latch. We move the condition to the end of the new block, then process its operands. We stop at operands that are defined outside the loop, or are the induction PHI. We duplicate the instructions and update the uses in the moved instructions, to ensure other users remain intact. See the added test2 for such an example. Reviewers: efriedma, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D67367 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371595 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 08:23:23 +00:00
Hideto Ueno	df18bf27eb	[Attributor] Implement "noalias" callsite argument deduction Summary: Now, `nocapture` is deduced in Attributor therefore, this patch introduces deduction for `noalias` callsite argument using `nocapture`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67286 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371590 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 07:00:33 +00:00
Hideto Ueno	58dc7cdf01	[Attributor][Fix] Manifest nocapture only in CSArgument or Argument Summary: We can query to Attributor whether the value is captured in the scope or not on the following way: ``` const auto & NoCapAA = A.getAAFor<AANoCapture>(*this, IRPosition::value(V)); ``` And if V is CallSiteReturned then `getDeducedAttribute` will add `nocatpure` to the callsite returned value. It is not valid. This patch checks the position is an argument or call site argument. This is tested in D67286. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67342 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371589 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 06:52:11 +00:00
Alexey Lapshin	6a1542545b	[Debuginfo][Instcombiner] Do not clone dbg.declare. TryToSinkInstruction() has a bug: While updating debug info for sunk instruction, it could clone dbg.declare intrinsic. That is wrong. There could be only one dbg.declare. The fix is to not clone dbg.declare intrinsic and to update it`s arguments, to not to point to sunk instruction. Differential Revision: https://reviews.llvm.org/D67217 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371587 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 06:07:16 +00:00

1 2 3 4 5 ...

22429 Commits