This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans. This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.
Differential Revision: https://reviews.llvm.org/D33009
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302936 91177308-0d34-0410-b5e6-96231b3b80d8
At this point in the code rhsWords is guaranteed to be non-zero and less than or equal to lhsWords. So if lhsWords is 1, rhsWords must also be 1. urem alread had the check removed so this makes all 3 consistent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302930 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds min/max population count, leading/trailing zero/one bit counting methods.
The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.
Differential Revision: https://reviews.llvm.org/D32931
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302925 91177308-0d34-0410-b5e6-96231b3b80d8
CodeViewDebug sets Asm to nullptr to disable debug info generation. You
can get a .ll file like no-cus.ll from 'clang -gcodeview -g0', which
happens in the ubsan test suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302923 91177308-0d34-0410-b5e6-96231b3b80d8
While debugging a predicate info problem, I noticed this was missing
a newline, making the debug output slightly less readable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302908 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow up patch for https://reviews.llvm.org/rL300440
to address a comment.
To make implementation to be consistent with other cases we just
ignore the remainder after distribution of remaining probability between
reachable edges.
If we reduced the probability of some edges coming to unreachable
blocks we should distribute the remaining part across other edges
coming to reachable blocks to satisfy the condition that sum of all
probabilities should be equal to one. If this remaining part is not
divided by number of "reachable" edges then we get this remainder.
This remainder probability should be pretty small. Other cases just ignore
if the sum of probabilities is not equal to one so we do the same.
Reviewers: chandlerc, sanjoy, vsk, junbuml, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D32124
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302883 91177308-0d34-0410-b5e6-96231b3b80d8
I tried to compile LLD using GCC 7.1.0 and got warnings like
"warning: this statement may fall through [-Wimplicit-fallthrough=]"
(some more details are here: D32907)
GCC's __cplusplus value is 201402L by default, so macro expands to nothing,
though GCC 7 has support for [[fallthrough]].
Patch uses gnu::fallthrough when it is available and fixes warning I am observing.
Initial idea of way to fix belongs to Davide Italiano.
Differential revision: https://reviews.llvm.org/D33036
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302878 91177308-0d34-0410-b5e6-96231b3b80d8
Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy()
with an undef source operand. It is not possible for the target to handle
this, as this flag is not passed to TII->copyPhysReg().
This patch solves this by treating such a COPY as an identity COPY.
Review: Matthias Braun
https://reviews.llvm.org/D32892
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302877 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot
always be trusted) at the end to fixup the CFG we keep the CFG updated as
we go along and remove or add branches and merge blocks.
This way we won't have any problems if the involved MBBs contain
unanalyzable instructions.
This fixes PR32721.
In that case we had a triangle
EBB
| \
| |
| TBB
| /
FBB
where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).
Reviewers: kparzysz, iteratee, MatzeB
Reviewed By: iteratee
Subscribers: aemerson, rengolin, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33037
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302876 91177308-0d34-0410-b5e6-96231b3b80d8
invariant PHI inputs and to rewrite PHI nodes during the actual
unswitching.
The checking is quite easy, but rewriting the PHI nodes is somewhat
surprisingly challenging. This should handle both branches and switches.
I think this is now a full featured trivial unswitcher, and more full
featured than the trivial cases in the old pass while still being (IMO)
somewhat simpler in how it works.
Next up is to verify its correctness in more widespread testing, and
then to add non-trivial unswitching.
Thanks to Davide and Sanjoy for the excellent review. There is one
remaining question that I may address in a follow-up patch (see the
review thread for details) but it isn't related to the functionality
specifically.
Differential Revision: https://reviews.llvm.org/D32699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302867 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds a resize method to APInt that manages deleting/allocating storage for an APInt and changes its bit width. Use this to simplify code in copy assignment and divide.
The assignment code in particular was overly complicated. Treating every possible case as a separate implementation. I'm also pretty sure the clearUnusedBits code at the end was unnecessary. Since we always copying whole words from the source APInt. All unused bits should be clear in the source.
Reviewers: hans, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33073
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302863 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out that the Fission/Split DWARF package format (DWP) is currently
insufficient to handle cross-CU (ref_addr) references. So for now,
duplicate any debug info needed in these situations:
* inlined_subroutine's abstract_origin
* inlined variable's abstract_origin
* types
Keep the ref_addr behavior in general, including in the split DWARF
inline debug info that can be emitted into the object files for online
symbolication.
Keep a flag to use the old (ref_addr) behavior for testing ways of
addressing this limitation in the DWP tool (& for those not using DWP
packaging).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302858 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In D30630 we will start writing custom event records. To avoid breaking
the tools that read the FDR mode records, we skip over these records.
To support these custom event records more effectively, we will have to
expose them in the trace loading API. Those changes will be forthcoming.
Reviewers: kpw, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33032
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302856 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch changes the function profile output order to be deterministic. In order to make it easier to understand, hottest functions (with most total samples) is ordered first.
Reviewers: dnovillo, davidxl
Reviewed By: dnovillo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33111
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302851 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).
This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling, and to add in profile summary metadata
so that getProfileCount can get the summary.
Reviewers: davidxl, danielcdh
Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D32877
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302844 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid using report_fatal_error, because it will ask the user to file a
bug. If the user attempts to disable SSE on x86_64 and them use floating
point, that's a bug in their code, not a bug in the compiler.
This is just a start. There are other ways to crash the backend in this
configuration, but they should be updated to follow this pattern.
Differential Revision: https://reviews.llvm.org/D27522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302835 91177308-0d34-0410-b5e6-96231b3b80d8
According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.
This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.
Differential Revision: https://reviews.llvm.org/D32880
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302834 91177308-0d34-0410-b5e6-96231b3b80d8
I ran the test-suite (including SPEC 2006) in PGO mode comparing cold
thresholds of 225 and 45. Here are some stats on the text size:
Out of 904 tests that ran, 197 see a change in text size. The average
text size reduction (of all the 904 binaries) is 1.07%. Of the 197
binaries, 19 see a text size increase, as high as 18%, but most of them
are small single source benchmarks. There are 3 multisource benchmarks
with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On
the other side of the spectrum, 31 benchmarks see >10% size reduction
and 6 of them are MultiSource.
I haven't run the test-suite with other values of inlinecold-threshold.
Since we have a cold callsite threshold of 45, I picked this value.
Differential revision: https://reviews.llvm.org/D33106
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302829 91177308-0d34-0410-b5e6-96231b3b80d8
Use the same switch technique to eliminate virtual successor accessors
from TerminatorInst. Extracted from D31261.
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302827 91177308-0d34-0410-b5e6-96231b3b80d8
The erase/remove from parent methods now use a switch table to remove
themselves from their appropriate parent ilist.
The copyAttributesFrom method is now completely non-virtual, since we
only ever copy attributes from a global of the appropriate type.
Pre-requisite to de-virtualizing Value to save a vptr
(https://reviews.llvm.org/D31261).
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302823 91177308-0d34-0410-b5e6-96231b3b80d8
Updates the MSP430 target to generate EABI-compatible libcall names.
As a byproduct, adjusts the hardware multiplier options available in
the MSP430 target, adds support for promotion of the ISD::MUL operation
for 8-bit integers, and correctly marks R11 as used by call instructions.
Patch by Andrew Wygle.
Differential Revision: https://reviews.llvm.org/D32676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302820 91177308-0d34-0410-b5e6-96231b3b80d8
The testcase in PR32984 shows a non linear compile time increase
after a change that made the LoopUnroll pass more aggressive
(increasing the threshold).
My profiling shows all the time of PHI elimination goes to
llvm::LiveVariables::addNewBlock. This is because we keep
Defs/Kills registers in a SmallSet and vfind(const T &V); is O(N).
Switching to a DenseSet reduces the time spent in the pass from
297 seconds to 97 seconds. Profiling still shows a lot of time is
spent iterating the data structure, so I guess there's room for
improvement.
Dan tells me GCC uses real set operations for live registers and
it takes no-time on this testcase. Matthias points out we might
want to switch all this to LiveIntervalAnalysis so it's not entirely
sure if a rewrite is worth it.
Differential Revision: https://reviews.llvm.org/D33088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302819 91177308-0d34-0410-b5e6-96231b3b80d8
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.
Additionally actually using it requires changing the output register
class, which wasn't done anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302813 91177308-0d34-0410-b5e6-96231b3b80d8