Summary:
When reading the metadata bitcode, create a type declaration when
possible for composite types when we are importing. Doing this in
the bitcode reader saves memory. Also it works naturally in the case
when the type ODR map contains a definition for the same composite type
because it was used in the importing module (buildODRType will
automatically use the existing definition and not create a type
declaration).
For Chromium built with -g2, this reduces the aggregate size of the
generated native object files by 66% (from 31G to 10G). It reduced
the time through the ThinLTO link and backend phases by about 20% on
my machine.
Reviewers: mehdi_amini, dblaikie, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27775
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289993 91177308-0d34-0410-b5e6-96231b3b80d8
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly :
Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.
Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl
Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22696
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289988 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 289920 (again).
I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
has not DIExpression. Unfortunately it is not possible to safely upgrade
these variables without adding a flag to the bitcode record indicating which
version they are.
My plan of record is to roll the planned follow-up patch that adds a
unit: field to DIGlobalVariable into this patch before recomitting.
This way we only need one Bitcode upgrade for both changes (with a
version flag in the bitcode record to safely distinguish the record
formats).
Sorry for the churn!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289982 91177308-0d34-0410-b5e6-96231b3b80d8
(some other implementations of an optional pretty printer print the full
name of the optional type (including template parameter) - but seems if
the template parameter isn't printed for std::vector, not sure why it
would be printed for optional, so erring on the side of consistency in
that direction here - compact, etc, as well)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289976 91177308-0d34-0410-b5e6-96231b3b80d8
This patch reapplies r289863. The original patch was reverted because it
exposed a bug causing the loop vectorizer to crash in the Python runtime on
PPC. The underlying issue was fixed with r289958.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289975 91177308-0d34-0410-b5e6-96231b3b80d8
`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The
only reason it worked at all is that `getMetadataID` returns something
unrelated -- it returns the subclass ID of the receiver (which is used
in `dyn_cast` etc.). That does not numerically match
`LLVMContext::MD_invariant_group` and ends up dropping `invariant_group`
along with every other metadata that does not numerically match
`LLVMContext::MD_invariant_group`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289973 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.
Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).
Recommiting with fix to avoid forming vld1dup if the type of the load
doesn't match the type of the vdup (see
https://llvm.org/bugs/show_bug.cgi?id=31404).
Differential Revision: https://reviews.llvm.org/D27694
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289972 91177308-0d34-0410-b5e6-96231b3b80d8
Reverting because this breaks lld's gdb_index support - it's probably
double counting the abbrev relocation offset.
This reverts commit r289954.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289961 91177308-0d34-0410-b5e6-96231b3b80d8
After r288909, instructions feeding predicated instructions may be scalarized
if profitable. Since these instructions will remain scalar, we shouldn't
attempt to type-shrink them. We should only truncate vector types to their
minimal bit widths. This bug was exposed by enabling the vectorization of loops
containing conditional stores by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289958 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: ThinLTO needs to invoke SampleProfileLoader pass during link time in order to annotate profile correctly after module importing.
Reviewers: davidxl, mehdi_amini, tejohnson
Subscribers: pcc, davide, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D27790
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289957 91177308-0d34-0410-b5e6-96231b3b80d8
atomic_load_add returns the value before addition, but sets EFLAGS based on the
result of the addition. That means it's setting the flags based on effectively
subtracting C from the value at x, which is also what the outer cmp does.
This targets a pattern that occurs frequently with reference counting pointers:
void decrement(long volatile *ptr) {
if (_InterlockedDecrement(ptr) == 0)
release();
}
Clang would previously compile it (for 32-bit at -Os) as:
00000000 <?decrement@@YAXPCJ@Z>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 31 c9 xor %ecx,%ecx
6: 49 dec %ecx
7: f0 0f c1 08 lock xadd %ecx,(%eax)
b: 83 f9 01 cmp $0x1,%ecx
e: 0f 84 00 00 00 00 je 14 <?decrement@@YAXPCJ@Z+0x14>
14: c3 ret
and with this patch it becomes:
00000000 <?decrement@@YAXPCJ@Z>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: f0 ff 08 lock decl (%eax)
7: 0f 84 00 00 00 00 je d <?decrement@@YAXPCJ@Z+0xd>
d: c3 ret
(Equivalent variants with _InterlockedExchangeAdd, std::atomic<>'s fetch_add
or pre-decrement operator generate the same code.)
Differential Revision: https://reviews.llvm.org/D27781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289955 91177308-0d34-0410-b5e6-96231b3b80d8
Input can be produced by ld -r, for example (a normal LLVM workflow
never hits this - LLVM only ever produces a single abbrev table in an
object (shared by multiple CUs), so the reloc's always 0, and when it's
linked together the relocation's resolved so it doesn't need to be
handled)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289954 91177308-0d34-0410-b5e6-96231b3b80d8
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block:
Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.
Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl
Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22696
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289951 91177308-0d34-0410-b5e6-96231b3b80d8
It currently is in an unnamed namespace and then it shouldn't be used
from something in the header file. This actually triggers a warning with
GCC:
../include/llvm/IR/Verifier.h:39:7: warning: ‘llvm::TBAAVerifier’ has a field ‘llvm::TBAAVerifier::Diagnostic’ whose type uses the anonymous namespace [enabled by default]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289942 91177308-0d34-0410-b5e6-96231b3b80d8
Add the minimal support necessary to select a function that returns the sum of
two i32 values.
This includes some support for argument/return lowering of i32 values through
registers, as well as the handling of copy and add instructions throughout the
GlobalISel pipeline.
Differential Revision: https://reviews.llvm.org/D26677
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289940 91177308-0d34-0410-b5e6-96231b3b80d8
stores by default
This uncovers a crasher in the loop vectorizer on PPC when building the
Python runtime. I'll send the testcase to the review thread for the
original commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289934 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This commits moves skipDebugInstructionsForward and
skipDebugInstructionsBackward from lib/CodeGen/IfConversion.cpp
to include/llvm/CodeGen/MachineBasicBlock.h and updates
some codgen files to use them.
This refactoring was suggested in https://reviews.llvm.org/D27688
and I thought it's best to do the refactoring in a separate
review, but I could also put both changes in a single review
if that's preferred.
Also, the names for the functions aren't the snappiest and
I would be happy to rename them if anybody has suggestions.
Reviewers: eli.friedman, iteratee, aprantl, MatzeB
Subscribers: MatzeB, llvm-commits
Differential Revision: https://reviews.llvm.org/D27782
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289933 91177308-0d34-0410-b5e6-96231b3b80d8
Add two public methods to ARMTargetLowering: CCAssignFnForCall and
CCAssignFnForReturn, which are just calling the already existing private method
CCAssignFnForNode. These will come in handy for GlobalISel on ARM.
We also replace all calls to CCAssignFnForNode in ARMISelLowering.cpp, because
the new methods are friendlier to the reader.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289932 91177308-0d34-0410-b5e6-96231b3b80d8
This patch appears to result in trampolines in vtables being miscompiled
when they in turn tail call a method.
I've posted some preliminary details about the failure on the thread for
this commit and talked to Hal. He was comfortable going ahead and
reverting until we sort out what is wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289928 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to be used (in a later patch) by the BitcodeReader
to detect invalid TBAA and drop them when loading bitcode, so that
we don't break client that have legacy bitcode with possible invalid
TBAA.
Differential Revision: https://reviews.llvm.org/D27838
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289927 91177308-0d34-0410-b5e6-96231b3b80d8
One more attempt to re-commit the patch r285355, which I had to revert in r285362, because some tests were failing (the reason is because the size of the line_table varied depending on the full file name).
In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler.
This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted.
Differential Revision: https://reviews.llvm.org/D16697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289925 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
This reapplies r289902 with additional testcase upgrades.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289920 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of checking whether a global referenced by a function being
imported is defined in the same module, speculatively always add the
referenced globals to the module's export list. After all imports are
computed, for each module prune any not in its defined set from its
export list.
For a huge C++ app with aggressive importing thresholds, even with
D27687 we spent a lot of time invoking modulePath() from
exportGlobalInModule (modulePath() was still the 2nd hottest routine in
profile). The reason is that with comdat/linkonce the summary lists for
each GUID can be long. For the app in question, for example, we were
invoking exportGlobalInModule almost 2 million times, and we traversed
an average of 63 entries in the summary list each time.
This patch reduced the thin link time for the app by about 10% (on top
of D27687) when using aggressive importing thresholds, and about 3.5% on
average with default importing thresholds.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27755
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289918 91177308-0d34-0410-b5e6-96231b3b80d8