This support is _very_ rudimentary, just enough to get some basic data
into the CodeView debug section.
Left to do is:
- Use the combined opcodes to save space.
- Do something about code offsets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259230 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both
This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16333
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259217 91177308-0d34-0410-b5e6-96231b3b80d8
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).
AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.
An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.
RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).
Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.
For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).
There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.
http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259201 91177308-0d34-0410-b5e6-96231b3b80d8
Also removed a few redundant `else`s.
Bug was found by a test I wrote for MemorySSA (in review at
http://reviews.llvm.org/D7864; shiny update coming soon). So, assuming
that lands at some point, this should be covered by that. If anyone
feels this deserves its own explicit test case, please let me know.
I'll write one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259179 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enables llvm-bcanalyzer to print the bitcode wrapper header
if the file has one, which is needed to test the changes made in
r258627 (bitcode-wrapper-header-armv7m.ll is the test case for r258627).
Differential Revision: http://reviews.llvm.org/D16642
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259162 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r259117.
The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259126 91177308-0d34-0410-b5e6-96231b3b80d8
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:
- .cv_file: Similar to DWARF .file directives
- .cv_loc: Similar to the DWARF .loc directive, but starts with a
function id. CodeView line tables are emitted by function instead of
by compilation unit, so we needed an extra field to communicate this.
Rather than overloading the .loc direction further, we decided it was
better to have our own directive.
- .cv_stringtable: Emits the codeview string table at the current
position. Currently this just contains the filenames as
null-terminated strings.
- .cv_filechecksums: Emits the file checksum table for all files used
with .cv_file so far. There is currently no support for emitting
actual checksums, just filenames.
This moves the line table emission code down into the assembler. This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.
David Majnemer collaborated on this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259117 91177308-0d34-0410-b5e6-96231b3b80d8
Re-commit of r258951 after fixing layering violation.
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259035 91177308-0d34-0410-b5e6-96231b3b80d8
This patch revamps the RegStackifier pass with a new tree traversal mechanism,
enabling three major new features:
- Stackification of values with multiple uses, using the result value of set_local
- More aggressive stackification of instructions with side effects
- Reordering operands in commutative instructions to enable more stackification.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259009 91177308-0d34-0410-b5e6-96231b3b80d8
Various bits we want to use the new ABI actually compile with "-arch armv7k
-miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how
slices work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258975 91177308-0d34-0410-b5e6-96231b3b80d8
ObjC ARC Optimizer.
The main implication of this is:
1. Ensuring that we treat it conservatively in terms of optimization.
2. We put the ASM marker on it so that the runtime can recognize
objc_unsafeClaimAutoreleasedReturnValue from releaseRV.
<rdar://problem/21567064>
Patch by Michael Gottesman!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258970 91177308-0d34-0410-b5e6-96231b3b80d8
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.
There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.
The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.
Differential Revision: http://reviews.llvm.org/D16590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258951 91177308-0d34-0410-b5e6-96231b3b80d8
Most of the time we only hit the small case, so it is beneficial to pull
it out of the insert_imp() implementation. This improves compile time
at least for non-LTO builds.
Differential Revision: http://reviews.llvm.org/D16619
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258908 91177308-0d34-0410-b5e6-96231b3b80d8
This brings the compile time of Function.cpp from ~40s down to ~4s for
me locally. It also shaves off about 400KB of object file size in a
release+asserts build.
I also realized that the AMDGPU backend does not have any GCC builtin
names to match, so the extra lookup was a no-op. I removed it to silence
a zero-length string table array warning. There should be no functional
change here.
This change really ends the story of PR11951.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258897 91177308-0d34-0410-b5e6-96231b3b80d8
I tried to make the AMDGPU intrinsic info table use this instead of
another StringMatcher, and some issues arose.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258871 91177308-0d34-0410-b5e6-96231b3b80d8
Originally this change was causing failures on windows buildbots.
But those problems were fixed in r258806.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258811 91177308-0d34-0410-b5e6-96231b3b80d8
More cleanup to try to get all intrinsics using the correct
amdgcn prefix that are as close to the instruction as possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258786 91177308-0d34-0410-b5e6-96231b3b80d8
This is a recommit of r258620 which causes PR26293.
The original message:
Now LIR can turn following codes into memset:
typedef struct foo {
int a;
int b;
} foo_t;
void bar(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; ++i) {
f[i].a = 0;
f[i].b = 0;
}
}
void test(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; i += 2) {
f[i] = 0;
f[i+1] = 0;
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258777 91177308-0d34-0410-b5e6-96231b3b80d8