This patch adds simplified support for tail calls on ARM with XRay instrumentation.
Known issue: compiled with generic flags: `-O3 -g -fxray-instrument -Wall
-std=c++14 -ffunction-sections -fdata-sections` (this list doesn't include my
specific flags like --target=armv7-linux-gnueabihf etc.), the following program
#include <cstdio>
#include <cassert>
#include <xray/xray_interface.h>
[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() {
std::printf("In fC()\n");
}
[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() {
std::printf("In fB()\n");
fC();
}
[[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() {
std::printf("In fA()\n");
fB();
}
// Avoid infinite recursion in case the logging function is instrumented (so calls logging
// function again).
[[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret)
{
printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret));
}
int main(int argc, char* argv[]) {
__xray_set_handler(simplyPrint);
printf("Patching...\n");
__xray_patch();
fA();
printf("Unpatching...\n");
__xray_unpatch();
fA();
return 0;
}
gives the following output:
Patching...
XRay: functionId=3 type=0.
In fA()
XRay: functionId=3 type=1.
XRay: functionId=2 type=0.
In fB()
XRay: functionId=2 type=1.
XRay: functionId=1 type=0.
XRay: functionId=1 type=1.
In fC()
Unpatching...
In fA()
In fB()
In fC()
So for function fC() the exit sled seems to be called too much before function
exit: before printing In fC().
Debugging shows that the above happens because printf from fC is also called as
a tail call. So first the exit sled of fC is executed, and only then printf is
jumped into. So it seems we can't do anything about this with the current
approach (i.e. within the simplification described in
https://reviews.llvm.org/D23988 ).
Differential Revision: https://reviews.llvm.org/D25030
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284456 91177308-0d34-0410-b5e6-96231b3b80d8
This is harder to do for vpermilpd as shuffle combining turns the constant vector into an immediate since all vpermilpd's inputs with constant vector can also be encoded with the immediate form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284455 91177308-0d34-0410-b5e6-96231b3b80d8
When Error was threaded through these APIs back in r265606 the
"return" was missed here, which triggers a warning if/when I add
LLVM_NODISCARD to the Error type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284454 91177308-0d34-0410-b5e6-96231b3b80d8
-debug-only=subtarget-emitter prints a lot of machine model diagnostics.
This prunes the output so that the "No machine model for XXX on processor YYY"
only appears when there is definitely no machine model for that opcode.
Previously it was printing that error even if the opcode was covered by
a more general scheduling class.
<rdar://problem/15919845> [TableGen][CodeGenSchedule] Debug output does not help spotting the missing scheduling classes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284452 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This is especially important for 32-bit targets with 64-bit shuffle elements.This is similar to how PSHUFB and VPERMIL handle the same problem.
Reviewers: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25666
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284451 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are differences in codegen between Linux and Windows due to:
1. Using std::sort which uses quicksort which is a non-stable sort.
2. Iterating over Set data structure where the iteration order is
non deterministic.
Reviewers: arsenm, grosbach, junbuml, zinob, MatzeB
Subscribers: MatzeB, wdng
Differential Revision: https://reviews.llvm.org/D25695
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284441 91177308-0d34-0410-b5e6-96231b3b80d8
This resubmits commits 284425 and r284428, which were reverted
in r284429 due to some infinite recursion caused by an incorrect
selection of function overloads. Reproduced the failure on Linux
using GCC 4.8.4, and confirmed that with the new patch the tests
path on GCC as well as MSVC. So hopefully this fixes everything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284436 91177308-0d34-0410-b5e6-96231b3b80d8
load commands that use the MachO::sub_framework_command,
MachO::sub_umbrella_command, MachO::sub_library_command
and MachO::sub_client_command types but are not used in llvm
libObject code but used in llvm tool code.
This includes the LC_SUB_FRAMEWORK, LC_SUB_UMBRELLA,
LC_SUB_LIBRARY and LC_SUB_CLIENT load commands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284431 91177308-0d34-0410-b5e6-96231b3b80d8
raw_ostream has not afforded a lot of flexibility in terms of
how to format numbers when outputting. Wrap this all up into
a set of low level helper functions that can be used to output
numbers with arbitrary precision, alignment, format, etc and
then update raw_ostream to use these functions.
This will be useful for upcoming improvements to llvm's string
formatting libraries, but are still useful independently.
Differential Revision: https://reviews.llvm.org/D25497
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284425 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in:
https://reviews.llvm.org/D25685
This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch.
In a minor attempt to keep some structure, we're pulling the FP helper over along with its
integer sibling, but clearly we can and should do more refactoring of the similar helper
functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284421 91177308-0d34-0410-b5e6-96231b3b80d8
If -coverage is passed, but -g is not, clang populates the PassManager
pipeline with StripSymbols(debugOnly = true).
The stripSymbol pass therefore scans the list of named metadata,
drops !llvm.dbg.cu, but leaves !llvm.gcov and !0 (the compileUnit MD)
around. The verifier runs, and finds out that there's a CU not listed
in !llvm.dbg.cu (as it was previously dropped) -> crash.
When we strip debug info, so, check if there's coverage data,
and strip it as well, in order to avoid pending metadata left around.
Differential Revision: https://reviews.llvm.org/D25689
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284418 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Debug info should *not* affect code generation. This patch properly handles debug info to make sure the generated code are the same with or without debug info.
Reviewers: davidxl, mzolotukhin, jmolloy
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D25286
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284415 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds the necessary logic to support relocations to thumb functions in the COFF dynamic linker.
The jumps to function addresses are mostly blx, which requires the ISA selection bit when jumping to a thumb function.
Note: I'm determining if the relocation requires the ISA bit when creating the relocation entries and not when resolving the relocation. I have to do that because I need the ObjectFile and the actual Symbol, which are available only when creating the entries. It would require a gross refactor if I do it otherwise, but I'm okay with doing it if you think it's better.
Reviewers: peter.smith, compnerd
Subscribers: rengolin, sas
Differential Revision: https://reviews.llvm.org/D25151
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284410 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If we are loading an i16 value from a 32-bit memory location, then
we need to be able to truncate the loaded value to i16.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D25198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284397 91177308-0d34-0410-b5e6-96231b3b80d8
Based on post-commit review for D25585/r284180, rename
hardware_physical_concurrency to heavyweight_hardware_concurrency,
to better reflect what type of tasks it should be used for and
to enable other systems to map this to something other than the
number of physical cores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284390 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::getConstantPool will automatically determine an appropriate alignment if one is not specified. It does this by querying the type's preferred alignment. This can end up creating quite a lot of padding when the preferred alignment for vectors is 128.
In optimize-for-size mode, it makes sense to instead query the ABI type alignment which is often smaller and causes less padding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284381 91177308-0d34-0410-b5e6-96231b3b80d8
Not all ConstantExprs can be represented by a global variable, for example most
pointer arithmetic other than addition of a constant, so we can't convert these
values from switch statements to lookup tables.
Differential Revision: https://reviews.llvm.org/D25550
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284379 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor.
Reviewers: sanjoy, grosser
Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser
Differential Revision: https://reviews.llvm.org/D16492
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284378 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGenPrepare knows how to move a zext of a load into the same basic block
where the load lives. The goal is to help ISel match a zero-extending load
instead of two separated instructions.
CGP attempts to move a zext computation even if it lives in a basic block that
does not post-dominate the load's basic block. That means, the hoisted zext may
be speculated. Preserving the zext location would hurt the debugging experience
and the quality of sample pgo.
With this patch, when moving a zext near to its associated load, CGP no longer
propagates the zext's debug location. Instead, CGP conservatively reuses the
same debug location for the load and the zext.
An alternative approach would be to assign an artificial line-0 location to the
zext. However we don't want to over-use the 'line-0' for this particular case
because it would have a size cost in the line-table section for no additional
benefit.
Differential Revision: https://reviews.llvm.org/D25611
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284377 91177308-0d34-0410-b5e6-96231b3b80d8
With fix: hex edited the precompiled inputs from another testcases to pass new checks.
Original commit message:
[Object/ELF] - Check that e_shnum is null when e_shoff is.
Spec says (http://www.sco.com/developers/gabi/1998-04-29/ch4.eheader.html) :
e_shnum
This member holds the number of entries in the section header table. Thus the product of e_shentsize and e_shnum gives the section header table's size in bytes. If a file has no section header table, e_shnum holds the value zero.
Revealed using "id_000037,sig_11,src_000015,op_havoc,rep_8" from PR30540
That was the reason of crash in lld on incorrect input file.
Binary reduced using afl-min.
Differential revision: https://reviews.llvm.org/D25090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284374 91177308-0d34-0410-b5e6-96231b3b80d8
Spec says (http://www.sco.com/developers/gabi/1998-04-29/ch4.eheader.html) :
e_shnum
This member holds the number of entries in the section header table. Thus the product of e_shentsize and e_shnum gives the section header table's size in bytes. If a file has no section header table, e_shnum holds the value zero.
Revealed using "id_000037,sig_11,src_000015,op_havoc,rep_8" from PR30540
That was the reason of crash in lld on incorrect input file.
Binary reduced using afl-min.
Differential revision: https://reviews.llvm.org/D25090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284371 91177308-0d34-0410-b5e6-96231b3b80d8
If object has wrong (large) string table index and
also incorrect large value for amount of sections in total,
then section index passes the check:
if (Index >= getNumSections())
return object_error::invalid_section_index;
But result pointer then is far after end of file data, what
result in a crash.
Differential revision: https://reviews.llvm.org/D25081
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284369 91177308-0d34-0410-b5e6-96231b3b80d8
Uses of this have all been updated to use LLVM_NODISCARD, which
matches the C++17 [[nodiscard]] semantics rather than those of GCC's
__attribute__((warn_unused_result)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284367 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of annotating (most of) the StringRef API, we can just
annotate the type directly. This is less code and it will warn in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284364 91177308-0d34-0410-b5e6-96231b3b80d8