When llvm declarations have argument names, it's helpful to actually
print those names when debugging. Arguably, it'd be nice to print them
all the time, but that would mean the IR we output wouldn't round trip
through bitcode, which doesn't store the names.
Make the varous print() methods in AsmWriter optionally print "for
debug" and set that flag in the dump() methods. The only thing this
does differently for now is print the argument names in declarations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248692 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.
Reviewers: andrew.w.kaylor, majnemer, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248677 91177308-0d34-0410-b5e6-96231b3b80d8
llvm::format compiles down to snprintf which has no defined rounding for
floating point arguments, and MSVC has implemented it differently from
what the BSD libcs and glibc do. Try to emulate the glibc rounding
behavior to avoid changing tests.
While there simplify code a bit and move trivial methods inline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248665 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
The original checkin, r248606 had to be backed out due to an issue with
a ObjCXX unit test. That issue is now fixed, so re-landing.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248637 91177308-0d34-0410-b5e6-96231b3b80d8
I realized that the live-out set computed for the return block is
missing the callee saved registers (the non-pristine ones to be exact).
This only affects the liveness computed for instructions inside the
function epilogue which currently none of the LivePhysRegs users in llvm
cares about, so this is just a drive-by fix without a testcase.
Differential Revision: http://reviews.llvm.org/D13180
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248636 91177308-0d34-0410-b5e6-96231b3b80d8
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:
Pros:
1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.
Cons:
1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.
One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.
Differential revision: http://reviews.llvm.org/D12603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248633 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The default behavior is to omit the .section directive for .text, .data,
and sometimes .bss, but some targets may want to omit this directive for
other sections too.
The AMDGPU backend will uses this to emit a simplified syntax for section
switches. For example if the section directive is not omitted (current
behavior), section switches to .hsatext will be printed like this:
.section .hsatext,#alloc,#execinstr,#write
This is actually wrong, because .hsatext has some custom STT_* flags,
which MC doesn't know how to print or parse.
If the section directive is omitted (made possible by this commit),
section switches will be printed like this:
.hsatext
The motivation for this patch is to make it possible to emit sections
with custom STT_* flags without having to teach MC about all the target
specific STT_* flags.
Reviewers: rafael, grosbach
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248618 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248606 91177308-0d34-0410-b5e6-96231b3b80d8
Arguments to function calls marked "nocapture" can be marked as
non-escaping. However, nocapture is defined in terms of the lifetime
of the callee, and if the callee can directly or indirectly recurse to
the caller, the semantics of nocapture are invalid.
Therefore, we eagerly discover which SCC each function belongs to,
and later can check if callee and caller of a callsite belong to
the same SCC, in which case there could be recursion.
This means that we can't be so optimistic in
getModRefInfo(ImmutableCallsite) - previously we assumed all call
arguments never aliased with an escaping global. Now we need to check,
because a global could now be passed as an argument but still not
escape.
This also solves a related conformance problem: MemCpyOptimizer can
turn non-escaping stores of globals into calls to intrinsics like
llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the
global can't escape and so returns NoModRef when queried, when
obviously a memcpy/memset call does indeed reference and modify its
arguments.
This fixes PR24800, PR24801, and PR24802.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248576 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This also adds the first set of tests for operand bundles.
The optimizer has not been audited to ensure that it does the right
thing with operand bundles.
Depends on D12456.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: maksfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D12457
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248551 91177308-0d34-0410-b5e6-96231b3b80d8
The doesn't seem to be a difference and ELFOSABI_NONE seems to be far more
common:
* Linux doesn't care when loading and puts ELFOSABI_NONE on core dumps.
* Gold and bfd ld produce files with ELFOSABI_NONE.
* Gold and bfd ld seems to ignore EI_OSABI other than for freebsd.
* Gas puts ELFOSABI_NONE in most .o files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248534 91177308-0d34-0410-b5e6-96231b3b80d8
These are necessary for implementing mem_fence for
OpenCL 2.0.
The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches `CallInst`s and `InvokeInst`s to maintain a set of
operand bundles as part of its operands. `CallInst`s and `InvokeInst`s
with operand bundles co-allocate some space before their `Use` array to
hold meta information about which of its operands are part of an operand
bundle.
The strings corresponding to the bundle tags are interned into
`LLVMContextImpl::BundleTagCache`
This change does not include any parsing / bitcode support. That's the
next change.
Depends on D12455.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: MatzeB, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248527 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.
This patch changes the handling of +t2dsp to be in line with other
architecture extensions.
Following a revert of r248152 and new review comments, this patch also includes
renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc.
The spelling of "t2dsp" is preserved, pending a further investigation of its
possible external usage.
Differential Revision: http://reviews.llvm.org/D12937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248519 91177308-0d34-0410-b5e6-96231b3b80d8
Allow a target to do something other than search for copies
that will avoid cross register bank copies.
Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.
I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid kinds
of subregister copies on some targets.
I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.
The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248478 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the header more readable and cleans up some unnecessary
header differences between NDEBUG and !NDEBUG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248462 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
With this change, subclasses of `llvm::User` will be able to co-allocate
a variable number of bytes (called a "descriptor") with the `llvm::User`
instance. The co-allocated descriptor can later be accessed using
`llvm::User::getDescriptor`. This will be used in later changes to
implement operand bundles.
This change steals one bit from `NumUserOperands`, but given that it is
still 28 bits wide I don't think this will be a practical issue.
This change does not allow allocating hung off uses with descriptors.
This only for simplicity, not for any fundamental reason; and we can
easily add this functionality later if needed.
Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet
Subscribers: pete, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12455
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248453 91177308-0d34-0410-b5e6-96231b3b80d8
Because the current proposal does not include that member function,
and we are trying to keep in line with that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248451 91177308-0d34-0410-b5e6-96231b3b80d8
...because that's what the cost model was intended to do.
As discussed in D12882, this fix has a temporary unintended consequence for
SimplifyCFG: it causes us to not speculate an fdiv. However, two wrongs make
PR24818 right, and two wrongs make PR24343 act right even though it's really
still wrong.
I intend to correct SimplifyCFG and add to CodeGenPrepare to account for this
cost model change and preserve the righteousness for the bug report cases.
https://llvm.org/bugs/show_bug.cgi?id=24818https://llvm.org/bugs/show_bug.cgi?id=24343
Differential Revision: http://reviews.llvm.org/D12882
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248439 91177308-0d34-0410-b5e6-96231b3b80d8
Note: I'm am not trying to describe what "should be"; I'm only describing what is true today.
This came out of my recent question to llvm-dev titled: When can the dominator tree not contain a node for a basic block?
Differential Revision: http://reviews.llvm.org/D13078
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248417 91177308-0d34-0410-b5e6-96231b3b80d8
Add two new ways of accessing the unsafe stack pointer:
* At a fixed offset from the thread TLS base. This is very similar to
StackProtector cookies, but we plan to extend it to other backends
(ARM in particular) soon. Bionic-side implementation here:
https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
not emutls).
This is a re-commit of a change in r248357 that was reverted in
r248358.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248405 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.
I've refactored the call sites I could find easily to use getZero /
getOne.
Reviewers: hfinkel, majnemer, reames
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12947
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248362 91177308-0d34-0410-b5e6-96231b3b80d8
Add two new ways of accessing the unsafe stack pointer:
* At a fixed offset from the thread TLS base. This is very similar to
StackProtector cookies, but we plan to extend it to other backends
(ARM in particular) soon. Bionic-side implementation here:
https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
not emutls).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248357 91177308-0d34-0410-b5e6-96231b3b80d8
In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.
Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.
Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".
Differential Revision: http://reviews.llvm.org/D13033
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248291 91177308-0d34-0410-b5e6-96231b3b80d8
Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248195 91177308-0d34-0410-b5e6-96231b3b80d8
Based on conversations with Justin and a few others, these constructors
are really useful to have in the executable so that you can call them
from the debugger. After some measurements, these *particular* calls
aren't so problematic as to make them a good tradeoff for always inline.
Please let me know if there are other functions really needed for
debugging. The always inline attribute is a hack that we should only
really employ when it doesn't hurt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248188 91177308-0d34-0410-b5e6-96231b3b80d8
The definition of the DivergenceAnalysis pass was in a CPP
file and wasn't accessible to users of the analysis to get it
through "getAnalysis<>()".
This patch extracts the definition into a separate header that
can be used by users of the analysis to fetch the results.
Patch by Volkan Keles (vkeles@apple.com)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248186 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.
This patch changes the handling of +t2dsp to be in line with other
architecture extensions.
Following review comments, also updating the description of FeatureDSPThumb2
in ARM.td.
Differential Revision: http://reviews.llvm.org/D12937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248152 91177308-0d34-0410-b5e6-96231b3b80d8