Discovered during attributor testing comparing stats with
and without the attributor. Willreturn should not be inferred
for nonexact definitions.
Differential Revision: https://reviews.llvm.org/D100988
FunctionAnalysisManagerCGSCCProxy should not be preserved if any of its
keys may be invalid. Since we are not removing/adding functions in
FuncAttrs, it's fine to preserve it.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D100893
This is mostly stylistic cleanup after D100226, but not entirely. When skimming the code, I found one case where we weren't accounting for attributes on the callsite at all. I'm also suspicious we had some latent bugs related to operand bundles (which are supposed to be able to *override* attributes on declarations), but I don't have concrete test cases for those, just suspicions.
Aside: The only case left in the file which directly checks attributes on the declaration is the norecurse logic. I left that because I didn't understand it; it looks obviously wrong, so I suspect I'm misinterpreting the intended semantics of the attribute.
Differential Revision: https://reviews.llvm.org/D100689
Have funcattrs expand all implied attributes into the IR. This expands the infrastructure from D100400, but for definitions not declarations this time.
Somewhat subtly, this mostly isn't semantic. Because the accessors did the inference, any client which used the accessor was already getting the stronger result. Clients that directly checked presence of attributes (there are some), will see a stronger result now.
The old behavior can end up quite confusing for two reasons:
* Without this change, we have situations where function-attrs appears to fail when inferring an attribute (as seen by a human reading IR), but that consuming code will see that it should have been implied. As a human trying to sanity check test results and study IR for optimization possibilities, this is exceeding error prone and confusing. (I'll note that I wasted several hours recently because of this.)
* We can have transforms which trigger without the IR appearing (on inspection) to meet the preconditions. This change doesn't prevent this from happening (as the accessors still involve multiple checks), but it should make it less frequent.
I'd argue in favor of deleting the extra checks out of the accessors after this lands, but I want that in it's own review as a) it's purely stylistic, and b) I already know there's some disagreement.
Once this lands, I'm also going to do a cleanup change which will delete some now redundant duplicate predicates in the inference code, but again, that deserves to be a change of it's own.
Differential Revision: https://reviews.llvm.org/D100226
Pretty straightforward use of existing infrastructure and port of the attributor inference rules for nosync.
A couple points of interest:
* I deliberately switched from "monotonic or better" to "unordered or better". This is simply me being conservative and is better in line with the rest of the optimizer. We treat monotonic conservatively pretty much everywhere.
* The operand bundle test change is suspicious. It looks like we might have missed something here, but if so, it's an issue with the existing nofree inference as well. I'm going to take a closer look at that separately.
* I needed to keep the previous inference from readnone. This surprised me, but made sense once I realized readonly inference goes to lengths to reason about local vs non-local memory and that writes to local memory are okay. This is fine for the purpose of nosync, but would e.g. prevent us from inferring nofree from readnone - which is slightly surprising.
Differential Revision: https://reviews.llvm.org/D99769
This implements the most basic possible nosync inference. The choice of inference rule is taken from the comments in attributor and the discussion on the review of the change which introduced the nosync attribute (0626367202c).
This is deliberately minimal. As noted in code comments, I do plan to add a more robust inference which actually scans the function IR directly, but a) I need to do some refactoring of the attributor code to use common interfaces, and b) I wanted to get something in. I also wanted to minimize the "interesting" analysis discussion since that's time intensive.
Context: This combines with existing nofree attribute inference to help prove dereferenceability in the ongoing deref-at-point semantics work.
Differential Revision: https://reviews.llvm.org/D99749
This moves the willReturn() helper from CallBase to Instruction,
so that it can be used in a more generic manner. This will make
it easier to fix additional passes (ADCE and BDCE), and will give
us one place to change if additional instructions should become
non-willreturn (e.g. there has been talk about handling volatile
operations this way).
I have also included the IntrinsicInst workaround directly in
here, so that it gets applied consistently. (As such this change
is not entirely NFC -- FuncAttrs will now use this as well.)
Differential Revision: https://reviews.llvm.org/D96992
The IR/MIR pseudo probe intrinsics don't get materialized into real machine instructions and therefore they don't incur runtime cost directly. However, they come with indirect cost by blocking certain optimizations. Some of the blocking are intentional (such as blocking code merge) for better counts quality while the others are accidental. This change unblocks perf-critical optimizations that do not affect counts quality. They include:
1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D95982
If a function doesn't contain loops and does not call non-willreturn
functions, then it is willreturn. Loops are detected by checking
for backedges in the function. We don't attempt to handle finite
loops at this point.
Differential Revision: https://reviews.llvm.org/D94633
Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison.
To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null.
This makes many transformations like below legal:
```
%p = gep inbounds %x, 1 ; % p is non-null pointer or poison
call void @f(%p) ; instcombine converts this to call void @f(nonnull %p)
```
Instead, this semantics makes propagation of `nonnull` to caller illegal.
The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument.
Having `noundef` attribute there re-allows this.
```
define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore
call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p.
ret void
}
```
Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90529
Similar to D94125, derive `willreturn` for functions that are `readonly` and
`mustprogress` in FunctionAttrs.
To quote the reasoning from D94125:
Since D86233 we have `mustprogress` which, in combination with
`readonly`, implies `willreturn`. The idea is that every side-effect
has to be modeled as a "write". Consequently, `readonly` means there
is no side-effect, and `mustprogress` guarantees that we cannot "loop"
forever without side-effect.
Reviewed By: jdoerfert, nikic
Differential Revision: https://reviews.llvm.org/D94502
A function is noreturn if all blocks terminating with a ReturnInst
contain a call to a noreturn function. Skip looking at naked functions
since there may be asm that returns.
This can be further refined in the future by checking unreachable blocks
and taking into account recursion. It looks like the attributor pass
does this, but that is not yet enabled by default.
This seems to help with code size under the new PM since PruneEH does
not run under the new PM, missing opportunities to mark some functions
noreturn, which in turn doesn't allow simplifycfg to clean up dead code.
https://bugs.llvm.org/show_bug.cgi?id=46858.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D93946
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
The legacy pass didn't properly detect indirect calls.
We can still remove the convergent attribute when there are indirect
calls. The LangRef says:
> When it appears on a call/invoke, the convergent attribute indicates
that we should treat the call as though we’re calling a convergent
function. This is particularly useful on indirect calls; without this we
may treat such calls as though the target is non-convergent.
So don't skip handling of convergent when there are unknown calls.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89826
To match NewPM pass name, and also for readability.
Also rename rpo-functionattrs -> rpo-function-attrs while we're here.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D84694
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame
pointer.
Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces
correct code for something like
```
struct A {
A();
A(A&&);
~A();
};
void bar() {
foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```
by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.
Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame
pointer.
Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces
correct code for something like
```
struct A {
A();
A(A&&);
~A();
};
void bar() {
foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```
by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Enable flag introduced in rL294998. Security concerns are no longer valid, since function signatures for mentioned libc functions has no nonnull attribute (Clang does not generate them? I see no nonnull attr in LLVM IR for these functions) and since rL372091 we carefully annotate the callsites where we know that size is static, non zero. So let's enable this flag again..
llvm-svn: 372573
adding new read attribute to an argument
Summary: Update optimization pass to prevent adding read-attribute to an
argument without removing its conflicting attribute.
A read attribute, based on the result of the attribute deduction
process, might be added to an argument. The attribute might be in
conflict with other read/write attribute currently associated with the
argument. To ensure the compatibility of attributes, conflicting
attribute, if any, must be removed before a new one is added.
The following snippet shows the current behavior of the compiler, where
the compilation process is aborted due to incompatible attributes.
$ cat x.ll
; ModuleID = 'x.bc'
%_type_of_d-ccc = type <{ i8*, i8, i8, i8, i8 }>
@d-ccc = internal global %_type_of_d-ccc <{ i8* null, i8 1, i8 13, i8 0,
i8 -127 }>, align 8
define void @foo(i32* writeonly %.aaa) {
foo_entry:
%_param_.aaa = alloca i32*, align 8
store i32* %.aaa, i32** %_param_.aaa, align 8
store i8 0, i8* getelementptr inbounds (%_type_of_d-ccc,
%_type_of_d-ccc* @d-ccc, i32 0, i32 3)
ret void
}
$ opt -O3 x.ll
Attributes 'readnone and writeonly' are incompatible!
void (i32*)* @foo
in function foo
LLVM ERROR: Broken function found, compilation aborted!
The purpose of this changeset is to fix the above error. This fix is
based on a suggestion from Johannes @jdoerfert (many thanks!!!)
Authored By: anhtuyen
Reviewer: nicholas, rnk, chandlerc, jdoerfert
Reviewed By: rnk
Subscribers: hiraditya, jdoerfert, llvm-commits, anhtuyen, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D58694
llvm-svn: 371622
There are scenarios where mutually recursive functions may cause the SCC
to contain both read only and write only functions. This removes an
assertion when adding read attributes which caused a crash with a the
provided test case, and instead just doesn't add the attributes.
Patch by Luke Lau <luke.lau@intel.com>
Differential Revision: https://reviews.llvm.org/D60761
llvm-svn: 366090
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).
Reviewers: jdoerfert
Differential Revision: https://reviews.llvm.org/D49165
llvm-svn: 365336
Create method `optForNone()` testing for the function level equivalent of
`-O0` and refactor appropriately.
Differential revision: https://reviews.llvm.org/D59852
llvm-svn: 357638
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
minted `CallBase` class instead of the `CallSite` wrapper.
This moves the largest interwoven collection of APIs that traffic in
`CallSite`s. While a handful of these could have been migrated with
a minorly more shallow migration by converting from a `CallSite` to
a `CallBase`, it hardly seemed worth it. Most of the APIs needed to
migrate together because of the complex interplay of AA APIs and the
fact that converting from a `CallBase` to a `CallSite` isn't free in its
current implementation.
Out of tree users of these APIs can fairly reliably migrate with some
combination of `.getInstruction()` on the `CallSite` instance and
casting the resulting pointer. The most generic form will look like `CS`
-> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there
is a more elegant migration. Hopefully, this migrates enough APIs for
users to fully move from `CallSite` to the base class. All of the
in-tree users were easily migrated in that fashion.
Thanks for the review from Saleem!
Differential Revision: https://reviews.llvm.org/D55641
llvm-svn: 350503
Summary: debug intrinsics might be marked norecurse to enable the caller function to be norecurse and optimized if needed. This avoids code gen optimisation differences when -g is used, as in globalOpt.cpp:processInternalGlobal checks.
Reviewers: chandlerc, jmolloy, aprantl
Reviewed By: aprantl
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D55187
llvm-svn: 348381
Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).
This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.
llvm-svn: 344186
The presence of readnone and an access range attribute (argmemonly,
inaccessiblememonly, inaccessiblemem_or_argmemonly) is considered an
error by the verifier. This seems strict but also not wrong. This
patch makes sure function attribute detection will remove all access
range attributes for readnone functions.
llvm-svn: 341927
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.
Reviewers: hfinkel, jdoerfert
Differential Revision: https://reviews.llvm.org/D48387
llvm-svn: 340537
This patch just extract code into a separate function to remove some
duplication between the old and new pass manager pipeline. Due to the
different CGSCC iterators used, not all code duplication was eliminated.
llvm-svn: 338585
Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This
patch replaces such types with SmallPtrSet, because IMO it is slightly
clearer and allows us to get rid of unnecessarily including SmallSet.h
Reviewers: dblaikie, craig.topper
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D47836
llvm-svn: 334492
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
Summary:
This was motivated by absence of PrunEH functionality in new PM.
It was decided that a proper way to do PruneEH is to add NoUnwind inference
into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top.
This change generalizes attribute handling implemented for (a removal of)
Convergent attribute, by introducing a generic builder-like class
AttributeInferer
It registers all the attribute inference requests, storing per-attribute
predicates into a vector, and then goes through an SCC Node, scanning all
the instructions for not breaking attribute assumptions.
The main idea is that as soon all the instructions from all the functions
of SCC Node conform to attribute assumptions then we are free to infer
the attribute as set for all the functions of SCC Node.
It handles two distinct cases of attributes:
- those that might break due to derefinement of the function code
for these attributes we are allowed to apply inference only if all the
functions are "exact definitions". Example - NoUnwind.
- those that do not care about derefinement
for these attributes we are allowed to apply inference as soon as we see
any function definition. Example - removal of Convergent attribute.
Also in this commit:
* Converted all the FunctionAttrs tests to use FileCheck and added new-PM
invocations to them
* FunctionAttrs/convergent.ll test demonstrates a difference in behavior between
new and old PM implementations. Marked with FIXME.
* PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg
combo as intended
* some of "other" tests were updated since function-attrs now infers 'nounwind'
even for old PM pipeline
* -disable-nounwind-inference hidden option added as a possible workaround for a supposedly
rare case when nounwind being inferred by default presents a problem
Reviewers: chandlerc, jlebar
Reviewed By: jlebar
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D44415
llvm-svn: 328377
- Fix for bug 36078.
- Prevent the functionattrs, function-attrs, globalopt and argpromotion passes
from changing naked functions.
- These passes can perform some alterations to the functions that should not be
applied. An example is removing parameters that are seemingly not used because
they are only referenced in the inline assembly. Another example is marking
the function as fastcc.
llvm-svn: 325788
Summary:
The aim is to make ModRefInfo checks and changes more intuitive
and less error prone using inline methods that abstract the bit operations.
Ideally ModRefInfo would become an enum class, but that change will require
a wider set of changes into FunctionModRefBehavior.
Reviewers: sanjoy, george.burgess.iv, dberlin, hfinkel
Subscribers: nlopes, llvm-commits
Differential Revision: https://reviews.llvm.org/D40749
llvm-svn: 319821