For non padded structs, we can just proceed and deaggregate them.
We don't want ot do this when there is padding in the struct as to not
lose information about this padding (the subsequents passes would then
try hard to preserve the padding, which is undesirable).
Also update extractvalue.ll and cast.ll so that they use structs with padding.
Remove the FIXME in the extractvalue of laod case as the non padded case is
handled when processing the load, and we don't want to do it on the padded
case.
Patch by: Amaury SECHET <deadalnix@gmail.com>
Differential Revision: http://reviews.llvm.org/D14483
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255600 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function. This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.
Depends on D15478.
Differential Revision: http://reviews.llvm.org/D15479
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255522 91177308-0d34-0410-b5e6-96231b3b80d8
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
but they are difficult to explain to others, even to seasoned LLVM
experts.
- catchendpad and cleanupendpad are optimization barriers. They cannot
be split and force all potentially throwing call-sites to be invokes.
This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
It is unsplittable, starts a funclet, and has control flow to other
funclets.
- The nesting relationship between funclets is currently a property of
control flow edges. Because of this, we are forced to carefully
analyze the flow graph to see if there might potentially exist illegal
nesting among funclets. While we have logic to clone funclets when
they are illegally nested, it would be nicer if we had a
representation which forbade them upfront.
Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
flow, just a bunch of simple operands; catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad. Their presence can be inferred
implicitly using coloring information.
N.B. The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for. An expert should take a
look to make sure the results are reasonable.
Reviewers: rnk, JosephTremoulet, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D15139
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255422 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid O(N^2) behaviour when checking for bad bitcasts in `ConstantExpr`s
buried inside of aggregate initializers to `GlobalVariable`s. I've:
- centralized the "visited" set for recursing through `ConstantExpr`s so
that expressions are only visited once per Verifier run,
- removed the duplicate logic for the stack visit, and
- avoided recursing into other `GlobalValue`s.
This recovers roughly a 100x time difference in clang compiles of a
particular input file (filled with large cross-referencing tables) that
depends on whether `-disable-llvm-verifier` is on. This slowdown was
caused by r187506, which introduced these checks.
Now, avoiding `-disable-llvm-verifier` only causes a 2x slowdown for
this case.
(Interestingly, dumping the textual IR for this file starts at least
50GB of global variable initializers (I don't know the total, since I
killed the dump)...)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255269 91177308-0d34-0410-b5e6-96231b3b80d8
- This simplifies the CallSite class, arg_begin / arg_end are now
simple wrapper getters.
- In several places, we were creating CallSite instances solely to call
arg_begin and arg_end. With this change, that's no longer required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255226 91177308-0d34-0410-b5e6-96231b3b80d8
The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>)
overload has had a typo in it since it was written, where it will
create a Vector instead of an Array. This obviously doesn't work at
all, but it turns out that until r254991 there weren't actually any
callers of this overload. Fix the typo and add some test coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255157 91177308-0d34-0410-b5e6-96231b3b80d8
This new patch fixes a few bugs that exposed in last submit. It also improves
the test cases.
--Original Commit Message--
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.
Differential Revision: http://reviews.llvm.org/D12781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255132 91177308-0d34-0410-b5e6-96231b3b80d8
Test case attached (test case also checks that we don't drop the calling
convention, but that functionality was correct before this patch).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255088 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, vectors of halfs end up as ConstantVectors, but there isn't
a good reason they can't be ConstantDataVectors. This should save some
memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254991 91177308-0d34-0410-b5e6-96231b3b80d8
254950 ended up being not NFC. The previous code was overriding the flags for whether an instruction read or wrote memory using the target specific flags returned via TTI. I'd missed this in my refactoring. Since I mistakenly built only x86 and didn't notice the number of unsupported tests, I didn't catch that before the original checkin.
This raises an interesting issue though. Given we have function attributes (i.e. readonly, readnone, argmemonly) which describe the aliasing of intrinsics, why does TTI have this information overriding the instruction definition at all? I see no reason for this, but decided to preserve existing behavior for the moment. The root issue might be that we don't have a "writeonly" attribute.
Original commit message:
[EarlyCSE] Simplify and invert ParseMemoryInst [NFCI]
Restructure ParseMemoryInst - which was introduced to abstract over target specific load and stores instructions - to just query the underlying instructions. In theory, this could be slightly slower than caching the results, but in practice, it's very unlikely to be measurable.
The simple query scheme makes it far easier to understand, and much easier to extend with new queries. Given I'm about to need to add new query types, doing the cleanup first seemed worthwhile.
Do we still believe the target specific intrinsic handling is worthwhile in EarlyCSE? It adds quite a bit of complexity and makes the code harder to read. Being able to delete the abstraction entirely would be wonderful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254957 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We are inserting both Scope and SP into the Seen map and check whether
it was already there in which case we skip the validation (the idea
being that we already checked this Subprogram before). However,
if (Scope == SP) as MDNodes, then inserting the Scope, will trigger
the Seen check causing us to incorrectly not validate this !dbg
attachment. Fix this by not performing the SP Seen check if Scope == SP
Reviewers: pcc, dexonsmith, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254887 91177308-0d34-0410-b5e6-96231b3b80d8
In 254760, I introduced the usage of a BumpPtrAllocator for the AnalysisUsage instances held by the PassManger. This turns out to have been incorrect since a BumpPtrAllocator does not run the destructors of objects when deallocating memory. Since a few of our SmallVector's had grown beyond their small size, we end up with some leaked memory. We need to use a SpecificBumpPtrAllocator instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254803 91177308-0d34-0410-b5e6-96231b3b80d8
Currently `OperandBundleUse::operandsHaveAttr` computes its result
without being given a specific operand. This is problematic because it
forces us to say that, e.g., even non-pointer operands in `"deopt"`
operand bundles are `readonly`, which doesn't make sense.
This commit changes `operandsHaveAttr` to work in the context of a
specific operand, so that we can give the operand attributes that make
sense for the operands's `llvm::Type`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254764 91177308-0d34-0410-b5e6-96231b3b80d8
The LegacyPassManager was storing an instance of AnalysisUsage for each instance of each pass. In practice, most instances of a single pass class share the same dependencies. We can't rely on this because passes can (and some do) have dynamic dependencies based on instance options.
We can exploit the likely commonality by uniqueing the usage information after querying the pass, but before storing it into the pass manager. This greatly reduces memory consumption by the AnalysisUsage objects. For a long pass pipeline, I measured a decrease in memory consumption for this storage of about 50%. I have not measured on the default O3 pipeline, but I suspect it will see some benefit as well since many passes are repeated (e.g. InstCombine).
Differential Revision: http://reviews.llvm.org/D14677
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254760 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a new target-independent calling convention for C++ TLS
access functions. It aims to minimize overhead in the caller by perserving as
many registers as possible.
The target-specific implementation for X86-64 is defined as following:
Arguments are passed as for the default C calling convention
The same applies for the return value(s)
The callee preserves all GPRs - except RAX and RDI
The access function makes C-style TLS function calls in the entry and exit
block, C-style TLS functions save a lot more registers than normal calls.
The added calling convention ties into the existing implementation of the
C-style TLS functions, so we can't simply use existing calling conventions
such as preserve_mostcc.
rdar://9001553
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254737 91177308-0d34-0410-b5e6-96231b3b80d8
This provides interface to get and set maximum function counts to Module. This
would allow things like determination of function hotness. The actual setting
of this max function count will have to be done in the frontend.
Differential Revision: http://reviews.llvm.org/D15003
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254647 91177308-0d34-0410-b5e6-96231b3b80d8
time.
The new overloaded function is used when an attribute is added to a
large number of slots of an AttributeSet (for example, to function
parameters). This is much faster than calling AttributeSet::addAttribute
once per slot, because AttributeSet::getImpl (which calls
FoldingSet::FIndNodeOrInsertPos) is called only once per function
instead of once per slot.
With this commit, clang compiles a file which used to take over 22
minutes in just 13 seconds.
rdar://problem/23581000
Differential Revision: http://reviews.llvm.org/D15085
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254491 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantDataArray::getImpl and ConstantDataVector::getImpl had a lot
of copy pasta in how they handled sequences of constants. Break that
out into a couple of simple functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254456 91177308-0d34-0410-b5e6-96231b3b80d8
By including the module name in the error message.
This makes the error message much more useful and
saves a trip to the debugger.
Reviewers: dexonsmith
Subscribers: dexonsmith, llvm-commits
Differential Revision: http://reviews.llvm.org/D14473
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254437 91177308-0d34-0410-b5e6-96231b3b80d8
They are as much trouble as aliases to declarations. They are requiring
the code generator to define a symbol with the same value as another
symbol, but the second symbol is undefined.
If representing this is important for some optimization, we could add
support for available_externally aliases. They would be *required* to
point to a declaration (or available_externally definition).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254170 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements a minimum spanning tree (MST) based instrumentation for
PGO. The use of MST guarantees minimum number of CFG edges getting
instrumented. An addition optimization is to instrument the less executed
edges to further reduce the instrumentation overhead. The patch contains both the
instrumentation and the use of the profile to set the branch weights.
Differential Revision: http://reviews.llvm.org/D12781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254021 91177308-0d34-0410-b5e6-96231b3b80d8
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".
For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253804 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D14903
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253796 91177308-0d34-0410-b5e6-96231b3b80d8
Terrifyingly, one of them is a mishandling of floating point vectors
in Constant::isZero(). How exactly this issue survived this long
is beyond me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253655 91177308-0d34-0410-b5e6-96231b3b80d8
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.
Differential Revision: http://reviews.llvm.org/D14150
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253544 91177308-0d34-0410-b5e6-96231b3b80d8
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253438 91177308-0d34-0410-b5e6-96231b3b80d8
The way prelink used to work was
* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.
There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
implementation is pretty broken. It talks about relocations that are
"resolved by the static linker". If they are resolved, there are none
left for the prelinker. What one needs to track is if an expression
will require only dynamic relocations that point to the same DSO.
At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)
This patch removes support for it. That is, it stops printing the
".local" sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253280 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.
Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.
Reviewers: aprantl, dexonsmith, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14275
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253186 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The Old personality function gets copied over, but the
Materializer didn't have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.
Reviewers: majnemer
Subscribers: jevinskie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253183 91177308-0d34-0410-b5e6-96231b3b80d8