477 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith
62ee08db9a IR: Disallow complicated function-local metadata
Disallow complex types of function-local metadata.  The only valid
function-local metadata is an `MDNode` whose sole argument is a
non-metadata function-local value.

Part of PR21532.

llvm-svn: 223564
2014-12-06 01:26:49 +00:00
Rafael Espindola
02dc2705ae Ask the module for its the identified types.
When lazy reading a module, the types used in a function will not be visible to
a TypeFinder until the body is read.

This patch fixes that by asking the module for its identified struct types.
If a materializer is present, the module asks it. If not, it uses a TypeFinder.

This fixes pr21374.

I will be the first to say that this is ugly, but it was the best I could find.

Some of the options I looked at:

* Asking the LLVMContext. This could be made to work for gold, but not currently
  for ld64. ld64 will load multiple modules into a single context before merging
  them. This causes us to see types from future merges. Unfortunately,
  MappedTypes is not just a cache when it comes to opaque types. Once the
  mapping has been made, we have to remember it for as long as the key may
  be used. This would mean moving MappedTypes to the Linker class and having
  to drop the Linker::LinkModules static methods, which are visible from C.

* Adding an option to ignore function bodies in the TypeFinder. This would
  fix the PR by picking the worst result. It would work, but unfortunately
  we are currently quite dependent on the upfront type merging. I will
  try to reduce our dependency, but it is not clear that we will be able
  to get rid of it for now.

The only clean solution I could think of is making the Module own the types.
This would have other advantages, but it is a much bigger change. I will
propose it, but it is nice to have this fixed while that is discussed.

With the gold plugin, this patch takes the number of types in the LTO clang
binary from 52817 to 49669.

llvm-svn: 223215
2014-12-03 07:18:23 +00:00
Peter Collingbourne
837799f13b Prologue support
Patch by Ben Gamari!

This redefines the `prefix` attribute introduced previously and
introduces a `prologue` attribute.  There are a two primary usecases
that these attributes aim to serve,

  1. Function prologue sigils

  2. Function hot-patching: Enable the user to insert `nop` operations
     at the beginning of the function which can later be safely replaced
     with a call to some instrumentation facility

  3. Runtime metadata: Allow a compiler to insert data for use by the
     runtime during execution. GHC is one example of a compiler that
     needs this functionality for its tables-next-to-code functionality.

Previously `prefix` served cases (1) and (2) quite well by allowing the user
to introduce arbitrary data at the entrypoint but before the function
body. Case (3), however, was poorly handled by this approach as it
required that prefix data was valid executable code.

Here we redefine the notion of prefix data to instead be data which
occurs immediately before the function entrypoint (i.e. the symbol
address). Since prefix data now occurs before the function entrypoint,
there is no need for the data to be valid code.

The previous notion of prefix data now goes under the name "prologue
data" to emphasize its duality with the function epilogue.

The intention here is to handle cases (1) and (2) with prologue data and
case (3) with prefix data.

References
----------

This idea arose out of discussions[1] with Reid Kleckner in response to a
proposal to introduce the notion of symbol offsets to enable handling of
case (3).

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html

Test Plan: testsuite

Differential Revision: http://reviews.llvm.org/D6454

llvm-svn: 223189
2014-12-03 02:08:38 +00:00
Richard Trieu
df554c2abd Add accessor marcos to ConstantPlaceHolder, similar to those in the base class.
llvm-svn: 222502
2014-11-21 02:42:08 +00:00
Rafael Espindola
9b65f4ede9 Return the number of read bytes in MemoryObject::readBytes.
Returning more information will allow BitstreamReader to be simplified a bit
and changed to read 64 bits at a time.

llvm-svn: 221794
2014-11-12 17:11:16 +00:00
Rafael Espindola
bdb086c664 Reduce code duplication a bit. NFC.
llvm-svn: 221785
2014-11-12 14:48:38 +00:00
Rafael Espindola
20114e4c02 Remove redundant calls to isMaterializable.
This removes calls to isMaterializable in the following cases:

* It was redundant with a call to isDeclaration now that isDeclaration returns
  the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.

llvm-svn: 221050
2014-11-01 16:46:18 +00:00
NAKAMURA Takumi
aa3b216e2e Untabify.
llvm-svn: 220884
2014-10-29 23:44:35 +00:00
Rafael Espindola
ee06e286d8 Modernize the error handling of the Materialize function.
llvm-svn: 220600
2014-10-24 22:50:48 +00:00
Rafael Espindola
e4018b9baa Don't ever call materializeAllPermanently during LTO.
To do this, change the representation of lazy loaded functions.

The previous representation cannot differentiate between a function whose body
has been removed and one whose body hasn't been read from the .bc file. That
means that in order to drop a function, the entire body had to be read.

llvm-svn: 220580
2014-10-24 18:13:04 +00:00
Rafael Espindola
f2df6380a9 clang-format two code snippets to make the next patch easy to read.
llvm-svn: 220484
2014-10-23 15:20:05 +00:00
Petar Jovanovic
ad76773e07 Do not destroy external linkage when deleting function body
The function deleteBody() converts the linkage to external and thus destroys
original linkage type value. Lack of correct linkage type causes wrong
relocations to be emitted later.
Calling dropAllReferences() instead of deleteBody() will fix the issue.

Differential Revision: http://reviews.llvm.org/D5415

llvm-svn: 218302
2014-09-23 12:54:19 +00:00
Chris Bieneman
84a520974a Eliminating static destructor for the BitCodeErrorCategory by converting to a ManagedStatic.
Summary: This is part of the overall goal of removing static initializers from LLVM.

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D5416

llvm-svn: 218149
2014-09-19 20:29:02 +00:00
Rafael Espindola
cbfc7723a8 Pass a && to getLazyBitcodeModule.
This forces callers to use std::move when calling it. It is somewhat odd to have
code with std::move that doesn't always move, but it is also odd to have code
without std::move that sometimes moves.

llvm-svn: 217049
2014-09-03 17:31:46 +00:00
Rafael Espindola
c3b6a25c56 Fix a double free in llvm::getBitcodeTargetTriple.
Unfortunately this is only used by ld64, so no testcase, but should fix the darwin LTO bootstrap.

llvm-svn: 216618
2014-08-27 21:11:13 +00:00
Rafael Espindola
a2d7cc97be Pass a std::unique_ptr<MemoryBuffer>& to getLazyBitcodeModule.
By taking a reference we can do the ownership transfer in one place instead of
expecting every caller to do it.

llvm-svn: 216492
2014-08-26 22:00:09 +00:00
Rafael Espindola
225cf75bef Pass a MemoryBufferRef when we can avoid taking ownership.
The attached patch simplifies a few interfaces that don't need to take
ownership of a buffer.

For example, both parseAssembly and parseBitcodeFile will parse the
entire buffer before returning. There is no need to take ownership.

Using a MemoryBufferRef makes it obvious in the type signature that
there is no ownership transfer.

llvm-svn: 216488
2014-08-26 21:49:01 +00:00
Duncan P. N. Exon Smith
54ed8c12fc BitcodeReader: Only create one basic block for each blockaddress
Block address forward-references are implemented by creating a
`BasicBlock` ahead of time that gets inserted in the `Function` when
it's eventually encountered.

However, if the same blockaddress was used in two separate functions
that were parsed *before* the referenced function (and the blockaddress
was never used at global scope), two separate basic blocks would get
created, one of which would be forgotten creating invalid IR.

This commit changes the forward-reference logic to create only one basic
block (and always return the same blockaddress).

llvm-svn: 215805
2014-08-16 01:54:37 +00:00
Duncan P. N. Exon Smith
bee997a043 UseListOrder: Correctly count the number of uses
This is an off-by-one bug I found by inspection, which would only
trigger if the bitcode writer sees more uses of a `Value` than the
reader.  Since this is only relevant when an instruction gets upgraded
somehow, there unfortunately isn't a reasonable way to add test
coverage.

llvm-svn: 215804
2014-08-16 01:54:34 +00:00
Duncan P. N. Exon Smith
7aaaba94bb BitcodeReader: Fix non-determinism in use-list order
`BasicBlockFwdRefs` (and `BlockAddrFwdRefs` before it) was being emptied
in a non-deterministic order.  When predicting use-list order I've
worked around this another way, but even when parsing lazily (and we
can't recreate use-list order) use-lists should be deterministic.

Make them so by using a side-queue of functions with forward-referenced
blocks that gets visited in order.

llvm-svn: 214899
2014-08-05 17:49:48 +00:00
Duncan P. N. Exon Smith
03001780da UseListOrder: Fix blockaddress use-list order
`parseBitcodeFile()` uses the generic `getLazyBitcodeFile()` function as
a helper.  Since `parseBitcodeFile()` isn't actually lazy -- it calls
`MaterializeAllPermanently()` -- bypass the unnecessary call to
`materializeForwardReferencedFunctions()` by extracting out a common
helper function.  This removes the last of the use-list churn caused by
blockaddresses.

This highlights that we can't reproduce use-list order of globals and
constants when parsing lazily -- but that's necessarily out of scope.
When we're parsing lazily, we never have all the functions in memory, so
the use-lists of globals (and constants that reference globals) are
always incomplete.

This is part of PR5680.

llvm-svn: 214581
2014-08-01 22:27:19 +00:00
Duncan P. N. Exon Smith
323f635bbe BitcodeReader: Change mechanics of BlockAddress forward references, NFC
Now that we can reliably handle forward references to `BlockAddress`
(r214563), change the mechanics to simplify predicting use-list order.

Previously, we created dummy `GlobalVariable`s to represent block
addresses.  After every function was materialized, we'd go through any
forward references to its blocks and RAUW them with a proper
`BlockAddress` constant.  This causes some (potentially a lot of)
unnecessary use-list churn, since any constant expression that it's a
part of will need to be rematerialized as well.

Instead, pre-construct a `BasicBlock` immediately -- without attaching
it to its (empty) `Function` -- and use that to construct a
`BlockAddress`.  This constant will not have to be regenerated.  When
the function body is parsed, hook this pre-constructed basic block up
in the right place using `BasicBlock::insertInto()`.

Both before and after this change, the IR is temporarily in an invalid
state that gets resolved when `materializeForwardReferencedFunctions()`
gets called.

This is a prep commit that's part of PR5680, but the only functionality
change is the reduction of churn in the constant pool.

llvm-svn: 214570
2014-08-01 21:51:52 +00:00
Duncan P. N. Exon Smith
9829cda628 BitcodeReader: Fix some BlockAddress forward reference corner cases
`BlockAddress`es are interesting in that they can reference basic blocks
from *outside* the block's function.  Since basic blocks are not global
values, this presents particular challenges for lazy parsing.

One corner case was found in PR11677 and fixed in r147425.  In that
case, a global variable references a block address.  It's necessary to
load the relevant function to resolve the forward reference before doing
anything with the module.

By inspection, I found (and have fixed here) two other cases:

  - An instruction from one function references a block address from
    another function, and only the first function is lazily loaded.

    I fixed this the same way as PR11677: by eagerly loading the
    referenced function.

  - A function whose block address is taken is dematerialized, leaving
    invalid references to it.

    I fixed this by refusing to dematerialize functions whose block
    addresses are taken (if you have to load it, you can't unload it).

llvm-svn: 214559
2014-08-01 21:11:34 +00:00
Rafael Espindola
74ac2cd9f3 Have a single enum for "not a bitcode" error.
This is more convenient for callers. No functionality change, this will
be used in a next patch to the gold plugin.

llvm-svn: 214218
2014-07-29 21:01:24 +00:00
Rafael Espindola
06b2000418 Move the bitcode error enum to the include directory.
This will let users in other libraries know which error occurred. In particular,
it will be possible to check if the parsing failed or if the file is not
bitcode.

llvm-svn: 214209
2014-07-29 20:22:46 +00:00
Duncan P. N. Exon Smith
1ad861d158 Bitcode: Serialize (and recover) use-list order
Predict and serialize use-list order in bitcode.  This makes the option
`-preserve-bc-use-list-order` work *most* of the time, but this is still
experimental.

  - Builds a full value-table up front in the writer, sets up a list of
    use-list orders to write out, and discards the table.  This is a
    simpler first step than determining the order from the various
    overlapping IDs of values on-the-fly.

  - The shuffles stored in the use-list order list have an unnecessarily
    large memory footprint.

  - `blockaddress` expressions cause functions to be materialized
    out-of-order.  For now I've ignored this problem, so use-list orders
    will be wrong for constants used by functions that have block
    addresses taken.  There are a couple of ways to fix this, but I
    don't have a concrete plan yet.

  - When materializing functions lazily, the use-lists for constants
    will not be correct.  This use case is out of scope: what should the
    use-list order be, if it's incomplete?

This is part of PR5680.

llvm-svn: 214125
2014-07-28 21:19:41 +00:00
Hal Finkel
000be1bc2f Add a dereferenceable attribute
This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).

llvm-svn: 213385
2014-07-18 15:51:28 +00:00
Hal Finkel
2587ff3060 Rename AlignAttribute to IntAttribute
Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).

No functionality change intended.

llvm-svn: 213352
2014-07-18 06:51:55 +00:00
Reid Kleckner
d5cc38a11b Roundtrip the inalloca bit on allocas through bitcode
This was an oversight in the original support.  As it is, I stuffed this
bit into the alignment.  The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.

Reviewers: nicholas

Differential Revision: http://reviews.llvm.org/D3943

llvm-svn: 213118
2014-07-16 01:34:27 +00:00
Rafael Espindola
1aa8b39bd1 Fix a bug in the conversion to ErrorOr.
The regular end of the bitcode parsing is in the  BitstreamEntry::EndBlock
case.

Should fix the LTO bootstrap on OS X (this function is only used by ld64).

llvm-svn: 212357
2014-07-04 20:05:56 +00:00
Rafael Espindola
6111790890 Revert "Convert a few std::strings to StringRef."
This reverts commit r212342.

We can get a StringRef into the current Record, but not one in the bitcode
itself since the string is compressed in it.

llvm-svn: 212356
2014-07-04 20:02:42 +00:00
Rafael Espindola
f67aea8080 Convert a few std::strings to StringRef.
llvm-svn: 212342
2014-07-04 14:12:46 +00:00
Rafael Espindola
40675153b8 Convert these functions to use ErrorOr.
llvm-svn: 212341
2014-07-04 13:52:01 +00:00
Rafael Espindola
f7b86978cb Remove unused old-style error handling.
If needed, an ErrorOr should be used.

llvm-svn: 212340
2014-07-04 13:30:13 +00:00
David Majnemer
abf7854d05 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

llvm-svn: 211920
2014-06-27 18:19:56 +00:00
Alp Toker
f228194c3e IRReader: don't mark MemoryBuffers const
llvm-svn: 211883
2014-06-27 09:19:14 +00:00
Alp Toker
efad9949fa Propagate const-correctness into parseBitcodeFile()
llvm-svn: 211864
2014-06-27 04:48:32 +00:00
Eli Bendersky
def2619060 Rename loop unrolling and loop vectorizer metadata to have a common prefix.
[LLVM part]

These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix.  Metadata name
changes:

llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*

This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata. 

Patch by Mark Heffernan.

llvm-svn: 211710
2014-06-25 15:41:00 +00:00
Rafael Espindola
23e5fb2297 Make ObjectFile and BitcodeReader always own the MemoryBuffer.
This allows us to just use a std::unique_ptr to store the pointer to the buffer.
The flip side is that they have to support releasing the buffer back to the
caller.

Overall this looks like a more efficient and less brittle api.

llvm-svn: 211542
2014-06-23 21:53:12 +00:00
Rafael Espindola
c19105cc44 Revert a C API difference that I incorrectly introduced.
LLVMGetBitcodeModuleInContext should not take ownership on error. I will
try to localize this odd api requirement, but this should get the bots green.

llvm-svn: 211213
2014-06-18 20:07:35 +00:00
Rafael Espindola
0608be5c6e Remove BitcodeReader::setBufferOwned.
We do have use cases for the bitcode reader owning the buffer or not, but we
always know which one we have when we construct it.

It might be possible to simplify this further, but this is a step in the
right direction.

llvm-svn: 211205
2014-06-18 18:55:41 +00:00
Tim Northover
b9ec29d7c5 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

llvm-svn: 210903
2014-06-13 14:24:07 +00:00
Rafael Espindola
98710599c1 Remove 'using std::errro_code' from lib.
llvm-svn: 210871
2014-06-13 02:24:39 +00:00
Rafael Espindola
e726a14d05 Remove all uses of 'using std::error_code' from headers.
llvm-svn: 210866
2014-06-13 01:25:41 +00:00
Rafael Espindola
b0ac81f225 Don't import error_category into the llvm namespace.
llvm-svn: 210733
2014-06-12 01:45:43 +00:00
Rafael Espindola
80bf4067ff Mark a few functions noexcept.
This reduces the difference between std::error_code and llvm::error_code.

llvm-svn: 210591
2014-06-10 21:26:47 +00:00
Rafael Espindola
e5f71f18e0 Allow aliases to be unnamed_addr.
Alias with unnamed_addr were in a strange state. It is stored in GlobalValue,
the language reference talks about "unnamed_addr aliases" but the verifier
was rejecting them.

It seems natural to allow unnamed_addr in aliases:

* It is a property of how it is accessed, not of the data itself.
* It is perfectly possible to write code that depends on the address
of an alias.

This patch then makes unname_addr legal for aliases. One side effect is that
the syntax changes for a corner case: In globals, unnamed_addr is now printed
before the address space.

llvm-svn: 210302
2014-06-06 01:20:28 +00:00
Tom Roeder
740d86dc79 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Rafael Espindola
133baba536 Clauses in a landingpad are always Constant. Use a stricter type.
llvm-svn: 210203
2014-06-04 18:51:31 +00:00
Rafael Espindola
87cd774844 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00