It was previously repeatedly erasing elements from the middle of a vector,
causing O(n^2) worst-case run-time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232789 91177308-0d34-0410-b5e6-96231b3b80d8
From what I can tell, the code is checking for PHIs that expect any value from
this block, not just constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232697 91177308-0d34-0410-b5e6-96231b3b80d8
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break
anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate
Constraint_* values.
PR22883 was caused the matching operands copying the whole of the operand flags
for the matched operand. This included the constraint id which needed to be
replaced with the operand number. This has been fixed with a conversion
function. Following on from this, matching operands also used the operand
number as the constraint id. This has been fixed by looking up the matched
operand and taking it from there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232165 91177308-0d34-0410-b5e6-96231b3b80d8
This should complete the job started in r231794 and continued in r232045:
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.
This should complete the removal of insert/extract128 intrinsics.
The Clang precursor patch for this change was checked in at r232109.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232120 91177308-0d34-0410-b5e6-96231b3b80d8
This (r232027) has caused PR22883; so it seems those bits might be used by
something else after all. Reverting until we can figure out what else to do.
Original commit message:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232093 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we've replaced the vinsertf128 intrinsics,
do the same for their extract twins.
This is very much like D8086 (checked in at r231794):
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is also the LLVM sibling to the cfe D8275 patch.
Differential Revision: http://reviews.llvm.org/D8276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232045 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8171
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232027 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGen incorrectly ignores (assert from APInt) constant index bigger
than 2^64 in getelementptr instruction. This is a test and fix for that.
Patch by Paweł Bylica!
Reviewed By: rnk
Subscribers: majnemer, rnk, mcrosier, resistor, llvm-commits
Differential Revision: http://reviews.llvm.org/D8219
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231984 91177308-0d34-0410-b5e6-96231b3b80d8
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.
(Resubmitting this change after not being able to reproduce buildbot failure)
Differential Revision: http://reviews.llvm.org/D7760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231800 91177308-0d34-0410-b5e6-96231b3b80d8
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is the sibling patch for the Clang half of this change:
http://reviews.llvm.org/D8088
Differential Revision: http://reviews.llvm.org/D8086
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231794 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is part of the work to support memory constraints that behave
differently to 'm'. The subsequent patches will expand on the existing
encoding (which is a 32-bit int) and as a result in some flag words will no
longer fit into an i16. This problem only affected the MSP430 target which
appears to have 16-bit pointers.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D8168
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231783 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231740 91177308-0d34-0410-b5e6-96231b3b80d8
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231366 91177308-0d34-0410-b5e6-96231b3b80d8
a lookup, pass that in rather than use a naked call to getSubtargetImpl.
This involved passing down and around either a TargetMachine or
TargetRegisterInfo. Update all callers/definitions around the targets
and SelectionDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230699 91177308-0d34-0410-b5e6-96231b3b80d8
The logic is almost there already, with our special homogeneous aggregate
handling. Tweaking it like this allows front-ends to emit AAPCS compliant code
without ever having to count registers or add discarded padding arguments.
Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to
apply the logic to all integer arrays for more consistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230348 91177308-0d34-0410-b5e6-96231b3b80d8
Synthesizing a call directly using the MI layer would confuse the frame
lowering code. This is problematic as frame lowering is highly
sensitive the particularities of calls, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230129 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.
Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.
Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229413 91177308-0d34-0410-b5e6-96231b3b80d8
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
=> getFnAttribute(Kind)
getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
=> hasFnAttribute(Kind)
Also, add `Function::getFnStackAlignment()`, and canonicalize:
getAttributes().getStackAlignment(AttributeSet::FunctionIndex)
=> getFnStackAlignment()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229208 91177308-0d34-0410-b5e6-96231b3b80d8
When lowering memcpy, memset or memmove, this assert checks whether the pointer
operands are in an address space < 256 which means "user defined address space"
on X86. However, this notion of "user defined address space" does not exist
for other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227191 91177308-0d34-0410-b5e6-96231b3b80d8
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227113 91177308-0d34-0410-b5e6-96231b3b80d8
This change reverts the interesting parts of 226311 (and 227046). This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways.
The bugs were:
- Registery appears to require all users be within the same linkage unit. After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy. The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now. 227046 appears to have been an attempt to fix this, but I don't believe it does so completely.
- GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy. Each Strategy was being added to the GCModuleInfo multiple times.
Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass. I'm probably also going to kill off the Registry. Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis. (As original suggested by Chandler.) This will accomplish my original goal of being able to access GCStrategy from Transform/ without adding all of the builtin GCs to IR/.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227109 91177308-0d34-0410-b5e6-96231b3b80d8
Specifically, gc.result benefits from this greatly. Instead of:
gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...
We now have a gc.result.* that can specialize to literally any type.
Differential Revision: http://reviews.llvm.org/D7020
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226857 91177308-0d34-0410-b5e6-96231b3b80d8
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226808 91177308-0d34-0410-b5e6-96231b3b80d8
This addresses part of llvm.org/PR22262. Specifically, it prevents
considering the densities of sub-ranges that have fewer than
TLI.getMinimumJumpTableEntries() elements. Those densities won't help
jump tables.
This is not a complete solution but works around the most pressing
issue.
Review: http://reviews.llvm.org/D7070
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226600 91177308-0d34-0410-b5e6-96231b3b80d8
This is in preparation for a fix to llvm.org/PR22262. One of the ideas
here is to first find a good jump table range first and then split
before and after it. Thereby, we don't need to use the
split-based-on-density heuristic at all, which can make the "binary
tree" deteriorate in various cases.
Also some minor cleanups.
No functional changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226551 91177308-0d34-0410-b5e6-96231b3b80d8
Note: This change ended up being slightly more controversial than expected. Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions.
Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance.
In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.)
Differential Revision: http://reviews.llvm.org/D6811
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226311 91177308-0d34-0410-b5e6-96231b3b80d8
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.
This is in preparation for porting this analysis to the new pass
manager.
No functionality changed, and updates inbound for Clang and Polly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226078 91177308-0d34-0410-b5e6-96231b3b80d8
This adds handling for ExceptionHandling::MSVC, used by the
x86_64-pc-windows-msvc triple. It assumes that filter functions have
already been outlined in either the frontend or the backend. Filter
functions are used in place of the landingpad catch clause type info
operands. In catch clause order, the first filter to return true will
catch the exception.
The C specific handler table expects the landing pad to be split into
one block per handler, but LLVM IR uses a single landing pad for all
possible unwind actions. This patch papers over the mismatch by
synthesizing single instruction BBs for every catch clause to fill in
the EH selector that the landing pad block expects.
Missing functionality:
- Accessing data in the parent frame from outlined filters
- Cleanups (from __finally) are unsupported, as they will require
outlining and parent frame access
- Filter clauses are unsupported, as there's no clear analogue in SEH
In other words, this is the minimal set of changes needed to write IR to
catch arbitrary exceptions and resume normal execution.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D6300
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225904 91177308-0d34-0410-b5e6-96231b3b80d8
While, generally speaking, the process of lowering arguments for a patchpoint
is the same as lowering a regular indirect call, on some targets it may not be
exactly the same. Targets may not, for example, want to add additional register
dependencies that apply only to making cross-DSO calls through linker stubs,
may not want to load additional registers out of function descriptors, and may
not want to add additional side-effect-causing instructions that cannot be
removed later with the call itself being generated.
The PowerPC target will use this in a future commit (for all of the reasons
stated above).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225806 91177308-0d34-0410-b5e6-96231b3b80d8
This name is less descriptive, but it sort of puts things in the
'llvm.frame...' namespace, relating it to frameallocate and
frameaddress. It also avoids using "allocate" and "allocation" together.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225752 91177308-0d34-0410-b5e6-96231b3b80d8
These intrinsics allow multiple functions to share a single stack
allocation from one function's call frame. The function with the
allocation may only perform one allocation, and it must be in the entry
block.
Functions accessing the allocation call llvm.recoverframeallocation with
the function whose frame they are accessing and a frame pointer from an
active call frame of that function.
These intrinsics are very difficult to inline correctly, so the
intention is that they be introduced rarely, or at least very late
during EH preparation.
Reviewers: echristo, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D6493
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225746 91177308-0d34-0410-b5e6-96231b3b80d8
SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases
seems unnecessary.
Also fix a missing CHECK in the test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224393 91177308-0d34-0410-b5e6-96231b3b80d8
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224255 91177308-0d34-0410-b5e6-96231b3b80d8
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532. Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.
I have a follow-up patch prepared for `clang`. If this breaks other
sub-projects, I apologize in advance :(. Help me compile it on Darwin
I'll try to fix it. FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.
This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.
Here's a quick guide for updating your code:
- `Metadata` is the root of a class hierarchy with three main classes:
`MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from
the `Value` class hierarchy. It is typeless -- i.e., instances do
*not* have a `Type`.
- `MDNode`'s operands are all `Metadata *` (instead of `Value *`).
- `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.
If you're referring solely to resolved `MDNode`s -- post graph
construction -- just use `MDNode*`.
- `MDNode` (and the rest of `Metadata`) have only limited support for
`replaceAllUsesWith()`.
As long as an `MDNode` is pointing at a forward declaration -- the
result of `MDNode::getTemporary()` -- it maintains a side map of its
uses and can RAUW itself. Once the forward declarations are fully
resolved RAUW support is dropped on the ground. This means that
uniquing collisions on changing operands cause nodes to become
"distinct". (This already happened fairly commonly, whenever an
operand went to null.)
If you're constructing complex (non self-reference) `MDNode` cycles,
you need to call `MDNode::resolveCycles()` on each node (or on a
top-level node that somehow references all of the nodes). Also,
don't do that. Metadata cycles (and the RAUW machinery needed to
construct them) are expensive.
- An `MDNode` can only refer to a `Constant` through a bridge called
`ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).
As a side effect, accessing an operand of an `MDNode` that is known
to be, e.g., `ConstantInt`, takes three steps: first, cast from
`Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
third, cast down to `ConstantInt`.
The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
metadata schema owners transition away from using `Constant`s when
the type isn't important (and they don't care about referring to
`GlobalValue`s).
In the meantime, I've added transitional API to the `mdconst`
namespace that matches semantics with the old code, in order to
avoid adding the error-prone three-step equivalent to every call
site. If your old code was:
MDNode *N = foo();
bar(isa <ConstantInt>(N->getOperand(0)));
baz(cast <ConstantInt>(N->getOperand(1)));
bak(cast_or_null <ConstantInt>(N->getOperand(2)));
bat(dyn_cast <ConstantInt>(N->getOperand(3)));
bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));
you can trivially match its semantics with:
MDNode *N = foo();
bar(mdconst::hasa <ConstantInt>(N->getOperand(0)));
baz(mdconst::extract <ConstantInt>(N->getOperand(1)));
bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2)));
bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3)));
bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));
and when you transition your metadata schema to `MDInt`:
MDNode *N = foo();
bar(isa <MDInt>(N->getOperand(0)));
baz(cast <MDInt>(N->getOperand(1)));
bak(cast_or_null <MDInt>(N->getOperand(2)));
bat(dyn_cast <MDInt>(N->getOperand(3)));
bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));
- A `CallInst` -- specifically, intrinsic instructions -- can refer to
metadata through a bridge called `MetadataAsValue`. This is a
subclass of `Value` where `getType()->isMetadataTy()`.
`MetadataAsValue` is the *only* class that can legally refer to a
`LocalAsMetadata`, which is a bridged form of non-`Constant` values
like `Argument` and `Instruction`. It can also refer to any other
`Metadata` subclass.
(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223802 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce the ``llvm.instrprof_increment`` intrinsic and the
``-instrprof`` pass. These provide the infrastructure for writing
counters for profiling, as in clang's ``-fprofile-instr-generate``.
The implementation of the instrprof pass is ported directly out of the
CodeGenPGO classes in clang, and with the followup in clang that rips
that code out to use these new intrinsics this ends up being NFC.
Doing the instrumentation this way opens some doors in terms of
improving the counter performance. For example, this will make it
simple to experiment with alternate lowering strategies, and allows us
to try handling profiling specially in some optimizations if we want
to.
Finally, this drastically simplifies the frontend and puts all of the
lowering logic in one place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223672 91177308-0d34-0410-b5e6-96231b3b80d8
This can significantly reduce the size of the switch, allowing for more
efficient lowering.
I also worked with the idea of exploiting unreachable defaults by
omitting the range check for jump tables, but always ended up with a
non-neglible binary size increase. It might be worth looking into some more.
SimplifyCFG currently does this transformation, but I'm working towards changing
that so we can optimize harder based on unreachable defaults.
Differential Revision: http://reviews.llvm.org/D6510
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223566 91177308-0d34-0410-b5e6-96231b3b80d8
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.
Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)
Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.
http://reviews.llvm.org/D6191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223348 91177308-0d34-0410-b5e6-96231b3b80d8
This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them.
With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now.
I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it.
During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases.
In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure.
Reviewed by: atrick, ributzka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223137 91177308-0d34-0410-b5e6-96231b3b80d8