In the code below on 32-bit targets, x would previously get forwarded to g()
without sign-extension to 32 bits as required by the parameter attribute.
void g(signed short);
void f(unsigned short x) {
g(x);
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262352 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D17337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262337 91177308-0d34-0410-b5e6-96231b3b80d8
Idea behind this change is to make code shorter and as much common for all targets as possible. Let's even accept more code than is valid for a particular target, leaving it for the assembler to sort out.
64bit instructions decoding added.
Error\warning messages on unrecognized instructions operands added, InstPrinter allowed to print invalid operands helping to find invalid/unsupported code.
The change is massive and hard to compare with previous version, so it makes sense just to take a look on the new version. As a bonus, with a few TD changes following, it disassembles the majority of instructions. Currently it fully disassembles >300K binary source of some blas kernel.
Previous TODOs were saved whenever possible.
Patch by: Valery Pykhtin
Differential Revision: http://reviews.llvm.org/D17720
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262332 91177308-0d34-0410-b5e6-96231b3b80d8
Function lto_module_create_in_local_context() would previously
rely on the default LLVMContext being created for it by
LTOModule::makeLTOModule(). This context exits the program on
error and is not arranged to update sLastStringError in
tools/lto/lto.cpp.
Function lto_module_create_in_local_context() now creates an
LLVMContext by itself, sets it up correctly to its needs and then
passes it to LTOModule::createInLocalContext() which takes
ownership of the context and keeps it present for the lifetime of
the returned LTOModule.
Function LTOModule::makeLTOModule() is modified to take a
reference to LLVMContext (instead of a pointer) and no longer
creates a default context when nullptr is passed to it. Method
LTOModule::createInContext() that takes a pointer to LLVMContext
is removed because it allows to pass a nullptr to it. Instead
LTOModule::createFromBuffer() (that takes a reference to
LLVMContext) should be used.
Differential Revision: http://reviews.llvm.org/D17715
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262330 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.
The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.
Reviewers: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D10970
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262316 91177308-0d34-0410-b5e6-96231b3b80d8
Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented.
For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString:
string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod";
Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod).
Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal.
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17568
[AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods
With this change you should place optional operands in order specified by asm string:
clamp -> omod
offset -> glc -> slc -> tfe
Fixes for several tests.
Depends on D17568
Patch by: Sam Kolton
Review: http://reviews.llvm.org/D17644
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262314 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262299 91177308-0d34-0410-b5e6-96231b3b80d8
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262298 91177308-0d34-0410-b5e6-96231b3b80d8
This currently does not have the control over the bitwidth,
and there are missing optimizations to reduce the integer to
32-bit if it can be.
But in most situations we do want the sinking to occur.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262296 91177308-0d34-0410-b5e6-96231b3b80d8
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to
indicate that no catch object should be initialized.
This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.
To handle this, allocate the registration node prior to any other frame
object. This will ensure that catch objects will not be allocated
before the registration node.
This fixes PR26757.
Differential Revision: http://reviews.llvm.org/D17689
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262294 91177308-0d34-0410-b5e6-96231b3b80d8
Generally speaking, this can only happen with unreachable code.
However, neglecting to check for this condition would lead us to loop
forever.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262284 91177308-0d34-0410-b5e6-96231b3b80d8
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:
__m128 mload_zeros(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0));
}
__m128 mload_fakeones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(1));
}
__m128 mload_ones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}
__m128 mload_oneset(float *f) {
return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}
...so none of the above will actually generate a masked load for optimized code.
This is the masked load counterpart to:
http://reviews.llvm.org/rL262064
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262269 91177308-0d34-0410-b5e6-96231b3b80d8
This change makes the verifier a little more paranoid. It was possible
to trick the verifier into crashing or infinite looping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262268 91177308-0d34-0410-b5e6-96231b3b80d8
We can actually have dependences between accesses with different
underlying types. Bail in this case.
A test will follow shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262267 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I re-benchmarked this and results are similar to original results in
D13259:
On ARM64:
SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27%
SingleSource/Benchmarks/Polybench/stencils/adi -19.78%
On x86:
SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14%
And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop
Distribution.
In terms of compile time, there is ~5% increase on both
SingleSource/Benchmarks/Misc/oourafft and
SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny
loop-intensive programs where SCEV computations dominates compile time.
The reason that time spent in SCEV increases has to do with the design
of the old pass manager. If a transform pass does not preserve an
analysis we *invalidate* the analysis even if there was *no*
modification made by the transform pass.
This means that currently we don't take advantage of LLE and LV sharing
the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV
for LLE.
(There should be a way to work around this limitation in the case of
SCEV and LAA since both compute things on demand and internally cache
their result. Thus we could pretend that transform passes preserve
these analyses and manually invalidate them upon actual modification.
On the other hand the new pass manager is supposed to solve so I am not
sure if this is worthwhile.)
Reviewers: hfinkel, dberlin
Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16300
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262250 91177308-0d34-0410-b5e6-96231b3b80d8
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.
This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).
<rdar://problem/24611008>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262247 91177308-0d34-0410-b5e6-96231b3b80d8
This is long-standing dirtiness, as acknowledged by r77582:
The current trick is to select it into a merge_values with
the first definition being an implicit_def. The proper solution is
to add new ISD opcodes for the no-output variant.
Doing this before selection will let us combine away some constructs.
Differential Revision: http://reviews.llvm.org/D17659
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262244 91177308-0d34-0410-b5e6-96231b3b80d8