Summary:
Originally
i64 = umax t8, Constant:i64<4>
was expanded into
i32,i32 = umax Constant:i32<0>, Constant:i32<0>
i32,i32 = umax t7, Constant:i32<4>
Now instead the two produced umax:es return i32 instead of i32, i32.
Thanks to Jan Vesely for help with the test case.
Patch by mikael.holmen at ericsson.com
Reviewers: bogner, jvesely, tstellarAMD, arsenm
Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D28135
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291441 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I've noticed that these assertions don't trigger when the condition is false.
The problem is that the DEBUG(x) macro only executes x when the pass is
emitting debug output via the -debug and -debug-only=registerbankinfo command
line arguments.
Debug builds should always execute the assertions so use '#ifndef NDEBUG' instead.
Also removed an assertion that is only true the first time it's tested. <Target>RegisterBankInfo's constructor will re-use register banks causing them to be valid on subsequent tests. That
assertion will fail on the first test too in the near future.
Reviewers: t.p.northover, ab, rovka, qcolombet
Subscribers: dberris, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D28358
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291235 91177308-0d34-0410-b5e6-96231b3b80d8
We used the logBase2 of the high instead of the ceilLogBase2 resulting
in the wrong result for certain values. For example, it resulted in an
i1 AssertZExt when the exclusive portion of the range was 3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291196 91177308-0d34-0410-b5e6-96231b3b80d8
do not use .cfi_sections. This requires checking if any non-declaration
function in the module needs an unwind table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291172 91177308-0d34-0410-b5e6-96231b3b80d8
Add an assert that checks whether liveins are up to date before they are
used.
- Do not print liveins into .mir files anymore in situations where they
are out of date anyway.
- The assert in the RegisterScavenger is superseded by the new one in
livein_begin().
- Skip parts of the liveness updating logic in IfConversion.cpp when
liveness isn't tracked anymore (just enough to avoid hitting the new
assert()).
Differential Revision: https://reviews.llvm.org/D27562
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291169 91177308-0d34-0410-b5e6-96231b3b80d8
To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level
basic blocks need to be initialized, as the AsmPrinter uses this link to be
able to print out labels for the basic blocks that are address-taken.
Most of the changes in this commit are about adapting existing tests to include
the basic block name that is now printed out in the MIR format, now that the
name becomes available as the link to the LLVM-IR basic block is initialized.
The relevant test change for the functionality added in this patch are the
added "(address-taken)" strings in
test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll.
Differential Revision: https://reviews.llvm.org/D28123
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291105 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero
extended. For example, given double 65534.0, without legalization:
fp-to-uint16: 65534.0 -> 0xfffe
With the legalization:
fp-to-sint32: 65534.0 -> 0x0000fffe
Without this patch, legalization wrongly emits a signed extend assertion,
which is consumed by later icmp instruction, and cause miscompile.
Note that the floating point value must be in [0, 65535), otherwise the
behavior is undefined.
This patch reverts r279223 behavior and adds more tests and
documentations.
In PR29041's context, James Molloy mentioned that:
We don't need to mask because conversion from float->uint8_t is
undefined if the integer part of the float value is not representable in
uint8_t. Therefore we can assume this doesn't happen!
which is totally true and good, because fptoui is documented clearly to
have undefined behavior when overflow/underflow happens. We should take
the advantage of this behavior so that we can save unnecessary mask
instructions.
Reviewers: jmolloy, nadav, echristo, kbarton
Subscribers: mehdi_amini, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28284
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291015 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of matching:
(a + i) + 1 -> (a + i, undef, 1)
Now it matches:
(a + i) + 1 -> (a, i, 1)
Reviewers: rengolin
Differential Revision: http://reviews.llvm.org/D26367
From: Evgeny Stupachenko <evstupac@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291012 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The InlineSpiller was accessing the DominatorTreeBase directly
through the public data member DT in the MachineDominatorTree.
This is not a good idea as the "cached" information in
SplitCriticalEdges is not applied before the access.
The DominatorTreeBase must be accessed through the member
function getBase() in MachineDominatorTree.
The fault was introduced in r266162.
I think the public data member DT in the MachineDominatorTree
should have been made private in the original code (r215576)
that introduced the concept of lazily updating the
MachineDominatorTree information from
MachineBasicBlock::SplitCriticalEdge().
Patch by Karl-Johan Karlsson <karl-johan.karlsson@ericsson.com>
Reviewers: wmi, qcolombet
Subscribers: llvm-commits, bjope, uabelho
Differential Revision: https://reviews.llvm.org/D27983
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290950 91177308-0d34-0410-b5e6-96231b3b80d8
Use getReturnedArgOperand() instead of rolling our own. Note that it's
equivalent because there can only be one 'returned' operand.
The existing code was also incorrect: there already was awkward logic to
ignore callee/EH blocks, but operands can now also be operand bundles,
in which case we'll look for non-existent parameter attributes.
Unfortunately, this isn't observable in-tree, as it only crashes when
exercising the regular call lowering logic with operand bundles.
Still, this is a nice small cleanup anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290905 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
No need to have this per-architecture. While there, unify 32-bit ARM's
behaviour with what changed elsewhere and start function names lowercase
as per the coding standards. Individual entry emission code goes to the
entry's own class.
Fully tested on amd64, cross-builds on both ARMs and PowerPC.
Reviewers: dberris
Subscribers: aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D28209
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290858 91177308-0d34-0410-b5e6-96231b3b80d8
GNU as rejects input where .cfi_sections is used after .cfi_startproc,
if the new section differs from the old. Adjust our output to always
emit .cfi_sections before the first .cfi_startproc to minimize necessary
code.
Differential Revision: https://reviews.llvm.org/D28011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290817 91177308-0d34-0410-b5e6-96231b3b80d8
This reapplies rL289013 (reverted in rL289014) with the fixes identified
in D21731. Should hopefully pass the buildbots this time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290809 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
`PromotedFloats` needs to be checked in
`DAGTypeLegalizer::PerformExpensiveChecks`. This patch fixes a few type
legalization failures with expansive checks for ARM fp16 tests.
Reviewers: baldrick, bogner, arsenm
Subscribers: arsenm, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D28187
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290796 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r290694. It broke sanitizer tests on Win64. I'll
probably bring this back, but the jump tables will just live in .text
like they do for MSVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290714 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a new intrinsic which is intended to provide memcpy functionality
with additional atomicity guarantees. Please refer to the review thread
or language reference for further details.
Differential Revision: https://reviews.llvm.org/D27133
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290708 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We were already using 32-bit jump table entries, but this was a
consequence of the default PIC model on Win64, and not an intentional
design decision. This patch ensures that we always use 32-bit label
difference jump table entries on Win64 regardless of the PIC model. This
is a good idea because it saves executable size and object file size.
Moving the jump tables to .rdata cleans up the disassembled object code
and reduces the available ROP targets, but it requires adding one more
RIP-relative lea to the code. COFF doesn't have relocations to express
the difference between two arbitrary symbols, so we can't use the jump
table label in the label difference like we do elsewhere.
Fixes PR31488
Reviewers: majnemer, compnerd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28141
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290694 91177308-0d34-0410-b5e6-96231b3b80d8
Jump table emission can switch to .rdata before
WinException::endFunction gets called. Just remember the appropriate
text section we started in and reset back to it when we end the
function. We were already switching sections back from .xdata anyway.
Fixes the first problem in PR31488, so that now COFF switch tables can
live in .rdata if we want them to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290678 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid extra (recursive) calls to computeKnownBits if we already know that there are no common known bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290490 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change rewrites a core component in the ImplicitNullChecks pass for
greater simplicity since the original design was over-complicated for no
good reason. Please review this as essentially a new pass. The change
is almost NFC and I've added a test case for a scenario that this new
code handles that wasn't handled earlier.
The implicit null check pass, at its core, is a code hoisting transform.
It differs from "normal" code transforms in that it speculates
potentially faulting instructions (by design), but a lot of the usual
hazard detection logic (register read-after-write etc.) still applies.
We previously detected hazards by keeping track of registers defined and
used by machine instructions over an instruction range, but that was
unwieldy and did not actually confer any performance benefits. The
intent was to have linear time complexity over the number of machine
instructions considered, but it ended up being N^2 is practice.
This new version is more obviously O(N^2) (with N capped to 8 by
default) in hazard detection. It does not attempt to be clever in
tracking register uses or defs (the previous cleverness here was a
source of bugs).
Once this is checked in, I'll extract out the `IsSuitableMemoryOp` and
`CanHoistLoadInst` lambda into member functions (they're too complicated
to be inline lambdas) and do some other related NFC cleanups.
Reviewers: reames, anna, atrick
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D27592
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290394 91177308-0d34-0410-b5e6-96231b3b80d8
We used to not check generic vregs, but that is actually a mistake given
nothing in the GlobalISel pipeline is going to fix the constraints on
target specific instructions. Therefore, the target has to have them
right from the start.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290380 91177308-0d34-0410-b5e6-96231b3b80d8
When generic virtual registers get constrained, because of a use on a
target specific operation for instance, we end up with regular virtual
registers with a type and that's perfectly fine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290376 91177308-0d34-0410-b5e6-96231b3b80d8
This is going to be needed to be able to constraint register class on
target specific instruction while the RegBankSelect pass did not run
yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290375 91177308-0d34-0410-b5e6-96231b3b80d8
Move the logic to constraint register from InstructionSelector to a
utility function. It will be required by other passes in the GlobalISel
pipeline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290374 91177308-0d34-0410-b5e6-96231b3b80d8
This is a succeeding patch of https://reviews.llvm.org/D22840 to address the
issue when a value to be merged into an int64 pair is in a different BB. Redoing
the store splitting in CodeGenPrepare so we can match the pattern across multiple
BBs and move some instructions into the same BB. We still keep the code in dag
combine so that we can catch cases that show up after DAG combining runs.
Differential Revision: https://reviews.llvm.org/D25914
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290365 91177308-0d34-0410-b5e6-96231b3b80d8
When the pipeliner is renaming phi values, it may need to iterate through
the phi operands to check for other phis. However, the pipeliner should
stop once it reaches a phi that is outside the pipelined loop.
Also, when the generateExistingPhis code is unable to reuse an existing
phi, the default code that computes the PhiOp2 is only to be used when
the pipeliner is generating the kernel. Otherwise, the phi may be a value
computed earlier in the same epilog.
Patch by Brendon Cahoon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290355 91177308-0d34-0410-b5e6-96231b3b80d8
When DwarfExpression is emitting a fragment that is located in a
register and that fragment is smaller than the register, and the
register must be composed from sub-registers (are you still with me?)
the last DW_OP_piece operation must not be larger than the size of the
fragment itself, since the last piece of the fragment could be smaller
than the last subregister that is being emitted.
rdar://problem/29779065
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290324 91177308-0d34-0410-b5e6-96231b3b80d8
There are helpers for testing for constant or constant build_vector,
and for splat ConstantFP vectors, but not for a constantfp or
non-splat ConstantFP vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290317 91177308-0d34-0410-b5e6-96231b3b80d8
The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use.
The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above.
The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it.
This aubmit also includes additional lit tests to cover better HVAs corner cases.
Differential Revision: https://reviews.llvm.org/D27392
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290240 91177308-0d34-0410-b5e6-96231b3b80d8
we used to print UNKNOWN instructions when the instruction to be printer was not
yet inserted in any BB: in that case the pretty printer would not be able to
compute a TII as the instruction does not belong to any BB or function yet.
This patch explicitly passes the TII to the pretty-printer.
Differential Revision: https://reviews.llvm.org/D27645
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290228 91177308-0d34-0410-b5e6-96231b3b80d8
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290214 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
This reapplies r289902 with additional testcase upgrades and a change
to the Bitcode record for DIGlobalVariable, that makes upgrading the
old format unambiguous also for variables without DIExpressions.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290153 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
MachineInstr::isIdenticalTo() is for some reason not
symmetric when comparing bundles, which gives us the
property:
I1->isIdenticalTo(*I2) != I2->isIdenticalTo(*I1)
when comparing bundles where one bundle is longer than
the other.
This patch makes sure that bundles of different length
always are considered as not being identical. Thus, the
result of the comparison will be the same regardless of
which side that happens to be to the left.
Reviewers: dexonsmith, jonpa, andrew.w.kaylor
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D27508
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290095 91177308-0d34-0410-b5e6-96231b3b80d8
The original version of the code in XRayInstrumentation.cpp assumed that
functions may not have empty machine basic blocks (or that the first one
couldn't be). This change addresses that by special-casing that specific
situation.
We provide two .mir test-cases to make sure we're handling this
appropriately.
Fixes llvm.org/PR31424.
Reviewers: chandlerc
Subscribers: varno, llvm-commits
Differential Revision: https://reviews.llvm.org/D27913
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290091 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
PseudoSourceValue can be used to attach a target specific value for "well behaved" side-effects lowered from target specific intrinsics.
This is useful whenever there is not an LLVM IR Value around when representing such "well behaved" side-effected operations in backends by attaching a MachineMemOperand with a custom PseudoSourceValue as this makes the scheduler not treating them as "GlobalMemoryObjects" which triggers a logic that makes the operation act like a barrier in the Schedule DAG.
This patch adds another Kind to the PseudoSourceValue object which is "TargetCustom". It indicates a type of PseudoSourceValue that has a target specific meaning (aka. LLVM shouldn't assume any specific usage for such a PSV).
It supports the possibility of having many different kinds of "TargetCustom" PseudoSourceValues.
We had a discussion about if this was valuable or not (in particular because there was a believe that PSV were going away sooner or later) but seems like they are not going anywhere and I think they are useful backend side.
It is not clear the interaction of this with MIRParser (do we need a target hook to parse these?) and I would like a comment from Alex about that :)
Reviewers: arphaman, hfinkel, arsenm
Subscribers: Eugene.Zelenko, llvm-commits
Patch By: Marcello Maggioni
Differential Revision: https://reviews.llvm.org/D13575
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290037 91177308-0d34-0410-b5e6-96231b3b80d8