Commit Graph

6782 Commits

Author SHA1 Message Date
Tom Stellard
48e90723ea Merging r322319:
------------------------------------------------------------------------
r322319 | matze | 2018-01-11 14:30:43 -0800 (Thu, 11 Jan 2018) | 7 lines

PeepholeOptimizer: Fix for vregs without defs

The PeepholeOptimizer would fail for vregs without a definition. If this
was caused by an undef operand abort to keep the code simple (so we
don't need to add logic everywhere to replicate the undef flag).

Differential Revision: https://reviews.llvm.org/D40763
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@329619 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-09 20:45:48 +00:00
Hans Wennborg
dcecdaaa04 Merging r323857:
------------------------------------------------------------------------
r323857 | rogfer01 | 2018-01-31 10:23:43 +0100 (Wed, 31 Jan 2018) | 19 lines

[ARM] Allow the scheduler to clone a node with glue to avoid a copy CPSR ↔ GPR.

In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do
copies CPSR ↔ GPR but not all Thumb1 targets implement them.

The schedule can attempt, before attempting a copy, to clone the instructions
but it does not currently do that for nodes with input glue. In this patch we
introduce a target-hook to let the hook decide if a glued machinenode is still
eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS .

As a follow-up of this change we should actually implement the copies for the
Thumb1 targets that do implement them and restrict the hook to the targets that
can't really do such copy as these clones are not ideal.

This change fixes PR35836.

Differential Revision: https://reviews.llvm.org/D42051


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324082 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02 13:35:26 +00:00
Hans Wennborg
e307072026 Merging r323155:
------------------------------------------------------------------------
r323155 | chandlerc | 2018-01-22 23:05:25 +0100 (Mon, 22 Jan 2018) | 133 lines

Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..

Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.

The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.

However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.

On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.

This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886

We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
  __llvm_external_retpoline_r11
```
or on 32-bit:
```
  __llvm_external_retpoline_eax
  __llvm_external_retpoline_ecx
  __llvm_external_retpoline_edx
  __llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.

There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.

The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.

For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.

When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.

When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.

However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.

We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.

This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.

Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer

Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D41723
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324067 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02 10:49:53 +00:00
Hans Wennborg
071c94c510 Merging r322003:
------------------------------------------------------------------------
r322003 | niravd | 2018-01-08 08:21:35 -0800 (Mon, 08 Jan 2018) | 11 lines

[DAG] Teach BaseIndexOffset to correctly handle with indexed operations

BaseIndexOffset address analysis incorrectly ignores offsets folded
into indexed memory operations causing potential errors in alias
analysis of pre-indexed operations.

Reviewers: efriedma, RKSimon, hfinkel, jyknight

Subscribers: hiraditya, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D41701
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322693 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 17:48:09 +00:00
Amara Emerson
d8de4cebd9 [AArch64][GlobalISel] Enable GlobalISel at -O0 by default
Tests updated to explicitly use fast-isel at -O0 instead of implicitly.

This change also allows an explicit -fast-isel option to override an
implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0.

Differential Revision: https://reviews.llvm.org/D41362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321655 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-02 16:30:47 +00:00
Guozhi Wei
de740eaa76 Revert r321377, it causes regression to https://reviews.llvm.org/P8055.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321528 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28 17:02:34 +00:00
Guozhi Wei
7f53692f1d [SimplifyCFG] Don't do if-conversion if there is a long dependence chain
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321377 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 18:54:04 +00:00
Eli Friedman
fcc597840c [Inliner] Restrict soft-float inlining penalty.
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 02:08:08 +00:00
Matt Arsenault
082879a7af TableGen: Allow setting SDNodeProperties on intrinsics
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.

Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321212 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 19:36:28 +00:00
Nemanja Ivanovic
39c264a5b7 [JumpTables] Let targets decide which switch instructions are suitable
This commits the non-controversial part of https://reviews.llvm.org/D41029
(making the queries virtual). The PPC-specific portion of this will be
committed in a subsequent patch once some of the finer points are ironed out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321182 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 15:44:32 +00:00
Krzysztof Parzyszek
44ef9b61ca Add optional SelectionDAG* parameter to SValue::dump and SDValue::dumpr
These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321180 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 15:15:04 +00:00
Daniel Sanders
b379571ef2 [globalisel][tablegen] Allow ImmLeaf predicates to use InstructionSelector members
NFC for currently supported targets. This resolves a problem encountered by
targets such as RISCV that reference `Subtarget` in ImmLeaf predicates.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321176 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 14:41:51 +00:00
Francis Visoiu Mistrih
235e856f64 [CodeGen] Move printing MO_BlockAddress operands to MachineOperand::print
Work towards the unification of MIR and debug output by refactoring the
interfaces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321113 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 21:47:14 +00:00
Francis Visoiu Mistrih
a6e4a6beb0 [CodeGen] Refactor printOffset from MO and MIRPrinter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321109 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 21:46:55 +00:00
Francis Visoiu Mistrih
fcfc7b225f [CodeGen] Move printing MO_CFIIndex operands to MachineOperand::print
Work towards the unification of MIR and debug output by refactoring the
interfaces.

Before this patch we printed "<call frame instruction>" in the debug
output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321084 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 16:51:52 +00:00
Matthias Braun
d53e70262d TargetLowering: Fix off-by-one error
This problem was present for a while, but somehow asan didn't catch
it before the refactoring in r321036.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321043 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 00:05:10 +00:00
Matthias Braun
209f048663 LiveStacks: Rename LiveStack.{h|cpp} to LiveStacks.{h|cpp}; NFC
Filenames should match the name of the class they contain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321037 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 23:19:44 +00:00
Matthias Braun
4f43d6f3e0 X86/AArch64/ARM: Factor out common sincos_stret logic; NFCI
Note:
- X86ISelLowering: setLibcallName(SINCOS) was superfluous as
  InitLibcalls() already does it.
- ARMISelLowering: Setting libcallnames for sincos/sincosf seemed
  superfluous as in the darwin case it wouldn't be used while for all
  other cases InitLibcalls already does it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321036 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 23:19:42 +00:00
Matthias Braun
40c166819e AArch64/X86: Factor out common bzero logic; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321035 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 23:14:28 +00:00
Francis Visoiu Mistrih
65ad22d969 [YAML] Add support for non-printable characters
LLVM IR function names which disable mangling start with '\01'
(https://www.llvm.org/docs/LangRef.html#identifiers).

When an identifier like "\01@abc@" gets dumped to MIR, it is quoted, but
only with single quotes.

http://www.yaml.org/spec/1.2/spec.html#id2770814:

"The allowed character range explicitly excludes the C0 control block
allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF."

http://www.yaml.org/spec/1.2/spec.html#id2776092:

"All non-printable characters must be escaped.
[...]
Note that escape sequences are only interpreted in double-quoted scalars."

This patch adds support for printing escaped non-printable characters
between double quotes if needed.

Should also fix PR31743.

Differential Revision: https://reviews.llvm.org/D41290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320996 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 17:38:03 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Matthias Braun
dfcb4f5344 MachineFunction: Slight refactoring; NFC
Slight cleanup/refactor in preparation for upcoming commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320882 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:46 +00:00
Matthias Braun
1203053a04 MachineModuleInfo: Remove unused function; NFC
Remove the unused setModule() function; it would be dangerous if someone
actually used it as it wouldn't reset/recompute various other module
related data.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320881 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:42 +00:00
Sanjay Patel
cc05ee7bde [CodeGen] fix documentation comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320835 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 18:09:33 +00:00
Francis Visoiu Mistrih
38e881da88 [CodeGen] Print stack object references as %(fixed-)stack.0 in both MIR and debug output
Work towards the unification of MIR and debug output by printing
`%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of
`<fi#-4>` (supposing there are 4 fixed stack objects).

Only debug syntax is affected.

Differential Revision: https://reviews.llvm.org/D41027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320827 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 16:33:45 +00:00
Francis Visoiu Mistrih
278b31c092 [MIR] Add support for missing CFI directives
The following CFI directives are suported by MC but not by MIR:

* .cfi_rel_offset
* .cfi_adjust_cfa_offset
* .cfi_escape
* .cfi_remember_state
* .cfi_restore_state
* .cfi_undefined
* .cfi_register
* .cfi_window_save

Add support for printing, parsing and update tests.

Differential Revision: https://reviews.llvm.org/D41230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 15:17:18 +00:00
Sam Clegg
5334180cf2 [WebAssembly] Implement @llvm.global_ctors and @llvm.global_dtors
Summary:
- lowers @llvm.global_dtors by adding @llvm.global_ctors
  functions which register the destructors with `__cxa_atexit`.
- impements @llvm.global_ctors with wasm start functions and linker metadata

See [here](https://github.com/WebAssembly/tool-conventions/issues/25) for more background.

Subscribers: jfb, dschuff, mgorny, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41211

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 00:17:10 +00:00
Matt Arsenault
45d0bf280d TLI: Allow using PSV for intrinsic mem operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320756 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 22:34:10 +00:00
Matt Arsenault
a40d3af28e DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320746 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 21:39:51 +00:00
Krzysztof Parzyszek
a3a5536590 Add MVT::v128i1, NFC
Hexagon HVX has type v128i8, comparing two vectors of that type will
produce v128i1 types in SelectionDAG.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 19:05:21 +00:00
Andrew V. Tischenko
e3c92f3b0d Any Target Asm comments should start from MachineInstr::TAsmComments value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320693 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 12:07:11 +00:00
Francis Visoiu Mistrih
3f63013fb4 [CodeGen] Print global addresses as @foo in both MIR and debug output
Work towards the unification of MIR and debug output by printing
`@foo` instead of `<ga:@foo>`.

Also print target flags in the MIR format since most of them are used on
global address operands.

Only debug syntax is affected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320682 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 10:03:09 +00:00
Francis Visoiu Mistrih
d347e9783d [CodeGen] Print jump-table index operands as %jump-table.0 in both MIR and debug output
Work towards the unification of MIR and debug output by printing `%jump-table.0` instead of `<jt#0>`.

Only debug syntax is affected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320566 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 10:30:59 +00:00
Matthias Braun
fa621d294f Rename LiveIntervalAnalysis.h to LiveIntervals.h
Headers/Implementation files should be named after the class they
declare/define.

Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 02:51:04 +00:00
Matthias Braun
f09cbb9774 Remove unnecessary includes; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320545 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 02:51:01 +00:00
Geoff Berry
3b391fe80e [MachineOperand][MIR] Add isRenamable to MachineOperand.
Summary:
Add isRenamable() predicate to MachineOperand.  This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand.  Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).

Reviewers: qcolombet, MatzeB, hfinkel

Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39400

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320503 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-12 17:53:59 +00:00
Alex Bradbury
181a9c9bcd [RISCV] Add custom CC_RISCV calling convention and improved call support
The TableGen-based calling convention definitions are inflexible, while
writing a function to implement the calling convention is very
straight-forward, and allows difficult cases to be handled more easily. With
this patch adds support for:
* Passing large scalars according to the RV32I calling convention
* Byval arguments
* Passing values on the stack when the argument registers are exhausted

The custom CC_RISCV calling convention is also used for returns.

This patch also documents the ABI lowering that a language frontend is 
expected to perform. I would like to work to simplify these requirements over 
time, but this will require further discussion within the LLVM community.

We add PendingArgFlags CCState, as a companion to PendingLocs.

The PendingLocs vector is used by a number of backends to handle arguments 
that are split during legalisation. However CCValAssign doesn't keep track of 
the original argument alignment. Therefore, add a PendingArgFlags vector which 
can be used to keep track of the ISD::ArgFlagsTy for every value added to 
PendingLocs.

Differential Revision: https://reviews.llvm.org/D39898



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320359 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-11 12:49:02 +00:00
Alex Bradbury
1e4e4646ad [RISCV] Support lowering FrameIndex
Introduces the AddrFI "addressing mode", which is necessary simply because 
it's not possible to write a pattern that directly matches a frameindex.

Ensure callee-saved registers are accessed relative to the stackpointer. This
is necessary as callee-saved register spills are performed before the frame
pointer is set.

Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can 
make use of it in the RISC-V backend.

Differential Revision: https://reviews.llvm.org/D39848


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320353 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-11 11:53:54 +00:00
Dylan McKay
68063cffdf Relax unaligned access assertion when type is byte aligned
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).

In these architectures, all types are aligned to 8-bits.

After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.

This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html

Reviewers: bogner, nemanjai, joerg, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cactus, llvm-commits

Differential Revision: https://reviews.llvm.org/D39946

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320243 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-09 06:45:36 +00:00
Francis Visoiu Mistrih
e28484a720 [CodeGen] Move printing MO_Immediate operands to MachineOperand::print
Work towards the unification of MIR and debug output by refactoring the
interfaces.

Add support for operand subreg index as an immediate to debug printing
and use ::print in the MIRPrinter.

Differential Review: https://reviews.llvm.org/D40965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320209 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-08 22:53:21 +00:00
Francis Visoiu Mistrih
fd11bc0813 [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320022 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 10:40:31 +00:00
Florian Hahn
f42965e37e [AArch64] Add patterns to replace fsub fmul with fma fneg.
Summary:
This patch adds MachineCombiner patterns for transforming
(fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower
latency on micro architectures where fneg is cheap.

Patch based on work by George Steed.

Reviewers: rengolin, joelkevinjones, joel_k_jones, evandro, efriedma

Reviewed By: evandro

Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319980 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 22:48:36 +00:00
Hans Wennborg
e33d953bea Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319824 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 20:22:20 +00:00
Bjorn Pettersson
efd6852a7b [DAGCombine] Handle big endian correctly in CombineConsecutiveLoads
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.

A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.

Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.

Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
  t76: i64,ch = load<LD8[FixedStack-9]
instead of
  t37: i32,ch = load<LD4[FixedStack-10]>
  t35: i32,ch = load<LD4[FixedStack-9]>
  t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.

Reviewers: niravd, hfinkel

Reviewed By: niravd

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D40444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319771 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 14:50:05 +00:00
Jonas Paulsson
48d461fa38 [Regalloc] Generate and store multiple regalloc hints.
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.

NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.

This should be an improvement generally for any target!

The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.

The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.

Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 10:52:24 +00:00
Daniel Sanders
02ef0610a9 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319739 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 05:52:07 +00:00
Matthias Braun
c93edadc6a MachineFrameInfo: Cleanup some parameter naming inconsistencies; NFC
Consistently use the same parameter names as the names of the affected
fields. This avoids some unintuitive abbreviations like `isSS`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319722 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 01:18:15 +00:00
Hans Wennborg
206bdbac74 Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 22:21:15 +00:00
Daniel Sanders
e35711676b [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319691 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 20:39:32 +00:00
Francis Visoiu Mistrih
ca0df55065 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 17:18:51 +00:00