Commit Graph

6817 Commits

Author SHA1 Message Date
Hiroshi Inoue
c0d997dcdf [NFC] fix trivial typos in comments
"the the" -> "the"



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323176 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-23 05:49:30 +00:00
Chandler Carruth
fd5a8723ce Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.

The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.

However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.

On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.

This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886

We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
  __llvm_external_retpoline_r11
```
or on 32-bit:
```
  __llvm_external_retpoline_eax
  __llvm_external_retpoline_ecx
  __llvm_external_retpoline_edx
  __llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.

There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.

The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.

For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.

When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.

When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.

However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.

We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.

This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.

Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer

Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D41723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323155 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 22:05:25 +00:00
Reid Kleckner
7193f5e4f0 [CodeGen] Shrink MachineOperand by 8 bytes on Windows
Use 'unsigned' for these bitfields so they actually pack together.
Previously it used three words for these bits instead of one.

Add some static_asserts to prevent this from being undone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323135 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 17:50:20 +00:00
Marina Yatsina
be6cfa9585 Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their own files.
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40333

Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323095 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:06:50 +00:00
Marina Yatsina
d6bf9cdf27 Rename ExecutionDepsFix files to ExecutionDomainFix
This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40332

Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323094 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:06:33 +00:00
Marina Yatsina
95ade18a36 ExecutionDepsFix refactoring:
- clang-format

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I131b126af13bc743bc5d69d83699e52b9b720979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323093 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:06:18 +00:00
Marina Yatsina
c77727aa82 ExecutionDepsFix refactoring:
- Moving comments to class definition in header file
- Changing comments to doxygen style
- Rephrase loop traversal explaining comment

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I9a12618db5b66128611fa71b54a233414f6012ac

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323092 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:06:10 +00:00
Marina Yatsina
8957ed6f20 ExecutionDepsFix refactoring:
- Removing LiveRegs

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I8ab56d99951a6d6981542f68d94c1f624f3c9fbf

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323091 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:06:01 +00:00
Marina Yatsina
982e8803a8 ExecutionDepsFix refactoring:
- Changing LiveRegs to be a vector

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: I9cdd364bd7bf2a0bf61ea41a48d4bd310ec3bce4

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323090 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:05:53 +00:00
Marina Yatsina
7dc5e8539c ExecutionDepsFix refactoring:
- Changing DenseMap<MBB*, LiveReg*> to SmallVector<LiveReg*>
- Now the MBB number will be the index of LiveReg in the vector.
- Adding asserts

This patch is NFC.

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: If4a3f141693d0361ddb292432337dbb63a1e69ee

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323089 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:05:45 +00:00
Marina Yatsina
f2a8b010a3 ExecutionDepsFix refactoring:
- Remove unneeded includes and unneeded members
- Use range iterators
- Variable renaming, typedefs, extracting constants
- Removing {} from one line ifs

This patch is NFC.

This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40330
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40331

Change-Id: Ib59060ab3fa5bee3bf2ca2045c24e572635ee7f6

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323088 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:05:37 +00:00
Marina Yatsina
ef279188ca Separate ExecutionDepsFix into 4 parts:
1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains).
2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings.
3. BreakFalseDeps - Breaks false dependencies.
4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks.

This also included the following changes to ExcecutionDepsFix original logic:
1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class.
2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class.

Additional changes in affected files:
1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate.
2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate.

Additional refactoring changes will follow.

This commit is (almost) NFC.
The only functional change is that now BreakFalseDeps will break dependency for all register classes.
Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet.
In a future commit several instructions (and tests) will be added.

This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869
Most of the patches are intended at refactoring the existent code.

Additional relevant reviews:
https://reviews.llvm.org/D40331
https://reviews.llvm.org/D40332
https://reviews.llvm.org/D40333
https://reviews.llvm.org/D40334

Differential Revision: https://reviews.llvm.org/D40330

Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323087 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 10:05:23 +00:00
Saleem Abdulrasool
860652c3f8 CodeGen: handle llvm.used properly for COFF
`llvm.used` contains a list of pointers to named values which the
compiler, assembler, and linker are required to treat as if there is a
reference that they cannot see.  Ensure that the symbols are preserved
by adding an explicit `-include` reference to the linker command.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323017 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-20 00:28:02 +00:00
Matthias Braun
09008367fc Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC
This avoids playing games with pseudo pass IDs and avoids using an
unreliable MRI::isSSA() check to determine whether register allocation
has happened.

Note that this renames:
- MachineLICMID -> EarlyMachineLICM
- PostRAMachineLICMID -> MachineLICMID
to be consistent with the EarlyTailDuplicate/TailDuplicate naming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322927 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-19 06:46:10 +00:00
Matthias Braun
9334f5c86b Split TailDuplicatePass into pre- and post-RA variant; NFC
Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This
avoids playing games with fake pass IDs and using MRI::isSSA() to
determine pre-/post-RA state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322926 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-19 06:08:17 +00:00
Matthias Braun
5d9e10f443 AArch64: Fix emergency spillslot being out of reach for large callframes
Re-commit of r322200: The testcase shouldn't hit machineverifiers
anymore with r322917 in place.

Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.

This commit does several things:
- Compute max callframe size at the end of instruction selection.
- Add mirFileLoaded target callback. Use it to compute the max callframe size
  after loading a .mir file when the size wasn't specified in the file.
- Let TargetFrameLowering::hasFP() return true if there exists a
  callframe > 255 bytes.
- Always place the emergency spillslot close to FP if we have a frame
  pointer.
- Note that `useFPForScavengingIndex()` would previously return false
  when a base pointer was available leading to the emergency spillslot
  getting allocated late (that's the whole effect of this callback).
  Which made no sense to me so I took this case out: Even though the
  emergency spillslot is technically not referenced by FP in this case
  we still want it allocated early.

Differential Revision: https://reviews.llvm.org/D40876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322919 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-19 03:16:36 +00:00
Francis Visoiu Mistrih
7bee1ceb03 [CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::print
Committed r322867 too soon.

Differential Revision: https://reviews.llvm.org/D42239

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322868 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-18 18:05:15 +00:00
Francis Visoiu Mistrih
96ed12f5f3 [CodeGen] Print RegClasses on MI in verbose mode
r322086 removed the trailing information describing reg classes for each
register.

This patch adds printing reg classes next to every register when
individual operands/instructions/basic blocks are printed. In the case
of dumping MIR or printing a full function, by default don't print it.

Differential Revision: https://reviews.llvm.org/D42239

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322867 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-18 17:59:06 +00:00
Justin Bogner
3df0e39672 GlobalISel: Make MachineCSE runnable in the middle of the GlobalISel
Right now, it is not possible to run MachineCSE in the middle of the
GlobalISel pipeline. Being able to run generic optimizations between the
core passes of GlobalISel was one of the goals of the new ISel framework.
This is the first attempt to do it.

The problem is that MachineCSE pass assumes all register operands have a
register class, which, in GlobalISel context, won't be true until after the
InstructionSelect pass. The reason for this behaviour is that before
replacing one virtual register with another, MachineCSE pass (and most of
the other optimization machine passes) must check if the virtual registers'
constraints have a (sufficiently large) intersection, and constrain the
resulting register appropriately if such intersection exists.

GlobalISel extends the representation of such constraints from just a
register class to a triple (low-level type, register bank, register
class).

This commit adds MachineRegisterInfo::constrainRegAttrs method that extends
MachineRegisterInfo::constrainRegClass to such a triple.

The idea is that going forward we should use:

- RegisterBankInfo::constrainGenericRegister within GlobalISel's
  InstructionSelect pass
- MachineRegisterInfo::constrainRegClass within SelectionDAG ISel
- MachineRegisterInfo::constrainRegAttrs everywhere else regardless
  the target and instruction selector it uses.

Patch by Roman Tereshin. Thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322805 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-18 02:06:56 +00:00
Volkan Keles
45915dee3d Add a TargetOption to enable/disable GlobalISel
Summary:
This patch adds a new target option in order to control GlobalISel.
This will allow the users to enable/disable GlobalISel prior to the
backend by calling `TargetMachine::setGlobalISel(bool Enable)`.

No test case as there is already a test to check GlobalISel
command line options.
See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll.

Reviewers: qcolombet, aemerson, ab, dsanders

Reviewed By: qcolombet

Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322773 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 22:34:21 +00:00
Benjamin Kramer
4101985b71 Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versa
compiler_rt doesn't provide them (yet), but libgcc does. PR34076.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322772 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 22:29:16 +00:00
Aditya Nandakumar
7c831c8c4a [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFC
https://reviews.llvm.org/D42149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322743 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17 19:31:33 +00:00
Volkan Keles
e628eadf7b [GlobalISel][TableGen] Add support for SDNodeXForm
Summary:
This patch adds CustomRenderer which renders the matched
operands to the specified instruction.

Targets can enable the matching of SDNodeXForm by adding
a definition that inherits from GICustomOperandRenderer and
GISDNodeXFormEquiv as follows.

def gi_imm8 : GICustomOperandRenderer<"renderImm8”>,
                       GISDNodeXFormEquiv<imm8_xform>;

Custom renderer functions should be of the form:
void render(MachineInstrBuilder &MIB, const MachineInstr &I);

Reviewers: dsanders, ab, rovka

Reviewed By: dsanders

Subscribers: kristof.beyls, javed.absar, llvm-commits, mgrang, qcolombet

Differential Revision: https://reviews.llvm.org/D42012

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322582 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-16 18:44:05 +00:00
Francis Visoiu Mistrih
19988dd44d [CodeGen][NFC] Correct case for printSubRegIdx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322541 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-16 10:53:11 +00:00
Amara Emerson
bc98528887 [GlobalISel][Legalizer] Convert some typedefs to using. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322466 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-15 00:44:20 +00:00
Matthias Braun
d0a44769ad PeepholeOptimizer: Fix for vregs without defs
The PeepholeOptimizer would fail for vregs without a definition. If this
was caused by an undef operand abort to keep the code simple (so we
don't need to add logic everywhere to replicate the undef flag).

Differential Revision: https://reviews.llvm.org/D40763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322319 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-11 22:30:43 +00:00
Wolfgang Pieb
861d9546fc [DWARF][NFC] Overload AsmPrinter::emitDwarfStringOffsets() to take a DwarfStringPoolEntry
record.

Differential Revision: https://reviews.llvm.org/D41920


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322250 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-11 02:35:00 +00:00
Matthias Braun
682df3f9a2 Revert "AArch64: Fix emergency spillslot being out of reach for large callframes"
Revert for now as the testcase is hitting a pre-existing verifier error
that manifest as a failure when expensive checks are enabled (or
-verify-machineinstrs) is used.

This reverts commit r322200.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322231 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 22:36:28 +00:00
Matthias Braun
d5651588f1 LiveRangeEdit: Inline markDeadRemat() into only user; NFC
This function was only called from a single place in which we didn't
even need the `if (DeadRemats)` check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322230 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 22:36:26 +00:00
Matthias Braun
9d85fa084a LiveRangeEdit: Simplify code; NFC
Simplify the code slightly: Instead of creating empty subranges in one
case and immediately removing them, do not create them in the first
place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322226 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 21:41:02 +00:00
Craig Topper
703dd67544 [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.

Most of this patch is just making sure we copy the scale around everywhere.

Differential Revision: https://reviews.llvm.org/D40055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322210 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 19:16:05 +00:00
Matthias Braun
11adaf9955 AArch64: Fix emergency spillslot being out of reach for large callframes
Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.

This commit does several things:
- Compute max callframe size at the end of instruction selection.
- Add mirFileLoaded target callback. Use it to compute the max callframe size
  after loading a .mir file when the size wasn't specified in the file.
- Let TargetFrameLowering::hasFP() return true if there exists a
  callframe > 255 bytes.
- Always place the emergency spillslot close to FP if we have a frame
  pointer.
- Note that `useFPForScavengingIndex()` would previously return false
  when a base pointer was available leading to the emergency spillslot
  getting allocated late (that's the whole effect of this callback).
  Which made no sense to me so I took this case out: Even though the
  emergency spillslot is technically not referenced by FP in this case
  we still want it allocated early.

Differential Revision: https://reviews.llvm.org/D40876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322200 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 18:16:24 +00:00
Craig Topper
3db29f781f [lli] Make lli support -mcpu=native for CPU autodetection
llc, opt, and clang can all autodetect the CPU and supported features. lli cannot as far as I could tell.

This patch uses the getCPUStr() and introduces a new getCPUFeatureList() and uses those in lli in place of MCPU and MAttrs.

Ideally, we would merge getCPUFeatureList and getCPUFeatureStr, but opt and llc need a string and lli wanted a list. Maybe we should just return the SubtargetFeature object and let the caller decide what it needs?

Differential Revision: https://reviews.llvm.org/D41833

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322100 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-09 18:14:18 +00:00
Sanjay Patel
8abb7fa665 [SelectionDAG] lower math intrinsics to finite version of libcalls when possible (PR35672)
Ingredients in this patch:
1. Add HANDLE_LIBCALL defs for finite mathlib functions that correspond to LLVM intrinsics.
2. Plumbing to send TargetLibraryInfo down to SelectionDAGLegalize.
3. Relaxed math and library checking in SelectionDAGLegalize::ConvertNodeToLibcall() to choose finite libcalls.

There was a bug about determining the availability of the finite calls that should be fixed with:
rL322010

Not in this patch:
This doesn't resolve the question/bug of clang creating the intrinsic IR in the first place.
There's likely follow-up work needed to support the long double variants better.
There's room for improvement to reduce the code duplication.
Create finite calls that don't originate from a corresponding intrinsic or DAG node?

Differential Revision: https://reviews.llvm.org/D41338



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322087 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-09 15:41:00 +00:00
Jessica Paquette
3f8e15a628 [MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups
This commit does two things. Firstly, it adds a collection of flags which can
be passed along to the target to encode information about the MBB that an
instruction lives in to the outliner.

Second, it adds some of those flags to the AArch64 outliner in order to add
more stack instructions to the list of legal instructions that are handled
by the outliner. The two flags added check if

- There are calls in the MachineBasicBlock containing the instruction
- The link register is available in the entire block

If the link register is available and there are no calls, then a stack
instruction can always be outlined without fixups, regardless of what it is,
since in this case, the outliner will never modify the stack to create a
call or outlined frame.

The motivation for doing this was checking which instructions are most often
missed by the outliner. Instructions like, say

%sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy

are very common, but cannot be outlined in the case that the outliner might
modify the stack. This commit allows us to outline instructions like this.
  


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322048 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-09 00:26:18 +00:00
Nirav Dave
80fc28f975 [DAG] Teach BaseIndexOffset to correctly handle with indexed operations
BaseIndexOffset address analysis incorrectly ignores offsets folded
into indexed memory operations causing potential errors in alias
analysis of pre-indexed operations.

Reviewers: efriedma, RKSimon, hfinkel, jyknight

Subscribers: hiraditya, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D41701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322003 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-08 16:21:35 +00:00
Bob Wilson
ce3ee9dbde support phi ranges for machine-level IR
Add iterator ranges for machine instruction phis, similar to the IR-level
phi ranges added in r303964. I updated a few places to use this. Besides
general code simplification, this change will allow removing a non-upstream
change from Swift's copy of LLVM (in a better way than my previous attempt
in http://reviews.llvm.org/D19080).

https://reviews.llvm.org/D41672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321783 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-04 02:58:15 +00:00
Francis Visoiu Mistrih
063121dfe8 [CodeGen][NFC] Remove unused function declaration
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321758 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-03 20:56:29 +00:00
Sanjay Patel
668a58d36b [ExpandMemcmp] rename variables and add hook to override pref for number of loads per block; NFC
The preference only applies to 'memcmp() == 0' expansion, so try to make that clearer.
x86 will likely benefit by increasing the default value from '1' to '2' as seen in PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325
...so that is the planned follow-up to this clean-up step.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321756 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-03 20:02:39 +00:00
Amara Emerson
d8de4cebd9 [AArch64][GlobalISel] Enable GlobalISel at -O0 by default
Tests updated to explicitly use fast-isel at -O0 instead of implicitly.

This change also allows an explicit -fast-isel option to override an
implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0.

Differential Revision: https://reviews.llvm.org/D41362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321655 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-02 16:30:47 +00:00
Guozhi Wei
de740eaa76 Revert r321377, it causes regression to https://reviews.llvm.org/P8055.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321528 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28 17:02:34 +00:00
Guozhi Wei
7f53692f1d [SimplifyCFG] Don't do if-conversion if there is a long dependence chain
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321377 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 18:54:04 +00:00
Eli Friedman
fcc597840c [Inliner] Restrict soft-float inlining penalty.
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 02:08:08 +00:00
Matt Arsenault
082879a7af TableGen: Allow setting SDNodeProperties on intrinsics
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.

Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321212 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 19:36:28 +00:00
Nemanja Ivanovic
39c264a5b7 [JumpTables] Let targets decide which switch instructions are suitable
This commits the non-controversial part of https://reviews.llvm.org/D41029
(making the queries virtual). The PPC-specific portion of this will be
committed in a subsequent patch once some of the finer points are ironed out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321182 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 15:44:32 +00:00
Krzysztof Parzyszek
44ef9b61ca Add optional SelectionDAG* parameter to SValue::dump and SDValue::dumpr
These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321180 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 15:15:04 +00:00
Daniel Sanders
b379571ef2 [globalisel][tablegen] Allow ImmLeaf predicates to use InstructionSelector members
NFC for currently supported targets. This resolves a problem encountered by
targets such as RISCV that reference `Subtarget` in ImmLeaf predicates.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321176 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 14:41:51 +00:00
Francis Visoiu Mistrih
235e856f64 [CodeGen] Move printing MO_BlockAddress operands to MachineOperand::print
Work towards the unification of MIR and debug output by refactoring the
interfaces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321113 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 21:47:14 +00:00
Francis Visoiu Mistrih
a6e4a6beb0 [CodeGen] Refactor printOffset from MO and MIRPrinter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321109 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 21:46:55 +00:00
Francis Visoiu Mistrih
fcfc7b225f [CodeGen] Move printing MO_CFIIndex operands to MachineOperand::print
Work towards the unification of MIR and debug output by refactoring the
interfaces.

Before this patch we printed "<call frame instruction>" in the debug
output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321084 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19 16:51:52 +00:00