8137 Commits

Author SHA1 Message Date
Bevin Hansson
4f8b0d2f56 [Intrinsic] Add fixed point saturating division intrinsics.
Summary:
This patch adds intrinsics and ISelDAG nodes for signed
and unsigned fixed-point division:

```
llvm.sdiv.fix.sat.*
llvm.udiv.fix.sat.*
```

These intrinsics perform scaled, saturating division
on two integers or vectors of integers. They are
required for the implementation of the Embedded-C
fixed-point arithmetic in Clang.

Reviewers: bjope, leonardchan, craig.topper

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71550
2020-02-24 10:50:52 +01:00
Bevin Hansson
d148a7c68f [MC] Widen the functional unit type from 32 to 64 bits.
Summary:
The type used to represent functional units in MC is
'unsigned', which is 32 bits wide. This is currently
not a problem in any upstream target as no one seems
to have hit the limit on this yet, but in our
downstream one, we need to define more than 32
functional units.

Increasing the size does not seem to cause a huge
size increase in the binary (an llc debug build went
from 1366497672 to 1366523984, a difference of 26k),
so perhaps it would be acceptable to have this patch
applied upstream as well.

Subscribers: hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71210
2020-02-24 09:37:00 +01:00
Craig Topper
8e843e8cc6 [SelectionDAG] Remove SelectionDAG::getTargetMemSDNode now that its not used.
Targets are expected to use getMemIntrinsicNode and not provide
their own subclasses. X86 was previously the only user.
2020-02-23 15:13:50 -08:00
Quentin Colombet
e023c34590 [GISel][KnownBits] Add a cache mechanism to speed compile time
This patch adds a cache that is valid only for the duration of a call
to getKnownBits. With such short lived cache we avoid all the problems
of cache invalidation while still getting the benefits of reusing
the information we already computed.

This cache is useful whenever an instruction occurs more than once
in a chain of computation.
E.g.,
v0 = G_ADD v1, v2
v3 = G_ADD v0, v1

Previously we would compute the known bits for:
v1, v2, v0, then v1 again and finally v3.

With the patch, now we won't have to recompute v1 again.

NFC
2020-02-21 14:31:42 -08:00
Sanjay Patel
8f7d151c8a [SelectionDAG] remove unused isFast() helper function; NFC
We want flag users to check individual fast-math flags,
not that all of them are set. This was also probably
not working as intended because NoFPExcept isn't always
set on non-strict nodes.
2020-02-21 16:58:10 -05:00
Hiroshi Yamauchi
e2050248f6 [BFI] Fix missed BFI updates in MachineSink.
Summary:
This prevents BFI queries on new blocks (from
MachineSinking::GetAllSortedSuccessors) and fixes a bunch of assert failures
under -check-bfi-unknown-block-queries=true.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74511
2020-02-21 09:50:54 -08:00
Craig Topper
8c6da7f0b6 [AArch64] Move isOverflowIntrOpRes help function to the ISD namespace in SelectionDAG.h. NFC
Enables sharing with an upcoming X86 change.
2020-02-20 08:50:17 -08:00
Sam Parker
a24415143b [NFC][RDA] Break-up initialization code
Separate out the initialization code from the loop traversal so
that the analysis can be reset and re-run by a user.
2020-02-20 14:59:42 +00:00
Djordje Todorovic
e67d8322ba Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGfaff707db82d.
A failure found on an ARM 2-stage buildbot.
The investigation is needed.
2020-02-20 14:41:39 +01:00
Bill Wendling
c708922cd0 Include static prof data when collecting loop BBs
Summary:
If the programmer adds static profile data to a branch---i.e. uses
"__builtin_expect()" or similar---then we should honor it. Otherwise,
"__builtin_expect()" is ignored in crucial situations. So we trust that
the programmer knows what they're doing until proven wrong.

Subscribers: hiraditya, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74809
2020-02-19 11:33:48 -08:00
Florian Hahn
476c0607dd [TargetLower] Update shouldFormOverflowOp check if math is used.
On some targets, like SPARC, forming overflow ops is only profitable if
the math result is used: https://godbolt.org/z/DxSmdB
This patch adds a new MathUsed parameter to allow the targets
to make the decision and defaults to only allowing it
if the math result is used. That is the conservative choice.

This patch also updates AArch64ISelLowering, X86ISelLowering,
ARMISelLowering.h, SystemZISelLowering.h to allow forming overflow
ops if the math result is not used. On those targets using the
overflow intrinsic for the overflow check only generates better code.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D74722
2020-02-19 11:28:33 +01:00
Djordje Todorovic
fdc5995043 Reland "[DebugInfo] Enable the debug entry values feature by default"
Differential Revision: https://reviews.llvm.org/D73534
2020-02-19 11:12:26 +01:00
Aditya Nandakumar
6daddf8fea [GlobalISel]: Fix some non determinism exposed in CSE due to not notifying observers about mutations + add verification for CSE
https://reviews.llvm.org/D67133

While investigating some non determinism (CSE doesn't produce wrong
code, it just doesn't CSE some times) in GISel CSE on an out of tree
target, I realized that the core issue was that there were lots of code
that mutates (setReg, setRegClass etc), but doesn't notify observers
(CSE in this case but this could be any other observer). In order to
make the Observer be available in various parts of code and to avoid
having to thread it through various API, the MachineFunction now has the
observer as field. This allows it to be easily used in helper functions
such as constrainOperandRegClass.
Also added some invariant verification method in CSEInfo which can
catch these issues (when CSE is enabled).
2020-02-18 14:54:57 -08:00
Simon Pilgrim
6d3c257dcf [TargetLowering] Add SimplifyMultipleUseDemandedBits 'all elements' helper wrapper. NFC. 2020-02-18 19:53:50 +00:00
Sander de Smalen
04e619f3c1 Add OffsetIsScalable to getMemOperandWithOffset
Summary:
Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo,
has the effect that all places where this information is used
(notably, TargetInstrInfo::getMemOperandWithOffset) will need
to consider Scale - and derived, Offset - possibly being scalable.

This patch adds a new operand `bool &OffsetIsScalable` to
TargetInstrInfo::getMemOperandWithOffset and fixes up all
the places where this function is used, to consider the
offset possibly being scalable.

In most cases, this means bailing out because the algorithm does not
(or cannot) support scalable offsets in places where it does some
form of alias checking for example.

Reviewers: rovka, efriedma, kristof.beyls

Reviewed By: efriedma

Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72758
2020-02-18 15:53:29 +00:00
Djordje Todorovic
a8a9374ec7 Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGa82d3e8a6e67.
2020-02-18 16:38:11 +01:00
Djordje Todorovic
7e0c075702 Reland "[DebugInfo] Enable the debug entry values feature by default"
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-18 14:41:08 +01:00
Matt Arsenault
7ca27c32cc AMDGPU/GlobalISel: Allow arbitrary global values
Treat unknown address spaces as global
2020-02-17 11:32:28 -08:00
Matt Arsenault
936ccc1714 GlobalISel: Allow running localizer earlier
This required legal and regbankselected MIR for seemingly no
reason. For AMDGPU this wouldn't see legalized G_GLOBAL_VALUEs.
2020-02-17 11:24:06 -08:00
Matt Arsenault
7779132b3d GlobalISel: Fix missing const 2020-02-17 09:30:41 -08:00
Simon Pilgrim
b87b1adeb2 [SelectionDAG] Expose the "getValidShiftAmount" helpers available. NFCI.
These are going to be useful in TargetLowering::SimplifyDemandedBits, so expose these helpers outside of SelectionDAG.cpp

Also add an getValidShiftAmountConstant early-out to getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant so we can use them for scalar cases as well.
2020-02-17 16:28:46 +00:00
Matt Arsenault
2158d90a04 GlobalISel: Add combine to narrow G_LSHR
Produce an unmerge to a narrower type and introduce a narrower shift
if needed. I wasn't sure if there was a better way to parameterize the
target's preferred shift type for the GICombineRule, so manually call
the combine helper.
2020-02-17 08:04:52 -08:00
Matt Arsenault
529fcd14cb GlobalISel: Add matcher for G_LSHR 2020-02-17 09:20:13 -05:00
Fangrui Song
13339a0dea [MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex} 2020-02-14 23:08:40 -08:00
Fangrui Song
a791526017 [AsmPrinter][XRay] Omit unique ID for xray_instr_map and xray_fn_idx
Follow-up for D74006.
2020-02-14 21:10:46 -08:00
Fangrui Song
917b37117e [AsmPrinter] Omit unique ID for __patchable_function_entries sections
Follow-up for D74006.

When the integrated assembler is used, we use SHF_LINK_ORDER.  The
linked-to symbol is part of ELFSectionKey, thus we can omit the unique
ID.
2020-02-14 20:54:54 -08:00
Matt Arsenault
c9e2bd579f GlobalISel: Remove unused function argument 2020-02-14 15:57:39 -08:00
Matt Arsenault
a191fc7b7d GlobalISel: Lower s64->s16 G_FPTRUNC
This is more or less directly ported from the AMDGPU custom lowering
for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES
instead of creating shift/trunc to extract the two halves, and zexting
an inverted compare instead of select_cc).

This also does not include the fast math expansion the DAG which
converts to f32 and then to f16. I think that belongs in a
pre-legalize combine instead.
2020-02-14 10:46:58 -08:00
Volkan Keles
7d4c004c85 [GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges
Like COPY instructions explained in D70616, we don't check the constraints
when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check
if registers can be replaced, or a COPY instruction needs to be built.

https://reviews.llvm.org/D70564
2020-02-14 10:45:58 -08:00
Matt Arsenault
1b60178765 TTI: Fix vectorization cost for bswap 2020-02-14 10:14:07 -08:00
Fangrui Song
f2049f8c24 [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
Fangrui Song
e81ba0b0a3 [AsmPrinter] De-capitalize all AsmPrinter::Emit* but EmitInstruction
Similar to rL328848.
2020-02-13 17:06:24 -08:00
Fangrui Song
216ba73a16 [AsmPrinter] De-capitalize some AsmPrinter::Emit* functions
Similar to rL328848.
2020-02-13 13:38:33 -08:00
Fangrui Song
2357c0bf62 [AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile 2020-02-13 13:22:49 -08:00
Matt Arsenault
8f4091003d GlobalISel: Don't use LLT references
These should always be passed by value
2020-02-13 15:25:30 -05:00
Guozhi Wei
af1f058dcb [MBP] Partial tail duplication into hot predecessors
Current tail duplication embedded in MBP duplicates a BB into all or none of its predecessors without too much cost analysis. So sometimes it is duplicated into cold predecessors, and in other cases it may miss the duplication into hot predecessors.

This patch improves tail duplication in 3 aspects:

  A successor can be duplicated into part of its predecessors.
  A more fine-grained benefit analysis, combined with 1, now a successor is duplicated into hot predecessors only.
  If a successor can't be duplicated into one predecessor, it doesn't impact the duplication into other predecessors.

Differential Revision: https://reviews.llvm.org/D73387
2020-02-12 15:22:33 -08:00
Simon Pilgrim
714eb197b4 [TargetLowering] Add NegatibleCost enum for isNegatibleForFree return codes
The isNegatibleForFree/getNegatedExpression methods currently rely on a raw char value to indicate whether a negation is beneficial or not.

This patch replaces the char return value with an NegatibleCost enum to more clearly demonstrate what is implied.

It also renames isNegatibleForFree to getNegatibleCost to more accurately reflect whats going on.

Differential Revision: https://reviews.llvm.org/D74221
2020-02-12 11:51:42 +00:00
Djordje Todorovic
45581db685 Revert "[DebugInfo] Enable the debug entry values feature by default"
This reverts commit rG9f6ff07f8a39.

Found a test failure on clang-with-thin-lto-ubuntu buildbot.
2020-02-12 11:59:04 +01:00
Djordje Todorovic
69262470d0 [DebugInfo] Enable the debug entry values feature by default
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-12 10:25:14 +01:00
Craig Topper
42881b3a20 [X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND
This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses
them to implement soft promotion for the half type. This is
enough to provide basic support for __fp16 with strictfp.

Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C
is enabled.
2020-02-11 22:30:04 -08:00
Justin Lebar
0e4d775a3f Use std::foo_t rather than std::foo in LLVM.
Summary: C++14 migration. No functional change.

Reviewers: bkramer, JDevlieghere, lebedev.ri

Subscribers: MatzeB, hiraditya, jkorous, dexonsmith, arphaman, kadircet, lebedev.ri, usaxena95, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74384
2020-02-11 15:12:51 -08:00
Kai Luo
8bf871612b [NFC] Fix typo. 2020-02-11 13:58:35 +08:00
serge_sans_paille
35ac7e7659 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille
85fea1133e Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554f5f4f90342f40afd35b077f6d88213.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille
9120fd7c93 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
serge-sans-paille
3db641e8fe Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732bcf1911210903ee9811033d5588e0d.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille
a978f6dd05 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Benjamin Kramer
7939b1a9cd ArrayRef'ize spillCalleeSavedRegisters. NFCI. 2020-02-08 12:19:23 +01:00
Simon Pilgrim
f4ac4433ae [TargetLowering] Remove isDesirableToCombineBuildVectorToShuffleTruncate target hook. NFC.
This hasn't been used for years, its original implementation, D35700, had bugs that caused the reversion of most of the code, and since then x86 shuffle lowering/combining has handled most cases and can deal with the rest as well.
2020-02-08 08:55:51 +00:00
Nico Weber
52288fac0f Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad6813a4c8cdb8dc20ea6337aa1f61e0.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00