Commit Graph

138778 Commits

Author SHA1 Message Date
Amara Emerson
fcbefce153 [GlobalISel] Enable usage of BranchProbabilityInfo in IRTranslator.
We weren't using this before, so none of the MachineFunction CFG edges had the
branch probability information added. As a result, block placement later in the
pipeline was flying blind.

This is enabled only with optimizations enabled like SelectionDAG.

Differential Revision: https://reviews.llvm.org/D86824
2020-09-09 14:31:12 -07:00
Amara Emerson
6f86f1afef [GlobalISel][IRTranslator] Generate better conditional branch lowering.
This is a port of the functionality from SelectionDAG, which tries to find
a tree of conditions from compares that are then combined using OR or AND,
before using that result as the input to a branch. Instead of naively
lowering the code as is, this change converts that into a sequence of
conditional branches on the sub-expressions of the tree.

Like SelectionDAG, we re-use the case block codegen functionality from
the switch lowering utils, which causes us to generate some different code.
The result of which I've tried to mitigate in earlier combine patches.

Differential Revision: https://reviews.llvm.org/D86665
2020-09-09 13:16:11 -07:00
Amara Emerson
a7636dc8f8 [GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less.
This combine previously tried to take sequences like:
  %cond = G_ICMP pred, a, b
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and by inverting the compare predicate and swapping branch targets, delete the
G_BR and instead have a single conditional branch to the falsebb. Since in an
earlier patch we have a combine to fold not(icmp) into just an inverted icmp,
we don't need this combine to do as much. This patch instead generalizes the
combine by just looking for:
  G_BRCOND %cond, %truebb
  G_BR %falsebb
%truebb:
  ...
%falsebb:
  ...

and then inverting the condition using a not (xor). The xor can be folded away
in a separate combine. This change also lets us avoid some optimization code
in the IRTranslator.

I also think that deleting G_BRs in the combiner is unnecessary. That's
something that targets can decide to do at selection time and could simplify
generic code in future.

Differential Revision: https://reviews.llvm.org/D86664
2020-09-09 13:08:16 -07:00
Hiroshi Yamauchi
dc79f6327a [X86] Add support for using fast short rep mov for memcpy lowering.
Disabled by default behind an option.

Differential Revision: https://reviews.llvm.org/D86883
2020-09-09 12:46:40 -07:00
Jian Cai
cec86ef133 [MC] Resolve the difference of symbols in consecutive MCDataFragements
Try to resolve the difference of two symbols in consecutive MCDataFragments.
This is important for an idiom like "foo:instr; .if . - foo; instr; .endif"
(https://bugs.llvm.org/show_bug.cgi?id=43795).

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D69411
2020-09-09 12:35:43 -07:00
Fangrui Song
cc25215e10 [gcov] Give the __llvm_gcov_ctr load instruction a name for more readable output 2020-09-09 12:34:43 -07:00
Krzysztof Parzyszek
90ea855f41 [Hexagon] Account for truncating pairs to non-pairs when widening truncates
Added missing selection patterns for vpackl.
2020-09-09 14:31:52 -05:00
Fangrui Song
ce51bde5c3 [gcov] Don't split entry block; add a synthetic entry block instead
The entry block is split at the first instruction where `shouldKeepInEntry`
returns false. The created basic block has a br jumping to the original entry
block. The new basic block causes the function label line and the other entry
block lines to be covered by different basic blocks, which can affect line
counts with special control flows (fork/exec in the entry block requires
heuristics in llvm-cov gcov to get consistent line counts).

  int main() { // BB0
    return 0;  // BB2 (due to entry block splitting)
  }
  // BB1 is the exit block (since gcov 4.8)

This patch adds a synthetic entry block (like PGOInstrumentation and GCC) and
inserts an edge from the synthetic entry block to the original entry block. We
can thus remove the tricky `shouldKeepInEntry` and entry block splitting. The
number of basic blocks does not change, but the emitted .gcno files will be
smaller because we can save one GCOV_TAG_LINES tag.

  // BB0 is the synthetic entry block with a single edge to BB2
  int main() { // BB2
    return 0;  // BB2
  }
  // BB1 is the exit block (since gcov 4.8)
2020-09-09 12:25:24 -07:00
Guillaume Chatelet
21ae6eda63 [NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst.
See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168

This is the fixed version of D83375, which was submitted and reverted.

Differential Revision: https://reviews.llvm.org/D87373
2020-09-09 19:10:30 +00:00
Mark de Wever
63ced1a0ad Implements [[likely]] and [[unlikely]] in IfStmt.
This is the initial part of the implementation of the C++20 likelihood
attributes. It handles the attributes in an if statement.

Differential Revision: https://reviews.llvm.org/D85091
2020-09-09 20:48:37 +02:00
Krzysztof Parzyszek
3d66ccadfc [DSE] Handle masked stores 2020-09-09 13:31:31 -05:00
Ulrich Weigand
487a2ff693 [DAGCombine] Skip re-visiting EntryToken to avoid compile time explosion
During the main DAGCombine loop, whenever a node gets replaced, the new
node and all its users are pushed onto the worklist.  Omit this if the
new node is the EntryToken (e.g. if a store managed to get optimized
out), because re-visiting the EntryToken and its users will not uncover
any additional opportunities, but there may be a large number of such
users, potentially causing compile time explosion.

This compile time explosion showed up in particular when building the
SingleSource/UnitTests/matrix-types-spec.cpp test-suite case on any
platform without SIMD vector support.

Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86963
2020-09-09 19:13:46 +02:00
Mircea Trofin
7134ae5a9c [NFC][MLInliner] Don't initialize in an assert.
Since the build bots have assertions enabled, this flew under the radar.
2020-09-09 09:56:07 -07:00
Simon Pilgrim
82fd96da09 X86CallFrameOptimization.cpp - use const references where possible. NFCI. 2020-09-09 16:35:08 +01:00
Simon Pilgrim
a943063683 X86FrameLowering::adjustStackWithPops - cleanup auto usage. NFCI.
Don't use auto for non-obvious types, and use const references.
2020-09-09 16:15:02 +01:00
Qiu Chaofan
6053db5a2e [PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering
In standard C library, both rint and nearbyint returns rounding result
in current rounding mode. But nearbyint never raises inexact exception.
On PowerPC, x(v|s)r(d|s)pic may modify FPSCR XX, raising inexact
exception. So we can't select constrained fnearbyint into xvrdpic.

One exception here is xsrqpi, which will not raise inexact exception, so
fnearbyint f128 is okay here.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87220
2020-09-09 22:40:58 +08:00
Jay Foad
d81eb83109 [AMDGPU] Simplify S_SETREG_B32 case in EmitInstrWithCustomInserter
NFC.
2020-09-09 15:18:31 +01:00
Dmitry Preobrazhensky
ba5e71a51e [AMDGPU][MC] Improved diagnostic messages for invalid registers
Corrected parser to issue meaningful error messages for invalid and malformed registers.

See bug 41303: https://bugs.llvm.org/show_bug.cgi?id=41303

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D87234
2020-09-09 16:44:03 +03:00
Alon Kom
183a8b689c [MachinePipeliner] Fix II_setByPragma initialization
II_setByPragma was not reset between 2 calls of the MachinePipleiner pass

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D87088
2020-09-09 13:38:35 +00:00
Denis Antrushin
a5e6ed283a [Statepoints] Update DAG root after emitting statepoint.
Since we always generate CopyToRegs for statepoint results,
we must update DAG root after emitting statepoint, so that
these copies are scheduled before any possible local uses.
Note: getControlRoot() flushes all PendingExports, not only
those we generates for relocates. If that'll become a problem,
we can change it to flushing relocate exports only.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D87251
2020-09-09 20:22:10 +07:00
Ronak Chauhan
7073a2ba14 Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors"
This reverts commit 487a80531006add8102d50dbcce4b6fd729ab1f6.

Tests fail on big endian machines.
2020-09-09 18:01:28 +05:30
Simon Pilgrim
2c3a34bc05 [KnownBits] Move SelectionDAG::computeKnownBits ISD::ABS handling to KnownBits::abs
Move the ISD::ABS handling to a KnownBits::abs handler, to simplify future implementations in ValueTracking/GlobalISel.
2020-09-09 13:22:58 +01:00
David Stenberg
7d4bb5d4ca [UnifyFunctionExitNodes] Fix Modified status for unreachable blocks
If a function had at most one return block, the pass would return false
regardless if an unified unreachable block was created.

This patch fixes that by refactoring runOnFunction into two separate
helper functions for handling the unreachable blocks respectively the
return blocks, as suggested by @bjope in a review comment.

This was caught using the check introduced by D80916.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D85818
2020-09-09 13:36:03 +02:00
Juneyoung Lee
624f3d3053 [BuildLibCalls] Add more noundef to library functions
This patch follows D85345 and adds more noundef attributes to return values/arguments of library functions
that are mostly about accessing the file system or processes.

A few functions like `chmod` or `times` use typedef `mode_t` and `clock_t`.
They are neither struct nor union, so they cannot contain undef even if they're lowered to iN in IR. So, it is fine to add noundef to them.

- clock_t's actual type is size_t (C17, 7.27.1.3), so it isn't struct or union.

- For mode_t, either int or long is used in practice because programmers use bit manipulation. So, I think it is okay that it's never aggregate in practice.

After this patch, the remaining library functions are those that eagerly participate in optimizations: they can be removed, reordered, or
introduced by a transformation from primitive IR operations.
For them, a few testings is needed, since it may not be valid to add noundef anymore even if C standard says it's okay.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85894
2020-09-09 20:33:35 +09:00
Juneyoung Lee
bd9d25252a [ValueTracking] Add UndefOrPoison/Poison-only version of relevant functions
This patch adds isGuaranteedNotToBePoison and programUndefinedIfUndefOrPoison.

isGuaranteedNotToBePoison will be used at D75808. The latter function is used at isGuaranteedNotToBePoison.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84242
2020-09-09 20:00:26 +09:00
Simon Pilgrim
3a3a752c6d TrigramIndex.cpp - remove unnecessary includes. NFCI.
TrigramIndex.h already includes most of these.
2020-09-09 11:38:31 +01:00
Simon Pilgrim
a2525d173c ARMTargetParser.cpp - use auto const references in for range loops. NFCI.
Fix static analysis warnings about unnecessary copies.
2020-09-09 11:38:31 +01:00
Simon Pilgrim
095c0be790 [APFloat] Fix uninitialized variable in IEEEFloat constructors
Some constructors of IEEEFloat do not initialize member variable exponent.
Fix it by initializing exponent with the following values:

For NaNs, the `exponent` is `maxExponent+1`.
For Infinities, the `exponent` is `maxExponent+1`.
For Zeroes, the `exponent` is `maxExponent-1`.

Patch by: @nullptr.cpp (Yang Fan)

Differential Revision: https://reviews.llvm.org/D86997
2020-09-09 11:38:30 +01:00
Mirko Brkusanin
049222554b [AMDGPU] Workaround for LDS Misalignment bug on GFX10
Add subtarget feature check to avoid using ds_read/write_b96/128 with too
low alignment if a bug is present on that specific hardware.
Add this "feature" to GFX 10.1.1 as it is also affected.
Add global-isel test.
2020-09-09 11:46:09 +02:00
Florian Hahn
1aaacc4182 [EarlyCSE] Explicitly require AAResultsWrapperPass.
The MemorySSAWrapperPass depends on AAResultsWrapperPass and if
MemorySSA is preserved but AAResultsWrapperPass is not, this could lead
to a crash when updating the last user of the MemorySSAWrapperPass.

Alternatively AAResultsWrapperPass could be marked preserved by GVN, but
I am not sure if that would be safe. I am not sure what is required in
order to preserve AAResultsWrapperPass. At the moment, it seems like a
couple of passes that do similar transforms to GVN are preserving it.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87137
2020-09-09 09:14:50 +01:00
Denis Antrushin
1969b82658 [Statepoints] Properly handle const base pointer.
Current code in InstEmitter assumes all GC pointers are either
VRegs or stack slots - hence, taking only one operand.
But it is possible to have constant base, in which case it
occupies two machine operands.

Add a convinience function to StackMaps to get index of next
meta argument and use it in InsrEmitter to properly advance to
the next statepoint meta operand.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D87252
2020-09-09 14:07:00 +07:00
Sam Parker
12e368a93a [ARM] Try to rematerialize VCTP instructions
We really want to try and avoid spilling P0, which can be difficult
since there's only one register, so try to rematerialize any VCTP
instructions.

Differential Revision: https://reviews.llvm.org/D87280
2020-09-09 07:41:22 +01:00
Johannes Doerfert
4844c8dda8 [Attributor] Cleanup ::initialize of various AAs
This commit cleans up the ::initialize method of various AAs in the
following ways:
  - If an associated function is required, give up on declarations.
    This was discovered as a real problem when lots of llvm.dbg.XXX
    call sites were assumed `noreturn` until proven otherwise. That
    does not make any sense and caused huge regressions and missed
    deductions.
  - Require more associated declarations for function interface AAs.
  - Use the IRAttribute::initialize to determine if function interface
    AAs can be used in IPO, don't replicate the checks (especially
    isFunctionIPOAmendable) all over the place. Arguably the function
    declaration check should be moved to some central place to.
2020-09-09 01:38:25 -05:00
Fangrui Song
e83f76a177 [llvm-cov gcov] Simply computation of line counts and exit block counter 2020-09-08 23:15:37 -07:00
Johannes Doerfert
fd33d62d25 [Attributor] Associate the callback callee with a call site argument (if any)
If we have a callback, call site arguments were already associated with
the callback callee. Now we also associate the function with the
callback callee, thus we know ensure that the following holds true (if
all return nonnull):
   `getAssociatedArgument()->getParent() == getAssociatedFunction()`

To test this an early exit from
  `AAMemoryBehaviorCallSiteArgument::initialize``
is included as well. Without the change to getAssociatedFunction() this
kind of early exit for declarations would cause callback call site
arguments to miss out.
2020-09-09 00:52:17 -05:00
Johannes Doerfert
0e508fe501 [Attributor] Cleanup IRPosition::getArgNo usages
As we handle callback calls we need to disambiguate the call site
argument number from the callee argument number. While always equal in
non-callback calls, a callback comes with a partial parameter-argument
mapping so there is no implicit correspondence. Here we split
`IRPosition::getArgNo()` into two public functions, `getCallSiteArgNo()`
and `getCalleeArgNo()`. Usages are adjusted to pick the right one for
their purpose. This fixed some problems that would have been exposed as
we more aggressively optimize callbacks.
2020-09-09 00:52:17 -05:00
Johannes Doerfert
15dac76b77 [Attributor] Selectively look at the callee even when there are operand bundles
While operand bundles carry unpredictable semantics, we know some of
them and can therefore "ignore" them. In this case we allow to look at
the declaration of `llvm.assume` when asked for the attributes at a call
site. The assume operand bundles we have do not invalidate the
declaration attributes.

We cannot test this in isolation because the llvm.assume attributes are
determined by the parser. However, a follow up patch will provide test
coverage.
2020-09-09 00:52:17 -05:00
Johannes Doerfert
8c1b461de9 [Attributor] Provide a command line option that limits recursion depth
In `MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.cpp` we initialized
attributes until stack frame ~35k caused space to run out. The initial
size 1024 is pretty much random.
2020-09-09 00:47:02 -05:00
Max Kazantsev
e249eaa2a3 [NFC] Move functon from IndVarSimplify to SCEV
This function can be reused in other places.

Differential Revision: https://reviews.llvm.org/D87274
Reviewed By: fhahn, lebedev.ri
2020-09-09 11:20:59 +07:00
Krzysztof Parzyszek
7bb3107990 [Hexagon] Fix order of operands in V6_vdealb4w 2020-09-08 22:09:28 -05:00
Fangrui Song
789e9aff28 [llvm-cov gcov] Compute unmeasured arc counts by Kirchhoff's circuit law
For a CFG G=(V,E), Knuth describes that by Kirchoff's circuit law, the minimum
number of counters necessary is |E|-(|V|-1). The emitted edges form a spanning
tree. libgcov emitted .gcda files leverages this optimization while clang
--coverage's doesn't.

Propagate counts by Kirchhoff's circuit law so that llvm-cov gcov can
correctly print line counts of gcc --coverage emitted files and enable
the future improvement of clang --coverage.
2020-09-08 18:45:11 -07:00
Brad Smith
99286ab2f0 [PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D86165
2020-09-08 21:21:14 -04:00
Mircea Trofin
1513446701 [NFC][ThinLTO] EmbedBitcodeSection doesn't need the Config
Instead, passing in the command line options, initialized to nullptr. In
an upcoming patch, we can then use the parameter to pass actual command
line options.

Differential Revision: https://reviews.llvm.org/D87336
2020-09-08 17:14:44 -07:00
Krzysztof Parzyszek
10332af38a Handle masked loads and stores in MemoryLocation/Dependence
Differential Revision: https://reviews.llvm.org/D87061
2020-09-08 19:08:44 -05:00
David Blaikie
f813a8ccef Remove unused variable(s) 2020-09-08 16:58:01 -07:00
Craig Topper
18bea3ae49 [SelectionDAGBuilder] Remove Unnecessary FastMathFlags temporary. Use SDNodeFlags instead. NFCI
This was a missed simplication in D87200
2020-09-08 15:50:12 -07:00
David Blaikie
b72d78d1d8 llvm-symbolizer: Add optional "start file" to match "start line"
Since a function might have portions of its code coming from multiple
different files, "start line" is ambiguous (it can't just be resolved
relative to the file/line specified). Add start file to disambiguate it.
2020-09-08 15:40:58 -07:00
Craig Topper
065f5d3388 [SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.

This required adding a SDNodeFlags to SelectionDAG::getSetCC.

Now we manage to contant fold some stuff undefs during the
initial getNode that we don't do in later DAG combines.

Differential Revision: https://reviews.llvm.org/D87200
2020-09-08 15:27:21 -07:00
Krzysztof Parzyszek
77adca5920 [Hexagon] Handle widening of truncation's operand with legal result
Failing example: v8i8 = truncate v8i32. v8i8 is legal, but v8i32 was
widened to HVX. Make sure that v8i8 does not get altered (even if it's
changed to another legal type).
2020-09-08 16:07:39 -05:00
Nikita Popov
dca909c3db [ValueTracking] Compute known bits of min/max intrinsics
Implement known bits for the min/max intrinsics based on the
recently added KnownBits primitives.
2020-09-08 21:08:17 +02:00