Commit Graph

4589 Commits

Author SHA1 Message Date
Pavel Labath
7400d52f48 Expose template parameters of endian specific types as class members
Summary:
This allows generic code to query these parameters, and is a common
practice in a lot of other template classes.

Reviewers: zturner, Bigcheese

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58969

llvm-svn: 355504
2019-03-06 14:09:02 +00:00
Igor Kudrin
fcf079ea32 [CommandLine] Allow grouping options which can have values.
This patch allows all forms of values for options to be used at the end
of a group. With the fix, it is possible to follow the way GNU binutils
tools handle grouping options better. For example, the -j option can be
used with objdump in any of the following ways:

$ objdump -d -j .text a.o
$ objdump -d -j.text a.o
$ objdump -dj .text a.o
$ objdump -dj.text a.o

Differential Revision: https://reviews.llvm.org/D58711

llvm-svn: 355185
2019-03-01 09:22:42 +00:00
Eli Friedman
35f2f3bd52 [AArch64] [Windows] Fix llvm-readobj -unwind output with many epilogs.
The number of epilog scopes may not fit into a uint8_t.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40855

Differential Revision: https://reviews.llvm.org/D58693

llvm-svn: 355135
2019-02-28 20:33:22 +00:00
Bjorn Pettersson
a490753ab0 Add support for computing "zext of value" in KnownBits. NFCI
Summary:
The description of KnownBits::zext() and
KnownBits::zextOrTrunc() has confusingly been telling
that the operation is equivalent to zero extending the
value we're tracking. That has not been true, instead
the user has been forced to explicitly set the extended
bits as known zero afterwards.

This patch adds a second argument to KnownBits::zext()
and KnownBits::zextOrTrunc() to control if the extended
bits should be considered as known zero or as unknown.

Reviewers: craig.topper, RKSimon

Reviewed By: RKSimon

Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58650

llvm-svn: 355099
2019-02-28 15:45:29 +00:00
Fangrui Song
8a35c45412 [Dominators] Avoid potentially quadratic std::is_permutation
Summary:
If the two sequences are not equal, std::is_permutation may be O(N^2)
and indeed the case in libstdc++ and libc++. Use SmallPtrSet to prevent
pessimizing cases. On my machine, SmallPtrSet starts to outperform
std::is_permutation when there are 16 elements.

Reviewers: kuhar

Reviewed By: kuhar

Subscribers: kristina, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58373

llvm-svn: 355070
2019-02-28 05:16:01 +00:00
Alexandre Ganea
a891789c78 [Memory] Add basic support for large/huge memory pages
This patch introduces Memory::MF_HUGE_HINT which indicates that allocateMappedMemory() shall return a pointer to a large memory page.
However the flag is a hint because we're not guaranteed in any way that we will get back a large memory page. There are several restrictions:

- Large/huge memory pages aren't enabled by default on modern OSes (Windows 10 and Linux at least), and should be manually enabled/reserved.
- Once enabled, it should be kept in mind that large pages are physical only, they can't be swapped.
- Memory fragmentation can affect the availability of large pages, especially after running the OS for a long time and/or running along many other applications.

Memory::allocateMappedMemory() will fallback to 4KB pages if it can't allocate 2MB large pages (if Memory::MF_HUGE_HINT is provided)

Currently, Memory::MF_HUGE_HINT only works on Windows. The hint will be ignored on Linux, 4KB pages will always be returned.

Differential Revision: https://reviews.llvm.org/D58718

llvm-svn: 355065
2019-02-28 02:47:34 +00:00
Craig Topper
a1bd7416d0 [X86] Use X86_CPU_SUBTYPE_COMPAT for 'cascadelake' cpu.
This CPU is supported by at least libgcc trunk now so we should make it available to __builtin_cpu_is.

llvm-svn: 354913
2019-02-26 19:17:12 +00:00
Ganesh Gopalasubramanian
9a9159179f [X86] AMD znver2 enablement
This patch enables the following

1) AMD family 17h "znver2" tune flag (-march, -mcpu).
2) ISAs that are enabled for "znver2" architecture.
3) For the time being, it uses the znver1 scheduler model.
4) Tests are updated.
5) Scheduler descriptions are yet to be put in place.

Reviewers: craig.topper

Differential Revision: https://reviews.llvm.org/D58343

llvm-svn: 354897
2019-02-26 16:55:10 +00:00
Luke Cheeseman
b773e62491 [ARM] Add Cortex-M35P
- Add LLVM backend support for Cortex-M35P
- Documentation can be found at
  https://developer.arm.com/products/processors/cortex-m/cortex-m35p

Differentail Revision: https://reviews.llvm.org/D57763

llvm-svn: 354868
2019-02-26 12:02:12 +00:00
Roman Lebedev
7631c47235 Revert "[Support] Make raw_string_ostream unbuffered"
Shame on me, did not run all the tests, bots are angry.

This reverts commit r354819.

llvm-svn: 354822
2019-02-25 21:11:19 +00:00
Roman Lebedev
839691fdd7 [Support] Make raw_string_ostream unbuffered
Summary:
In D58580 i have noted that `llvm::to_string()` is a memory hog.
It uses `raw_string_ostream`, and since it was buffered,
every `raw_string_ostream` had a cost of `BUFSIZ` bytes
(which is `8192` at least here). So every `llvm::to_string()`
call, even to just print an `int`, costed `8192` bytes.

In D58580, getting rid of that buffering //had// significant
performance and memory consumption improvements for `llvm-xray convert`.

Similarly, in D58580 @rnk pointed out that the `raw_svector_ostream`
is already unbuffered, and `write_unsigned_impl` and friends
do internal buffering. So it should be ok performance-wise to just
make the `raw_string_ostream` itself unbuffered.

Here, i don't have any perf measurements.
Another letdown is that i'm leaving a loose-end - not deleting the
`flush()` method. I  don't expect that cleanup to be anything more
than just fixing every new compiler error, but i'm presently unable
to do that. Will look into that later.

Reviewers: rnk, zturner

Reviewed By: rnk

Subscribers: kristina, jdoerfert, llvm-commits, rnk

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58643

llvm-svn: 354819
2019-02-25 20:51:49 +00:00
Luke Cheeseman
6af96f2e37 [AArch64] Add support for Cortex-A76 and Cortex-A76AE
- Add LLVM backend support for Cortex-A76 and Cortex-A76AE
- Documentation can be found at
  https://developer.arm.com/products/processors/cortex-a/cortex-a76

llvm-svn: 354788
2019-02-25 15:08:27 +00:00
Duncan P. N. Exon Smith
e4b7b09efd VFS: Avoid some unnecessary std::string copies
Thread Twine a little deeper through the VFS to avoid unnecessarily
constructing the same std::string twice in a parameter sequence:

    Twine -> std::string -> StringRef -> std::string

Changing a few parameters from StringRef to Twine avoids the early call
to `Twine::str()`.

llvm-svn: 354739
2019-02-23 23:48:47 +00:00
Matt Arsenault
81d5a377eb Fix missing C++ mode comments
llvm-svn: 354590
2019-02-21 15:48:10 +00:00
Puyan Lotfi
5fc21d122b Fixing NDEBUG typo in include/llvm/Support/raw_ostream.h
NDEBUG is misspelled as NDBEBUG in include/llvm/Support/raw_ostream.h.

llvm-svn: 354495
2019-02-20 18:30:44 +00:00
Fangrui Song
6e6640e7ec [Dominators] Simplify and optimize path compression used in link-eval forest.
Summary:
* NodeToInfo[*] have been allocated so the addresses are stable. We can store them instead of NodePtr to save NumToNode lookups.
* Nodes are traversed twice. Using `Visited` to check the traversal number is expensive and obscure. Just split the two traversals into two loops explicitly.
* The check `VInInfo.DFSNum < LastLinked` is redundant as it is implied by `VInInfo->Parent < LastLinked`
* VLabelInfo PLabelInfo are used to save a NodeToInfo lookup in the second traversal.

Also add some comments explaining eval().

This shows a ~4.5% improvement (9.8444s -> 9.3996s) on

    perf stat -r 10 taskset -c 0 opt -passes=$(printf '%.0srequire<domtree>,invalidate<domtree>,' {1..1000})'require<domtree>' -disable-output sqlite-autoconf-3270100/sqlite3.bc

Reviewers: kuhar, sanjoy, asbirlea

Reviewed By: kuhar

Subscribers: brzycki, NutshellySima, kristina, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58327

llvm-svn: 354433
2019-02-20 04:39:42 +00:00
Fangrui Song
6f9ce942e6 [Dominators] Delete UpdateLevelsAfterInsertion in edge insertion of depth-based search for release builds
Summary:
After insertion of (From, To), v is affected iff
depth(NCD)+1 < depth(v) && path P from To to v exists where every w on P s.t. depth(v) <= depth(w)

All affected vertices change their idom to NCD.

If a vertex u has changed its depth, it must be a descendant of an
affected vertex v. Its depth must have been updated by UpdateLevel()
called by setIDom() of the first affected ancestor.

So UpdateLevelsAfterInsertion and its bookkeeping variable VisitedNotAffectedQueue are redundant.
Run them only in debug builds as a sanity check.

Reviewers: kuhar

Reviewed By: kuhar

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58369

llvm-svn: 354429
2019-02-20 02:35:24 +00:00
Daniel Sanders
6542491095 Annotate timeline in Instruments with passes and other timed regions.
Summary:
Instruments is a useful tool for finding performance issues in LLVM but it can
be difficult to identify regions of interest on the timeline that we can use
to filter the profiler or allocations instrument. Xcode 10 and the latest
macOS/iOS/etc. added support for the os_signpost() API which allows us to
annotate the timeline with information that's meaningful to LLVM.

This patch causes timer start and end events to emit signposts. When used with
-time-passes, this causes the passes to be annotated on the Instruments timeline.
In addition to visually showing the duration of passes on the timeline, it also
allows us to filter the profile and allocations instrument down to an individual
pass allowing us to find the issues within that pass without being drowned out
by the noise from other parts of the compiler.

Using this in conjunction with the Time Profiler (in high frequency mode) and
the Allocations instrument is how I found the SparseBitVector that should have
been a BitVector and the DenseMap that could be replaced by a sorted vector a
couple months ago. I added NamedRegionTimers to TableGen and used the resulting
annotations to identify the slow portions of the Register Info Emitter. Some of
these were placed according to educated guesses while others were placed
according to hot functions from a previous profile. From there I filtered the
profile to a slow portion and the aforementioned issues stood out in the
profile.

To use this feature enable LLVM_SUPPORT_XCODE_SIGNPOSTS in CMake and run the
compiler under Instruments with -time-passes like so:
  instruments -t 'Time Profiler' bin/llc -time-passes -o - input.ll'
Then open the resulting trace in Instruments.

There was a talk at WWDC 2018 that explained the feature which can be found at
https://developer.apple.com/videos/play/wwdc2018/405/ if you'd like to know
more about it.

Reviewers: bogner

Reviewed By: bogner

Subscribers: jdoerfert, mgorny, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D52954

llvm-svn: 354365
2019-02-19 18:18:31 +00:00
Fangrui Song
bd67e9adcd [Dominators] Fix and optimize edge insertion of depth-based search
Summary:
After (x,y) is inserted, depth-based search finds all affected v that satisfies:

depth(nca(x,y))+1 < depth(v) && there exists a path P from y to v where every w on P satisfies depth(v) <= depth(w)

This reduces to a widest path problem (maximizing the depth of the
minimum vertex in the path) which can be solved by a modified version of
Dijkstra with a bucket queue (named depth-based search in the paper).

The algorithm visits vertices in decreasing order of bucket number.
However, the current code misused priority_queue to extract them in
increasing order. I cannot think of a failing scenario but it surely may
process vertices more than once due to the local usage of Processed.

This patch fixes this bug and simplifies/optimizes the code a bit. Also
add more comments.

Reviewers: kuhar

Reviewed By: kuhar

Subscribers: kristina, jdoerfert, llvm-commits, NutshellySima, brzycki

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58349

llvm-svn: 354306
2019-02-19 05:16:52 +00:00
Saleem Abdulrasool
38ec336765 Support: use internal call_once on PPC64le
Always use the internal `call_once` for PPC64le.  This is needed to
support the Swift toolchain on PPC64le.

Patch by Sarvesh Tamba!

llvm-svn: 354045
2019-02-14 18:36:52 +00:00
Sam McCall
c2a2194aef Reapply [VFS] Allow multiple RealFileSystem instances with independent CWDs.
This reverts commit r351091.
The original mac breakages are addressed by ensuring the root directory
we're working from is fully symlink-resolved before starting.

Differential Revision: https://reviews.llvm.org/D58169

llvm-svn: 354026
2019-02-14 12:57:01 +00:00
Matt Arsenault
84e44687ee GlobalISel: Add G_FCANONICALIZE instruction
llvm-svn: 353719
2019-02-11 17:05:20 +00:00
Eugene Leviant
69a4eb7877 Attempt to fix buildbot after r353679 #2
llvm-svn: 353683
2019-02-11 10:17:17 +00:00
Eugene Leviant
7c75340e63 Attempt to fix buildbot after r353679
llvm-svn: 353681
2019-02-11 10:12:19 +00:00
Eugene Leviant
cd9bb80c89 Small refactoring of FileError. NFC.
Differential revision: https://reviews.llvm.org/D57945

llvm-svn: 353679
2019-02-11 09:49:37 +00:00
Mikhail R. Gadelha
7db7f7f32f This reverts commit 1440a848a635849b97f7a5cfa0ecc40d37451f5b.
and commit a1853e834c65751f92521f7481b15cf0365e796b.

They broke arm and aarch64

llvm-svn: 353590
2019-02-09 00:46:12 +00:00
Jessica Paquette
cb5da3f61a Recommit "[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR""
After r353586, we won't fail on the AMDGPU floor pattern that was killing the
importer before.

llvm-svn: 353589
2019-02-09 00:37:31 +00:00
Craig Topper
ea7e6b3857 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
Adrian Prantl
10b7106901 Move SMTSolver dump() methods out-of-line.
This broke modularized non-local-submodule-visibility builds because
the function bodies pulled in extra dependencies.

llvm-svn: 353465
2019-02-07 21:03:18 +00:00
Matt Arsenault
c48f0dc588 GlobalISel: Try to make legalize rules more useful for vectors
Mostly keep the existing functions on scalars, but add versions which
also operate based on the vector element size.

llvm-svn: 353430
2019-02-07 17:25:51 +00:00
Fangrui Song
46cc0ce60a Fix misspelled filenames in file headers
llvm-svn: 353408
2019-02-07 14:38:25 +00:00
Mikhail R. Gadelha
dd981b7599 Move the SMT API to LLVM
Moved everything SMT-related to LLVM and updated the cmake scripts.

Differential Revision: https://reviews.llvm.org/D54978

llvm-svn: 353373
2019-02-07 03:19:45 +00:00
Petr Hosek
ec05af2652 [CMake] Unify scripts for generating VCS headers
Previously, there were two different scripts for generating VCS headers:
one used by LLVM and one used by Clang and lldb. They were both similar,
but different. They were both broken in their own ways, for example the
one used by Clang didn't properly handle monorepo resulting in an
incorrect version information reported by Clang.

This change unifies two the scripts by introducing a new script that's
used from both LLVM, Clang and lldb, ensures that the new script
supports both monorepo and standalone SVN and Git setups, and removes
the old scripts.

Differential Revision: https://reviews.llvm.org/D57063

llvm-svn: 353268
2019-02-06 03:51:00 +00:00
Thomas Preud'homme
fd51ca5973 Recommit: Add support for prefix-only CLI options
Summary:
Add support for options that always prefix their value, giving an error
if the value is in the next argument or if the option is given a value
assignment (ie. opt=val). This is the desired behavior for the -D option
of FileCheck for instance.

Copyright:
- Linaro (changes in version 2 of revision D55940)
- GraphCore (changes in later versions and introduced when creating
  D56549)

Reviewers: jdenny

Subscribers: llvm-commits, probinson, kristina, hiraditya,
JonChesterfield

Differential Revision: https://reviews.llvm.org/D56549

llvm-svn: 353172
2019-02-05 14:17:16 +00:00
Jessica Paquette
b78928bcbd Revert "[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR"
This reverts commit b05ecba6d687fcb3078509220c67458bf1d77a2e.

Apparently adding floor breaks AMDGPU somehow, so I have to back this out
while I look into it.

llvm-svn: 353065
2019-02-04 17:32:47 +00:00
Jessica Paquette
b34b9ca867 [GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR
This introduces a generic opcode for floating point floor, working towards
selecting @llvm.floor.

Differential Revision: https://reviews.llvm.org/D57484

llvm-svn: 353057
2019-02-04 17:10:55 +00:00
Petr Hosek
be96a2d9e7 Revert "[CMake] Unify scripts for generating VCS headers"
This reverts commits r352729 and r352731: this broke Sanitizer Windows bots

llvm-svn: 352733
2019-01-31 07:12:43 +00:00
Petr Hosek
874d1380a0 [CMake] Unify scripts for generating VCS headers
Previously, there were two different scripts for generating VCS headers:
one used by LLVM and one used by Clang. They were both similar, but
different. They were both broken in their own ways, for example the one
used by Clang didn't properly handle monorepo resulting in an incorrect
version information reported by Clang.

This change unifies two the scripts by introducing a new script that's
used from both LLVM and Clang, ensures that the new script supports both
monorepo and standalone SVN and Git setups, and removes the old scripts.

Differential Revision: https://reviews.llvm.org/D57063

llvm-svn: 352729
2019-01-31 06:21:01 +00:00
Jessica Paquette
2b66bd8551 [GlobalISel] Introduce a G_FSQRT generic instruction
This introduces a generic instruction for computing the floating point
square root of a value.

Right now, we can't select @llvm.sqrt, so this is working towards fixing that.

llvm-svn: 352668
2019-01-30 20:49:50 +00:00
Sam Clegg
b75dc5a9f1 Add enum values to CodeGenOpt::Level
The absolute values of this enum are important at least in that
they get printed by SelectionDAGISel. e.g:
  `Before: -O2 ; After: -O0`

Differential Revision: https://reviews.llvm.org/D57430

llvm-svn: 352587
2019-01-30 02:08:34 +00:00
Jessica Paquette
e2b76b0235 [GlobalISel] Add G_FSIN and G_FCOS generic instructions
This introduces generic instrutions for floating point sin and cos, G_FCOS and
G_FSIN. It updates the tests, etc.

https://reviews.llvm.org/D57197
1/3

llvm-svn: 352400
2019-01-28 18:34:16 +00:00
Thomas Preud'homme
ebb7585889 Revert "Add support for prefix-only CLI options"
This reverts commit r351038.

llvm-svn: 352310
2019-01-27 09:02:46 +00:00
Matt Arsenault
5ad5b82541 GlobalISel: Fix address space limit in LLT
The IR enforced limit for the address space is 24-bits, but LLT was
only using 23-bits. Additionally, the argument to the constructor was
truncating to 16-bits.

A similar problem still exists for the number of vector elements. The
IR enforces no limit, so if you try to use a vector with > 65535
elements the IRTranslator asserts in the LLT constructor.

llvm-svn: 352264
2019-01-26 01:42:13 +00:00
Sam McCall
7010cdc6c4 [JSON] Work around excess-precision issue when comparing T_Integer numbers.
Reviewers: bkramer

Subscribers: kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D57237

llvm-svn: 352204
2019-01-25 15:05:33 +00:00
Matt Arsenault
50c7d5240c GlobalISel: Add helper to LLT to get a scalar or vector
llvm-svn: 352136
2019-01-25 00:10:49 +00:00
Davide Italiano
4a405d2a77 [Chrono] Remove ATTRIBUTE_ALWAYS inline from Chrono.h.
I discussed this with Pavel, who told me there was no real
thought behind this, and had no objection to remove the
attributes.

llvm-svn: 351893
2019-01-22 22:49:19 +00:00
Matt Arsenault
309650e2b2 GlobalISel: Make buildConstant handle vectors
Produce a splat build_vector similar to how
SelectionDAG::getConstant does.

llvm-svn: 351880
2019-01-22 21:31:02 +00:00
Serge Guelton
2a0c5337b6 Slight fix for r351820
llvm-svn: 351821
2019-01-22 13:57:29 +00:00
Serge Guelton
1e000b2c69 Fix llvm::is_trivially_copyable portability issues
llvm::is_trivially_copyable portability is verified at compile time using
std::is_trivially_copyable as the reference implementation.

Unfortunately, the latter is not available on all platforms, so introduce
a proper configure check to detect if it is available on the target platform.

In a similar manner, std::is_copy_assignable is not fully supported for gcc4.9.
Provide a portable (?) implementation instead.

Differential Revision: https://reviews.llvm.org/D57018

llvm-svn: 351820
2019-01-22 13:48:55 +00:00
Vitaly Buka
316c1b9dc6 Revert "Remove static_assert(value == std::is_trivially_copyable<T>::value)"
Upgraded the bot as workaround.

This reverts commit r351784.

llvm-svn: 351786
2019-01-22 07:22:45 +00:00