Summary:
We will use this in the AMDGPU backend in a subsequent patch
in the stack to lookup target-specific per-intrinsic information.
The generic CodeGenIntrinsic machinery is used to ensure that,
even though we don't calculate actual enum values here, we do
get the intrinsics in the right order for the binary search
index.
Change-Id: If61cd5587963a4c5a1cc53df1e59c5e4dec1f9dc
Reviewers: arsenm, rampitec, b-sumner
Subscribers: wdng, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D44935
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328937 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: echristo, zturner, samsonov
Reviewed By: echristo
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D45134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328935 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Adds -import-cutoff=N which will stop importing during the thin link
after N imports. Default is -1 (no limit).
Reviewers: wmi
Subscribers: inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D45127
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328934 91177308-0d34-0410-b5e6-96231b3b80d8
If a loop has a loop exiting latch, it can be profitable
to rotate the loop if it leads to the simplification of
a phi node. Perform rotation in these cases even if loop
rotate itself didnt simplify the loop to get there.
Differential Revision: https://reviews.llvm.org/D44199
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328933 91177308-0d34-0410-b5e6-96231b3b80d8
We mistakenly believe we might be able to fold this as a RMW operation, but that doesn't end up happening.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328929 91177308-0d34-0410-b5e6-96231b3b80d8
This Promote flag was alwasys set to true except in the default case. But in the default case we don't need to set PVT and can just return false.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328926 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer D44363 for a list of all the required patches.
Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon
Reviewed By: dexonsmith
Subscribers: david2050, llvm-commits
Differential Revision: https://reviews.llvm.org/D44944
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328925 91177308-0d34-0410-b5e6-96231b3b80d8
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC,
so replace a pair of casts with the equivalent node. We don't have to account for
special cases (NaN, INF) because out-of-range casts are undefined.
Differential Revision: https://reviews.llvm.org/D44909
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328921 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput.
This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet.
I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs.
Reviewers: RKSimon, GGanesh, courbet
Reviewed By: RKSimon
Subscribers: gchatelet, gbedwell, andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D44972
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328914 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption).
1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file.
2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions).
3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether.
Differential Revision: https://reviews.llvm.org/D45076
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328904 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This exposes WebAssembly passes for use on the command line (as
arguments to -print-before and the like).
Reviewers: dschuff, sunfish
Subscribers: MatzeB, jfb, sbc100, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D45103
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328901 91177308-0d34-0410-b5e6-96231b3b80d8
Two memory instructions with a dependency only on the address register
between the two (the first one of them being post-incrememnt) can be
packetized together after the offset on the second was updated to the
incremement value. Make sure that the new offset is valid for the
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328897 91177308-0d34-0410-b5e6-96231b3b80d8
VSQRT instructions.
There were still a few AVX instructions with an incorrect number of opcodes.
These should be fixed now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328892 91177308-0d34-0410-b5e6-96231b3b80d8
This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan.
This change resolves bug 36314.
Patch by Sam Kerner!
Differential Revision: https://reviews.llvm.org/D44784
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328890 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Make NVPTX require structured CFG. Added a temporary flag to
"roll back" the behavior for easy deployment.
Combined with D45008, this fixes several internal Nvidia GPU test
failures that we suspect to be ptxas miscompiles (PR27738).
Reviewers: jlebar
Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D45070
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328885 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Tail duplication easily breaks the structure of CFG, e.g. duplicating on
a region entry. If the structure is intended to be preserved, then we
may want to configure tail duplication, or disable it for structured
CFG. From our benchmark results disabling it doesn't cause performance
regression.
Notice that this currently affects AMDGPU backend. In the next patch, I
also plan to turn on requiresStructuredCFG for NVPTX.
All unit tests still pass.
Reviewers: jlebar, arsenm
Subscribers: jholewinski, sanjoy, wdng, tpr, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45008
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328884 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previous revision caused a leak in the echo test that got caught by the ASAN bots because of missing free of the handlers array and was reverted in r328759. Resubmitting the patch with that correction.
Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors.
Test is modified from SimplifyCFG because it contains many diverse usages of these instructions.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits, vlad.tsyrklevich
Differential Revision: https://reviews.llvm.org/D45100
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328883 91177308-0d34-0410-b5e6-96231b3b80d8
This will show more detail when using `llvm-pdbutil explain` on an
offset in the DBI or PDB streams. Specifically, it will dig into
individual header fields and substreams to give a more precise
description of what the byte represents.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328878 91177308-0d34-0410-b5e6-96231b3b80d8
most vector logic instructions.
Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input
operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328867 91177308-0d34-0410-b5e6-96231b3b80d8
instructions.
In the Btver2 model, there are a few InstRW overrides that don't specify a
ReadAfterLd for the register input operand.
As a result, a few AVX variants of horizontal operations and most vector logic
operations with a folded memory operand don't have a ReadAdvance info associated
to their input register operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328865 91177308-0d34-0410-b5e6-96231b3b80d8
Verify that the ReadAfterLd is correctly applied to FMA and 4-ops variable blend
instructions.
As Craig pointed out in D44726, some Intel models still have to be fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328861 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a couple of tests to verify the change introduced by revision
328823 ([X86] Correct the placement of ReadAfterLd in BEXTR and BZHI).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328859 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases where that value is used only once. It
was not prepared for this use to be the preserved portion of a value for
dst:UNUSED_PRESERVE, resulting in a crash or assert.
This change either rejects the illegal SDWA attempt, or in the case where
dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded
extract instruction.
Reviewers: arsenm, #amdgpu
Reviewed By: arsenm, #amdgpu
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D44364
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328856 91177308-0d34-0410-b5e6-96231b3b80d8