133235 Commits

Author SHA1 Message Date
Sam Kolton
4f6d8a41f5 [AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand.
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D20968

llvm-svn: 272384
2016-06-10 09:57:59 +00:00
Simon Pilgrim
9a6ac0909a [X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests
Memory operand is new for AVX512 (SSE/AVX2 didn't support it).

Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations).

Regenerated VPALIGNR test now that the shuffle comments work

llvm-svn: 272383
2016-06-10 09:56:20 +00:00
Sean Silva
7c30828bb9 Fix stale name in comment.
llvm-svn: 272382
2016-06-10 08:48:49 +00:00
Roger Ferrer Ibanez
f99b65e91e test commit: remove trailing whitespaces in README.txt
llvm-svn: 272380
2016-06-10 08:19:58 +00:00
Xinliang David Li
cadcbba1d6 Bug fix remove another illegal char from prof symbol name
End-end test with no integrated assembly should be added 
at some point (not done now because some bots are not properly configured to
support -no-integrated-as)

llvm-svn: 272376
2016-06-10 06:32:26 +00:00
Dan Liew
0e9bdedc89 [LibFuzzer] Fix some unit test crashes on OSX.
This fixes the following unit tests:

FuzzerDictionary.ParseOneDictionaryEntry
FuzzerDictionary.ParseDictionaryFile

The issue appears to be mixing non-ASan-ified code (LibFuzzer) and
ASan-ified code (the unittest) as the tests would pass fine if
everything was built with ASan enabled.

I believe the issue is that different implementations of std::vector<>
are being used in LibFuzzer and outside LibFuzzer (in the unittests).
For Libcxx (I've not seen the issue manifest for libstdc++) we can disable
the ASanified std::vector<> by definining the ``_LIBCPP_HAS_NO_ASAN`` macro.
Doing this fixes the tests on OSX.

Differential Revision: http://reviews.llvm.org/D21049

llvm-svn: 272374
2016-06-10 05:33:07 +00:00
Craig Topper
8db9cc42a5 Add missing include for r272369
llvm-svn: 272373
2016-06-10 05:19:42 +00:00
Craig Topper
62a6e30eb0 [AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ.
llvm-svn: 272371
2016-06-10 05:12:40 +00:00
Zachary Turner
ff9c91b5ca Make PDBFile take a StreamInterface instead of a MemBuffer.
This is the next step towards being able to write PDBs.
MemoryBuffer is immutable, and StreamInterface is our replacement
which can be any combination of read-only, read-write, or write-only
depending on the particular implementation.

The one place where we were creating a PDBFile (in RawSession) is
updated to subclass ByteStream with a simple adapter that holds
a MemoryBuffer, and initializes the superclass with the buffer's
array, so that all the functionality of ByteStream works
transparently.

llvm-svn: 272370
2016-06-10 05:10:19 +00:00
Zachary Turner
8110e5a8a3 Add support for writing through StreamInterface.
This adds method and tests for writing to a PDB stream.  With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157

llvm-svn: 272369
2016-06-10 05:09:12 +00:00
Craig Topper
d0d202e5ad [AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names.
llvm-svn: 272367
2016-06-10 04:48:05 +00:00
Daniel Dunbar
4f12eba9f0 [lit] Only gather redirected files for command failures.
- The intended use of this was just in diagnostics, so we shouldn't pay the
   cost of reading these all the time.

 - This will avoid including the full output of each command in tests which
   fail, but the most important use case for this was to gather the output of
   the specific command which failed.

llvm-svn: 272365
2016-06-10 04:17:30 +00:00
Matt Arsenault
738cbf4e5d AMDGPU: Fix trailing whitespace
llvm-svn: 272364
2016-06-10 02:18:02 +00:00
Qin Zhao
9f159944c1 [esan|cfrag] Add the struct field offset array in StructInfo
Summary:
Adds the struct field offset array in struct StructInfo.

Updates test struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21192

llvm-svn: 272362
2016-06-10 02:10:06 +00:00
Quentin Colombet
3f36e7cbaf [LiveRangeEdit] Add a test case for r272314.
The test case is not great espicially because it is still cumbersome to
run the regalloc pass with run-pass. (We miss a bunch of initiliazier to
be properly implemented.)

Related to llvm.org/PR27983

llvm-svn: 272360
2016-06-10 01:57:48 +00:00
Richard Trieu
7b4df3326b Add null checks before using a pointer.
llvm-svn: 272359
2016-06-10 01:42:05 +00:00
Quentin Colombet
e4aede34c0 [llc] Do not create the pass config several times for run-pass.
Thanks to Matthias Braun for spotting this.

llvm-svn: 272358
2016-06-10 01:12:06 +00:00
Quentin Colombet
3866bb8809 [llc] Add support for several run-pass options.
Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.

Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!

llvm-svn: 272356
2016-06-10 00:52:10 +00:00
Qin Zhao
b4c774b3be [esan|cfrag] Disable load/store instrumentation for cfrag
Summary:
Adds ClInstrumentFastpath option to control fastpath instrumentation.

Avoids the load/store instrumentation for the cache fragmentation tool.

Renames cache_frag_basic.ll to working_set_slow.ll for slowpath
instrumentation test.

Adds the __esan_init check in struct_field_count_basic.ll.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21079

llvm-svn: 272355
2016-06-10 00:48:53 +00:00
Matt Arsenault
7bddaed185 Update call site attribute documentation
convergent is also accepted.

llvm-svn: 272353
2016-06-10 00:36:57 +00:00
Tom Stellard
b49c569b73 docs: Add AMDGPU relocation information
Summary:
This documents the various relocation types that are supported by the
Radeon Open Compute (ROC) runtime (which is essentially the dynamic
linker for AMDGPU).

Only R_AMDGPU_32 is not currently supported by the ROC runtime, but
it will usually be resolved at link time by lld.

Patch by: Konstantin Zhuravlyov

Reviewers: kzhuravl, rafael

Subscribers: rafael, arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D20952

llvm-svn: 272352
2016-06-10 00:31:13 +00:00
Matt Arsenault
f4490b9ad5 AMDGPU: v_cndmask_b32 does not def vcc
Fixes verifier errors after SIShrinkInstructions.

llvm-svn: 272351
2016-06-10 00:18:41 +00:00
Tom Stellard
7240ff2203 AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.

Reviewers: scchan, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19994

llvm-svn: 272349
2016-06-10 00:01:04 +00:00
Chris Bieneman
5e3b2f93ed [CMake] Removing fallback code for CMake versions before 3.1
This code is dead code now. Out with the old, in with the new!

llvm-svn: 272347
2016-06-09 23:53:22 +00:00
Tom Stellard
8ca66729b9 AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMI
Reviewers: arsenm, axeldavy

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19823

llvm-svn: 272346
2016-06-09 23:48:02 +00:00
Matt Arsenault
b7b3917848 AMDGPU: Fix flat atomics
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.

llvm-svn: 272345
2016-06-09 23:42:54 +00:00
Matt Arsenault
bcec847408 AMDGPU: Fix i64 global cmpxchg
This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.

There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.

llvm-svn: 272344
2016-06-09 23:42:48 +00:00
Matt Arsenault
ce7bee3c77 AMDGPU: Fix missing and broken check lines in atomic tests
llvm-svn: 272343
2016-06-09 23:42:44 +00:00
Vitaly Buka
f126725677 Make sure that not interesting allocas are not instrumented.
Summary:
We failed to unpoison uninteresting allocas on return as unpoisoning is part of
main instrumentation which skips such allocas.

Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of
dynamic allocas is disabled it will not will not be unpoisoned.

PR27453

Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21207

llvm-svn: 272341
2016-06-09 23:31:59 +00:00
Matt Arsenault
7c29915da6 CodeGen: Allow verifier to run after MachineBlockPlacement
No tests break with this enabled.

llvm-svn: 272340
2016-06-09 23:31:55 +00:00
Eric Christopher
99f6f9fa1d Add aliases for mfvrsave/mtvrsave.
Update a test as we're now going to emit it for easier reading of
generated assembly as well.

llvm-svn: 272339
2016-06-09 23:27:48 +00:00
Matt Arsenault
ceb1d538ec AMDGPU: Run verifer after insert waits pass
llvm-svn: 272338
2016-06-09 23:19:14 +00:00
Matt Arsenault
524364ac4e AMDGPU: Remove incorrect assertion
I'm still not sure under what circumstances the offset here is non-0,
but private memory is not limited to 27-bits.

llvm-svn: 272337
2016-06-09 23:19:08 +00:00
Matt Arsenault
dde12252bd AMDGPU: Properly initialize SIShrinkInstructions
llvm-svn: 272336
2016-06-09 23:18:47 +00:00
George Burgess IV
f70bbc0aad [CFLAA] Handle global/arg attrs more sanely.
Prior to this patch, we used argument/global stratified attributes in
order to note that a value could have come from either dereferencing a
global/arg, or from the assignment from a global/arg.

Now, AttrUnknown is placed on sets when we see a dereference, instead of
the global/arg attributes. This allows us to be more aggressive in the
future when we see global/arg attributes without AttrUnknown.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21110

llvm-svn: 272335
2016-06-09 23:15:04 +00:00
Vitaly Buka
f7bd39fe64 Unpoison stack memory in use-after-return + use-after-scope mode
Summary:
We still want to unpoison full stack even in use-after-return as it can be disabled at runtime.

PR27453

Reviewers: eugenis, kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21202

llvm-svn: 272334
2016-06-09 23:05:35 +00:00
Alina Sbirlea
02d7508090 Reapply 272328 and 272329 as a single patch.
[cpu-detection] [amdfam10] Return barcelona, and amdfam10 for all other
subtypes. Address Bug 28067.

Along with the refactoring of Host.cpp, getHostCPUName() was modified to
return more precise types for CPUs in amdfam10.
However, callers of getHostCPUName() do string matching on type, so this
cannot be modified.
Currently there is support in the x86 backend for barcelona.
For all other subtypes the assumed return value is amdfam10.

Fix: getHostCPUName() returns barcelona subtype and amdfam10 for all
others. This can be extended further when support for the other subtypes
is added.

Differential revision: http://reviews.llvm.org/D21193

llvm-svn: 272333
2016-06-09 23:04:15 +00:00
Alina Sbirlea
5e36a74d30 Revert 272328 and 272329 to recommit as a single patch.
llvm-svn: 272332
2016-06-09 23:04:05 +00:00
Alina Sbirlea
263e70a93c Keep barcelona subtype for amdfam10
llvm-svn: 272329
2016-06-09 22:47:36 +00:00
Alina Sbirlea
d5696c2256 [cpu-detection] Return amdfam10 for all subtypes. Address Bug 28067.
Summary: Remove architecture subtype from the string returned by getHostCPUName(). String matching done on type.

Reviewers: llvm-commits, echristo

Subscribers: mehdi_amini

Differential Revision: http://reviews.llvm.org/D21193

llvm-svn: 272328
2016-06-09 22:47:12 +00:00
Chris Bieneman
0158e53ab5 [CMake] Cleanup ExternalProject usage of CMake 3.x features
All the ExternalProject features in use here are supported by CMake 3.4.3, so we don't need these version checks anymore.

llvm-svn: 272327
2016-06-09 22:41:36 +00:00
Easwaran Raman
57cf853e91 Use ProfileSummaryInfo in inline cost analysis.
Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo.

Differential revision: http://reviews.llvm.org/D21045

llvm-svn: 272321
2016-06-09 22:23:21 +00:00
Simon Pilgrim
10fdc6dca9 [X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments
llvm-svn: 272319
2016-06-09 22:03:15 +00:00
Quentin Colombet
75321fdc07 [LiveRangeEdit] Fix a crash in eliminateDeadDef.
When we delete a live-range, we check if that live-range is the origin of others
to keep it around for rematerialization. For that we check that the instruction
we are about to remove is the same as the definition of the VNI of the original
live-range.
If this is the case, we just shrink the live-range to an empty one.

Now, when we try to delete one of the children of such live-range (product of
splitting), we do the same check.
However, now the original live-range is empty and there is no way we can
access the VNI to check its definition, and we crash.

When we cannot get the VNI for the original live-range, that means we are not in
the presence of the original definition. Thus, this check does not need to happen
in that case and the crash is sloved!

This bug was introduced in r266162 | wmi | 2016-04-12 20:08:27. It affects every
target that uses the greedy register allocator.
To happen, we need to delete both a the original instruction and its split
products, in that order. This is likely to happen when rematerialization comes
into play.

Trying to produce a more robust test case. Will follow in a coming commit.

This fixes llvm.org/PR27983.

rdar://problem/26651519 

llvm-svn: 272314
2016-06-09 21:34:31 +00:00
Vedant Kumar
e22d636530 [docs] Fix indentation for a tool option
llvm-svn: 272309
2016-06-09 21:09:54 +00:00
Simon Pilgrim
4515a471d4 [X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics
Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ 

llvm-svn: 272308
2016-06-09 21:09:03 +00:00
Simon Pilgrim
c8cd1f56db [X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR
llvm-svn: 272307
2016-06-09 20:53:12 +00:00
Duncan P. N. Exon Smith
969e6070d4 BitcodeReader: Use std:::piecewise_construct when upgrading type refs
r267296 used std::piecewise_construct without using
std::forward_as_tuple, and r267298 hacked it out (using an emplace_back
followed by a couple of reset() calls) because of a problem on a bot.
I'm finally circling back to call forward_as_tuple as I should have to
begin with (thanks to David Blaikie for pointing out the missing piece).

Note that this code uses emplace_back() instead of
push_back(make_pair()) because the move constructor for TrackingMDRef is
expensive (cheaper than a copy, but still expensive).

llvm-svn: 272306
2016-06-09 20:46:33 +00:00
Simon Pilgrim
01d87ac7ba [X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts
512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget

llvm-svn: 272300
2016-06-09 20:13:58 +00:00
Justin Lebar
05e13f8eda [NVPTX] Add intrinsics for shfl instructions.
Summary:
Currently clang emits these instructions via inline (volatile) asm in
the CUDA headers.  Switching to intrinsics will let the optimizer reason
across calls to these intrinsics.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D21160

llvm-svn: 272298
2016-06-09 20:04:08 +00:00