AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.
This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.
Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.
As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted
Differential Revision: http://reviews.llvm.org/D21660
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274013 91177308-0d34-0410-b5e6-96231b3b80d8
Some headers in IR depend on tablegen generated code. Modules builds triggered
generation of the LLVM_IR module (including headers dependant on intrinsic_gen),
imposing a unnecessary build dependency.
Reviewed by Richard Smith.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274006 91177308-0d34-0410-b5e6-96231b3b80d8
This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments.
Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication.
Differential Revision: http://reviews.llvm.org/D21148
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273999 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273996 91177308-0d34-0410-b5e6-96231b3b80d8
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273992 91177308-0d34-0410-b5e6-96231b3b80d8
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional. Thus we would
access a memory location that the original loop did not access.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273991 91177308-0d34-0410-b5e6-96231b3b80d8
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273990 91177308-0d34-0410-b5e6-96231b3b80d8
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
Changes since the initial commit:
- Avoid accidentally closing stdout twice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273985 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r273971. test/profile/instrprof-visibility.cpp is
failing because of an uncaught error in SafelyCloseFileDescriptor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273978 91177308-0d34-0410-b5e6-96231b3b80d8
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273977 91177308-0d34-0410-b5e6-96231b3b80d8
tests will want different IR.
Wanted this when writing tests for the proposed CG update stuff, and
this is an easily separable piece.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273973 91177308-0d34-0410-b5e6-96231b3b80d8
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273971 91177308-0d34-0410-b5e6-96231b3b80d8
Expose getBPI interface from BFI impl and use
it in graph viewer. This eliminates the dependency
on old PM interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273967 91177308-0d34-0410-b5e6-96231b3b80d8
the new pass manager.
This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.
Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273961 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently, MSVC complains if there's an implicit conversion from
`unsigned` to `unsigned long long`, if the `unsigned` is the result of
a bit shift.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273955 91177308-0d34-0410-b5e6-96231b3b80d8
This change reverts a "false" test that was placed to avoid regressions while the atomics pass was completed for the Sparc back-ends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273949 91177308-0d34-0410-b5e6-96231b3b80d8
allow a good error message to be produced.
I added the one test case that the object file tools could produce an error
message. The other two errors can’t be triggered if the input file is passed
through sys::fs::identify_magic(). But the malformedError("bad magic number")
does get triggered by the logic in llvm-dsymutil when dealing with a normal
Mach-O file. The other "File too small ..." error would take a logic error
currently to produce and is not tested for.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273946 91177308-0d34-0410-b5e6-96231b3b80d8
binutils ar uses -plugin to specify the LTO plugin, but LLVM doesn't
need this as it doesn't use a plugin for LTO. Accepting (and ignoring)
the option allows interoperability with existing build systems and
make downstream consumers life much easier.
No objections from Rafael on this change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273938 91177308-0d34-0410-b5e6-96231b3b80d8