Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks.
It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function.
I removed the support for extracting values from trees.
We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184647 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
The infrastructure has three potential users:
1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).
2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.
3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.
This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:
void SAXPY(int *x, int *y, int a, int i) {
x[i] = a * x[i] + y[i];
x[i+1] = a * x[i+1] + y[i+1];
x[i+2] = a * x[i+2] + y[i+2];
x[i+3] = a * x[i+3] + y[i+3];
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179117 91177308-0d34-0410-b5e6-96231b3b80d8
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149468 91177308-0d34-0410-b5e6-96231b3b80d8