Add a note for -ffast-math optimization of vector norm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@153031 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Benjamin Kramer 2012-03-19 00:43:34 +00:00
parent a87a75fd80
commit 8118c94a55

View File

@ -922,3 +922,22 @@ _test2: ## @test2
The insertps's of $0 are pointless complex copies.
//===---------------------------------------------------------------------===//
[UNSAFE FP]
void foo(double, double, double);
void norm(double x, double y, double z) {
double scale = __builtin_sqrt(x*x + y*y + z*z);
foo(x/scale, y/scale, z/scale);
}
We currently generate an sqrtsd and 3 divsd instructions. This is bad, fp div is
slow and not pipelined. In -ffast-math mode we could compute "1.0/scale" first
and emit 3 mulsd in place of the divs. This can be done as a target-independent
transform.
If we're dealing with floats instead of doubles we could even replace the sqrtss
and inversion with an rsqrtss instruction, which computes 1/sqrt faster at the
cost of reduced accuracy.
//===---------------------------------------------------------------------===//