Adam Nemet 049a31d53d Use FPContractModeKind universally
FPContractModeKind is the codegen option flag which is already ternary (off,
on, fast).  This makes it universally the type for the contractable info
across the front-end:

* In FPOptions (i.e. in the Sema + in the expression nodes).
* In LangOpts::DefaultFPContractMode which is the option that initializes
FPOptions in the Sema.

Another way to look at this change is that before fp-contractable on/off were
the only states handled to the front-end:
 * For "on", FMA folding was performed by  the front-end
 * For "fast", we simply forwarded the flag to TargetOptions to handle it in
 LLVM

Now off/on/fast are all exposed because for fast we will generate
fast-math-flags during CodeGen.

This is toward moving fp-contraction=fast from an LLVM TargetOption to a
FastMathFlag in order to fix PR25721.

---
This is a recommit of r299027 with an adjustment to the test
CodeGenCUDA/fp-contract.cu.  The test assumed that even
though -ffp-contract=on is passed FE-based folding of FMA won't happen.

This is obviously wrong since the user is asking for this explicitly with the
option.  CUDA is different that -ffp-contract=fast is on by default.

The test used to "work" because contract=fast and contract=on were maintained
separately and we didn't fold in the FE because contract=fast was on due to
the target-default.  This patch consolidates the contract=on/fast/off state
into a ternary state hence the change in behavior.
---

Differential Revision: https://reviews.llvm.org/D31167

llvm-svn: 299033
2017-03-29 21:54:24 +00:00

33 lines
1.2 KiB
Plaintext

// REQUIRES: x86-registered-target
// REQUIRES: nvptx-registered-target
// By default we should fuse multiply/add into fma instruction.
// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
// RUN: -disable-llvm-passes -o - %s | FileCheck -check-prefix ENABLED %s
// Explicit -ffp-contract=fast
// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
// RUN: -ffp-contract=fast -disable-llvm-passes -o - %s \
// RUN: | FileCheck -check-prefix ENABLED %s
// Explicit -ffp-contract=on -- fusing by front-end.
// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
// RUN: -ffp-contract=on -disable-llvm-passes -o - %s \
// RUN: | FileCheck -check-prefix ENABLED %s
// Explicit -ffp-contract=off should disable instruction fusing.
// RUN: %clang_cc1 -fcuda-is-device -triple nvptx-nvidia-cuda -S \
// RUN: -ffp-contract=off -disable-llvm-passes -o - %s \
// RUN: | FileCheck -check-prefix DISABLED %s
#include "Inputs/cuda.h"
__host__ __device__ float func(float a, float b, float c) { return a + b * c; }
// ENABLED: fma.rn.f32
// ENABLED-NEXT: st.param.f32
// DISABLED: mul.rn.f32
// DISABLED-NEXT: add.rn.f32
// DISABLED-NEXT: st.param.f32