Bug 585708 - Part 1: Rework SSE.h so that it supports and encourages putting intrinsics in a file separate from the main code. r,a2.0=dbaron

--HG--
extra : rebase_source : 6bc2169ba4e451e1d8a36540614024fe40902e62
This commit is contained in:
Justin Lebar 2010-08-11 16:49:42 -07:00
parent ede16504ba
commit 32d92cb466
2 changed files with 67 additions and 303 deletions

View File

@ -20,6 +20,7 @@
*
* Contributor(s):
* L. David Baron <dbaron@dbaron.org>, Mozilla Corporation (original author)
* Justin Lebar <justin.lebar@gmail.com>, Mozilla Corporation
*
* Alternatively, the contents of this file may be used under the terms of
* either the GNU General Public License Version 2 or later (the "GPL"), or
@ -47,11 +48,12 @@
* The public interface of this header consists of a set of macros and
* functions for Intel CPU features.
*
* CODE USING ASSEMBLY
* ===================
* DETECTING ISA EXTENSIONS
* ========================
*
* This header provides the following functions for determining whether the
* current CPU supports a particular instruction set extension:
*
* For each feature handled here, this header defines a single function
* that can be used to test whether to use code written in *assembly*:
* mozilla::supports_mmx
* mozilla::supports_sse
* mozilla::supports_sse2
@ -60,145 +62,66 @@
* mozilla::supports_sse4a
* mozilla::supports_sse4_1
* mozilla::supports_sse4_2
* Note that the dynamic detection depends on cpuid intrinsics only
* available in gcc 4.3 or later and MSVC 8.0 (Visual C++ 2005) or
* later, so the dynamic detection returns false in older compilers.
* (This could be fixed by replacing the code with inline assembly.)
*
* CODE USING INTRINSICS
* =====================
* If you're writing code using inline assembly, you should guard it with a
* call to one of these functions. For instance:
*
* The remainder of the functions and macros that are the API from this
* header relate to code written using CPU intrinsics.
* if (mozilla::supports_sse2()) {
* asm(" ... ");
* }
* else {
* ...
* }
*
* In each macro-function pair, the function may not be available if the
* macro is undefined. They should be used in the following pattern:
* Note that these functions depend on cpuid intrinsics only available in gcc
* 4.3 or later and MSVC 8.0 (Visual C++ 2005) or later, so they return false
* in older compilers. (This could be fixed by replacing the code with inline
* assembly.)
*
* if (mozilla::use_abc()) {
* #ifdef MOZILLA_COMPILE_WITH_ABC
* // abc-specific code here
* #endif
* } else {
* // generic code here
* }
*
* In addition, on some platforms, the headers that contain the
* intrinsics for many of these features won't compile unless we define
* a particular macro first (to pretend that we gave gcc an appropriate
* -march option). Therefore, code using this header should NOT include
* the headers for intrinsics directly, but should instead request the
* header by defining the header macro given below *before* including
* this file (which, in practice, means before including *any* header
* files).
* USING INTRINSICS
* ================
*
* The dynamic detection depends on cpuid intrinsics only available in
* gcc 4.3 or later and MSVC 8.0 (Visual C++ 2005) or later, so the
* dynamic detection returns false in older compilers. However, it
* could be extended to avoid this restriction; see the code in
* mozilla/jpeg/jdapimin.c for some hints.
* This header also provides support for coding using CPU intrinsics.
*
* Macro: MOZILLA_COMPILE_WITH_MMX
* Function: mozilla::use_mmx
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_MMX
* Header: <mmintrin.h>
* For each mozilla::supports_abc function, we define a MOZILLA_MAY_SUPPORT_ABC
* macro which indicates that the target/compiler combination we're using is
* compatible with the ABC extension. For instance, x86_64 with MSVC 2003 is
* compatible with SSE2 but not SSE3, since although there exist x86_64 CPUs
* with SSE3 support, MSVC 2003 only supports through SSE2.
*
* Macro: MOZILLA_COMPILE_WITH_SSE
* Function: mozilla::use_sse
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE
* Header: <xmmintrin.h>
* Until gcc fixes #pragma target [1] [2] or our x86 builds require SSE2,
* you'll need to separate code using intrinsics into a file separate from your
* regular code. Here's the recommended pattern:
*
* Macro: MOZILLA_COMPILE_WITH_SSE2
* Function: mozilla::use_sse2
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE2
* Header: <emmintrin.h>
* #ifdef MOZILLA_MAY_SUPPORT_ABC
* namespace mozilla {
* namespace ABC {
* void foo();
* }
* }
* #endif
*
* Macro: MOZILLA_COMPILE_WITH_SSE3
* Function: mozilla::use_sse3
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE3
* Header: <pmmintrin.h>
* void foo() {
* #ifdef MOZILLA_MAY_SUPPORT_ABC
* if (mozilla::supports_abc()) {
* mozilla::ABC::foo(); // in a separate file
* return;
* }
* #endif
*
* Macro: MOZILLA_COMPILE_WITH_SSSE3
* Function: mozilla::use_ssse3
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSSE3
* Header: <tmmintrin.h>
* foo_unvectorized();
* }
*
* Macro: MOZILLA_COMPILE_WITH_SSE4A
* Function: mozilla::use_sse4a
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4A
* Header: <ammintrin.h>
* You'll need to define mozilla::ABC::foo() in a separate file and add the
* -mabc flag when using gcc.
*
* Macro: MOZILLA_COMPILE_WITH_SSE4_1
* Function: mozilla::use_sse4_1
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4_1
* Header: <smmintrin.h>
* [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39787 and
* [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41201 being fixed.
*
* Macro: MOZILLA_COMPILE_WITH_SSE4_2
* Function: mozilla::use_sse4_2
* Header Macro: MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4_2
* Header: <nmmintrin.h>
*/
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
// GCC
// FIXME: Is any of this available on arm? GCC seems to offer mmintrin.h
#if 0
// #pragma target doesn't actually work yet; making it work depends on
// http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39787 and
// http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41201 being fixed.
// Note that this means there are cases where mozilla::use_abc() will
// return true even though we can't define MOZILLA_COMPILE_WITH_ABC.
#define MOZILLA_SSE_HAVE_PRAGMA_TARGET
#endif
#ifdef MOZILLA_SSE_HAVE_PRAGMA_TARGET
// Limit compilation to compiler versions where the *intrin.h headers
// are present. (These ifdefs actually aren't useful because they're
// all (currently) weaker than MOZILLA_SSE_HAVE_PRAGMA_TARGET.)
#if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 1) // GCC 3.1 and up
#define MOZILLA_COMPILE_WITH_MMX 1
#define MOZILLA_COMPILE_WITH_SSE 1
#endif // GCC 3.1 and up
#if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 3) // GCC 3.3 and up
#define MOZILLA_COMPILE_WITH_SSE2 1
#define MOZILLA_COMPILE_WITH_SSE3 1
#endif // GCC 3.3 and up
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3) // GCC 4.3 and up
#define MOZILLA_COMPILE_WITH_SSSE3 1
#define MOZILLA_COMPILE_WITH_SSE4A 1
#define MOZILLA_COMPILE_WITH_SSE4_1 1
#define MOZILLA_COMPILE_WITH_SSE4_2 1
#endif // GCC 4.3 and up
// GCC 4.3 and GCC 4.4 shipped with SSE5, but it is being removed in GCC
// 4.5; see http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01527.html and
// http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01536.html
#else
#ifdef __MMX__
#define MOZILLA_COMPILE_WITH_MMX 1
#endif
#ifdef __SSE__
#define MOZILLA_COMPILE_WITH_SSE 1
#endif
#ifdef __SSE2__
#define MOZILLA_COMPILE_WITH_SSE2 1
#endif
#ifdef __SSE3__
#define MOZILLA_COMPILE_WITH_SSE3 1
#endif
#ifdef __SSSE3__
#define MOZILLA_COMPILE_WITH_SSSE3 1
#endif
#ifdef __SSE4A__
#define MOZILLA_COMPILE_WITH_SSE4A 1
#endif
#ifdef __SSE4_1__
#define MOZILLA_COMPILE_WITH_SSE4_1 1
#endif
#ifdef __SSE4_2__
#define MOZILLA_COMPILE_WITH_SSE4_2 1
#endif
#endif
#ifdef __MMX__
// It's ok to use MMX instructions based on the -march option (or
@ -263,99 +186,9 @@ namespace mozilla {
#endif
// We need to #include headers quite carefully for CPUID-tested
// compilation. GCC's headers for SSE intrinsics both:
// * have #error in them when the appropriate macro is not defined, and
// * depend on intrinsics that depend on -msse, etc.
// We could fix the first of these options with #define. However, we
// can fix the second only with #pragma directives introduced in GCC
// 4.4 (but not yet working quite well enough on any gcc version), whose
// availability influenced (above) whether MOZILLA_COMPILE_WITH_* is
// defined.
#ifdef MOZILLA_SSE_HAVE_PRAGMA_TARGET
#pragma GCC push_options
#endif
#if defined(MOZILLA_COMPILE_WITH_MMX) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_MMX)
#if !defined(__MMX__)
#pragma GCC target ("mmx")
#endif
#include <mmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE)
#if !defined(__SSE__)
#pragma GCC target ("sse")
#endif
#include <xmmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE2) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE2)
#if !defined(__SSE2__)
#pragma GCC target ("sse2")
#endif
#include <emmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE3) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE3)
#if !defined(__SSE3__)
#pragma GCC target ("sse3")
#endif
#include <pmmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSSE3) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSSE3)
#if !defined(__SSSE3__)
#pragma GCC target ("ssse3")
#endif
#include <tmmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE4A) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4A)
#if !defined(__SSE4A__)
#pragma GCC target ("sse4a")
#endif
#include <ammintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE4_1) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4_1)
#if !defined(__SSE4_1__)
#pragma GCC target ("sse4.1")
#endif
#include <smmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE4_2) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE4_2)
#if !defined(__SSE4_2__)
#pragma GCC target ("sse4.2")
#endif
#include <nmmintrin.h>
#endif
#ifdef MOZILLA_SSE_HAVE_PRAGMA_TARGET
#pragma GCC pop_options
#endif
#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_AMD64))
// MSVC on x86 or amd64
// Limit compilation to compiler versions where the *intrin.h headers
// are present
#if 1 // Available at least back to Visual Studio 2003
#define MOZILLA_COMPILE_WITH_MMX 1
#define MOZILLA_COMPILE_WITH_SSE 1
#define MOZILLA_COMPILE_WITH_SSE2 1
#endif // Available at least back to Visual Studio 2003
#if _MSC_VER >= 1400
#include <intrin.h>
#define MOZILLA_SSE_HAVE_CPUID_DETECTION
@ -394,28 +227,9 @@ namespace mozilla {
#define MOZILLA_PRESUME_SSE2
#endif
#if defined(MOZILLA_COMPILE_WITH_MMX) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_MMX)
#include <mmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE)
#include <xmmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE2) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE2)
#include <emmintrin.h>
#endif
#elif defined(__SUNPRO_CC) && (defined(__i386) || defined(__x86_64__))
// Sun Studio on x86 or amd64
#define MOZILLA_COMPILE_WITH_MMX 1
#define MOZILLA_COMPILE_WITH_SSE 1
#define MOZILLA_COMPILE_WITH_SSE2 1
#define MOZILLA_SSE_HAVE_CPUID_DETECTION
namespace mozilla {
@ -488,21 +302,6 @@ namespace mozilla {
#define MOZILLA_PRESUME_SSE2
#endif
#if defined(MOZILLA_COMPILE_WITH_MMX) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_MMX)
#include <mmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE)
#include <xmmintrin.h>
#endif
#if defined(MOZILLA_COMPILE_WITH_SSE2) && \
defined(MOZILLA_SSE_INCLUDE_HEADER_FOR_SSE2)
#include <emmintrin.h>
#endif
#endif
namespace mozilla {
@ -537,119 +336,85 @@ namespace mozilla {
}
#if defined(MOZILLA_PRESUME_MMX)
#define MOZILLA_MAY_SUPPORT_MMX 1
inline bool supports_mmx() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_MMX 1
inline bool supports_mmx() { return sse_private::mmx_enabled; }
#else
inline bool supports_mmx() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE)
#define MOZILLA_MAY_SUPPORT_SSE 1
inline bool supports_sse() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE 1
inline bool supports_sse() { return sse_private::sse_enabled; }
#else
inline bool supports_sse() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE2)
#define MOZILLA_MAY_SUPPORT_SSE2 1
inline bool supports_sse2() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE2 1
inline bool supports_sse2() { return sse_private::sse2_enabled; }
#else
inline bool supports_sse2() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE3)
#define MOZILLA_MAY_SUPPORT_SSE3 1
inline bool supports_sse3() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE3 1
inline bool supports_sse3() { return sse_private::sse3_enabled; }
#else
inline bool supports_sse3() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSSE3)
#define MOZILLA_MAY_SUPPORT_SSSE3 1
inline bool supports_ssse3() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSSE3 1
inline bool supports_ssse3() { return sse_private::ssse3_enabled; }
#else
inline bool supports_ssse3() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE4A)
#define MOZILLA_MAY_SUPPORT_SSE4A 1
inline bool supports_sse4a() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE4A 1
inline bool supports_sse4a() { return sse_private::sse4a_enabled; }
#else
inline bool supports_sse4a() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE4_1)
#define MOZILLA_MAY_SUPPORT_SSE4_1 1
inline bool supports_sse4_1() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE4_1 1
inline bool supports_sse4_1() { return sse_private::sse4_1_enabled; }
#else
inline bool supports_sse4_1() { return false; }
#endif
#if defined(MOZILLA_PRESUME_SSE4_2)
#define MOZILLA_MAY_SUPPORT_SSE4_2 1
inline bool supports_sse4_2() { return true; }
#elif defined(MOZILLA_SSE_HAVE_CPUID_DETECTION)
#define MOZILLA_MAY_SUPPORT_SSE4_2 1
inline bool supports_sse4_2() { return sse_private::sse4_2_enabled; }
#else
inline bool supports_sse4_2() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_MMX
inline bool use_mmx() { return supports_mmx(); }
#else
inline bool use_mmx() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE
inline bool use_sse() { return supports_sse(); }
#else
inline bool use_sse() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE2
inline bool use_sse2() { return supports_sse2(); }
#else
inline bool use_sse2() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE3
inline bool use_sse3() { return supports_sse3(); }
#else
inline bool use_sse3() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSSE3
inline bool use_ssse3() { return supports_ssse3(); }
#else
inline bool use_ssse3() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE4a
inline bool use_sse4a() { return supports_sse4a(); }
#else
inline bool use_sse4a() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE4_1
inline bool use_sse4_1() { return supports_sse4_1(); }
#else
inline bool use_sse4_1() { return false; }
#endif
#ifdef MOZILLA_COMPILE_WITH_SSE4_2
inline bool use_sse4_2() { return supports_sse4_2(); }
#else
inline bool use_sse4_2() { return false; }
#endif
}
#endif /* !defined(mozilla_SSE_h_) */

View File

@ -138,12 +138,11 @@ int main()
printf("Feature Presume Compile Support Use\n");
#define SHOW_INFO(featurelc_, featureuc_) \
printf( "%7s %1s %1s %1s %1s\n", \
printf( "%7s %1s %1s %1s\n", \
#featurelc_, \
PRESUME_##featureuc_##_STRING, \
COMPILE_##featureuc_##_STRING, \
(mozilla::supports_##featurelc_() ? "Y" : "-"), \
(mozilla::use_##featurelc_() ? "Y" : "-"));
(mozilla::supports_##featurelc_() ? "Y" : "-"));
SHOW_INFO(mmx, MMX)
SHOW_INFO(sse, SSE)
SHOW_INFO(sse2, SSE2)