Bug 1879873 - Remove kiss fft and openmax dl. r=karlt,sylvestre

Differential Revision: https://phabricator.services.mozilla.com/D201600
2024-11-23 12:51:06 +00:00 · 2024-02-28 12:50:26 +00:00 · 2024-02-28 12:50:26 +00:00 · d309b4eb34
commit d309b4eb34
parent 26f74e078a
39 changed files with 0 additions and 13093 deletions
--- a/config/external/moz.build
+++ b/config/external/moz.build
@ -49,9 +49,6 @@ if not CONFIG["MOZ_SYSTEM_PNG"]:
 if not CONFIG["MOZ_SYSTEM_WEBP"]:
    external_dirs += ["media/libwebp"]
 if CONFIG["TARGET_CPU"] == "arm":
    external_dirs += ["media/openmax_dl/dl"]
 if CONFIG["MOZ_FFVPX"]:
    external_dirs += ["media/ffvpx"]
@ -59,7 +56,6 @@ if CONFIG["MOZ_JXL"]:
    external_dirs += ["media/libjxl", "media/highway"]
 external_dirs += [
    "media/kiss_fft",
    "media/libcubeb",
    "media/libmkv",
    "media/libnestegg",
--- a/dom/media/webaudio/moz.build
+++ b/dom/media/webaudio/moz.build
@ -130,8 +130,6 @@ if CONFIG["TARGET_CPU"] == "aarch64" or CONFIG["BUILD_ARM_NEON"]:
    LOCAL_INCLUDES += ["/third_party/xsimd/include"]
    SOURCES += ["AudioNodeEngineNEON.cpp"]
    SOURCES["AudioNodeEngineNEON.cpp"].flags += CONFIG["NEON_FLAGS"]
    if CONFIG["BUILD_ARM_NEON"]:
        LOCAL_INCLUDES += ["/media/openmax_dl/dl/api/"]
 # Are we targeting x86 or x64?  If so, build SSEX files.
 if CONFIG["INTEL_ARCHITECTURE"]:
--- a/media/kiss_fft/CHANGELOG
+++ b/media/kiss_fft/CHANGELOG
@ -1,123 +0,0 @@
 1.3.0 2012-07-18
  removed non-standard malloc.h from kiss_fft.h
  moved -lm to end of link line
  checked various return values
  converted python Numeric code to NumPy
  fixed test of int32_t on 64 bit OS
  added padding in a couple of places to allow SIMD alignment of structs
 1.2.9 2010-05-27
  threadsafe ( including OpenMP )
  first edition of kissfft.hh the C++ template fft engine
 1.2.8 
  Changed memory.h to string.h -- apparently more standard
  Added openmp extensions.  This can have fairly linear speedups for larger FFT sizes.
 1.2.7 
  Shrank the real-fft memory footprint. Thanks to Galen Seitz.
 1.2.6 (Nov 14, 2006) The "thanks to GenArts" release.
  Added multi-dimensional real-optimized FFT, see tools/kiss_fftndr
  Thanks go to GenArts, Inc. for sponsoring the development.
 1.2.5 (June 27, 2006) The "release for no good reason" release.
   Changed some harmless code to make some compilers' warnings go away.
   Added some more digits to pi -- why not.
   Added kiss_fft_next_fast_size() function to help people decide how much to pad.
   Changed multidimensional test from 8 dimensions to only 3 to avoid testing 
   problems with fixed point (sorry Buckaroo Banzai).
 1.2.4 (Oct 27, 2005)   The "oops, inverse fixed point real fft was borked" release. 
   Fixed scaling bug for inverse fixed point real fft -- also fixed test code that should've been failing.
    Thanks to Jean-Marc Valin for bug report.
   Use sys/types.h for more portable types than short,int,long => int16_t,int32_t,int64_t
   If your system does not have these, you may need to define them -- but at least it breaks in a 
   loud and easily fixable way -- unlike silently using the wrong size type.
   Hopefully tools/psdpng.c is fixed -- thanks to Steve Kellog for pointing out the weirdness.
 1.2.3 (June 25, 2005)   The "you want to use WHAT as a sample" release.
    Added ability to use 32 bit fixed point samples -- requires a 64 bit intermediate result, a la 'long long'
    Added ability to do 4 FFTs in parallel by using SSE SIMD instructions. This is accomplished by
    using the __m128 (vector of 4 floats) as kiss_fft_scalar.  Define USE_SIMD to use this.
    I know, I know ...  this is drifting a bit from the "kiss" principle, but the speed advantages 
    make it worth it for some.  Also recent gcc makes it SOO easy to use vectors of 4 floats like a POD type.
 1.2.2 (May 6, 2005)   The Matthew release
    Replaced fixed point division with multiply&shift.  Thanks to Jean-Marc Valin for 
    discussions regarding.  Considerable speedup for fixed-point.
    Corrected overflow protection in real fft routines  when using fixed point.
    Finder's Credit goes to Robert Oschler of robodance for pointing me at the bug.
    This also led to the CHECK_OVERFLOW_OP macro.
 1.2.1 (April 4, 2004) 
    compiles cleanly with just about every -W warning flag under the sun
    reorganized kiss_fft_state so it could be read-only/const. This may be useful for embedded systems
    that are willing to predeclare twiddle factors, factorization.
    Fixed C_MUL,S_MUL on 16-bit platforms.
    tmpbuf will only be allocated if input & output buffers are same
    scratchbuf will only be allocated for ffts that are not multiples of 2,3,5
    NOTE: The tmpbuf,scratchbuf changes may require synchronization code for multi-threaded apps.
 1.2 (Feb 23, 2004)
    interface change -- cfg object is forward declaration of struct instead of void*
    This maintains type saftey and lets the compiler warn/error about stupid mistakes.
            (prompted by suggestion from Erik de Castro Lopo)
    small speed improvements
    added psdpng.c -- sample utility that will create png spectrum "waterfalls" from an input file
        ( not terribly useful yet)
 1.1.1 (Feb 1, 2004 )
    minor bug fix -- only affects odd rank, in-place, multi-dimensional FFTs
 1.1 : (Jan 30,2004)
    split sample_code/ into test/ and tools/
    Removed 2-D fft and added N-D fft (arbitrary)
    modified fftutil.c to allow multi-d FFTs
    Modified core fft routine to allow an input stride via kiss_fft_stride()
    (eased support of multi-D ffts)
    Added fast convolution filtering (FIR filtering using overlap-scrap method, with tail scrap)
    Add kfc.[ch]: the KISS FFT Cache. It takes care of allocs for you ( suggested by Oscar Lesta ).
 1.0.1 (Dec 15, 2003)
    fixed bug that occurred when nfft==1. Thanks to Steven Johnson.
 1.0 : (Dec 14, 2003)
    changed kiss_fft function from using a single buffer, to two buffers.
    If the same buffer pointer is supplied for both in and out, kiss will
    manage the buffer copies.
    added kiss_fft2d and kiss_fftr as separate source files (declarations in kiss_fft.h )
 0.4 :(Nov 4,2003) optimized for radix 2,3,4,5
 0.3 :(Oct 28, 2003) woops, version 2 didn't actually factor out any radices other than 2.
        Thanks to Steven Johnson for finding this one.
 0.2 :(Oct 27, 2003) added mixed radix, only radix 2,4 optimized versions
 0.1 :(May 19 2003)  initial release, radix 2 only
--- a/media/kiss_fft/COPYING
+++ b/media/kiss_fft/COPYING
@ -1,11 +0,0 @@
 Copyright (c) 2003-2010 Mark Borgerding
 All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/media/kiss_fft/README
+++ b/media/kiss_fft/README
@ -1,134 +0,0 @@
 KISS FFT - A mixed-radix Fast Fourier Transform based up on the principle, 
 "Keep It Simple, Stupid."
    There are many great fft libraries already around.  Kiss FFT is not trying
 to be better than any of them.  It only attempts to be a reasonably efficient, 
 moderately useful FFT that can use fixed or floating data types and can be 
 incorporated into someone's C program in a few minutes with trivial licensing.
 USAGE:
    The basic usage for 1-d complex FFT is:
        #include "kiss_fft.h"
        kiss_fft_cfg cfg = kiss_fft_alloc( nfft ,is_inverse_fft ,0,0 );
        while ...
            ... // put kth sample in cx_in[k].r and cx_in[k].i
            kiss_fft( cfg , cx_in , cx_out );
            ... // transformed. DC is in cx_out[0].r and cx_out[0].i 
        free(cfg);
    Note: frequency-domain data is stored from dc up to 2pi.
    so cx_out[0] is the dc bin of the FFT
    and cx_out[nfft/2] is the Nyquist bin (if exists)
    Declarations are in "kiss_fft.h", along with a brief description of the 
 functions you'll need to use. 
 Code definitions for 1d complex FFTs are in kiss_fft.c.
 You can do other cool stuff with the extras you'll find in tools/
    * multi-dimensional FFTs 
    * real-optimized FFTs  (returns the positive half-spectrum: (nfft/2+1) complex frequency bins)
    * fast convolution FIR filtering (not available for fixed point)
    * spectrum image creation
 The core fft and most tools/ code can be compiled to use float, double,
 Q15 short or Q31 samples. The default is float.
 BACKGROUND:
    I started coding this because I couldn't find a fixed point FFT that didn't 
 use assembly code.  I started with floating point numbers so I could get the 
 theory straight before working on fixed point issues.  In the end, I had a 
 little bit of code that could be recompiled easily to do ffts with short, float
 or double (other types should be easy too).  
    Once I got my FFT working, I was curious about the speed compared to
 a well respected and highly optimized fft library.  I don't want to criticize 
 this great library, so let's call it FFT_BRANDX.
 During this process, I learned:
    1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d).
    2. It took me an embarrassingly long time to get FFT_BRANDX working.
    3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size).
    4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode.
    It is wonderful that free, highly optimized libraries like FFT_BRANDX exist.
 But such libraries carry a huge burden of complexity necessary to extract every 
 last bit of performance.
    Sometimes simpler is better, even if it's not better.
 FREQUENTLY ASKED QUESTIONS:
 	Q: Can I use kissfft in a project with a ___ license?
 	A: Yes.  See LICENSE below.
 	Q: Why don't I get the output I expect?
 	A: The two most common causes of this are 
 		1) scaling : is there a constant multiplier between what you got and what you want?
 		2) mixed build environment -- all code must be compiled with same preprocessor 
 		definitions for FIXED_POINT and kiss_fft_scalar
 	Q: Will you write/debug my code for me?
 	A: Probably not unless you pay me.  I am happy to answer pointed and topical questions, but 
 	I may refer you to a book, a forum, or some other resource.
 PERFORMANCE:
    (on Athlon XP 2100+, with gcc 2.96, float data type)
    Kiss performed 10000 1024-pt cpx ffts in .63 s of cpu time.
    For comparison, it took md5sum twice as long to process the same amount of data.
    Transforming 5 minutes of CD quality audio takes less than a second (nfft=1024). 
 DO NOT:
    ... use Kiss if you need the Fastest Fourier Transform in the World
    ... ask me to add features that will bloat the code
 UNDER THE HOOD:
    Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer  
    and output buffer that are the same, a temporary buffer will be created to hold the data.
    No static data is used.  The core routines of kiss_fft are thread-safe (but not all of the tools directory).
    No scaling is done for the floating point version (for speed).  
    Scaling is done both ways for the fixed-point version (for overflow prevention).
    Optimized butterflies are used for factors 2,3,4, and 5. 
    The real (i.e. not complex) optimization code only works for even length ffts.  It does two half-length
    FFTs in parallel (packed into real&imag), and then combines them via twiddling.  The result is 
    nfft/2+1 complex frequency bins from DC to Nyquist.  If you don't know what this means, search the web.
    The fast convolution filtering uses the overlap-scrap method, slightly 
    modified to put the scrap at the tail.
 LICENSE:
    Revised BSD License, see COPYING for verbiage. 
    Basically, "free to use&change, give credit where due, no guarantees"
    Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at 
    the other end.  See http://www.fsf.org/licensing/licenses
    A commercial license is available which removes the requirement for attribution.  Contact me for details.
 TODO:
    *) Add real optimization for odd length FFTs 
    *) Document/revisit the input/output fft scaling
    *) Make doc describing the overlap (tail) scrap fast convolution filtering in kiss_fastfir.c
    *) Test all the ./tools/ code with fixed point (kiss_fastfir.c doesn't work, maybe others)
 AUTHOR:
    Mark Borgerding
    Mark@Borgerding.net
--- a/media/kiss_fft/README.simd
+++ b/media/kiss_fft/README.simd
@ -1,78 +0,0 @@
 If you are reading this, it means you think you may be interested in using the SIMD extensions in kissfft 
 to do 4 *separate* FFTs at once.
 Beware! Beyond here there be dragons!
 This API is not easy to use, is not well documented, and breaks the KISS principle.  
 Still reading? Okay, you may get rewarded for your patience with a considerable speedup 
 (2-3x) on intel x86 machines with SSE if you are willing to jump through some hoops.
 The basic idea is to use the packed 4 float __m128 data type as a scalar element.  
 This means that the format is pretty convoluted. It performs 4 FFTs per fft call on signals A,B,C,D.
 For complex data, the data is interlaced as follows:
 rA0,rB0,rC0,rD0,      iA0,iB0,iC0,iD0,   rA1,rB1,rC1,rD1, iA1,iB1,iC1,iD1 ...
 where "rA0" is the real part of the zeroth sample for signal A
 Real-only data is laid out:
 rA0,rB0,rC0,rD0,     rA1,rB1,rC1,rD1,      ... 
 Compile with gcc flags something like
 -O3 -mpreferred-stack-boundary=4  -DUSE_SIMD=1 -msse 
 Be aware of SIMD alignment.  This is the most likely cause of segfaults.  
 The code within kissfft uses scratch variables on the stack.  
 With SIMD, these must have addresses on 16 byte boundaries.  
 Search on "SIMD alignment" for more info.
 Robin at Divide Concept was kind enough to share his code for formatting to/from the SIMD kissfft.  
 I have not run it -- use it at your own risk.  It appears to do 4xN and Nx4 transpositions 
 (out of place).
 void SSETools::pack128(float* target, float* source, unsigned long size128)
 {
   __m128* pDest = (__m128*)target;
   __m128* pDestEnd = pDest+size128;
   float* source0=source;
   float* source1=source0+size128;
   float* source2=source1+size128;
   float* source3=source2+size128;
   while(pDest<pDestEnd)
   {
       *pDest=_mm_set_ps(*source3,*source2,*source1,*source0);
       source0++;
       source1++;
       source2++;
       source3++;
       pDest++;
   }
 }
 void SSETools::unpack128(float* target, float* source, unsigned long size128)
 {
   float* pSrc = source;
   float* pSrcEnd = pSrc+size128*4;
   float* target0=target;
   float* target1=target0+size128;
   float* target2=target1+size128;
   float* target3=target2+size128;
   while(pSrc<pSrcEnd)
   {
       *target0=pSrc[0];
       *target1=pSrc[1];
       *target2=pSrc[2];
       *target3=pSrc[3];
       target0++;
       target1++;
       target2++;
       target3++;
       pSrc+=4;
   }
 } 
--- a/media/kiss_fft/_kiss_fft_guts.h
+++ b/media/kiss_fft/_kiss_fft_guts.h
@ -1,164 +0,0 @@
 /*
 Copyright (c) 2003-2010, Mark Borgerding
 All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */
 /* kiss_fft.h
   defines kiss_fft_scalar as either short or a float type
   and defines
   typedef struct { kiss_fft_scalar r; kiss_fft_scalar i; }kiss_fft_cpx; */
 #include "kiss_fft.h"
 #include <limits.h>
 #define MAXFACTORS 32
 /* e.g. an fft of length 128 has 4 factors 
 as far as kissfft is concerned
 4*4*4*2
 */
 struct kiss_fft_state{
    int nfft;
    int inverse;
    int factors[2*MAXFACTORS];
    kiss_fft_cpx twiddles[1];
 };
 /*
  Explanation of macros dealing with complex math:
   C_MUL(m,a,b)         : m = a*b
   C_FIXDIV( c , div )  : if a fixed point impl., c /= div. noop otherwise
   C_SUB( res, a,b)     : res = a - b
   C_SUBFROM( res , a)  : res -= a
   C_ADDTO( res , a)    : res += a
 * */
 #ifdef FIXED_POINT
 #if (FIXED_POINT==32)
 # define FRACBITS 31
 # define SAMPPROD int64_t
 #define SAMP_MAX 2147483647
 #else
 # define FRACBITS 15
 # define SAMPPROD int32_t 
 #define SAMP_MAX 32767
 #endif
 #define SAMP_MIN -SAMP_MAX
 #if defined(CHECK_OVERFLOW)
 #  define CHECK_OVERFLOW_OP(a,op,b)  \
 	if ( (SAMPPROD)(a) op (SAMPPROD)(b) > SAMP_MAX || (SAMPPROD)(a) op (SAMPPROD)(b) < SAMP_MIN ) { \
 		fprintf(stderr,"WARNING:overflow @ " __FILE__ "(%d): (%d " #op" %d) = %ld\n",__LINE__,(a),(b),(SAMPPROD)(a) op (SAMPPROD)(b) );  }
 #endif
 #   define smul(a,b) ( (SAMPPROD)(a)*(b) )
 #   define sround( x )  (kiss_fft_scalar)( ( (x) + (1<<(FRACBITS-1)) ) >> FRACBITS )
 #   define S_MUL(a,b) sround( smul(a,b) )
 #   define C_MUL(m,a,b) \
      do{ (m).r = sround( smul((a).r,(b).r) - smul((a).i,(b).i) ); \
          (m).i = sround( smul((a).r,(b).i) + smul((a).i,(b).r) ); }while(0)
 #   define DIVSCALAR(x,k) \
 	(x) = sround( smul(  x, SAMP_MAX/k ) )
 #   define C_FIXDIV(c,div) \
 	do {    DIVSCALAR( (c).r , div);  \
 		DIVSCALAR( (c).i  , div); }while (0)
 #   define C_MULBYSCALAR( c, s ) \
    do{ (c).r =  sround( smul( (c).r , s ) ) ;\
        (c).i =  sround( smul( (c).i , s ) ) ; }while(0)
 #else  /* not FIXED_POINT*/
 #   define S_MUL(a,b) ( (a)*(b) )
 #define C_MUL(m,a,b) \
    do{ (m).r = (a).r*(b).r - (a).i*(b).i;\
        (m).i = (a).r*(b).i + (a).i*(b).r; }while(0)
 #   define C_FIXDIV(c,div) /* NOOP */
 #   define C_MULBYSCALAR( c, s ) \
    do{ (c).r *= (s);\
        (c).i *= (s); }while(0)
 #endif
 #ifndef CHECK_OVERFLOW_OP
 #  define CHECK_OVERFLOW_OP(a,op,b) /* noop */
 #endif
 #define  C_ADD( res, a,b)\
    do { \
 	    CHECK_OVERFLOW_OP((a).r,+,(b).r)\
 	    CHECK_OVERFLOW_OP((a).i,+,(b).i)\
 	    (res).r=(a).r+(b).r;  (res).i=(a).i+(b).i; \
    }while(0)
 #define  C_SUB( res, a,b)\
    do { \
 	    CHECK_OVERFLOW_OP((a).r,-,(b).r)\
 	    CHECK_OVERFLOW_OP((a).i,-,(b).i)\
 	    (res).r=(a).r-(b).r;  (res).i=(a).i-(b).i; \
    }while(0)
 #define C_ADDTO( res , a)\
    do { \
 	    CHECK_OVERFLOW_OP((res).r,+,(a).r)\
 	    CHECK_OVERFLOW_OP((res).i,+,(a).i)\
 	    (res).r += (a).r;  (res).i += (a).i;\
    }while(0)
 #define C_SUBFROM( res , a)\
    do {\
 	    CHECK_OVERFLOW_OP((res).r,-,(a).r)\
 	    CHECK_OVERFLOW_OP((res).i,-,(a).i)\
 	    (res).r -= (a).r;  (res).i -= (a).i; \
    }while(0)
 #ifdef FIXED_POINT
 #  define KISS_FFT_COS(phase)  floor(.5+SAMP_MAX * cos (phase))
 #  define KISS_FFT_SIN(phase)  floor(.5+SAMP_MAX * sin (phase))
 #  define HALF_OF(x) ((x)>>1)
 #elif defined(USE_SIMD)
 #  define KISS_FFT_COS(phase) _mm_set1_ps( cos(phase) )
 #  define KISS_FFT_SIN(phase) _mm_set1_ps( sin(phase) )
 #  define HALF_OF(x) ((x)*_mm_set1_ps(.5))
 #else
 #  define KISS_FFT_COS(phase) (kiss_fft_scalar) cos(phase)
 #  define KISS_FFT_SIN(phase) (kiss_fft_scalar) sin(phase)
 #  define HALF_OF(x) ((x)*.5)
 #endif
 #define  kf_cexp(x,phase) \
 	do{ \
 		(x)->r = KISS_FFT_COS(phase);\
 		(x)->i = KISS_FFT_SIN(phase);\
 	}while(0)
 /* a debugging function */
 #define pcpx(c)\
    fprintf(stderr,"%g + %gi\n",(double)((c)->r),(double)((c)->i) )
 #ifdef KISS_FFT_USE_ALLOCA
 // define this to allow use of alloca instead of malloc for temporary buffers
 // Temporary buffers are used in two case: 
 // 1. FFT sizes that have "bad" factors. i.e. not 2,3 and 5
 // 2. "in-place" FFTs.  Notice the quotes, since kissfft does not really do an in-place transform.
 #include <alloca.h>
 #define  KISS_FFT_TMP_ALLOC(nbytes) alloca(nbytes)
 #define  KISS_FFT_TMP_FREE(ptr) 
 #else
 #define  KISS_FFT_TMP_ALLOC(nbytes) KISS_FFT_MALLOC(nbytes)
 #define  KISS_FFT_TMP_FREE(ptr) KISS_FFT_FREE(ptr)
 #endif
--- a/media/kiss_fft/kiss_fft.c
+++ b/media/kiss_fft/kiss_fft.c
@ -1,408 +0,0 @@
 /*
 Copyright (c) 2003-2010, Mark Borgerding
 All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */
 #include "_kiss_fft_guts.h"
 /* The guts header contains all the multiplication and addition macros that are defined for
 fixed or floating point complex numbers.  It also delares the kf_ internal functions.
 */
 static void kf_bfly2(
        kiss_fft_cpx * Fout,
        const size_t fstride,
        const kiss_fft_cfg st,
        int m
        )
 {
    kiss_fft_cpx * Fout2;
    kiss_fft_cpx * tw1 = st->twiddles;
    kiss_fft_cpx t;
    Fout2 = Fout + m;
    do{
        C_FIXDIV(*Fout,2); C_FIXDIV(*Fout2,2);
        C_MUL (t,  *Fout2 , *tw1);
        tw1 += fstride;
        C_SUB( *Fout2 ,  *Fout , t );
        C_ADDTO( *Fout ,  t );
        ++Fout2;
        ++Fout;
    }while (--m);
 }
 static void kf_bfly4(
        kiss_fft_cpx * Fout,
        const size_t fstride,
        const kiss_fft_cfg st,
        const size_t m
        )
 {
    kiss_fft_cpx *tw1,*tw2,*tw3;
    kiss_fft_cpx scratch[6];
    size_t k=m;
    const size_t m2=2*m;
    const size_t m3=3*m;
    tw3 = tw2 = tw1 = st->twiddles;
    do {
        C_FIXDIV(*Fout,4); C_FIXDIV(Fout[m],4); C_FIXDIV(Fout[m2],4); C_FIXDIV(Fout[m3],4);
        C_MUL(scratch[0],Fout[m] , *tw1 );
        C_MUL(scratch[1],Fout[m2] , *tw2 );
        C_MUL(scratch[2],Fout[m3] , *tw3 );
        C_SUB( scratch[5] , *Fout, scratch[1] );
        C_ADDTO(*Fout, scratch[1]);
        C_ADD( scratch[3] , scratch[0] , scratch[2] );
        C_SUB( scratch[4] , scratch[0] , scratch[2] );
        C_SUB( Fout[m2], *Fout, scratch[3] );
        tw1 += fstride;
        tw2 += fstride*2;
        tw3 += fstride*3;
        C_ADDTO( *Fout , scratch[3] );
        if(st->inverse) {
            Fout[m].r = scratch[5].r - scratch[4].i;
            Fout[m].i = scratch[5].i + scratch[4].r;
            Fout[m3].r = scratch[5].r + scratch[4].i;
            Fout[m3].i = scratch[5].i - scratch[4].r;
        }else{
            Fout[m].r = scratch[5].r + scratch[4].i;
            Fout[m].i = scratch[5].i - scratch[4].r;
            Fout[m3].r = scratch[5].r - scratch[4].i;
            Fout[m3].i = scratch[5].i + scratch[4].r;
        }
        ++Fout;
    }while(--k);
 }
 static void kf_bfly3(
         kiss_fft_cpx * Fout,
         const size_t fstride,
         const kiss_fft_cfg st,
         size_t m
         )
 {
     size_t k=m;
     const size_t m2 = 2*m;
     kiss_fft_cpx *tw1,*tw2;
     kiss_fft_cpx scratch[5];
     kiss_fft_cpx epi3;
     epi3 = st->twiddles[fstride*m];
     tw1=tw2=st->twiddles;
     do{
         C_FIXDIV(*Fout,3); C_FIXDIV(Fout[m],3); C_FIXDIV(Fout[m2],3);
         C_MUL(scratch[1],Fout[m] , *tw1);
         C_MUL(scratch[2],Fout[m2] , *tw2);
         C_ADD(scratch[3],scratch[1],scratch[2]);
         C_SUB(scratch[0],scratch[1],scratch[2]);
         tw1 += fstride;
         tw2 += fstride*2;
         Fout[m].r = Fout->r - HALF_OF(scratch[3].r);
         Fout[m].i = Fout->i - HALF_OF(scratch[3].i);
         C_MULBYSCALAR( scratch[0] , epi3.i );
         C_ADDTO(*Fout,scratch[3]);
         Fout[m2].r = Fout[m].r + scratch[0].i;
         Fout[m2].i = Fout[m].i - scratch[0].r;
         Fout[m].r -= scratch[0].i;
         Fout[m].i += scratch[0].r;
         ++Fout;
     }while(--k);
 }
 static void kf_bfly5(
        kiss_fft_cpx * Fout,
        const size_t fstride,
        const kiss_fft_cfg st,
        int m
        )
 {
    kiss_fft_cpx *Fout0,*Fout1,*Fout2,*Fout3,*Fout4;
    int u;
    kiss_fft_cpx scratch[13];
    kiss_fft_cpx * twiddles = st->twiddles;
    kiss_fft_cpx *tw;
    kiss_fft_cpx ya,yb;
    ya = twiddles[fstride*m];
    yb = twiddles[fstride*2*m];
    Fout0=Fout;
    Fout1=Fout0+m;
    Fout2=Fout0+2*m;
    Fout3=Fout0+3*m;
    Fout4=Fout0+4*m;
    tw=st->twiddles;
    for ( u=0; u<m; ++u ) {
        C_FIXDIV( *Fout0,5); C_FIXDIV( *Fout1,5); C_FIXDIV( *Fout2,5); C_FIXDIV( *Fout3,5); C_FIXDIV( *Fout4,5);
        scratch[0] = *Fout0;
        C_MUL(scratch[1] ,*Fout1, tw[u*fstride]);
        C_MUL(scratch[2] ,*Fout2, tw[2*u*fstride]);
        C_MUL(scratch[3] ,*Fout3, tw[3*u*fstride]);
        C_MUL(scratch[4] ,*Fout4, tw[4*u*fstride]);
        C_ADD( scratch[7],scratch[1],scratch[4]);
        C_SUB( scratch[10],scratch[1],scratch[4]);
        C_ADD( scratch[8],scratch[2],scratch[3]);
        C_SUB( scratch[9],scratch[2],scratch[3]);
        Fout0->r += scratch[7].r + scratch[8].r;
        Fout0->i += scratch[7].i + scratch[8].i;
        scratch[5].r = scratch[0].r + S_MUL(scratch[7].r,ya.r) + S_MUL(scratch[8].r,yb.r);
        scratch[5].i = scratch[0].i + S_MUL(scratch[7].i,ya.r) + S_MUL(scratch[8].i,yb.r);
        scratch[6].r =  S_MUL(scratch[10].i,ya.i) + S_MUL(scratch[9].i,yb.i);
        scratch[6].i = -S_MUL(scratch[10].r,ya.i) - S_MUL(scratch[9].r,yb.i);
        C_SUB(*Fout1,scratch[5],scratch[6]);
        C_ADD(*Fout4,scratch[5],scratch[6]);
        scratch[11].r = scratch[0].r + S_MUL(scratch[7].r,yb.r) + S_MUL(scratch[8].r,ya.r);
        scratch[11].i = scratch[0].i + S_MUL(scratch[7].i,yb.r) + S_MUL(scratch[8].i,ya.r);
        scratch[12].r = - S_MUL(scratch[10].i,yb.i) + S_MUL(scratch[9].i,ya.i);
        scratch[12].i = S_MUL(scratch[10].r,yb.i) - S_MUL(scratch[9].r,ya.i);
        C_ADD(*Fout2,scratch[11],scratch[12]);
        C_SUB(*Fout3,scratch[11],scratch[12]);
        ++Fout0;++Fout1;++Fout2;++Fout3;++Fout4;
    }
 }
 /* perform the butterfly for one stage of a mixed radix FFT */
 static void kf_bfly_generic(
        kiss_fft_cpx * Fout,
        const size_t fstride,
        const kiss_fft_cfg st,
        int m,
        int p
        )
 {
    int u,k,q1,q;
    kiss_fft_cpx * twiddles = st->twiddles;
    kiss_fft_cpx t;
    int Norig = st->nfft;
    kiss_fft_cpx * scratch = (kiss_fft_cpx*)KISS_FFT_TMP_ALLOC(sizeof(kiss_fft_cpx)*p);
    for ( u=0; u<m; ++u ) {
        k=u;
        for ( q1=0 ; q1<p ; ++q1 ) {
            scratch[q1] = Fout[ k  ];
            C_FIXDIV(scratch[q1],p);
            k += m;
        }
        k=u;
        for ( q1=0 ; q1<p ; ++q1 ) {
            int twidx=0;
            Fout[ k ] = scratch[0];
            for (q=1;q<p;++q ) {
                twidx += fstride * k;
                if (twidx>=Norig) twidx-=Norig;
                C_MUL(t,scratch[q] , twiddles[twidx] );
                C_ADDTO( Fout[ k ] ,t);
            }
            k += m;
        }
    }
    KISS_FFT_TMP_FREE(scratch);
 }
 static
 void kf_work(
        kiss_fft_cpx * Fout,
        const kiss_fft_cpx * f,
        const size_t fstride,
        int in_stride,
        int * factors,
        const kiss_fft_cfg st
        )
 {
    kiss_fft_cpx * Fout_beg=Fout;
    const int p=*factors++; /* the radix  */
    const int m=*factors++; /* stage's fft length/p */
    const kiss_fft_cpx * Fout_end = Fout + p*m;
 #ifdef _OPENMP
    // use openmp extensions at the 
    // top-level (not recursive)
    if (fstride==1 && p<=5)
    {
        int k;
        // execute the p different work units in different threads
 #       pragma omp parallel for
        for (k=0;k<p;++k) 
            kf_work( Fout +k*m, f+ fstride*in_stride*k,fstride*p,in_stride,factors,st);
        // all threads have joined by this point
        switch (p) {
            case 2: kf_bfly2(Fout,fstride,st,m); break;
            case 3: kf_bfly3(Fout,fstride,st,m); break; 
            case 4: kf_bfly4(Fout,fstride,st,m); break;
            case 5: kf_bfly5(Fout,fstride,st,m); break; 
            default: kf_bfly_generic(Fout,fstride,st,m,p); break;
        }
        return;
    }
 #endif
    if (m==1) {
        do{
            *Fout = *f;
            f += fstride*in_stride;
        }while(++Fout != Fout_end );
    }else{
        do{
            // recursive call:
            // DFT of size m*p performed by doing
            // p instances of smaller DFTs of size m, 
            // each one takes a decimated version of the input
            kf_work( Fout , f, fstride*p, in_stride, factors,st);
            f += fstride*in_stride;
        }while( (Fout += m) != Fout_end );
    }
    Fout=Fout_beg;
    // recombine the p smaller DFTs 
    switch (p) {
        case 2: kf_bfly2(Fout,fstride,st,m); break;
        case 3: kf_bfly3(Fout,fstride,st,m); break; 
        case 4: kf_bfly4(Fout,fstride,st,m); break;
        case 5: kf_bfly5(Fout,fstride,st,m); break; 
        default: kf_bfly_generic(Fout,fstride,st,m,p); break;
    }
 }
 /*  facbuf is populated by p1,m1,p2,m2, ...
    where 
    p[i] * m[i] = m[i-1]
    m0 = n                  */
 static 
 void kf_factor(int n,int * facbuf)
 {
    int p=4;
    double floor_sqrt;
    floor_sqrt = floor( sqrt((double)n) );
    /*factor out powers of 4, powers of 2, then any remaining primes */
    do {
        while (n % p) {
            switch (p) {
                case 4: p = 2; break;
                case 2: p = 3; break;
                default: p += 2; break;
            }
            if (p > floor_sqrt)
                p = n;          /* no more factors, skip to end */
        }
        n /= p;
        *facbuf++ = p;
        *facbuf++ = n;
    } while (n > 1);
 }
 /*
 *
 * User-callable function to allocate all necessary storage space for the fft.
 *
 * The return value is a contiguous block of memory, allocated with malloc.  As such,
 * It can be freed with free(), rather than a kiss_fft-specific function.
 * */
 kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem )
 {
    kiss_fft_cfg st=NULL;
    size_t memneeded = sizeof(struct kiss_fft_state)
        + sizeof(kiss_fft_cpx)*(nfft-1); /* twiddle factors*/
    if ( lenmem==NULL ) {
        st = ( kiss_fft_cfg)KISS_FFT_MALLOC( memneeded );
    }else{
        if (mem != NULL && *lenmem >= memneeded)
            st = (kiss_fft_cfg)mem;
        *lenmem = memneeded;
    }
    if (st) {
        int i;
        st->nfft=nfft;
        st->inverse = inverse_fft;
        for (i=0;i<nfft;++i) {
            const double pi=3.141592653589793238462643383279502884197169399375105820974944;
            double phase = -2*pi*i / nfft;
            if (st->inverse)
                phase *= -1;
            kf_cexp(st->twiddles+i, phase );
        }
        kf_factor(nfft,st->factors);
    }
    return st;
 }
 void kiss_fft_stride(kiss_fft_cfg st,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int in_stride)
 {
    if (fin == fout) {
        //NOTE: this is not really an in-place FFT algorithm.
        //It just performs an out-of-place FFT into a temp buffer
        kiss_fft_cpx * tmpbuf = (kiss_fft_cpx*)KISS_FFT_TMP_ALLOC( sizeof(kiss_fft_cpx)*st->nfft);
        kf_work(tmpbuf,fin,1,in_stride, st->factors,st);
        memcpy(fout,tmpbuf,sizeof(kiss_fft_cpx)*st->nfft);
        KISS_FFT_TMP_FREE(tmpbuf);
    }else{
        kf_work( fout, fin, 1,in_stride, st->factors,st );
    }
 }
 void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout)
 {
    kiss_fft_stride(cfg,fin,fout,1);
 }
 void kiss_fft_cleanup(void)
 {
    // nothing needed any more
 }
 int kiss_fft_next_fast_size(int n)
 {
    while(1) {
        int m=n;
        while ( (m%2) == 0 ) m/=2;
        while ( (m%3) == 0 ) m/=3;
        while ( (m%5) == 0 ) m/=5;
        if (m<=1)
            break; /* n is completely factorable by twos, threes, and fives */
        n++;
    }
    return n;
 }
--- a/media/kiss_fft/kiss_fft.h
+++ b/media/kiss_fft/kiss_fft.h
@ -1,124 +0,0 @@
 #ifndef KISS_FFT_H
 #define KISS_FFT_H
 #include <stdlib.h>
 #include <stdio.h>
 #include <math.h>
 #include <string.h>
 #ifdef __cplusplus
 extern "C" {
 #endif
 /*
 ATTENTION!
 If you would like a :
 -- a utility that will handle the caching of fft objects
 -- real-only (no imaginary time component ) FFT
 -- a multi-dimensional FFT
 -- a command-line utility to perform ffts
 -- a command-line utility to perform fast-convolution filtering
 Then see kfc.h kiss_fftr.h kiss_fftnd.h fftutil.c kiss_fastfir.c
  in the tools/ directory.
 */
 #ifdef USE_SIMD
 # include <xmmintrin.h>
 # define kiss_fft_scalar __m128
 #define KISS_FFT_MALLOC(nbytes) _mm_malloc(nbytes,16)
 #define KISS_FFT_FREE _mm_free
 #else	
 #define KISS_FFT_MALLOC malloc
 #define KISS_FFT_FREE free
 #endif	
 #ifdef FIXED_POINT
 #include <sys/types.h>	
 # if (FIXED_POINT == 32)
 #  define kiss_fft_scalar int32_t
 # else	
 #  define kiss_fft_scalar int16_t
 # endif
 #else
 # ifndef kiss_fft_scalar
 /*  default is float */
 #   define kiss_fft_scalar float
 # endif
 #endif
 typedef struct {
    kiss_fft_scalar r;
    kiss_fft_scalar i;
 }kiss_fft_cpx;
 typedef struct kiss_fft_state* kiss_fft_cfg;
 /* 
 *  kiss_fft_alloc
 *  
 *  Initialize a FFT (or IFFT) algorithm's cfg/state buffer.
 *
 *  typical usage:      kiss_fft_cfg mycfg=kiss_fft_alloc(1024,0,NULL,NULL);
 *
 *  The return value from fft_alloc is a cfg buffer used internally
 *  by the fft routine or NULL.
 *
 *  If lenmem is NULL, then kiss_fft_alloc will allocate a cfg buffer using malloc.
 *  The returned value should be free()d when done to avoid memory leaks.
 *  
 *  The state can be placed in a user supplied buffer 'mem':
 *  If lenmem is not NULL and mem is not NULL and *lenmem is large enough,
 *      then the function places the cfg in mem and the size used in *lenmem
 *      and returns mem.
 *  
 *  If lenmem is not NULL and ( mem is NULL or *lenmem is not large enough),
 *      then the function returns NULL and places the minimum cfg 
 *      buffer size in *lenmem.
 * */
 kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem); 
 /*
 * kiss_fft(cfg,in_out_buf)
 *
 * Perform an FFT on a complex input buffer.
 * for a forward FFT,
 * fin should be  f[0] , f[1] , ... ,f[nfft-1]
 * fout will be   F[0] , F[1] , ... ,F[nfft-1]
 * Note that each element is complex and can be accessed like
    f[k].r and f[k].i
 * */
 void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout);
 /*
 A more generic version of the above function. It reads its input from every Nth sample.
 * */
 void kiss_fft_stride(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int fin_stride);
 /* If kiss_fft_alloc allocated a buffer, it is one contiguous 
   buffer and can be simply free()d when no longer needed*/
 #define kiss_fft_free free
 /*
 Cleans up some memory that gets managed internally. Not necessary to call, but it might clean up 
 your compiler output to call this before you exit.
 */
 void kiss_fft_cleanup(void);
 /*
 * Returns the smallest integer k, such that k>=n and k has only "fast" factors (2,3,5)
 */
 int kiss_fft_next_fast_size(int n);
 /* for real ffts, we need an even size */
 #define kiss_fftr_next_fast_size_real(n) \
        (kiss_fft_next_fast_size( ((n)+1)>>1)<<1)
 #ifdef __cplusplus
 } 
 #endif
 #endif
--- a/media/kiss_fft/kiss_fftr.c
+++ b/media/kiss_fft/kiss_fftr.c
@ -1,159 +0,0 @@
 /*
 Copyright (c) 2003-2004, Mark Borgerding
 All rights reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */
 #include "kiss_fftr.h"
 #include "_kiss_fft_guts.h"
 struct kiss_fftr_state{
    kiss_fft_cfg substate;
    kiss_fft_cpx * tmpbuf;
    kiss_fft_cpx * super_twiddles;
 #ifdef USE_SIMD
    void * pad;
 #endif
 };
 kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem)
 {
    int i;
    kiss_fftr_cfg st = NULL;
    size_t subsize, memneeded;
    if (nfft & 1) {
        fprintf(stderr,"Real FFT optimization must be even.\n");
        return NULL;
    }
    nfft >>= 1;
    kiss_fft_alloc (nfft, inverse_fft, NULL, &subsize);
    memneeded = sizeof(struct kiss_fftr_state) + subsize + sizeof(kiss_fft_cpx) * ( nfft * 3 / 2);
    if (lenmem == NULL) {
        st = (kiss_fftr_cfg) KISS_FFT_MALLOC (memneeded);
    } else {
        if (*lenmem >= memneeded)
            st = (kiss_fftr_cfg) mem;
        *lenmem = memneeded;
    }
    if (!st)
        return NULL;
    st->substate = (kiss_fft_cfg) (st + 1); /*just beyond kiss_fftr_state struct */
    st->tmpbuf = (kiss_fft_cpx *) (((char *) st->substate) + subsize);
    st->super_twiddles = st->tmpbuf + nfft;
    kiss_fft_alloc(nfft, inverse_fft, st->substate, &subsize);
    for (i = 0; i < nfft/2; ++i) {
        double phase =
            -3.14159265358979323846264338327 * ((double) (i+1) / nfft + .5);
        if (inverse_fft)
            phase *= -1;
        kf_cexp (st->super_twiddles+i,phase);
    }
    return st;
 }
 void kiss_fftr(kiss_fftr_cfg st,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata)
 {
    /* input buffer timedata is stored row-wise */
    int k,ncfft;
    kiss_fft_cpx fpnk,fpk,f1k,f2k,tw,tdc;
    if ( st->substate->inverse) {
        fprintf(stderr,"kiss fft usage error: improper alloc\n");
        exit(1);
    }
    ncfft = st->substate->nfft;
    /*perform the parallel fft of two real signals packed in real,imag*/
    kiss_fft( st->substate , (const kiss_fft_cpx*)timedata, st->tmpbuf );
    /* The real part of the DC element of the frequency spectrum in st->tmpbuf
     * contains the sum of the even-numbered elements of the input time sequence
     * The imag part is the sum of the odd-numbered elements
     *
     * The sum of tdc.r and tdc.i is the sum of the input time sequence. 
     *      yielding DC of input time sequence
     * The difference of tdc.r - tdc.i is the sum of the input (dot product) [1,-1,1,-1... 
     *      yielding Nyquist bin of input time sequence
     */
    tdc.r = st->tmpbuf[0].r;
    tdc.i = st->tmpbuf[0].i;
    C_FIXDIV(tdc,2);
    CHECK_OVERFLOW_OP(tdc.r ,+, tdc.i);
    CHECK_OVERFLOW_OP(tdc.r ,-, tdc.i);
    freqdata[0].r = tdc.r + tdc.i;
    freqdata[ncfft].r = tdc.r - tdc.i;
 #ifdef USE_SIMD    
    freqdata[ncfft].i = freqdata[0].i = _mm_set1_ps(0);
 #else
    freqdata[ncfft].i = freqdata[0].i = 0;
 #endif
    for ( k=1;k <= ncfft/2 ; ++k ) {
        fpk    = st->tmpbuf[k]; 
        fpnk.r =   st->tmpbuf[ncfft-k].r;
        fpnk.i = - st->tmpbuf[ncfft-k].i;
        C_FIXDIV(fpk,2);
        C_FIXDIV(fpnk,2);
        C_ADD( f1k, fpk , fpnk );
        C_SUB( f2k, fpk , fpnk );
        C_MUL( tw , f2k , st->super_twiddles[k-1]);
        freqdata[k].r = HALF_OF(f1k.r + tw.r);
        freqdata[k].i = HALF_OF(f1k.i + tw.i);
        freqdata[ncfft-k].r = HALF_OF(f1k.r - tw.r);
        freqdata[ncfft-k].i = HALF_OF(tw.i - f1k.i);
    }
 }
 void kiss_fftri(kiss_fftr_cfg st,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata)
 {
    /* input buffer timedata is stored row-wise */
    int k, ncfft;
    if (st->substate->inverse == 0) {
        fprintf (stderr, "kiss fft usage error: improper alloc\n");
        exit (1);
    }
    ncfft = st->substate->nfft;
    st->tmpbuf[0].r = freqdata[0].r + freqdata[ncfft].r;
    st->tmpbuf[0].i = freqdata[0].r - freqdata[ncfft].r;
    C_FIXDIV(st->tmpbuf[0],2);
    for (k = 1; k <= ncfft / 2; ++k) {
        kiss_fft_cpx fk, fnkc, fek, fok, tmp;
        fk = freqdata[k];
        fnkc.r = freqdata[ncfft - k].r;
        fnkc.i = -freqdata[ncfft - k].i;
        C_FIXDIV( fk , 2 );
        C_FIXDIV( fnkc , 2 );
        C_ADD (fek, fk, fnkc);
        C_SUB (tmp, fk, fnkc);
        C_MUL (fok, tmp, st->super_twiddles[k-1]);
        C_ADD (st->tmpbuf[k],     fek, fok);
        C_SUB (st->tmpbuf[ncfft - k], fek, fok);
 #ifdef USE_SIMD        
        st->tmpbuf[ncfft - k].i *= _mm_set1_ps(-1.0);
 #else
        st->tmpbuf[ncfft - k].i *= -1;
 #endif
    }
    kiss_fft (st->substate, st->tmpbuf, (kiss_fft_cpx *) timedata);
 }
--- a/media/kiss_fft/kiss_fftr.h
+++ b/media/kiss_fft/kiss_fftr.h
@ -1,46 +0,0 @@
 #ifndef KISS_FTR_H
 #define KISS_FTR_H
 #include "kiss_fft.h"
 #ifdef __cplusplus
 extern "C" {
 #endif
 /* 
 Real optimized version can save about 45% cpu time vs. complex fft of a real seq.
 */
 typedef struct kiss_fftr_state *kiss_fftr_cfg;
 kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem, size_t * lenmem);
 /*
 nfft must be even
 If you don't care to allocate space, use mem = lenmem = NULL 
 */
 void kiss_fftr(kiss_fftr_cfg cfg,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata);
 /*
 input timedata has nfft scalar points
 output freqdata has nfft/2+1 complex points
 */
 void kiss_fftri(kiss_fftr_cfg cfg,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata);
 /*
 input freqdata has  nfft/2+1 complex points
 output timedata has nfft scalar points
 */
 #define kiss_fftr_free free
 #ifdef __cplusplus
 }
 #endif
 #endif
--- a/media/kiss_fft/moz.build
+++ b/media/kiss_fft/moz.build
@ -1,20 +0,0 @@
 # -*- Mode: python; indent-tabs-mode: nil; tab-width: 40 -*-
 # vim: set filetype=python:
 # This Source Code Form is subject to the terms of the Mozilla Public
 # License, v. 2.0. If a copy of the MPL was not distributed with this
 # file, You can obtain one at http://mozilla.org/MPL/2.0/.
 with Files("**"):
    BUG_COMPONENT = ("Core", "Web Audio")
 EXPORTS.kiss_fft += [
    'kiss_fft.h',
    'kiss_fftr.h',
 ]
 SOURCES += [
    'kiss_fft.c',
    'kiss_fftr.c',
 ]
 FINAL_LIBRARY = 'xul'
--- a/media/kiss_fft/moz.yaml
+++ b/media/kiss_fft/moz.yaml
@ -1,49 +0,0 @@
 schema: 1
 bugzilla:
  product: Core
  component: "Web Audio"
 origin:
  name: kiss_fft
  description: A mixed-radix Fast Fourier Transform
  url: https://github.com/mborgerding/kissfft
  release: 1c3d6f5aa9eb2bf2f18641f0a7e3e6f5e523a156 (2017-10-25T13:50:40Z).
  revision: 1c3d6f5aa9eb2bf2f18641f0a7e3e6f5e523a156
  license: BSD-3-Clause
  license-file: COPYING
 vendoring:
  url: https://github.com/mborgerding/kissfft
  source-hosting: github
  tracking: commit
  exclude:
    - ".*"
    - test
    - tools/fftutil.c
    - tools/psdpng.c
    - "tools/kiss_fftnd*"
    - tools/kiss_fastfir.c
    - "tools/kfc.*"
    - "tools/.*"
    - TIPS
    - kissfft.hh
    - tools/Makefile
    - Makefile
  keep:
    - COPYING
    - _kiss_fft_guts.h
    - kiss_fft.c
    - kiss_fft.h
    - tools/kiss_fftr.c
    - tools/kiss_fftr.h
  update-actions:
    - action: move-dir
      from: '{vendor_dir}/tools'
      to: '{vendor_dir}'
--- a/media/openmax_dl/LICENSE
+++ b/media/openmax_dl/LICENSE
@ -1,39 +0,0 @@
 Use of this source code is governed by a BSD-style license that can be
 found in the LICENSE file in the root of the source tree. All
 contributing project authors may be found in the AUTHORS file in the
 root of the source tree.
 The files were originally licensed by ARM Limited.
 The following files:
    * dl/api/omxtypes.h
    * dl/sp/api/omxSP.h
 are licensed by Khronos:
 Copyright (c) 2005-2008,2015 The Khronos Group Inc.
 Permission is hereby granted, free of charge, to any person obtaining a
 copy of this software and/or associated documentation files (the
 "Materials"), to deal in the Materials without restriction, including
 without limitation the rights to use, copy, modify, merge, publish,
 distribute, sublicense, and/or sell copies of the Materials, and to
 permit persons to whom the Materials are furnished to do so, subject to
 the following conditions:
 The above copyright notice and this permission notice shall be included
 in all copies or substantial portions of the Materials.
 MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
 KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
 SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
   https://www.khronos.org/registry/
 THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
 CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
 TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
--- a/media/openmax_dl/OWNERS
+++ b/media/openmax_dl/OWNERS
@ -1,3 +0,0 @@
 ajm@google.com
 kma@google.com
 rtoy@google.com
--- a/media/openmax_dl/README.chromium
+++ b/media/openmax_dl/README.chromium
@ -1,19 +0,0 @@
 Name: OpenMAX DL
 Short Name: OpenMax DL
 URL: https://silver.arm.com/download/Software/Graphics/OX000-BU-00010-r1p0-00bet0/OX000-BU-00010-r1p0-00bet0.tgz
 Version: 1.0.2
 License: BSD
 License File: LICENSE
 Security Critical: yes
 Description:
 Implementation of OpenMAX DL spec from ARM.  This is used to support
 WebAudio for Chromium on Android.
 Local Modifications:
 Only the FFT routines from the OpenMAX DL package are included.  The
 code was modified to work with gcc and a new implementation for a
 floating-point FFT was added.
 The original ARM license is unclear, but Google has obtained
 permission to relicense this code under a BSD license.
--- a/media/openmax_dl/README.mozilla
+++ b/media/openmax_dl/README.mozilla
@ -1,9 +0,0 @@
 Bug 1158741 added an omxSP_FFTInv_CCSToR_F32_Sfs_unscaled function as an
 optimization which performs the same operation as
 omxSP_FFTInv_CCSToR_F32_Sfs except it doesn't scale the results by the
 length of the FFT. For consistency with other FFT routines used, it does
 multiply the results by two.
 The affected files are:
 media/openmax_dl/dl/sp/api/omxSP.h
 media/openmax_dl/dl/sp/src/omxSP_FFTInv_CCSToR_F32_Sfs_unscaled_s.S
--- a/media/openmax_dl/dl/api/armCOMM_s.h
+++ b/media/openmax_dl/dl/api/armCOMM_s.h
@ -1,417 +0,0 @@
@// -*- Mode: asm; -*-
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This file was originally licensed as follows. It has been
@//  relicensed with permission from the copyright holders.
@//
@// 
@// File Name:  armCOMM_s.h
@// OpenMAX DL: v1.0.2
@// Last Modified Revision:   13871
@// Last Modified Date:       Fri, 09 May 2008
@// 
@// (c) Copyright 2007-2008 ARM Limited. All Rights Reserved.
@// 
@// 
@//
@// ARM optimized OpenMAX common header file
@//
 	.set	_SBytes, 0	@ Number of scratch bytes on stack
 	.set	_Workspace, 0	@ Stack offset of scratch workspace
 	.set	_RRegList, 0	@ R saved register list (last register number)
 	.set	_DRegList, 0	@ D saved register list (last register number)
        @// Work out a list of R saved registers, and how much stack space is needed.
 	@// gas doesn't support setting a variable to a string, so we set _RRegList to 
 	@// the register number.
 	.macro	_M_GETRREGLIST	rreg
 	.ifeqs "\rreg", ""
 	@ Nothing needs to be saved
 	.exitm
 	.endif
 	@ If rreg is lr or r4, save lr and r4
 	.ifeqs "\rreg", "lr"
 	.set	_RRegList, 4
 	.exitm
 	.endif
 	.ifeqs "\rreg", "r4"
 	.set	_RRegList, 4
 	.exitm
 	.endif
 	@ If rreg = r5 or r6, save up to register r6
 	.ifeqs "\rreg", "r5"
 	.set	_RRegList, 6
 	.exitm
 	.endif
 	.ifeqs "\rreg", "r6"
 	.set	_RRegList, 6
 	.exitm
 	.endif
 	@ If rreg = r7 or r8, save up to register r8
 	.ifeqs "\rreg", "r7"
 	.set	_RRegList, 8
 	.exitm
 	.endif
 	.ifeqs "\rreg", "r8"
 	.set	_RRegList, 8
 	.exitm
 	.endif
 	@ If rreg = r9 or r10, save up to register r10
 	.ifeqs "\rreg", "r9"
 	.set	_RRegList, 10
 	.exitm
 	.endif
 	.ifeqs "\rreg", "r10"
 	.set	_RRegList, 10
 	.exitm
 	.endif
 	@ If rreg = r11 or r12, save up to register r12
 	.ifeqs "\rreg", "r11"
 	.set	_RRegList, 12
 	.exitm
 	.endif
 	.ifeqs "\rreg", "r12"
 	.set	_RRegList, 12
 	.exitm
 	.endif
 	.warning "Unrecognized saved r register limit: \rreg"
 	.endm
 	@ Work out list of D saved registers, like for R registers.
 	.macro	_M_GETDREGLIST dreg
 	.ifeqs "\dreg", ""
 	.set	_DRegList, 0
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d8"
 	.set	_DRegList, 8
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d9"
 	.set	_DRegList, 9
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d10"
 	.set	_DRegList, 10
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d11"
 	.set	_DRegList, 11
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d12"
 	.set	_DRegList, 12
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d13"
 	.set	_DRegList, 13
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d14"
 	.set	_DRegList, 14
 	.exitm
 	.endif
 	.ifeqs "\dreg", "d15"
 	.set	_DRegList, 15
 	.exitm
 	.endif
 	.warning "Unrecognized saved d register limit: \rreg"
 	.endm
@//////////////////////////////////////////////////////////
@// Function header and footer macros
@//////////////////////////////////////////////////////////      
        @ Function Header Macro    
        @ Generates the function prologue
        @ Note that functions should all be "stack-moves-once"
        @ The FNSTART and FNEND macros should be the only places
        @ where the stack moves.
        @    
        @  name  = function name
        @  rreg  = ""   don't stack any registers
        @          "lr" stack "lr" only
        @          "rN" stack registers "r4-rN,lr"
        @  dreg  = ""   don't stack any D registers
        @          "dN" stack registers "d8-dN"
        @
        @ Note: ARM Archicture procedure call standard AAPCS
        @ states that r4-r11, sp, d8-d15 must be preserved by
        @ a compliant function.
 	.macro	M_START name, rreg, dreg
 	.set	_Workspace, 0
 	@ Define the function and make it external.
 	.global	\name
 #ifndef __clang__
 	.func	\name
 #endif
 	.section	.text.\name,"ax",%progbits
 	.arch armv7-a
 	.fpu neon
 	.syntax unified
 	.object_arch armv4
 	.align	2
 \name :		
 .fnstart
 	@ Save specified R registers
 	_M_GETRREGLIST	\rreg
 	_M_PUSH_RREG
 	@ Save specified D registers
        _M_GETDREGLIST  \dreg
 	_M_PUSH_DREG
 	@ Ensure size claimed on stack is 8-byte aligned
 	.if (_SBytes & 7) != 0
 	.set	_SBytes, _SBytes + (8 - (_SBytes & 7))
 	.endif
 	.if _SBytes != 0
 		sub	sp, sp, #_SBytes
 	.endif	
 	.endm
        @ Function Footer Macro        
        @ Generates the function epilogue
 	.macro M_END
 	@ Restore the stack pointer to its original value on function entry
 	.if _SBytes != 0
 		add	sp, sp, #_SBytes
 	.endif
 	@ Restore any saved R or D registers.
 	_M_RET
 	.fnend	
 #ifndef __clang__
 	.endfunc
 #endif
        @ Reset the global stack tracking variables back to their
 	@ initial values.
 	.set _SBytes, 0
 	.endm
 	@// Based on the value of _DRegList, push the specified set of registers 
 	@// to the stack.  Is there a better way?
 	.macro _M_PUSH_DREG
 	.if _DRegList == 8
 		vpush	{d8}
 	.exitm
 	.endif
 	.if _DRegList == 9
 		vpush	{d8-d9}
 	.exitm
 	.endif
 	.if _DRegList == 10
 		vpush	{d8-d10}
 	.exitm
 	.endif
 	.if _DRegList == 11
 		vpush	{d8-d11}
 	.exitm
 	.endif
 	.if _DRegList == 12
 		vpush	{d8-d12}
 	.exitm
 	.endif
 	.if _DRegList == 13
 		vpush	{d8-d13}
 	.exitm
 	.endif
 	.if _DRegList == 14
 		vpush	{d8-d14}
 	.exitm
 	.endif
 	.if _DRegList == 15
 		vpush	{d8-d15}
 	.exitm
 	.endif
 	.endm
 	@// Based on the value of _RRegList, push the specified set of registers 
 	@// to the stack.  Is there a better way?
 	.macro _M_PUSH_RREG
 	.if _RRegList == 4
 		stmfd	sp!, {r4, lr}
 	.exitm
 	.endif
 	.if _RRegList == 6
 		stmfd	sp!, {r4-r6, lr}
 	.exitm
 	.endif
 	.if _RRegList == 8
 		stmfd	sp!, {r4-r8, lr}
 	.exitm
 	.endif
 	.if _RRegList == 10
 		stmfd	sp!, {r4-r10, lr}
 	.exitm
 	.endif
 	.if _RRegList == 12
 		stmfd	sp!, {r4-r12, lr}
 	.exitm
 	.endif
 	.endm
 	@// The opposite of _M_PUSH_DREG
 	.macro  _M_POP_DREG
 	.if _DRegList == 8
 		vpop	{d8}
 	.exitm
 	.endif
 	.if _DRegList == 9
 		vpop	{d8-d9}
 	.exitm
 	.endif
 	.if _DRegList == 10
 		vpop	{d8-d10}
 	.exitm
 	.endif
 	.if _DRegList == 11
 		vpop	{d8-d11}
 	.exitm
 	.endif
 	.if _DRegList == 12
 		vpop	{d8-d12}
 	.exitm
 	.endif
 	.if _DRegList == 13
 		vpop	{d8-d13}
 	.exitm
 	.endif
 	.if _DRegList == 14
 		vpop	{d8-d14}
 	.exitm
 	.endif
 	.if _DRegList == 15
 		vpop	{d8-d15}
 	.exitm
 	.endif
 	.endm
 	@// The opposite of _M_PUSH_RREG
 	.macro _M_POP_RREG cc
 	.if _RRegList == 0
 		bx\cc lr
 	.exitm
 	.endif
 	.if _RRegList == 4
 		ldm\cc\()fd	sp!, {r4, pc}
 	.exitm
 	.endif
 	.if _RRegList == 6
 		ldm\cc\()fd	sp!, {r4-r6, pc}
 	.exitm
 	.endif
 	.if _RRegList == 8
 		ldm\cc\()fd	sp!, {r4-r8, pc}
 	.exitm
 	.endif
 	.if _RRegList == 10
 		ldm\cc\()fd	sp!, {r4-r10, pc}
 	.exitm
 	.endif
 	.if _RRegList == 12
 		ldm\cc\()fd	sp!, {r4-r12, pc}
 	.exitm
 	.endif
 	.endm
        @ Produce function return instructions
 	.macro	_M_RET cc
 	_M_POP_DREG \cc
 	_M_POP_RREG \cc
 	.endm	
        @// Allocate 4-byte aligned area of name
        @// |name| and size |size| bytes.
 	.macro	M_ALLOC4 name, size
 	.if	(_SBytes & 3) != 0
 	.set	_SBytes, _SBytes + (4 - (_SBytes & 3))
 	.endif
 	.set	\name\()_F, _SBytes
 	.set	_SBytes, _SBytes + \size
 	.endm
        @ Load word from stack
 	.macro M_LDR r, a0, a1, a2, a3
 	_M_DATA "ldr", 4, \r, \a0, \a1, \a2, \a3
 	.endm
        @ Store word to stack
 	.macro M_STR r, a0, a1, a2, a3
 	_M_DATA "str", 4, \r, \a0, \a1, \a2, \a3
 	.endm
        @ Macro to perform a data access operation
        @ Such as LDR or STR
        @ The addressing mode is modified such that
        @ 1. If no address is given then the name is taken
        @    as a stack offset
        @ 2. If the addressing mode is not available for the
        @    state being assembled for (eg Thumb) then a suitable
        @    addressing mode is substituted.
        @
        @ On Entry:
        @ $i = Instruction to perform (eg "LDRB")
        @ $a = Required byte alignment
        @ $r = Register(s) to transfer (eg "r1")
        @ $a0,$a1,$a2. Addressing mode and condition. One of:
        @     label {,cc}
        @     [base]                    {,,,cc}
        @     [base, offset]{!}         {,,cc}
        @     [base, offset, shift]{!}  {,cc}
        @     [base], offset            {,,cc}
        @     [base], offset, shift     {,cc}
 	@
 	@ WARNING: Most of the above are not supported, except the first case.
 	.macro _M_DATA i, a, r, a0, a1, a2, a3
 	.set	_Offset, _Workspace + \a0\()_F
 	\i\a1	\r, [sp, #_Offset]	
 	.endm
--- a/media/openmax_dl/dl/api/armOMX.h
+++ b/media/openmax_dl/dl/api/armOMX.h
@ -1,289 +0,0 @@
 /*
 *  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 *
 *  This file was originally licensed as follows. It has been
 *  relicensed with permission from the copyright holders.
 */
 /* 
 * 
 * File Name:  armOMX_ReleaseVersion.h
 * OpenMAX DL: v1.0.2
 * Last Modified Revision:   15322
 * Last Modified Date:       Wed, 15 Oct 2008
 * 
 * (c) Copyright 2007-2008 ARM Limited. All Rights Reserved.
 * 
 * 
 *
 * This file allows a version of the OMX DL libraries to be built where some or
 * all of the function names can be given a user specified suffix. 
 *
 * You might want to use it where:
 *
 * - you want to rename a function "out of the way" so that you could replace
 *   a function with a different version (the original version would still be
 *   in the library just with a different name - so you could debug the new
 *   version by comparing it to the output of the old)
 *
 * - you want to rename all the functions to versions with a suffix so that 
 *   you can include two versions of the library and choose between functions
 *   at runtime.
 *
 *     e.g. omxIPBM_Copy_U8_C1R could be renamed omxIPBM_Copy_U8_C1R_CortexA8
 * 
 */
 #ifndef _armOMX_H_
 #define _armOMX_H_
 #define ARMOMX_ENABLE_RENAMING 0
 #if ARMOMX_ENABLE_RENAMING
 /* We need to define these two macros in order to expand and concatenate the names */
 #define OMXCAT2BAR(A, B) omx ## A ## B
 #define OMXCATBAR(A, B) OMXCAT2BAR(A, B)
 /* Define the suffix to add to all functions - the default is no suffix */
 #define BARE_SUFFIX 
 /* Define what happens to the bare suffix-less functions, down to the sub-domain accuracy */
 #define OMXACAAC_SUFFIX    BARE_SUFFIX   
 #define OMXACMP3_SUFFIX    BARE_SUFFIX
 #define OMXICJP_SUFFIX     BARE_SUFFIX
 #define OMXIPBM_SUFFIX     BARE_SUFFIX
 #define OMXIPCS_SUFFIX     BARE_SUFFIX
 #define OMXIPPP_SUFFIX     BARE_SUFFIX
 #define OMXSP_SUFFIX       BARE_SUFFIX
 #define OMXVCCOMM_SUFFIX   BARE_SUFFIX
 #define OMXVCM4P10_SUFFIX  BARE_SUFFIX
 #define OMXVCM4P2_SUFFIX   BARE_SUFFIX
 /* Define what the each bare, un-suffixed OpenMAX API function names is to be renamed */
 #define omxACAAC_DecodeChanPairElt                        OMXCATBAR(ACAAC_DecodeChanPairElt, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeDatStrElt                          OMXCATBAR(ACAAC_DecodeDatStrElt, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeFillElt                            OMXCATBAR(ACAAC_DecodeFillElt, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeIsStereo_S32                       OMXCATBAR(ACAAC_DecodeIsStereo_S32, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeMsPNS_S32_I                        OMXCATBAR(ACAAC_DecodeMsPNS_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeMsStereo_S32_I                     OMXCATBAR(ACAAC_DecodeMsStereo_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodePrgCfgElt                          OMXCATBAR(ACAAC_DecodePrgCfgElt, OMXACAAC_SUFFIX)
 #define omxACAAC_DecodeTNS_S32_I                          OMXCATBAR(ACAAC_DecodeTNS_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_DeinterleaveSpectrum_S32                 OMXCATBAR(ACAAC_DeinterleaveSpectrum_S32, OMXACAAC_SUFFIX)
 #define omxACAAC_EncodeTNS_S32_I                          OMXCATBAR(ACAAC_EncodeTNS_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_LongTermPredict_S32                      OMXCATBAR(ACAAC_LongTermPredict_S32, OMXACAAC_SUFFIX)
 #define omxACAAC_LongTermReconstruct_S32_I                OMXCATBAR(ACAAC_LongTermReconstruct_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_MDCTFwd_S32                              OMXCATBAR(ACAAC_MDCTFwd_S32, OMXACAAC_SUFFIX)
 #define omxACAAC_MDCTInv_S32_S16                          OMXCATBAR(ACAAC_MDCTInv_S32_S16, OMXACAAC_SUFFIX)
 #define omxACAAC_NoiselessDecode                          OMXCATBAR(ACAAC_NoiselessDecode, OMXACAAC_SUFFIX)
 #define omxACAAC_QuantInv_S32_I                           OMXCATBAR(ACAAC_QuantInv_S32_I, OMXACAAC_SUFFIX)
 #define omxACAAC_UnpackADIFHeader                         OMXCATBAR(ACAAC_UnpackADIFHeader, OMXACAAC_SUFFIX)
 #define omxACAAC_UnpackADTSFrameHeader                    OMXCATBAR(ACAAC_UnpackADTSFrameHeader, OMXACAAC_SUFFIX)
 #define omxACMP3_HuffmanDecode_S32                        OMXCATBAR(ACMP3_HuffmanDecode_S32, OMXACMP3_SUFFIX)
 #define omxACMP3_HuffmanDecodeSfb_S32                     OMXCATBAR(ACMP3_HuffmanDecodeSfb_S32, OMXACMP3_SUFFIX)
 #define omxACMP3_HuffmanDecodeSfbMbp_S32                  OMXCATBAR(ACMP3_HuffmanDecodeSfbMbp_S32, OMXACMP3_SUFFIX)
 #define omxACMP3_MDCTInv_S32                              OMXCATBAR(ACMP3_MDCTInv_S32, OMXACMP3_SUFFIX)
 #define omxACMP3_ReQuantize_S32_I                         OMXCATBAR(ACMP3_ReQuantize_S32_I, OMXACMP3_SUFFIX)
 #define omxACMP3_ReQuantizeSfb_S32_I                      OMXCATBAR(ACMP3_ReQuantizeSfb_S32_I, OMXACMP3_SUFFIX)
 #define omxACMP3_SynthPQMF_S32_S16                        OMXCATBAR(ACMP3_SynthPQMF_S32_S16, OMXACMP3_SUFFIX)
 #define omxACMP3_UnpackFrameHeader                        OMXCATBAR(ACMP3_UnpackFrameHeader, OMXACMP3_SUFFIX)
 #define omxACMP3_UnpackScaleFactors_S8                    OMXCATBAR(ACMP3_UnpackScaleFactors_S8, OMXACMP3_SUFFIX)
 #define omxACMP3_UnpackSideInfo                           OMXCATBAR(ACMP3_UnpackSideInfo, OMXACMP3_SUFFIX)
 #define omxICJP_CopyExpand_U8_C3                          OMXCATBAR(ICJP_CopyExpand_U8_C3, OMXICJP_SUFFIX)
 #define omxICJP_DCTFwd_S16                                OMXCATBAR(ICJP_DCTFwd_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTFwd_S16_I                              OMXCATBAR(ICJP_DCTFwd_S16_I, OMXICJP_SUFFIX)
 #define omxICJP_DCTInv_S16                                OMXCATBAR(ICJP_DCTInv_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTInv_S16_I                              OMXCATBAR(ICJP_DCTInv_S16_I, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantFwd_Multiple_S16                  OMXCATBAR(ICJP_DCTQuantFwd_Multiple_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantFwd_S16                           OMXCATBAR(ICJP_DCTQuantFwd_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantFwd_S16_I                         OMXCATBAR(ICJP_DCTQuantFwd_S16_I, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantFwdTableInit                      OMXCATBAR(ICJP_DCTQuantFwdTableInit, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantInv_Multiple_S16                  OMXCATBAR(ICJP_DCTQuantInv_Multiple_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantInv_S16                           OMXCATBAR(ICJP_DCTQuantInv_S16, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantInv_S16_I                         OMXCATBAR(ICJP_DCTQuantInv_S16_I, OMXICJP_SUFFIX)
 #define omxICJP_DCTQuantInvTableInit                      OMXCATBAR(ICJP_DCTQuantInvTableInit, OMXICJP_SUFFIX)
 #define omxICJP_DecodeHuffman8x8_Direct_S16_C1            OMXCATBAR(ICJP_DecodeHuffman8x8_Direct_S16_C1, OMXICJP_SUFFIX)
 #define omxICJP_DecodeHuffmanSpecGetBufSize_U8            OMXCATBAR(ICJP_DecodeHuffmanSpecGetBufSize_U8, OMXICJP_SUFFIX)
 #define omxICJP_DecodeHuffmanSpecInit_U8                  OMXCATBAR(ICJP_DecodeHuffmanSpecInit_U8, OMXICJP_SUFFIX)
 #define omxICJP_EncodeHuffman8x8_Direct_S16_U1_C1         OMXCATBAR(ICJP_EncodeHuffman8x8_Direct_S16_U1_C1, OMXICJP_SUFFIX)
 #define omxICJP_EncodeHuffmanSpecGetBufSize_U8            OMXCATBAR(ICJP_EncodeHuffmanSpecGetBufSize_U8, OMXICJP_SUFFIX)
 #define omxICJP_EncodeHuffmanSpecInit_U8                  OMXCATBAR(ICJP_EncodeHuffmanSpecInit_U8, OMXICJP_SUFFIX)
 #define omxIPBM_AddC_U8_C1R_Sfs                           OMXCATBAR(IPBM_AddC_U8_C1R_Sfs, OMXIPBM_SUFFIX)
 #define omxIPBM_Copy_U8_C1R                               OMXCATBAR(IPBM_Copy_U8_C1R, OMXIPBM_SUFFIX)
 #define omxIPBM_Copy_U8_C3R                               OMXCATBAR(IPBM_Copy_U8_C3R, OMXIPBM_SUFFIX)
 #define omxIPBM_Mirror_U8_C1R                             OMXCATBAR(IPBM_Mirror_U8_C1R, OMXIPBM_SUFFIX)
 #define omxIPBM_MulC_U8_C1R_Sfs                           OMXCATBAR(IPBM_MulC_U8_C1R_Sfs, OMXIPBM_SUFFIX)
 #define omxIPCS_ColorTwistQ14_U8_C3R                      OMXCATBAR(IPCS_ColorTwistQ14_U8_C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR565ToYCbCr420LS_MCU_U16_S16_C3P3R      OMXCATBAR(IPCS_BGR565ToYCbCr420LS_MCU_U16_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR565ToYCbCr422LS_MCU_U16_S16_C3P3R      OMXCATBAR(IPCS_BGR565ToYCbCr422LS_MCU_U16_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR565ToYCbCr444LS_MCU_U16_S16_C3P3R      OMXCATBAR(IPCS_BGR565ToYCbCr444LS_MCU_U16_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR888ToYCbCr420LS_MCU_U8_S16_C3P3R       OMXCATBAR(IPCS_BGR888ToYCbCr420LS_MCU_U8_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR888ToYCbCr422LS_MCU_U8_S16_C3P3R       OMXCATBAR(IPCS_BGR888ToYCbCr422LS_MCU_U8_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_BGR888ToYCbCr444LS_MCU_U8_S16_C3P3R       OMXCATBAR(IPCS_BGR888ToYCbCr444LS_MCU_U8_S16_C3P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr420RszCscRotBGR_U8_P3C3R             OMXCATBAR(IPCS_YCbCr420RszCscRotBGR_U8_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr420RszRot_U8_P3R                     OMXCATBAR(IPCS_YCbCr420RszRot_U8_P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr420ToBGR565_U8_U16_P3C3R             OMXCATBAR(IPCS_YCbCr420ToBGR565_U8_U16_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr420ToBGR565LS_MCU_S16_U16_P3C3R      OMXCATBAR(IPCS_YCbCr420ToBGR565LS_MCU_S16_U16_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr420ToBGR888LS_MCU_S16_U8_P3C3R       OMXCATBAR(IPCS_YCbCr420ToBGR888LS_MCU_S16_U8_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422RszCscRotBGR_U8_P3C3R             OMXCATBAR(IPCS_YCbCr422RszCscRotBGR_U8_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_CbYCrY422RszCscRotBGR_U8_U16_C2R          OMXCATBAR(IPCS_CbYCrY422RszCscRotBGR_U8_U16_C2R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422RszRot_U8_P3R                     OMXCATBAR(IPCS_YCbCr422RszRot_U8_P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbYCr422ToBGR565_U8_U16_C2C3R            OMXCATBAR(IPCS_YCbYCr422ToBGR565_U8_U16_C2C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422ToBGR565LS_MCU_S16_U16_P3C3R      OMXCATBAR(IPCS_YCbCr422ToBGR565LS_MCU_S16_U16_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbYCr422ToBGR888_U8_C2C3R                OMXCATBAR(IPCS_YCbYCr422ToBGR888_U8_C2C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422ToBGR888LS_MCU_S16_U8_P3C3R       OMXCATBAR(IPCS_YCbCr422ToBGR888LS_MCU_S16_U8_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422ToBGR888LS_MCU_S16_U8_P3C3R       OMXCATBAR(IPCS_YCbCr422ToBGR888LS_MCU_S16_U8_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_CbYCrY422ToYCbCr420Rotate_U8_C2P3R        OMXCATBAR(IPCS_CbYCrY422ToYCbCr420Rotate_U8_C2P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr422ToYCbCr420Rotate_U8_P3R           OMXCATBAR(IPCS_YCbCr422ToYCbCr420Rotate_U8_P3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr444ToBGR565_U8_U16_C3R               OMXCATBAR(IPCS_YCbCr444ToBGR565_U8_U16_C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr444ToBGR565_U8_U16_P3C3R             OMXCATBAR(IPCS_YCbCr444ToBGR565_U8_U16_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr444ToBGR565LS_MCU_S16_U16_P3C3R      OMXCATBAR(IPCS_YCbCr444ToBGR565LS_MCU_S16_U16_P3C3R, OMXIPCS_SUFFIX)
 #define omxIPCS_YCbCr444ToBGR888_U8_C3R                   OMXCATBAR(IPCS_YCbCr444ToBGR888_U8_C3R, OMXIPCS_SUFFIX)
 #define omxIPPP_Deblock_HorEdge_U8_I                      OMXCATBAR(IPPP_Deblock_HorEdge_U8_I, OMXIPPP_SUFFIX)
 #define omxIPPP_Deblock_VerEdge_U8_I                      OMXCATBAR(IPPP_Deblock_VerEdge_U8_I, OMXIPPP_SUFFIX)
 #define omxIPPP_FilterFIR_U8_C1R                          OMXCATBAR(IPPP_FilterFIR_U8_C1R, OMXIPPP_SUFFIX)
 #define omxIPPP_FilterMedian_U8_C1R                       OMXCATBAR(IPPP_FilterMedian_U8_C1R, OMXIPPP_SUFFIX)
 #define omxIPPP_GetCentralMoment_S64                      OMXCATBAR(IPPP_GetCentralMoment_S64, OMXIPPP_SUFFIX)
 #define omxIPPP_GetSpatialMoment_S64                      OMXCATBAR(IPPP_GetSpatialMoment_S64, OMXIPPP_SUFFIX)
 #define omxIPPP_MomentGetStateSize                        OMXCATBAR(IPPP_MomentGetStateSize, OMXIPPP_SUFFIX)
 #define omxIPPP_MomentInit                                OMXCATBAR(IPPP_MomentInit, OMXIPPP_SUFFIX)
 #define omxIPPP_Moments_U8_C1R                            OMXCATBAR(IPPP_Moments_U8_C1R, OMXIPPP_SUFFIX)
 #define omxIPPP_Moments_U8_C3R                            OMXCATBAR(IPPP_Moments_U8_C3R, OMXIPPP_SUFFIX)
 #define omxSP_BlockExp_S16                                OMXCATBAR(SP_BlockExp_S16, OMXSP_SUFFIX)
 #define omxSP_BlockExp_S32                                OMXCATBAR(SP_BlockExp_S32, OMXSP_SUFFIX)
 #define omxSP_Copy_S16                                    OMXCATBAR(SP_Copy_S16, OMXSP_SUFFIX)
 #define omxSP_DotProd_S16                                 OMXCATBAR(SP_DotProd_S16, OMXSP_SUFFIX)
 #define omxSP_DotProd_S16_Sfs                             OMXCATBAR(SP_DotProd_S16_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTFwd_CToC_SC16_Sfs                        OMXCATBAR(SP_FFTFwd_CToC_SC16_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTFwd_CToC_SC32_Sfs                        OMXCATBAR(SP_FFTFwd_CToC_SC32_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTFwd_RToCCS_S16S32_Sfs                    OMXCATBAR(SP_FFTFwd_RToCCS_S16S32_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTFwd_RToCCS_S32_Sfs                       OMXCATBAR(SP_FFTFwd_RToCCS_S32_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTGetBufSize_C_SC16                        OMXCATBAR(SP_FFTGetBufSize_C_SC16, OMXSP_SUFFIX)
 #define omxSP_FFTGetBufSize_C_SC32                        OMXCATBAR(SP_FFTGetBufSize_C_SC32, OMXSP_SUFFIX)
 #define omxSP_FFTGetBufSize_R_S16S32                      OMXCATBAR(SP_FFTGetBufSize_R_S16S32, OMXSP_SUFFIX)
 #define omxSP_FFTGetBufSize_R_S32                         OMXCATBAR(SP_FFTGetBufSize_R_S32, OMXSP_SUFFIX)
 #define omxSP_FFTInit_C_SC16                              OMXCATBAR(SP_FFTInit_C_SC16, OMXSP_SUFFIX)
 #define omxSP_FFTInit_C_SC32                              OMXCATBAR(SP_FFTInit_C_SC32, OMXSP_SUFFIX)
 #define omxSP_FFTInit_R_S16S32                            OMXCATBAR(SP_FFTInit_R_S16S32, OMXSP_SUFFIX)
 #define omxSP_FFTInit_R_S32                               OMXCATBAR(SP_FFTInit_R_S32, OMXSP_SUFFIX)
 #define omxSP_FFTInv_CCSToR_S32_Sfs                       OMXCATBAR(SP_FFTInv_CCSToR_S32_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTInv_CCSToR_S32S16_Sfs                    OMXCATBAR(SP_FFTInv_CCSToR_S32S16_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTInv_CToC_SC16_Sfs                        OMXCATBAR(SP_FFTInv_CToC_SC16_Sfs, OMXSP_SUFFIX)
 #define omxSP_FFTInv_CToC_SC32_Sfs                        OMXCATBAR(SP_FFTInv_CToC_SC32_Sfs, OMXSP_SUFFIX)
 #define omxSP_FilterMedian_S32                            OMXCATBAR(SP_FilterMedian_S32, OMXSP_SUFFIX)
 #define omxSP_FilterMedian_S32_I                          OMXCATBAR(SP_FilterMedian_S32_I, OMXSP_SUFFIX)
 #define omxSP_FIR_Direct_S16                              OMXCATBAR(SP_FIR_Direct_S16, OMXSP_SUFFIX)
 #define omxSP_FIR_Direct_S16_I                            OMXCATBAR(SP_FIR_Direct_S16_I, OMXSP_SUFFIX)
 #define omxSP_FIR_Direct_S16_ISfs                         OMXCATBAR(SP_FIR_Direct_S16_ISfs, OMXSP_SUFFIX)
 #define omxSP_FIR_Direct_S16_Sfs                          OMXCATBAR(SP_FIR_Direct_S16_Sfs, OMXSP_SUFFIX)
 #define omxSP_FIROne_Direct_S16                           OMXCATBAR(SP_FIROne_Direct_S16, OMXSP_SUFFIX)
 #define omxSP_FIROne_Direct_S16_I                         OMXCATBAR(SP_FIROne_Direct_S16_I, OMXSP_SUFFIX)
 #define omxSP_FIROne_Direct_S16_ISfs                      OMXCATBAR(SP_FIROne_Direct_S16_ISfs, OMXSP_SUFFIX)
 #define omxSP_FIROne_Direct_S16_Sfs                       OMXCATBAR(SP_FIROne_Direct_S16_Sfs, OMXSP_SUFFIX)
 #define omxSP_IIR_BiQuadDirect_S16                        OMXCATBAR(SP_IIR_BiQuadDirect_S16, OMXSP_SUFFIX)
 #define omxSP_IIR_BiQuadDirect_S16_I                      OMXCATBAR(SP_IIR_BiQuadDirect_S16_I, OMXSP_SUFFIX)
 #define omxSP_IIR_Direct_S16                              OMXCATBAR(SP_IIR_Direct_S16, OMXSP_SUFFIX)
 #define omxSP_IIR_Direct_S16_I                            OMXCATBAR(SP_IIR_Direct_S16_I, OMXSP_SUFFIX)
 #define omxSP_IIROne_BiQuadDirect_S16                     OMXCATBAR(SP_IIROne_BiQuadDirect_S16, OMXSP_SUFFIX)
 #define omxSP_IIROne_BiQuadDirect_S16_I                   OMXCATBAR(SP_IIROne_BiQuadDirect_S16_I, OMXSP_SUFFIX)
 #define omxSP_IIROne_Direct_S16                           OMXCATBAR(SP_IIROne_Direct_S16, OMXSP_SUFFIX)
 #define omxSP_IIROne_Direct_S16_I                         OMXCATBAR(SP_IIROne_Direct_S16_I, OMXSP_SUFFIX)
 #define omxVCCOMM_Average_16x                             OMXCATBAR(VCCOMM_Average_16x, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_Average_8x                              OMXCATBAR(VCCOMM_Average_8x, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_ComputeTextureErrorBlock                OMXCATBAR(VCCOMM_ComputeTextureErrorBlock, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_ComputeTextureErrorBlock_SAD            OMXCATBAR(VCCOMM_ComputeTextureErrorBlock_SAD, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_Copy16x16                               OMXCATBAR(VCCOMM_Copy16x16, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_Copy8x8                                 OMXCATBAR(VCCOMM_Copy8x8, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_ExpandFrame_I                           OMXCATBAR(VCCOMM_ExpandFrame_I, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_LimitMVToRect                           OMXCATBAR(VCCOMM_LimitMVToRect, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_SAD_16x                                 OMXCATBAR(VCCOMM_SAD_16x, OMXVCCOMM_SUFFIX)
 #define omxVCCOMM_SAD_8x                                  OMXCATBAR(VCCOMM_SAD_8x, OMXVCCOMM_SUFFIX)
 #define omxVCM4P10_Average_4x                             OMXCATBAR(VCM4P10_Average_4x, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_BlockMatch_Half                        OMXCATBAR(VCM4P10_BlockMatch_Half, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_BlockMatch_Integer                     OMXCATBAR(VCM4P10_BlockMatch_Integer, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_BlockMatch_Quarter                     OMXCATBAR(VCM4P10_BlockMatch_Quarter, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_DeblockChroma_I                        OMXCATBAR(VCM4P10_DeblockChroma_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_DeblockLuma_I                          OMXCATBAR(VCM4P10_DeblockLuma_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_DecodeChromaDcCoeffsToPairCAVLC        OMXCATBAR(VCM4P10_DecodeChromaDcCoeffsToPairCAVLC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_DecodeCoeffsToPairCAVLC                OMXCATBAR(VCM4P10_DecodeCoeffsToPairCAVLC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_DequantTransformResidualFromPairAndAdd OMXCATBAR(VCM4P10_DequantTransformResidualFromPairAndAdd, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_FilterDeblockingChroma_HorEdge_I       OMXCATBAR(VCM4P10_FilterDeblockingChroma_HorEdge_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_FilterDeblockingChroma_VerEdge_I       OMXCATBAR(VCM4P10_FilterDeblockingChroma_VerEdge_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_FilterDeblockingLuma_HorEdge_I         OMXCATBAR(VCM4P10_FilterDeblockingLuma_HorEdge_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_FilterDeblockingLuma_VerEdge_I         OMXCATBAR(VCM4P10_FilterDeblockingLuma_VerEdge_I, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_GetVLCInfo                             OMXCATBAR(VCM4P10_GetVLCInfo, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InterpolateChroma                      OMXCATBAR(VCM4P10_InterpolateChroma, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InterpolateHalfHor_Luma                OMXCATBAR(VCM4P10_InterpolateHalfHor_Luma, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InterpolateHalfVer_Luma                OMXCATBAR(VCM4P10_InterpolateHalfVer_Luma, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InterpolateLuma                        OMXCATBAR(VCM4P10_InterpolateLuma, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InvTransformDequant_ChromaDC           OMXCATBAR(VCM4P10_InvTransformDequant_ChromaDC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InvTransformDequant_LumaDC             OMXCATBAR(VCM4P10_InvTransformDequant_LumaDC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_InvTransformResidualAndAdd             OMXCATBAR(VCM4P10_InvTransformResidualAndAdd, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_MEGetBufSize                           OMXCATBAR(VCM4P10_MEGetBufSize, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_MEInit                                 OMXCATBAR(VCM4P10_MEInit, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_MotionEstimationMB                     OMXCATBAR(VCM4P10_MotionEstimationMB, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_PredictIntra_16x16                     OMXCATBAR(VCM4P10_PredictIntra_16x16, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_PredictIntra_4x4                       OMXCATBAR(VCM4P10_PredictIntra_4x4, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_PredictIntraChroma_8x8                  OMXCATBAR(VCM4P10_PredictIntraChroma_8x8, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SAD_4x                                 OMXCATBAR(VCM4P10_SAD_4x, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SADQuar_16x                            OMXCATBAR(VCM4P10_SADQuar_16x, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SADQuar_4x                             OMXCATBAR(VCM4P10_SADQuar_4x, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SADQuar_8x                             OMXCATBAR(VCM4P10_SADQuar_8x, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SATD_4x4                               OMXCATBAR(VCM4P10_SATD_4x4, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_SubAndTransformQDQResidual             OMXCATBAR(VCM4P10_SubAndTransformQDQResidual, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_TransformDequantChromaDCFromPair       OMXCATBAR(VCM4P10_TransformDequantChromaDCFromPair, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_TransformDequantLumaDCFromPair         OMXCATBAR(VCM4P10_TransformDequantLumaDCFromPair, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_TransformQuant_ChromaDC                OMXCATBAR(VCM4P10_TransformQuant_ChromaDC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P10_TransformQuant_LumaDC                  OMXCATBAR(VCM4P10_TransformQuant_LumaDC, OMXVCM4P10_SUFFIX)
 #define omxVCM4P2_BlockMatch_Half_16x16                   OMXCATBAR(VCM4P2_BlockMatch_Half_16x16, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_BlockMatch_Half_8x8                     OMXCATBAR(VCM4P2_BlockMatch_Half_8x8, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_BlockMatch_Integer_16x16                OMXCATBAR(VCM4P2_BlockMatch_Integer_16x16, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_BlockMatch_Integer_8x8                  OMXCATBAR(VCM4P2_BlockMatch_Integer_8x8, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DCT8x8blk                               OMXCATBAR(VCM4P2_DCT8x8blk, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodeBlockCoef_Inter                   OMXCATBAR(VCM4P2_DecodeBlockCoef_Inter, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodeBlockCoef_Intra                   OMXCATBAR(VCM4P2_DecodeBlockCoef_Intra, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodePadMV_PVOP                        OMXCATBAR(VCM4P2_DecodePadMV_PVOP, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodeVLCZigzag_Inter                   OMXCATBAR(VCM4P2_DecodeVLCZigzag_Inter, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodeVLCZigzag_IntraACVLC              OMXCATBAR(VCM4P2_DecodeVLCZigzag_IntraACVLC, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_DecodeVLCZigzag_IntraDCVLC              OMXCATBAR(VCM4P2_DecodeVLCZigzag_IntraDCVLC, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_EncodeMV                                OMXCATBAR(VCM4P2_EncodeMV, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_EncodeVLCZigzag_Inter                   OMXCATBAR(VCM4P2_EncodeVLCZigzag_Inter, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_EncodeVLCZigzag_IntraACVLC              OMXCATBAR(VCM4P2_EncodeVLCZigzag_IntraACVLC, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_EncodeVLCZigzag_IntraDCVLC              OMXCATBAR(VCM4P2_EncodeVLCZigzag_IntraDCVLC, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_FindMVpred                              OMXCATBAR(VCM4P2_FindMVpred, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_IDCT8x8blk                              OMXCATBAR(VCM4P2_IDCT8x8blk, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_MCReconBlock                            OMXCATBAR(VCM4P2_MCReconBlock, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_MEGetBufSize                            OMXCATBAR(VCM4P2_MEGetBufSize, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_MEInit                                  OMXCATBAR(VCM4P2_MEInit, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_MotionEstimationMB                      OMXCATBAR(VCM4P2_MotionEstimationMB, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_PredictReconCoefIntra                   OMXCATBAR(VCM4P2_PredictReconCoefIntra, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_QuantInter_I                            OMXCATBAR(VCM4P2_QuantInter_I, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_QuantIntra_I                            OMXCATBAR(VCM4P2_QuantIntra_I, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_QuantInvInter_I                         OMXCATBAR(VCM4P2_QuantInvInter_I, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_QuantInvIntra_I                         OMXCATBAR(VCM4P2_QuantInvIntra_I, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_TransRecBlockCoef_inter                 OMXCATBAR(VCM4P2_TransRecBlockCoef_inter, OMXVCM4P2_SUFFIX)
 #define omxVCM4P2_TransRecBlockCoef_intra                 OMXCATBAR(VCM4P2_TransRecBlockCoef_intra, OMXVCM4P2_SUFFIX)
 #endif /* endif ARMOMX_ENABLE_RENAMING */
 #endif /* _armOMX_h_ */
--- a/media/openmax_dl/dl/api/omxtypes.h
+++ b/media/openmax_dl/dl/api/omxtypes.h
@ -1,286 +0,0 @@
 /**
 * File: omxtypes.h
 * Brief: Defines basic Data types used in OpenMAX v1.0.2 header files.
 *
 * Copyright (c) 2005-2008,2015 The Khronos Group Inc.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and/or associated documentation files (the
 * "Materials"), to deal in the Materials without restriction, including
 * without limitation the rights to use, copy, modify, merge, publish,
 * distribute, sublicense, and/or sell copies of the Materials, and to
 * permit persons to whom the Materials are furnished to do so, subject to
 * the following conditions:
 *
 * The above copyright notice and this permission notice shall be included
 * in all copies or substantial portions of the Materials.
 *
 * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
 * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
 * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
 *    https://www.khronos.org/registry/
 *
 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
 *
 */
 #ifndef _OMXTYPES_H_
 #define _OMXTYPES_H_
 #include <limits.h> 
 #ifdef __cplusplus
 extern "C" {
 #endif
 /*
 * Maximum FFT order supported by the twiddle table.  Only used by the
 * float FFT routines. Must be consistent with the table in
 * armSP_FFT_F32TwiddleTable.c.
 */
 #ifdef BIG_FFT_TABLE
 #define TWIDDLE_TABLE_ORDER 15
 #else
 #define TWIDDLE_TABLE_ORDER 12
 #endif
 #define OMX_IN
 #define OMX_OUT
 #define OMX_INOUT
 typedef enum {
    /* Mandatory return codes - use cases are explicitly described for each function */
    OMX_Sts_NoErr                    =  0,    /* No error, the function completed successfully */
    OMX_Sts_Err                      = -2,    /* Unknown/unspecified error */    
    OMX_Sts_InvalidBitstreamValErr   = -182,  /* Invalid value detected during bitstream processing */    
    OMX_Sts_MemAllocErr              = -9,    /* Not enough memory allocated for the operation */
    OMX_StsACAAC_GainCtrErr    	     = -159,  /* AAC: Unsupported gain control data detected */
    OMX_StsACAAC_PrgNumErr           = -167,  /* AAC: Invalid number of elements for one program   */
    OMX_StsACAAC_CoefValErr          = -163,  /* AAC: Invalid quantized coefficient value          */     
    OMX_StsACAAC_MaxSfbErr           = -162,  /* AAC: Invalid maxSfb value in relation to numSwb */    
 	OMX_StsACAAC_PlsDataErr		     = -160,  /* AAC: pulse escape sequence data error */
    /* Optional return codes - use cases are explicitly described for each function*/
    OMX_Sts_BadArgErr                = -5,    /* Bad Arguments */
    OMX_StsACAAC_TnsNumFiltErr       = -157,  /* AAC: Invalid number of TNS filters  */
    OMX_StsACAAC_TnsLenErr           = -156,  /* AAC: Invalid TNS region length  */   
    OMX_StsACAAC_TnsOrderErr         = -155,  /* AAC: Invalid order of TNS filter  */                  
    OMX_StsACAAC_TnsCoefResErr       = -154,  /* AAC: Invalid bit-resolution for TNS filter coefficients  */
    OMX_StsACAAC_TnsCoefErr          = -153,  /* AAC: Invalid TNS filter coefficients  */                  
    OMX_StsACAAC_TnsDirectErr        = -152,  /* AAC: Invalid TNS filter direction  */  
    OMX_StsICJP_JPEGMarkerErr        = -183,  /* JPEG marker encountered within an entropy-coded block; */
                                              /* Huffman decoding operation terminated early.           */
    OMX_StsICJP_JPEGMarker           = -181,  /* JPEG marker encountered; Huffman decoding */
                                              /* operation terminated early.                         */
    OMX_StsIPPP_ContextMatchErr      = -17,   /* Context parameter doesn't match to the operation */
    OMX_StsSP_EvenMedianMaskSizeErr  = -180,  /* Even size of the Median Filter mask was replaced by the odd one */
    OMX_Sts_MaximumEnumeration       = INT_MAX  /*Placeholder, forces enum of size OMX_INT*/
 } OMXResult;          /** Return value or error value returned from a function. Identical to OMX_INT */
 /* OMX_U8 */
 #if UCHAR_MAX == 0xff
 typedef unsigned char OMX_U8;
 #elif USHRT_MAX == 0xff 
 typedef unsigned short int OMX_U8; 
 #else
 #error OMX_U8 undefined
 #endif 
 /* OMX_S8 */
 #if SCHAR_MAX == 0x7f 
 typedef signed char OMX_S8;
 #elif SHRT_MAX == 0x7f 
 typedef signed short int OMX_S8; 
 #else
 #error OMX_S8 undefined
 #endif
 /* OMX_U16 */
 #if USHRT_MAX == 0xffff
 typedef unsigned short int OMX_U16;
 #elif UINT_MAX == 0xffff
 typedef unsigned int OMX_U16; 
 #else
 #error OMX_U16 undefined
 #endif
 /* OMX_S16 */
 #if SHRT_MAX == 0x7fff 
 typedef signed short int OMX_S16;
 #elif INT_MAX == 0x7fff 
 typedef signed int OMX_S16; 
 #else
 #error OMX_S16 undefined
 #endif
 /* OMX_U32 */
 #if UINT_MAX == 0xffffffff
 typedef unsigned int OMX_U32;
 #elif LONG_MAX == 0xffffffff
 typedef unsigned long int OMX_U32; 
 #else
 #error OMX_U32 undefined
 #endif
 /* OMX_S32 */
 #if INT_MAX == 0x7fffffff
 typedef signed int OMX_S32;
 #elif LONG_MAX == 0x7fffffff
 typedef long signed int OMX_S32; 
 #else
 #error OMX_S32 undefined
 #endif
 /* OMX_U64 & OMX_S64 */
 #if defined( _WIN32 ) || defined ( _WIN64 )
    typedef __int64 OMX_S64; /** Signed 64-bit integer */
    typedef unsigned __int64 OMX_U64; /** Unsigned 64-bit integer */
    #define OMX_MIN_S64			(0x8000000000000000i64)
    #define OMX_MIN_U64			(0x0000000000000000i64)
    #define OMX_MAX_S64			(0x7FFFFFFFFFFFFFFFi64)
    #define OMX_MAX_U64			(0xFFFFFFFFFFFFFFFFi64)
 #else
    typedef long long OMX_S64; /** Signed 64-bit integer */
    typedef unsigned long long OMX_U64; /** Unsigned 64-bit integer */
    #define OMX_MIN_S64			(0x8000000000000000LL)
    #define OMX_MIN_U64			(0x0000000000000000LL)
    #define OMX_MAX_S64			(0x7FFFFFFFFFFFFFFFLL)
    #define OMX_MAX_U64			(0xFFFFFFFFFFFFFFFFLL)
 #endif
 /* OMX_SC8 */
 typedef struct
 {
  OMX_S8 Re; /** Real part */
  OMX_S8 Im; /** Imaginary part */	
 } OMX_SC8; /** Signed 8-bit complex number */
 /* OMX_SC16 */
 typedef struct
 {
  OMX_S16 Re; /** Real part */
  OMX_S16 Im; /** Imaginary part */	
 } OMX_SC16; /** Signed 16-bit complex number */
 /* OMX_SC32 */
 typedef struct
 {
  OMX_S32 Re; /** Real part */
  OMX_S32 Im; /** Imaginary part */	
 } OMX_SC32; /** Signed 32-bit complex number */
 /* OMX_SC64 */
 typedef struct
 {
  OMX_S64 Re; /** Real part */
  OMX_S64 Im; /** Imaginary part */	
 } OMX_SC64; /** Signed 64-bit complex number */
 /* OMX_F32 */
 typedef float OMX_F32; /** Single precision floating point,IEEE 754 */
 /* OMX_F64 */
 typedef double OMX_F64; /** Double precision floating point,IEEE 754 */
 /* OMX_FC32 */
 typedef struct
 {
  OMX_F32 Re; /** Real part */
  OMX_F32 Im; /** Imaginary part */	
 } OMX_FC32; /** single precision floating point complex number */
 /* OMX_FC64 */
 typedef struct
 {
  OMX_F64 Re; /** Real part */
  OMX_F64 Im; /** Imaginary part */	
 } OMX_FC64; /** double precision floating point complex number */
 /* OMX_INT */
 typedef int OMX_INT; /** signed integer corresponding to machine word length, has maximum signed value INT_MAX*/
 #define OMX_MIN_S8  	   	(-128)
 #define OMX_MIN_U8  		0
 #define OMX_MIN_S16		 	(-32768)
 #define OMX_MIN_U16			0
 #define OMX_MIN_S32			(-2147483647-1)
 #define OMX_MIN_U32			0
 #define OMX_MAX_S8			(127)
 #define OMX_MAX_U8			(255)
 #define OMX_MAX_S16			(32767)
 #define OMX_MAX_U16			(0xFFFF)
 #define OMX_MAX_S32			(2147483647)
 #define OMX_MAX_U32			(0xFFFFFFFF)
 typedef void OMXVoid;
 #ifndef NULL
 #define NULL ((void*)0)
 #endif
 /** Defines the geometric position and size of a rectangle, 
  * where x,y defines the coordinates of the top left corner
  * of the rectangle, with dimensions width in the x-direction 
  * and height in the y-direction */
 typedef struct {
 	OMX_INT x;      /** x-coordinate of top left corner of rectangle */
 	OMX_INT y;      /** y-coordinate of top left corner of rectangle */
 	OMX_INT width;  /** Width in the x-direction. */
 	OMX_INT height; /** Height in the y-direction. */
 }OMXRect;
 /** Defines the geometric position of a point, */
 typedef struct 
 {
 OMX_INT x; /** x-coordinate */
 OMX_INT y;	/** y-coordinate */
 } OMXPoint;
 /** Defines the dimensions of a rectangle, or region of interest in an image */
 typedef struct 
 {
 OMX_INT width;  /** Width of the rectangle, in the x-direction */
 OMX_INT height; /** Height of the rectangle, in the y-direction */
 } OMXSize;
 #ifdef __cplusplus
 }
 #endif
 #endif /* _OMXTYPES_H_ */
--- a/media/openmax_dl/dl/api/omxtypes_s.h
+++ b/media/openmax_dl/dl/api/omxtypes_s.h
@ -1,76 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This file was originally licensed as follows. It has been
@//  relicensed with permission from the copyright holders.
@//
@// 
@// File Name:  omxtypes_s.h
@// OpenMAX DL: v1.0.2
@// Last Modified Revision:   9622
@// Last Modified Date:       Wed, 06 Feb 2008
@// 
@// (c) Copyright 2007-2008 ARM Limited. All Rights Reserved.
@// 
@//
@// Mandatory return codes - use cases are explicitly described for each function 
 	.equ	OMX_Sts_NoErr, 0    @// No error the function completed successfully 
 	.equ	OMX_Sts_Err, -2    @// Unknown/unspecified error     
 	.equ	OMX_Sts_InvalidBitstreamValErr, -182  @// Invalid value detected during bitstream processing     
 	.equ	OMX_Sts_MemAllocErr, -9    @// Not enough memory allocated for the operation 
 	.equ	OMX_StsACAAC_GainCtrErr, -159  @// AAC: Unsupported gain control data detected 
 	.equ	OMX_StsACAAC_PrgNumErr, -167  @// AAC: Invalid number of elements for one program   
 	.equ	OMX_StsACAAC_CoefValErr, -163  @// AAC: Invalid quantized coefficient value               
 	.equ	OMX_StsACAAC_MaxSfbErr, -162  @// AAC: Invalid maxSfb value in relation to numSwb     
 	.equ	OMX_StsACAAC_PlsDataErr, -160  @// AAC: pulse escape sequence data error 
@// Optional return codes - use cases are explicitly described for each function
 	.equ	OMX_Sts_BadArgErr, -5    @// Bad Arguments 
 	.equ	OMX_StsACAAC_TnsNumFiltErr, -157  @// AAC: Invalid number of TNS filters  
 	.equ	OMX_StsACAAC_TnsLenErr, -156  @// AAC: Invalid TNS region length     
 	.equ	OMX_StsACAAC_TnsOrderErr, -155  @// AAC: Invalid order of TNS filter                    
 	.equ	OMX_StsACAAC_TnsCoefResErr, -154  @// AAC: Invalid bit-resolution for TNS filter coefficients  
 	.equ	OMX_StsACAAC_TnsCoefErr, -153  @// AAC: Invalid TNS filter coefficients                    
 	.equ	OMX_StsACAAC_TnsDirectErr, -152  @// AAC: Invalid TNS filter direction    
 	.equ	OMX_StsICJP_JPEGMarkerErr, -183  @// JPEG marker encountered within an entropy-coded block; 
                                            @// Huffman decoding operation terminated early.           
 	.equ	OMX_StsICJP_JPEGMarker, -181  @// JPEG marker encountered; Huffman decoding 
                                            @// operation terminated early.                         
 	.equ	OMX_StsIPPP_ContextMatchErr, -17   @// Context parameter doesn't match to the operation 
 	.equ	OMX_StsSP_EvenMedianMaskSizeErr, -180  @// Even size of the Median Filter mask was replaced by the odd one 
 	.equ	OMX_Sts_MaximumEnumeration, 0x7FFFFFFF
 	.equ	OMX_MIN_S8, (-128)
 	.equ	OMX_MIN_U8, 0
 	.equ	OMX_MIN_S16, (-32768)
 	.equ	OMX_MIN_U16, 0
 	.equ	OMX_MIN_S32, (-2147483647-1)
 	.equ	OMX_MIN_U32, 0
 	.equ	OMX_MAX_S8, (127)
 	.equ	OMX_MAX_U8, (255)
 	.equ	OMX_MAX_S16, (32767)
 	.equ	OMX_MAX_U16, (0xFFFF)
 	.equ	OMX_MAX_S32, (2147483647)
 	.equ	OMX_MAX_U32, (0xFFFFFFFF)
 	.equ	OMX_VC_UPPER, 0x1                 @// Used by the PredictIntra functions   
 	.equ	OMX_VC_LEFT, 0x2                 @// Used by the PredictIntra functions 
 	.equ	OMX_VC_UPPER_RIGHT, 0x40          @// Used by the PredictIntra functions   
 	.equ	NULL, 0
--- a/media/openmax_dl/dl/moz.build
+++ b/media/openmax_dl/dl/moz.build
@ -1,49 +0,0 @@
 # -*- Mode: python; indent-tabs-mode: nil; tab-width: 40 -*-
 # vim: set filetype=python:
 # This Source Code Form is subject to the terms of the Mozilla Public
 # License, v. 2.0. If a copy of the MPL was not distributed with this
 # file, You can obtain one at http://mozilla.org/MPL/2.0/.
 if CONFIG['TARGET_CPU'] == 'arm' and CONFIG['BUILD_ARM_NEON']:
    Library('openmax_dl')
    EXPORTS.dl.api += [
        'api/armCOMM_s.h',
        'api/armOMX.h',
        'api/omxtypes.h',
        'api/omxtypes_s.h',
    ]
    EXPORTS.dl.sp.api += [
        'sp/api/armSP.h',
        'sp/api/omxSP.h',
    ]
    SOURCES += [
        'sp/src/armSP_FFT_F32TwiddleTable.c',
        'sp/src/omxSP_FFTGetBufSize_R_F32.c',
        'sp/src/omxSP_FFTGetBufSize_R_S32.c',
        'sp/src/omxSP_FFTInit_R_F32.c',
    ]
    SOURCES += [
        'sp/src/armSP_FFT_CToC_FC32_Radix2_fs_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix2_ls_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix2_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix4_fs_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix4_ls_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix4_unsafe_s.S',
        'sp/src/armSP_FFT_CToC_FC32_Radix8_fs_unsafe_s.S',
        'sp/src/armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe_s.S',
        'sp/src/omxSP_FFTFwd_RToCCS_F32_Sfs_s.S',
        'sp/src/omxSP_FFTInv_CCSToR_F32_Sfs_unscaled_s.S',
    ]
    LOCAL_INCLUDES += [
        '..',
        'api'
    ]
    DEFINES['BIG_FFT_TABLE'] = True
    FINAL_LIBRARY = 'xul'
--- a/media/openmax_dl/dl/sp/api/armSP.h
+++ b/media/openmax_dl/dl/sp/api/armSP.h
@ -1,92 +0,0 @@
 /*
 *  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 *
 *  This file was originally licensed as follows. It has been
 *  relicensed with permission from the copyright holders.
 */
 /**
 * 
 * File Name:  armSP.h
 * OpenMAX DL: v1.0.2
 * Last Modified Revision:   7014
 * Last Modified Date:       Wed, 01 Aug 2007
 * 
 * (c) Copyright 2007-2008 ARM Limited. All Rights Reserved.
 * 
 * 
 *   
 * File: armSP.h
 * Brief: Declares API's/Basic Data types used across the OpenMAX Signal Processing domain
 *
 */
 #ifndef _armSP_H_
 #define _armSP_H_
 #include "dl/api/omxtypes.h"
 #ifdef __cplusplus
 extern "C" {
 #endif
 /** FFT Specific declarations */
 extern  OMX_S32 armSP_FFT_S32TwiddleTable[1026];
 extern OMX_F32 armSP_FFT_F32TwiddleTable[];
 typedef struct  ARMsFFTSpec_SC32_Tag 
 {
    OMX_U32     N;
    OMX_U16     *pBitRev;    
    OMX_SC32    *pTwiddle;
    OMX_SC32    *pBuf;
 }ARMsFFTSpec_SC32;
 typedef struct  ARMsFFTSpec_SC16_Tag 
 {
    OMX_U32     N;
    OMX_U16     *pBitRev;    
    OMX_SC16    *pTwiddle;
    OMX_SC16    *pBuf;
 }ARMsFFTSpec_SC16;
 typedef struct  ARMsFFTSpec_R_SC32_Tag 
 {
    OMX_U32     N;
    OMX_U16     *pBitRev;    
    OMX_SC32    *pTwiddle;
    OMX_S32     *pBuf;
 }ARMsFFTSpec_R_SC32;
 typedef struct ARMsFFTSpec_R_FC32_Tag
 {
    OMX_U32 N;
    OMX_U16* pBitRev;
    OMX_FC32* pTwiddle;
    OMX_F32* pBuf;
 } ARMsFFTSpec_R_FC32;
 typedef struct ARMsFFTSpec_FC32_Tag
 {
    OMX_U32 N;
    OMX_U16* pBitRev;
    OMX_FC32* pTwiddle;
    OMX_FC32* pBuf;
 } ARMsFFTSpec_FC32;
 #ifdef __cplusplus
 }
 #endif
 #endif
 /*End of File*/
--- a/media/openmax_dl/dl/sp/api/omxSP.h
+++ b/media/openmax_dl/dl/sp/api/omxSP.h
--- a/media/openmax_dl/dl/sp/src/armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe_s.S
@ -1,294 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of
@//  armSP_FFTInv_CCSToR_S32_preTwiddleRadix2_unsafe_s.s to support float
@//  instead of SC32.
@//
@//
@// Description:
@// Compute the "preTwiddleRadix2" stage prior to the call to the complexFFT
@// It does a Z(k) = Feven(k) + jW^(-k) FOdd(k); k=0,1,2,...N/2-1 computation
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
      @// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r1
 #define pFFTSpec        r2
 #define scale           r3
@// Output registers
 #define result          r0
@//Local Scratch Registers
 #define argTwiddle      r1
 #define argDst          r2
 #define argScale        r4
 #define tmpOrder        r4
 #define pTwiddle        r4
 #define pOut            r5
 #define subFFTSize      r7
 #define subFFTNum       r6
 #define N               r6
 #define order           r14
 #define diff            r9
@// Total num of radix stages required to complete the FFT
 #define count           r8
 #define x0r             r4
 #define x0i             r5
 #define diffMinusOne    r2
 #define round           r3
 #define pOut1           r2
 #define size            r7
 #define step            r8
 #define step1           r9
 #define twStep          r10
 #define pTwiddleTmp     r11
 #define argTwiddle1     r12
 #define zero            r14
@// Neon registers
 #define dX0     D0
 #define dShift  D1
 #define dX1     D1
 #define dY0     D2
 #define dY1     D3
 #define dX0r    D0
 #define dX0i    D1
 #define dX1r    D2
 #define dX1i    D3
 #define dW0r    D4
 #define dW0i    D5
 #define dW1r    D6
 #define dW1i    D7
 #define dT0     D8
 #define dT1     D9
 #define dT2     D10
 #define dT3     D11
 #define qT0     D12
 #define qT1     D14
 #define qT2     D16
 #define qT3     D18
 #define dY0r    D4
 #define dY0i    D5
 #define dY1r    D6
 #define dY1i    D7
 #define dY2     D4
 #define dY3     D5
 #define dW0     D6
 #define dW1     D7
 #define dW0Tmp  D10
 #define dW1Neg  D11
 #define half    D13
@ Structure offsets for the FFTSpec
        .set    ARMsFFTSpec_N, 0
        .set    ARMsFFTSpec_pBitRev, 4
        .set    ARMsFFTSpec_pTwiddle, 8
        .set    ARMsFFTSpec_pBuf, 12
        .MACRO FFTSTAGE scaled, inverse, name
        @// Read the size from structure and take log
        LDR     N, [pFFTSpec, #ARMsFFTSpec_N]
        @// Read other structure parameters
        LDR     pTwiddle, [pFFTSpec, #ARMsFFTSpec_pTwiddle]
        LDR     pOut, [pFFTSpec, #ARMsFFTSpec_pBuf]
        VMOV.F32    half, #0.5
        MOV     size,N,ASR #1                 @// preserve the contents of N
        MOV     step,N,LSL #2                 @// step = N/2 * 8 bytes
        @// Z(k) = 1/2 {[F(k) +  F'(N/2-k)] +j*W^(-k) [F(k) -  F'(N/2-k)]}
        @// Note: W^(k) is stored as negated value and also need to
        @// conjugate the values from the table
        @// Z(0) : no need of twiddle multiply
        @// Z(0) = 1/2 { [F(0) +  F'(N/2)] +j [F(0) -  F'(N/2)] }
        VLD1.F32    dX0,[pSrc],step
        ADD     pOut1,pOut,step               @// pOut1 = pOut+ N/2*8 bytes
        VLD1.F32    dX1,[pSrc]!
        @// twStep = 3N/8 * 8 bytes pointing to W^1
        SUB     twStep,step,size,LSL #1
        MOV     step1,size,LSL #2             @// step1 = N/4 * 8 = N/2*4 bytes
        SUB     step1,step1,#8                @// (N/4-1)*8 bytes
        VADD.F32    dY0,dX0,dX1                   @// [b+d | a+c]
        VSUB.F32    dY1,dX0,dX1                   @// [b-d | a-c]
        VMUL.F32    dY0, dY0, half[0]
        VMUL.F32    dY1, dY1, half[0]
        @// dY0= [a-c | a+c] ;dY1= [b-d | b+d]
        VZIP.F32    dY0,dY1
        VSUB.F32   dX0,dY0,dY1
        SUBS   size,size,#2
        VADD.F32   dX1,dY0,dY1
        SUB     pSrc,pSrc,step
        VST1.F32    dX0[0],[pOut1]!
        ADD     pTwiddleTmp,pTwiddle,#8       @// W^2
        VST1.F32    dX1[1],[pOut1]!
        ADD     argTwiddle1,pTwiddle,twStep   @// W^1
        BLT     decrementScale\name
        BEQ     lastElement\name
        @// Z(k) = 1/2[F(k) +  F'(N/2-k)] +j*W^(-k) [F(k) -  F'(N/2-k)]
        @// Note: W^k is stored as negative values in the table and also
        @// need to conjugate the values from the table.
        @//
        @// Process 4 elements at a time. E.g: Z(1),Z(2) and Z(N/2-2),Z(N/2-1)
        @// since both of them require F(1),F(2) and F(N/2-2),F(N/2-1)
        SUB     step,step,#24
 evenOddButterflyLoop\name :
        VLD1.F32    dW0r,[argTwiddle1],step1
        VLD1.F32    dW1r,[argTwiddle1]!
        VLD2.F32    {dX0r,dX0i},[pSrc],step
        SUB     argTwiddle1,argTwiddle1,step1
        VLD2.F32    {dX1r,dX1i},[pSrc]!
        SUB     step1,step1,#8                @// (N/4-2)*8 bytes
        VLD1.F32    dW0i,[pTwiddleTmp],step1
        VLD1.F32    dW1i,[pTwiddleTmp]!
        SUB     pSrc,pSrc,step
        SUB     pTwiddleTmp,pTwiddleTmp,step1
        VREV64.F32  dX1r,dX1r
        VREV64.F32  dX1i,dX1i
        SUBS    size,size,#4
        VSUB.F32    dT2,dX0r,dX1r                 @// a-c
        VADD.F32    dT3,dX0i,dX1i                 @// b+d
        VADD.F32    dT0,dX0r,dX1r                 @// a+c
        VSUB.F32    dT1,dX0i,dX1i                 @// b-d
        SUB     step1,step1,#8
        VMUL.F32    dT2, dT2, half[0]
        VMUL.F32    dT3, dT3, half[0]
        VMUL.F32    dT0, dT0, half[0]
        VMUL.F32    dT1, dT1, half[0]
        VZIP.F32    dW1r,dW1i
        VZIP.F32    dW0r,dW0i
        VMUL.F32   dX1r,dW1r,dT2
        VMUL.F32   dX1i,dW1r,dT3
        VMUL.F32   dX0r,dW0r,dT2
        VMUL.F32   dX0i,dW0r,dT3
        VMLS.F32   dX1r,dW1i,dT3
        VMLA.F32   dX1i,dW1i,dT2
        VMLA.F32   dX0r,dW0i,dT3
        VMLS.F32   dX0i,dW0i,dT2
        VADD.F32    dY1r,dT0,dX1i                 @// F(N/2 -1)
        VSUB.F32    dY1i,dX1r,dT1
        VREV64.F32  dY1r,dY1r
        VREV64.F32  dY1i,dY1i
        VADD.F32    dY0r,dT0,dX0i                 @// F(1)
        VSUB.F32    dY0i,dT1,dX0r
        VST2.F32    {dY0r,dY0i},[pOut1],step
        VST2.F32    {dY1r,dY1i},[pOut1]!
        SUB     pOut1,pOut1,step
        SUB     step,step,#32                 @// (N/2-4)*8 bytes
        BGT     evenOddButterflyLoop\name
        @// set both the ptrs to the last element
        SUB     pSrc,pSrc,#8
        SUB     pOut1,pOut1,#8
        @// Last element can be expanded as follows
        @// 1/2[Z(k) + Z'(k)] - j w^-k [Z(k) - Z'(k)] (since W^k is stored as
        @// -ve)
        @// 1/2[(a+jb) + (a-jb)] - j w^-k [(a+jb) - (a-jb)]
        @// 1/2[2a+j0] - j (c-jd) [0+j2b]
        @// (a+bc, -bd)
        @// Since (c,d) = (0,1) for the last element, result is just (a,-b)
 lastElement\name :
        VLD1.F32    dX0r,[pSrc]
        VST1.F32    dX0r[0],[pOut1]!
        VNEG.F32    dX0r,dX0r
        VST1.F32    dX0r[1],[pOut1]
 decrementScale\name :
        .endm
        M_START armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe,r4
            FFTSTAGE "FALSE","TRUE",Inv
        M_END
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_fs_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_fs_unsafe_s.S
@ -1,134 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix2_fs_unsafe_s.S
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute the first stage of a Radix 2 DIT in-order out-of-place FFT
@// stage for a N point complex signal.
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define pPingPongBuf    r5
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define pointStep       r3
 #define outPointStep    r3
 #define grpSize         r4
 #define setCount        r4
 #define step            r8
 #define dstStep         r8
@// Neon Registers
 #define dX0     D0
 #define dX1     D1
 #define dY0     D2
 #define dY1     D3
        .MACRO FFTSTAGE scaled, inverse, name
        @// Define stack arguments
        @// update subFFTSize and subFFTNum into RN6 and RN7 for the next stage
        MOV        subFFTSize,#2
        LSR        grpSize,subFFTNum,#1
        MOV        subFFTNum,grpSize
        @// pT0+1 increments pT0 by 8 bytes
        @// pT0+pointStep = increment of 8*pointStep bytes = 4*grpSize bytes
        @// Note: outPointStep = pointStep for firststage
        @// Note: setCount = grpSize/2 (reuse the updated grpSize for setCount)
        MOV        pointStep,grpSize,LSL #3
        RSB        step,pointStep,#8
        @// Loop on the sets for grp zero
 grpZeroSetLoop\name :
        VLD1.F32    dX0,[pSrc],pointStep
        VLD1.F32    dX1,[pSrc],step                   @// step = -pointStep + 8
        SUBS    setCount,setCount,#1
        VADD.F32    dY0,dX0,dX1
        VSUB.F32    dY1,dX0,dX1
        VST1.F32    dY0,[pDst],outPointStep
        @// dstStep =  step = -pointStep + 8
        VST1.F32    dY1,[pDst],dstStep
        BGT     grpZeroSetLoop\name
        @// reset pSrc to pDst for the next stage
        SUB     pSrc,pDst,pointStep                     @// pDst -= 2*grpSize
        MOV     pDst,pPingPongBuf
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix2_fs_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","FALSE",fwd
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix2_fs_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","TRUE",inv
        M_END
 	.end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_ls_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_ls_unsafe_s.S
@ -1,153 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix2_ls_unsafe_s.S
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute the last stage of a Radix 2 DIT in-order out-of-place FFT
@// stage for a N point complex signal.
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define outPointStep    r3
 #define grpCount        r4
 #define dstStep         r5
 #define pTmp            r4
@// Neon Registers
 #define dWr     d0
 #define dWi     d1
 #define dXr0    d2
 #define dXi0    d3
 #define dXr1    d4
 #define dXi1    d5
 #define dYr0    d6
 #define dYi0    d7
 #define dYr1    d8
 #define dYi1    d9
 #define qT0     d10
 #define qT1     d12
        .MACRO FFTSTAGE scaled, inverse, name
        MOV     outPointStep,subFFTSize,LSL #3
        @// Update grpCount and grpSize rightaway
        MOV     subFFTNum,#1                            @//after the last stage
        LSL     grpCount,subFFTSize,#1
        @// update subFFTSize for the next stage
        MOV     subFFTSize,grpCount
        RSB      dstStep,outPointStep,#16
        @// Loop on 2 grps at a time for the last stage
 radix2lsGrpLoop\name :
        @ dWr = [pTwiddle[0].Re, pTwiddle[1].Re]
        @ dWi = [pTwiddle[0].Im, pTwiddle[1].Im]
        VLD2.F32    {dWr,dWi},[pTwiddle, :64]!
        @ dXr0 = [pSrc[0].Re, pSrc[2].Re]
        @ dXi0 = [pSrc[0].Im, pSrc[2].Im]
        @ dXr1 = [pSrc[1].Re, pSrc[3].Re]
        @ dXi1 = [pSrc[1].Im, pSrc[3].Im]
        VLD4.F32    {dXr0,dXi0,dXr1,dXi1},[pSrc, :128]!
        SUBS    grpCount,grpCount,#4                   @// grpCount is multiplied by 2
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   qT0,dWr,dXr1
            VMLA.F32   qT0,dWi,dXi1                       @// real part
            VMUL.F32   qT1,dWr,dXi1
            VMLS.F32   qT1,dWi,dXr1                       @// imag part
        .else
            VMUL.F32   qT0,dWr,dXr1
            VMLS.F32   qT0,dWi,dXi1                       @// real part
            VMUL.F32   qT1,dWr,dXi1
            VMLA.F32   qT1,dWi,dXr1                       @// imag part
        .endif
        VSUB.F32    dYr0,dXr0,qT0
        VSUB.F32    dYi0,dXi0,qT1
        VADD.F32    dYr1,dXr0,qT0
        VADD.F32    dYi1,dXi0,qT1
        VST2.F32    {dYr0,dYi0},[pDst],outPointStep
        VST2.F32    {dYr1,dYi1},[pDst],dstStep                  @// dstStep =  step = -outPointStep + 16
        BGT     radix2lsGrpLoop\name
        @// Reset and Swap pSrc and pDst for the next stage
        MOV     pTmp,pDst
        SUB     pDst,pSrc,outPointStep,LSL #1       @// pDst -= 4*size; pSrc -= 8*size bytes
        SUB     pSrc,pTmp,outPointStep
        @// Reset pTwiddle for the next stage
        SUB     pTwiddle,pTwiddle,outPointStep      @// pTwiddle -= 4*size bytes
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix2_ls_OutOfPlace_unsafe,r4,""
        FFTSTAGE "FALSE","FALSE",fwd
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix2_ls_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","TRUE",inv
        M_END
 	.end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix2_unsafe_s.S
@ -1,191 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix2_unsafe_s.s
@//  to support float instead of SC32.
@//
@// Description:
@// Compute a Radix 2 DIT in-order out-of-place FFT stage for an N point
@// complex signal.  This handles the general stage, not the first or last
@// stage.
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define outPointStep    r3
 #define pointStep       r4
 #define grpCount        r5
 #define setCount        r8
@//const           RN  9
 #define step            r10
 #define dstStep         r11
 #define pTable          r9
 #define pTmp            r9
@// Neon Registers
 #define dW      D0
 #define dX0     D2
 #define dX1     D3
 #define dX2     D4
 #define dX3     D5
 #define dY0     D6
 #define dY1     D7
 #define dY2     D8
 #define dY3     D9
 #define qT0     D10
 #define qT1     D11
        .MACRO FFTSTAGE scaled, inverse, name
        @// Define stack arguments
        @// Update grpCount and grpSize rightaway inorder to reuse pGrpCount
        @// and pGrpSize regs
        LSR     subFFTNum,subFFTNum,#1                      @//grpSize
        LSL     grpCount,subFFTSize,#1
        @// pT0+1 increments pT0 by 8 bytes
        @// pT0+pointStep = increment of 8*pointStep bytes = 4*grpSize bytes
        MOV     pointStep,subFFTNum,LSL #2
        @// update subFFTSize for the next stage
        MOV     subFFTSize,grpCount
        @// pOut0+1 increments pOut0 by 8 bytes
        @// pOut0+outPointStep == increment of 8*outPointStep bytes =
        @//    4*size bytes
        SMULBB  outPointStep,grpCount,pointStep
        LSL     pointStep,pointStep,#1
        RSB      step,pointStep,#16
        RSB      dstStep,outPointStep,#16
        @// Loop on the groups
 radix2GrpLoop\name :
        MOV      setCount,pointStep,LSR #3
        VLD1.F32     dW,[pTwiddle],pointStep                @//[wi | wr]
        @// Loop on the sets
 radix2SetLoop\name :
        @// point0: dX0-real part dX1-img part
        VLD2.F32    {dX0,dX1},[pSrc],pointStep
        @// point1: dX2-real part dX3-img part
        VLD2.F32    {dX2,dX3},[pSrc],step
        SUBS    setCount,setCount,#2
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   qT0,dX2,dW[0]
            VMLA.F32   qT0,dX3,dW[1]                       @// real part
            VMUL.F32   qT1,dX3,dW[0]
            VMLS.F32   qT1,dX2,dW[1]                       @// imag part
        .else
            VMUL.F32   qT0,dX2,dW[0]
            VMLS.F32   qT0,dX3,dW[1]                       @// real part
            VMUL.F32   qT1,dX3,dW[0]
            VMLA.F32   qT1,dX2,dW[1]                       @// imag part
        .endif
        VSUB.F32    dY0,dX0,qT0
        VSUB.F32    dY1,dX1,qT1
        VADD.F32    dY2,dX0,qT0
        VADD.F32    dY3,dX1,qT1
        VST2.F32    {dY0,dY1},[pDst],outPointStep
        @// dstStep = -outPointStep + 16
        VST2.F32    {dY2,dY3},[pDst],dstStep
        BGT     radix2SetLoop\name
        SUBS    grpCount,grpCount,#2
        ADD     pSrc,pSrc,pointStep
        BGT     radix2GrpLoop\name
        @// Reset and Swap pSrc and pDst for the next stage
        MOV     pTmp,pDst
        @// pDst -= 4*size; pSrc -= 8*size bytes
        SUB     pDst,pSrc,outPointStep,LSL #1
        SUB     pSrc,pTmp,outPointStep
        @// Reset pTwiddle for the next stage
        @// pTwiddle -= 4*size bytes
        SUB     pTwiddle,pTwiddle,outPointStep
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix2_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","FALSE",FWD
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix2_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","TRUE",INV
        M_END
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_fs_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_fs_unsafe_s.S
@ -1,251 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix4_fs_unsafe_s.s
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute a first stage Radix 4 FFT stage for a N point complex signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define pPingPongBuf    r5
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define grpSize         r3
@// Reuse grpSize as setCount
 #define setCount        r3
 #define pointStep       r4
 #define outPointStep    r4
 #define setStep         r8
 #define step1           r9
 #define step3           r10
@// Neon Registers
 #define dXr0    D0
 #define dXi0    D1
 #define dXr1    D2
 #define dXi1    D3
 #define dXr2    D4
 #define dXi2    D5
 #define dXr3    D6
 #define dXi3    D7
 #define dYr0    D8
 #define dYi0    D9
 #define dYr1    D10
 #define dYi1    D11
 #define dYr2    D12
 #define dYi2    D13
 #define dYr3    D14
 #define dYi3    D15
 #define qX0     Q0
 #define qX1     Q1
 #define qX2     Q2
 #define qX3     Q3
 #define qY0     Q4
 #define qY1     Q5
 #define qY2     Q6
 #define qY3     Q7
 #define dZr0    D16
 #define dZi0    D17
 #define dZr1    D18
 #define dZi1    D19
 #define dZr2    D20
 #define dZi2    D21
 #define dZr3    D22
 #define dZi3    D23
 #define qZ0     Q8
 #define qZ1     Q9
 #define qZ2     Q10
 #define qZ3     Q11
        .MACRO FFTSTAGE scaled, inverse, name
        @// Define stack arguments
        @// pT0+1 increments pT0 by 8 bytes
        @// pT0+pointStep = increment of 8*pointStep bytes = 2*grpSize bytes
        @// Note: outPointStep = pointStep for firststage
        MOV     pointStep,subFFTNum,LSL #1
        @// Update pSubFFTSize and pSubFFTNum regs
        VLD2.F32    {dXr0,dXi0},[pSrc, :128],pointStep          @//  data[0]
        @// subFFTSize = 1 for the first stage
        MOV     subFFTSize,#4
        @// Note: setCount = subFFTNum/4 (reuse the grpSize reg for setCount)
        LSR     grpSize,subFFTNum,#2
        VLD2.F32    {dXr1,dXi1},[pSrc, :128],pointStep          @//  data[1]
        MOV     subFFTNum,grpSize
        @// Calculate the step of input data for the next set
        @//MOV     setStep,pointStep,LSL #1
        MOV     setStep,grpSize,LSL #4
        VLD2.F32    {dXr2,dXi2},[pSrc, :128],pointStep          @//  data[2]
        @// setStep = 3*pointStep
        ADD     setStep,setStep,pointStep
        @// setStep = - 3*pointStep+16
        RSB     setStep,setStep,#16
        @//  data[3] & update pSrc for the next set
        VLD2.F32    {dXr3,dXi3},[pSrc, :128],setStep
        @// step1 = 2*pointStep
        MOV     step1,pointStep,LSL #1
        VADD.F32    qY0,qX0,qX2
        @// step3 = -pointStep
        RSB     step3,pointStep,#0
        @// grp = 0 a special case since all the twiddle factors are 1
        @// Loop on the sets : 2 sets at a time
 radix4fsGrpZeroSetLoop\name :
        @// Decrement setcount
        SUBS    setCount,setCount,#2
        @// finish first stage of 4 point FFT
        VSUB.F32    qY2,qX0,qX2
        VLD2.F32    {dXr0,dXi0},[pSrc, :128],step1          @//  data[0]
        VADD.F32    qY1,qX1,qX3
        VLD2.F32    {dXr2,dXi2},[pSrc, :128],step3          @//  data[2]
        VSUB.F32    qY3,qX1,qX3
        @// finish second stage of 4 point FFT
        .ifeqs "\inverse", "TRUE"
            VLD2.F32    {dXr1,dXi1},[pSrc, :128],step1          @//  data[1]
            VADD.F32    qZ0,qY0,qY1
            @//  data[3] & update pSrc for the next set, but not if it's the
            @//  last iteration so that we don't read past the end of the 
            @//  input array.
            BEQ     radix4SkipLastUpdateInv\name
            VLD2.F32    {dXr3,dXi3},[pSrc, :128],setStep
 radix4SkipLastUpdateInv\name:
            VSUB.F32    dZr3,dYr2,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VADD.F32    dZi3,dYi2,dYr3
            VSUB.F32    qZ1,qY0,qY1
            VST2.F32    {dZr3,dZi3},[pDst, :128],outPointStep
            VADD.F32    dZr2,dYr2,dYi3
            VST2.F32    {dZr1,dZi1},[pDst, :128],outPointStep
            VSUB.F32    dZi2,dYi2,dYr3
            VADD.F32    qY0,qX0,qX2                     @// u0 for next iteration
            VST2.F32    {dZr2,dZi2},[pDst, :128],setStep
        .else
            VLD2.F32    {dXr1,dXi1},[pSrc, :128],step1          @//  data[1]
            VADD.F32    qZ0,qY0,qY1
            @//  data[3] & update pSrc for the next set, but not if it's the
            @//  last iteration so that we don't read past the end of the 
            @//  input array.
            BEQ     radix4SkipLastUpdateFwd\name
            VLD2.F32    {dXr3,dXi3},[pSrc, :128],setStep
 radix4SkipLastUpdateFwd\name:
            VADD.F32    dZr2,dYr2,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VSUB.F32    dZi2,dYi2,dYr3
            VSUB.F32    qZ1,qY0,qY1
            VST2.F32    {dZr2,dZi2},[pDst, :128],outPointStep
            VSUB.F32    dZr3,dYr2,dYi3
            VST2.F32    {dZr1,dZi1},[pDst, :128],outPointStep
            VADD.F32    dZi3,dYi2,dYr3
            VADD.F32    qY0,qX0,qX2                     @// u0 for next iteration
            VST2.F32    {dZr3,dZi3},[pDst, :128],setStep
        .endif
        BGT     radix4fsGrpZeroSetLoop\name
        @// reset pSrc to pDst for the next stage
        SUB     pSrc,pDst,pointStep                     @// pDst -= 2*grpSize
        MOV     pDst,pPingPongBuf
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix4_fs_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","FALSE",fwd
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix4_fs_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","TRUE",inv
        M_END
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_ls_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_ls_unsafe_s.S
@ -1,339 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix4_ls_unsafe_s.s
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute a Radix 4 FFT stage for a N point complex signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Import symbols required from other files
@// (For example tables)
    @//IMPORT  armAAC_constTable
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define outPointStep    r3
 #define grpCount        r4
 #define dstStep         r5
 #define grpTwStep       r8
 #define stepTwiddle     r9
 #define twStep          r10
 #define pTmp            r4
 #define step16          r11
 #define step24          r12
@// Neon Registers
 #define dButterfly1Real02       D0
 #define dButterfly1Imag02       D1
 #define dButterfly1Real13       D2
 #define dButterfly1Imag13       D3
 #define dButterfly2Real02       D4
 #define dButterfly2Imag02       D5
 #define dButterfly2Real13       D6
 #define dButterfly2Imag13       D7
 #define dXr0                    D0
 #define dXi0                    D1
 #define dXr1                    D2
 #define dXi1                    D3
 #define dXr2                    D4
 #define dXi2                    D5
 #define dXr3                    D6
 #define dXi3                    D7
 #define dYr0                    D16
 #define dYi0                    D17
 #define dYr1                    D18
 #define dYi1                    D19
 #define dYr2                    D20
 #define dYi2                    D21
 #define dYr3                    D22
 #define dYi3                    D23
 #define dW1r                    D8
 #define dW1i                    D9
 #define dW2r                    D10
 #define dW2i                    D11
 #define dW3r                    D12
 #define dW3i                    D13
 #define qT0                     d14
 #define qT1                     d16
 #define qT2                     d18
 #define qT3                     d20
 #define qT4                     d22
 #define qT5                     d24
 #define dZr0                    D14
 #define dZi0                    D15
 #define dZr1                    D26
 #define dZi1                    D27
 #define dZr2                    D28
 #define dZi2                    D29
 #define dZr3                    D30
 #define dZi3                    D31
 #define qX0                     Q0
 #define qY0                     Q8
 #define qY1                     Q9
 #define qY2                     Q10
 #define qY3                     Q11
 #define qZ0                     Q7
 #define qZ1                     Q13
 #define qZ2                     Q14
 #define qZ3                     Q15
        .MACRO FFTSTAGE scaled, inverse , name
        @// Define stack arguments
        @// pOut0+1 increments pOut0 by 8 bytes
        @// pOut0+outPointStep == increment of 8*outPointStep bytes
        MOV     outPointStep,subFFTSize,LSL #3
        @// Update grpCount and grpSize rightaway
        VLD2.F32    {dW1r,dW1i},[pTwiddle, :128]             @// [wi|wr]
        MOV     step16,#16
        LSL     grpCount,subFFTSize,#2
        VLD1.F32    dW2r,[pTwiddle, :64]                     @// [wi|wr]
        MOV     subFFTNum,#1                            @//after the last stage
        VLD1.F32    dW3r,[pTwiddle, :64],step16              @// [wi|wr]
        MOV     stepTwiddle,#0
        VLD1.F32    dW2i,[pTwiddle, :64]!                    @// [wi|wr]
        SUB     grpTwStep,stepTwiddle,#8                @// grpTwStep = -8 to start with
        @// update subFFTSize for the next stage
        MOV     subFFTSize,grpCount
        VLD1.F32    dW3i,[pTwiddle, :64],grpTwStep           @// [wi|wr]
        MOV     dstStep,outPointStep,LSL #1
        @// AC.r AC.i BD.r BD.i
        VLD4.F32     {dButterfly1Real02,dButterfly1Imag02,dButterfly1Real13,dButterfly1Imag13},[pSrc, :256]!
        ADD     dstStep,dstStep,outPointStep            @// dstStep = 3*outPointStep
        RSB     dstStep,dstStep,#16                     @// dstStep = - 3*outPointStep+16
        MOV     step24,#24
        @// AC.r AC.i BD.r BD.i
        VLD4.F32     {dButterfly2Real02,dButterfly2Imag02,dButterfly2Real13,dButterfly2Imag13},[pSrc, :256]!
        @// Process two groups at a time
 radix4lsGrpLoop\name :
        VZIP.F32    dW2r,dW2i
        ADD     stepTwiddle,stepTwiddle,#16
        VZIP.F32    dW3r,dW3i
        ADD     grpTwStep,stepTwiddle,#4
        VUZP.F32     dButterfly1Real13, dButterfly2Real13   @// B.r D.r
        SUB     twStep,stepTwiddle,#16                  @// -16+stepTwiddle
        VUZP.F32     dButterfly1Imag13, dButterfly2Imag13   @// B.i D.i
        MOV     grpTwStep,grpTwStep,LSL #1
        VUZP.F32     dButterfly1Real02, dButterfly2Real02   @// A.r C.r
        RSB     grpTwStep,grpTwStep,#0                  @// -8-2*stepTwiddle
        VUZP.F32     dButterfly1Imag02, dButterfly2Imag02   @// A.i C.i
        @// grpCount is multiplied by 4
        SUBS    grpCount,grpCount,#8
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   dZr1,dW1r,dXr1
            VMLA.F32   dZr1,dW1i,dXi1                       @// real part
            VMUL.F32   dZi1,dW1r,dXi1
            VMLS.F32   dZi1,dW1i,dXr1                       @// imag part
        .else
            VMUL.F32   dZr1,dW1r,dXr1
            VMLS.F32   dZr1,dW1i,dXi1                       @// real part
            VMUL.F32   dZi1,dW1r,dXi1
            VMLA.F32   dZi1,dW1i,dXr1                       @// imag part
        .endif
        VLD2.F32    {dW1r,dW1i},[pTwiddle, :128],stepTwiddle      @// [wi|wr]
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   dZr2,dW2r,dXr2
            VMLA.F32   dZr2,dW2i,dXi2                       @// real part
            VMUL.F32   dZi2,dW2r,dXi2
            VLD1.F32   dW2r,[pTwiddle, :64],step16           @// [wi|wr]
            VMLS.F32   dZi2,dW2i,dXr2                       @// imag part
        .else
            VMUL.F32   dZr2,dW2r,dXr2
            VMLS.F32   dZr2,dW2i,dXi2                       @// real part
            VMUL.F32   dZi2,dW2r,dXi2
            VLD1.F32    dW2r,[pTwiddle, :64],step16          @// [wi|wr]
            VMLA.F32   dZi2,dW2i,dXr2                       @// imag part
        .endif
        VLD1.F32    dW2i,[pTwiddle, :64],twStep              @// [wi|wr]
        @// move qX0 so as to load for the next iteration
        VMOV     qZ0,qX0
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   dZr3,dW3r,dXr3
            VMLA.F32   dZr3,dW3i,dXi3                       @// real part
            VMUL.F32   dZi3,dW3r,dXi3
            VLD1.F32    dW3r,[pTwiddle, :64],step24
            VMLS.F32   dZi3,dW3i,dXr3                       @// imag part
        .else
            VMUL.F32   dZr3,dW3r,dXr3
            VMLS.F32   dZr3,dW3i,dXi3                       @// real part
            VMUL.F32   dZi3,dW3r,dXi3
            VLD1.F32    dW3r,[pTwiddle, :64],step24
            VMLA.F32   dZi3,dW3i,dXr3                       @// imag part
        .endif
        VLD1.F32    dW3i,[pTwiddle, :64],grpTwStep           @// [wi|wr]
        @// Don't do the load on the last iteration so we don't read past the end
        @// of pSrc.
        addeq   pSrc, pSrc, #64
        beq     radix4lsSkipRead\name
        @// AC.r AC.i BD.r BD.i
        VLD4.F32     {dButterfly1Real02,dButterfly1Imag02,dButterfly1Real13,dButterfly1Imag13},[pSrc, :256]!
        @// AC.r AC.i BD.r BD.i
        VLD4.F32     {dButterfly2Real02,dButterfly2Imag02,dButterfly2Real13,dButterfly2Imag13},[pSrc, :256]!
 radix4lsSkipRead\name:
        @// finish first stage of 4 point FFT
        VADD.F32    qY0,qZ0,qZ2
        VSUB.F32    qY2,qZ0,qZ2
        VADD.F32    qY1,qZ1,qZ3
        VSUB.F32    qY3,qZ1,qZ3
        @// finish second stage of 4 point FFT
        .ifeqs  "\inverse", "TRUE"
            VSUB.F32    qZ0,qY2,qY1
            VADD.F32    dZr3,dYr0,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VSUB.F32    dZi3,dYi0,dYr3
            VADD.F32    qZ2,qY2,qY1
            VST2.F32    {dZr3,dZi3},[pDst, :128],outPointStep
            VSUB.F32    dZr1,dYr0,dYi3
            VST2.F32    {dZr2,dZi2},[pDst, :128],outPointStep
            VADD.F32    dZi1,dYi0,dYr3
            @// dstStep = -outPointStep + 16
            VST2.F32    {dZr1,dZi1},[pDst, :128],dstStep
        .else
            VSUB.F32    qZ0,qY2,qY1
            VSUB.F32    dZr1,dYr0,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VADD.F32    dZi1,dYi0,dYr3
            VADD.F32    qZ2,qY2,qY1
            VST2.F32    {dZr1,dZi1},[pDst, :128],outPointStep
            VADD.F32    dZr3,dYr0,dYi3
            VST2.F32    {dZr2,dZi2},[pDst, :128],outPointStep
            VSUB.F32    dZi3,dYi0,dYr3
            @// dstStep = -outPointStep + 16
            VST2.F32    {dZr3,dZi3},[pDst, :128],dstStep
        .endif
        BGT     radix4lsGrpLoop\name
        @// Reset and Swap pSrc and pDst for the next stage
        MOV     pTmp,pDst
        @// Extra increment done in final iteration of the loop
        SUB     pSrc,pSrc,#64
        @// pDst -= 4*size; pSrc -= 8*size bytes
        SUB     pDst,pSrc,outPointStep,LSL #2
        SUB     pSrc,pTmp,outPointStep
        SUB     pTwiddle,pTwiddle,subFFTSize,LSL #1
        @// Extra increment done in final iteration of the loop
        SUB     pTwiddle,pTwiddle,#16
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix4_ls_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","FALSE",fwd
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix4_ls_OutOfPlace_unsafe,r4
        FFTSTAGE "FALSE","TRUE",inv
        M_END
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix4_unsafe_s.S
@ -1,331 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//
@//  This is a modification of armSP_FFT_CToC_SC32_Radix4_unsafe_s.s
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute a Radix 4 FFT stage for a N point complex signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Guarding implementation by the processor name
@// Import symbols required from other files
@// (For example tables)
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define subFFTNum       r6
 #define subFFTSize      r7
@//Output Registers
@//Local Scratch Registers
 #define grpCount        r3
 #define pointStep       r4
 #define outPointStep    r5
 #define stepTwiddle     r12
 #define setCount        r14
 #define srcStep         r8
 #define setStep         r9
 #define dstStep         r10
 #define twStep          r11
 #define t1              r3
@// Neon Registers
 #define dW1     D0
 #define dW2     D1
 #define dW3     D2
 #define dXr0    D4
 #define dXi0    D5
 #define dXr1    D6
 #define dXi1    D7
 #define dXr2    D8
 #define dXi2    D9
 #define dXr3    D10
 #define dXi3    D11
 #define dYr0    D12
 #define dYi0    D13
 #define dYr1    D14
 #define dYi1    D15
 #define dYr2    D16
 #define dYi2    D17
 #define dYr3    D18
 #define dYi3    D19
 #define qT0     d16
 #define qT1     d18
 #define qT2     d12
 #define qT3     d14
 #define dZr0    D20
 #define dZi0    D21
 #define dZr1    D22
 #define dZi1    D23
 #define dZr2    D24
 #define dZi2    D25
 #define dZr3    D26
 #define dZi3    D27
 #define qY0     Q6
 #define qY1     Q7
 #define qY2     Q8
 #define qY3     Q9
 #define qX0     Q2
 #define qZ0     Q10
 #define qZ1     Q11
 #define qZ2     Q12
 #define qZ3     Q13
        .MACRO FFTSTAGE scaled, inverse , name
        @// Define stack arguments
        @// Update grpCount and grpSize rightaway inorder to reuse
        @// pGrpCount and pGrpSize regs
        LSL     grpCount,subFFTSize,#2
        LSR     subFFTNum,subFFTNum,#2
        MOV     subFFTSize,grpCount
        VLD1.F32     dW1,[pTwiddle]                    @//[wi | wr]
        @// pT0+1 increments pT0 by 8 bytes
        @// pT0+pointStep = increment of 8*pointStep bytes = 2*grpSize bytes
        MOV     pointStep,subFFTNum,LSL #1
        @// pOut0+1 increments pOut0 by 8 bytes
        @// pOut0+outPointStep == increment of 8*outPointStep bytes
        @//   = 2*size bytes
        MOV     stepTwiddle,#0
        VLD1.F32     dW2,[pTwiddle]                    @//[wi | wr]
        SMULBB  outPointStep,grpCount,pointStep
        LSL     pointStep,pointStep,#2             @// 2*grpSize
        VLD1.F32     dW3,[pTwiddle]                    @//[wi | wr]
        MOV     srcStep,pointStep,LSL #1           @// srcStep = 2*pointStep
        ADD     setStep,srcStep,pointStep          @// setStep = 3*pointStep
        RSB     setStep,setStep,#0                 @// setStep = - 3*pointStep
        SUB     srcStep,srcStep,#16                @// srcStep = 2*pointStep-16
        MOV     dstStep,outPointStep,LSL #1
        ADD     dstStep,dstStep,outPointStep       @// dstStep = 3*outPointStep
        @// dstStep = - 3*outPointStep+16
        RSB     dstStep,dstStep,#16
 radix4GrpLoop\name :
        VLD2.F32    {dXr0,dXi0},[pSrc],pointStep       @//  data[0]
        ADD      stepTwiddle,stepTwiddle,pointStep
        VLD2.F32    {dXr1,dXi1},[pSrc],pointStep       @//  data[1]
        @// set pTwiddle to the first point
        ADD      pTwiddle,pTwiddle,stepTwiddle
        VLD2.F32    {dXr2,dXi2},[pSrc],pointStep       @//  data[2]
        MOV      twStep,stepTwiddle,LSL #2
        @//  data[3] & update pSrc for the next set
        VLD2.F32    {dXr3,dXi3},[pSrc],setStep
        SUB      twStep,stepTwiddle,twStep         @// twStep = -3*stepTwiddle
        MOV      setCount,pointStep,LSR #3
        @// set pSrc to data[0] of the next set
        ADD     pSrc,pSrc,#16
        @// increment to data[1] of the next set
        ADD     pSrc,pSrc,pointStep
        @// Loop on the sets
 radix4SetLoop\name :
        .ifeqs  "\inverse", "TRUE"
            VMUL.F32   dZr1,dXr1,dW1[0]
            VMUL.F32   dZi1,dXi1,dW1[0]
            VMUL.F32   dZr2,dXr2,dW2[0]
            VMUL.F32   dZi2,dXi2,dW2[0]
            VMUL.F32   dZr3,dXr3,dW3[0]
            VMUL.F32   dZi3,dXi3,dW3[0]
            VMLA.F32   dZr1,dXi1,dW1[1]                @// real part
            VMLS.F32   dZi1,dXr1,dW1[1]                @// imag part
            @//  data[1] for next iteration
            VLD2.F32    {dXr1,dXi1},[pSrc],pointStep
            VMLA.F32   dZr2,dXi2,dW2[1]                @// real part
            VMLS.F32   dZi2,dXr2,dW2[1]                @// imag part
            @//  data[2] for next iteration
            VLD2.F32    {dXr2,dXi2},[pSrc],pointStep
            VMLA.F32   dZr3,dXi3,dW3[1]                @// real part
            VMLS.F32   dZi3,dXr3,dW3[1]                @// imag part
        .else
            VMUL.F32   dZr1,dXr1,dW1[0]
            VMUL.F32   dZi1,dXi1,dW1[0]
            VMUL.F32   dZr2,dXr2,dW2[0]
            VMUL.F32   dZi2,dXi2,dW2[0]
            VMUL.F32   dZr3,dXr3,dW3[0]
            VMUL.F32   dZi3,dXi3,dW3[0]
            VMLS.F32   dZr1,dXi1,dW1[1]                @// real part
            VMLA.F32   dZi1,dXr1,dW1[1]                @// imag part
            @//  data[1] for next iteration
            VLD2.F32    {dXr1,dXi1},[pSrc],pointStep
            VMLS.F32   dZr2,dXi2,dW2[1]                @// real part
            VMLA.F32   dZi2,dXr2,dW2[1]                @// imag part
            @//  data[2] for next iteration
            VLD2.F32    {dXr2,dXi2},[pSrc],pointStep
            VMLS.F32   dZr3,dXi3,dW3[1]                @// real part
            VMLA.F32   dZi3,dXr3,dW3[1]                @// imag part
        .endif
        @//  data[3] & update pSrc to data[0]
        @// But don't read on the very last iteration because that reads past 
 	@// the end of pSrc. The last iteration is grpCount = 4, setCount = 2.
        cmp     grpCount, #4
        cmpeq   setCount, #2                      @// Test setCount if grpCount = 4
        @// These are executed only if both grpCount = 4 and setCount = 2       
        addeq   pSrc, pSrc, setStep
        beq     radix4SkipRead\name
        VLD2.F32    {dXr3,dXi3},[pSrc],setStep
 radix4SkipRead\name:
        SUBS    setCount,setCount,#2
        @// finish first stage of 4 point FFT
        VADD.F32    qY0,qX0,qZ2
        VSUB.F32    qY2,qX0,qZ2
        @//  data[0] for next iteration
        VLD2.F32    {dXr0,dXi0},[pSrc, :128]!
        VADD.F32    qY1,qZ1,qZ3
        VSUB.F32    qY3,qZ1,qZ3
        @// finish second stage of 4 point FFT
        VSUB.F32    qZ0,qY2,qY1
        .ifeqs  "\inverse", "TRUE"
            VADD.F32    dZr3,dYr0,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VSUB.F32    dZi3,dYi0,dYr3
            VADD.F32    qZ2,qY2,qY1
            VST2.F32    {dZr3,dZi3},[pDst, :128],outPointStep
            VSUB.F32    dZr1,dYr0,dYi3
            VST2.F32    {dZr2,dZi2},[pDst, :128],outPointStep
            VADD.F32    dZi1,dYi0,dYr3
            VST2.F32    {dZr1,dZi1},[pDst, :128],dstStep
        .else
            VSUB.F32    dZr1,dYr0,dYi3
            VST2.F32    {dZr0,dZi0},[pDst, :128],outPointStep
            VADD.F32    dZi1,dYi0,dYr3
            VADD.F32    qZ2,qY2,qY1
            VST2.F32    {dZr1,dZi1},[pDst, :128],outPointStep
            VADD.F32    dZr3,dYr0,dYi3
            VST2.F32    {dZr2,dZi2},[pDst, :128],outPointStep
            VSUB.F32    dZi3,dYi0,dYr3
            VST2.F32    {dZr3,dZi3},[pDst, :128],dstStep
        .endif
        @// increment to data[1] of the next set
        ADD     pSrc,pSrc,pointStep
        BGT     radix4SetLoop\name
        VLD1.F32     dW1,[pTwiddle, :64],stepTwiddle    @//[wi | wr]
        @// subtract 4 since grpCount multiplied by 4
        SUBS    grpCount,grpCount,#4
        VLD1.F32     dW2,[pTwiddle, :64],stepTwiddle    @//[wi | wr]
        @// increment pSrc for the next grp
        ADD     pSrc,pSrc,srcStep
        VLD1.F32     dW3,[pTwiddle, :64],twStep         @//[wi | wr]
        BGT     radix4GrpLoop\name
        @// Reset and Swap pSrc and pDst for the next stage
        MOV     t1,pDst
        @// pDst -= 2*size; pSrc -= 8*size bytes
        SUB     pDst,pSrc,outPointStep,LSL #2
        SUB     pSrc,t1,outPointStep
        .endm
        M_START armSP_FFTFwd_CToC_FC32_Radix4_OutOfPlace_unsafe,r4
            FFTSTAGE "FALSE","FALSE",FWD
        M_END
        M_START armSP_FFTInv_CToC_FC32_Radix4_OutOfPlace_unsafe,r4
            FFTSTAGE "FALSE","TRUE",INV
        M_END
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix8_fs_unsafe_s.S
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_CToC_FC32_Radix8_fs_unsafe_s.S
@ -1,422 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of armSP_FFT_CToC_FC32_Radix8_fs_unsafe_s.s
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute a first stage Radix 8 FFT stage for a N point complex signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
@// Guarding implementation by the processor name
@//Input Registers
 #define pSrc            r0
 #define pDst            r2
 #define pTwiddle        r1
 #define subFFTNum       r6
 #define subFFTSize      r7
@// dest buffer for the next stage (not pSrc for first stage)
 #define pPingPongBuf    r5
@//Output Registers
@//Local Scratch Registers
 #define grpSize         r3
@// Reuse grpSize as setCount
 #define setCount        r3
 #define pointStep       r4
 #define outPointStep    r4
 #define setStep         r8
 #define step1           r9
 #define step2           r10
 #define t0              r11
@// Neon Registers
 #define dXr0    D0
 #define dXi0    D1
 #define dXr1    D2
 #define dXi1    D3
 #define dXr2    D4
 #define dXi2    D5
 #define dXr3    D6
 #define dXi3    D7
 #define dXr4    D8
 #define dXi4    D9
 #define dXr5    D10
 #define dXi5    D11
 #define dXr6    D12
 #define dXi6    D13
 #define dXr7    D14
 #define dXi7    D15
 #define qX0     Q0
 #define qX1     Q1
 #define qX2     Q2
 #define qX3     Q3
 #define qX4     Q4
 #define qX5     Q5
 #define qX6     Q6
 #define qX7     Q7
 #define dUr0    D16
 #define dUi0    D17
 #define dUr2    D18
 #define dUi2    D19
 #define dUr4    D20
 #define dUi4    D21
 #define dUr6    D22
 #define dUi6    D23
 #define dUr1    D24
 #define dUi1    D25
 #define dUr3    D26
 #define dUi3    D27
 #define dUr5    D28
 #define dUi5    D29
@// reuse dXr7 and dXi7
 #define dUr7    D30
 #define dUi7    D31
 #define qU0     Q8
 #define qU1     Q12
 #define qU2     Q9
 #define qU3     Q13
 #define qU4     Q10
 #define qU5     Q14
 #define qU6     Q11
 #define qU7     Q15
 #define dVr0    D24
 #define dVi0    D25
 #define dVr2    D26
 #define dVi2    D27
 #define dVr4    D28
 #define dVi4    D29
 #define dVr6    D30
 #define dVi6    D31
 #define dVr1    D16
 #define dVi1    D17
 #define dVr3    D18
 #define dVi3    D19
 #define dVr5    D20
 #define dVi5    D21
 #define dVr7    D22
 #define dVi7    D23
 #define qV0     Q12
 #define qV1     Q8
 #define qV2     Q13
 #define qV3     Q9
 #define qV4     Q14
 #define qV5     Q10
 #define qV6     Q15
 #define qV7     Q11
 #define dYr0    D16
 #define dYi0    D17
 #define dYr2    D18
 #define dYi2    D19
 #define dYr4    D20
 #define dYi4    D21
 #define dYr6    D22
 #define dYi6    D23
 #define dYr1    D24
 #define dYi1    D25
 #define dYr3    D26
 #define dYi3    D27
 #define dYr5    D28
 #define dYi5    D29
 #define dYr7    D30
 #define dYi7    D31
 #define qY0     Q8
 #define qY1     Q12
 #define qY2     Q9
 #define qY3     Q13
 #define qY4     Q10
 #define qY5     Q14
 #define qY6     Q11
 #define qY7     Q15
 #define dT0     D14
 #define dT1     D15
        .MACRO FFTSTAGE scaled, inverse, name
        @// Define stack arguments
        @// Update pSubFFTSize and pSubFFTNum regs
        @// subFFTSize = 1 for the first stage
        MOV     subFFTSize,#8
        ADR     t0,ONEBYSQRT2\name
        @// Note: setCount = subFFTNum/8 (reuse the grpSize reg for setCount)
        LSR     grpSize,subFFTNum,#3
        MOV     subFFTNum,grpSize
        @// pT0+1 increments pT0 by 8 bytes
        @// pT0+pointStep = increment of 8*pointStep bytes = grpSize bytes
        @// Note: outPointStep = pointStep for firststage
        MOV     pointStep,grpSize,LSL #3
        @// Calculate the step of input data for the next set
        @//MOV     step1,pointStep,LSL #1             @// step1 = 2*pointStep
        VLD2.F32    {dXr0,dXi0},[pSrc, :128],pointStep     @//  data[0]
        MOV     step1,grpSize,LSL #4
        MOV     step2,pointStep,LSL #3
        VLD2.F32    {dXr1,dXi1},[pSrc, :128],pointStep     @//  data[1]
        SUB     step2,step2,pointStep                 @// step2 = 7*pointStep
        @// setStep = - 7*pointStep+16
        RSB     setStep,step2,#16
        VLD2.F32    {dXr2,dXi2},[pSrc, :128],pointStep     @//  data[2]
        VLD2.F32    {dXr3,dXi3},[pSrc, :128],pointStep     @//  data[3]
        VLD2.F32    {dXr4,dXi4},[pSrc, :128],pointStep     @//  data[4]
        VLD2.F32    {dXr5,dXi5},[pSrc, :128],pointStep     @//  data[5]
        VLD2.F32    {dXr6,dXi6},[pSrc, :128],pointStep     @//  data[6]
        @//  data[7] & update pSrc for the next set
        @//  setStep = -7*pointStep + 16
        VLD2.F32    {dXr7,dXi7},[pSrc, :128],setStep
        @// grp = 0 a special case since all the twiddle factors are 1
        @// Loop on the sets
 radix8fsGrpZeroSetLoop\name :
        @// Decrement setcount
        SUBS    setCount,setCount,#2
        @// finish first stage of 8 point FFT
        VADD.F32    qU0,qX0,qX4
        VADD.F32    qU2,qX1,qX5
        VADD.F32    qU4,qX2,qX6
        VADD.F32    qU6,qX3,qX7
        @// finish second stage of 8 point FFT
        VADD.F32    qV0,qU0,qU4
        VSUB.F32    qV2,qU0,qU4
        VADD.F32    qV4,qU2,qU6
        VSUB.F32    qV6,qU2,qU6
        @// finish third stage of 8 point FFT
        VADD.F32    qY0,qV0,qV4
        VSUB.F32    qY4,qV0,qV4
        VST2.F32    {dYr0,dYi0},[pDst, :128],step1         @// store y0
        .ifeqs  "\inverse", "TRUE"
            VSUB.F32    dYr2,dVr2,dVi6
            VADD.F32    dYi2,dVi2,dVr6
            VADD.F32    dYr6,dVr2,dVi6
            VST2.F32    {dYr2,dYi2},[pDst, :128],step1     @// store y2
            VSUB.F32    dYi6,dVi2,dVr6
            VSUB.F32    qU1,qX0,qX4
            VST2.F32    {dYr4,dYi4},[pDst, :128],step1     @// store y4
            VSUB.F32    qU3,qX1,qX5
            VSUB.F32    qU5,qX2,qX6
            VST2.F32    {dYr6,dYi6},[pDst, :128],step1     @// store y6
        .ELSE
            VADD.F32    dYr6,dVr2,dVi6
            VSUB.F32    dYi6,dVi2,dVr6
            VSUB.F32    dYr2,dVr2,dVi6
            VST2.F32    {dYr6,dYi6},[pDst, :128],step1     @// store y2
            VADD.F32    dYi2,dVi2,dVr6
            VSUB.F32    qU1,qX0,qX4
            VST2.F32    {dYr4,dYi4},[pDst, :128],step1     @// store y4
            VSUB.F32    qU3,qX1,qX5
            VSUB.F32    qU5,qX2,qX6
            VST2.F32    {dYr2,dYi2},[pDst, :128],step1     @// store y6
        .ENDIF
        @// finish first stage of 8 point FFT
        VSUB.F32    qU7,qX3,qX7
        VLD1.F32    dT0[0], [t0]
        @// finish second stage of 8 point FFT
        VSUB.F32    dVr1,dUr1,dUi5
        @//  data[0] for next iteration
        VLD2.F32    {dXr0,dXi0},[pSrc, :128],pointStep
        VADD.F32    dVi1,dUi1,dUr5
        VADD.F32    dVr3,dUr1,dUi5
        VLD2.F32    {dXr1,dXi1},[pSrc, :128],pointStep     @//  data[1]
        VSUB.F32    dVi3,dUi1,dUr5
        VSUB.F32    dVr5,dUr3,dUi7
        VLD2.F32    {dXr2,dXi2},[pSrc, :128],pointStep     @//  data[2]
        VADD.F32    dVi5,dUi3,dUr7
        VADD.F32    dVr7,dUr3,dUi7
        VLD2.F32    {dXr3,dXi3},[pSrc, :128],pointStep     @//  data[3]
        VSUB.F32    dVi7,dUi3,dUr7
        @// finish third stage of 8 point FFT
        .ifeqs  "\inverse", "TRUE"
            @// calculate a*v5
            VMUL.F32    dT1,dVr5,dT0[0]                   @// use dVi0 for dT1
            VLD2.F32    {dXr4,dXi4},[pSrc, :128],pointStep @//  data[4]
            VMUL.F32    dVi5,dVi5,dT0[0]
            VLD2.F32    {dXr5,dXi5},[pSrc, :128],pointStep @//  data[5]
            VSUB.F32    dVr5,dT1,dVi5                     @// a * V5
            VADD.F32    dVi5,dT1,dVi5
            VLD2.F32    {dXr6,dXi6},[pSrc, :128],pointStep @//  data[6]
            @// calculate  b*v7
            VMUL.F32    dT1,dVr7,dT0[0]
            VMUL.F32    dVi7,dVi7,dT0[0]
            VADD.F32    qY1,qV1,qV5
            VSUB.F32    qY5,qV1,qV5
            VADD.F32    dVr7,dT1,dVi7                     @// b * V7
            VSUB.F32    dVi7,dVi7,dT1
            SUB     pDst, pDst, step2                 @// set pDst to y1
            @// On the last iteration,  this will read past the end of pSrc, 
            @// so skip this read.
            BEQ     radix8SkipLastUpdateInv\name
            VLD2.F32    {dXr7,dXi7},[pSrc, :128],setStep   @//  data[7]
 radix8SkipLastUpdateInv\name:
            VSUB.F32    dYr3,dVr3,dVr7
            VSUB.F32    dYi3,dVi3,dVi7
            VST2.F32    {dYr1,dYi1},[pDst, :128],step1     @// store y1
            VADD.F32    dYr7,dVr3,dVr7
            VADD.F32    dYi7,dVi3,dVi7
            VST2.F32    {dYr3,dYi3},[pDst, :128],step1     @// store y3
            VST2.F32    {dYr5,dYi5},[pDst, :128],step1     @// store y5
            VST2.F32    {dYr7,dYi7},[pDst, :128]           @// store y7
            ADD pDst, pDst, #16
        .ELSE
            @// calculate  b*v7
            VMUL.F32    dT1,dVr7,dT0[0]
            VLD2.F32    {dXr4,dXi4},[pSrc, :128],pointStep @//  data[4]
            VMUL.F32    dVi7,dVi7,dT0[0]
            VLD2.F32    {dXr5,dXi5},[pSrc, :128],pointStep @//  data[5]
            VADD.F32    dVr7,dT1,dVi7                     @// b * V7
            VSUB.F32    dVi7,dVi7,dT1
            VLD2.F32    {dXr6,dXi6},[pSrc, :128],pointStep @//  data[6]
            @// calculate a*v5
            VMUL.F32    dT1,dVr5,dT0[0]                   @// use dVi0 for dT1
            VMUL.F32    dVi5,dVi5,dT0[0]
            VADD.F32    dYr7,dVr3,dVr7
            VADD.F32    dYi7,dVi3,dVi7
            SUB     pDst, pDst, step2                 @// set pDst to y1
            VSUB.F32    dVr5,dT1,dVi5                     @// a * V5
            VADD.F32    dVi5,dT1,dVi5
            @// On the last iteration,  this will read past the end of pSrc, 
            @// so skip this read.
            BEQ     radix8SkipLastUpdateFwd\name
            VLD2.F32    {dXr7,dXi7},[pSrc, :128],setStep   @//  data[7]
 radix8SkipLastUpdateFwd\name:
            VSUB.F32    qY5,qV1,qV5
            VSUB.F32    dYr3,dVr3,dVr7
            VST2.F32    {dYr7,dYi7},[pDst, :128],step1     @// store y1
            VSUB.F32    dYi3,dVi3,dVi7
            VADD.F32    qY1,qV1,qV5
            VST2.F32    {dYr5,dYi5},[pDst, :128],step1     @// store y3
            VST2.F32    {dYr3,dYi3},[pDst, :128],step1     @// store y5
            VST2.F32    {dYr1,dYi1},[pDst, :128]!          @// store y7
        .ENDIF
        @// update pDst for the next set
        SUB     pDst, pDst, step2
        BGT     radix8fsGrpZeroSetLoop\name
        @// reset pSrc to pDst for the next stage
        SUB     pSrc,pDst,pointStep                   @// pDst -= 2*grpSize
        MOV     pDst,pPingPongBuf
        .endm
        @// Allocate stack memory required by the function
        M_START armSP_FFTFwd_CToC_FC32_Radix8_fs_OutOfPlace_unsafe,r4
            FFTSTAGE "FALSE","FALSE",FWD
        M_END
 ONEBYSQRT2FWD:     .float  0.7071067811865476e0
        M_START armSP_FFTInv_CToC_FC32_Radix8_fs_OutOfPlace_unsafe,r4
            FFTSTAGE "FALSE","TRUE",INV
        M_END
 ONEBYSQRT2INV:     .float  0.7071067811865476e0
        .end
--- a/media/openmax_dl/dl/sp/src/armSP_FFT_F32TwiddleTable.c
+++ b/media/openmax_dl/dl/sp/src/armSP_FFT_F32TwiddleTable.c
--- a/media/openmax_dl/dl/sp/src/omxSP_FFTFwd_RToCCS_F32_Sfs_s.S
+++ b/media/openmax_dl/dl/sp/src/omxSP_FFTFwd_RToCCS_F32_Sfs_s.S
@ -1,404 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of omxSP_FFTFwd_RToCCS_S32_Sfs_s.s
@//  to support float instead of SC32.
@//
@//
@// Description:
@// Compute FFT for a real signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
        .extern  armSP_FFTFwd_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix4_fs_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix8_fs_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix4_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix4_fs_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix8_fs_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix2_OutOfPlace_unsafe
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
    @// Guarding implementation by the processor name
@// Import symbols required from other files
@// (For example tables)
        .extern  armSP_FFTFwd_CToC_FC32_Radix4_ls_OutOfPlace_unsafe
        .extern  armSP_FFTFwd_CToC_FC32_Radix2_ls_OutOfPlace_unsafe
@//Input Registers
 #define pSrc            r0
 #define pDst            r1
 #define pFFTSpec        r2
 #define scale           r3
@// Output registers
 #define result          r0
@//Local Scratch Registers
 #define argTwiddle      r1
 #define argDst          r2
 #define argScale        r4
 #define tmpOrder        r4
 #define pTwiddle        r4
 #define pOut            r5
 #define subFFTSize      r7
 #define subFFTNum       r6
 #define N               r6
 #define order           r14
 #define diff            r9
@// Total num of radix stages required to comple the FFT
 #define count           r8
 #define x0r             r4
 #define x0i             r5
 #define diffMinusOne    r2
 #define subFFTSizeTmp   r6
 #define step            r3
 #define step1           r4
 #define twStep          r8
 #define zero            r9
 #define pTwiddleTmp     r5
 #define t0              r10
@// Neon registers
 #define dX0       d0
 #define dzero     d1
 #define dZero     d2
 #define dShift    d3
 #define dX0r      d2
 #define dX0i      d3
 #define dX1r      d4
 #define dX1i      d5
 #define dT0       d6
 #define dT1       d7
 #define dT2       d8
 #define dT3       d9
 #define qT0       d10
 #define qT1       d12
 #define dW0r      d14
 #define dW0i      d15
 #define dW1r      d16
 #define dW1i      d17
 #define dY0r      d14
 #define dY0i      d15
 #define dY1r      d16
 #define dY1i      d17
 #define dY0rS64   d14.s64
 #define dY0iS64   d15.s64
 #define qT2       d18
 #define qT3       d20
@// lastThreeelements
 #define dX1       d3
 #define dW0       d4
 #define dW1       d5
 #define dY0       d10
 #define dY1       d11
 #define dY2       d12
 #define dY3       d13
 #define half      d0
    @// Allocate stack memory required by the function
    @// Write function header
        M_START     omxSP_FFTFwd_RToCCS_F32_Sfs,r11,d15
@ Structure offsets for the FFTSpec
        .set    ARMsFFTSpec_N, 0
        .set    ARMsFFTSpec_pBitRev, 4
        .set    ARMsFFTSpec_pTwiddle, 8
        .set    ARMsFFTSpec_pBuf, 12
        @// Define stack arguments
        @// Read the size from structure and take log
        LDR     N, [pFFTSpec, #ARMsFFTSpec_N]
        @// Read other structure parameters
        LDR     pTwiddle, [pFFTSpec, #ARMsFFTSpec_pTwiddle]
        LDR     pOut, [pFFTSpec, #ARMsFFTSpec_pBuf]
        @//  N=1 Treat seperately
        CMP     N,#1
        BGT     sizeGreaterThanOne
        VLD1.F32    dX0[0],[pSrc]
        MOV     zero,#0
        VMOV.F32    dzero[0],zero
        VMOV.F32    dZero[0],zero
        VST3.F32    {dX0[0],dzero[0],dZero[0]},[pDst]
        B       End
 sizeGreaterThanOne:
        @// Do a N/2 point complex FFT including the scaling
        MOV     N,N,ASR #1                          @// N/2 point complex FFT
        CLZ     order,N                             @// N = 2^order
        RSB     order,order,#31
        MOV     subFFTSize,#1
        @//MOV     subFFTNum,N
        CMP     order,#3
        BGT     orderGreaterthan3                   @// order > 3
        CMP     order,#1
        BGE     orderGreaterthan0                   @// order > 0
        VLD1.F32    dX0,[pSrc]
        VST1.F32    dX0,[pOut]
        MOV     pSrc,pOut
        MOV     argDst,pDst
        BLT     FFTEnd
 orderGreaterthan0:
        @// set the buffers appropriately for various orders
        CMP     order,#2
        MOVEQ   argDst,pDst
        MOVNE   argDst,pOut
        @// Pass the first stage destination in RN5
        MOVNE   pOut,pDst
        MOV     argTwiddle,pTwiddle
        CMP     order,#1
        BGT     orderGreaterthan1
        @// order = 1
        BL      armSP_FFTFwd_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        B       FFTEnd
 orderGreaterthan1:
        CMP     order,#2
        BGT     orderGreaterthan2
        @// order =2
        BL      armSP_FFTFwd_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        BL      armSP_FFTFwd_CToC_FC32_Radix2_ls_OutOfPlace_unsafe
        B       FFTEnd
 orderGreaterthan2:@// order =3
        BL      armSP_FFTFwd_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        BL      armSP_FFTFwd_CToC_FC32_Radix2_OutOfPlace_unsafe
        BL      armSP_FFTFwd_CToC_FC32_Radix2_ls_OutOfPlace_unsafe
        B       FFTEnd
 orderGreaterthan3:
 specialScaleCase:
        @// Set input args to fft stages
        TST     order, #2
        MOVEQ   argDst,pDst
        MOVNE   argDst,pOut
        @// Pass the first stage destination in RN5
        MOVNE   pOut,pDst
        MOV     argTwiddle,pTwiddle
        @//check for even or odd order
        @// NOTE: The following combination of BL's would work fine even though
        @// the first BL would corrupt the flags. This is because the end of
        @// the "grpZeroSetLoop" loop inside
        @// armSP_FFTFwd_CToC_FC32_Radix4_fs_OutOfPlace_unsafe sets the Z flag
        @// to EQ
        TST     order,#0x00000001
        BLEQ    armSP_FFTFwd_CToC_FC32_Radix4_fs_OutOfPlace_unsafe
        BLNE    armSP_FFTFwd_CToC_FC32_Radix8_fs_OutOfPlace_unsafe
        CMP        subFFTNum,#4
        BLT     FFTEnd
 unscaledRadix4Loop:
        BEQ        lastStageUnscaledRadix4
         BL        armSP_FFTFwd_CToC_FC32_Radix4_OutOfPlace_unsafe
         CMP        subFFTNum,#4
         B        unscaledRadix4Loop
 lastStageUnscaledRadix4:
        BL      armSP_FFTFwd_CToC_FC32_Radix4_ls_OutOfPlace_unsafe
        B        FFTEnd
 FFTEnd:
 finalComplexToRealFixup:
        @// F(0) = 1/2[Z(0) + Z'(0)] - j [Z(0) - Z'(0)]
        @// 1/2[(a+jb) + (a-jb)] - j  [(a+jb) - (a-jb)]
        @// 1/2[2a+j0] - j [0+j2b]
        @// (a+b, 0)
        @// F(N/2) = 1/2[Z(0) + Z'(0)] + j [Z(0) - Z'(0)]
        @// 1/2[(a+jb) + (a-jb)] + j  [(a+jb) - (a-jb)]
        @// 1/2[2a+j0] + j [0+j2b]
        @// (a-b, 0)
        @// F(0) and F(N/2)
        VLD2.F32    {dX0r[0],dX0i[0]},[pSrc]!
        MOV     zero,#0
        VMOV.F32    dX0r[1],zero
        MOV     step,subFFTSize,LSL #3            @// step = N/2 * 8 bytes
        VMOV.F32    dX0i[1],zero
        @// twStep = 3N/8 * 8 bytes pointing to W^1
        SUB     twStep,step,subFFTSize,LSL #1
        VADD.F32    dY0r,dX0r,dX0i                    @// F(0) = ((Z0.r+Z0.i) , 0)
        MOV     step1,subFFTSize,LSL #2           @// step1 = N/2 * 4 bytes
        VSUB.F32    dY0i,dX0r,dX0i                    @// F(N/2) = ((Z0.r-Z0.i) , 0)
        SUBS    subFFTSize,subFFTSize,#2
        VST1.F32    dY0r,[argDst],step
        ADD     pTwiddleTmp,argTwiddle,#8         @// W^2
        VST1.F32    dY0i,[argDst]!
        ADD     argTwiddle,argTwiddle,twStep      @// W^1
        VDUP.F32    dzero,zero
        SUB     argDst,argDst,step
        BLT     End
        BEQ     lastElement
        SUB     step,step,#24
        SUB     step1,step1,#8                    @// (N/4-1)*8 bytes
        @// F(k) = 1/2[Z(k) +  Z'(N/2-k)] -j*W^(k) [Z(k) -  Z'(N/2-k)]
        @// Note: W^k is stored as negative values in the table
        @// Process 4 elements at a time. E.g: F(1),F(2) and F(N/2-2),F(N/2-1)
        @// since both of them require Z(1),Z(2) and Z(N/2-2),Z(N/2-1)
        ADR     t0, HALF
        VLD1.F32    half[0], [t0]
 evenOddButterflyLoop:
        VLD1.F32    dW0r,[argTwiddle],step1
        VLD1.F32    dW1r,[argTwiddle]!
        VLD2.F32    {dX0r,dX0i},[pSrc],step
        SUB     argTwiddle,argTwiddle,step1
        VLD2.F32    {dX1r,dX1i},[pSrc]!
        SUB     step1,step1,#8                    @// (N/4-2)*8 bytes
        VLD1.F32    dW0i,[pTwiddleTmp],step1
        VLD1.F32    dW1i,[pTwiddleTmp]!
        SUB     pSrc,pSrc,step
        SUB     pTwiddleTmp,pTwiddleTmp,step1
        VREV64.F32  dX1r,dX1r
        VREV64.F32  dX1i,dX1i
        SUBS    subFFTSize,subFFTSize,#4
        VSUB.F32    dT2,dX0r,dX1r                     @// a-c
        SUB     step1,step1,#8
        VADD.F32    dT0,dX0r,dX1r                     @// a+c
        VSUB.F32    dT1,dX0i,dX1i                     @// b-d
        VADD.F32    dT3,dX0i,dX1i                     @// b+d
        VMUL.F32   dT0,dT0,half[0]
        VMUL.F32   dT1,dT1,half[0]
        VZIP.F32    dW1r,dW1i
        VZIP.F32    dW0r,dW0i
        VMUL.F32   qT0,dW1r,dT2
        VMUL.F32   qT1,dW1r,dT3
        VMUL.F32   qT2,dW0r,dT2
        VMUL.F32   qT3,dW0r,dT3
        VMLA.F32   qT0,dW1i,dT3
        VMLS.F32   qT1,dW1i,dT2
        VMLS.F32   qT2,dW0i,dT3
        VMLA.F32   qT3,dW0i,dT2
        VMUL.F32  dX1r,qT0,half[0]
        VMUL.F32  dX1i,qT1,half[0]
        VSUB.F32    dY1r,dT0,dX1i                     @// F(N/2 -1)
        VADD.F32    dY1i,dT1,dX1r
        VNEG.F32    dY1i,dY1i
        VREV64.F32  dY1r,dY1r
        VREV64.F32  dY1i,dY1i
        VMUL.F32  dX0r,qT2,half[0]
        VMUL.F32  dX0i,qT3,half[0]
        VSUB.F32    dY0r,dT0,dX0i                     @// F(1)
        VADD.F32    dY0i,dT1,dX0r
        VST2.F32    {dY0r,dY0i},[argDst],step
        VST2.F32    {dY1r,dY1i},[argDst]!
        SUB     argDst,argDst,step
        SUB     step,step,#32                     @// (N/2-4)*8 bytes
        BGT     evenOddButterflyLoop
        @// set both the ptrs to the last element
        SUB     pSrc,pSrc,#8
        SUB     argDst,argDst,#8
        @// Last element can be expanded as follows
        @// 1/2[Z(k) + Z'(k)] + j w^k [Z(k) - Z'(k)]
        @// 1/2[(a+jb) + (a-jb)] + j w^k [(a+jb) - (a-jb)]
        @// 1/2[2a+j0] + j (c+jd) [0+j2b]
        @// (a-bc, -bd)
        @// Since (c,d) = (0,1) for the last element, result is just (a,-b)
 lastElement:
        VLD1.F32    dX0r,[pSrc]
        VST1.F32    dX0r[0],[argDst]!
        VNEG.F32    dX0r,dX0r
        VST1.F32    dX0r[1],[argDst]!
 End:
        @// Set return value
        MOV     result, #OMX_Sts_NoErr
        @// Write function tail
        M_END
 HALF:   .float  0.5
        .end
--- a/media/openmax_dl/dl/sp/src/omxSP_FFTGetBufSize_R_F32.c
+++ b/media/openmax_dl/dl/sp/src/omxSP_FFTGetBufSize_R_F32.c
@ -1,49 +0,0 @@
 /*
 *  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 *
 */
 #include "dl/api/armOMX.h"
 #include "dl/api/omxtypes.h"
 #include "dl/sp/api/armSP.h"
 #include "dl/sp/api/omxSP.h"
 /**
 * Function: omxSP_FFTGetBufSize_R_F32
 *
 * Description:
 * Computes the size of the specification structure required for the length
 * 2^order real FFT and IFFT functions.
 *
 * Remarks:
 * This function is used in conjunction with the 32-bit functions
 * <FFTFwd_RToCCS_F32_Sfs> and <FFTInv_CCSToR_F32_Sfs>.
 *
 * Parameters:
 * [in]  order       base-2 logarithm of the length; valid in the range
 *                    [1,12]. ([1,15] if BIG_FFT_TABLE is defined.)
 * [out] pSize	   pointer to the number of bytes required for the
 *			   specification structure.
 *
 * Return Value:
 * Standard omxError result. See enumeration for possible result codes.
 *
 */
 OMXResult omxSP_FFTGetBufSize_R_F32(OMX_INT order, OMX_INT *pSize) {
  if (!pSize || (order < 1) || (order > TWIDDLE_TABLE_ORDER))
    return OMX_Sts_BadArgErr;
  /*
   * The required size is the same as for R_S32, because the
   * elements are the same size and because ARMsFFTSpec_R_SC32 is
   * the same size as ARMsFFTSpec_R_FC32.
   */
  return omxSP_FFTGetBufSize_R_S32(order, pSize);
 }
--- a/media/openmax_dl/dl/sp/src/omxSP_FFTGetBufSize_R_S32.c
+++ b/media/openmax_dl/dl/sp/src/omxSP_FFTGetBufSize_R_S32.c
@ -1,91 +0,0 @@
 /*
 *  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 *
 *  This file was originally licensed as follows. It has been
 *  relicensed with permission from the copyright holders.
 */
 /**
 * 
 * File Name:  omxSP_FFTGetBufSize_R_S32.c
 * OpenMAX DL: v1.0.2
 * Last Modified Revision:   7777
 * Last Modified Date:       Thu, 27 Sep 2007
 * 
 * (c) Copyright 2007-2008 ARM Limited. All Rights Reserved.
 * 
 * 
 * Description:
 * Computes the size of the specification structure required.
 */
 #include "dl/api/armOMX.h"
 #include "dl/api/omxtypes.h"
 #include "dl/sp/api/armSP.h"
 #include "dl/sp/api/omxSP.h"
 /**
 * Function: omxSP_FFTGetBufSize_R_S32
 *
 * Description:
 * Computes the size of the specification structure required for the length
 * 2^order real FFT and IFFT functions.
 *
 * Remarks:
 * This function is used in conjunction with the 32-bit functions
 * <FFTFwd_RToCCS_S32_Sfs> and <FFTInv_CCSToR_S32_Sfs>.
 *
 * Parameters:
 * [in]  order       base-2 logarithm of the length; valid in the range
 *			   [0,12].
 * [out] pSize	   pointer to the number of bytes required for the
 *			   specification structure.
 *
 * Return Value:
 * Standard omxError result. See enumeration for possible result codes.
 *
 */
 OMXResult omxSP_FFTGetBufSize_R_S32(
     OMX_INT order,     
     OMX_INT *pSize
 )
 {
    OMX_INT     NBy2,N,twiddleSize;
    /* Check for order zero */
    if (order == 0)
    {
        *pSize = sizeof(ARMsFFTSpec_R_SC32)
                + sizeof(OMX_S32) * (2); /* Extra size 'N' is used in FFTInv_CCSToR_S32S16_Sfs as a temporary buf */   
        return OMX_Sts_NoErr;
    }
    NBy2 = 1 << (order - 1);
    N = NBy2<<1;
    twiddleSize = 5*N/8;            /* 3/4(N/2) + N/4 */
    /* 2 pointers to store bitreversed array and twiddle factor array */
    *pSize = sizeof(ARMsFFTSpec_R_SC32)
        /* Twiddle factors  */
           + sizeof(OMX_SC32) * twiddleSize
        /* Ping Pong buffer for doing the N/2 point complex FFT  */      
           + sizeof(OMX_S32) * (N<<1)  /* Extra size 'N' is used in FFTInv_CCSToR_S32_Sfs as a temporary buf */
           + 62 ;  /* Extra bytes to get 32 byte alignment of ptwiddle and pBuf */ 
    return OMX_Sts_NoErr;
 }
 /*****************************************************************************
 *                              END OF FILE
 *****************************************************************************/
--- a/media/openmax_dl/dl/sp/src/omxSP_FFTInit_R_F32.c
+++ b/media/openmax_dl/dl/sp/src/omxSP_FFTInit_R_F32.c
@ -1,210 +0,0 @@
 /*
 *  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
 *
 *  Use of this source code is governed by a BSD-style license
 *  that can be found in the LICENSE file in the root of the source
 *  tree. An additional intellectual property rights grant can be found
 *  in the file PATENTS.  All contributing project authors may
 *  be found in the AUTHORS file in the root of the source tree.
 *
 *  This is a modification of omxSP_FFTInit_R_S32.c to support float
 *  instead of S32.
 */
 #include "dl/api/armOMX.h"
 #include "dl/api/omxtypes.h"
 #include "dl/sp/api/armSP.h"
 #include "dl/sp/api/omxSP.h"
 /**
 * Function: omxSP_FFTInit_R_F32
 *
 * Description:
 * Initialize the real forward-FFT specification information struct.
 *
 * Remarks:
 * This function is used to initialize the specification structures
 * for functions <ippsFFTFwd_RToCCS_F32_Sfs> and
 * <ippsFFTInv_CCSToR_F32_Sfs>. Memory for *pFFTSpec must be
 * allocated prior to calling this function. The number of bytes
 * required for *pFFTSpec can be determined using
 * <FFTGetBufSize_R_F32>.
 *
 * Parameters:
 * [in]  order       base-2 logarithm of the desired block length;
 *                         valid in the range [1,12].  ([1,15] if
 *                         BIG_FFT_TABLE is defined.)
 * [out] pFFTFwdSpec pointer to the initialized specification structure.
 *
 * Return Value:
 * Standard omxError result. See enumeration for possible result codes.
 *
 */
 OMXResult omxSP_FFTInit_R_F32(OMXFFTSpec_R_F32* pFFTSpec, OMX_INT order) {
  OMX_INT i;
  OMX_INT j;
  OMX_FC32* pTwiddle;
  OMX_FC32* pTwiddle1;
  OMX_FC32* pTwiddle2;
  OMX_FC32* pTwiddle3;
  OMX_FC32* pTwiddle4;
  OMX_F32* pBuf;
  OMX_U16* pBitRev;
  OMX_U32 pTmp;
  OMX_INT Nby2;
  OMX_INT N;
  OMX_INT M;
  OMX_INT diff;
  OMX_INT step;
  OMX_F32 x;
  OMX_F32 y;
  OMX_F32 xNeg;
  ARMsFFTSpec_R_FC32* pFFTStruct = 0;
  pFFTStruct = (ARMsFFTSpec_R_FC32 *) pFFTSpec;
  /* Validate args */
  if (!pFFTSpec || (order < 1) || (order > TWIDDLE_TABLE_ORDER))
    return OMX_Sts_BadArgErr;
  /* Do the initializations */
  Nby2 = 1 << (order - 1);
  N = Nby2 << 1;
  /* optimized implementations don't use bitreversal */
  pBitRev = NULL;
  pTwiddle = (OMX_FC32 *) (sizeof(ARMsFFTSpec_R_SC32) + (OMX_S8*) pFFTSpec);
  /* Align to 32 byte boundary */
  pTmp = ((OMX_U32)pTwiddle) & 31;
  if (pTmp)
    pTwiddle = (OMX_FC32*) ((OMX_S8*)pTwiddle + (32 - pTmp));
  pBuf = (OMX_F32*) (sizeof(OMX_FC32)*(5*N/8) + (OMX_S8*) pTwiddle);
  /* Align to 32 byte boundary */
  pTmp = ((OMX_U32)pBuf)&31;                 /* (OMX_U32)pBuf % 32 */
  if (pTmp)
    pBuf = (OMX_F32*) ((OMX_S8*)pBuf + (32 - pTmp));
  /*
   * Filling Twiddle factors :
   *
   * exp^(-j*2*PI*k/ (N/2) ) ; k=0,1,2,...,3/4(N/2)
   *
   * N/2 point complex FFT is used to compute N point real FFT The
   * original twiddle table "armSP_FFT_F32TwiddleTable" is of size
   * (MaxSize/8 + 1) Rest of the values i.e., upto MaxSize are
   * calculated using the symmetries of sin and cos The max size of
   * the twiddle table needed is 3/4(N/2) for a radix-4 stage
   *
   * W = (-2 * PI) / N
   * N = 1 << order
   * W = -PI >> (order - 1)
   */
  M = Nby2 >> 3;
  diff = TWIDDLE_TABLE_ORDER - (order - 1);
  /* step into the twiddle table for the current order */
  step = 1 << diff;
  x = armSP_FFT_F32TwiddleTable[0];
  y = armSP_FFT_F32TwiddleTable[1];
  xNeg = 1;
  if ((order - 1) >= 3) {
    /* i = 0 case */
    pTwiddle[0].Re = x;
    pTwiddle[0].Im = y;
    pTwiddle[2*M].Re = -y;
    pTwiddle[2*M].Im = xNeg;
    pTwiddle[4*M].Re = xNeg;
    pTwiddle[4*M].Im = y;
    for (i = 1; i <= M; i++) {
      j = i*step;
      x = armSP_FFT_F32TwiddleTable[2*j];
      y = armSP_FFT_F32TwiddleTable[2*j+1];
      pTwiddle[i].Re = x;
      pTwiddle[i].Im = y;
      pTwiddle[2*M-i].Re = -y;
      pTwiddle[2*M-i].Im = -x;
      pTwiddle[2*M+i].Re = y;
      pTwiddle[2*M+i].Im = -x;
      pTwiddle[4*M-i].Re = -x;
      pTwiddle[4*M-i].Im = y;
      pTwiddle[4*M+i].Re = -x;
      pTwiddle[4*M+i].Im = -y;
      pTwiddle[6*M-i].Re = y;
      pTwiddle[6*M-i].Im = x;
    }
  } else if ((order - 1) == 2) {
    pTwiddle[0].Re = x;
    pTwiddle[0].Im = y;
    pTwiddle[1].Re = -y;
    pTwiddle[1].Im = xNeg;
    pTwiddle[2].Re = xNeg;
    pTwiddle[2].Im = y;
  } else if ((order-1) == 1) {
    pTwiddle[0].Re = x;
    pTwiddle[0].Im = y;
  }
  /*
   * Now fill the last N/4 values : exp^(-j*2*PI*k/N) ;
   * k=1,3,5,...,N/2-1 These are used for the final twiddle fix-up for
   * converting complex to real FFT
   */
  M = N >> 3;
  diff = TWIDDLE_TABLE_ORDER - order;
  step = 1 << diff;
  pTwiddle1 = pTwiddle + 3*N/8;
  pTwiddle4 = pTwiddle1 + (N/4 - 1);
  pTwiddle3 = pTwiddle1 + N/8;
  pTwiddle2 = pTwiddle1 + (N/8 - 1);
  x = armSP_FFT_F32TwiddleTable[0];
  y = armSP_FFT_F32TwiddleTable[1];
  xNeg = 1;
  if (order >=3) {
    for (i = 1; i <= M; i += 2) {
      j = i*step;
      x = armSP_FFT_F32TwiddleTable[2*j];
      y = armSP_FFT_F32TwiddleTable[2*j+1];
      pTwiddle1[0].Re = x;
      pTwiddle1[0].Im = y;
      pTwiddle1 += 1;
      pTwiddle2[0].Re = -y;
      pTwiddle2[0].Im = -x;
      pTwiddle2 -= 1;
      pTwiddle3[0].Re = y;
      pTwiddle3[0].Im = -x;
      pTwiddle3 += 1;
      pTwiddle4[0].Re = -x;
      pTwiddle4[0].Im = y;
      pTwiddle4 -= 1;
    }
  } else {
    if (order == 2) {
      pTwiddle1[0].Re = -y;
      pTwiddle1[0].Im = xNeg;
    }
  }
  /* Update the structure */
  pFFTStruct->N = N;
  pFFTStruct->pTwiddle = pTwiddle;
  pFFTStruct->pBitRev = pBitRev;
  pFFTStruct->pBuf = pBuf;
  return OMX_Sts_NoErr;
 }
--- a/media/openmax_dl/dl/sp/src/omxSP_FFTInv_CCSToR_F32_Sfs_unscaled_s.S
+++ b/media/openmax_dl/dl/sp/src/omxSP_FFTInv_CCSToR_F32_Sfs_unscaled_s.S
@ -1,284 +0,0 @@
@//
@//  Copyright (c) 2013 The WebRTC project authors. All Rights Reserved.
@//
@//  Copyright 2016, Mozilla Foundation and contributors
@//
@//  Use of this source code is governed by a BSD-style license
@//  that can be found in the LICENSE file in the root of the source
@//  tree. An additional intellectual property rights grant can be found
@//  in the file PATENTS.  All contributing project authors may
@//  be found in the AUTHORS file in the root of the source tree.
@//
@//  This is a modification of omxSP_FFTInv_CCSToR_S32_Sfs_s.s
@//  to support float instead of SC32.
@//
@//  It is further modified to produce an "unscaled" version, which
@//  actually multiplies by two for consistency with the other FFT functions
@//  in use.
@//
@//
@// Description:
@// Compute an inverse FFT for a complex signal
@//
@//
@// Include standard headers
 #include "dl/api/armCOMM_s.h"
 #include "dl/api/omxtypes_s.h"
@// Import symbols required from other files
@// (For example tables)
        .extern  armSP_FFTInv_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CToC_FC32_Radix4_fs_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CToC_FC32_Radix8_fs_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CToC_FC32_Radix4_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CToC_FC32_Radix2_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe
@// Set debugging level
@//DEBUG_ON    SETL {TRUE}
@// Guarding implementation by the processor name
      @// Guarding implementation by the processor name
@// Import symbols required from other files
@// (For example tables)
        .extern  armSP_FFTInv_CToC_FC32_Radix4_ls_OutOfPlace_unsafe
        .extern  armSP_FFTInv_CToC_FC32_Radix2_ls_OutOfPlace_unsafe
@//Input Registers
 #define pSrc            r0
 #define pDst            r1
 #define pFFTSpec        r2
 #define scale           r3
@// Output registers
 #define result          r0
@//Local Scratch Registers
 #define argTwiddle      r1
 #define argDst          r2
 #define argScale        r4
 #define tmpOrder        r4
 #define pTwiddle        r4
 #define pOut            r5
 #define subFFTSize      r7
 #define subFFTNum       r6
 #define N               r6
 #define order           r14
 #define diff            r9
@// Total num of radix stages required to comple the FFT
 #define count           r8
 #define x0r             r4
 #define x0i             r5
 #define diffMinusOne    r2
 #define round           r3
 #define pOut1           r2
 #define size            r7
 #define step            r8
 #define step1           r9
 #define twStep          r10
 #define pTwiddleTmp     r11
 #define argTwiddle1     r12
 #define zero            r14
@// Neon registers
 #define dX0     D0
 #define dShift  D1
 #define dX1     D1
 #define dY0     D2
 #define dY1     D3
 #define dX0r    D0
 #define dX0i    D1
 #define dX1r    D2
 #define dX1i    D3
 #define dW0r    D4
 #define dW0i    D5
 #define dW1r    D6
 #define dW1i    D7
 #define dT0     D8
 #define dT1     D9
 #define dT2     D10
 #define dT3     D11
 #define qT0     d12
 #define qT1     d14
 #define qT2     d16
 #define qT3     d18
 #define dY0r    D4
 #define dY0i    D5
 #define dY1r    D6
 #define dY1i    D7
 #define dzero   D20
 #define dY2     D4
 #define dY3     D5
 #define dW0     D6
 #define dW1     D7
 #define dW0Tmp  D10
 #define dW1Neg  D11
 #define sN      S0.S32
 #define fN      S1
@// two must be the same as dScale[0]!
 #define dScale  D2
 #define two S4
    @// Allocate stack memory required by the function
        M_ALLOC4        complexFFTSize, 4
    @// Write function header
        M_START     omxSP_FFTInv_CCSToR_F32_Sfs_unscaled,r11,d15
@ Structure offsets for the FFTSpec
        .set    ARMsFFTSpec_N, 0
        .set    ARMsFFTSpec_pBitRev, 4
        .set    ARMsFFTSpec_pTwiddle, 8
        .set    ARMsFFTSpec_pBuf, 12
        @// Define stack arguments
        @// Read the size from structure and take log
        LDR     N, [pFFTSpec, #ARMsFFTSpec_N]
        @// Read other structure parameters
        LDR     pTwiddle, [pFFTSpec, #ARMsFFTSpec_pTwiddle]
        LDR     pOut, [pFFTSpec, #ARMsFFTSpec_pBuf]
        @//  N=1 Treat seperately
        CMP     N,#1
        BGT     sizeGreaterThanOne
        VLD1.F32    dX0[0],[pSrc]
        VST1.F32    dX0[0],[pDst]
        B       End
 sizeGreaterThanOne:
        @// Call the preTwiddle Radix2 stage before doing the compledIFFT
        BL    armSP_FFTInv_CCSToR_F32_preTwiddleRadix2_unsafe
 complexIFFT:
        ASR     N,N,#1                             @// N/2 point complex IFFT
        M_STR   N, complexFFTSize                  @ Save N for scaling later
        ADD     pSrc,pOut,N,LSL #3                 @// set pSrc as pOut1
        CLZ     order,N                             @// N = 2^order
        RSB     order,order,#31
        MOV     subFFTSize,#1
        @//MOV     subFFTNum,N
        CMP     order,#3
        BGT     orderGreaterthan3                   @// order > 3
        CMP     order,#1
        BGE     orderGreaterthan0                   @// order > 0
        VLD1.F32    dX0,[pSrc]
        VST1.F32    dX0,[pDst]
        MOV     pSrc,pDst
        BLT     FFTEnd
 orderGreaterthan0:
        @// set the buffers appropriately for various orders
        CMP     order,#2
        MOVNE   argDst,pDst
        MOVEQ   argDst,pOut
        @// Pass the first stage destination in RN5
        MOVEQ   pOut,pDst
        MOV     argTwiddle,pTwiddle
        BGE     orderGreaterthan1
        BLLT    armSP_FFTInv_CToC_FC32_Radix2_fs_OutOfPlace_unsafe  @// order = 1
        B       FFTEnd
 orderGreaterthan1:
        MOV     tmpOrder,order                          @// tmpOrder = RN 4
        BL      armSP_FFTInv_CToC_FC32_Radix2_fs_OutOfPlace_unsafe
        CMP     tmpOrder,#2
        BLGT    armSP_FFTInv_CToC_FC32_Radix2_OutOfPlace_unsafe
        BL      armSP_FFTInv_CToC_FC32_Radix2_ls_OutOfPlace_unsafe
        B       FFTEnd
 orderGreaterthan3:
 specialScaleCase:
        @// Set input args to fft stages
        TST     order, #2
        MOVNE   argDst,pDst
        MOVEQ   argDst,pOut
        @// Pass the first stage destination in RN5
        MOVEQ   pOut,pDst
        MOV     argTwiddle,pTwiddle
        @//check for even or odd order
        @// NOTE: The following combination of BL's would work fine even though
        @// the first BL would corrupt the flags. This is because the end of
        @// the "grpZeroSetLoop" loop inside
        @// armSP_FFTInv_CToC_FC32_Radix4_fs_OutOfPlace_unsafe sets the Z flag
        @// to EQ
        TST     order,#0x00000001
        BLEQ    armSP_FFTInv_CToC_FC32_Radix4_fs_OutOfPlace_unsafe
        BLNE    armSP_FFTInv_CToC_FC32_Radix8_fs_OutOfPlace_unsafe
        CMP        subFFTNum,#4
        BLT     FFTEnd
 unscaledRadix4Loop:
        BEQ        lastStageUnscaledRadix4
         BL        armSP_FFTInv_CToC_FC32_Radix4_OutOfPlace_unsafe
         CMP        subFFTNum,#4
         B        unscaledRadix4Loop
 lastStageUnscaledRadix4:
        BL      armSP_FFTInv_CToC_FC32_Radix4_ls_OutOfPlace_unsafe
        B        FFTEnd
 FFTEnd:                                               @// Does only the scaling
        @ Scale inverse FFT result by 2 for consistency with other FFTs
        VMOV.F32    two, #2.0                   @ two = dScale[0]
        @// N = subFFTSize  ; dataptr = pDst
 scaleFFTData:
        VLD1.F32    {dX0},[pSrc]            @// pSrc contains pDst pointer
        SUBS    subFFTSize,subFFTSize,#1
        VMUL.F32    dX0, dX0, dScale[0]
        VST1.F32    {dX0},[pSrc]!
        BGT     scaleFFTData
 End:
        @// Set return value
        MOV     result, #OMX_Sts_NoErr
        @// Write function tail
        M_END
        .end
--- a/toolkit/content/license.html
+++ b/toolkit/content/license.html
@ -104,7 +104,6 @@
      <li><a href="about:license#jquery">jQuery License</a></li>
      <li><a href="about:license#k_exp">k_exp License</a></li>
      <li><a href="about:license#khronos">Khronos group License</a></li>
      <li><a href="about:license#kiss_fft">Kiss FFT License</a></li>
 #ifdef MOZ_USE_LIBCXX
      <li><a href="about:license#libc++">libc++ License</a></li>
 #endif
@ -2041,7 +2040,6 @@ WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
        <li><code>gfx/ots/</code></li>
        <li><code>gfx/ycbcr/</code></li>
        <li><code>ipc/chromium/</code></li>
        <li><code>media/openmax_dl/</code></li>
        <li><code>toolkit/components/reputationservice/</code></li>
        <li><code>toolkit/components/url-classifier/chromium/</code></li>
        <li><code>tools/profiler/</code></li>
@ -3116,80 +3114,6 @@ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGE.
 </pre>
    <hr>
    <h1><a id="khronos"></a>Khronos group License</h1>
    <p>This license applies to the following files:</p>
    <ul>
      <li><code>media/openmax_dl/dl/api/omxtypes.h</code></li>
      <li><code>media/openmax_dl/dl/sp/api/omxSP.h</code></li>
    </ul>
 <pre>
 Copyright 2005-2008 The Khronos Group Inc. All Rights Reserved.
 These materials are protected by copyright laws and contain material
 proprietary to the Khronos Group, Inc.  You may use these materials
 for implementing Khronos specifications, without altering or removing
 any trademark, copyright or other notice from the specification.
 Khronos Group makes no, and expressly disclaims any, representations
 or warranties, express or implied, regarding these materials, including,
 without limitation, any implied warranties of merchantability or fitness
 for a particular purpose or non-infringement of any intellectual property.
 Khronos Group makes no, and expressly disclaims any, warranties, express
 or implied, regarding the correctness, accuracy, completeness, timeliness,
 and reliability of these materials.
 Under no circumstances will the Khronos Group, or any of its Promoters,
 Contributors or Members or their respective partners, officers, directors,
 employees, agents or representatives be liable for any damages, whether
 direct, indirect, special or consequential damages for lost revenues,
 lost profits, or otherwise, arising from or in connection with these
 materials.
 Khronos and OpenMAX are trademarks of the Khronos Group Inc.
 </pre>
    <hr>
    <h1><a id="kiss_fft"></a>Kiss FFT License</h1>
    <p>This license applies to files in the directory
      <code>media/kiss_fft/</code>.</p>
 <pre>
 Copyright (c) 2003-2010 Mark Borgerding
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright notice,
      this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice,
      this list of conditions and the following disclaimer in the documentation
      and/or other materials provided with the distribution.
    * Neither the author nor the names of any contributors may be used to
      endorse or promote products derived from this software without specific
      prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
 ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
 ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 </pre>
    <hr>
 #ifdef MOZ_USE_LIBCXX