2008-12-25 12:38:45 +00:00
|
|
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
|
|
|
<html>
|
2010-01-24 12:40:30 +00:00
|
|
|
|
<head>
|
|
|
|
|
<title>SoundTouch library README</title>
|
|
|
|
|
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
|
|
|
|
|
<meta http-equiv="Content-Language" content="en-us">
|
|
|
|
|
<meta name="author" content="Olli Parviainen">
|
|
|
|
|
<meta name="description" content="Readme file for SoundTouch audio processing library">
|
|
|
|
|
<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
|
|
|
|
|
<meta name="ProgId" content="FrontPage.Editor.Document">
|
|
|
|
|
<style> <!-- .normal { font-family: Arial }
|
|
|
|
|
--></style>
|
|
|
|
|
</head>
|
|
|
|
|
<body class="normal">
|
|
|
|
|
<hr>
|
|
|
|
|
<h1>SoundTouch audio processing library v1.5.1pre
|
|
|
|
|
</h1>
|
|
|
|
|
<p class="normal">SoundTouch library Copyright <20> Olli Parviainen 2001-2009
|
|
|
|
|
</p>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>1. Introduction
|
|
|
|
|
</h2>
|
|
|
|
|
<p>SoundTouch is an open-source audio processing library that allows changing the
|
|
|
|
|
sound tempo, pitch and playback rate parameters independently from each other,
|
|
|
|
|
i.e.:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Sound tempo can be increased or decreased while maintaining the original pitch
|
|
|
|
|
<li>
|
|
|
|
|
Sound pitch can be increased or decreased while maintaining the original tempo
|
|
|
|
|
<li>
|
|
|
|
|
Change playback rate that affects both tempo and pitch at the same time
|
|
|
|
|
<li>
|
|
|
|
|
Choose any combination of tempo/pitch/rate</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<h3>1.1 Contact information
|
|
|
|
|
</h3>
|
|
|
|
|
<p>Author email: oparviai 'at' iki.fi
|
|
|
|
|
</p>
|
|
|
|
|
<p>SoundTouch WWW page: <a href="http://www.surina.net/soundtouch">http://www.surina.net/soundtouch</a></p>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>2. Compiling SoundTouch</h2>
|
|
|
|
|
<p>Before compiling, notice that you can choose the sample data format if it's
|
|
|
|
|
desirable to use floating point sample data instead of 16bit integers. See
|
|
|
|
|
section "sample data format" for more information.</p>
|
|
|
|
|
<h3>2.1. Building in Microsoft Windows</h3>
|
|
|
|
|
<p>Project files for Microsoft Visual C++ 6.0 and Visual C++ .NET are supplied with
|
|
|
|
|
the source code package. </p>
|
|
|
|
|
<p>
|
|
|
|
|
Please notice that SoundTouch library uses processor-specific optimizations for
|
|
|
|
|
Pentium III and AMD processors. Visual Studio .NET and later versions supports
|
|
|
|
|
the required instructions by default, but Visual Studio 6.0 requires a
|
|
|
|
|
processor pack upgrade to be installed in order to support these optimizations.
|
|
|
|
|
The processor pack upgrade can be downloaded from Microsoft site at this URL:</p>
|
|
|
|
|
<p><a href="http://msdn.microsoft.com/en-us/vstudio/aa718349.aspx">http://msdn.microsoft.com/en-us/vstudio/aa718349.aspx</a></p>
|
|
|
|
|
<p>If the above URL is unavailable or removed, go to <a href="http://msdn.microsoft.com/">
|
|
|
|
|
http://msdn.microsoft.com</a> and perform a search with keywords "processor
|
|
|
|
|
pack".
|
|
|
|
|
</p>
|
|
|
|
|
<p>To build the binaries with Visual C++ compiler, either run "make-win.bat"
|
|
|
|
|
script, or open the appropriate project files in source code directories with
|
|
|
|
|
Visual Studio. The final executable will appear under the "SoundTouch\bin"
|
|
|
|
|
directory. If using the Visual Studio IDE instead of the make-win.bat script,
|
|
|
|
|
directories bin and lib may need to be created manually to the SoundTouch
|
|
|
|
|
package root for the final executables. The make-win.bat script creates these
|
|
|
|
|
directories automatically.
|
|
|
|
|
</p>
|
|
|
|
|
<h3>2.2. Building in Gnu platforms</h3>
|
|
|
|
|
<p>The SoundTouch library can be compiled in practically any platform supporting
|
|
|
|
|
GNU compiler (GCC) tools. SoundTouch have been tested with gcc version 3.3.4.,
|
|
|
|
|
but it shouldn't be very specific about the gcc version. Assembler-level
|
|
|
|
|
performance optimizations for GNU platform are currently available in x86
|
|
|
|
|
platforms only, they are automatically disabled and replaced with standard C
|
|
|
|
|
routines in other processor platforms.</p>
|
|
|
|
|
<p>To build and install the binaries, run the following commands in the SoundTouch/
|
|
|
|
|
directory:</p>
|
|
|
|
|
<table border="0" cellpadding="0" cellspacing="4">
|
|
|
|
|
<tbody>
|
|
|
|
|
<tr valign="top">
|
|
|
|
|
<td>
|
|
|
|
|
<pre>./configure -</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td>
|
|
|
|
|
<p>Configures the SoundTouch package for the local environment.</p>
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr valign="top">
|
|
|
|
|
<td>
|
|
|
|
|
<pre>make -</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td>
|
|
|
|
|
<p>Builds the SoundTouch library & SoundStretch utility.</p>
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr valign="top">
|
|
|
|
|
<td>
|
|
|
|
|
<pre>make install -</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td>
|
|
|
|
|
<p>Installs the SoundTouch & BPM libraries to <b>/usr/local/lib</b> and
|
|
|
|
|
SoundStretch utility to <b>/usr/local/bin</b>. Please notice that 'root'
|
|
|
|
|
privileges may be required to install the binaries to the destination
|
|
|
|
|
locations.</p>
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
</tbody>
|
|
|
|
|
</table>
|
|
|
|
|
<h4><b>2.2.1 Required GNU tools</b> </h4>
|
|
|
|
|
<p>
|
|
|
|
|
Bash shell, GNU C++ compiler, libtool, autoconf and automake tools are required
|
|
|
|
|
for compiling the SoundTouch library. These are usually included with the
|
|
|
|
|
GNU/Linux distribution, but if not, install these packages first. For example,
|
|
|
|
|
in Ubuntu Linux these can be acquired and installed with the following command:</p>
|
|
|
|
|
<pre><b>sudo apt-get install automake autoconf libtool build-essential</b></pre>
|
|
|
|
|
<h4><b>2.2.2 Problems with GCC compiler compatibility</b></h4>
|
|
|
|
|
<p>At the release time the SoundTouch package has been tested to compile in
|
|
|
|
|
GNU/Linux platform. However, in past it's happened that new gcc versions aren't
|
|
|
|
|
necessarily compatible with the assembler settings used in the optimized
|
|
|
|
|
routines. <b>If you have problems getting the SoundTouch library compiled, try the
|
|
|
|
|
workaround of disabling the optimizations</b> by editing the file
|
|
|
|
|
"include/STTypes.h" and removing the following definition there:</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>#define ALLOW_OPTIMIZATIONS 1</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<h4><b>2.2.3 Problems with configure script or build process</b> </h4>
|
|
|
|
|
<p>Incompatibilities between various GNU toolchain versions may cause errors when
|
|
|
|
|
running the "configure" script or building the source codes, if your GNU tool
|
|
|
|
|
versions are not compatible with the versions used for preparing the SoundTouch
|
|
|
|
|
kit. </p>
|
|
|
|
|
<p>To resolve the issue, regenerate the configure scripts with your local tool set
|
|
|
|
|
by running the "<b>./bootstrap</b>" script included in the SoundTouch source
|
|
|
|
|
code kit. After that, run the <b>configure</b> script and <b>make</b> as
|
|
|
|
|
usually.</p>
|
|
|
|
|
<h4><b>2.2.4 Compiler issues with non-x86 processors</b></h4>
|
|
|
|
|
<p>SoundTouch library works also on non-x86 processors.</p>
|
|
|
|
|
<p>However, in case that you get compiler errors when trying to compile for
|
|
|
|
|
non-Intel processor, edit the file "<b>source\SoundTouch\Makefile.am</b>" and
|
|
|
|
|
remove the "<b>-msse2</b>" flag on the <b>AM_CXXFLAGS </b>line:</p>
|
|
|
|
|
<pre><b>AM_CXXFLAGS=-O3 -fcheck-new -I../../include # Note: -msse2 flag removed!</b></pre>
|
|
|
|
|
<p>After that, run "<b>./bootstrap</b>" script, and then run <b>configure</b> and <b>make</b>
|
|
|
|
|
again.</p>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>3. About implementation & Usage tips</h2>
|
|
|
|
|
<h3>3.1. Supported sample data formats</h3>
|
|
|
|
|
<p>The sample data format can be chosen between 16bit signed integer and 32bit
|
|
|
|
|
floating point values, the default is 32bit floating point.
|
|
|
|
|
</p>
|
|
|
|
|
<p>
|
|
|
|
|
In Windows environment, the sample data format is chosen in file "STTypes.h" by
|
|
|
|
|
choosing one of the following defines:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
<span style="FONT-WEIGHT: bold">#define INTEGER_SAMPLES</span>
|
|
|
|
|
for 16bit signed integer
|
|
|
|
|
<li>
|
|
|
|
|
<span style="FONT-WEIGHT: bold">#define FLOAT_SAMPLES</span>
|
|
|
|
|
for 32bit floating point</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p>
|
|
|
|
|
In GNU environment, the floating sample format is used by default, but integer
|
|
|
|
|
sample format can be chosen by giving the following switch to the configure
|
|
|
|
|
script: <blockquote>
|
|
|
|
|
<pre>./configure --enable-integer-samples</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p>The sample data can have either single (mono) or double (stereo) audio channel.
|
|
|
|
|
Stereo data is interleaved so that every other data value is for left channel
|
|
|
|
|
and every second for right channel. Notice that while it'd be possible in
|
|
|
|
|
theory to process stereo sound as two separate mono channels, this isn't
|
|
|
|
|
recommended because processing the channels separately would result in losing
|
|
|
|
|
the phase coherency between the channels, which consequently would ruin the
|
|
|
|
|
stereo effect.</p>
|
|
|
|
|
<p>Sample rates between 8000-48000H are supported.</p>
|
|
|
|
|
<h3>3.2. Processing latency</h3>
|
|
|
|
|
<p>The processing and latency constraints of the SoundTouch library are:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Input/output processing latency for the SoundTouch processor is around 100 ms.
|
|
|
|
|
This is when time-stretching is used. If the rate transposing effect alone is
|
|
|
|
|
used, the latency requirement is much shorter, see section 'About algorithms'.
|
|
|
|
|
<li>
|
|
|
|
|
Processing CD-quality sound (16bit stereo sound with 44100H sample rate) in
|
|
|
|
|
real-time or faster is possible starting from processors equivalent to Intel
|
|
|
|
|
Pentium 133Mh or better, if using the "quick" processing algorithm. If not
|
|
|
|
|
using the "quick" mode or if floating point sample data are being used, several
|
|
|
|
|
times more CPU power is typically required.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<h3>3.3. About algorithms</h3>
|
|
|
|
|
<p>SoundTouch provides three seemingly independent effects: tempo, pitch and
|
|
|
|
|
playback rate control. These three controls are implemented as combination of
|
|
|
|
|
two primary effects, <em>sample rate transposing</em> and <em>time-stretching</em>.</p>
|
|
|
|
|
<p><em>Sample rate transposing</em> affects both the audio stream duration and
|
|
|
|
|
pitch. It's implemented simply by converting the original audio sample stream
|
|
|
|
|
to the desired duration by interpolating from the original audio samples.
|
|
|
|
|
In SoundTouch, linear interpolation with anti-alias filtering is used.
|
|
|
|
|
Theoretically a higher-order interpolation provide better result than 1st order
|
|
|
|
|
linear interpolation, but in audio application linear interpolation together
|
|
|
|
|
with anti-alias filtering performs subjectively about as well as higher-order
|
|
|
|
|
filtering would.</p>
|
|
|
|
|
<p><em>Time-stretching </em>means changing the audio stream duration without
|
|
|
|
|
affecting it's pitch. SoundTouch uses WSOLA-like time-stretching routines that
|
|
|
|
|
operate in the time domain. Compared to sample rate transposing,
|
|
|
|
|
time-stretching is a much heavier operation and also requires a longer
|
|
|
|
|
processing "window" of sound samples used by the processing algorithm, thus
|
|
|
|
|
increasing the algorithm input/output latency. Typical i/o latency for the
|
|
|
|
|
SoundTouch time-stretch algorithm is around 100 ms.</p>
|
|
|
|
|
<p>Sample rate transposing and time-stretching are then used together to produce
|
|
|
|
|
the tempo, pitch and rate controls:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
<strong>'Tempo'</strong>
|
|
|
|
|
control is implemented purely by time-stretching.
|
|
|
|
|
<li>
|
|
|
|
|
<strong>'Rate</strong>' control is implemented purely by sample rate
|
|
|
|
|
transposing.
|
|
|
|
|
<li>
|
|
|
|
|
<strong>'Pitch</strong>' control is implemented as a combination of
|
|
|
|
|
time-stretching and sample rate transposing. For example, to increase pitch the
|
|
|
|
|
audio stream is first time-stretched to longer duration (without affecting
|
|
|
|
|
pitch) and then transposed back to original duration by sample rate
|
|
|
|
|
transposing, which simultaneously reduces duration and increases pitch. The
|
|
|
|
|
result is original duration but increased pitch.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<h3>3.4 Tuning the algorithm parameters</h3>
|
|
|
|
|
<p>The time-stretch algorithm has few parameters that can be tuned to optimize
|
|
|
|
|
sound quality for certain application. The current default parameters have been
|
|
|
|
|
chosen by iterative if-then analysis (read: "trial and error") to obtain best
|
|
|
|
|
subjective sound quality in pop/rock music processing, but in applications
|
|
|
|
|
processing different kind of sound the default parameter set may result into a
|
|
|
|
|
sub-optimal result.</p>
|
|
|
|
|
<p>The time-stretch algorithm default parameter values are set by the following
|
|
|
|
|
#defines in file "TDStretch.h":</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>#define DEFAULT_SEQUENCE_MS AUTOMATIC
|
2009-01-25 14:12:12 +00:00
|
|
|
|
#define DEFAULT_SEEKWINDOW_MS AUTOMATIC
|
2009-12-28 20:51:18 +00:00
|
|
|
|
#define DEFAULT_OVERLAP_MS 8</pre>
|
2010-01-24 12:40:30 +00:00
|
|
|
|
</blockquote>
|
|
|
|
|
<p>These parameters affect to the time-stretch algorithm as follows:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
<strong>DEFAULT_SEQUENCE_MS</strong>: This is the default length of a single
|
|
|
|
|
processing sequence in milliseconds which determines the how the original sound
|
|
|
|
|
is chopped in the time-stretch algorithm. Larger values mean fewer sequences
|
|
|
|
|
are used in processing. In principle a larger value sounds better when slowing
|
|
|
|
|
down the tempo, but worse when increasing the tempo and vice versa. <br>
|
|
|
|
|
<br>
|
|
|
|
|
By default, this setting value is calculated automatically according to tempo
|
|
|
|
|
value.<br>
|
|
|
|
|
<li>
|
|
|
|
|
<strong>DEFAULT_SEEKWINDOW_MS</strong>: The seeking window default length in
|
|
|
|
|
milliseconds is for the algorithm that seeks the best possible overlapping
|
|
|
|
|
location. This determines from how wide a sample "window" the algorithm can use
|
|
|
|
|
to find an optimal mixing location when the sound sequences are to be linked
|
|
|
|
|
back together. <br>
|
|
|
|
|
<br>
|
|
|
|
|
The bigger this window setting is, the higher the possibility to find a better
|
|
|
|
|
mixing position becomes, but at the same time large values may cause a
|
|
|
|
|
"drifting" sound artifact because neighboring sequences can be chosen at more
|
|
|
|
|
uneven intervals. If there's a disturbing artifact that sounds as if a constant
|
|
|
|
|
frequency was drifting around, try reducing this setting.<br>
|
|
|
|
|
<br>
|
|
|
|
|
By default, this setting value is calculated automatically according to tempo
|
|
|
|
|
value.<br>
|
|
|
|
|
<li>
|
|
|
|
|
<strong>DEFAULT_OVERLAP_MS</strong>: Overlap length in milliseconds. When the
|
|
|
|
|
sound sequences are mixed back together to form again a continuous sound
|
|
|
|
|
stream, this parameter defines how much the ends of the consecutive sequences
|
|
|
|
|
will overlap with each other.<br>
|
|
|
|
|
<br>
|
|
|
|
|
This shouldn't be that critical parameter. If you reduce the
|
|
|
|
|
DEFAULT_SEQUENCE_MS setting by a large amount, you might wish to try a smaller
|
|
|
|
|
value on this.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p>Notice that these parameters can also be set during execution time with
|
|
|
|
|
functions "<strong>TDStretch::setParameters()</strong>" and "<strong>SoundTouch::setSetting()</strong>".</p>
|
|
|
|
|
<p>The table below summaries how the parameters can be adjusted for different
|
|
|
|
|
applications:</p>
|
|
|
|
|
<table border="1">
|
|
|
|
|
<tbody>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top"><strong>Parameter name</strong></td>
|
|
|
|
|
<td valign="top"><strong>Default value magnitude</strong></td>
|
|
|
|
|
<td valign="top"><strong>Larger value affects...</strong></td>
|
|
|
|
|
<td valign="top"><strong>Smaller value affects...</strong></td>
|
|
|
|
|
<td valign="top"><strong>Effect to CPU burden</strong></td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>SEQUENCE_MS</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Default value is relatively large, chosen for slowing down music
|
|
|
|
|
tempo</td>
|
|
|
|
|
<td valign="top">Larger value is usually better for slowing down tempo. Growing the
|
|
|
|
|
value decelerates the "echoing" artifact when slowing down the tempo.</td>
|
|
|
|
|
<td valign="top">Smaller value might be better for speeding up tempo. Reducing the
|
|
|
|
|
value accelerates the "echoing" artifact when slowing down the tempo
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Increasing the parameter value reduces computation burden</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>SEEKWINDOW_MS</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Default value is relatively large, chosen for slowing down music
|
|
|
|
|
tempo</td>
|
|
|
|
|
<td valign="top">Larger value eases finding a good mixing position, but may cause a
|
|
|
|
|
"drifting" artifact</td>
|
|
|
|
|
<td valign="top">Smaller reduce possibility to find a good mixing position, but
|
|
|
|
|
reduce the "drifting" artifact.</td>
|
|
|
|
|
<td valign="top">Increasing the parameter value increases computation burden</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>OVERLAP_MS</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Default value is relatively large, chosen to suit with above
|
|
|
|
|
parameters.</td>
|
|
|
|
|
<td valign="top"> </td>
|
|
|
|
|
<td valign="top">If you reduce the "sequence ms" setting, you might wish to try a
|
|
|
|
|
smaller value.</td>
|
|
|
|
|
<td valign="top">Increasing the parameter value increases computation burden</td>
|
|
|
|
|
</tr>
|
|
|
|
|
</tbody>
|
|
|
|
|
</table>
|
|
|
|
|
<h3>3.5 Performance Optimizations
|
|
|
|
|
</h3>
|
|
|
|
|
<p><strong>General optimizations:</strong></p>
|
|
|
|
|
<p>The time-stretch routine has a 'quick' mode that substantially speeds up the
|
|
|
|
|
algorithm but may degrade the sound quality by a small amount. This mode is
|
|
|
|
|
activated by calling SoundTouch::setSetting() function with parameter id
|
|
|
|
|
of SETTING_USE_QUICKSEEK and value "1", i.e.
|
|
|
|
|
</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<p>setSetting(SETTING_USE_QUICKSEEK, 1);</p>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p><strong>CPU-specific optimizations:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Intel MMX optimized routines are used with compatible CPUs when 16bit integer
|
|
|
|
|
sample type is used. MMX optimizations are available both in Win32 and Gnu/x86
|
|
|
|
|
platforms. Compatible processors are Intel PentiumMMX and later; AMD K6-2,
|
|
|
|
|
Athlon and later.
|
|
|
|
|
<li>
|
|
|
|
|
Intel SSE optimized routines are used with compatible CPUs when floating point
|
|
|
|
|
sample type is used. SSE optimizations are currently implemented for Win32
|
|
|
|
|
platform only. Processors compatible with SSE extension are Intel processors
|
|
|
|
|
starting from Pentium-III, and AMD processors starting from Athlon XP.
|
|
|
|
|
<li>
|
|
|
|
|
AMD 3DNow! optimized routines are used with compatible CPUs when floating point
|
|
|
|
|
sample type is used, but SSE extension isn't supported . 3DNow! optimizations
|
|
|
|
|
are currently implemented for Win32 platform only. These optimizations are used
|
|
|
|
|
in AMD K6-2 and Athlon (classic) CPU's; better performing SSE routines are used
|
|
|
|
|
with AMD processor starting from Athlon XP.
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2><a name="SoundStretch"></a>4. SoundStretch audio processing utility
|
|
|
|
|
</h2>
|
|
|
|
|
<p>SoundStretch audio processing utility<br>
|
|
|
|
|
Copyright (c) Olli Parviainen 2002-2009</p>
|
|
|
|
|
<p>SoundStretch is a simple command-line application that can change tempo, pitch
|
|
|
|
|
and playback rates of WAV sound files. This program is intended primarily to
|
|
|
|
|
demonstrate how the "SoundTouch" library can be used to process sound in your
|
|
|
|
|
own program, but it can as well be used for processing sound files.</p>
|
|
|
|
|
<h3>4.1. SoundStretch Usage Instructions</h3>
|
|
|
|
|
<p>SoundStretch Usage syntax:</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch infilename outfilename [switches]</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p>Where:
|
|
|
|
|
</p>
|
|
|
|
|
<table border="0" cellpadding="2" width="100%">
|
|
|
|
|
<tbody>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>"infilename"</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Name of the input sound data file (in .WAV audio file format).
|
|
|
|
|
Give "stdin" as filename to use standard input pipe.
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>"outfilename"</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Name of the output sound file where the resulting sound is saved
|
|
|
|
|
(in .WAV audio file format). This parameter may be omitted if you don't
|
|
|
|
|
want to save the output (e.g. when only calculating BPM rate with '-bpm'
|
|
|
|
|
switch). Give "stdout" as filename to use standard output pipe.</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre> [switches]</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Are one or more control switches.</td>
|
|
|
|
|
</tr>
|
|
|
|
|
</tbody>
|
|
|
|
|
</table>
|
|
|
|
|
<p>Available control switches are:</p>
|
|
|
|
|
<table border="0" cellpadding="2" width="100%">
|
|
|
|
|
<tbody>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-tempo=n </pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Change the sound tempo by n percents (n = -95.0 .. +5000.0 %)
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-pitch=n</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Change the sound pitch by n semitones (n = -60.0 .. + 60.0
|
|
|
|
|
semitones)
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-rate=n</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Change the sound playback rate by n percents (n = -95.0 .. +5000.0
|
|
|
|
|
%)
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-bpm=n</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Detect the Beats-Per-Minute (BPM) rate of the sound and adjust the
|
|
|
|
|
tempo to meet 'n' BPMs. When this switch is applied, the "-tempo" switch is
|
|
|
|
|
ignored. If "=n" is omitted, i.e. switch "-bpm" is used alone, then the BPM
|
|
|
|
|
rate is estimated and displayed, but tempo not adjusted according to the BPM
|
|
|
|
|
value.
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-quick</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Use quicker tempo change algorithm. Gains speed but loses sound
|
|
|
|
|
quality.
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-naa</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Don't use anti-alias filtering in sample rate transposing. Gains
|
|
|
|
|
speed but loses sound quality.
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|
<tr>
|
|
|
|
|
<td valign="top">
|
|
|
|
|
<pre>-license</pre>
|
|
|
|
|
</td>
|
|
|
|
|
<td valign="top">Displays the program license text (LGPL)</td>
|
|
|
|
|
</tr>
|
|
|
|
|
</tbody>
|
|
|
|
|
</table>
|
|
|
|
|
<p>Notes:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
To use standard input/output pipes for processing, give "stdin" and "stdout" as
|
|
|
|
|
input/output filenames correspondingly. The standard input/output pipes will
|
|
|
|
|
still carry the audio data in .wav audio file format.
|
|
|
|
|
<li>
|
|
|
|
|
The numerical switches allow both integer (e.g. "-tempo=123") and decimal (e.g.
|
|
|
|
|
"-tempo=123.45") numbers.
|
|
|
|
|
<li>
|
|
|
|
|
The "-naa" and/or "-quick" switches can be used to reduce CPU usage while
|
|
|
|
|
compromising some sound quality
|
|
|
|
|
<li>
|
|
|
|
|
The BPM detection algorithm works by detecting repeating bass or drum patterns
|
|
|
|
|
at low frequencies of <250Hz. A lower-than-expected BPM figure may be
|
|
|
|
|
reported for music with uneven or complex bass patterns.
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<h3>4.2. SoundStretch usage examples
|
|
|
|
|
</h3>
|
|
|
|
|
<p><strong>Example 1</strong></p>
|
|
|
|
|
<p>The following command increases tempo of the sound file "originalfile.wav" by
|
|
|
|
|
12.5% and stores result to file "destinationfile.wav":</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch originalfile.wav destinationfile.wav -tempo=12.5</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p><strong>Example 2</strong></p>
|
|
|
|
|
<p>The following command decreases the sound pitch (key) of the sound file
|
|
|
|
|
"orig.wav" by two semitones and stores the result to file "dest.wav":</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch orig.wav dest.wav -pitch=-2</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p><strong>Example 3</strong></p>
|
|
|
|
|
<p>The following command processes the file "orig.wav" by decreasing the sound
|
|
|
|
|
tempo by 25.3% and increasing the sound pitch (key) by 1.5 semitones. Resulting
|
|
|
|
|
.wav audio data is directed to standard output pipe:</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch orig.wav stdout -tempo=-25.3 -pitch=1.5</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p><strong>Example 4</strong></p>
|
|
|
|
|
<p>The following command detects the BPM rate of the file "orig.wav" and adjusts
|
|
|
|
|
the tempo to match 100 beats per minute. Result is stored to file "dest.wav":</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch orig.wav dest.wav -bpm=100</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<p><strong>Example 5</strong></p>
|
|
|
|
|
<p>The following command reads .wav sound data from standard input pipe and
|
|
|
|
|
estimates the BPM rate:</p>
|
|
|
|
|
<blockquote>
|
|
|
|
|
<pre>soundstretch stdin -bpm</pre>
|
|
|
|
|
</blockquote>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>5. Change History</h2>
|
|
|
|
|
<h3>5.1. SoundTouch library Change History
|
|
|
|
|
</h3>
|
|
|
|
|
<p><b>1.5.1pre:</b></p>
|
|
|
|
|
<UL>
|
|
|
|
|
<li>
|
|
|
|
|
Added automatic cutoff threshold adaptation to beat detection routine to better
|
|
|
|
|
adapt BPM calculation to different types of music
|
|
|
|
|
<LI>
|
|
|
|
|
Retired 3DNow! optimization support as 3DNow! is nowadays obsolete and
|
|
|
|
|
assembler code is nuisance to maintain.</LI></UL>
|
|
|
|
|
<p><strong>1.5.0:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Added normalization to correlation calculation and improvement automatic
|
|
|
|
|
seek/sequence parameter calculation to improve sound quality
|
|
|
|
|
<li>
|
|
|
|
|
Bugfixes:
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed negative array indexing in quick seek algorithm
|
|
|
|
|
<li>
|
|
|
|
|
FIR autoalias filter running too far in processing buffer
|
|
|
|
|
<li>
|
|
|
|
|
Check against zero sample count in rate transposing
|
|
|
|
|
<li>
|
|
|
|
|
Fix for x86-64 support: Removed pop/push instructions from the cpu detection
|
|
|
|
|
algorithm.
|
|
|
|
|
<li>
|
|
|
|
|
Check against empty buffers in FIFOSampleBuffer
|
|
|
|
|
<li>
|
|
|
|
|
Other minor fixes & code cleanup</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixes in compilation scripts for non-Intel platforms
|
|
|
|
|
<li>
|
|
|
|
|
Added Dynamic-Link-Library (DLL) version of SoundTouch library build, provided
|
|
|
|
|
with Delphi/Pascal wrapper for calling the dll routines
|
|
|
|
|
<li>
|
|
|
|
|
Added #define PREVENT_CLICK_AT_RATE_CROSSOVER that prevents a click artifact
|
|
|
|
|
when crossing the nominal pitch from either positive to negative side or vice
|
|
|
|
|
versa</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>1.4.1:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed a buffer overflow bug in BPM detect algorithm routines if processing more
|
|
|
|
|
than 2048 samples at one call </li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>1.4.0:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Improved sound quality by automatic calculation of time stretch algorithm
|
|
|
|
|
processing parameters according to tempo setting
|
|
|
|
|
<li>
|
|
|
|
|
Moved BPM detection routines from SoundStretch application into SoundTouch
|
|
|
|
|
library
|
|
|
|
|
<li>
|
|
|
|
|
Bugfixes: Usage of uninitialied variables, GNU build scripts, compiler errors
|
|
|
|
|
due to 'const' keyword mismatch.
|
|
|
|
|
<li>
|
|
|
|
|
Source code cleanup</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.3.1: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Changed static class declaration to GCC 4.x compiler compatible syntax.
|
|
|
|
|
<li>
|
|
|
|
|
Enabled MMX/SSE-optimized routines also for GCC compilers. Earlier the
|
|
|
|
|
MMX/SSE-optimized routines were written in compiler-specific inline assembler,
|
|
|
|
|
now these routines are migrated to use compiler intrinsic syntax which allows
|
|
|
|
|
compiling the same MMX/SSE-optimized source code with both Visual C++ and GCC
|
|
|
|
|
compilers.
|
|
|
|
|
<li>
|
|
|
|
|
Set floating point as the default sample format and added switch to the GNU
|
|
|
|
|
configure script for selecting the other sample format.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.3.0: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed tempo routine output duration inaccuracy due to rounding error
|
|
|
|
|
<li>
|
|
|
|
|
Implemented separate processing routines for integer and floating arithmetic to
|
|
|
|
|
allow improvements to floating point routines (earlier used algorithms mostly
|
|
|
|
|
optimized for integer arithmetic also for floating point samples)
|
|
|
|
|
<li>
|
|
|
|
|
Fixed a bug that distorts sound if sample rate changes during the sound stream
|
|
|
|
|
<li>
|
|
|
|
|
Fixed a memory leak that appeared in MMX/SSE/3DNow! optimized routines
|
|
|
|
|
<li>
|
|
|
|
|
Reduced redundant code pieces in MMX/SSE/3DNow! optimized routines vs. the
|
|
|
|
|
standard C routines.
|
|
|
|
|
<li>
|
|
|
|
|
MMX routine incompatibility with new gcc compiler versions
|
|
|
|
|
<li>
|
|
|
|
|
Other miscellaneous bug fixes
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.2.1: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Added automake/autoconf scripts for GNU platforms (in courtesy of David Durham)
|
|
|
|
|
<li>
|
|
|
|
|
Fixed SCALE overflow bug in rate transposer routine.
|
|
|
|
|
<li>
|
|
|
|
|
Fixed 64bit address space bugs.
|
|
|
|
|
<li>
|
|
|
|
|
Created a 'soundtouch' namespace for SAMPLETYPE definitions.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.2.0: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Added support for 32bit floating point sample data type with SSE/3DNow!
|
|
|
|
|
optimizations for Win32 platform (SSE/3DNow! optimizations currently not
|
|
|
|
|
supported in GCC environment)
|
|
|
|
|
<li>
|
|
|
|
|
Replaced 'make-gcc' script for GNU environment by master Makefile
|
|
|
|
|
<li>
|
|
|
|
|
Added time-stretch routine configurability to SoundTouch main class
|
|
|
|
|
<li>
|
|
|
|
|
Bugfixes</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.1.1: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Moved SoundTouch under lesser GPL license (LGPL). This allows using SoundTouch
|
|
|
|
|
library in programs that aren't released under GPL license.
|
|
|
|
|
<li>
|
|
|
|
|
Changed MMX routine organiation so that MMX optimized routines are now
|
|
|
|
|
implemented in classes that are derived from the basic classes having the
|
|
|
|
|
standard non-mmx routines.
|
|
|
|
|
<li>
|
|
|
|
|
MMX routines to support gcc version 3.
|
|
|
|
|
<li>
|
|
|
|
|
Replaced windows makefiles by script using the .dsw files
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.01: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
"mmx_gcc.cpp": Added "using namespace std" and removed "return 0" from a
|
|
|
|
|
function with void return value to fix compiler errors when compiling the
|
|
|
|
|
library in Solaris environment.
|
|
|
|
|
<li>
|
|
|
|
|
Moved file "FIFOSampleBuffer.h" to "include" directory to allow accessing the
|
|
|
|
|
FIFOSampleBuffer class from external files.
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.0: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Initial release
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p> </p>
|
|
|
|
|
<h3>5.2. SoundStretch application Change History
|
|
|
|
|
</h3>
|
|
|
|
|
<p><b>v1.5.0:</b></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Added "-speech" switch to activate algorithm parameters more suitable for
|
|
|
|
|
speech processing than the default parameters tuned for music processing.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.4.0:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Moved BPM detection routines from SoundStretch application into SoundTouch
|
|
|
|
|
library
|
|
|
|
|
<li>
|
|
|
|
|
Allow using standard input/output pipes as audio processing input/output
|
|
|
|
|
streams</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.3.0:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Simplified accessing WAV files with floating point sample format.
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.2.1: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed 64bit address space bugs.</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.2.0: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Added support for 32bit floating point sample data type
|
|
|
|
|
<li>
|
|
|
|
|
Restructured the BPM routines into separate library
|
|
|
|
|
<li>
|
|
|
|
|
Fixed big-endian conversion bugs in WAV file routines (hopefully :)</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.1.1: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed bugs in WAV file reading & added byte-order conversion for big-endian
|
|
|
|
|
processors.
|
|
|
|
|
<li>
|
|
|
|
|
Moved SoundStretch source code under 'example' directory to highlight
|
|
|
|
|
difference from SoundTouch stuff.
|
|
|
|
|
<li>
|
|
|
|
|
Replaced windows makefiles by script using the .dsw files
|
|
|
|
|
<li>
|
|
|
|
|
Output file name isn't required if output isn't desired (e.g. if using the
|
|
|
|
|
switch '-bpm' in plain format only)
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.1:</strong></p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Fixed "Release" settings in Microsoft Visual C++ project file (.dsp)
|
|
|
|
|
<li>
|
|
|
|
|
Added beats-per-minute (BPM) detection routine and command-line switch "-bpm"
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p><strong>v1.01: </strong>
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Initial release
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>6. Acknowledgements
|
|
|
|
|
</h2>
|
|
|
|
|
<p>Kudos for these people who have contributed to development or submitted bugfixes
|
|
|
|
|
since SoundTouch v1.3.1:
|
|
|
|
|
</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
Arthur A
|
|
|
|
|
<li>
|
|
|
|
|
Richard Ash
|
|
|
|
|
<li>
|
|
|
|
|
Stanislav Brabec
|
|
|
|
|
<li>
|
|
|
|
|
Christian Budde
|
|
|
|
|
<li>
|
|
|
|
|
Brian Cameron
|
|
|
|
|
<li>
|
|
|
|
|
Jason Champion
|
|
|
|
|
<li>
|
|
|
|
|
Patrick Colis
|
|
|
|
|
<li>
|
|
|
|
|
Justin Frankel
|
|
|
|
|
<li>
|
|
|
|
|
Jason Garland
|
|
|
|
|
<li>
|
|
|
|
|
Takashi Iwai
|
|
|
|
|
<li>
|
|
|
|
|
Paulo Pizarro
|
|
|
|
|
<li>
|
|
|
|
|
RJ Ryan
|
|
|
|
|
<li>
|
|
|
|
|
John Sheehy</li>
|
|
|
|
|
</ul>
|
|
|
|
|
<p>Moral greetings to all other contributors and users also!</p>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>7. LICENSE
|
|
|
|
|
</h2>
|
|
|
|
|
<p>SoundTouch audio processing library<br>
|
|
|
|
|
Copyright (c) Olli Parviainen</p>
|
|
|
|
|
<p>This library is free software; you can redistribute it and/or modify it under
|
|
|
|
|
the terms of the GNU Lesser General Public License version 2.1 as published by
|
|
|
|
|
the Free Software Foundation.</p>
|
|
|
|
|
<p>This library is distributed in the hope that it will be useful, but WITHOUT ANY
|
|
|
|
|
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
|
|
|
|
PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.</p>
|
|
|
|
|
<p>You should have received a copy of the GNU Lesser General Public License along
|
|
|
|
|
with this library; if not, write to the Free Software Foundation, Inc., 59
|
|
|
|
|
Temple Place, Suite 330, Boston, MA 02111-1307 USA</p>
|
|
|
|
|
<hr>
|
|
|
|
|
<!--
|
2008-12-25 12:38:45 +00:00
|
|
|
|
$Id$
|
|
|
|
|
-->
|
2010-01-24 12:40:30 +00:00
|
|
|
|
</body>
|
2008-12-25 12:38:45 +00:00
|
|
|
|
</html>
|