【转载】FFT教程

A Tutorial on the Fast Fourier Transform
1. Introduction to digital audio
If you are already familiar with general digital audio concepts, you can ski
p this section.
The most common type of digital audio recording is called pulse code modulat
ion (PCM). Pulse code modulation is what compact discs and most WAV files us
e. In PCM recording hardware, a microphone converts a varying air pressure (
sound waves) into a varying voltage. Then an analog-to-digital converter mea
sures (samples) the voltage at regular intervals of time. For example, in a
compact disc audio recording, there are exactly 44,100 samples taken every s                                           
econd. Each sampled voltage gets converted into a 16-bit integer. A CD conta
ins two channels of data: one for the left ear and one for the right ear, to
 produce stereo. The two channels are independent recordings placed "side by
 side" on the compact disc. (Actually, the data for the left and right chann
el alternate...left, right, left, right, ... like marching feet.)
The data that results from a PCM recording is a function of time. It often a
mazes people that a sequence of millions of integers on a compact disc recor
ding can yield music and speech. People tend to wonder, "How can a stream of
 numbers sound like an entire orchestra?" It seems magical, and it is! Yet t
he magic is not in the digital recording; it's in your ear and your brain. T
o understand why this is true, imagine that you could place a microscopic mo
vie camera in your ear to film your ear drum in slow motion. Suppose the mov
ie camera was so fast that it could take a picture every 1/44,100 of a secon
d. Also, suppose that the images this camera captured on film were so crisp
and sharp that you could discern 65,536 (64K) distinct positions of the ear
drum's surface as it moved back and forth in response to incoming sound wave
s. If you used this hypothetical technology to film your ear drum while list
ening to your best friend saying your name, then took the resulting movie an
d wrote down the numeric position of your ear drum in every frame of the mov
ie, you would have a digital PCM recording. If you could later make your ear
 drum move back and forth in accordance with the thousands of numbers you ha
dompact disc audio recording, there are exactly 44,100 samples taken every s                                           
econd. Each sampled voltage gets converted into a 16-bit integer. A CD conta
ins two channels of data: one for the left ear and one for the right ear, to
 produce stereo. The two channels are independent recordings placed "side by
 side" on the compact disc. (Actually, the data for the left and right chann
el alternate...left, right, left, right, ... like marching feet.)
The data that results from a PCM recording is a function of time. It often a
mazes people that a sequence of millions of integers on a compact disc recor
ding can yield music and speech. People tend to wonder, "How can a stream of
 numbers sound like an entire orchestra?" It seems magical, and it is! Yet t
 a series of three tiny bones. It consists of a spiral of tissue filled with
 liquid and thousands of tiny hairs. The hairs on the outside of the spiral
are longer than the hairs on the inside of the spiral. In fact, the hairs ge
t gradually smaller as you wind your way around the spiral to the inside. Ea
ch hair is connected to a nerve which feeds into the auditory nerve bundle g
oing to the brain. The longer hairs resonate with lower frequency sounds, an
d the shorter hairs with higher frequencies. Thus the cochlea serves to tran
sform the air pressure signal experienced by the ear drum into frequency inf
ormation which can be interpreted by the brain as tonality and texture. This
 way, we can tell the difference between adjacent notes on a piano, even if
they are played equally loud. The Fourier Transform is a mathematical techni
que for doing a similar thing: resolving any time-domain function into a fre
quency spectrum, much like a prism splitting light into a spectrum of colors
. This analogy is not perfect, but it gets the basic idea across.
3. The Fourier Transform as a mathematical co concept still holds. When you
hear more than one thing at a time, all the distinct sounds are physically m
ixed together in your ears as a single pattern of varying air pressure. Your
 ears and your brain work together to analyze this signal back into separate
 auditory sensations. It's literally all in your head!
2. Frequency information in a function of time
An organ in our inner ears called the cochlea enables us to detect tonality
in the sounds we hear. The cochlea is acoustically coupled to the eardrum by
 a series of three tiny bones. It consists of a spiral of tissue filled with
 liquid and thousands of tiny hairs. The hairs on the outside of the spiral
are longer than the hairs on the inside of the spiral. In fact, the hairs ge
t gradually smaller as you wind your way around the spiral to the inside. Ea
ch hair is connected to a nerve which feeds into the auditory nerve bundle g
oing to the brain. The longer hairs resonate with lower frequency sounds, an
d the shorter hairs with higher frequencies. Thus the cochlea serves to tran
sform the air pressure signal experienced by the ear drum into frequency inf
ormation which can be interpreted by the brain as tonality and texture. This
 way, we can tell the difference between adjacent notes on a piano, even if
they are played equally loud. The Fourier Transform is a mathematical techni
que for doing a similar thing: resolving any time-domain function into a fre
quency spectrum, much like a prism splitting light into a spectrum of colors
. This analogy is not perfect, but it gets the basic idea across.
3. The Fourier Transform as a mathematical concept
The Fourier Transform is based on the discovery that it is possible to take
any periodic function of time x(t) and resolve it into an equivalent infinit
e summation of sine waves and cosine waves with frequencies that start at 0
and increase in integer multiples of a base frequency f0 = 1/T, where T is t
he period of x(t). Here is what the expansion looks like:
An expression of the form of the right hand side of this equation is called
a Fourier Series. The job of a Fourier Transform is to figure out all the ak
 and bk values to produce a Fourier Series, given the base frequency and the
 function x(t). You can think of the a0 term outside the summation as the co
sine coefficient for k=0. There is no corresponding zero-frequency sine coef
ficient b0 because the sine of zero is zero, and therefore such a coefficien
t would have no effect.
Of course, we cannot do an infinite summation of any kind on a real computer
, so we have to settle for a finite set of sines and cosines. It turns out t
hat this is easy to do for a digitally sampled input, when we stipulate that
 there will be the same number of frequency output samples as there are time
 input samples. Also, we are fortunate that all digital audio recordings hav
e a finite length. We can pretend that the function x(t) is periodic, and th
at the period is the same as the length of the recording. In other words, im
agine the recording repeating forever, and call thiinto a spectrum of colors
. This analogy is not perfect, but it gets the basic idea across.
3. The Fourier Transform as a mathematical concept
The Fourier Transform is based on the discovery that it is possible to take
any periodic function of time x(t) and resolve it into an equivalent infinit
e summation of sine waves and cosine waves with frequencies that start at 0
and increase in integer multiples of a base frequency f0 = 1/T, where T is t
he period of x(t). Here is what the expansion looks like:
An expression of the form of the right hand side of this equation is called
a Fourier Series. The job of a Fourier Transform is to figure out all the ak
transform is functioning correctly, you could then generate all the sines an
d cosines at these frequencies, multiply them by their respective ak and bk
coefficients, add these all together, and you will get your original recordi
ng back! It's a bit spooky that this actually works!
4. The Discrete Fast Fourier Transform algorithm
The discrete FFT is an algorithm which converts a sampled complex-valued fun
ction of time into a sampled complex-valued function of frequency. Most of t
he time, we want to operate on real-valued functions, so we set all the imag
inary parts of the input to zero. If you want to use my source code, here ar
e some things you need to know.
Your input arrays and output arrays all must have a common size, which we wi
ll call n.
The value of n must be a positive integer power of 2. For example, an array
of 1024 is allowed, but one of size 1000 is not. The smallest allowed array
size is 2. There is no upper limit to the value of n other than limitations
inherent in memory allocation of the four arrays (input{real,imaginary}, out
put{real,imaginary}) and time limits inherent in executing this O(n*log(n))
algorithm.
The realOut array holds the coefficients of the c322 seconds, so the base fr
equency f0 will be 1 / 0.02322 = 43.07 Hz. If you process these 1024 samples
 with the FFT, the output will be the sine and cosine coefficients ak and bk
 for the frequencies 43.07Hz, 2*43.07Hz, 3*43.07Hz, etc. To verify that the
The ordering of the frequencies in the output arrays (realOut and imagOut) a
re a little bit strange, because they contain both positive and negative fre
quencies. Both positive and negative frequencies are necessary for the math
to work when the inputs are complex-valued (i.e. when at least one of the in
puts has a non-zero imaginary component). Most of the time, the FFT is used
for strictly real-valued inputs, and this is especially the case in digital
audio analysis. The FFT, when fed real-valued inputs, gives outputs whose po
sitive and negative frequencies are redundant. It turns out that they are co
mplex conjugates of each other, meaning that their real parts are equal and
their imaginary parts are negatives of each other. If your inputs are all re
al-valued, you can get all the frequency information you need just by lookin
g at the first half of the output arrays.
The first slot of the output, realOut[0] and imagOut[0], contains the averag
e value of all the ibut one of size 1000 is not. The smallest allowed array
size is 2. There is no upper limit to the value of n other than limitations
inherent in memory allocation of the four arrays (input{real,imaginary}, out
put{real,imaginary}) and time limits inherent in executing this O(n*log(n))
algorithm.
The realOut array holds the coefficients of the cosine waves in the Fourier
formula.
The imagOut array holds the coefficients of the sine waves in the Fourier fo
rmula.
while i = 1024 - 17 = 1007 would correspond to -732.13 Hz, etc.
The index n/2 is a special case: it corresponds to the Nyquist frequency, wh
ich is always half the sampling rate in any digital PCM recording. For examp
le, the Nyquist frequency in a CD audio recording is (44,100 Hz)/2 = 22,050
Hz. The Nyquist frequency is important because it is the highest frequency t
hat a PCM digital audio recording can reproduce. Nothing above this frequenc
y can be represented by PCM. If you try to sample a signal which is higher i
n frequency than the Nyquist frequency, severe distortion known as aliasing
occurs, analogous to optical illusions of slow or backwards motion in films
of wagon wheels in old Westerns. Recording engineers know that they have to
filter out any signals above the Nyquist frequency in their analog equipment
 before the signals are digitally sampled, in order to avoid aliasing proble
ms. Note that the Nyquist frequency limitation is inherent in the digital re
cording itself, whether or not an FFT is ever performed on the recording.
If the input to the FFT is real, the Nyquist frequency index n/2 in the outp
ut will always have a real valupart of every positive frequency index i = 1,
 2, 3, ..., n/2 - 1, is i' = n - i. Here's an example. Suppose the sampling
rate is 44,100 Hz, and we are using buffers of 1024 complex numbers for both
 the inputs and outputs. The frequency at i = 1 would be (44,100 Hz) * 1 / 1
024 = 43.07 Hz. The negative frequency counterpart would be at i = 1024 - 1
= 1023 (i.e. the last slot in the array), with a frequency of -43.07 Hz. Lik
ewise, i = 17 would correspond to a frequency of (43.07 Hz)*17 = 732.13 Hz,
        #include <math.h>
        double magnitude = sqrt ( realOut[i]*realOut[i] + imagOut[i]*imagOut[i]
);
        double angle = atan2 ( imagOut[i], realOut[i] );
If you are interested in doing the inverse conversion, from magnitude and an
gle to real and imaginary, use the following code:
        #include <math.h>
        double real = magnitude * cos(angle);
        double imag = magnitude * sin(angle);
If you need to have a rigorous mathematical understanding of the FFT, here i
s an equation which tells you the exact relationship between the inputs and
outputs. In this equation, xk is the kth complex-valued input (time-domain)
sample, yp is the pth complex-valued output (frequency-domain) sample, and n
 = 2^N is the total number of samples. Note that k and p are in the range 0
.. n-1.
Although this formula tells you what the FFT is equivalent to, this formula
is not how the FFT algor to zero due to floating-point roundoff errors). Thi
s is because, using the formula above, i' = n - n/2 = n/2, so the Nyquist fr
equency is its own negative frequency counterpart. Therefore, it must equal
its own complex conjugate, which in turn forces it to be a real number.
Sometimes you will be interested only in the magnitude of each frequency com
ponent, or its phase angle. Here is how you calculate each:
        #include <math.h>
        double magnitude = sqrt ( realOut[i]*realOut[i] + imagOut[i]*imagOut[i]
);
        double angle = atan2 ( imagOut[i], realOut[i] );
If you are interested in doing the inverse conversion, from magnitude and an
gle to real and imaginary, use the following code:
        #include <math.h>
        double real = magnitude * cos(angle);
        double imag = magnitude * sin(angle);
If you need to have a rigorous mathematical understanding of the FFT, here i
s an equation which tells you the exact relationship between the inputs and
outputs. In this equation, xk is the kth complex-valued input (time-domain)
sample, yp is the pth complex-valued output (frequency-domain) sample, and n
 = 2^N is the total number of samples. Note that k and p are in the range 0
.. n-1.
Although this formula tells you what the FFT is equivalent to, this formula
is not how the FFT algorithm is implemented. This formula requires O(n^2) op
erations, whereas the FFT itself is O(n*log2(n)). In other words, if you wer
e to use the formula above, it would be much slower than using the FFT algor
ithm. However, if you only need a small subset of the frequency spectrum (sa
y two or three frequency samples), or you have a number of samples that isn'
t a power of 2, this formula combined with some trig optimizations could be
of practical use. Note that I have done exactly this in my Turbo Pascal sour
ce code in a procedure called CalcFrequency.
5. Applications of the FFT
The FFT algorithm tends to be better suited to analyzing digital audio recor
dings than for filtering or synthesizing sounds. A common example is when yo
u want to do the software equivalent of a device known as a spectrum analyze
r, which electrical engineers use for displaying a graph of the frequency co
ntent of an electrical signal. You can use the FFT to determine the frequenc
y of a note played in recorded music, to try to recognize different kinds of
 birds or insects, etc. The FFT is also useful for things which have nothing
 to do with audio, such as image processing (using a two-dimensional version
 of the FFT). The FFT also has scientific/statistical applications, like try
ing to detect periodic fluctuations in stock prices, animal populations, etc
. F2^N is the total number of samples. Note that k and p are in the range 0
.. n-1.
Although this formula tells you what the FFT is equivalent to, this formula
is not how the FFT algorithm is implemented. This formula requires O(n^2) op
erations, whereas the FFT itself is O(n*log2(n)). In other words, if you wer
e to use the formula above, it would be much slower than using the FFT algor
ithm. However, if you only need a small subset of the frequency spectrum (sa
y two or three frequency samples), or you have a number of samples that isn'
t a power of 2, this formula combined with some trig optimizations could be
Here is an example C++ program of the preceeding method. Notice especially h
ow the functions FadeMix and ShiftData are called from main.
If you really want to do clean sounding algorithmic filters on digital audio
, you should check out time-domain filters (also known as linear filters), w
hich process the input audio samples one at a time, instead of processing bl
ocks of samples.
----------------------------------------------------------------------------
----
Other FFT web sites
FFTW - The "Fastest Fourier Transform in the West" - developed at MIT by Mat
teo Frigo and Steven G. Johnson. Look here if you are interested in portable
 FFT source code for multi-dimensional analysis, or if you have buffer sizes
 which cannot be a power of two.
The Fourier Page by Douglas Nunn. Has lots of interesting links... some tech
nical, some historical.
Public Domain FFT Code - Has public domain source code for Fortran and C. Al
so has a nice theoretical discussion of the continuous Fourier Transform, fr
om which all Discrete Fourier Transforms are derived (including the FFT).
FFT for 2D matrices in Fortran by Hon W Yau, Bhaven Avalani and Alok Choudha
ry.
Numerical Recipes home page. The Numerical Recipes people were responsible f
or the algorithm on which I based m is broken up into chunks like this and p
hich process the input audio samples one at a time, instead of processing bl
ocks of samples.
----------------------------------------------------------------------------
----
Other FFT web sites
FFTW - The "Fastest Fourier Transform in the West" - developed at MIT by Mat
teo Frigo and Steven G. Johnson. Look here if you are interested in portable
 FFT source code for multi-dimensional analysis, or if you have buffer sizes
 which cannot be a power of two.
The Fourier Page by Douglas Nunn. Has lots of interesting links... some tech
nical, some historical.
Public Domain FFT Code - Has public domain source code for Fortran and C. Al
so has a nice theoretical discussion of the continuous Fourier Transform, fr
om which all Discrete Fourier Transforms are derived (including the FFT).
FFT for 2D matrices in Fortran by Hon W Yau, Bhaven Avalani and Alok Choudha
ry.
Numerical Recipes home page. The ion of the continuous Fourier Transform, fr
om which all Discrete Fourier Transforms are derived (including the FFT).
FFT for 2D matrices in Fortran by Hon W Yau, Bhaven Avalani and Alok Choudha
ry.
Numerical Recipes home page. The Numerical Recipes people were responsible f
or the algorithm on which I based my C and Pascal implementations.
hich process the input audio samples one at a time, instead of processing bl
ocks of samples.
--------------------------------------------------------------------------------
Other FFT web sites
FFTW - The "Fastest Fourier Transform in the West" - developed at MIT by Mat
teo Frigo and Steven G. Johnson. Look here if you are interested in portable
 FFT source code for multi-dimensional analysis, or if you have buffer sizes
 which cannot be a power of two.
The Fourier Page by Douglas Nunn. Has lots of interesting links... some tech
nical, some historical.
Public Domain FFT Code - Has public domain source code for Fortran and C. Al
so has a nice theoretical discussion of the continuous Fourier Transform, fr
om which all Discrete Fourier Transforms are derived (including the FFT).
FFT for 2D matrices in Fortran by Hon W Yau, Bhaven Avalani and Alok Choudha
ry.
Numerical Recipes home page. The Numerical Recipes people were responsible f
or the algorithm on which I based my C and Pascal implementations. 

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值