Fourier for dummies - under construction

Heater. · 2010-12-11 14:22

Here is a nice look at Tukey the man. www.stat.uchicago.edu/~pmcc/tukey/Tukey.pdf

Did you know that it was Tukey who first called the binary digit a "bit" and the first to call programs "software"?

Despite which he tried to stay away from computers as much as possible.

ErNa · 2010-12-12 05:12

There is always a story behind. Very interesting document! Thanks
Coming up to the half, it seems to be fascination, but not glue, what it is about!

Ray0665 · 2010-12-12 11:30

Thanks also heater I think the fact that he did all these things and had a disdain (maybe actual contempt) for the math is even more amazing. He certainly was a character. So there may still be hope for us.

I personally am at the point where I have gotten to be where I wanted to be in terms of the FFT. It could NOT have happened without this entire discussion side trips and all, and my head is spinning with all sorts of questions about how this can be presented to someone who does not read the entire thread that makes sense.

Do we follow my original outline? Is order of Ch12 good? Where should Beans pdf be positioned, and the expansion of bitwise reversal by Heater,

Somebody, Anybody!!! I am not sleeping and will explode soon... rescue me from this path before I start wanting to do theoretical math - I'm sick and need help bad....

Heater. · 2010-12-12 11:42

Moving on to practical matters.

I'm playing with two Fast Fourier Transforms in C. Both are attached.

fourier.c I found on the net, can't remember where. It seems to be a typical text book FFT implementation using a bit-reversal stage prior to getting on with the real work. And importantly doing all the work "in-place". The output ends up in the input buffer. Good for space constrained micro-controllers. I can't for the life of me get a grip on how this style of FFT works.

ditfft2.c is a bit different. I have written it from by simple understanding of how the maths works. The simple steps it takes to get from the DFT definition to a recursive "divide and conquer" algorithm. It uses recursion rather than the bit-reversal step that is normally employed to remove recursion. I has an input and output buffer so wastes memory somewhat.

ditfft2 is my baby, I like it, I understand it. It is easily derived from the C++ version I posted. Which itself is easily derived from the maths. Problem is a) it uses twice as much memory, b) It takes about 50% to 100% longer to execute on a 1024 sample set.

Now actually that's not so bad. I just skimmed an old paper on FFT which quoted another paper as saying "It has been shown that a recursive FFT implementation will be 6 times slower than the non-recursive approach"

How wrong these math heads can be:)

What is more amazing is that if I crank the sample size up to something like one million then my recursive effort starts to be twice as fast as the traditional code!! Not sure at what point they are neck and neck yet.

Dave Hein · 2010-12-12 11:56

You can do an in-place computation if you compute one stage at a time. A recursive algortithm computes parts of each stage until it reaches the final stage, so it needs to use two buffers. Look at the following diagram. If you compute the 2x2 butterfly operations one stage at a time, the results of each stage can be stored back in the orignal buffer. Also note that this form of the FFT has a minimal number of complex muliplies. The butterfly operations use only adds and subtracts with multiplications between the stages. Another way to reduce computations is to precompute all the sines and cosines that will be used in the tranform so you do re-compute the same sine or cosine multiple times.

The webpage at http://cnx.org/content/m12016/latest/ does a good job of describing the decimation-in-time algorithm.

Heater. · 2010-12-12 12:15

Dave,

Yep I know, I know. And I've seen that diagram a million times. Problems is the more I look at those things the more my head spins, it has not clicked into place. There is no way I could write a non-recursive FFT with bit-reversal from scratch. I just have not convinced myself of the link from the math to the code. It is the many incomplete and hand waving descriptions of that, as seen in the DSP guide and elsewhere, that have befuddled me for decades.

Still with my new found insights I will not give up.

Ray0665,

Somebody, Anybody!!! I am not sleeping and will explode soon... rescue me from this path before I start wanting to do theoretical math - I'm sick and need help bad....

You have my sympathies. I now believe Fourier is a mental illness. It's symptoms include: confusion, withdrawal from the world, feelings of inadequacy (suffers start to refer to themselves as dummies). An intense concentration on increasingly bizarre things, sufferers may start rambling about "imaginary numbers", "negative frequencies" the existence of "harmonics" in all things.

There is no known cure although therapy with other sufferers in self-help groups can alleviate the symptoms. This should be approached with care as sufferers often do not understand each other.

Practically, how we might proceed with "Fourier For Dumies" is a puzzle. I may have shone a light on the bit-reversal but I'm still not seeing how to use it.

As I said before I can't make the link from DFT definition to FFT without the maths so that starts to be out of scope for Dummies.

Where is Bean? He kicked of with a document so the ball is his court:)

Ray0665 · 2010-12-12 14:33

@heater
Fear not! I am here to help, for I have not delved into the recursive stuff yet only the iterative.

Remember the purpose of doing the bit reversal, I'm sure you do. Well the Butterfly does the reverse putting the samples back in the original order and shows where to do the multiplies adds and subtracts. The two point diagram on the to left of the thing Dave just posted is a single stage butterfly, its inputs is two of the scrambled points. that are combined and multiplied by the sin and cos elements. This gets repeated for all the points on one level and again on the next level until in the end it is unscrambled and you have the power spectrum.

More in a bit I have an urgent matter here....

Dave Hein · 2010-12-12 15:11

The key to understanding the flow diagrams is to understand the operations performed in each 2x2 butterfly operation. The FFT butterfly is shown below. This version is computed with a multiply, an add and a subtract. If the inputs are G and H, then the outputs can be computed in-place as follows:

temp = W * H
H = G - temp
G = G + temp

T

Edit:There were some errors in the original expressions I posted. The expressions are now correct.

Ray0665 · 2010-12-12 15:46

Hello again
To continue notice the pattern of the twiddle factors (Thanks Dave). Now here I found Ch 12 page 230 helpful and I don't pretend to be an expert because my focus is in the what and how rather than the theory behind it, but I think it is showing what happens to the frequency domain when we do the bit reversal in the time domain and even though we are not doing the shift and add as shown the bit reversal ends up in that place. Perhaps someone could give some input here?

Also I found the basic program on page 235 and the flow chart on 232 helpful as well.

Ray0665 · 2010-12-13 12:20

Here is a demo in spin of the bit reversal algorithm, More to follow....

Heater. · 2010-12-13 13:20

Ray0665,

You should know that Spin has bitwise reverse operators: ><, ><=

So you can write:

reversedIndex := originalIndex >< 10

Which will reverse the first 10 bits of originalIndex to create reversedIndex.

This would shorten and speed up your code:

Something like the following for an array of 1024 samples (10 bit wide index)

PUB BitReversal
    repeat A from 0 to 1023
        temp := Samples[A]
        B := A >< 10
        Samples[A] := Samples[B]
        Samples[B] := temp

Or something like that:)

Heater. · 2010-12-13 13:45

Dave Hein,

My head is full of butterflies!

No, it's not a flashback to our psychotropic chemistry experiments of the '70s.

I feel that your little hints, clues and pointers have caused a spark which is about to fire up a huge flash light on the bit-reversing FFT for me.

I've been running over the Douglas L. Jones page on http://cnx.org/content/m12016/latest/#fig2so many times. I'm have a feeling that I'm about to be happy with writing such an FFT from scratch.

One sticking point at the moment is: How does it work to go from a 2 by 2 butterfly with two "twiddle" factors to having just one fiddle factor? As in Jones' "Additional Simplification" section.

In that section he takes of into cuckoo land by referring to "the theory of multi-dimensional index maps" which of course links to an unintelligible page. I've never heard of this theory before, please tell me we don't need it for understanding the problem at hand!

Dave Hein · 2010-12-13 13:58

The butterfly with 2 multipliers is shown below. The value of Wn is exp(-i*2*PI/n), where i = sqrt(-1). Wn can also be written as cos(2*PI/n) - i * sin(2*PI/n). Note, the i in the diagram is intended to be an index, and not the imaginary i. We'll use "k" for the index instead of "i" to reduce the confusion.

The two multipliers in the diagram are Wn raised to the k-th power and Wn raised to the (k + n/2) power. The second multiplier is exp(-i*(k+n/2)*2*PI/n), which is the same as exp(-i*2*PI*k/n) * exp(-i*2*PI*n/(2*n)). The second factor is just exp(-i*PI) = cos(PI) - i * sin(PI) = -1. Therefore, the butterfly can be computed by multiplying the second input value times a single twiddle factor, and then computing the sum and difference.

Dave Hein · 2010-12-13 14:25

The 8-point FFT butterfly diagram would translate to the following psuedo-code:

fft8(complex x[8])
{
    complex w[4]

    // Compute W's
    w[0] = 1
    w[1] = cos(2*PI*1/8) - i * sin(2*PI*1/8)
    w[2] = cos(2*PI*2/8) - i * sin(2*PI*2/8)
    w[3] = cos(2*PI*3/8) - i * sin(2*PI*3/8)

    // Bit Reversal
    swap(x[1], x[4])
    swap(x[3], x[6])

    // First Stage
    bufferfly(x[0], x[1], w[0])
    bufferfly(x[2], x[3], w[0])
    bufferfly(x[4], x[5], w[0])
    bufferfly(x[6], x[7], w[0])

    // Second Stage
    bufferfly(x[0], x[2], w[0])
    bufferfly(x[1], x[3], w[2])
    bufferfly(x[4], x[6], w[0])
    bufferfly(x[5], x[7], w[2])

    // Third Stage
    bufferfly(x[0], x[4], w[0])
    bufferfly(x[1], x[5], w[1])
    bufferfly(x[2], x[6], w[2])
    bufferfly(x[3], x[7], w[3])
}

butterfly(complex a, complex b, complex c)
{
    complex temp
    temp = b * c
    b = a - temp
    a = a + temp
}

swap(complex a, complex b)
{
    complex temp
    temp = b
    b = a
    a = temp
}

Ray0665 · 2010-12-13 15:12

I did know about the bit reversal ops but it didn't net my mind when I was writing the demo. I was really just translating from an earlier program I wrote. Thanks for reminding us all. My goal was to isolate the bit reversal with the intent of including it in the final paper so speed is not a consideration. Next I think an implementation of the three main loops without the butterflies or twiddle factors and finally just the butterfly and twiddle factors

If we could get some more volunterers. That would be great

Heater. · 2010-12-13 22:10

Dave,

Brilliant thank you. That searchlight is seriously warming up now.

Strangely enough I now find myself seeing the twiddle factors W as the Nth roots of unity, of which there are N. Seeing them as vectors at equal spaced angles one can visualize that the +N/2 rotates the vector by 180 degrees, which is the same as reversing the vector by negation.

Still your explanation with exp()...is about the last missing piece of the puzzle for me.

As I said before I can't see how to get to the "FFT for Dummies", that is from definition to code, without all that rather undummy maths.

Heater. · 2010-12-13 23:57

Dave,

Why do the k values in your second stage Ws go 0 and 2 rather than 0 and 1 ?

// Second Stage
    bufferfly(x[0], x[2], w[0])
    bufferfly(x[1], x[3], w[2])
    bufferfly(x[4], x[6], w[0])
    bufferfly(x[5], x[7], w[2])

Instead of:

 // Second Stage
    bufferfly(x[0], x[2], w[0])
    bufferfly(x[1], x[3], w[1])
    bufferfly(x[4], x[6], w[0])
    bufferfly(x[5], x[7], w[1])

I know that's how appears in the diagram but it does not agree with my
reading of the decimation equation which I summarize as:

X(k) = DFT_OF_EVEN_ELEMENTS + W{k, N} * DFT_OF_ODD_ELEMENTS

Dave Hein · 2010-12-14 06:12

The w's I computed were based on N = 8, so w[k] = W{k,8}. In the second stage of the 8-point FFT, we're doing the last stage of a 4-point FFT. The weights should be W{1,4}, but we're using W{2,8}, which is equivalent.

Heater. · 2010-12-14 06:20

Dave,

Oh yes, I was just coming to that conclusion as your post arrived. At each stage down the N gets halved and so....

Which is actually the same as the answer I gave to a question many posts back but I did not see the connection immediately:)

Thanks for confirming my suspicions. Your pseudo code converted to C++ works just fine with the W{2,8}

Ray0665 · 2010-12-14 11:27

Updated bit reversal in spin. included use of prop bit reversal OP as suggested by heater.
And...

just to prove I still maintain my Dummy status the thing kept returning an ummodified array!!
After too many attempts the light finally went on.
you need do only the first N/2-1 items if you do them all the second half simply undoes everything.

Heater. · 2010-12-14 14:09

Cool. No worries, that's a kind of nice mistake to make:)

Heater. · 2010-12-15 13:21

Ray0665:

I don't think that bit reversal using >< is going to work. It will trample on itself. My fault for being hasty when I suggested using ><.

Thing is, with the rev operator in Spin and the rev instruction in PASM it is possible to fill the FFT array in the time decimated order as the data arrives, say from an ADC. There is almost no overhead with using REV. So it saves needing a bit-reversal stage all together.

Dave Hein · 2010-12-15 13:52

The Spin bit reversal will work if you use the same logic as the "classic" routine. Loop from I = 0 to N-1, but only swap if I < J.

Heater. · 2010-12-15 14:39

Oh my God. I have finally done it.

I've written my very own in-place, non-recursive, bit-reversing, Radix-2 Decimation In Time Fast Fourier Transform.

All without any peeking at existing source codes (honest). My few remaining brain cells are all holding hands in just the right way for me to see the whole thing from the mathematical DFT definition, through Cooley-Tukey decimation, bit-reversal, butterfly simplification (thanks Dave), right down to my code in C++.

All inspired by the lively discussion on this thread. Thank you everybody.

As I suspected if you can fight your way through all the layers of maths gloop and diversions there is actually a beautifully simple idea at the bottom of it all.

My effort is attached for your consideration. I've tried to keep it as straight forward and simple as possible with hopefully meaningful variable names and layout.

Perhaps next up is a C version, a Spin version and if I'm feeling adventurous a PASM version.

I really am a dummy, it's taken me about twenty five years to achieve this level of understanding!

Ray0665 · 2010-12-15 16:25

Dude... Well done!!! But, ya gotta do the spin version....
There seems to be an echo in here about the >< op undoing things. Must be the acoustics...
@Dave I would think N/2-1 is faster than if I < J

prof_braino · 2010-12-15 18:38

Heater. wrote: »

Oh my God. I have finally done it.

I've written my very own in-place, non-recursive, bit-reversing, Radix-2 Decimation In Time Fast Fourier Transform.
...
the whole thing from the mathematical DFT definition, through Cooley-Tukey decimation, bit-reversal, butterfly simplification (thanks Dave), right down to my code in C++.
..there is actually a beautifully simple idea at the bottom of it all.
...
I really am a dummy, it's taken me about twenty five years to achieve this level of understanding!

Congratulations!

So, we have success according the criterion stated in the original post, that being either Heater or I claim understanding by the time work stops.

But while we have evidence of understanding (heater's code), we don't have the text of the explanation yet.

I hope the thread continues in that direction, so I have time to re-read all the posts and study the code example.

Incidentally, it took less than a month of heater's hard work since the original post to reach this point. So far the baseline is twenty five years and three weeks till understanding.

Dave Hein · 2010-12-15 19:02

Ray0665 wrote: »

@Dave I would think N/2-1 is faster than if I < J

N/2-1 won't work for N greater than 8. For N=16, you would never swap locations 11 and 13, for example.

Heater, congratulations on the FFT. Now on to convolution, correlation, and many more things that can be done with the FFT.

Heater. · 2010-12-15 21:22

Ray0665,

Dude... Well done!!! But, ya gotta do the spin version....

Thank you, slave driver:)

Prof_Braino,

But while we have evidence of understanding (heater's code), we don't have the text of the explanation yet.

There in lies the problem. Having the idea in mind is a long way from being able to explain it to anyone who does not. Good teachers are hard to find. Plus as I have said before I can't get to the code from the DFT definition without all the maths gloop and that takes any current explanation from me out of scope for "dummies". Unless we start with "Engineering Maths For Dummies" first. It is actually not such a lot to get through.

Dave,

Heater, congratulations on the FFT. Now on to convolution, correlation, and many more things that can be done with the FFT.

Thank you Dave, and thanks again for the hand holding and tips. That is just what I needed to keep me looking at it until it clicked. As for convolution, correlation etc we will see. I think I'm going to take a rest from anything that feels like thinking until next year:)

Except for that Spin version....

Ray0665 · 2010-12-16 06:57

@Dave Thanks for that catch,
I updated the two uploads and hand checked output with 4,8,16,32,128 and 256 seems good now

Heater. · 2010-12-16 13:51

Attached is my first working effort at converting my home made FFT into Spin.

This takes 1024 samples of a signed 12 bit test signal and produces a nice amplitude spectrum in 512 samples which are printed via FullDuplexSerial. The attached image is a plot of the resulting spectrum given a test signal of two frequencies at maximum amplitude.

Hopefully it is clear how to adapt it to the input from and output to any application.

The complex multiply in the middle of the butterfly is written for clarity rather than speed. The twiddle tables could be reduced in size. Exercises for the reader.

Any questions:)

Fourier for dummies - under construction

Comments