Recursive function calls in PASM?

Heater. · 2010-12-10 01:24

Using PASMs CALL and RET we can easily write functions that call functions that call functions... What we can't do is have a function that calls itself, directly or indirectly. That's because CALL only save one return address in a fixed location, a recursive call would over write the first return address. Recursive calls require a stack of return addresses.

Now, over on the Fourier thread I have managed to gain enough understanding of the Fast Fourier Transform to be able to write, for the first time, an FFT of my own. This implementation requires recursive calls. For now, and the foreseeable future, I don't see myself understanding the more normal non-recursive, in-place, implementations.

So of course I'd like to cast my FFT in PASM. This will require recursive function calls and a stack.

I have some ideas has to how to go about this but have to ask has anyone tackled this issue before?

Initially I'm thinking of keeping the stack in HUB, this would make the pointer handling easier and remove any COG space issues.

Looks like I need:
1) JMPRET to make the calls, placing the return address in some LONG somewhere, retAddr.
3) On entry to the routine save that retAddr to the stack in HUB and adjust the stack pointer. A push.
4) To return, fetch the return address into to retAddr, adjust the SP and JMP indirectly through the retAddr. A pop.

Then of course I have some parameters to stack up as well....

Anyone got ideas about this?

My recursive FFT may be a bit slower than a pukka in-place FFT but it might also be a lot smaller code wise. Assuming the stack ops don't eat the COG space.

Ale · 2010-12-10 05:11

I did not use any recursive call... nor did I see that they were necessary... did I miss something ?

Heater. · 2010-12-10 06:19

Ale,

I did not use any recursive call... nor did I see that they were necessary... did I miss something ?

Brilliant, I seem to have missed your fftdemo along the way.

No, you did not miss anything. I guess you have used a traditional in-place (output goes where input was) FFT with a bit reversal stage at the beginning. And so you have no need for recursive calls. This is normal.

I, on the other hand, am not normal:) My problem is that I have no idea how or why an in-place FFT works yet.

What I am able to understand is:
1) Take the definition of a DFT. As a sum of complex exponentials multiplied by input samples.
2) Split that sum into two sums.
3) Factor out the so called "twiddle" factors from the second sum.
4) Realize that the two sums we now have are the same.
5) Therefore see that the DFT can be written as a recursive algorithm.
6) Create an implementation of that in C++.

This approach is outlined on wikipedia: http://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm

My C++ effort works well, it is about half the speed of a common in-place FFT in C but that is still hundreds of times faster than a naive DFT implementation. It may reach parity when I redo it in C.

It looks like this:

void ditfft2 (complex<double> *Y, complex<double> *X, int N, int s)
{
    if (N == 1)
    {
        *Y = *X;
    }
    else
    {
        ditfft2 (Y,       X,     N/2, 2*s);
        ditfft2 (Y + N/2, X + s, N/2, 2*s);
        for (int k = 0; k < N/2; k++)
        {
            complex<double>t =  Y[k];
            double angle = 2.0 * M_PI * k/N;
            double testCos = cos(angle);
            double testSin = -sin(angle);
            complex<double> twiddle(testCos, testSin);
            Y[k]       = t + twiddle * Y[k + N/2];
            Y[k + N/2] = t - twiddle * Y[k + N/2];
        }
    }
}

Given that this is what I understand, this is what I want to do in PASM.

Anyway the question about recursive calls in PASM may have other uses.

Ale · 2010-12-10 06:27

heater, you know you can "cheat" all you want... for a defined N you know how many calls are done all the way, thus duplicating some code it may be possible to "avoid" the recursivity a bit, do you follow ? (but I do not know if it will be faster than in-place).

Dave Hein · 2010-12-10 07:30

I don't program in PASM very often, but it seems like a recursive PASM call would look something like this:

RecursiveSub            ...
                        wrlong  RecursiveSub_ret, stackptr
                        add     stackptr, #4
                        call    #RecursiveSub
                        sub     stackptr, #4
                        rdlong  RecursiveSub_ret, stackptr
                        ...
RecursiveSub_ret        ret
                        ...
stackptr                res     1

The value of stackptr would need to be initialized to point to a block of hub RAM that would serve as the stack. Also, any variables that are used recursively would need to be stored on the stack as well.

Dave

Heater. · 2010-12-10 07:34

I don't mind cheating but for this exercise I want to understand the "cheats" rather than just borrow them blindly. Your idea of avoiding recusivity (word?) a bit by knowing N did occur to me but I can't see how to work it yet.

Just now my recursive FFT is taking about 50% longer for a 16K sample set than a "normal" FFT in C I found on the net somewhere. Not bad considering the code is a lot easier to understand (for me anyway). I moved a few things around and removed the N and s parameters like below. Probably I can remove the X and Y parameters as well.

int N;  // Sample count
int s;  // Span

void ditfft2 (complex<double> *Y, complex<double> *X)
{
    N >>= 1;
    s <<= 1;
    int n_by_2 = N >> 1;
    if (N == 1)
    {
        *Y = *X;
    }
    else
    {
        ditfft2(Y,       X);
        ditfft2(Y + n_by_2, X + s);
        for (int k = 0; k < n_by_2; k++)
        {
            complex<double>t =  Y[k];
            double angle = 2.0 * M_PI * k/N;
            double testCos = cos(angle);
            double testSin = -sin(angle);
            complex<double> twiddle(testCos, testSin);
            twiddle *= Y[k + n_by_2];
            Y[k]                 = t + twiddle;
            Y[k + n_by_2] = t - twiddle;
        }
    }
    N <<= 1;
    s >>= 1;
}

Dave Hein · 2010-12-10 08:15

OK well here is how your orignal C code would look in PASM. However, instead of saving X, Y, s and N on the stack you could just undo the adjustment to these values after returning from ditfft2. You could also eliminate saving the return address on the stack by using a mode flag to control where you jump. The routine then becomes a loop instead of a recursive call.

                ' Set up the initial values of Y, X, N and s
                ...
                call #ditfft2
                ...
ditfft2         ...
                ' ditfft2 (Y,       X,     N/2, 2*s);
                wrlong  ditfft2_ret, stackptr
                add     stackptr, #4
                wrlong  N, stackptr
                add     stackptr, #4
                wrlong  s, stackptr
                add     stackptr, #4
                shr     N, #1
                shl     s, #1
                call    #ditfft2
                sub     stackptr, #4
                rdlong  s, stackptr
                sub     stackptr, #4
                rdlong  N, stackptr
                sub     stackptr, #4
                rdlong  ditfft2_ret, stackptr

                ' ditfft2 (Y + N/2, X + s, N/2, 2*s);
                wrlong  ditfft2_ret, stackptr
                add     stackptr, #4
                wrlong  N, stackptr
                add     stackptr, #4
                wrlong  s, stackptr
                add     stackptr, #4
                 wrlong X, stackptr
                add     stackptr, #4
                wrlong  Y, stackptr
                add     stackptr, #4
                shr     N, #1
                shl     s, #1
                add     Y, N
                add     X, s
                call    #ditfft2
                sub     stackptr, #4
                rdlong  Y, stackptr
                sub     stackptr, #4
                rdlong  X, stackptr
                sub     stackptr, #4
                rdlong  s, stackptr
                sub     stackptr, #4
                rdlong  N, stackptr
                sub     stackptr, #4
                rdlong  ditfft2_ret, stackptr

                ...
ditfft2_ret     ret
                ...
Y               res        1
X               res        1
N               res        1
s               res        1
stackptr        res        1

Cluso99 · 2010-12-10 14:18

Dave: FWIW it is only necessary to save a word (lower 9 bits required) and of course add/sub #2.

Heater. · 2010-12-11 04:05

Dave, your recursive call solution looks quite fine.

I now have a version of my C++ FFT with no parameters, oops it's not on this computer so I can't post it. Removing those does not speed things up noticeable for my C++ running on a PC but it would simplify any PASM implementation a lot and probably gain some speed as it would not be required to PUSH and POP them from a stack in HUB.

I'll have a look at this mode flag idea... First I need to get rid of the complex class.

Recursive function calls in PASM?

Comments