FFT on a Propeller?

Piper984 · 2006-02-24 14:15

Hi gang,

I was wondering, is it possible to perform FFT analysis on the new pchip?· Anyone tried this yet?· Does Spin support arrays?

Gadgetman · 2006-02-24 14:17

FFT?

You mean Fast Fourier Transforms?
(Like they use in the SETI project?)

Doesn't they need floating-point maths?
(No I haven't seen any of that in what has been made available)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

Martin Hebel · 2006-02-24 14:23

Spin does support arrays and will have a floating point library if this helps [noparse]:)[/noparse] With 32 bit wide data, floating point because fairly easy I imagine (relatively speaking).

-Martin

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Martin Hebel

Disclaimer: ANY Propeller statements made by me are subject to my inaccurate understanding of my limited time with it!
Southern Illinois University Carbondale -Electronic Systems Technologies
Personal Links with plenty of BASIC Stamp info
and SelmaWare Solutions - StampPlot - Graphical Data Acquisition and Control

Piper984 · 2006-02-24 14:39

Yes, Fast Fourier Transforms. If spin supports arrays and floating point math, then the issue is probably down to if the data being analyzed can fit into the pchips ram. 32K is the total amount of 'scratch pad' style ram that is avaiable to all the cogs? Or is it just 2K per cog?

Paul Baker · 2006-02-24 15:06

Yes, but its not going to be extremely fast, it wont be comparable to a DSP.

The hub has 32x8k memory accesable by all cogs, each cog has 2x8k memory on itself. Since the memory is organized by longs (1 long=4bytes), thats 8K and 512 longs respectively

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 2/24/2006 3:10:00 PM GMT

KenLem · 2006-02-24 15:53

There are integer based FFT routines. I know people do it on AVR's so it should be possible and faster on an Propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.speechchips.com

Speech & Video IC's for BasicStamps

inaki · 2006-02-24 16:01

By the way, does Propeller support the instruction 'multiply and add' ?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

KenLem · 2006-02-24 16:05

I just looked at the PDF with ASM opcodes and it doesn't. They do provide mult and div routines but those instructions are not native.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.speechchips.com

Speech & Video IC's for BasicStamps

Paul Baker · 2006-02-24 16:37

No, MAC operations are not supported, multiply is only defined in Spin (no hardware multiply).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

pjv · 2006-02-24 16:53

Hi All;

Regarding FFT's.....since and SX already can do these at audio frequencies, I expect we can do much better with a Propeller in assembler.

Cheers

Peter (pjv)

Paul Baker · 2006-02-24 17:04

Hi Peter,
At first I was going to dispute based on the 50MIPS for SX vs. 20MIPS per cog on the Propeller, but I realized the Propeller having longs makes it roughly equivalent to 80MIPS per cog when compared to the SX's byte size. Also the Propeller's assembly code is much more efficient than the SX's resulting in fewer instructions/task.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Piper984 · 2006-02-24 17:37

Hi PJV, can you point me towards some SX reference material regarding audio FFT?? I would really be into learning about that. Thanks!!

Tracy Allen · 2006-02-24 19:32

Propeller has remarkable asm instructions that reflect Chip's depth of understanding of algorithms and signals. For example, the SUMC command adds or subtracts the source from the destination, depending on the carry bit. There is a family of asm instructions that condition the sign of the operation on the carry or zero bits. If you have ever done CORDIC algorithms or DSP in general, your eyes will light up at that, the efficiency that allows. And 32 bits per gulp.

There is an REV instruction that reverses the order of a selected number of bits in the source, and bit reversal is used in DSP algorithms and convolutions, and it is otherwise quite timeconsuming to microcode it. That instruction could also make MSBFIRST/LSBFIRST algorthms easier to code.

There are 9 instructions where the carry out is the parity of the result. Chip showed us a nice linear shift register random bit generator. The TEST instruction ANDs S with D, and the carry equals parity of the result. Those random number generators of course take that parity output of a certain set of taps on a shift register, and shift it in on the next iteration. This kind of instruction is great for encryption and checksum algorithms.

I'm still trying to get my head around the MUXx instructions. (among others) MUX. copies the carry or zero bit, or its complement, into the destination bits selected by using the source as a mask, and the resulting carry equals parity of the result. Like all the instructions it has options of direct and indirect addressing modes and optionally affecting the status flags. Huh?!

As to FFTs, integer algorthms are bound to be faster than anything that requires floating point. The most efficient integer algorithms require that the number of samples be restricted in some way, such as a power of two or a product of a certain set of prime numbers (based on the Chinese remainder theorem). The math can avoid much multiplcation and rely more on shift and add. FFT algorithms tend to be memory intensive. That would be a limitation, although an external RAM could be an option.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

pjv · 2006-02-24 21:29

Hi Piper;

Sure, its on the Parallax site in the Application Note section.........there's TONS of good stuff in "them thar App Notes" !!!!!

Sorry, I dont know how to make the link for you, but go to:

Parallax Home Page...... Downloads........ SX-Key and SX Tools....... Application Notes....... and it's near the bottom, labelled "Demo of FFT Implementation"

Cheers,

Peter (pjv)

Electronegativity · 2006-02-25 02:13

The first thing I thought of when I learned about the Propeller chip was array processing.

I'm still brainstorming it, but basic math seems to show that the bottleneck is in data transfer.

with an 8 bit parallel interface and 100 ns memory it would take about 1.6 milliseconds to get 16kb in, and another 1.6 ms to get it out.

Acheiving even this rate of data transfer is difficult because of the number of pins required to communicate with the memory.

A one megabyte RAM takes 20 pins to specify an address, plus 8 more for data transfer, and a couple more to specify read or write and enable the chip.

That leaves maybe 2 configurable I/O pins on the propeller.

I have been thinking about using an SX48 or 52 to specify the addresses sequentially while the propeller uses only 8 of its pins to read or write to memory but synchonization is a real issue.

Apologies for posting the same thing twice, but it seemed relevant to this discussion.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I wonder if this wire is hot...

Gadgetman · 2006-02-25 08:58

Why not use a couple of counters?

Hook the reset and inputs of the counters to output pins on the Propeller, then first Reset them, then pulse the pins to get to the correct address.
It may be a bit slow to find the first addresss you want to read from, but the next address is just one pulse away...

You can see one Real-world example of how it could be done Here
It's how the Psion Organiser One(1984) stored user-data in 8KB EPROMs...
(Similar techniques were used for modules of up to 128KB, but then with a few more counters that could be pulsed individually. Incidentally, you could hear when it wrote to EPROM as the charge-pump to 21V makes a ticking sound...)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

Dan Roganti · 2006-02-25 14:17

Adding large memory this would be a good HW appl to create for this processor. Especially with FFT designs, you might want to consider using a SDRAM, requires fewer signals(plus you get much memory). We use to design with 8bit Sdram devices for situations like this(one chip was enough most of the time). Usually, we would have a Sdram controller built in a cpld which provided all the memory access timing for a MCU. But for a quick and dirty application,you could conceivably use one of the COGs to execute the sdram access timing. The other COGs which calculate the FFT could pass the address and data to the Sdram COG. As it was mentioned earlier, you may still face a bottleneck with the 8bit data path, (something you have to live with)but you gain more memory.

=Dan

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.
[noparse][[/noparse]My Corner of Cyberspace http://ragooman.home.comcast.net/ ]
[noparse][[/noparse]Pittsburgh Robotics Society Got Robot? http://www.pghrobotics.org/ ]
[noparse][[/noparse]Pittsburgh Vintage Comp.Society http://groups.yahoo.com/group/pghvintagecomp/ ]
.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.

Electronegativity · 2006-02-25 15:18

Hi Dan you referred to CPLDs:

Somebody said...
Usually, we would have a Sdram controller built in a cpld which provided all the memory access timing for a MCU.

This is a possibility that occurred to me as well, but I don't know any thing about CPLD's

I read Propeller Doc version 0.1 last night and it seems to say you can clock the propeller off an external source without using the internal PLL multiplier.

If this is true then I will probably try to clock a propeller and an SX52 off the same resonator and use the SX52 as the SDRAM controller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I wonder if this wire is hot...

Electronegativity · 2006-02-25 15:20

Hi Dan you referred to CPLDs:

Somebody said...
Usually, we would have a Sdram controller built in a cpld which provided all the memory access timing for a MCU.

This is a possibility that occurred to me as well, but I don't know any thing about CPLD's

I read Propeller Doc version 0.1 last night and it seems to say you can clock the propeller off an external source without using the internal PLL multiplier.

If this is true then I will probably try to clock a propeller and an SX52 off the same resonator and use the SX52 as the SDRAM controller.

I would still like to know more about CPLDs though; could you give me a basic idea of how they work?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I wonder if this wire is hot...

Dan Roganti · 2006-02-26 04:27

hi,

You could think of a cpld as one big array of logic gates and FF's which can be interconnected thru a fuse matrix(in a limited fashion). You can replace a dozen or more logic chips with one cpld. Some of the projects at an old job was to revamp the existing telecom designs from the 80's containing several dozen chips and replace it with 1 cpld. You can also create several levels(stages) within the cpld depending on the density of the device(and the width of your design). Stages are basically how many ff's(or latchs) you have in a combination within your design. The width of your design describes how many inputs you want to include(and consumes more gates as a result).

One of the early languages for designing with cpld's is ABLE. Which used boolean expressions to define the combinatorial logic and included some high level programming syntax. In the late 70's/early 80's when these first arrived(when they were still PAL's) we had to bit bang the individual fuses in the matrix to program the devices, there was no progamming language to work with.

This is by no means comparable to a fpga but these are used quite extensively because they give your design alot of bang/buck. Even though the cpld may be highly integrated, there is still some measure of timing issues you have to keep in mind as with fpga's, setup time conflicts(metastability) never disappear, nothing's perfect.

The cpld are typically used for control logic functions to glue various hardware devices together. Nowadays the line between the high-end cpld's and the low-end fpga's has blurred to where the gate count on the newest cpld is getting higher. You can actually fit several diffferent types of interfaces in one device. Although the gate count is higher, the fpga is still more sophisticated and equipped for intricate designs(such as CPUs, DSPs, etc) One of the coolest hobbiest resources for cpld design is the Xilinx tool suite that is free and you can design with most of their devices including some of their fpga's.

hope that helps,
=Dan

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.
[noparse][[/noparse]My Corner of Cyberspace http://ragooman.home.comcast.net/ ]
[noparse][[/noparse]Pittsburgh Robotics Society Got Robot? http://www.pghrobotics.org/ ]
[noparse][[/noparse]Pittsburgh Vintage Comp.Society http://groups.yahoo.com/group/pghvintagecomp/ ]
.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.

Electronegativity · 2006-02-27 16:50

Thanks Dan, but I'm still a little bogged down in the jargon.

What does FF stand for?

What's an FPGA?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I wonder if this wire is hot...

Paul Baker · 2006-02-27 17:00

FF = flip-flop (a 1 bit register)
FPGA = Field Programmable Gate Array, a little more complex to program than a CPLD, but more verastile in its possible configurations.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Mike Young · 2007-07-24 04:59

I'm not sure what spectral analysis task the original poster wanted to tackle, but the following should be considered:

For specific applications the FFT may not be the most efficient means of calculating the answer.

Consider: if you wish to determine whether a specific frequency is present in a waveform (perhaps you want to decode DTMF tones), the Goertzel algorithm is more efficient. Also, if the spectral region you are interested in is constrained than chip-z may be better than FFT. If FFT is most appropriate, bear in mind that -- depending on the length of your data vector and padding -- higher radix transforms can save computations.

That said, there is no good substitute for a decent DSP chip -- though the Propellor would make a darn good supervisory controller wrapped around TI DSP or whatever your favorite toy happens to be. Note that some of the newer chips like TMS320C642x include DDR 333 DRAM interfaces on-board.

By the way, if you have a pretty hefty budget you can brute force this in the analog domain. Consider the following: an Analog Devices direct digital synthesizer block (AD9957) is more than capable of serving as the sweep generator for a crude spectrum analyzer. It will produce reasonably clean tones through 200MHz, and contains on-chip mixers for frequency translation loops. It has a simple serial interface and generating I- and Q- waveforms from multiple chips is of course trivial with DDS. Generating chirps with the AD9957 is trivial. Combine this with a nice image reject mixer (mixing your signal against the reference) a nice, tight bandpass filter (Sawtek surface acoustic wave filters are pretty phenomenal), a wide dynamic-range log amp (AD8313 or equivalent), and a 16- or higher depth DAC, you have the start of a nice low frequency spectrum analyzer. Your propellor can almost drive this bus in its sleep. Bear in mind that if you're going above a few tens of kilohertz you need to get your board layout act together, though.

V/R
Mike

FFT on a Propeller?

Comments