Shop OBEX P1 Docs P2 Docs Learn Events
Fast Hartley Transform — Parallax Forums

Fast Hartley Transform

nglngl Posts: 24
edited 2010-01-09 00:52 in Propeller 1
·· There has been some discussion of the Fast Fourier Transform implementation on the propeller in another thread on this forum.· One approach was detailed in the Propeller Wiki. I would suggest that a more practical approach to fft implementation is the Fast Hartley Transform (FHT), which works with real numbers,·therefore·running faster than the FFT. Furthermore, the FHT is its own inverse function and the FFT can be calculated from the FHT. Ullmann (sepwww.stanford.edu/theses/sep38/38_29_abs ) has described a FHT algorithm (expressed in Ratfor) which is easily translated·into Spin.

·· The attached spin file is a first implementation of·the 1-dimensional HFT, which also includes methods for calculating the Power Spectrum and the FFT. ViewPort was used to debug the program and view the results.· Things to do include providing a graphical data display, converting the subroutines (eg, butterfly) to pasm, timing studies with larger data sets,·and implementing·the 2-dimensional FHT.

Comments and suggestions are welcome.

Nick


Corrected spin file and url.

New attachments include updated FHT object, Functions.spin, Utilities.spin, Demo1 & Demo2.spin.

Post Edited (ngl) : 12/6/2009 2:11:07 AM GMT

Comments

  • KyeKye Posts: 2,200
    edited 2009-10-30 04:05
    Thank you for contributing.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Nyamekye,
  • DroneDrone Posts: 433
    edited 2009-10-30 12:04
    Thanks for the work Nick. The link in the Ullman reference in your post and in the source spawns a 404 Not Found. How useful did you find the Bracewell book on FHT? I've seen this book for sale on the Web but wondered if it was too dated. I think Bracewell developed the DHT, but I'm wondering if the book is too old to cover FHT variants.

    Regards, David

    Post Edited (Drone) : 10/30/2009 12:13:52 PM GMT
  • JamesxJamesx Posts: 132
    edited 2009-10-30 13:17
    Thanks, Nick.

    I am very interested in this work. My problem is that I don't know much about FFT or FHT, but having this code and references gives me somewhere to start. I am looking forward to your successes at getting it into ASM, as well as tests of timing.

    Jim C.
  • HannoHanno Posts: 1,130
    edited 2009-10-30 19:41
    Great job Nick!
    (And nice use of ViewPort!) Got any screenshots? Any chance you could apply this to my "Speech Recognition" challenge?
    Hanno

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Co-author of the official Propeller Guide- available at Amazon
    Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
    12Blocks, the block-based programming environment (thread here)
    and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer
  • nglngl Posts: 24
    edited 2009-10-31 04:43
    David: I find Bracewell very useful - I am still using it.

    Hanno: ViewPort allowed me to quickly find programming errors. A great product. I will have to look at the speech recognition challange.
    I would like to use the xy widget in ViewPort to display the FHT results. Any hints on how to do this?

    There is a sign error in the FFT method. The last statement should be Im[noparse][[/noparse]k] := (-fx[noparse][[/noparse]k] + fx[noparse][[/noparse]len-k+2])/2

    I have converted the program to a true object which also includes convolution & correlation methods. I will post as soon as I have a working demo.
  • HannoHanno Posts: 1,130
    edited 2009-10-31 07:08
    Hi Nick,
    Thanks! The xy widget takes the values of two traces at the current timescale and plots them in xy fashion. Not sure what you want to do, but typically you would share at least 2 variables, select the proper timescale, and plot 2 channels.
    Hanno

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Co-author of the official Propeller Guide- available at Amazon
    Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
    12Blocks, the block-based programming environment (thread here)
    and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer
  • BRBR Posts: 92
    edited 2009-10-31 16:12
    Nick,

    This is very cool. I don't have a copy of viewport (yet), so I made a version that uses the Parallax Serial Terminal. I set up the output so it can be directly imported into PLX_DAQ for easy plot/visualization. I also added sin, cos, step, and impulse functions to the examples you had so I could see the output with various waveform inputs.

    I included a copy of my PLX_DAQ excel file in the archive...for anybody who'd rather get a fresh copy of PLX_DAQ, it is available here:
    http://www.parallax.com/Portals/0/Downloads/sw/plx_daq_install.zip

    The output from the various input waveforms isn't exactly what I was expecting...but that may be as much due to my ignorance of the subject area as anything else. Between you, Jay, and lonesock, and others, it's been a veritable frequency transform cornucopia here in the forums lately.

    Looking forward to your next update.

    BR

    EDIT: updated archive attachment:
    -Added a random (white) noise function, pulse function
    -Added a simple measurement of #clock cycles required to complete H-xform
    -Added magnitude and phase plots in excel PLX_DAQ worksheet
    -Misc. tweaks

    Post Edited (BR) : 11/2/2009 3:19:54 AM GMT
  • nglngl Posts: 24
    edited 2009-11-03 01:22
    Attached find a 2nd version of the FHT project reformatted as a basic FHT object and a demo program to exercise the FHT object.

    BR:· Thanks for your interest and additions to FHT. I have incorporated them, with modifications to accomodate the new FHT object, in the attached demo program.· That is, your earlier additions.· I look forward to making further additions to the demo program.

    I have replaced attached programs with new versions containg BR's latest additions.

    Nick

    Post Edited (ngl) : 11/3/2009 2:07:22 AM GMT
  • ericballericball Posts: 774
    edited 2009-11-03 02:36
    I'm reading through US Patent 4,646,256 for the (Fast) Discrete Bracewell (Hartley) Transform and I'm not sure the RATFOR code is correct, particularly the initial bit reversal step. It might be a good idea to test it out by calculating the DHT (i.e. use the full O(N^2) calculation.), just watch out for scaling differences.

    None the less, it looks like an interesting alternative to the typical FFT.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Composite NTSC sprite driver: Forum
    NTSC & PAL driver templates: ObEx Forum
    OnePinTVText driver: ObEx Forum
  • nglngl Posts: 24
    edited 2009-11-03 05:38
    ericball: I did make one change to the bit reversal subroutine - butdis was initialized to 1 rather than 0 in Ullmann's document.
    The code works in that I could reproduce all Bracewell's examples that I tried.

    Nick

    I also cross-checked the FFT methgod using the Descrete Fourier Transform in Mathematica.

    Post Edited (ngl) : 11/3/2009 6:22:52 AM GMT
  • ericballericball Posts: 774
    edited 2009-11-04 14:13
    I've gone through Bracewell's patent and I think Ullman's algorithm must be a second generation algorithm. The Bracewell algorithm uses a 2 element butterfly, i.e. a1 = a0 + b0 * cas b1 = a0 - b0 * cas, rather than the 3 (4) element butterfly used by Ullman. I'm contemplating implementing the Bracewell algorithm in SPIN, but I'd like to know how many samples & sample width I should be targetting.

    I'm hoping I can squeeze the algorithm into 240 instructions, leaving 256 registers for the element array, which would speed things up slightly (and make the bit reversal permutation trivial as it would happen on the load from HUB to COG). Moving the data out to HUB RAM means the array could be any size, but the elements would have to be LONGs. (Probably 16+16 signed fixed point.)

    And what sample data do people use? 8 bit signed or unsigned, 16 bit signed, or some fixed point? The output is always signed.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Composite NTSC sprite driver: Forum
    NTSC & PAL driver templates: ObEx Forum
    OnePinTVText driver: ObEx Forum
  • BRBR Posts: 92
    edited 2009-11-05 04:27
    @Nick: thanks for posting your updated object/demo. I've been trying it out, seems to work fine. I took the liberty of making another tweak...I replaced all the "/2" and "/scale" operations with shift arithmetic right operations, also changed the scale to 1024 to facilitate this. Was worth about 8% improvement in execution speed, at the cost of code readability. Maybe worth the trouble, if for no other reason than for making the spin code match what would likely be the preferred implementation if you decide to covert it to ASM. It would be interesting to see how an ASM implementation of the FHT stacks up against Ale's FFT on the wiki page. The wiki claims 1024 sample FFTs @ 16 fps with a single cog.

    @ericball: as to sample data, 16 bit signed seems to be a good fit for the prop. cessnapilot has been proponing a "Qs15_16" fixed point format (http://forums.parallax.com/showthread.php?p=829958). I've been pondering what it would take to implement this for a digital filter project I've been experimenting with. Might be worth considering for your Bracewell algorithm, assuming you decide to put the data in the hub ram.
  • ericballericball Posts: 774
    edited 2009-11-05 14:07
    Yeah, I've come to the conclusion that storing the table in HUB RAM is the way to go, and not just so the table can be larger. I was thinking that storing the table in COG RAM would allow the HUB RAM table to be words or even bytes, but then there is a significant loss of precision when writing the output values back to HUB RAM; not good if you want to reverse the transform. So it's best to use the same format (probably Qs15_16) for both intermediate and final values.

    Using HUB RAM also opens the possibility of having multiple cogs process the data simultaneously. There would need to be some mutual synchronization to ensure each stage is completed before the cogs start the next stage. But other than that, the algorithm parallelizes quite nicely.

    Note: the scaling should be 1/sqrt(N) so the transform is it's own inverse. In my algorithm I'm going to shift the result as a part of every other stage, which should help with overflow.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Composite NTSC sprite driver: Forum
    NTSC & PAL driver templates: ObEx Forum
    OnePinTVText driver: ObEx Forum
  • ericballericball Posts: 774
    edited 2009-11-07 01:43
    BR said...
    It would be interesting to see how an ASM implementation of the FHT stacks up against Ale's FFT on the wiki page. The wiki claims 1024 sample FFTs @ 16 fps with a single cog.
    I've done a quick cycle count of my untested PASM FHT - 266 cycles * N * log2N , or about 30 1024 sample FHTs per second at 80 MHz using a single cog. Now, the output is a Hartley Transform cas frequency domain, so if you want the Fourier ei frequency domain you'll have to convert. Or you can just do all of the work in the cas domain directly. All the usual stuff should be possible, assuming you're dealing with real signals in the time domain.

    edit 2009-11-12
    It turns out the FHT starts doing the 4 element butterfly after two rounds of 2 element add/sub transforms. I didn't realize it until I started executing the third round by hand and discovering it didn't match the expected results. Then I went back through the patent and paid more attention to the initial theory section. Unfortunately, this makes the FHT more complex, the inner loop is now 364 cycles per sample, so around 21.5 1024 sample FHTs per second. (That's a minimum rate, actual may be slightly faster.)

    I also discovered a bunch of problems with my math routines - signed values are a little trickier to handle than unsigned values.

    I'm editing this to note my errors & progress (hopefully) without bumping the topic to the top.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Composite NTSC sprite driver: Forum
    NTSC & PAL driver templates: ObEx Forum
    OnePinTVText driver: ObEx Forum


    Post Edited (ericball) : 11/13/2009 1:10:34 AM GMT
  • nglngl Posts: 24
    edited 2009-11-30 23:07
    Attached find updated versions of the spin version of the Fast Hartley Transform described in the first post of this thread.· FHT_V2 now includes methods for calculating the FFT, power sprectrum, magnitude and the tangent (phase angle) of data sequences.· FHT_Utilities includes methods for manipulating data sequences, e.g., reversing, rotation, shifting, scaling, and centering.· FHT_Functions includes convolution and correlation methods based on the FHT.· The two demos exercise the FHT·(demo1) and convolution/correlation (demo2) methods.· Both demos include BR's tabulated output method based on the Parallax Serial Terminal.· I have added a·simple vga plotting method using Kwabena Agyeman's BMP2Engine.·
    The demos are structured to allow the user to select specific examples as well as· which datasets to plot.

    The following approximate times in milliseconds were recorded for the FHT as a function of data sequence lengths: 8 -> 3, 16 -> 6, 32 -> 15,
    64 -> 34, and 128 -> 77.·· Convolution of two datasets measured 24 msec for 16 and 257 msec for 128 member arrays.

    Nick
  • BRBR Posts: 92
    edited 2009-12-01 04:03
    @Nick: neat stuff. I'm going to have to play around with this a bit. I'd definitely like to try the correlation functionality out. I have a spin function that implements correlation/convolution via the input side algorithm, will be interesting to compare results versus what comes out of the FHT. They should be the same...

    I guess you've got a couple monitors available to you in your setup (one for viewing pst output and one for the VGA plot output from the demo board). I found Kye's driver on obex and am able to see your plot output. As an FYI, if you select "File/Archive/Project..." in the propeller tool, it will automatically kick out a nice zip archive containing everything needed for someone else to run the demo. Archives are easier to attach to forum posts, too.
  • nglngl Posts: 24
    edited 2009-12-06 02:12
    @BR: I have archived my files as you suggested, but the attachment manager would not accept my zipped files!

    I have edited the first post to include the current versions of all FHT files.
  • nglngl Posts: 24
    edited 2010-01-08 23:00
    ·Attached to this post is FHT_PASM_demo in which a pasm encoded Fast Hartley Transform (see 1st post in this thread for more details) is embedded.
    This FHT version runs about 20x faster ( 16 member sequence - 0.27 msec compared to 6 msec) than the spin version. Since this is
    my first attempt at serious pasm programming, I would appreciate any comments regarding possible improvements.

    Nick
  • stevenmess2004stevenmess2004 Posts: 1,102
    edited 2010-01-09 00:52
    I made a couple of changes and some comments in the attached file. There are probably other places you can do similar things as well. Other people will probably pick up more stuff than I did.
Sign In or Register to comment.