[HX512] hub DMA, The wrath of Phil ...

... will be upon me for using 4 (four) coginit's. For that I apologize [noparse]:)[/noparse]
With all the different memory solutions flying around I thought I contribute my own using the hx512 SRAM card with emphasis on block transfers. Also, I was wondering, what is the fastest way to get a long into a cog (excluding the obvious mov dst, ina as it's a highly unlikely scenario that all I/O pins are dedicated data lines)? Well the first three candidates are (IMO):
Some transfer numbers (32 longs @80MHz):
ATM the driver is tailored for the Hydra. Meaning because the data bus occupies P[noparse][[/noparse]23..16] I use counters for address handling which means an extra I/O bit is consumed for the counter link. If the data bus is located at P[noparse][[/noparse]7..0] this is not required (address advance can be done in the driver threads). Also, the CPLD needs a more useful f/w. The original f/w can be used but this has an impact on the selectable address range (i.e. 1st 64K only). I'm running a modified f/w (hx512_cpld.1) which gives me a 3 cycle address setup (single command).
I don't know how many different revisions of the hx512 are out there (mine is using 10ns SRAM). Anything slower may not work at full speed as there are only 3 cycles between CS being issued to the card and ina being sampled. The CPLD has about 10ns propagation delay for CS, add the SRAM access time, outa to pin delay and pin sample setup time ... you get the picture (2 clock cycles is not enough, and I can't run with 4).
Update: Released driver code. Most of the SPIN code is just there to make things easier to follow (I hope). The main object duplicates the hub ROM across the external SRAM, then reads it back from a location 4n away and compares it against its source. Basically a memory test with overkill access functionality (given that the SPIN main loop eats most of the runtime).
For people who still use the original f/w there is a constant in the controller file (muxc.drv.sram.ctrl.spin) named hydra_sram_controller_03. This should be set to TRUE. The effect is that the card is set to post-increment mode during control.init, also the address setup function is redirected to a 2-cycle version (LO/HI) as supported by the original f/w (hx512_addr16).
Enjoy!
Post Edited (kuroneko) : 6/23/2009 6:47:01 AM GMT
With all the different memory solutions flying around I thought I contribute my own using the hx512 SRAM card with emphasis on block transfers. Also, I was wondering, what is the fastest way to get a long into a cog (excluding the obvious mov dst, ina as it's a highly unlikely scenario that all I/O pins are dedicated data lines)? Well the first three candidates are (IMO):
- rdlong with perfect or slightly off hub window (8..20)
- 2x16bit assembly from within the cog (5 instructions)
- rdlong missing the hub window a bit more (21..23)
Some transfer numbers (32 longs @80MHz):
- raw: 512+12 cycles, 19.5MB/s (12 cycles are cog interleave overhead)
- reading contiguous blocks from incrementing addresses: 18.6MB/s (added driver command handling)
- reading blocks from specific addresses: 17.1MB/s (added address setup of CPLD counter)
ATM the driver is tailored for the Hydra. Meaning because the data bus occupies P[noparse][[/noparse]23..16] I use counters for address handling which means an extra I/O bit is consumed for the counter link. If the data bus is located at P[noparse][[/noparse]7..0] this is not required (address advance can be done in the driver threads). Also, the CPLD needs a more useful f/w. The original f/w can be used but this has an impact on the selectable address range (i.e. 1st 64K only). I'm running a modified f/w (hx512_cpld.1) which gives me a 3 cycle address setup (single command).
I don't know how many different revisions of the hx512 are out there (mine is using 10ns SRAM). Anything slower may not work at full speed as there are only 3 cycles between CS being issued to the card and ina being sampled. The CPLD has about 10ns propagation delay for CS, add the SRAM access time, outa to pin delay and pin sample setup time ... you get the picture (2 clock cycles is not enough, and I can't run with 4).
Update: Released driver code. Most of the SPIN code is just there to make things easier to follow (I hope). The main object duplicates the hub ROM across the external SRAM, then reads it back from a location 4n away and compares it against its source. Basically a memory test with overkill access functionality (given that the SPIN main loop eats most of the runtime).
For people who still use the original f/w there is a constant in the controller file (muxc.drv.sram.ctrl.spin) named hydra_sram_controller_03. This should be set to TRUE. The effect is that the card is set to post-increment mode during control.init, also the address setup function is redirected to a 2-cycle version (LO/HI) as supported by the original f/w (hx512_addr16).
- The helper threads only know about data transfers, timing (i.e. clocks) is provided by the controller. The latter needs adjusting when the number of threads is changed. Timing diagrams are provided.
- The current thread implementation assumes the data bus starts with an offset (i.e. not P[noparse][[/noparse]7..0] -> muxc.drv.sram.cpld.rw.n.spin). As I don't have any other setup there wasn't much point implementing the no-offset version (muxc.drv.sram.cpld.rw.0.spin). If there is demand, drop me line.
Enjoy!
Post Edited (kuroneko) : 6/23/2009 6:47:01 AM GMT
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propalyzer: Propeller PC Logic Analyzer
http://forums.parallax.com/showthread.php?p=788230
I want to order some 6.25mhz crystals to up the clock speed to 100mhz (with PLL).
I am using the HX512 out of the box. Do I have to burn a different CPLD image to use the 4-cog driver?
I have 7 ProtoBoards, so lack of cogs isn't a problem. Cluso and others have dedicated one Prop chip just for RAM handling, so your approach falls into line with what others are doing. But you seem to be a Wizard, so I eagerly await your driver solution. Your mastery of the two counters approach is unlike the code I have seen to date. Thanks...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
Post Edited (James Michael Huselton) : 6/22/2009 2:46:17 PM GMT
@all: I updated the original posting with the driver code.
Thanks for sharing.
Ron
Post Edited (Ron Sutcliffe) : 6/23/2009 8:30:51 AM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH