The Quest for RAM
Retrobits
Posts: 46
Hi all,
I am on a Quest for RAM. I know this issue has been covered in the forum at some length in the past, but I'm curious what the current thinking is. Here's my plight:
I have a Propeller Professional Development Board, and a breadboard SD card unit. To this, I'd like to add some RAM that is compatible with the Catalina C compiler. I've looked at some of the devices that Catalina supports, and there seem to be various choices for RAM and interface designs. It would be awesome if the RAM choice had a commercially-available breakout board for use in a solderless breadboard, or if it was itself DIP compatible. Failing that, I'll figure out how to wire it up.
128K would be neat, 512K would be awesome, and more would be even better.
Thanks in advance for any ideas!
Oh, and P.S. - my goal is to use this RAM for a virtual disk drive unit I'm designing for a retro laptop computer called the Epson PX-8. I've been toying with this project for many years, and recently had some prototype success, so I'm very excited to take the next steps.
There is an open-source implementation of such a virtual drive that runs under Linux. It's written in C, and thus it would be cool if I could use Catalina and port as much of the code as possible with minimal rewrite. I saw that Catalina can use the SD card as a standard file system, but only if enough RAM is present - hence, my quest for more memory.
Thanks again, and take care,
- Earl
I am on a Quest for RAM. I know this issue has been covered in the forum at some length in the past, but I'm curious what the current thinking is. Here's my plight:
I have a Propeller Professional Development Board, and a breadboard SD card unit. To this, I'd like to add some RAM that is compatible with the Catalina C compiler. I've looked at some of the devices that Catalina supports, and there seem to be various choices for RAM and interface designs. It would be awesome if the RAM choice had a commercially-available breakout board for use in a solderless breadboard, or if it was itself DIP compatible. Failing that, I'll figure out how to wire it up.
128K would be neat, 512K would be awesome, and more would be even better.
Thanks in advance for any ideas!
Oh, and P.S. - my goal is to use this RAM for a virtual disk drive unit I'm designing for a retro laptop computer called the Epson PX-8. I've been toying with this project for many years, and recently had some prototype success, so I'm very excited to take the next steps.
There is an open-source implementation of such a virtual drive that runs under Linux. It's written in C, and thus it would be cool if I could use Catalina and port as much of the code as possible with minimal rewrite. I saw that Catalina can use the SD card as a standard file system, but only if enough RAM is present - hence, my quest for more memory.
Thanks again, and take care,
- Earl
Comments
Here's my attempt at an 8-bit data bus SRAM. It's a stack of 256Kbit DIP chips to make a 256KByte module. It's still waiting for an application.
There are links to other memory modules in the above thread.
I don't think Clusso's RAMBlade is listed anywhere in the thread. That would be another module to look at.
I haven't tried Catalina myself yet. I don't know which modules work best with it.
Duane
Jazzed has solutions using Quad SPI srams. Bill has solutions too (see micronaughts - spelling doesnt look right) so search for Bill Henning. Dr_Acula has a number of DracBlades.
This schematic is version 5 http://www.smarthome.jigsy.com/files/documents/Propeller_v5.pdf
Essentially it is 3 latches and a 512k ram chip and uses 12 propeller pins.
The 'better' design I am working on uses one latch and 24 propeller pins and ought to get close to Cluso's design in terms of speed (the one that uses virtually all the propeller pins connected to the ram chip). I am porting all my designs over the the Gadget Gangster format as this has now been endorsed by Parallax as an officially supported platform.
The great thing about Catalina is that it supports a variety of ram chips. So you can start with one hardware design, write all your code, and very easily change to a different hardware design down the track and all your code will still work eg if you run out of ram even with 512k you can then go to jazzed's 32mb design.
A virtual drive in C sounds great. Please keep us posted with progress on the project.
Actually that's 4MB QuadSPI Flash (2x for an 8 bit bus with 10 pins total).
I do have an 8x 32KB SPI SRAM board but it's a little slower than flash.
Samples of 64KB SPI SRAM are on the way now.
The really great thing about using SPI anything is its synchronous nature.
For example, read/write data for a stock SRAM requires changing the address
for every byte. With 2x QuadSPI you just send a clock strobe, so it's possible
to read as fast as the Propeller HUB will let you (5MB/s at 80MHz). Getting
the initial address takes longer, but on average with a cache the throughput
is much higher. I just haven't tested the drivers with Catalina yet.
There is another language where I can switch between HUB and 2x QuadSPI
flash for certain tests and the performance impact is less than a 1.33 slow down.
More complicated programs like FFT run less than 2x slower (actual 1.47:1).
I've offered this service before, but it seems appropriate to restate it here:
If anyone wants Catalina support for their external RAM solution, all they have to do is send me a working board (and preferably some clue - e.g. some Spin code - about how it is supposed to work!) and I'll write the necessary XMM API for inclusion in the next release of Catalina.
Or alternatively you can do this yourself and just send me the appropriate platform-specific target files (see the Catalina target directory for examples).
Ross.
(I have the HX-20 and a PX-4. Not certain if I have a PX-8... my collection is not exactly organised... )
Are you aiming for an emulation of the optional floppy drive?
As for speed issues; I doubt that will be much of a problem.
(The HX-20 which is the origin of these models used 2 x 6301 CPUs running at 0.6MHz. The others are single-CPU, I believe. A bit faster, though. )
Ross,
I didn't know about this offer! Thanks. I'll definitely send you a Flashpoint Rampage in hopes you can make it work with Catalina.
I'm kinda focussing on using it for video memory right now...
Hi Earl
This may be on the outside edge of what you want, so here's food for thought. There is a method to use SD instead of RAM, which is a little slower than actual RAM, but much cheaper in terms of hardware and I/O resources.
On my project, we treat (one) SD as internal to the prop system, that is, organized and accessed to be optimized for speed. No FAT, no PC compatablity. A single block on the SD is accessed at any time, and the last block accessed is the current block. Areas of the disk are allocated into "files" and once defined, a file definition is permanent until the SD is reformated. The files become a "memory map" of the SD, they can be read and the contents altered, but the size is fixed. The idea is that these restriction reduce overhead and keep it fast.
The result is we can pretend the prop with an 8Gig SD card has 8Gig of RAM. The since the SD is persistent, the application(s) use the SD as both RAM and storage. I think usage might be similar to how they used to store a programs in core memory, if I understand core memory. With some imagination, one can pretend that the COG memory is used as L1 cache and registers, HUB memory used as 32k of L2 cache, and the SD is used as RAM and disk storage.
The trick becomes dividing the target applications and data into less-than-32K chuncks, and work on one disk block at a time, and keep the SD channel saturated. If one can do this, one can have programs and data up to the size of the SD card. The technics to use this are necessarilly a little different from on a PC workstation, but might be similar to dealing with the 64K boundary on old DOS systems minus the segment registers. The transfer speed is limited to the transfer speed of the SD card, so it is not the solution for all applications. You can have a very large number of small (assembler) functions that run at full prop speed, and a very large number of sub 32k data chunks.
Yes that is a good way to do it. Sal felt the overhead of FAT support would more than double the SD support, and significantly impact performance, so he took the other option. The SD can still be accessed through the prop, and all file things that FAT does still get done.
The plan is to have a second SD slot the includes the full FAT support. This way, the "internal" SD is as fast as possible, and only the "external" SD is impacted. The thinking is that SD is cheap when SD adapter is used instead of an actual slot, and only needs a couple pins; also the FAT support can be loaded when needed and eliminated when complete.
Thanks for the responses! Of the various choices for RAM, I think I'll be going with Cluso's RAMBlade. Having a full Propeller "co-processor" with extra RAM, micro SD card support and Catalina support would be a great solution for this. The primary Propeller would handle I/O and user interface, and the RAMBlade can do the storage and heavy lifting. On a side note, I'm also looking forward to playing with the CP/M implementation for RAMBlade - that should be a lot of fun!
Speed is not really an issue, compared to some applications (e.g., real-time hi-res video). I only need to support a 38400 baud connection - that's the speed at which the Epson PX-8 laptop talks to the disk drive.
@Gadgetman: Yes, the hope is to emulate the Epson PF-10 portable floppy drive, which connects to the PX-8 computer over a 38400 baud serial connection. I believe it would also work for the PX-4. The Linux implementation of a virtual floppy (vfloppy) emulates four of those PF-10 drives simultaneously, and that would be my intent here as well. Disk image files serve as the virtual floppy media, similar to the approach used in emulators. In this case, those files would be on the SD card.
I've sent an e-mail to Cluso asking about purchasing an assembled RAMBlade. I'll keep the forum up to date on my progress - once I'm a bit further on a working prototype, I'll make a post and include some pictures.
Thanks again!
- Earl
I believe Cluso's design is the fastest one out there. Where I think it really shines is with Catalina in XMM mode, because once you have the hardware sorted, you can start coding and never worry about running out of space. And Ross is very close to getting Catalina to something that is extremely easy to install and use.
But I am in awe of prof_braino's concept of caches. This is a brand new way of looking at things, and it opens up the possibility of huge programs. In particular, if you are smart and code your program so that things that are read/write often are the bits that end up in cog/hub ram, and the 'mostly read only' bits are in the sd cache, then the program could be optimised to run much more efficiently.
I wonder if the concept could be extended to considering a "sram" cache? I think it sits somewhere between "hub cache" and "sd cache", a bit faster than sd, doesn't wear out, but of course, data is lost on power down and it is not as big as the sd cache (though bigger than the hub cache).
Once you abstract the hardware in such a way, you can optimise code and then hardware can be changed down the track without changing the code.
As an aside, a while back cluso designed the "triblade" with three propellers, some with sram. I don't know if it is still available, but it was the first propeller board I owned and I don't think I ever fully explored all the possibilities. It used high speed links between the propeller chips as well as autoloading one propeller to the next. And it had a micro SD and an external ram chip.
I'm thinking of taking this further, with a propeller plus sram dedicated to the highest resolution display possible within the timing constraints, coupled with a propeller plus sram dedicated to running big Catalina C programs as fast as the hardware will allow (24 pins devoted to external ram on each board). it is all very very experimental, but thanks to this amazing forum and the generous time and help from a number of individuals, both parts of the project are going ahead in leaps and bounds. Like many projects, it will probably end up being a piece of code with multiple acknowledgments.
@retrobits, a cached drive could well end up being very useful to a number of software projects. For a start, if you use more pins than the 4 that connect to an sd card, it ought to fundamentally be faster than an sd card.
Nonetheless, my impression is that all external RAM can only run as fast as Hub RAM because the i/o is on that same buss. Am I wrong?
I don't have any performance figures to hand but I'm sure they have been posted on the various ext RAM threads.
Heater, just for hardware comparison sake I implemented heater_fft 2.0 in a language (David Betz xBasic) that supports HUB only and 2xQuadSPI Flash (10 pin interface).
The HUB only version finished heater_fft in about 2.574 seconds and the Flash version finished in 3.802 seconds. That's only a %47.7 slow down.
I'd like to see the performance difference on one of the "faster" solutions. Flash is not RAM, but a similar SRAM solution is on my desk ... I have to port the cache driver before I'll know the performance difference for that.
HUB ony at 96MHz
SpinSocket-Flash (4MB code space cached in HUB at 96MHz)
Those heater_fft results don't look right to me.
Maybe it was 1.0 instead of 2.0. All i know is the results were identical to the SPIN version with #define SPINBUTTERFLIES except for performance. I didn't bother with PASMBUTTERFLIES since it's not exercising the "business part" of language.
I am now rereading Andre's documentation for the Xtreme 512K ram as it is an interesting 8bit wide device that was intended to be well suited for video support.
Actually, it's not too bad. It's certainly not the slowest XMM Ram platform I have. The "auto increment" function for addresses means that as long as you are reading successive bytes it is quite quick. This was intended for streaming data, but it works for streaming instruction fetches as well.
You only need to slow down on the corners (so to speak)!.
Ross.
The 10 pin SPI RAM and Flash solutions I'm producing are twice the performance for less than half the price. Big QuadSPI Flash chips are even cheaper. I received the 64K x8 SPI RAM samples (512KB/board) and will be testing them soon. QuadSPI Flash addressing is much faster than for SPI SRAM.