Adding 32MB SDRAM to Propeller ...

jazzed · 2010-08-03 02:57

Please see the SDRAM Module product announcement here:
http://forums.parallax.com/showthread.php?t=127044

Contents of this thread are kept mostly unchanged except for some picture edits.

Here are some results from experimenting with SDRAM on Propeller. Some of this was obliquely discussed in the 'DRAM adding' thread. Yes, these things are puzzles that need to be done

The attached board picture has two 32MB SDRAMs sitting on a GadgetGangster PlatformSD. The second SDRAM is not used now but could be functional with more effort. The wires are mostly experiments, and the little gate is the only part I had. There is one real mistake:) This particular design allows 8 free pins but uses SCL for clock. I have another schematic in process that allows 8 free pins, does not use SCL, and can time-share up to 4 more pins with smaller memory options (see schematic in second picture below).

The output below shows some statistics and pseudo-random test output. A 32 byte read burst takes about 6us so the throughput is potentially > 5.3MB/s for 512 byte read. I'm using 32 byte bursts for buffer read/write so that the cache is more efficient (this could be ported to VMCOG except for the memory limit in that driver).

I'll make a FAB available if there is interest. My initial hardware target is the GG Platform. As far as software goes, I plan to add support for Catalina, ZOG (via VMCOG if possible), and other development environments. Applications that I intend to try are rich video (32 to 512 byte reads at > 5MB/s) and others.

I should be able to improve cached performance once the flusher knows how to skip saving a non-dirty page before reload. That being said Bill ;0 If you used less bits for your LRU counter, I could add this to VMCOG.

The first pic is the 64MB test FAB. The second pic is a more flexible 8/16/32MB option VGA/TV adapter schematic.

Cheers.
--Steve

jazzed · 2010-08-03 03:06

Reserved for software by OP

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

Duane Degn · 2010-08-03 03:25

This is very interesting.

Many of the post I've read concerning SDRAM has been with regard to retro computer emulation.· I think I've seen some posts with regard to video buffer (I'm not sure is this is possible or not).· Most of the discussion has been a bit over my head but I do feel the ideas brushing against my hair every so often.

The question I've long wanted to ask without sounding too ignorant is:

Can this memory be accessed (without too much trouble)·via a spin program?· I've been able to access Winbond flash chips.· Is this much harder than that?· (Assuming the hardware and drivers are already taken care of.)

I'm trying to write a mini database for the Prop.· It wouldn't be a general database.· It would just so I could keep track of samples I'm testing.· So far I've been writing it with the expectation of using a Winbond flash chip.· Would SDRAM be a significant improvement with regards to speed?

Based on my reading, if seems it is posible to run larger programs than would normally fit in Hub memory.· I don't think it is possible to use SDRAM to hold large spin programs.· If I've understood correctly, large programs would be written in C using LMM (Large Memory Monkeying or something like that).

It would be cool to use this without having to learn a lot of extra programming skills.

I like that this uses the Prop Platform.

Duane

jazzed · 2010-08-03 04:43

@Duane Degn,

I'm using a Spin program for testing the SDRAM, so yes it's easy to access. All you need to do is call a method in the driver. There is not much difference for PASM access except that for speed, PASM needs to keep track of it's own buffer. The guts of the SDRAM driver are written in PASM.

This is the first published attempt I know of for using SDRAM (yes DDR2 is similarly possible and is a little cheaper). The retro emulators use lower density SRAM. Bill's Morpheus design uses SRAM for video buffering. There are a few discussions about EDO DRAM including a design I did a few years ago. The advantage of SDRAM, DDR2, and PSRAM (cell phone) is the clock. You can set up a page for read/write and strobe the data with the clock without having to worry about toggling address bits. It's easy to cheat with movs outa, data too [noparse]:)[/noparse]

I haven't looked at the Winbond SPI flash much but will very soon for another project. I suspect that would be slower than an 8 bit wide memory. As far as a "sample database" goes you're probably just as well off to use the Flash as long as there are not long delays polling for the write to finish. In any event I assume the data has to be stored in non-volatile memory like SD Card or some other media at some point ... the question becomes is 33MB enough temporary storage?

Yes Large Monkey Model LMM [noparse]:)[/noparse] programs can execute code and read/write variables in external memory. Catalina as I understand it supports up to 16MB code executable memory now, but that is more or less artificial as Ross has explained in email. I'm not sure if Zog has a limit or not, but having a big memory for makes Zog GNU very attractive to me. I may port the Propeller JVM for such memory, but I'm not sure if that has value at this point unless someone is willing to pay for it. That could open further the possibility of Java 2 programs though.

I've experimented and published methodologies for using bigger Spin programs using EEPROM and have been able to run programs that would use more than 32K code, but there are difficult restrictions.

The Propeller Platform modules are very nice. If you ask me any add-on hardware features should target the Platform modules since they are good for prototyping. Flexibility always comes with a price though. I'll build a generic custom board later most likely since it's much easier that way to control pin allocation and provide "solutions" that don't require tinkering.

The skill for using such memory is built in to the driver object ... code sharing is your friend [noparse]:)[/noparse]. All you have to do is call methods with "data := readLong(address)" or "writeByte(address, byte)" for example ... simple.

Cheers,
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

heater · 2010-08-03 07:17

Looks like another puzzle completed[noparse]:)[/noparse]

Zog can execute C/C++ code and access data anywhere within a 32bit address space. 4 GBytes.
I have currently divided the memory space in such a way that some areas are redirected to HUB RAM or COG space. This is quite arbitrary and those areas could be placed anywhere so 32M or more is not a problem.

Zog looks forward to you getting this SDRAM supported in VMCOG.

Actually I was thinking we should create an RMCOG driver. "Real Memory" or "Real Mode" COG would have the same API as VMCOG for accessing external RAM but forgo all the virtual memory management and HUB space that takes up for the working set. Each access would go directly to the RAM hardware.

What's the point or RMCOG?

1) Imagine every existing ext RAM hardware design ws supported by RMCOG.
2) Then any Spin application using the RM/VMCOG API would work on any ext RAM board.
3) Duane Degn here is an example of the many who could use ext RAM from Spin just as a "place to put stuff". Perhaps speed not so critical. It would be great if such aps would just work with any ext RAM or FLAS or whatever.
4) Any one doing new ext RAM designs only has to add a little driver code to VM/RMCOG to make it usable by all apps.

Perhaps a cache of the last block loaded, for those drivers that uses block access would speed along execution of ZOG and others. If Zog for example ran code from ext RAM but his data was in HUB that would zip along.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Dr_Acula · 2010-08-03 08:05

This is very interesting.

I looked up sdram http://en.wikipedia.org/wiki/Synchronous_dynamic_random_access_memory

And I realised I probably have many of these chips sitting in the junk pile of old PCs. Just need to desolder them?

How many pins does it take to drive one of these - you say 8 pins are free so I am assuming 24 pins? But with the possibility of sharing pins somehow? Can these chips be placed in HiZ and then you can double up pins for, say, sd card access?

Great work!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.smarthome.viviti.com/propeller

Toby Seckshund · 2010-08-03 09:39

The bent DracBlade I made has P0-P19 free, for memory.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Why did I think a new, more challenging, job was a good idea ??

Post Edited (Toby Seckshund) : 8/4/2010 8:12:10 AM GMT

heater · 2010-08-03 09:53

Toby, I'm trying to drink coffee here[noparse]:)[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Duane Degn · 2010-08-03 15:04

Steve,

Thank you for taking time to answer my questions.· Your explanation really helped.· I think I'm acronym impaired.· SRAM has been discussed a lot not SDRAM (at least not as much as SRAM).

Michael Green as an object for Microchip SPI SRAM and Winbond SPI Flash Memory.· I confess to not being able to access my Winbond Flash (W25X16AVDAIZ 2MB) with Mike's object.· This was a failing on my part not Mike's.

I realized I needed to have a basic understanding of how the flash chip was working so I'd know which method to call in Mike's object.· To make sure I understood some aspect of the chip I wrote my own method that uses the feature I was learning about.· I ended up writing my own object.· It's all in spin so It's pretty slow but I think I know how the flash chip works.· One aspect of the chip I didn't use was it's ability to be read from·two bits per clock (the input pin may also serves as a second output pin).· From what I understand about SDRAM (from your answer) it can be read from (and written to) 8 bits at a time.

The Winbond flash has a fast read option (single and double bit modes) that only requires the start address with the address automatically advancing as it is read (the entire chip can be read this way).· The main disadvantage to this flash chip (I think most are like this) is it can only be written to a page (256 bytes) at a time and the page has to have been previously erased (with ones, I'd have thought zeros).

Editing data on the chip requires holding data in hub ram while the necessary pages are erased and written to with updated data.

It is good to know an option for faster larger memory is available.· I'll keep·this in mind so if I find a problem SRAM would solve I'll know were to find it.

This would be fun to play with when I have some free time (which occasionally happens during winter months (in the northern hemisphere)).

Thanks again for your reply,

Duane

Bill Henning · 2010-08-03 16:20

Well done!

But the schematic is a bit small to make out for my poor old eyes...

Actually, its not the LRU counter bits that stop using the current VMCOG for your memory; its the direct mapped nature of the MMU. (Unless what you meant was keeping some VM upper address bits there, using it basically as a "hashing" function for the pages. That would be easy to to with the next-gen VMCOG that I've started on.)

I've forked off a new branch of VMCOG for large VM's, however to be able to use 64MB of VM, I'll have to use both larger page sizes AND a hub-based page-present table.

I do NOT want to go to multi-level TLB's, the speed hit would be too large.

jazzed said...
Here are some results from experimenting with SDRAM on Propeller. Some of this was obliquely discussed in the "DRAM adding" thread. Yes, these things are puzzles that need to be done [noparse]:)[/noparse]

The attached board picture has two 32MB SDRAMs sitting on a GadgetGangster PlatformSD. The second SDRAM is not used now but could be functional with more effort. The wires are mostly experiments, and the little gate is the only part I had. There is one real mistake[noparse]:)[/noparse] This particular design allows 8 free pins but uses SCL for clock. I have another schematic in process that allows 8 free pins, does not use SCL, and can time-share up to 4 more pins with smaller memory options (see schematic in second picture below).

The output below shows some statistics and pseudo-random test output. A 32 byte read burst takes about 6us so the throughput is potentially > 5.3MB/s for 512 byte read. I'm using 32 byte bursts for buffer read/write so that the cache is more efficient (this could be ported to VMCOG except for the memory limit in that driver).
33 MB SDRAM Test Startup. Cache size: 4096 Tag count: 128 Line size: 32

SPIN Overhead Calls/s: 20000
Cached Buffers/s: 100000 about 3200 KB/s minus SPIN overhead.
Uncached Buffers/s: 33333 about 1066 KB/s minus SPIN overhead.

Pseudo-Random Pattern Test 33554 KB
----------------------------------------------------------------
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Test Complete!

Pseudo-Random Pattern Test 33554 KB
----------------------------------------------------------------
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Test Complete!
I'll make a FAB available if there is interest. My initial hardware target is the GG Platform. As far as software goes, I plan to add support for Catalina, ZOG (via VMCOG if possible), and other development environments. Applications that I intend to try are rich video (32 to 512 byte reads at > 5MB/s) and others.

I should be able to improve cached performance once the flusher knows how to skip saving a non-dirty page before reload. That being said Bill ;0 If you used less bits for your LRU counter, I could add this to VMCOG.

The first pic is the 64MB test FAB. The second pic is a more flexible 8/16/32MB option VGA/TV adapter schematic.

Cheers.
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system

Post Edited (Bill Henning) : 8/3/2010 4:31:08 PM GMT

Bill Henning · 2010-08-03 16:23

Excellent suggestion... we think alike... see "Alternate Usage Model" in original post of VMCOG thread [noparse]:)[/noparse]

heater said...
Looks like another puzzle completed[noparse]:)[/noparse]

Zog can execute C/C++ code and access data anywhere within a 32bit address space. 4 GBytes.
I have currently divided the memory space in such a way that some areas are redirected to HUB RAM or COG space. This is quite arbitrary and those areas could be placed anywhere so 32M or more is not a problem.

Zog looks forward to you getting this SDRAM supported in VMCOG.

Actually I was thinking we should create an RMCOG driver. "Real Memory" or "Real Mode" COG would have the same API as VMCOG for accessing external RAM but forgo all the virtual memory management and HUB space that takes up for the working set. Each access would go directly to the RAM hardware.

What's the point or RMCOG?

1) Imagine every existing ext RAM hardware design ws supported by RMCOG.
2) Then any Spin application using the RM/VMCOG API would work on any ext RAM board.
3) Duane Degn here is an example of the many who could use ext RAM from Spin just as a "place to put stuff". Perhaps speed not so critical. It would be great if such aps would just work with any ext RAM or FLAS or whatever.
4) Any one doing new ext RAM designs only has to add a little driver code to VM/RMCOG to make it usable by all apps.

Perhaps a cache of the last block loaded, for those drivers that uses block access would speed along execution of ZOG and others. If Zog for example ran code from ext RAM but his data was in HUB that would zip along.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system

jazzed · 2010-08-03 16:37

@Heater,

I can add SDRAM support to a hacked variant of VMCOG for Zog. Do you have a Propeller Platform Module?
RMCOG is a good idea for adding hardware with an abstraction layer for any underlying implementation.

@Dr_Acula,

It's probably better to say that the board I have now uses 21 dedicated pins.
The next board will use 20 pins where 16 are dedicated and 4 are sharable.
Both designs use a '573 latch.

Here are possible pin assignments for the next board:

Propeller Pins : Usage
P0-11 : Dedicated SDRAM interface
P24-27 : Dedicated SDRAM interface
P28-29 : I2C
P30-31 : Serial Port
P12-15 : Sharable SDRAM interface
P16-23 : Free

I'm thinking these things are possible:

A fully functional computer with TV, SDCARD, Keyboard/Mouse, Serial/I2C ports.
A hi-res VGA LCD touch-screen platform using I2C devices for 256KB non-volatile storage.
Embedded systems not using video (web server, mp3 player, data-logger, etc...)

A more powerful LCD touch-screen device is possible by one-shot gating the serial port pins with special hardware.

@Duane,

It is good to experiment with code like you mention as it builds your abilities.
If you're having trouble with Mike's driver, he would be more than happy to help you get it working.
Make a new thread asking for help. Mike is very active on the forums.

Cheers,
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

jazzed · 2010-08-03 16:48

@Bill,

Here's a snippet from what I'm using now. It should not be too hard to integrate something similar with VMCOG.
Granted, the number of instructions required to manage the buffers is more than I would like right now.

{{
=====================================================================
cache is organized as a table. The tagline table contains pointers to cache entries.
Each tagline has the following definition:

TAGLINE:    ccmx_xxxx
such that:  c=counter, d=dirty, m=c+d+xx, and x=match address
}}

con
    CACHESIZE   = $4000>>2           ' power of 2 cache size - could be less
    TAGCOUNT    = CACHESIZE/LINELEN ' cache with linelen bytes
    LINESHFT    = 5 ' 9             ' maximum 512 (1<< 9) minimum is 32 (1<<5)
    LINELEN     = 1<<LINESHFT       ' CACHELINE SIZE

    TAGMASK     = $003F_FFFF        ' $3FFFFF*512 = 2GB $3FFFFF*32 = 128MB
    COUNTMASK   = $FF80_0000        ' count
    DIRTYSHFT   = 22                ' shift by N to get dirty bit   test tag,DIRTYMASK wc
    DIRTYMASK   = 1 << DIRTYSHFT    ' dirty bit mask                muxc tag,DIRTYSHFT

{{
=====================================================================
cache
}}
dat ' cache
tagram          long 0 [noparse][[/noparse]TAGCOUNT]
cache           long 0 [noparse][[/noparse]CACHESIZE]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

jazzed · 2010-08-03 21:13

Here lies a post where a what I thought was good idea once lived. RIP idea

Using the CLK and ALE on the same pin is not possible as far as I can tell.
Now using original backup plan with P23 for clock.

Cluso99 · 2010-08-03 22:29

Nice work steve. Schematic still hard on the eyes but perfectly readable.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz

jazzed · 2010-08-04 02:28

@Cluso99 - Thanks Ray.

I've done some research and it is possible to handle 3 SDRAM chip types (16,32,64 MB) with one board design from ISSI and Micron. The 16Mx8 parts are US $6 to $8, 32Mx8s are $10 to $12, 64Mx8s are about $30. I guess yield is bad for the big ones, but it's nice that any of the chips can be used on the same board.

Seems to me that 2 different GadgetGangster Platform boards would serve existing needs: Option 1) SDRAM + VGA/TV + Mouse/KB connectors and Option 2) just SDRAM. I realize that there are boards for the GG Platform that provide VGA/TV, but I'm weary of stacking too much. I'll shoot FABs with VGA, etc... as soon as I do rework to verify that ALE/Clock can be on a single pin.

One or more single board Propeller Computers with these design elements should be ready before winter.

I'm so happy to realize a cheap and manufacturable high density solution. BTW, looking at the DDR specifications, there are some differences regarding data window alignment, but for the most part the use model is the same. One big problem though: 3V DDR chips do not appear to be available at all. Guess we'll have to wait for Prop2 to use DDR.

Cheers,
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

Toby Seckshund · 2010-08-04 08:23

Jazzed

I took the VGA down to Hsync, Vsync, Forground and Background. I also put an analog multiplex (4053) into the I2C lines so that once the Hsyncs started, the EEPROM was switched out and the KBD in. I never got around to sharing the SD card pins, I just gave it the 4 pins from the usual KBD and Mouse. This gives 20 free pins, I guess the SD card ones would tip the balance if they were shared on the memory ones and the VGA was shifted up to P24-P27.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Why did I think a new, more challenging, job was a good idea ??

mikediv · 2010-08-04 14:47

Hi guys ,, Jazzed as always I am floored that you share your designs so readily with us its awesome man thank you .. I have a question if I wire wrap up a board using your second schematic that uses 20 pins is there any "simple" software that I can use to test the Sram chip to make sure its working???
Thanks

Oh also doesn't SDram require constant refreshing or the data will be lost?

jazzed · 2010-08-04 17:25

@mikediv,

Yes, SDRAM requires refresh, that's built-in on my COG driver which I'll post in the near future. I haven't verified that sharing the ALE*/Clock (inverted) works, but plan to work on that today. Don't commit to this design until I have good results.

Wire-wrap for this SDRAM would be super-tough without a socket. The Schmartboard 72 pin adapter card has the right pitch and costs about $10.

As far as giving away things. I doubt I'll ever make a living selling these boards or any of my software to forum members [noparse]:)[/noparse] I'll provide hardware at cost + postage + a small gratuity if someone wants it though.

@Toby,

I was also thinking about using some 4053's for changing the P28-31 definition which could be set with a 7474. Sadly, this would not be available on an add-on card without a propeller on board. Have you tried using SD Card on such a switched interface? It seems natural to try that, but I would be a little concerned about slew rates impacting performance.

I have another unique idea for the I2C bus

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

Toby Seckshund · 2010-08-04 19:08

I had pondered on crudely sharing some pins so that loading stopped the serial comms and KBD or video for a bit (after all I am emulating early Z80 things) but the habit of VGA monitors going to sleep would make things worse. I had some CD4053s here but I think that the 74HC version would be better on speed issues

It would be a bacwards step on "style and grace" but getting all the house keeping bits up onto the top 8 pins would leave the lower 24 for the memory. That would also be getting nearer and nearer to Clusso's Blade2 and may require Blade1 to get consistant In/Outs back.

My thoughts of using DRAM (old EDOs) was to get the chip count down, by not using the latches and the 1 of 8 decoder, only to end up using added mplx ones elsewhere.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Why did I think a new, more challenging, job was a good idea ??

Ding-Batty · 2010-08-04 23:06

@jazzed,

In your most recent schematic, I don't see any connections to the Bank-Select pins BA0 and BA1 -- should they be connected to two of the A13..A15 pins?

The reason I have been looking at EDO DRAM is to minimize the number of I/O pins needed (within reason). But in your latest schematic, it looks like you are down to 4 control lines + Data + Address.

You could save one more I/O pin if instead of dividing the address lines as a group of 8 and a group of 5, you uses 7 and 6 (or 7 and 7) -- that would give up to 64 MB address range (but you still have the BA pins floating [noparse]:)[/noparse]

TD

Ding-Batty · 2010-08-04 23:20

Sorry to reply to my own post, but I got my math wrong...

7 address bits + latch + Row/column addressing gives up to 28 bits of address space, which is 256MB...

TD

Bill Henning · 2010-08-04 23:52

Steve,

I've been chewing on this before commenting.

I don't think you need any counter bits at all - what you describe below seems to be a direct-mapped cache, with 32 byte cache lines, and tag bits for the upper address bits, which does not need LRU counts.

Say you wanted to use 4KB of HUB ram for the cache - that's 12 bits of address, minus 5 bits for the 32 byte cache line, leaving 7 bits.

This means your TLB will have 128 entries.

Now if you have 64MB of DRAM, that needs 26 bits of addressing; minus 12 bits (for 32 byte lines, with 128 entries), you need 14 bits in the TLB to hold the upper address bits.

This should work pretty well, and with the 32 byte cache line size, there should not be too much trashing going on.

I don't think going to a 2 or 4 way associative design would buy you any performance - the extra checking IMHO would negate any advantages.

jazzed said...
@Bill,

Here's a snippet from what I'm using now. It should not be too hard to integrate something similar with VMCOG.
Granted, the number of instructions required to manage the buffers is more than I would like right now.

{{
=====================================================================
cache is organized as a table. The tagline table contains pointers to cache entries.
Each tagline has the following definition:

TAGLINE:    ccmx_xxxx
such that:  c=counter, d=dirty, m=c+d+xx, and x=match address
}}

con
    CACHESIZE   = $4000>>2           ' power of 2 cache size - could be less
    TAGCOUNT    = CACHESIZE/LINELEN ' cache with linelen bytes
    LINESHFT    = 5 ' 9             ' maximum 512 (1<< 9) minimum is 32 (1<<5)
    LINELEN     = 1<<LINESHFT       ' CACHELINE SIZE

    TAGMASK     = $003F_FFFF        ' $3FFFFF*512 = 2GB $3FFFFF*32 = 128MB
    COUNTMASK   = $FF80_0000        ' count
    DIRTYSHFT   = 22                ' shift by N to get dirty bit   test tag,DIRTYMASK wc
    DIRTYMASK   = 1 << DIRTYSHFT    ' dirty bit mask                muxc tag,DIRTYSHFT

{{
=====================================================================
cache
}}
dat ' cache
tagram          long 0 [noparse][[/noparse]TAGCOUNT]
cache           long 0 [noparse][[/noparse]CACHESIZE]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system

Post Edited (Bill Henning) : 8/5/2010 12:28:42 AM GMT

jazzed · 2010-08-05 00:11

Ding-Batty said...
In your most recent schematic, I don't see any connections to the Bank-Select pins BA0 and BA1 -- should they be connected to two of the A13..A15 pins?

Hey your right [noparse]:)[/noparse] I had BA0-1 connected to some jumpers before. The ERC tool might have caught that, but it's best to catch it in advance. Thanks. I've looked at using only 7 pins in the latch, but I keep coming up short a pin.

Ding-Batty said...
7 address bits + latch + Row/column addressing gives up to 28 bits of address space, which is 256MB...

That's right in the case of EDO DRAM. For SDRAM the biggest chip I've seen is 64MB and the limit is 128MB for 8 bit data. The limit is because A10 is used for pre-charge at READ/WRITE CAS command time.

@mikediv, BTW, the schematic shown is different that what I have in my office because the latch bits are on a different byte now; I'm doing that to save some instructions* on the address cycles. Consequently, the driver I have now would need to change for the new schematic. I haven't had time to finish the ALE/Clock experiments today as yard-work has taken priority.

*To summarize what Chip said in a Propeller2 post, getting the SDRAM going is like starting a train, but once its running it's fast. It's tough to optimize the startup, but the data can flow at 10MB/s once it's started with the method that Me, Phil, and Jonathan devised. The best news about longer startup cycles is it's easier to get 2 cogs sampling for double the data rate as long as the data remains on P0..7 (any other pin configuration for data makes that idea impossible).

Cheers.
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

jazzed · 2010-08-05 00:36

Bill Henning said...
I've been chewing on this before commenting. ...

@Bill,

I do appreciate the thought and was wondering why you were so quiet. I'm glad to see your analysis is a positive evaluation. Yes the scheme is direct mapped (modulo hash) and the only thing I'm using counters for right now is for statistics and preventing reload of 0. I still have to work on my write-back flusher though [noparse]:)[/noparse] I wonder if a write-through policy might be more effective since we can signal that the operation is complete while the write happens in the COG in the background (the next read would be blocked until the write is finished anyway).

A question now is how can we get ZOG to run using SDRAM?

Of course Catalina is on my language list and maybe even ICC. PropellerJVM support is up in the air at this point because of infrastructure requirements.

Question for Bean: Does it make sense to port a PropBASIC XMM? Do you have anything like that today?

Cheers.
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

Bill Henning · 2010-08-05 01:17

Steve,

You are welcome [noparse]:)[/noparse] and I think that which of write-through / delayed write will perform better will depend on the application's access pattern.

As for my reduced rate of posting... I've been a bit busy: consulting, trouble shooting my new boards, etc.

As for running ZOG on SDRAM... use the same interface as VMCOG [noparse]:)[/noparse]

When I get around to making XMCOG for Morpheus, it will use the exact same interface, and treat unneeded api's as NOP's.

As long as the API/interface method is kept the same, it should be possible to change between direct access (XMCOG), VM+MMU (VMCOG) and your cache cog variant without the app caring!

Frankly, only {rd|wr}v{byte|word|long} should be used by normal applications... and perhaps a memsize call (to determine amount of available memory). The other API's are really for debuggers, debugging VMCOG, and OS use.

I suspect that given time, all flavors of LMM will also be made available in a VMCOG using version [noparse]:)[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system

Cluso99 · 2010-08-05 01:40

Steve, I think any of the code using external memory would appreciate drivers for your SDRAM. Every platform supported ultimately brings more users which is a good thing. Many who are likely to go down these paths bring other experiences and thoughts.

Pitty we don't have Prop 1.5 to give us those desperately sought-after extra pins. If/when, we will see a major breakthrough with the upper-end prop developments.

I am pleased you will also offer a board. The more the merrier!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz

jazzed · 2010-08-08 14:38

mikediv wrote: »

I have a question if I wire wrap up a board using your second schematic that uses 20 pins is there any 'simple' software that I can use to test the Sram chip to make sure its working???

@mikediv and others:

Bad news. I've been fighting for a week now to make the shared CLK/ALE configuration work. I don't think it will ever work. I'll be removing that schematic and go back to my original design with perhaps a different CLK method.

Good news. I was able to reduce the number of instructions for cache access.

I'll update further progress later.

Thanks,
--Steve

jazzed · 2010-08-09 15:50

Hi. I've added a new 32MB schematic which allows 8 unshared pins to the social group "Thanks for the Memories." The schematic has other goodies that I hope will be useful.

Cheers,
--Steve

Ding-Batty · 2010-08-09 16:08

I just took a look at your schematic -- the jpeg is 428x373, according to Firefox, so it is just a little bit blurry

Seriously, I clicked on the thumbnail to see the picture, then opened it in a new window to see the size. Is there another way to view the picture so that it has more detail, or is this part of the size-limitation "fun and games" we are seeing with the new forum software?

And one more question, perhaps a much larger one: What do you imagine the social group should be used for? What should we do there, and what in the public forums?

Heater. · 2010-08-09 16:09

Jazzed. I've just spent a couple of minutes wondering why any social group would want 8 free Propeller pins. Then it dawned on me you mean that space in the forum they call "social groups" ....:)

Thing is I don't want to join any social group just to view a schematic. I'm already a member of this forum, twice, isn't that enough?

What you have in that social group is a discussion about the topic of external RAM on the Prop. Well isn't that a thread? Can't it be out in the open here like all other thread topics? Am I missing a point here?

Adding 32MB SDRAM to Propeller ...

Comments