The issue here is whether to proceed with the development of a Prop I design that's mostly completed already and simply adds the 2nd set of 32 I/O pins that was originally contemplated in the Prop I design (DIRB / INB / OUTB, etc.) Anything else, as suggest by others, would require that the design be redone which would increase the projected cost for completion, delay the schedule to production, and take further resources away from Prop II development. There's no savings in just providing some of the 32 I/O pins since they'd have to be on the chip itself (or need to be removed from the existing design). Sure, you could use a smaller package, but chips are partially priced by area and, to recover that cost, you'd want to sell a full pinout version first, then later consider the costs and tradeoffs of a smaller package with some I/O pins unconnected.
I may be way off base here but I would think that having a large DRAM buffer isn't going to buy you a whole lot. Maybe for high speed communications or synchronization to a logic signal, other than that you are still limited to how much ram the ucontroller has. The ucontroller cannot execute instructions from DRAM nor can it address registers (wrong architecture) so for LMM the only thing it will buy you is maybe quicker swap times with your virtual extender (I'm an old school DOS programmer so I am unfortunately painfully familiar with virtual extenders). If all you need is buffer storage I would think that NVRAM might be the better alternative. Yes, it's slower, but it's much easier to use and interface with, plus it maintains state which is something you usually want to do with data. Also, if you put instructions in the DRAM and swap them out using LMM, or whatever the extender is called, then you WILL lose them on power down so you still need some sort of persistent storage for your program.
I've been pouring over the SDRAM datasheet and this thing is NOT simple. Someone correct me if I'm wrong, but this isn't something that you can just configure and start using. Refresh signalling must be done periodically (and in consideration of 4 separate "banks") which doubles the complexity of the state machine. Also, to get full random access on each r/w, there is a lot of cycling that must be done to get the train onto another track. In short, it's pretty ugly, but still very compelling due to the bits/$.
I'm thinking now that the best way to handle this is to have a cog program do all the·signalling and act as a memory server for the other cogs. We can get the data rate up this way and the only hardware requirement might be some simple·instruction which merges a data byte/word/long with some control states to be mapped out to an I/O port. We could then issue a command·or read/write data every clock (every cog·instruction) by using a CTR PLL to generate the same frequency that the cog runs at, but with tweaked PHS so that the SDRAM clock·transition is ~180 degrees out of phase with the cog clock. That way, when reading or writing to the I/O port which connects to the SDRAM, you have adequate setup and hold times with respect to the SDRAM's clock. Cogs could request single locations or cache-like reads to/from hub ram via the memory-serving cog. Any objections?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Post Edited (Chip Gracey (Parallax)) : 1/30/2009 8:20:24 PM GMT
soshimo, the DRAM isn't intended for large persistent data storage, it's more for work-ram, like you could have your Spin Program in it, and compiler, or an emulator, with CP/M or whatever running with plenty of ram, hell,could probably emulate a 8Mhz 286 and have DOS on, for that old school goodness [noparse]:D[/noparse] or is it new-school, as we've already got propaltair for real deal old school [noparse]:D[/noparse]
Chip, that's Fine [noparse]:)[/noparse] little cache reads is cool, could be like a mini blitter function to and from hub-ram, almost like paging RAM banks in and out [noparse]:D[/noparse]
Chip Gracey (Parallax) said...
I've been pouring over the SDRAM datasheet and this thing is NOT simple. Someone correct me if I'm wrong, but this isn't something that you can just configure and start using.
You are right...its no wonder that the memory controller in a PC is usually one of the most complex and hottest running chips in the system. The refresh is a real pain and all you need to do is have a small code error or timing issue and you end up with possibly undetected data corruption in RAM.
I think a 64 pin version of the current chip, with no other changes, and a static RAM (SRAM) would be much much easier...more expensive, but probably a viable solution. If the RAM was just big enough to hold some video buffer for the VGA and some packet buffers for the Ethernet we could really open up a world of applications beyond a regular microcontroller. No need to try and make a full blown PC, but I find the multi core aspects of the Propeller to be extremely innovative and if only some more RAM was easy to add you could open up to new markets.
At the height of the .com bubble a class of chips called Network Processors or NPUs were very popular. They will similar is structure to the Prop, but were overly complicated to program and were supplanted for the most part by general purpose CPUs. There are still a few vendors out there like Cavium and RMI, but they are really geared to networking. The Prop is so cool because it brings the multi core mentality down into a technology that is accessible to the average designer.
I really hope to see a follow on Prop soon.
On last related item...you asked if anyone is using LOCKS. I think a better question is, SHOULD anyone be using locks. My guess is there are a huge number of applications out there that should be using locks but don't. The results are intermittent errors that may go unnoticed because they are very infrequent. Any time two tasks are sharing a piece of data larger than the word wide of the machine's memory access you need a semaphore.
@Baggers - yes, but you can only execute instructions from HUB RAM correct? So, again, the only thing it will buy you is swap speed with your LMM extender. You are still limited to the size of HUB RAM for your program - which is 512 longs if memory serves correct. I don't see the speed increase justifying the added complexity of the memory controller for the DRAM. The old DOS extenders would run on an XT at 4.77MHz and use disk storage as the program store for swapping.
ASM program size is limited to 512 bytes, SPIN is limited to the size of the RAM.
My bad, hinv is right, 512 longs, not bytes...
On the SPIN code size....is there any reason that the interpreter could not be changed to pull instructions from external RAM...where ever that might be, for example the DRAM or what ever ends up being implemented by Chip?
Post Edited (awesomeduck) : 1/30/2009 11:35:10 PM GMT
Spin programs reside in HUB memory which is 8192 longs, or 32KB.
PASM programs reside in COG ram which each cog has 512 longs or 2KB
@Chip, That sound's like a really good way of going about it. I was actually thinking along the same lines but since we only have 32 pins to play with currently have a prop with a fast connection like Beau's 14MB 4wire for the transfer. Of course a different protocol could be used depending on how many pins you have free. Unfortunately, transferring it back using even Beau's 4wire connection isn't fast enough for a 640x480 frame buffer.
I wonder how much of the desire for a 64-I/O Prop I is driven simply through not having the Prop II available now ? How would the argument in favour stand-up if the Prop II were here ?
I can accept the arguments for 64-I/O though for hobbyist use that's offset against greater difficulty of use, and while it solves one problem ( not enough I/O ) it doesn't solve the problem of lack of Hub RAM and the two I believe often go hand in hand.
There's also a potential irony I can see in adding DRAM support to the Prop II; as the extra I/O pins are taken away for DRAM use, less are available for interfacing.
While large memory may be nice to have, I'm wondering how useful it would really be in practice. If it cannot be used at speeds required for high-res display it's dead in the water for that and possibly only useful for niche application markets. Are there more useful things which could be added which would benefit a wider audience, make the Prop II more attractive overall ?
My favourite hobby-horse is proper on-chip ADC. If we can have a background task handling DRAM refreshing and the like plus special address mappings, why not something which can scan however many ADC's at whatever rate and store readings in an addressable 'hub section' where any Cog can get at them ?
There's another alternative to on-chip DRAM; get super-fast inter-Propeller comms working, then design a separate DRAM handling chip which can pair with that.
personally I think the supper fast inter-proeller comms is the way to go. If I need more IO line I start converting my periferals to serial and or use an i2c expansion chip. Having a built in high speed comunication system for propriatary parallax chips that is fast and easy to use would make me buy the parallax io expander, the parallax, dram interfacer, parallax ect. i can then use the same prop2 module(or several if i need more power) to do all my projects and just add in modules that i can easily comunicate with because there use is built into the system.
I imagine being able to interface to a parallax expansion board as easy as writing to outb,c,d,e,f,g... or accessing hub memory on the second prop as easy as giving an io address starting with 1[noparse]:([/noparse]i assum prop 1 would be 0[noparse]:)[/noparse]
If you did that the one programing interface could let you program each of the chips or just program it as 1 and let the compiler decide wich chips cog to use.
I think the real benefit of having the extra I/O would be enabling the addition of fast SRAM such as the CY7C1010DV33-10ZSXI and CY7C1049DV33-10VXI. These are about the same price as the DRAM (but much smaller densities of 256x8 and 512x8), but they are a cinche to interface with and very fast (10ns) making them well suited for video buffering apps. But again the question remains, would a designer choose the Prop 2 for such an app.
I think there may still be some fundamental confusion going on in this thread, as Mike pointed out the Prop B is exactly like the current Prop except it has 32 additional I/O. If you are discussing how you want to see such and such a feature or other improvements to the architecture, you are actually arguing against there being a Prop B, you are instead arguing for a feature to be added to Prop 2. There's nothing wrong with this, just realize which side of the fence you are sitting on.
I don't know how fast we could get it to go, but on the current prop, you wouldn't have a lot of pins left over to do VGA anyway. I don't think that even on a 64 I/O Prop 1 it could be pushed to do 1600x1200 as that requires 57.6MB/s even at 8bit color.
On a prop II though is a different matter. There is no reason why it could not be run faster than that. If you need more speed go 16 bits which has a theoretical limit for the series of chips that chip is talking about is around 266MB/sec.
I do like the idea of super fast inter-propeller communication links (ala transputer & SEAforth) would be great too.
I also like the idea of having an ADC/cog. I would argue that for many microcontroller projects an ADC it more needed than a video generator, not that I would want to get rid of it.
Well, since this has degraded to a Prop II wish list, I want it all ;^)
@Paul it sound's like you are saying that we get a Prop B we won't get a Prop II. That isn't what you mean is it?
BTW, what happened to your Parallax avatar?
No, thats not what I'm saying at all. I'm just saying that the Prop B is already defined to be the current prop with 32 additional I/O. And the discussion is whether people would buy this.·The feature enhancement discussions are really in regards to the Prop 2, and I think some people don't understand this. I don't know, maybe this isn't what Chip is thinking but this has been my understanding.
Since I'm getting a fair number of PMs regarding this, I'll go ahead and state it here, but if you want to say anything regarding it please PM me so we don't send this thread into OT land. I no longer work at Parallax, the parting was amicable. I will still be a presence on the forums, however I no longer speak with the authority of someone working for Parallax, but from the informed customer perspective.
Paul Baker said...
I think there may still be some fundamental confusion going on in this thread, as Mike pointed out the Prop B is exactly like the current Prop except it has 32 additional I/O. If you are discussing how you want to see such and such a feature or other improvements to the architecture, you are actually arguing against there being a Prop B, you are instead arguing for a feature to be added to Prop 2. There's nothing wrong with this, just realize which side of the fence you are sitting on.
I am definitely sitting on the prop 2 side of the fence. 1 is great for the apps I am doing right now. but my dream to make a cheap 22" picture frame requires prop2 and a lot of ram. I think proprietary high speed inter chip communication in silicone would be the best way to make the chip expandable to everyone need because then you can just keep adding chips when you run out of io, ram, or cogs.
I like ADC and DAC but I am willing to use external chips or timers if need be. Especially if the external chips could be on that high speed bus(by the way I do believe it should be a bus or ring so expandability is infinite.)
awesomeduck said...
ASM program size is limited to 512 bytes, SPIN is limited to the size of the RAM.
Ah, but SPIN code is not machine code. It's byte code that is read from an interpreter. The interpreter is machine code and runs from HUB RAM. I still stand by my original statement, that machine code only runs from HUB RAM and you are still limited to 512 longs (2KB) of actual machine code that the cpu an address. It's an inherit limitation of the architecture of the CPU and you can not get around it.
well there could be a way around the 512 long limit but it would require having a new op code wich could read a cog cache. would allow reducing of cog ram used for storage of data so the entire 512 could be used for code.
Some people expressed doubts about whether there is anything the Propeller would do with all that RAM; some people mentioned:
- Framebuffer for video
- System memory for emulated 286 system
But don't forget:
- MP3 playback
- Possibly even (simple) video playback
- Video capture for a robot camera
- After capture you might use even more space analyzing the capture in a machine vision algorithm
You could also:
- Make a logic analyzer that records state of 24 Propeller pins (would store thousands or millions of snapshots of the pin states per second; 32 MB might be used up in just two minutes of capture)
- Make a 3-d persistence of vision display that works by projecting successive images onto a paper screen that's traveling rapidly forward and back 2 or 3 inches to change the depth of the image; 3-d voxel data has X^3 memory requirements.
@soshimo machine code runs on COG ram(512 longs) in each of the 8 cogs, not HUB ram which is 8192 which is accessed in a round robin fashion by all of the 8 cogs.
I'm with Paul, a Prop B with an SRAM that can be used for video RAM would be very attractive. I don't know what the state of the B is right now, maybe someone could clue us in...does the current Prop die have all 64 IO's and its just a matter of wiring them out? I know the registers for 64 IOs are already in the chip, not clear if they are just reserved spots though.
hinv, I am not sure I follow...there are plenty of pins for VGA on the current Prop....Look at the Hydra.
Also...could someone clarify for soshimo and me...SPIN code is limited only by the size of the RAM, only the SPIN interpreter needs to fit in the 512 long cog RAM...right? It's possible to rewrite the SPIN interpreter to access this to-be-determined DRAM or SRAM, for instruction fetch, correct?
@awesomeduck, there wouldn't be enough pins for vga and also the 32MB SDRAM that Chip was talking about(24 pins) or 512KB SRAM(~30 pins) plus vga and boot i2c. That is where the Prop B comes in with 64 I/Os. The port b I/Os are not wired up internally in the chip, so a new chip would have produced, not just a new package to put it in. The new chip has been designed, so AFAIK, there is no changing it now.
The spin interpreter runs in a cog and thus is a very cleverly packed ~500 long program. The spin interpreter loads bytecodes out of HUB ram which is 32KB.
hinv, I guess I was already assuming the 64 IO pins [noparse]:)[/noparse]
I have looked at the spin interpreter source code, and "cleverly packed" in an understatement! For many years I wrote real time firmware for routers and switches and I fully understand how hard it is to fit code in such a small space. Very impressed with SPIN to say the least.
The goal is to always sell a product. Or maybe at the worst, sell it for a really long time.
A 64 pin Prop I "B" would bring the full potential of it to the table for sure. After reading the thread, the extra pins, plus the RAM would put a lot of interesting options on the table.
So then, would those options be as interesting, given the Prop II "A", with 64 pins at the start?
I'm thinking maybe not.
I'm also thinking that the time to release on Prop II is a factor also. If it's not too long, the initial investment in Prop I "B" might not pay off so well, simply because the less hobby friendly package and maybe price bump would just nudge people to a Prop II. It's all about that flurry of sales, and some sustaining sales in the window between now and Prop II.
Also, since the Prop I "A" is already here, and will continue to be sold, that flurry would really only be about that sub-set of possible projects suited for a Prop I "B", and only until a Prop II "A" then exists.
There are a fair number of cases I can think of that just don't play out for a return. [noparse]:([/noparse]
Say Prop II "A" is developed with a memory interface of some generic kind, like that being discussed on this thread. Given the interest in RAM, and history telling us there is always interest in more RAM, wouldn't this then just render a Prop I"B" somewhat ill-suited for a lot of things?
In the shorter term, having a more capable Propeller might get some attention. The emulators being worked on will help some with that, as would more intense projects. Is that attention worth something, perhaps not in terms of sheer number of Prop I "B" chips sold, but for building greater demand for the Prop II?
If so, that opens up the potential for a return quite a bit wider!
Finally, can anyone think of a set of use cases where the Prop I "B" would be well suited, even with the Prop II "A" released? This is fairly easy to see with the existing Prop I "A", as it's packaging and friendly nature will always appeal. It's tougher for the "B" chip, if released. Thoughts / ideas on that?
@soshimo: Absolutely true. However, the LMM code model does present an interesting case. Instructions still execute from the COG, but are fetched from the HUB, and potentially external RAM. Depending on how software is written, the COG could then be seen as a cache, where instead of silicon rules dictating how and what runs "in cache", the programmer deals with this as their needs demand.
On Prop I, there is a clear performance cost for this. From what I understand of the Prop II, this difference will be far less.
Not sure what that means in the context of this thread (probably nothing), but just thought I would comment.
@Everyone: As Mike Green and others have said, the Prop I-B (Prop I-64) is only a 64 I/O version of the Prop I. Nothing else can/will be added.
There are two additional subjects going on in this thread:
* Whether it is worth Parallax spending $80K on bringing the Prop I-B (64 I/Os) to market.
* How to add SDRAM (or SRAM) to the Prop II simply.
Prop II:
If the SDRAM takes a cog to control it continually, then don't bother with SDRAM because we will not have a cog to spare in Prop II (only 8 cogs unless you squeeze more). If another cog is required for high speed comms to other props, we only have 6 left. If the Prop II is limited to 64 I/O pins total (i.e. the ram interface cannot be added on extra pins) then another reason to forget it. We will have to use multiple SRAM's, either with or without address latches, depending on speeds and density required.
Is there perhaps a simple way to perform a quick LE (latch enable strobe if required), -WE (write enable strobe if required) and -CE(chip enable, multiple -CE's with decoding) via a cog rd/wr style instruction using a few preset configuration register(s)?
Here is an example...
SRAM 8 bit wide:
B0...7: D0...7 data bits 0...7
B8...(n+8): A0...n address bits 0...n
B(n+9): LE latch enable strobe (optional)
B(n+10): -WE write enable strobe
B(n+11): -CE chip enable strobe
B(n+12)...: maybe more -CE for extra SRAMs
If multiplexing is used, then the higher address pins could be multiplexed on either B8...(n+8) for addresses An...(n+n) or
multiplexed on B0...(n+8) for addresses An...(n+n+8). -CEn could be part of these upper address pins.
SRAM 16 bits wide: would be similar, except the data bits would be on B0-15 and address lines A1... would start at B16... The multiplexing would need to start at B0...
Sample code attached.
What I am thinking is one 32bit long contains the data (four x 8 bits) and one 32bit long contains the linear byte address (just like the rdbyte/wrbyte instruction to hub), but it goes to the Port B pins. These two longs could be registers if that makes it simpler. A register(s) controls how the interface works, which is set once by a cog (each cog?) depending on the SRAMs used. Any cog can access the SRAM (by using locks - the users responsibility).
The instruction would use the register to determine:
* If a latch is used
* If the data stored on the latch is current (the same as last time?). If so, latching can be skipped resulting in faster access. (i.e. has the upper address changed?)
* Place the upper address lines (and perhaps -CEs) on Port B and strobe LE (i.e. could the silicon decode say the top 2 address bits, if enabled, for 4 x -CEs)
* Place the lower address lines on Port B
* Write (-WE strobe) or read (no -WE strobe) the data (byte/word/long)
* If read, store the data (unused bits masked to 0) in the destination
* Increment the address register by 1/2/4 (for byte/word/long)
Perhaps the Immediate bit could be used to indicate that A0 & A1 specify which byte/word (bit offset) is to be used in a read or write. In a read, only the relevant byte/word would be changed in the data register (other bits remain untouched) - this saves shift, and, or instructions if building a long. In write, only the relevant byte/word would be written to the SRAM (output on B0...8/16).
If the immediate bit is not set, then a read byte/word would place the byte/word into the lower bits and clear high order bits. In a write, the high order bits would be ignored.
Chip, I have no idea how easy or difficult this would be in silicon. At least some of it may be easy, and whatever could be done would improve the access. Thinking further, I guess this all needs to be within each cog.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Links to other interesting threads:
So help me out here, for a non-hardware guy like me. Lets say we have the 64 IO pin Prop B, what fast can it interface with external SRAM? Because if such beast exist, and the external memory interface is fast enough, that would pretty much make a "Prop C chip" a reality.
Depending on we manage the memory map and what the memory driver instructions look like, it may even be possible that the user can put "fast" code in HUB RAM, and other code in external SRAM. Or a RTOS can sit in the HUB RAM (see http://www.imagecraft.com/pub/emos_avr.pdf for our eMOS RTOS), and the user tasks can sit in the external SRAM. Or...
Imagine the possibilities! Most people are not bumping up to the limit of processing power with the Prop 1 yet, it's mostly the memory limitation that is the wall. So Prop 1 B could solve that. When Prop II comes out, that will address the processing limits end of thing.
So what's the theoretical throughput of external SRAM?
@hinv: The Interpreter could be modified to run from external RAM. I have written a faster version of Chips Interpreter, but it is not fully debugged. There is expansion space available. See my "Prop Tools..." link in my signature to find the RamInterpreter - Faster thread.
PropI-B:
Since there is no answer to the timing of the availability of Prop II, I can only presume it is still 1 over a year away. This makes the Prop I-B more viable.
Unfortunately, I am becoming disillusioned with the Prop II because of lack of pins (64) and cogs (8). While we can live with 2KB (512 longs) per cog (design constraint), to do the things we are thinking of, we are going to require cogs for fast comms and memory controller. This also takes pins. If it were here today, it would be fantastic, but >1 year away, there will be a host of other things we will want to do. It will be very fast, but no more than other chips. It will no longer be simple to use because we will have to multi-task the cogs, and then the 512 longs will be even more of an issue.
So, my suggestion:
* Immediately go for the Prop I-B
* Delay the Prop II to add more cogs and more pins (special purpose ilicon and pins for SRAM or SDRAM) and a cheap SMT package mounted on a small footprint pcb with holes for the hobbyist to mount to another pcb. Use more silicon if required, at more expense.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Links to other interesting threads:
Comments
I'm thinking now that the best way to handle this is to have a cog program do all the·signalling and act as a memory server for the other cogs. We can get the data rate up this way and the only hardware requirement might be some simple·instruction which merges a data byte/word/long with some control states to be mapped out to an I/O port. We could then issue a command·or read/write data every clock (every cog·instruction) by using a CTR PLL to generate the same frequency that the cog runs at, but with tweaked PHS so that the SDRAM clock·transition is ~180 degrees out of phase with the cog clock. That way, when reading or writing to the I/O port which connects to the SDRAM, you have adequate setup and hold times with respect to the SDRAM's clock. Cogs could request single locations or cache-like reads to/from hub ram via the memory-serving cog. Any objections?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Post Edited (Chip Gracey (Parallax)) : 1/30/2009 8:20:24 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
http://www.propgfx.co.uk/forum/·home of the PropGFX Lite
·
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
http://www.propgfx.co.uk/forum/·home of the PropGFX Lite
·
You are right...its no wonder that the memory controller in a PC is usually one of the most complex and hottest running chips in the system. The refresh is a real pain and all you need to do is have a small code error or timing issue and you end up with possibly undetected data corruption in RAM.
I think a 64 pin version of the current chip, with no other changes, and a static RAM (SRAM) would be much much easier...more expensive, but probably a viable solution. If the RAM was just big enough to hold some video buffer for the VGA and some packet buffers for the Ethernet we could really open up a world of applications beyond a regular microcontroller. No need to try and make a full blown PC, but I find the multi core aspects of the Propeller to be extremely innovative and if only some more RAM was easy to add you could open up to new markets.
At the height of the .com bubble a class of chips called Network Processors or NPUs were very popular. They will similar is structure to the Prop, but were overly complicated to program and were supplanted for the most part by general purpose CPUs. There are still a few vendors out there like Cavium and RMI, but they are really geared to networking. The Prop is so cool because it brings the multi core mentality down into a technology that is accessible to the average designer.
I really hope to see a follow on Prop soon.
On last related item...you asked if anyone is using LOCKS. I think a better question is, SHOULD anyone be using locks. My guess is there are a huge number of applications out there that should be using locks but don't. The results are intermittent errors that may go unnoticed because they are very infrequent. Any time two tasks are sharing a piece of data larger than the word wide of the machine's memory access you need a semaphore.
My bad, hinv is right, 512 longs, not bytes...
On the SPIN code size....is there any reason that the interpreter could not be changed to pull instructions from external RAM...where ever that might be, for example the DRAM or what ever ends up being implemented by Chip?
Post Edited (awesomeduck) : 1/30/2009 11:35:10 PM GMT
PASM programs reside in COG ram which each cog has 512 longs or 2KB
@Chip, That sound's like a really good way of going about it. I was actually thinking along the same lines but since we only have 32 pins to play with currently have a prop with a fast connection like Beau's 14MB 4wire for the transfer. Of course a different protocol could be used depending on how many pins you have free. Unfortunately, transferring it back using even Beau's 4wire connection isn't fast enough for a 640x480 frame buffer.
I can accept the arguments for 64-I/O though for hobbyist use that's offset against greater difficulty of use, and while it solves one problem ( not enough I/O ) it doesn't solve the problem of lack of Hub RAM and the two I believe often go hand in hand.
There's also a potential irony I can see in adding DRAM support to the Prop II; as the extra I/O pins are taken away for DRAM use, less are available for interfacing.
While large memory may be nice to have, I'm wondering how useful it would really be in practice. If it cannot be used at speeds required for high-res display it's dead in the water for that and possibly only useful for niche application markets. Are there more useful things which could be added which would benefit a wider audience, make the Prop II more attractive overall ?
My favourite hobby-horse is proper on-chip ADC. If we can have a background task handling DRAM refreshing and the like plus special address mappings, why not something which can scan however many ADC's at whatever rate and store readings in an addressable 'hub section' where any Cog can get at them ?
There's another alternative to on-chip DRAM; get super-fast inter-Propeller comms working, then design a separate DRAM handling chip which can pair with that.
I imagine being able to interface to a parallax expansion board as easy as writing to outb,c,d,e,f,g... or accessing hub memory on the second prop as easy as giving an io address starting with 1[noparse]:([/noparse]i assum prop 1 would be 0[noparse]:)[/noparse]
If you did that the one programing interface could let you program each of the chips or just program it as 1 and let the compiler decide wich chips cog to use.
I think there may still be some fundamental confusion going on in this thread, as Mike pointed out the Prop B is exactly like the current Prop except it has 32 additional I/O. If you are discussing how you want to see such and such a feature or other improvements to the architecture, you are actually arguing against there being a Prop B, you are instead arguing for a feature to be added to Prop 2. There's nothing wrong with this, just realize which side of the fence you are sitting on.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Post Edited (Paul Baker) : 1/31/2009 3:40:19 AM GMT
On a prop II though is a different matter. There is no reason why it could not be run faster than that. If you need more speed go 16 bits which has a theoretical limit for the series of chips that chip is talking about is around 266MB/sec.
I do like the idea of super fast inter-propeller communication links (ala transputer & SEAforth) would be great too.
I also like the idea of having an ADC/cog. I would argue that for many microcontroller projects an ADC it more needed than a video generator, not that I would want to get rid of it.
Well, since this has degraded to a Prop II wish list, I want it all ;^)
Doug
BTW, what happened to your Parallax avatar?
Doug
Since I'm getting a fair number of PMs regarding this, I'll go ahead and state it here, but if you want to say anything regarding it please PM me so we don't send this thread into OT land. I no longer work at Parallax, the parting was amicable. I will still be a presence on the forums, however I no longer speak with the authority of someone working for Parallax, but from the informed customer perspective.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Post Edited (Paul Baker) : 1/31/2009 6:24:23 AM GMT
I am definitely sitting on the prop 2 side of the fence. 1 is great for the apps I am doing right now. but my dream to make a cheap 22" picture frame requires prop2 and a lot of ram. I think proprietary high speed inter chip communication in silicone would be the best way to make the chip expandable to everyone need because then you can just keep adding chips when you run out of io, ram, or cogs.
I like ADC and DAC but I am willing to use external chips or timers if need be. Especially if the external chips could be on that high speed bus(by the way I do believe it should be a bus or ring so expandability is infinite.)
Ah, but SPIN code is not machine code. It's byte code that is read from an interpreter. The interpreter is machine code and runs from HUB RAM. I still stand by my original statement, that machine code only runs from HUB RAM and you are still limited to 512 longs (2KB) of actual machine code that the cpu an address. It's an inherit limitation of the architecture of the CPU and you can not get around it.
- Framebuffer for video
- System memory for emulated 286 system
But don't forget:
- MP3 playback
- Possibly even (simple) video playback
- Video capture for a robot camera
- After capture you might use even more space analyzing the capture in a machine vision algorithm
You could also:
- Make a logic analyzer that records state of 24 Propeller pins (would store thousands or millions of snapshots of the pin states per second; 32 MB might be used up in just two minutes of capture)
- Make a 3-d persistence of vision display that works by projecting successive images onto a paper screen that's traveling rapidly forward and back 2 or 3 inches to change the depth of the image; 3-d voxel data has X^3 memory requirements.
hinv, I am not sure I follow...there are plenty of pins for VGA on the current Prop....Look at the Hydra.
Also...could someone clarify for soshimo and me...SPIN code is limited only by the size of the RAM, only the SPIN interpreter needs to fit in the 512 long cog RAM...right? It's possible to rewrite the SPIN interpreter to access this to-be-determined DRAM or SRAM, for instruction fetch, correct?
The spin interpreter runs in a cog and thus is a very cleverly packed ~500 long program. The spin interpreter loads bytecodes out of HUB ram which is 32KB.
No need to wait for future versions of the prop.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
I have looked at the spin interpreter source code, and "cleverly packed" in an understatement! For many years I wrote real time firmware for routers and switches and I fully understand how hard it is to fit code in such a small space. Very impressed with SPIN to say the least.
The goal is to always sell a product. Or maybe at the worst, sell it for a really long time.
A 64 pin Prop I "B" would bring the full potential of it to the table for sure. After reading the thread, the extra pins, plus the RAM would put a lot of interesting options on the table.
So then, would those options be as interesting, given the Prop II "A", with 64 pins at the start?
I'm thinking maybe not.
I'm also thinking that the time to release on Prop II is a factor also. If it's not too long, the initial investment in Prop I "B" might not pay off so well, simply because the less hobby friendly package and maybe price bump would just nudge people to a Prop II. It's all about that flurry of sales, and some sustaining sales in the window between now and Prop II.
Also, since the Prop I "A" is already here, and will continue to be sold, that flurry would really only be about that sub-set of possible projects suited for a Prop I "B", and only until a Prop II "A" then exists.
There are a fair number of cases I can think of that just don't play out for a return. [noparse]:([/noparse]
Say Prop II "A" is developed with a memory interface of some generic kind, like that being discussed on this thread. Given the interest in RAM, and history telling us there is always interest in more RAM, wouldn't this then just render a Prop I"B" somewhat ill-suited for a lot of things?
In the shorter term, having a more capable Propeller might get some attention. The emulators being worked on will help some with that, as would more intense projects. Is that attention worth something, perhaps not in terms of sheer number of Prop I "B" chips sold, but for building greater demand for the Prop II?
If so, that opens up the potential for a return quite a bit wider!
Finally, can anyone think of a set of use cases where the Prop I "B" would be well suited, even with the Prop II "A" released? This is fairly easy to see with the existing Prop I "A", as it's packaging and friendly nature will always appeal. It's tougher for the "B" chip, if released. Thoughts / ideas on that?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
On Prop I, there is a clear performance cost for this. From what I understand of the Prop II, this difference will be far less.
Not sure what that means in the context of this thread (probably nothing), but just thought I would comment.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
There are two additional subjects going on in this thread:
* Whether it is worth Parallax spending $80K on bringing the Prop I-B (64 I/Os) to market.
* How to add SDRAM (or SRAM) to the Prop II simply.
Prop II:
If the SDRAM takes a cog to control it continually, then don't bother with SDRAM because we will not have a cog to spare in Prop II (only 8 cogs unless you squeeze more). If another cog is required for high speed comms to other props, we only have 6 left. If the Prop II is limited to 64 I/O pins total (i.e. the ram interface cannot be added on extra pins) then another reason to forget it. We will have to use multiple SRAM's, either with or without address latches, depending on speeds and density required.
Is there perhaps a simple way to perform a quick LE (latch enable strobe if required), -WE (write enable strobe if required) and -CE(chip enable, multiple -CE's with decoding) via a cog rd/wr style instruction using a few preset configuration register(s)?
Here is an example...
SRAM 8 bit wide:
B0...7: D0...7 data bits 0...7
B8...(n+8): A0...n address bits 0...n
B(n+9): LE latch enable strobe (optional)
B(n+10): -WE write enable strobe
B(n+11): -CE chip enable strobe
B(n+12)...: maybe more -CE for extra SRAMs
If multiplexing is used, then the higher address pins could be multiplexed on either B8...(n+8) for addresses An...(n+n) or
multiplexed on B0...(n+8) for addresses An...(n+n+8). -CEn could be part of these upper address pins.
SRAM 16 bits wide: would be similar, except the data bits would be on B0-15 and address lines A1... would start at B16... The multiplexing would need to start at B0...
Sample code attached.
What I am thinking is one 32bit long contains the data (four x 8 bits) and one 32bit long contains the linear byte address (just like the rdbyte/wrbyte instruction to hub), but it goes to the Port B pins. These two longs could be registers if that makes it simpler. A register(s) controls how the interface works, which is set once by a cog (each cog?) depending on the SRAMs used. Any cog can access the SRAM (by using locks - the users responsibility).
The instruction would use the register to determine:
* If a latch is used
* If the data stored on the latch is current (the same as last time?). If so, latching can be skipped resulting in faster access. (i.e. has the upper address changed?)
* Place the upper address lines (and perhaps -CEs) on Port B and strobe LE (i.e. could the silicon decode say the top 2 address bits, if enabled, for 4 x -CEs)
* Place the lower address lines on Port B
* Write (-WE strobe) or read (no -WE strobe) the data (byte/word/long)
* If read, store the data (unused bits masked to 0) in the destination
* Increment the address register by 1/2/4 (for byte/word/long)
Perhaps the Immediate bit could be used to indicate that A0 & A1 specify which byte/word (bit offset) is to be used in a read or write. In a read, only the relevant byte/word would be changed in the data register (other bits remain untouched) - this saves shift, and, or instructions if building a long. In write, only the relevant byte/word would be written to the SRAM (output on B0...8/16).
If the immediate bit is not set, then a read byte/word would place the byte/word into the lower bits and clear high order bits. In a write, the high order bits would be ignored.
Chip, I have no idea how easy or difficult this would be in silicon. At least some of it may be easy, and whatever could be done would improve the access. Thinking further, I guess this all needs to be within each cog.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Prop Tools under Development or Completed (Index)
· Emulators (Micros eg Altair, and Terminals eg VT100) - index
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz
Post Edited (Cluso99) : 1/31/2009 7:22:22 AM GMT
Depending on we manage the memory map and what the memory driver instructions look like, it may even be possible that the user can put "fast" code in HUB RAM, and other code in external SRAM. Or a RTOS can sit in the HUB RAM (see http://www.imagecraft.com/pub/emos_avr.pdf for our eMOS RTOS), and the user tasks can sit in the external SRAM. Or...
Imagine the possibilities! Most people are not bumping up to the limit of processing power with the Prop 1 yet, it's mostly the memory limitation that is the wall. So Prop 1 B could solve that. When Prop II comes out, that will address the processing limits end of thing.
So what's the theoretical throughput of external SRAM?
PropI-B:
Since there is no answer to the timing of the availability of Prop II, I can only presume it is still 1 over a year away. This makes the Prop I-B more viable.
Unfortunately, I am becoming disillusioned with the Prop II because of lack of pins (64) and cogs (8). While we can live with 2KB (512 longs) per cog (design constraint), to do the things we are thinking of, we are going to require cogs for fast comms and memory controller. This also takes pins. If it were here today, it would be fantastic, but >1 year away, there will be a host of other things we will want to do. It will be very fast, but no more than other chips. It will no longer be simple to use because we will have to multi-task the cogs, and then the 512 longs will be even more of an issue.
So, my suggestion:
* Immediately go for the Prop I-B
* Delay the Prop II to add more cogs and more pins (special purpose ilicon and pins for SRAM or SDRAM) and a cheap SMT package mounted on a small footprint pcb with holes for the hobbyist to mount to another pcb. Use more silicon if required, at more expense.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Prop Tools under Development or Completed (Index)
· Emulators (Micros eg Altair, and Terminals eg VT100) - index
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz