VGA and/or Sram in a multi prop system, how doe it work?
rwgast_logicdesign
Posts: 1,464
Ok so I see lots of projects out there where there is a dedicated prop to run some peripherals like VGA systems like propGFX or some of the mikronaughts things like mobius I believe.
Im having hard time wrapping my head around Mhz vs nS times and how systems are befitting from using two propellers where one controls a peripheral(s) and the other does the main bulk of the processing.
Lets take a parallel SRAM system for instance (this is something ive had my mind on for along time) so lets say you get 8 one megabit chip that makes 1meg. Now anyone whos played with ram and looked at some numbers knows the fastest way to get external ram on the prop is to connect SRAM chips in parallel to it, that way the prop can read a byte of memmory in the same amount of time it would take the prop to read a bit, if there were only one ram chip. So first off here is some confusion the ram chips are 20mhz, now what does that have to do with the amount of nS it takes the propeller to read/write to the chips? The fastest the prop can read these chips is 12nS, but how fast can the rams pins go? Anyway for simplicity sake we will say the propeller can read write the SRAM chips at 12nS (im sorry if this is wrong but like I said for simplicity sake).. Anyways so we have are propeller and we have are 8 SRAM chips connected to it transferring data back and forth at 12nS. Everything is peachy with 1mb of fast ram but wait the ram took 16 pins... what to do, now heres where the confusion really starts!
So most projects want more than the 12 pins on the prop we have left after we add a buttload of ram an eeprom and TX/RX. So there are alot of solutions to solve this problem some better than others, I think a very fast CPLD latching the ram to the propeller is probably the best but also out of my skill range. Now one of the better solutions ive seen tossed around is to use a second prop as a ram controller and let the first one do whatever its there to do. So now we have P1(prop #1) loaded with ram and P2(Prop #2, lets say it has an sd card). Now that P1 is a ram controller we need a bus between the two props so in my mind im thinking 9 lines sounds optimal, this would allow P2 to pull a byte of ram from P1 and implement a control line. But how much speed is this going to lose it seems like now P2 needs o issue a command wait for P1 to receive the command and execute the code, then wait around for 12 more nS while P1 grabs the byte. After P1 receives the byte P2 still needs to wait on P1 to execute whatever code is in place to ship the byte off to P2, now P1 is sending to P2 here is another 12nS. This whole process of using a prop as a ram controller has now made it take 24nS to read a byte instead of 12 plus all the code execution time on the propeller (P!) controlling the ram. The only way I can see this system as plausible is if the time it takes P1 is acually double or tripple the time of the inter propeller communication. Lets then say your using p2 to access SD and you want to store its data in ram and the rams data on SD, this takes all the time above plus the access time of the SD card. Plus a 9 line bus is pretty is also alot pins. Is my train of thought even close to correct here??
Now with VGA im not even going to pretend I have an idea about whats going on Im guessing P! would have a monitor connected to it and be running the video drivers, maybe this allows for more speed because P! has all 8 cogs and all the internal ram to dedicate to graphics. But how is P2 telling P1 what needs to be drawn? Its obviously some kind of parallel bus but how wide is it, what kind of data are we sending over it and how is the latency of the prop comunications not slowing it way down especially if your drawing a frame wait 12nS draw a frame wait 12nS???
Im having hard time wrapping my head around Mhz vs nS times and how systems are befitting from using two propellers where one controls a peripheral(s) and the other does the main bulk of the processing.
Lets take a parallel SRAM system for instance (this is something ive had my mind on for along time) so lets say you get 8 one megabit chip that makes 1meg. Now anyone whos played with ram and looked at some numbers knows the fastest way to get external ram on the prop is to connect SRAM chips in parallel to it, that way the prop can read a byte of memmory in the same amount of time it would take the prop to read a bit, if there were only one ram chip. So first off here is some confusion the ram chips are 20mhz, now what does that have to do with the amount of nS it takes the propeller to read/write to the chips? The fastest the prop can read these chips is 12nS, but how fast can the rams pins go? Anyway for simplicity sake we will say the propeller can read write the SRAM chips at 12nS (im sorry if this is wrong but like I said for simplicity sake).. Anyways so we have are propeller and we have are 8 SRAM chips connected to it transferring data back and forth at 12nS. Everything is peachy with 1mb of fast ram but wait the ram took 16 pins... what to do, now heres where the confusion really starts!
So most projects want more than the 12 pins on the prop we have left after we add a buttload of ram an eeprom and TX/RX. So there are alot of solutions to solve this problem some better than others, I think a very fast CPLD latching the ram to the propeller is probably the best but also out of my skill range. Now one of the better solutions ive seen tossed around is to use a second prop as a ram controller and let the first one do whatever its there to do. So now we have P1(prop #1) loaded with ram and P2(Prop #2, lets say it has an sd card). Now that P1 is a ram controller we need a bus between the two props so in my mind im thinking 9 lines sounds optimal, this would allow P2 to pull a byte of ram from P1 and implement a control line. But how much speed is this going to lose it seems like now P2 needs o issue a command wait for P1 to receive the command and execute the code, then wait around for 12 more nS while P1 grabs the byte. After P1 receives the byte P2 still needs to wait on P1 to execute whatever code is in place to ship the byte off to P2, now P1 is sending to P2 here is another 12nS. This whole process of using a prop as a ram controller has now made it take 24nS to read a byte instead of 12 plus all the code execution time on the propeller (P!) controlling the ram. The only way I can see this system as plausible is if the time it takes P1 is acually double or tripple the time of the inter propeller communication. Lets then say your using p2 to access SD and you want to store its data in ram and the rams data on SD, this takes all the time above plus the access time of the SD card. Plus a 9 line bus is pretty is also alot pins. Is my train of thought even close to correct here??
Now with VGA im not even going to pretend I have an idea about whats going on Im guessing P! would have a monitor connected to it and be running the video drivers, maybe this allows for more speed because P! has all 8 cogs and all the internal ram to dedicate to graphics. But how is P2 telling P1 what needs to be drawn? Its obviously some kind of parallel bus but how wide is it, what kind of data are we sending over it and how is the latency of the prop comunications not slowing it way down especially if your drawing a frame wait 12nS draw a frame wait 12nS???
Comments
And use the same clock(makes things easier)
You would need a framework to mesh the two (or more) prop chips.
I built such a mesh, but NOT programmed or optimized for highspeed (my prop to prop communication is only 300,000baud)
This framework would need to be operating at the speed of the fastest device connected...(in your case the sram?)
300kbaud wouldn't work for you.
At this point it becomes an issue of designing the two props around the sram.
And the operating cycle of the sram is how the props would communicate (using linked address or data lines)
You would need to structure a STATE MACHINE in prop-asm and SPIN that operated with the sram as its "heart"
Then you do the same with another cog on each prop to do the video in the same manner.
You would build the video framework integrated into to the sram frame work and interact with the sram via a 3rd cog that is actually your "manager".
This third cog 'manager' would act like a traffic controller , with the primary duty as scheduling data exchange between the two props and sram.
If you don't want to deal with a high speed communication scheme,
You could also have the secondary prop SKI off the first prop and the sram by monitoring the sram's important lines,... it then tells prop1 what it wants the sram to do.
The first method allows a very high speed link between the prop chips and the core memory.
My basic tests with 55 prop chips and primitive coding showed me that very very high accuracy can be found if the "bugs" are worked out.
http://forums.parallax.com/showthread.php?127983-55-Parallax-Propeller-s-Parallells-Processing-of-Permanent-Perturbations.
I found that I was able to get a pretty good accuracy such that 300kbaud serial output was able to be transmitted from TWO props with their TX outputs merged so that if the props weren't synchronized, the baud rate timing would cause screen garbage.
I don't doubt at all that many megabaud (8.42) can be had with proper synchronization coding. (which would be done on the propasm level.)
One could take my SPIN example and create a prop-asm high speed version, (tweaking the circuit also)...
The prop chip not only has 8 cores, but it also has n-chip programming.
To program 1 or 5,000 prop chips, it would only take the length of a single programming sequence, using similar framework as mine.
http://forums.parallax.com/showthread.php?127983-55-Parallax-Propeller-s-Parallells-Processing-of-Permanent-Perturbations.
So if you had 8 props, technically you have 8 props with high speed ram, none of the cogs on each separate prop has to wait for HUB access. Each prop can burst data access to the IO is exclusive to each prop, they do not wait for hub to muck with the pins.
When you start designing with more than a single prop chip, you open up many more possible scenarios.
And when you start to mess with variable clock SOURCES ... You start to see how multiple props could aggregrate their combined pins to do analysis of PULSES that are way higher than the prop chip can measure.
The method is due to clock loops that are walking in and out of phase with eachother in a predictable manner, if you use enough prop chips, you can modulate the XI of each prop so their clock loops are phasing.
Calculate the in-out phase interactions (have fun) and you will see the ways to use a hand full of prop chips to measure a PULSE that lasts less than anything a single prop could ever measure.
But to do that kind of stuff, I would use a huge pile of prop chips with interphasing XI clocks. Also I am loosely stating this based upon garbage floating in my head....and a bit of visual imagination...:cool:
I can see it ... it would be a hack job, but it would be something to build and then smile at.
Thats what pusing the edge of technology is all about right tho?
I do not know.
Scalar computing using clock sources that are harmonics and various other combinations for 50+ prop chips makes for some interesting timing interactions.
If done right, a phase of all looping clocks could stair step each prop. The XI clock "low to high" threshold would transition at a point that was fractions of what a single prop could do.
And .... :O .... the prop chip does ok with low to medium deviations from the internal set clock rate. If you deviate too much, things will mess up, but I am suggesting to use the fact that you can deviate at all to manipulate the period and position of each XI CLOCK LOOP input. When I started messing with variable clock signals coming into the XI.... I realized.... .... .. everything. :nerd:
Don't even get me started on NESTED clocks. (russian doll scenario) prop clocking a prop clocking a prop.... and all of them doing it in a variable manner. It starts to get really really messy in there.
Nested interphasing variable clocks.
I am NOT responsible for any gravitational / time lensing / paradox / zpe / effects as a result of utilizing this information. Ride at your own risk.