jazzed inspired NTSC/PAL Driver.

davidsaunders · 2011-03-23 15:57

jazzed challenged me to make a full color video driver for output to television using the Propeller Platform (by: Gadget Gangster) SDRAM module for the frame buffer and video output. Giving the conditions that it must exceed the current drivers in display capability, to the point of being able to not having to repeat any pattern in order to fill the display and it must use 5 COGS or less.

And I accepted his challenge. I am going to order the SDRAM module as soon as I can afford it. In the mean time I will be using a SDRAM IC scavenged from an old DIMM to get started (to which I had already soldered leads to breadboard it for another project, that is long since done). I am also looking through all my old hardware to find the appropriate resistors for the video out. Hopefully the IC on the SDRAM module is close enough that I will not have to make major modifications in order to use my code on the correct HW.

So as to keep the number of posts I have decided to post a periodic log of the status of the progress I am making with this project. and here it is:

3/23/2011
1:) Pixel and simple line drawing functions implemented. Up to 2000 vector draw commands per
frame, though in some cases as few as 150 (figure aprox 2000 pixels updated by vector drawing
per frame).
2:) Still trying to get the output timing lined up for NTSC video.  This part may take a while.

3/23/2011
Wow what a day.  I have not accomplished this much in a single day in a while.
Today I accomplished the fallowing:
1:)  Accepted the challenge to code a better TV video driver for the Propeller Platform using
external SDRAM.
2:) Managed to get a SDRAM driver working on the Propeller, that is based on an earlier,
unreleased driver that I had written for the AVR.
3:) Got the SDRAM read and write speed up to about 4700000 bytes per second max. This
is fast enough for this application.
4:) Began refreshing my knowledge of NTSC, and began studying PAL in relation to NTSC.

jazzed · 2011-03-23 16:19

davidsaunders wrote: »

... Giving the conditions that it must exceed the current drivers in display capability, to the point of being able to not having to repeat any pattern in order to fill the display and it must use 5 COGS or less.

Remember the other driver uses 3 COGs for the sprite animations, so you really only get 2 cogs to do the "background" video layer and run the SDRAM driver. Having this will make beautiful GUIs possible.

I think it's great that you accept the challenge. Video on Propeller has made some nice advances in the last 12 months and if you can get something like this working by UPEW end of May, then we will all have something to celebrate in Rocklin.

No pressure though. We'll all be happy to enjoy your contribution whenever it's done.

davidsaunders · 2011-03-23 16:33

As I am looking at everything (including my old AVR code), I think that I can do this in two COGS with sprites, and a mouse driver (sorry KB would take another COG). Though if I am wrong at least this will encourage me to keep the COG count way down. I say this as this took two AVRs, one of which was only using about half its available processing power), though required a lot more support HW, and that project did have Mouse and KB.

davidsaunders · 2011-03-23 17:34

It looks like it takes 40 longs of code and 4608 clocks (aprox 1.1 millisecond) to read a 256 byte page into COG mem. So if I can keep every thing going this small, should be able to have one cog reading SDRAM while another renders a scan line, and have the one reading SDRAM scan the keyboard and mouse between finishing its read and the beginning of the next scan line (when it is its turn to render). I am keeping my fingers crossed.

Rayman · 2011-03-23 17:54

I'm looking forward to seeing what can be done. SDRAM offers some huge capacity and a low price. Plus, I guess it's the future as Chip promised SDRAM support in Prop2 (whenever that comes).

davidsaunders · 2011-03-23 18:28

HELP: How can you accurately sync code running in a COG to the Counter?? I am using the counter to clock the SDRAM, and I seem to be well out of sync (and so does my code relative to the generated clock

).

trodoss · 2011-03-23 18:39

I think that cog syncronization was discussed in this thread:
http://forums.parallax.com/showthread.php?126813-Video-timing-New-discoveries-improved-synchronisation-code.

Hope that helps.
--trodoss

kuroneko · 2011-03-23 18:40

davidsaunders wrote: »

HELP: How can you accurately sync code running in a COG to the Counter??

Take your pick, either actively wait for a clock edge and go from there or setup the clock so that it fits (phsx preset).

davidsaunders · 2011-03-23 18:56

Thank you trodoss.
Thank you kurenko.

davidsaunders · 2011-03-23 19:00

Ok I feel a little space headed now. I had already used the info from the article at: http://www.linusakesson.net/programming/propeller/pllsync.php to sync my VGA stuff, and I did not even think of it for the SDRAM.

davidsaunders · 2011-03-23 20:25

Ok back on track the SDRAM is doing good with 256 Byte burst reads, and writes. Maybe I should actually take a couple minutes to look at someone else code see if I get any Ideas. And double duh, Watch the counter that you are using and you have sync.

davidsaunders · 2011-03-23 20:46

And we are off will only post progress here from here on. Currently it takes 2704 clocks to Write or Read 256 bytes of SDRAM. So now that the SDRAM acces is fast enough there are no remaining barriers. at this time. (I am sure that I will find some as I progress though those will not be part of this thread, rather they will be directed to the most appropriate thread I can find). I hope that it will not be more than a week or two before I have enough to post some form of code.

jazzed · 2011-03-23 21:26

davidsaunders wrote: »

Ok back on track the SDRAM is doing good with 256 Byte burst reads, and writes. Maybe I should actually take a couple minutes to look at someone else code see if I get any Ideas. And double duh, Watch the counter that you are using and you have sync.

This is in one of the links i posted for you in the other thread. SdramTest-8bit

davidsaunders · 2011-03-24 11:30

Thank you Jazzed, that helped a bit.
Some information on the progress:
The Video driver will use COGS 7 and 6 (at this point), and the hub mem used for parameters and buffers is defined by the following addresses:
sdrcmd = $77C0
sdrhub = $77C4
sdraddr = $77C8
drawbuff = $77D0
viddat = $7BD4
vidcmd = $7BD4
vidpar0 = $7BD8
vidpar1 = $7BDC
vidpar2 = $7BE0
vidpar3 = $7BE4
vidxsz = $7BE8
vidysz = $7BEC
linebuff = $7BF0
inbuff = $7BF0
outbuff = $7DF0

The details of the commands is still somewhat in limbo.

davidsaunders · 2011-03-24 19:23

I ask what would be expected as far as callable spin code is concerned from this driver? When I run into a snag I work on support functions while working it out in the brain.

jazzed · 2011-03-24 20:40

davidsaunders wrote: »

I ask what would be expected as far as callable spin code is concerned from this driver? When I run into a snag I work on support functions while working it out in the brain.

The driver you need to beat should be your guide. Do not ignore work that has come before you.

If at all possible, the SDRAM should serve as both a video buffer and a program code/data store. I suppose one could use 32MB for video buffers only. One thing at a time is fine.

Hardcoded addresses for each of your variables is probably a mistake. Create a "DAT structure" that can live anywhere in memory and be passed to your driver by reference.

Unfortunately a driver requiring 6 or 7 COGs will probably not get used. Maybe that can be optimized later.

JLocke · 2011-03-24 20:49

I think he said it will use cogs 6 & 7, not require 6 or 7 cogs.

Phil Pilgrim (PhiPi) · 2011-03-24 21:40

JLocke wrote:

I think he said it will use cogs 6 & 7, not require 6 or 7 cogs.

Ugh. Yes, that's what he said alright.

DavidSaunders, do not use coginit to start your cogs! It's very bad programming practice. Use cognew, and record the cog number that's returned, in case you need to stop the cog later.

-Phil

davidsaunders · 2011-03-24 21:44

Jazzed:
Cogs 6 & 7, only constitute 2 out of the cogs available in a Propeller. As to the thought on not using static locations; Is there a way to make sure that DAT locations are as close to the top of mem as possible? I ask this because the reason for this choice is to make sure that everything is as much out of the way as possible, as most of the code that I have taken the time to look at only uses the lower addresses, and I wish to make it simple to overwrite the contents of hub ram for applications that would benefit from doing so. Also the buffers are just the default locations, the command and parameter stores are the only portion limited to the static address range.

davidsaunders · 2011-03-24 21:49

Phil:
Done. Now the CogInit statements are replaced with CogNew. There were only 3 of them total.

potatohead · 2011-03-24 22:25

When you use PAR, as one method to control driver HUB parameter access, you can simply pass a address that makes sense, given the memory use at the time. If it's desirable to put it "at the top", it is not hard to do. Maybe somebody has a small region of memory, or wants to bundle several drivers together, or is using C, or some other language. A binary driver, that reads PAR, fetching key data works with all of those.

A driver that exists in the COG, can just be loaded when needed, or compiled in to the HUB, perhaps in a buffer region, and any number of things.

There really should be no assumptions, other than those needed to make the driver work, such as memory needing LONG, or WORD alignment, etc...

davidsaunders · 2011-03-24 22:36

Ok, though HOW do you get it as close to the top of hub mem as you can, without using static locations?

jazzed · 2011-03-24 23:03

davidsaunders wrote: »

Ok, though HOW do you get it as close to the top of hub mem as you can, without using static locations?

If you insist on putting it in high memory you can. Defining every address for each location is bad programming practice though. Since the compiler's address allocation apparently is not good enough, you can use the Spin dataptr.long[n] notation or long[dataptr+n<<2] to create variable references and CON enumeration syntax to specify the fields. That's a lot of work though which is prone to errors.

Letting the compiler set the addresses using a "DAT structure" is much better practice. DAT generates a "global" block within the code segment of a program. If you need to reference the block from high memory for some reason that's easy enough. The driver by definition will be singleton, so it doesn't matter if it's global or not.

potatohead · 2011-03-25 00:28

Can you clarify "singleton" in this context for me Jazzed? Never really sure what those constructs are.

Interestingly, I think it's easier to use the compiler, freeing upper memory, which it won't touch. Build from the bottom and let it do what it does. Your DAT can be at the end, and then you can rather easily call out absolute addresses in higher memory as you see fit. The Parallax drivers do this, taking advantage of the bottom up approach of the compiler.

From there, assuming a HUB memory model, it's easy to just deal with buffers as addresses.

Since this is external memory, to a degree, some of that might not make sense, depending on what it is you want your driver to do, of course.

Andrey Demenev · 2011-03-25 03:31

Phil Pilgrim (PhiPi) wrote: »

DavidSaunders, do not use coginit to start your cogs! It's very bad programming practice.

I would say that it's not bad practice, but advanced technique. For example if you need 2 or 4 cogs with interleaved access to hub memory

jazzed · 2011-03-25 08:13

potatohead wrote: »

Can you clarify "singleton" in this context for me Jazzed?

A singleton is "a set of one." In this case only one instance of the driver object will be used (if for some reason you want 2 displays, then that's a different story). The driver may be referred to multiple times like some versions of FullDuplexSerial.spin using DAT instead of VAR data - my version is called FullDuplexSingleton.spin.

Since object references can not be passed in spin (a good aspect of spin in some ways), such singletons are necessary for any object to have access to a display or serial console. Of course this can be a problem if many objects try to access the same resource simultaneously in a stream, but that is manageable by spin-locks or programmer discipline.

Wikipedia has a page dedicated to the singleton. Read the first paragraph.

The coginit has a place in some code as Andrey mentions, it is rarely necessary though. Phil is just trying to help people stay out of trouble. Let's not argue about it here.

davidsaunders · 2011-03-25 09:30

Thank you all.

I did put a note in the main spin source stating that these need to be loaded in consecutive COGS and as such to make sure that the first two available COGS are consecutive and that no cogs are started between the two. And this morning I made an optimization that voids that. It is now best for them to be separated by three cogs each direction, being for example in COG 7 and COG 3, or 6 and 2, etc...

davidsaunders · 2011-03-25 09:37

Jazzed:
Thank you again for your challenge. This is very good for me, as my "Day Job" is the creation of the MultiDB68K (backed by an investor that wishes to remain a silent partner), an Amiga clone that uses Propellers to simulate the custom HW. For more info on that you can check out: http://davidsaunders.cwahi.net/MultiDB68K/index.html.

By the way if any has any suggestions for a better name than MultiDB68K, we are still in need of a better name.

Phil Pilgrim (PhiPi) · 2011-03-25 10:09

davidsaunders wrote:

It is now best for them to be separated by three cogs each direction, being for example in COG 7 and COG 3, or 6 and 2, etc...

And the reason for this is ... ?

-Phil

davidsaunders · 2011-03-25 10:20

The relative instruction timing for 8 clocks between Hub RAM window begin and Hub RAM window begin. Doing it this way in some cases as the two cogs used swap function every scan line, so it does not work very well to have them nest to one another assuming that one always has a specific function. For some things it works best if the COG handling Video can put a value in HUB RAM, then the Mem handler read that, and for some vice versa. I am not explaining this very well, I hope the basic just is got. There are a couple of things were this increases the speed significantly.

davidsaunders · 2011-03-25 10:33

Phil:
Ok here we go: Take a situation were one COG has to send a request to the other, and the other has to respond and the timing has to be very predictable; beings that the current function of the two COGS changes every scan line this can not be done if the distance between the two COGs HUB access window (in the forward direction) is not equal. And coding two different versions to handle the difference would require that the distance for each be a constant (as computing it at init would take extra code, that would have to be overwritten later in order to make space for the driver code), and a bit of extra work.

jazzed inspired NTSC/PAL Driver.

Comments