high-color TV drivers - how much time do I get between waitvids?
Dennis Ferron
Posts: 480
I'm working on a TV driver that is supposed to be able to use a variable number of bits per pixel - one, two, four, or eight, and so I've set the waitvid instruction to quit after showing just 4 pixels instead of after 16. This way, I can load the value %%0123 for pixels and put the pixel colors into bytes in the palette argument.
This means that I'm doing four times as many waitvids as the "stock" Parallax TV driver does. Still, I'm running at 80 MHz and grabbing bytes from Hub RAM an entire long at a time. I set up a test program that grabs the long from hub RAM and does four waitvids, but does the waitvids on dummy values to see how many nop's I can put between the waitvids, so that I'll know what I have to work with if I am going to "decode" the bits.
I'm only able to put 2 nops extra between the last waitvid and the rdlong. If I try to put 3, things get shaky because the Propeller seems to not make it back in time for the next waitvid (or hub access?) and if I put 4 nops it doesn't work at all.
Should I really have only 8 cycles to work with here?
(I can put as many as 11 nops between waitvids inside the group of four; it's just right after the whole group of four that I can only insert 2.)
This means that I'm doing four times as many waitvids as the "stock" Parallax TV driver does. Still, I'm running at 80 MHz and grabbing bytes from Hub RAM an entire long at a time. I set up a test program that grabs the long from hub RAM and does four waitvids, but does the waitvids on dummy values to see how many nop's I can put between the waitvids, so that I'll know what I have to work with if I am going to "decode" the bits.
I'm only able to put 2 nops extra between the last waitvid and the rdlong. If I try to put 3, things get shaky because the Propeller seems to not make it back in time for the next waitvid (or hub access?) and if I put 4 nops it doesn't work at all.
Should I really have only 8 cycles to work with here?
(I can put as many as 11 nops between waitvids inside the group of four; it's just right after the whole group of four that I can only insert 2.)
Comments
So any chance of elaborating a bit? what res are you trying to do? etc.
For instance, I've done a 640x480 text mode (had to be 16 pixels wide font though [noparse]:)[/noparse] so still 40x30, as it's 16x16 font [noparse]:)[/noparse] looks great. ( PAL & NTSC, although NTSC does overscan lol )
Baggers.
See the file sw_sk_tv_drv_021.spin within the Dodgy Kong download.
This does 256 pixels horizontal resolution. I can insert 4 nops into the loop before it breaks up.
What do you need extra instructions in the loop for?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Help to build the Propeller wiki - propeller.wikispaces.com
At 160 pixels, I can put 10 NOPs in there. At 320, I can only place two, maybe three. --not near a running prop right now to check. This loop will do 480 tops. Pretty much the same as Cardboard's loop...
What resolution are you testing on Dennis? 320? 256? Your results sound about right for 256 and up... I ran into a similar scenario, trying to do a character mode and or color lookups. Unrolling some of the loop --doing a few waitvids before doing the djnz is a little gain, but it's costly in terms of COG space. If you use an in-COG scan line buffer, you can do some bit mashing during the porches and sync. Might be enough time to do most of a line, then chip away at the rest while the scanline is drawing.
I was headed that direction until I got side tracked with getting overlay video to work.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
To answer your question, CardboardGuru, about what I need the extra instructions for: I was hoping to have a user-defined number of bits per pixel, which would require shifting and bit-banging the result from rdlong to get the numerical pixel value for each of the four pixel positions, and then looking it up in a color table and constructing the colors to pass to waitvid on the fly.
In other words, I knew I could use 8 bits per pixel and have a straightforward mapping, but I was hoping to be able to use just 5 or 6 bits per pixel and a CLUT because not all 256 byte values map to a valid NTSC color. However, I now realize that it's not really feasible within my timing constraints to do all that work per pixel in the TV driver.
I see that if I just implement an 8-bit per pixel TV driver and build the scanline in another cog, such as you are doing for Dodgy Kong, then things will be much better for me. In fact, the reason I didn't start from your Dodgy Kong drivers is that on my TV to VGA converter, Dodgy Kong's TV signal produces vertical black pinstripes, but the Parallax driver does not. Since I do not know why one works for me and the other doesn't, I started hacking on the Parallax driver so that I knew I'd be using timing values which looked good on my TV-VGA converter.
To clear up any remaining mystery: Because I expected to have some spare cycles (the Parallax TV driver was designed to work at less than max clock speed) I traded execution efficiency away for better abstraction and make function calls like "DoLine" in place of one giant amorphous loop. (I don't make a function call for every pixel, just every line!) It may be that the final ret / next call at the end of a line is enough to put me over the limit with fewer nops after the waitvid than I should normally be able to do.
Potatohead's suggestion that I could do some extra computation in the blank periods to chip away at the line before I enter the actual visual portion has merit, but I don't think I want to do it that way. It might work but I'd have to keep track of both where I am in the decoding and where I am displaying, and the piecemeal computation between waitvids could get harder to follow than an Escher drawing. I'd rather just throw another cog at the problem; this gets me closer to what I want to do eventually anyway; I will have more complex jobs for that second cog as time goes on.
Post Edited (Dennis Ferron) : 8/4/2007 6:45:05 AM GMT
(1) Set your pixel clock
Cheap Videos have 320 x 234 pixels, refreshing them with 60 Hz needs around 4,4 MHz; interlacing (= doubled lines) will not make a change, and only improve things on a full grown TV or VGA Monitor. True NTSC is 760 pixels per line, also needs a grown up display and a doubled pixel clock! Note that this will only improve B&W, when adherring to the standard, not color!
So lets have 4 MHz (alternative 8 MHz)
(2) Outputting 4 pixels per WAITVID keeps the hardware busy for 1 mys (0.5 mys), thus you have 80 (40) processor ticks for the complete loop. You have WAITVID, DJNZ, ADD, RD...., altogether 20 to 35 ticks.
Leaves 60 (20 ) ticks or 15 (5 ) instruction for intermediate computations. (*)
When using no CLUT (raw 8 bit values from HUB) this will work fine
(3) The number of colours is limited in PAL/NTSC to 16 @ about 5 luma levels = 80 true colours plus 6 "true" gray values ("chroma off")
But most of you are probably aware of all this..
Edit1: (*) Note there is no need to consider streteched waiting for HUB cycle! You can always fill your loop!
Edit2: There is a full thread in the Propeller forum explaining "mys" and other units
Post Edited (deSilva) : 8/4/2007 5:36:53 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.