VGA Video Graph Overlay
xgaProp8
Posts: 13
(NOTE: Has been edited...too many dumb bugs in the original post to waste anyone's valuable time!)
Hello all magic Propellerheads...I'm trying to do something that at least to me sounds simple. But I'm having great difficulty wrapping my head around much of anything to do with VGA video generation!
Basically, I'm using the "Hidemaru" 50x18 VGA driver by Marko Lukat (aka "kuroneko") to drive a computer monitor at 800x600 pixels with a 50x18 Propeller font tile pattern. Works GREAT...I'd almost consider donating a few $$ 'cause it's so perfect!
One major challenge: I read through the PASM source code there, but for the life of me, I can't figure out how the video data gets sent to the video module (or I/O pins)! The only "waitvid" instructions are for HSync and VSync. I'm aware that the code does some dynamic recompiling (as it were), and that could perhaps be the magic dust that makes it shine. Or perhaps there's some secret tool hidden in there to enable "multitasking" as it were, to have the video driver outputting data WHILE the cog is decoding the font. (If I understand correctly, WAITVID prevents the cog from outputting any further data until it can deliver the data chunk.)
Anyway, I'm hoping to make a single-cog graph driver to "piggyback" on top of the VGA signals from the aforementioned 800x600 video driver. It's already driving the monitor...if I'm thinking correctly, I could synchronize my code to the HSync and VSync signals to make a driver that I could readily start and stop at whim (enable/disable graph). Watch the VSync pin to determine the start of the frame...count HSyncs to pass the blanking period, and get to where I want to display the graph (and determine how long the graph is), etc.
Just looking at the code required to decode (pardon the pun) the Propeller ROM font in kuroneko's driver, I am nearly certain that making a realtime 3-color graph (maybe even 4-color?) should be a piece of cake.
My basic idea is the following:
- Input graph data stored in a LONG array (each of the four bytes in one long are a different graph color, I.E. R, G, B )...depending on routine efficiency, anywhere from 50-800 longs (= 1 - 16 pixel X resolution). NOTE: The values will have to be inverted for the graph to display right-side up
- Wait for VSync...
- Graph driver waits, say, for line 300 (counting HSyncs)
- Loop: Read a LONG from the data array, quickly compare each of the bytes with the scanline. Use "=" for a dot-graph, and "<" for an area-graph. Long[0] = red, Long[1] = green, Long[2] = blue, etc.
- Output video data. Get next LONG...repeat all the way across the scanline. (I'm thinking there should EASILY be enough time to handle this, with how involved it is to decode the Propeller ROM font!)
- Repeat previous two steps for 256 scanlines (for full graph height)
If there's somehow not enough time to do a quick [byte/scanline] comparison for each pixel, it easily could be made coarser, namely, 2 pixels each have the same data point...or 4 pixels...or 8 pixels...or all 16. All that matters is getting the data on the screen!
EDIT: EDIT: EDIT: Got a number of the bugs worked out, but need some serious help on how to make the main loop more efficient! The code below is extremely coarse, but it graphs two colors. I've programmed it to demo at least something; it must be started after the "hidemaru" driver release. Improvements are needed! I've crammed 3 "waitvids" into the routine to output data, but clearly, there has to be a better way! Somehow the "Hidemaru" driver is completely re-bitting the font data for EACH "waitvid"-sized output...and that's not easy code!
Can someone at least tell me if this is somewhat viable, or if I'm completely out in left field? PLEASE NOTE: if you try to test this, it DOES NOT drive HSync or VSync. I'm purposely trying to design it to run in tandem with the Hidemaru 50x18 tile driver by kuroneko.
Hello all magic Propellerheads...I'm trying to do something that at least to me sounds simple. But I'm having great difficulty wrapping my head around much of anything to do with VGA video generation!
Basically, I'm using the "Hidemaru" 50x18 VGA driver by Marko Lukat (aka "kuroneko") to drive a computer monitor at 800x600 pixels with a 50x18 Propeller font tile pattern. Works GREAT...I'd almost consider donating a few $$ 'cause it's so perfect!
One major challenge: I read through the PASM source code there, but for the life of me, I can't figure out how the video data gets sent to the video module (or I/O pins)! The only "waitvid" instructions are for HSync and VSync. I'm aware that the code does some dynamic recompiling (as it were), and that could perhaps be the magic dust that makes it shine. Or perhaps there's some secret tool hidden in there to enable "multitasking" as it were, to have the video driver outputting data WHILE the cog is decoding the font. (If I understand correctly, WAITVID prevents the cog from outputting any further data until it can deliver the data chunk.)
Anyway, I'm hoping to make a single-cog graph driver to "piggyback" on top of the VGA signals from the aforementioned 800x600 video driver. It's already driving the monitor...if I'm thinking correctly, I could synchronize my code to the HSync and VSync signals to make a driver that I could readily start and stop at whim (enable/disable graph). Watch the VSync pin to determine the start of the frame...count HSyncs to pass the blanking period, and get to where I want to display the graph (and determine how long the graph is), etc.
Just looking at the code required to decode (pardon the pun) the Propeller ROM font in kuroneko's driver, I am nearly certain that making a realtime 3-color graph (maybe even 4-color?) should be a piece of cake.
My basic idea is the following:
- Input graph data stored in a LONG array (each of the four bytes in one long are a different graph color, I.E. R, G, B )...depending on routine efficiency, anywhere from 50-800 longs (= 1 - 16 pixel X resolution). NOTE: The values will have to be inverted for the graph to display right-side up
- Wait for VSync...
- Graph driver waits, say, for line 300 (counting HSyncs)
- Loop: Read a LONG from the data array, quickly compare each of the bytes with the scanline. Use "=" for a dot-graph, and "<" for an area-graph. Long[0] = red, Long[1] = green, Long[2] = blue, etc.
- Output video data. Get next LONG...repeat all the way across the scanline. (I'm thinking there should EASILY be enough time to handle this, with how involved it is to decode the Propeller ROM font!)
- Repeat previous two steps for 256 scanlines (for full graph height)
If there's somehow not enough time to do a quick [byte/scanline] comparison for each pixel, it easily could be made coarser, namely, 2 pixels each have the same data point...or 4 pixels...or 8 pixels...or all 16. All that matters is getting the data on the screen!
EDIT: EDIT: EDIT: Got a number of the bugs worked out, but need some serious help on how to make the main loop more efficient! The code below is extremely coarse, but it graphs two colors. I've programmed it to demo at least something; it must be started after the "hidemaru" driver release. Improvements are needed! I've crammed 3 "waitvids" into the routine to output data, but clearly, there has to be a better way! Somehow the "Hidemaru" driver is completely re-bitting the font data for EACH "waitvid"-sized output...and that's not easy code!
Can someone at least tell me if this is somewhat viable, or if I'm completely out in left field? PLEASE NOTE: if you try to test this, it DOES NOT drive HSync or VSync. I'm purposely trying to design it to run in tandem with the Hidemaru 50x18 tile driver by kuroneko.
VAR byte cogNum long graphdata[150] '#of pixels across the screen. 4 simultaneous graphs permitted. PUB start | a, b b := 0 repeat a from 0 to 255 step 15 graphdata[b++] := (^^a * 6) | ((255 - a) << 8) '| ((?a & $FF) << 8) return cogNum := cognew(@driver, @graphdata) ' start the PASM video driver PUB stop if cogNum cogstop(cogNum) cogNum~ DAT org 0 ' video driver driver mov graphptr, par ' because the only mailbox parameter IS "graph." Not planning on making a super-configurable routine...yet! 'NOT giving options for "configurable Y position, configurable X-range", etc. Hardcode for one purpose. That greatly simplifies the init routine. neg href, cnt ' -4 hub window reference (%%) ?????????????????????? ' Upset video h/w and relatives. movi ctrb, #%0_11111_000 ' LOGIC always (loader support) '###????? VCO+128 (lower three), counter mode "ALWAYS", no output pin (upper five) movi ctra, #%0_00001_110 ' PLL, VCO/2 movi frqa, #%0001_00000 ' 5MHz * 16 / 2 = 40MHz = pixel clock movs frqa, #res_x/4 ' | movs frqa, #0 ' insert res_x into phsa mov vscl, hvis ' 1/16 mov vcfg, vcfg_norm ' VGA, 4 colour mode ### was vcfg_sync rdlong cnt, #0 shr cnt, #10 ' ~1ms add cnt, cnt waitcnt cnt, #0 ' PLL needs to settle waitvid zero, #0 ' dummy (first one is unpredictable) waitvid zero, #0 ' point of reference add href, cnt ' get current sync slot shr href, #1 ' 2 system clocks per pixel sub href, #9 ' 9..16 >> 0..7 neg href, href ' | and href, #%111 ' calculate adjustment ' WHOP is reasonably far away so we can update vscl without re-sync. add vscl, href ' | waitvid zero, #0 ' stretch frame sub vscl, href ' | waitvid zero, #0 ' restore frame ' At this point all WHOPs are aligned to 16h+11. In fact they are aligned to 32h+11 ' due to the frame counter covering 2 hub windows. 'Finished init of video generators, etc. 'FULL DISCLAIMER: I dunno what all that code above does...I'm just leaving it right there. mov dira, mask ' drive outputs. NOTE: Does not drive VSYNC/HSYNC pins. ' horizontal timing 800(800) 40(40) 128(128) 88(88) ' vertical timing 600(600) 1(1) 4(4) 23(23) '================================================================================================= vsync ' An experiment here... '...just wait for the VSync signal from the video driver. ' We don't need to do the video sync/drive portion of anything. Just overlaying on top of the already-generated signal. ' waitcnt cnt, #0 ' re-sync for steady video. RESULTS IN MESSED UP TEXT COLORS/NO GRAPH OUTPUT ' waitvid sync, #0 waitpeq vsync_pin, vsync_pin ' Wait for the VSync pin to go HIGH waitpne vsync_pin, vsync_pin ' Wait for the VSync pin to go LOW mov yline, #0 ' starting at the top of the screen. mov gLines, #graph_height ' number of lines in the graph ' Synced to the VSync signal. We need to wait a number of HSync lines to get out of the vertical blanking, 'and to where we want to output video data. '------------------------------------------------------------------------------------------------ line waitpeq hsync_pin, hsync_pin ' Wait for the HSync pin to go HIGH waitpne hsync_pin, hsync_pin ' Wait for the HSync pin to go LOW add yLine, #1 'Start of range: just pick line 300 cmp yLine, #graph_start wc if_c jmp #line '---------------------------------- 'At or beyond #graph_start 'Need to wait out past the blanking portion... mov graphptr, par 'graph base address mov ycomp, yline sub ycomp, #graph_start 'zero-correct waitvid sync, #0 waitvid sync, #0 waitvid sync, #0 mov data, #0 mov xBlocks, #17 '800 / 16 = 50 pixel 'read a byte. Have 5 instructions available in this inner loop... rdlong temp, graphptr add graphptr, #4 waitvid graph_colors, data ' mov data, #0 mov ch, temp and ch, #$FF cmp ch, ycomp wc muxc data, ch_1 waitvid graph_colors, data ror temp, #8 and temp, #$FF cmp temp, ycomp wc muxc data, ch_2 waitvid graph_colors, data djnz xBlocks, #pixel '---------------------------------- waitvid graph_colors, #$00 ' mute output 'How many rows to display? djnz gLines, #line 'Done. Wait for the next frame. jmp #vsync ' next frame '================================================================================================= ' initialised data and/or presets sync long $0200 ' locked to %00 {%hv} hvis long 1 << 12 | 16 ' 1/16 vcfg_norm long %0_01_1_00_000 << 23 | pgrp << 9 | vpin vsync_pin long %01 << (pgrp * 8) hsync_pin long %10 << (pgrp * 8) mask long vpin << (pgrp * 8) '### | %11 << (sgrp * 8) ###...removed VSync/HSync pins. We're only trying to sync to the text screen graph_colors long $C0300C00 ' R/G/B/black... all_on long $FFFFFFFF ch_1 long $55555555 ch_2 long $AAAAAAAA ch_3 long $FFFFFFFF href res 1 ' hub window reference < setup +3 (%%) temp res 1 ch res 1 graphptr res 1 ' screen buffer < setup +4 (%%) data res 1 yline res 1 ycomp res 1 'value y-offset corrected for graph offset gLines res 1 xBlocks res 1 tail fit CON zero = $1F0 ' par (dst only) vpin = $0FC ' pin group mask pgrp = 1 ' pin group dcolour = %%0220_0010 ' default colour res_x = 800 ' | res_y = 600 ' | res_m = 4 ' UI support graph_start = 300 graph_height = 256
Comments
PLEASE NOTE the following:
- Both the VGA driver and overlay routine are using pin group 1, starting on P8. I understand that most boards use pin group 2 (=start P16) to output VGA signals. You will need to change the PGRP value at the bottom the overlay driver, and both VGRP and SGRP at the bottom of kuroneko's driver to the appropriate value. I can't test it on group 2; your mileage may vary.
- There is a "carry" bug with the very compressed "graph match" routine, where the two axis will interact with each other, causing a one-line error on the other graph. But there simply isn't room for more instructions in the loop.
There's only a handful of LONGs free in the COG due to "unrolling" the 50-iteration video loop (=400 LONGs). Which also uses WHOP technology (i.e. putting the output data on the COG's internal bus at precise timing instead of WAITVID instruction). Subtract another 50 LONGs for a buffer of the graph data (there isn't time to read from HUB RAM during the output loop), and you basically have: COG is 512 LONGs - 16 (SFRs) - 50 graph LONGs - 400 unrolled output LONGs = only 46 LONGs left over for all support code, data constants, variables, etc. I dare anyone to try to get more axis/points than this in one COG!
It still has some startup timing errors (likely to do with the video PLL startup), but that only results in the graph randomly being shifted over about 8 pixels. (Once started, it remains rock-solid.) But as far as I'm concerned, it does work. Enjoy!
P.S. Same NOTE as above post applies: you will have to modify CONstants in both the tile driver and graph driver if your VGA pins are on a different group than P8-P15.