Shop OBEX P1 Docs P2 Docs Learn Events
Help calculating a waitcnt delta — Parallax Forums

Help calculating a waitcnt delta

pullmollpullmoll Posts: 817
edited 2010-05-01 17:12 in Propeller 1
I'm trying to get a video mode working that pushes the Prop to the limit. I figured some numbers that are working, but I have not enough time for even a single hub access instruction in the video emitting cog. I wanted to write -1 to a long for the other cog to synchronize upon, whenever a character row (of 12 sanlines) was done. This won't work.

Now my other idea was to exactly calculate the number of clock cycles it takes the video cog to emit 12 scanlines and then synchronize the renderer by means of waitcnt on every character row.

These are values I have found or want to use on a DracBlade with 5MHz xtal and 16x PLL, i.e. 80MHz:

vcfg : vmode=%01, cmode=1, pin group=2, 8 pins
ctra : 15 << 23 (PLLx16)
frqa : 14 << 24 (5MHz * 14 = 70 MHz PLL clock)
vscl : 3 << 12 + 3 * 16 (3 PLLs per pixel, frames are 16 pixels)

The scanlines consist of
6 * 16 pixels for hblank and hsync, split as: 1x porch, 3x hsync, 2x porch
640 active pixels (40 x 16)

The frame consits of
480 active scanlines
11 bottom porch scanlines
2 vsync lines
31 top porch scanlines

The pixel and colour buffer consists of 2 character rows and my goal was to alternatingly render the character row that is currently not displayed.
For this mode this makes 40*12*2 longs of pixel data and 40*2 longs of color data.

My font is 8x12, so the interesting number is the cycles it takes to emit 12 scanlines.
The pixel clock is 70 MHz / 3 = 23,33333 MHz and there are 6*16+640 = 732 total pixels per scanline, so the horizontal frequency is 31_703 kHz.
If I got that right, this is 31,542us (which is slightly out of the spec of 31,77us per scanline for the VGA 640x480 mode).
Then this value * 12 is 378,514us and at 80 MHz this should be equal to a CNT delta of 30281,14329 ~= 30281 clocks.
With this delta my "waitcnt wait, wait_add" hangs until the counter wraps - as it looks like.

My questions are:
a) did I do the math right, or am I missing something?
b) Should I try to get (closer to) an integer CNT delta by tweaking the 70MHz PLL frequency?

TIA
Juergen

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pullmoll's Propeller Projects

Post Edited (pullmoll) : 4/29/2010 10:09:13 PM GMT

Comments

  • Jack BuffingtonJack Buffington Posts: 115
    edited 2010-04-29 23:46
    I'm just finishing up a project where I am dealing with capturing 640x480x60Hz. There are 800 pixels per line when you include the sync and porches. The pixel clock is 25.175MHz. TinyVGA.com has been a help to me on this project.

    -Jack
  • kuronekokuroneko Posts: 3,623
    edited 2010-04-30 02:34
    pullmoll said...
    Then this value * 12 is 378,514us and at 80 MHz this should be equal to a CNT delta of 30281,14329 ~= 30281 clocks.
    With this delta my "waitcnt wait, wait_add" hangs until the counter wraps - as it looks like.

    My questions are:
    a) did I do the math right, or am I missing something?
    Your column loop alone consumes 1+189*(CHAR_W/2)+1 normal instructions (hub instructions count as 2 normal ones). With CHAR_W being 80, that's 30248 cycles. An that's not counting the code between :row and :col.

    Post Edited (kuroneko) : 4/30/2010 2:39:17 AM GMT
  • potatoheadpotatohead Posts: 10,261
    edited 2010-04-30 05:36
    Why can't you just sync during the VBLANK, and have the two cogs run together at that point?

    Seems to me, whatever buffer the video emitting cog is using just needs to be filled. Fill it with the graphics cog, using a counter in that cog as a time reference.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • pullmollpullmoll Posts: 817
    edited 2010-04-30 06:08
    potatohead said...
    Why can't you just sync during the VBLANK, and have the two cogs run together at that point?

    Writing a hub long at each frame is too much for the video emitting cog. Yes, really! As soon as I try this, the whole thing goes out of sync.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects
  • pullmollpullmoll Posts: 817
    edited 2010-04-30 06:17
    kuroneko said...
    pullmoll said...
    Then this value * 12 is 378,514us and at 80 MHz this should be equal to a CNT delta of 30281,14329 ~= 30281 clocks.
    With this delta my "waitcnt wait, wait_add" hangs until the counter wraps - as it looks like.

    My questions are:
    a) did I do the math right, or am I missing something?
    Your column loop alone consumes 1+189*(CHAR_W/2)+1 normal instructions (hub instructions count as 2 normal ones). With CHAR_W being 80, that's 30248 cycles. An that's not counting the code between :row and :col.

    How did you get the value 189? Did you count the instructions or look into the listing? I now counted them too and got the same result...
    If my renderer is too slow - I wouldn't have expected that - I would need to run two rendering cogs, one doing odd lines, the other even lines... or one doing the first half, one the second half of the character rows. That makes the whole attempt almost useless :-/

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects

    Post Edited (pullmoll) : 4/30/2010 6:39:33 AM GMT
  • potatoheadpotatohead Posts: 10,261
    edited 2010-04-30 06:35
    That doesn't sound right to me. I can understand getting stuck on the actual scan lines. This is VGA right? But the vertical blank period is actually pretty long.

    And because we might be confused on what cog does what, is your "emitter" cog the one that is actually putting the signal out onto the pins? There is another cog feeding graphics to a line buffer, or buffers as well, let's call that one the "graphics" cog.

    The emitter should write a value, to be read by the graphics cog, so that it knows what elements of the display to build. With high dot clocks, doing this during active scans is going to be tough, but during the vertical blank, there should be fairly long periods of time where the sync pulses are output, where you could easily nest a wrlong, right after a waitivd.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!

    Post Edited (potatohead) : 4/30/2010 6:43:25 AM GMT
  • pullmollpullmoll Posts: 817
    edited 2010-04-30 06:51
    potatohead said...
    That doesn't sound right to me. I can understand getting stuck on the actual scan lines. This is VGA right? But the vertical blank period is actually pretty long.

    And because we might be confused on what cog does what, is your "emitter" cog the one that is actually putting the signal out onto the pins? There is another cog feeding graphics to a line buffer, or buffers as well, let's call that one the "graphics" cog.

    The emitter should write a value, to be read by the graphics cog, so that it knows what elements of the display to build. With high dot clocks, doing this during active scans is going to be tough, but during the vertical blank, there should be fairly long periods of time where the sync pulses are output, where you could easily nest a wrlong, right after a waitivd.

    This is almost VGA. I had to tweak the timing to get a video signal that synchronizes on my LCD monitor without the frequencies being out of range.

    The emitter is the vgavid.spin cog, yes. It does nothing but shuffle hub longs with attributes and colors into waitvids. It repeats the same buffer after 2 character rows, i.e. 24 scanlines.
    I couldn't myself believe that writing just one hub long at the "frame" label in vgavid.spin was too much, but you can verify it yourself. Just "wrlong minus1, sync_ptr" there. If I do this, the whole thing doesn't sync anymore. It seems as if there is absolutely no cycle left to waste at that point - or any point for that matter.

    What I would like to have was a sync signal after the emitter sent out the first 12 lines, so that the generator could again begin to render the first half of the buffer while the emitter was active on the second half.

    Unfortunately, as kureneko pointed out, my generator is too slow anyways, so it makes no sense at all to try to synchronize both cogs [noparse]:([/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects

    Post Edited (pullmoll) : 4/30/2010 6:57:12 AM GMT
  • potatoheadpotatohead Posts: 10,261
    edited 2010-04-30 06:55
    You know, that kind of thing works great on TV. If you run s-video, your average TV will do plenty of pixels for this. I went ahead and deleted my other post, because of the speed issue overall mentioned above.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • kuronekokuroneko Posts: 3,623
    edited 2010-04-30 07:01
    pullmoll said...
    How did you get the value 189? Did you count the instructions or look into the listing? I now counted them too and got the same result...
    For the record, yes I did count them [noparse]:)[/noparse] Just to make sure as stuff like this is too often missed/overlooked.
  • pullmollpullmoll Posts: 817
    edited 2010-04-30 07:08
    kuroneko said...
    pullmoll said...
    How did you get the value 189? Did you count the instructions or look into the listing? I now counted them too and got the same result...
    For the record, yes I did count them [noparse]:)[/noparse] Just to make sure as stuff like this is too often missed/overlooked.

    Thanks for the effort! I like it being convinced by numbers that I'm trying to do something impossible. Time to give up on the try [noparse]:)[/noparse]

    At least I now know how to write a less demanding VGA driver for the emulations. With just 40x25 or 64x16 characters without any colors that should be as easy as TV.

    Thanks again,
    Juergen

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects
  • pullmollpullmoll Posts: 817
    edited 2010-05-01 07:10
    Going down to 320x240 I finally got at least anything working as expected.
    This code generates a display of 40x20 characters with 2 colours per character, defined by 4 bits back- and 4 bits foreground colour, so a subset of the 64 possible colours indexed through a palette.
    If YOU have time and patience to test it, would you mind to report if it works for you?

    Now from this working version I'll try to go to higher resolutions, both horizontally and vertically.

    Edit: with 384x240 I can't seem to find a value for FRQA that doesn't lead to colour seams on the characters.
    See version 0.0.6 - you can tune the FRQA value while it runs using PST (or your terminal program) and the keys i for increment, d for decrement.
    The relation it starts with is one of the best I found.

    Edit2: Using an 8x8 font with double scan works fine, so version 0.0.7 is a 40x30 mode with 2 colours per character.

    Edit3: This is about the limit that's possible with my renderer. 48x30 characters with 2 colours. 2880 bytes of frame buffer plus 3 * 24*8 = 576 longs of pixel buffer plus 3 * 24 longs of colour buffer.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects

    Post Edited (pullmoll) : 5/1/2010 4:33:43 PM GMT
  • pullmollpullmoll Posts: 817
    edited 2010-05-01 09:28
    potatohead said...
    You know, that kind of thing works great on TV. If you run s-video, your average TV will do plenty of pixels for this. I went ahead and deleted my other post, because of the speed issue overall mentioned above.

    Well, I now believe there _is_ a speed issue [noparse]:)[/noparse] You can do much more with the TV mode, because there you have half the frequencies and twice the time and number of cycles to do your stuff. For the VGA it is much harder to achieve halfway reasonable results, even for lower resolutions. It also looks like VGA in one cog would be very limited, if at all possible.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects
  • potatoheadpotatohead Posts: 10,261
    edited 2010-05-01 16:01
    Agreed.

    I've not done much VGA, just because I am rarely where I've got a monitor! (wonders why laptops don't have a VGA in mode) A display like the one you just completed is possible in one COG on TV, though I don't know if the color indirection is possible. Probably, but when I tried it, I ran out of cycles. Then again, I didn't use all the speed tricks at that time either.

    One reason I really like TV is so that I can run a video capture, putting the propeller output into a window on my laptop. Composite breaks down some over 320 pixels, though on newer devices it's surprisingly good. Anything HD capable will squeeze far more out of that signal than one would expect. S-video brings near monitor quality to a 320 pixel display, and does very well up to the 640 pixel display.

    The real bummer with NTSC TV is the lower vertical resolution. Without interlace, it's about 200 lines. On the other hand, Prop doesn't have a lot of RAM either, putting TV right in the sweet spot.

    I've a VGA monitor that syncs all the way down to the 15Khz frequencies TV uses. Wonder how many of them will actually do that? If a lot of them do, perhaps setting up the CGA timings, with VGA output would balance the speed issue out, yielding much better one COG displays.

    TV has a similar problem, if one wants to exceed S-video resolutions, especially vertical where a progressive scan is very nice to have. Basically, a component video output is necessary, and that consumes pins, or is color limited, or grey scale only. Not really all that possible with one COG there either.

    Nice job on this driver though. Looks great!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • pullmollpullmoll Posts: 817
    edited 2010-05-01 16:07
    potatohead said...
    I've a VGA monitor that syncs all the way down to the 15Khz frequencies TV uses. Wonder how many of them will actually do that? If a lot of them do, perhaps setting up the CGA timings, with VGA output would balance the speed issue out, yielding much better one COG displays.

    The times of CGA are history with LCD monitors. Mine (brand new) starts accepting video signals only from 30kHz and higher, and vertically it seems I must have 50 to 75 Hz, so the possible range is quite limited.
    potatohead said...
    TV has a similar problem, if one wants to exceed S-video resolutions, especially vertical where a progressive scan is very nice to have. Basically, a component video output is necessary, and that consumes pins, or is color limited, or grey scale only. Not really all that possible with one COG there either.

    Nice job on this driver though. Looks great!

    Yes, the latest one may actually be usable for something. The trick was to go with 3 character rows pixel buffer, because with 2 I couldn't get rid of rendering overlapping the video output. 40x30 chars is not _that_ bad and with some block and line graphics in the upper 128 characters (not yet) you could even do some games in this mode. 2400 bytes frame buffer isn't too much either.

    Now I got 48x30 working also. Don't know what I did wrong before. It's all very peculiar adjustments of number and position of the porches and sync signals, plus finding some reasonable value for the PLL and pixel clocks, so that the VSCL divider produces stable boundaries. This is what's missing from version 0.0.8, too: a good (stable) relation of PLL clock, thus pixel clock and the possible hsync + vsync frequency range.

    Edit: Woah! Now even 56x30 works... still no "nice" PLL clock without colour seams, though.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects

    Post Edited (pullmoll) : 5/1/2010 4:34:17 PM GMT
    3264 x 2448 - 2M
  • potatoheadpotatohead Posts: 10,261
    edited 2010-05-01 16:52
    Why not use a reduced pixel font for 80 cols? You've now got plenty of pixels for that, if you can live with the bigger color cells.

    And the seams are not that big of deal honestly. Would be nice to eliminate them, but overall, that display looks pretty great!

    That 8x8 font rocks on TV. It's built to keep artifacts at a minimum, allowing readable 80 column display. On VGA, those are not issues, meaning you should be able to pack all the characters in with the resolution you've got. Somebody here posted up a 5x8 font... There are 4x8 fonts that work too, though they can be painful to read. How many cogs do you have to render with?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!

    Post Edited (potatohead) : 5/1/2010 4:59:31 PM GMT
  • pullmollpullmoll Posts: 817
    edited 2010-05-01 17:12
    potatohead said...
    Why not use a reduced pixel font for 80 cols? You've now got plenty of pixels for that, if you can live with the bigger color cells.

    And the seams are not that big of deal honestly. Would be nice to eliminate them, but overall, that display looks pretty great!

    That 8x8 font rocks on TV. It's built to keep artifacts at a minimum, allowing readable 80 column display. On VGA, those are not issues, meaning you should be able to pack all the characters in with the resolution you've got. Somebody here posted up a 5x8 font... There are 4x8 fonts that work too, though they can be painful to read. How many cogs do you have to render with?

    I don't want to lose the per character attribute, because I intend to attach my ANSI/VT100 emulation to such a text display. I could try to use a shorter VSCL pixel frame like 14 or 12 pixels for a 7x10 or 6x10 font perhaps. Not sure if that works out. For the emulations I already got everything I need, even more than I need, because they were all monochrome so far.

    There's one video output cog and one renderer cog. The overall footprint including the 256 character font is ~ 3600 longs for the 56x30 mode.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pullmoll's Propeller Projects
Sign In or Register to comment.