P2 DVI/VGA driver

rogloh · 2024-11-26 09:49

So I've been trying to improve the PAL and NTSC stuff in my video driver in the last couple of days. I took Chip's more accurate timing calculations from this thread
https://forums.parallax.com/discussion/173102/ntsc-pal-driver-without-dot-crawl-new-version/p1
and recalculated timing parameters for these four SDTV modes output using composite/S-video:
Progressive: 288p (PAL), 240p (NTSC)
Interlaced: 576i (PAL), 480i (NTSC)

I also took @Wuerfel_21's updated colourspace matrix settings for NTSC/PAL and added the 3.3V DAC pin setting for improved yellow output range.

Final result is a lot better than my original settings which we knew were less than ideal.

For NTSC and PAL via S-video, results are very good as expected - it looked amazing on my Plasma actually. And for composite NTSC both progressive and interlaced look pretty good too. For PAL the results are still not ideal with visible chroma crawl in interlaced mode despite tweaking timing to get the colour subcarrier frequency as close as possible to the 4433618.7516MHz value as I could and computing XFREQ accurately with respect to this frequency attempting to lock it to it. While the text is crisp and the colour is good it just shimmers with colour dots visible (and I've setup 40 column text instead of 80 which helps reduce the high resolution of the text) and the vertical graphical lines also crawl upwards. A video is attached of it in action. Progressive mode PAL does look better at least as there is no crawl everything looks frozen but it does show the dot pattern - see photo. Actually it does crawl very slowly but the cycle time is over tens of seconds so it's not really noticeable. On my Dell LCD (photo'd) these dots are highly noticeable, but the same signal into the Plasma TV looks less dotty and more like it's wavy and underwater, so it's been filtered somewhat.

I also spent time on this today and experimentally added in the special burst gate blanking PAL uses (Bruch blanking) in the hope that might have helped out with displays that don't like seeing colour bursts on incorrect lines, but still no improvement so far unfortunately. I'm at a loss as to what can actually improve this more. It seems like there is still something wrong but I can't tell what when looking on the scope.
See this link for details in useful Video Demystified book online
https://archive.org/details/video-demystified-5th-edition/page/n305/mode/2up (page 306 via browser, real page 286 in book)
and more details here:
http://www.bbceng.info/additions/2019/ETD Books/TV Signal Coding - Supplemenary information - ETD Training Book.pdf

TonyB_ · 2024-11-26 12:28

What PAL colour subcarrier frequency (fcsc) and line frequency (fline) are you using?

EDIT:
The exact relationship is fcsc = 283.75 * fline + 25 Hz
where fline = 15625 Hz and fcsc = 4.43361875 MHz

The extra 25 Hz introduces one or two horrible big prime numbers when synthesizing fcsc

Just wondering whether you've tried fcsc = 283.75 * fline
If sysclk = 283.75 MHz and fline = 15625 Hz then fcsc = sysclk / 64 = 4.43359375 MHz
20.0 MHz * 227 / 16 = 283.75 MHz
283.75 MHz / 19 (or 20) pixel clock outputs very close (or close) to square pixels for 768x576 res.

Wuerfel_21 · 2024-11-26 13:23

Yea the exact length and number of scanlines is what affects it. For lowres NTSC you ideally want each scanline to be exactly N+0.5 color cycles and an odd number of total scanlines (263 probably). That way the errors even out the best (for static images). Each scanline will have roughly the opposite error of it's upper/lower neighbor and the errors invert every frame. For interlace it's a 90° shift every field, which I guess is good?
Right now I'm not sure what's ideal for PAL (since the phase is flipping on its own to begin with).
PAL60 is kinda messed up in this regard from the beginning, nothing matches up nicely. Even "real" PAL60 sources like the Wii have annoying dot crawl. Though that uses a really nice video DAC chip ("AVE-RVL") that filters the signal to avoid artifacts in the first place, so it's not as bad.
Maybe that could be built externally i.e. output S-Video from the P2 chip and put a switchable 3.58/4.43 MHz trap filter on the Y line before mixing the signals. The common AD725 encoder has something like that built-in, so I think that works in practice. You'd really want to filter IQ before modulation, too, but oh well.

TonyB_ · 2024-11-26 13:48

This webpage is worth a read:
The 625/50 PAL Video Signal and TV Compatible Graphics Modes

rogloh · 2024-11-26 22:15

@TonyB_ said:
What PAL colour subcarrier frequency (fcsc) and line frequency (fline) are you using?

Well I first started out with what Chip used which was an integer 4433618Hz and then tweaked it further by adjusting to 4433618.75 and even 4433618.7516 (true frequency that re-aligns colour phase every 8 fields with that extra Fline/625 offset added).
Line frequency was computed based on 283.75 times this in Chip's code below.

pal_cf          = 4_433_618                             'colorburst frequency
pal_cc          = round(283.75 * 4.0)                   'color cycles per line * 4 to preserve fraction
...
 dotf := muldiv64(x_total, pal ? pal_cf * 4  * 128 : ntsc_cf * 4 * 128, pal ? pal_cc : ntsc_cc) 'compute pixel clock * 128
 cf := (pal ? pal_cf : ntsc_cf) frac freq
...
  i := 31 - encod freq                               'compute very accurate streamer frequency to stop dot-crawl
  xf := ((dotf >> (7-i)) frac (freq << i) + 1) >> 1

EDIT:
The exact relationship is fcsc = 283.75 * fline + 25 Hz
where fline = 15625 Hz and fcsc = 4.43361875 MHz

The extra 25 Hz introduces one or two horrible big prime numbers when synthesizing fcsc

Just wondering whether you've tried fcsc = 283.75 * fline
If sysclk = 283.75 MHz and fline = 15625 Hz then fcsc = sysclk / 64 = 4.43359375 MHz
20.0 MHz * 227 / 16 = 283.75 MHz
283.75 MHz / 19 (or 20) pixel clock outputs very close (or close) to square pixels for 768x576 res.

I think the original PAL timing was probably somewhat better before I started tweaking it so I'll probably revert back to that for now. IIRC it showed the text dots (perhaps more of an LCD monitor artefact with poor filtering) and jagged vertical lines but less actual visible crawl.

I will also try out your P2 clock numbers TonyB_, but if they don't improve things further than Chip's stuff, for PAL use I'd probably just stick with S-video for the output with PAL. Composite PAL seems too hard to get spot on with the P2 HW even when you make all attempts at getting everything computed correctly, unless PAL always has this amount of crawl and relies on very good filtering from devices to reduce it.

rogloh · 2024-11-26 22:29

Here was the hacked up vsync handling I'd tested - copied it into the mouse sprite code for now to make room for testing until I remove DVI to free COGRAM space it needs. When I scoped it I captured all four cases of the colour bursts at sync time for the four fields and saw it was doing the correct thing. Also the PAL phase (CQ value) gets reset to the same initial value before the back porch blanking begins after this code, so the burst has the correct polarity, and is XORd to flip it to the other polarity per each scan line after that. This is what is meant to occur AFAIK.

interlacedcode                                                'some different sync code patches
                            testb   fieldcount,#0 wz
                            testb   fieldcount,#1 wc

            if_c_ne_z       sets    doburst, #hsync0        'disable colourburst fields 2,3
                            callpa  #1, #blank              'send a blank line
                            sets    doburst, #hsync0        'disable all colourburst cases
            if_nz           call    #\hsync                 'send
            if_nz           xcont   m_slim, hsync0           'send 1/2 line for field 1,3

                            rep #2, #5-0                    'defaults to PAL
                            xcont   sync_000, hsync1        'generate horizontal blanking/sync
                            xcont   sync_001, hsync0

                            decod   status, #31             'update status - in vertical sync
                            setbyte status, fieldcount, #2  'update field counter in status
                            wrlong  status, statusaddr

                            rep #2, #5-0                    'defaults to PAL
                            xcont   sync_002, hsync1        'generate horizontal blanking/sync
                            xcont   sync_003, hsync0
                            cogatn  #0-0

                            rep #2, #5-0                    'defaults to PAL
                            xcont   sync_000, hsync1        'generate horizontal blanking/sync
                            xcont   sync_001, hsync0

            if_z            xcont   m_half, hsync0  'remainder after half line (line 318, field 2,4)


            if_c_eq_z       callpa  #1,#blank                 'send a full line (fields 1,4, lines 6,319)
                            sets    doburst, #colourburst    'enable all colourburst outputs from now

            if_00           setd    patchvbp, #15   ' 7-21    22-309 (288 lines)
            if_01           setd    patchvbp, #15   ' 319 - 333 inclusive  334-621 (288 lines)
            if_10           setd    patchvbp, #16   ' 6-21   22-309 (288 lines)
            if_11           setd    patchvbp, #14   ' 320-333 inclusive  334-621 (288 lines)

                            jmp     #fieldloop

rogloh · 2024-11-26 23:44

@TonyB_ said:
Just wondering whether you've tried fcsc = 283.75 * fline
If sysclk = 283.75 MHz and fline = 15625 Hz then fcsc = sysclk / 64 = 4.43359375 MHz
20.0 MHz * 227 / 16 = 283.75 MHz
283.75 MHz / 19 (or 20) pixel clock outputs very close (or close) to square pixels for 768x576 res.

Just tried this, it does give good results but still has that bad dot crawl I already see with my own tests. A brief video is attached showing the output on my plasma using these settings - gives similar results to mine.

@Wuerfel_21 said:
Maybe that could be built externally i.e. output S-Video from the P2 chip and put a switchable 3.58/4.43 MHz trap filter on the Y line before mixing the signals. The common AD725 encoder has something like that built-in, so I think that works in practice. You'd really want to filter IQ before modulation, too, but oh well.

Yeah I think many of the problems now probably mainly come down to combining Luma+Chroma inside the P2 without filtering because the S-video output looks really nice.

TonyB_ · 2024-11-27 00:39

@rogloh said:

@TonyB_ said:
Just wondering whether you've tried fcsc = 283.75 * fline
If sysclk = 283.75 MHz and fline = 15625 Hz then fcsc = sysclk / 64 = 4.43359375 MHz
20.0 MHz * 227 / 16 = 283.75 MHz
283.75 MHz / 19 (or 20) pixel clock outputs very close (or close) to square pixels for 768x576 res.

Just tried this, it does give good results but still has that bad dot crawl I already see with my own tests. A brief video is attached showing the output on my plasma using these settings - gives similar results to mine.

Thanks for testing. I think fcsc = 283.75 * fline is the best that can be achieved. It seems that's what Chip's code is doing and results using his code should look similar. To minimize PAL dot crawl, a "precision offset" of 1/625 * fline needs to be added to 283.75 * fline but it's a tiny addition, only 5.6 ppm. The insurmountable problem on the P2 is that fline and fcsc are both integer fractions of sysclk, which would need to be 10 times faster for the precision offset to be possible.

Wuerfel_21 · 2024-11-27 01:59

I assume you tried the obvious "just add/subtract 1 from CFREQ" thing?
Another factor in this is that XZERO resets the pixel NCO phase (so if used, each scanline is an integer number of cycles), but not the carrier phase. So XFREQ can be an exact fraction of the master clock, but CFREQ can only be multiples of clkfreq/2³². Smart choice of system clock can help here. IMO at the clock ratios involved in SDTV (<= clkfreq/16) the XFREQ does not need to be an exact fraction, as long as XZERO is used to keep everything aligned.

rogloh · 2024-11-27 02:28

@Wuerfel_21 said:
I assume you tried the obvious "just add/subtract 1 from CFREQ" thing?

You mean keep tweaking its value until I see it working nicer? Or something dynamic in the code? What is obvious here?

Another factor in this is that XZERO resets the pixel NCO phase (so if used, each scanline is an integer number of cycles), but not the carrier phase. So XFREQ can be an exact fraction of the master clock, but CFREQ can only be multiples of clkfreq/2³². Smart choice of system clock can help here. IMO at the clock ratios involved in SDTV (<= clkfreq/16) the XFREQ does not need to be an exact fraction, as long as XZERO is used to keep everything aligned.

Yeah I do an xzero per scanline so the phase error doesn't accumulate. It's always the correct number of clock cycles per line.

hsync                       xzero   m_sn, hsync1            'generate the sync pulse
                            wrlong  status, statusaddr      'update the sync status per line
dobreeze                    xcont   m_br, hsync0            'do breezeway before colour burst
                            setcq   cq                      'reapply CQ for PAL colour changes
doburst                     xcont   m_cb, colourburst       'do the PAL/NTSC colour burst
flipref                     xor     cq, palflipcq           'toggle PAL colour output per scanline
bp                          xcont   m_bv, hsync0            'generate the back porch
notify      _ret_           cogatn  #0-0                    'notify new scan line status during hsync

Wuerfel_21 · 2024-11-27 02:34

Obvious in the sense of "just wiggle the CFREQ value up and down and see if that improves it". Also, remember that the XFREQ needs to be rounded up if you want an exact fraction, otherwise there's an extra cycle after XZERO.

rogloh · 2024-11-27 02:56

Hey @Wuerfel_21 , are these colorspace values for component SD and component HD known to be correct in NeoVGA? I was going to use these too but didn't know if they were calculated as the working wasn't shown unlike what you did for PAL/NTSC, it's just raw values. Am talking about your params for BT.601 and Rec.709 below.

cmode_ypbpr601 ' YPbPr (with Rec. 601 matrix for SDTV)
long $00_00_00_00   ' DAC blanking value
long Y_BLANK        ' Blanking color
long Y_SYNC         ' HSync color
long Y_SYNC         ' VSync color
long Y_BLANK        ' HSync+VSync color
long 0              ' Color burst color (NTSC/PAL only)
long 0              ' Color burst frequency  (NTSC/PAL only)
long ( 45&$FF) << 24 + (-38&$FF) << 16 + ( -7&$FF) << 8 + 128   ' CY
long ( 27&$FF) << 24 + ( 53&$FF) << 16 + ( 10&$FF) << 8 + BLANK_LEVEL   ' CI
long (-15&$FF) << 24 + (-30&$FF) << 16 + ( 45&$FF) << 8 + 128   ' CQ
long 0              ' CQ XOR value (NTSC/PAL only)

cmode_ypbpr709 ' YPbPr (with Rec. 709 matrix for HDTV)
long $00_00_00_00   ' DAC blanking value
long Y_BLANK        ' Blanking color
long Y_SYNC         ' HSync color
long Y_SYNC         ' VSync color
long Y_BLANK        ' HSync+VSync color
long 0              ' Color burst color (NTSC/PAL only)
long 0              ' Color burst frequency  (NTSC/PAL only)
long ( 45&$FF) << 24 + (-41&$FF) << 16 + ( -4&$FF) << 8 + 128   ' CY
long ( 19&$FF) << 24 + ( 64&$FF) << 16 + (  7&$FF) << 8 + BLANK_LEVEL   ' CI
long (-10&$FF) << 24 + (-35&$FF) << 16 + ( 45&$FF) << 8 + 128   ' CQ
long 0              ' CQ XOR value (NTSC/PAL only)

Wuerfel_21 · 2024-11-27 03:07

Ought to be correct, but test it if unsure.

I wonder when you're actually supposed to use which. Where does 480p "EDTV" fall? What's decode matrices are TVs using de-facto?

rogloh · 2024-11-27 03:55

I think SDTV/EDTV analog progressive component video like 480p/576p uses BT601 and HDTV resolutions like 720p/1080i/p use the Rec709 one. However it's somewhat confusing. In fact after DVB-t DTV was initially released in Australia around 2001 some channel (7?) that broadcast 576p actually had the gall to call it "HD" so they didn't have to invest in more expensive equipment at the time proper HDTV came in. What a joke that was.
That link to that Demystified book I posted above had more details listed in Chapters 3 and 5.

TonyB_ · 2024-11-27 11:03

@TonyB_ said:

@rogloh said:

@TonyB_ said:
Just wondering whether you've tried fcsc = 283.75 * fline
If sysclk = 283.75 MHz and fline = 15625 Hz then fcsc = sysclk / 64 = 4.43359375 MHz
20.0 MHz * 227 / 16 = 283.75 MHz
283.75 MHz / 19 (or 20) pixel clock outputs very close (or close) to square pixels for 768x576 res.

Just tried this, it does give good results but still has that bad dot crawl I already see with my own tests. A brief video is attached showing the output on my plasma using these settings - gives similar results to mine.

Thanks for testing. I think fcsc = 283.75 * fline is the best that can be achieved. It seems that's what Chip's code is doing and results using his code should look similar. To minimize PAL dot crawl, a "precision offset" of 1/625 * fline needs to be added to 283.75 * fline but it's a tiny addition, only 5.6 ppm. The insurmountable problem on the P2 is that fline and fcsc are both integer fractions of sysclk, which would need to be 10 times faster for the precision offset to be possible.

@rogloh
According to the BBC TV Signal Coding document, there should be diagonal lines moving slowly up or down when using an odd number of quarter cycles, which is true when fcsc = 283.75 * fline.

Does NTSC composite have less dot crawl than PAL? Just wondering what happens if you keep sysclk = 283.75 MHz and fcsc = sysclk / 64 and adjust fline a little:

fline = fcsc / 283.5,  pixels/line @ 283.75 MHz = 18144
fline = fcsc / 283.75, pixels/line @ 283.75 MHz = 18160  **tested**
fline = fcsc / 284.0,  pixels/line @ 283.75 MHz = 18176

Dot crawl might be reduced and replaced by vertical colour banding for last example.

rogloh · 2024-11-27 11:35

Hi @TonyB_
I tested your other two variants with 283.5 and 284.0 as the divisors. 283.5 was probably quite a bit better than the 284.0 case. I'd have to say on first impression of using 283.5, it probably looks better than 283.75 on my LCD monitor - still to drag it over and take a look with the Plasma in the other room. I really need a good way to switch out these values dynamically or show two different outputs from two COGs side by side on two monitors to compare visually faster rather than recompute values, rebuild and reload etc and in the mean time forget what the last one looked like. But that will take too long to setup for now. I'll see if I can reload XFREQ from keypress or something...

Update: yeah on the Plasma the 283.5 setting looked quite a bit better than the original 283.75 setting. I just can't understand why the correct value of 283.75 is not working out for PAL timing on the P2. I think I'll keep this 283.5 value for now as the best yet.

One thing I'm sort of wondering is if I do go and start to tweak the XFREQ value slightly until I think it's best visually, whether that will only work "best" on my own P2 given it's crystal frequency will be different to other people's P2s, or whether it will still be applicable on other systems.

For our reference, these are the values I used that Chip's timing generator code computes for the modified 283.5 colour cycles/line and a 283.75MHz clock (with 908 total pixels/line) for PAL 50Hz
ntsc.start(1, 1, 908, 720, 0, 288, 0, 283750000, @timing[0])

xfreq 107469456
cfreq 67109231
breeze = 13 burst = 32
horiz 30 67 91 activepixels =720

TonyB_ · 2024-11-27 12:30

@rogloh said:
Update: yeah on the Plasma the 283.5 setting looked quite a bit better than the original 283.75 setting. I just can't understand why the correct value of 283.75 is not working out for PAL timing on the P2. I think I'll keep this 283.5 value for now as the best yet.

One thing I'm sort of wondering is if I do go and start to tweak the XFREQ value slightly until I think it's best visually, whether that will only work "best" on my own P2 given it's crystal frequency will be different to other people's P2s, or whether it will still be applicable on other systems.

For our reference, these are the values I used that Chip's timing generator code computes for the modified 283.5 colour cycles/line and a 283.75MHz clock (with 908 total pixels/line) for PAL 50Hz
ntsc.start(1, 1, 908, 720, 0, 288, 0, 283750000, @timing[0])

xfreq 107469456
cfreq 67109231
breeze = 13 burst = 32
horiz 30 67 91 activepixels =720

I calculate that xfreq = 107468869 is more accurate for fline = fcsc / 283.5.
18144 / 908 = 19.9823788546
2^31 / 19.9823788546 = 107468868.628
Round up to 107468869

Might not make any difference. Daft question, you are testing interlaced video?

EDIT:
xfreq for 283.75 = 107374183 = $666_6666 + 1

My P2 Eval B has been out of action for a while. It was setup to work with an old Windows PC but that or the Eval was giving off a nasty burning smell and smoke. I switched both off and haven't looked for the cause yet.

rogloh · 2024-11-27 14:05

@TonyB_ said:

@rogloh said:

I calculate that xfreq = 107468869 is more accurate for fline = fcsc / 283.5.
18144 / 908 = 19.9823788546
2^31 / 19.9823788546 = 107468868.628
Round up to 107468869

Hmm, yeah I've also noticed that Chip's calculations doesn't give the exact same results as something done manually using higher precision eg. a via Numbers spreadsheet on Mac. Could be down to single precision floats in toolchain perhaps.
EDIT: actually looking at the code again, that doesn't make sense as the computation is done using MULDIV64 in SPIN2 and the frac operator. I wonder if some precision is getting lost there somehow.

Might not make any difference. Daft question, you are testing interlaced video?

Yes.

EDIT:
xfreq for 283.75 = 107374183 = $666_6666 + 1

rogloh · 2024-11-28 02:15

BTW I tried out @Wuerfel_21 's SDTV and HDTV component colourspace settings on my LCD and Plasma TV. RGBI colour swatches look reasonable on first impression so there's not some gross error apparent. I don't do colorimetry or have calibration gear etc so wouldn't know if it's accurate but it looks okay to me. Photo shows it as very grainy, looks far better in real life although I wasn't using a proper coax - this is 720p50 component HD.

I've also added a bunch of new stock resolution timings to the driver code. Not all are supported by my test gear here so YMMV. I did test 720p50, 720p60, 1080p24, 1080i50, 1080i60 and worked on my plasma TV. Not sure progressive 1080p50 and 1080p60 are supported over component on my plasma but I should try that out too. Update: nope, nothing displayed, unless it needs a proper coax instead of the cheap old DVD component cables I used. I recall you weren't allowed to do 1080p50/60 over component with consumer gear. FTS.

PUB getTiming(resolution) : result
' returns pre-defined timings for common video resolutions. 
    case resolution
        RES_640x350         : return @ega_timing
        RES_640x400         : return @vga400_timing
        RES_640x480         : return @vga_timing
        RES_720x480         : return @sdtv_480p
        RES_720x576         : return @sdtv_576p
        RES_800x600         : return @svga_timing
        RES_800x600_DVI     : return @svga_dvi_timing
        RES_1024x768        : return @xga_timing
        RES_1280x720p24     : return @hd24_timing
        RES_1280x720p50     : return @hd50_timing
        RES_1280x720p60     : return @hd60_timing
        RES_1280x1024       : return @sxga_timing
        RES_1440x900        : return @wsxga_timing
        RES_1400x1050       : return @wsxga_plus_timing
        RES_1600x1200       : return @uxga_timing
        RES_1920x1080       : return @fullhd60_timing ' default full HD mode
        RES_1920x1080p24    : return @fullhd24_timing
        RES_1920x1080p25    : return @fullhd25_timing
        RES_1920x1080p50    : return @fullhd50_timing
        RES_1920x1080p30    : return @fullhd30_timing
        RES_1920x1080p60    : return @fullhd60_timing
        RES_1920x1080i50    : return @fullhd50i_timing
        RES_1920x1080i60    : return @fullhd60i_timing
        RES_1920x1200       : return @wuxga_timing
        other               : return @vga_timing ' default to VGA resolution

TonyB_ · 2024-12-13 11:14

@rogloh said:

@TonyB_ said:

@rogloh said:

I calculate that xfreq = 107468869 is more accurate for fline = fcsc / 283.5.
18144 / 908 = 19.9823788546
2^31 / 19.9823788546 = 107468868.628
Round up to 107468869

Hmm, yeah I've also noticed that Chip's calculations doesn't give the exact same results as something done manually using higher precision eg. a via Numbers spreadsheet on Mac. Could be down to single precision floats in toolchain perhaps.
EDIT: actually looking at the code again, that doesn't make sense as the computation is done using MULDIV64 in SPIN2 and the frac operator. I wonder if some precision is getting lost there somehow.

Might not make any difference. Daft question, you are testing interlaced video?

Yes.

Just wondering whether dot crawl for non-interlaced PAL composite is better, the same or worse than for interlaced.

rogloh · 2024-12-13 11:35

From my recollection non-interlaced was always better than interlaced with PAL in terms of quality, but I've not worked on it for a couple of weeks. I believe it's looking much better in general now vs what I had before and I've revamped the default driver timings - of course it can still be tweaked manually if required. Right now I have some other stuff I want to do before Xmas so won't have time to look at it any more for now.

Wuerfel_21 · 2024-12-29 19:38

I am once again violently reminded of the lack of doubling for PSRAM source buffers

rogloh · 2024-12-29 20:47

LOL, yeah that can't be done at the moment. In theory it might be doable with the other analog outputs if the code is changed to use the streamer rate change for pixel doubling and I'm at least moving in that direction.
To double DVI you'd need to have a three line scan buffer. One filling with PSRAM contents, one being doubled, and another being output. My driver uses a two scan line buffer.

Something that may ameliorate this restriction is that if you want to pixel double, it's more likely you can probably use HUB RAM given it takes less memory to hold a 320x240 frame for example. And if you want to use larger resolutions with PSRAM you probably don't need to double in the first place. Not perfect I know, but at least it helps a little. Also you can still line double with PSRAM IIRC, just not double the pixel widths - though this will create non-square pixels.

evanh · 2024-12-29 23:27

DVI/HDMI monitors are far more forgiving of wild timings. Try just outputting a 320x240 mode straight. It does depend on the particular monitor but many will happily scale up something very low res for you. They certainly can all know what you're putting out, just a question as to whether it wants to accept it or not.

Minimum dotclock is 25MHz still ... so blanking would need extended accordingly to retain 50 or 60 Hz refresh. I found Hsync fixed at 64 dots wide and centred in the blanking worked well.

Vsync of 2 lines did the job too. Vsync placement can be anywhere that suits you. I placed it in the middle of blanking when blanking was large to fully utilise Roger's API limitations. Otherwise I used a front porch of 1 line.

rogloh · 2024-12-30 00:04

@evanh said:
Vsync of 2 lines did the job too. Vsync placement can be anywhere that suits you. I placed it in the middle of blanking when blanking was large to fully utilise Roger's API limitations. Otherwise I used a front porch of 1 line.

The updated version I am now working with uses 16 bits per sync timing parameter and should allow way more vertical and horizontal blanking - far more than makes sense even.

Wuerfel_21 · 2024-12-30 00:10

@rogloh said:
Something that may ameliorate this restriction is that if you want to pixel double, it's more likely you can probably use HUB RAM given it takes less memory to hold a 320x240 frame for example. And if you want to use larger resolutions with PSRAM you probably don't need to double in the first place. Not perfect I know, but at least it helps a little. Also you can still line double with PSRAM IIRC, just not double the pixel widths - though this will create non-square pixels.

Double 320x240 16bpp buffers do fit in Hub RAM, but leave little space for anything else. The idea is to reclaim the second buffer by doing a sort of triple-buffer scheme with one hub buffer and two external ones that finished frames can be copied into.

rogloh · 2024-12-30 00:26

@Wuerfel_21 said:
Double 320x240 16bpp buffers do fit in Hub RAM, but leave little space for anything else. The idea is to reclaim the second buffer by doing a sort of triple-buffer scheme with one hub buffer and two external ones that finished frames can be copied into.

Yeah that could work out. You'd just have to manage the copying into HUBRAM from PSRAM yourself. Given you can do block copies in the background if you use my driver, that may not be too onerous, though it can burn one extra COG.

evanh · 2024-12-30 01:13

@rogloh said:
The updated version I am now working with uses 16 bits per sync timing parameter and should allow way more vertical and horizontal blanking - far more than makes sense even.

Gimme!

Wuerfel_21 · 2024-12-30 02:36

@rogloh said:

@Wuerfel_21 said:
Double 320x240 16bpp buffers do fit in Hub RAM, but leave little space for anything else. The idea is to reclaim the second buffer by doing a sort of triple-buffer scheme with one hub buffer and two external ones that finished frames can be copied into.

Yeah that could work out. You'd just have to manage the copying into HUBRAM from PSRAM yourself. Given you can do block copies in the background if you use my driver, that may not be too onerous, though it can burn one extra COG.

Block copy really doesn't take that long, so for this simple testbed/demo program it's fine to do it synchronously for now (currently runs at ~15 FPS). I swapped in some old version of my video driver that I already modified for PSRAM operation and that can scale the framebuffer just fine.

pik33 · 2025-01-06 09:27

A video driver I use in my stuff (Basic iterpreter, player) works like this:

in line #x:
- preload line #x+2 from the PSRAM to the hub ram cache. The cache keeps 4 lines.
- draw sprites on line #x+1
- stream the current line #x to the video output

I don't do the pixel doubling in the current version of the driver but having the line in the hub should make it easy - I have older versions of my drivers that have no sprites, but can multiply pixels vertically and horizontally (x1,2,4,8)

P2 DVI/VGA driver

Comments