Not sure of ideal sync positions. Some use large sync length with equal sized small porches, others use 50% blanking as back porch, while others still use tight leading sync with huge back porch.
So I have changed it to now use 44 for the hsync width and 88 for the front porch. It's slightly different now but my LCD still won't show all the characters.
Here's the updated build with this timing change. It should show 240 columns of text but I see about 202 columns after re-syncing the monitor which is only ~1600 pixels so it might not be detecting it right. I know my video driver can already hit 240 columns in 1920x1200 at 308MHz - just double checked it now too. That's the native LCD resolution.
Give it a try if you can and see if it helps you...?
I wonder if the monitor is somehow over-scanning etc because it thinks this is a TV resolution...? This is pretty heavily reduced blanking too I guess.
So I have changed it to now use 44 for the hsync width and 88 for the front porch. It's slightly different now but my LCD still won't show all the characters.
Huh, that one is working for me on the cheapo TV ... and also fine on a 1920x1200 monitor. Cool. Impressive how little an adjustment that was.
I tried both of your suggestions just now, unfortunately the Dell LCD I have doesn't like them. Both will sync but still show the same problem. The monitor does say it is 1920x1080@60Hz in the info, but it still crops/scales it, despite going to 1:1 pixels etc and resyncing etc. Might just be my monitor's limitations, perhaps 1080 is not really supported over VGA and normally works only via of the component video inputs, which it overscans.
Same problem. Also tried 160-40-80 too, same again. My monitor doesn't seem to like this horizontal 67.5kHz timing with 1125 total lines and tries to scale it. I'm also playing around with other numbers to see if I can get it to work at 50Hz instead.
Massive blanking increase! Not reduced timing any longer. That's exactly what I've found in the past with LCDs - make the blanking too short for the monitor and you lose the right edge of the picture.
That's how they do 50Hz with 1080p HD by sharing the same dot clock and same number of total lines with the 60Hz version. They just pad out the horizontal front porch.
This 50Hz operation has boosted the performance quite a bit too. The scan line is longer now in time, and allows more opportunities for other COGs to get access after video reads its 1920 bytes. Pixels/s is boosted up to 174500!
This is definitely going to be fast enough to build a responsive 1080p frame buffer based GUI with the P2 with HyperRAM. I just need to check the mouse sprite render time isn't impacted. Will have to back off the write bursts of Round Robin COGs slightly if it is.
For those without HyperRAM boards here's a quick movie I took of this graphics demo. I only recorded the top left corner for slightly better readability, the full monitor area is much bigger than this (240 columns total) and it is all being written to. Apologies for my camera auto-focus changing as the brightness changes.
Reminds me a lot of the benchmark thing that comes with Allegro 4. If I i had my DOS computer set up right now, I'd use it to check how fast that that can draw lines. My current PC gets some 9 million solid horizontal lines per second in GDI mode...
One thing I found after adjusting the porches for my 1080p50Hz timing, was that the mouse cursor has no time to be drawn in the back porch at this clock speed. I tweaked the timing and moved 104 pixels from front porch to back porch and was then able to get my LCD monitor to sync the 1080 and it now has enough time for the mouse sprite to be drawn and still keep up.
It's quite tricky to have a mouse drawn in time with the external memory because unlike the normal hub RAM sourced modes you need to delay until the end of the line has arrived before you can start rendering it (in case the mouse is positioned right at the right end of the line). With an external memory source, there are plenty of cycles after the mailbox request is issued where we could try to draw a mouse but unfortunately we can't use these because we need to delay until after hsync to account for the worst latency before the full data arrives. This may need some further thought at 60Hz too which could be a problem...
Thanks, so I take it you got it to run okay at 1080p on your monitor Brian?
Yep, running fine.
I'll let it run all day on my glob top board.
Cool. If it's anything like my own setup it should run ok all day at normal temperature anyway.
I think I found out why 1080p60 over VGA doesn't work right on the Dell 2405FPW. It was never a properly supported resolution apparently on this older monitor. I guess they wanted people to upgrade to their next model...people have asked about this here:
At least I can use 1920x1080@50Hz on this thing and still get it pixel perfect (with narrow black bars top/bottom).
Pushing up to it's native 1920x1200 with the external frame buffer is just a fraction too high. I can get both my drivers to run this fast and see a full image, but there is some slight pixel fuzz/corruption operating at 308MHz which looks like the HyperRAM signal quality is borderline. This was at 60Hz but I don't think this LCD can accept 1920x1200@50Hz. I guess I could experiment further at some point with the timing. Maybe also try the registered clock signal instead of pass through, as evanh recommended.
The 1080p60 resolution with external RAM and drawing a mouse sprite is an issue. I see screen distortion when I try to enable it (which shows I am running out of instruction cycles) and I can't drop the blanking enough to get it to have enough time to output in 8bpp mode unlike the 50Hz variant of 1080. I'm still looking into it but I might have finally reached the processing limit after all this time! Maybe I could wait for a HyperRAM driver acknowledgment via COGATN and then try to render the mouse before the horizontal front porch and sync pulse starts, but this way may be prone to driver lockup problems and it could exceed the last 3 video driver instructions available to me if I steal them back from a recent interlaced text misprovisioning fix (which is non-essential).
Update: I found I could get a 1920x1200@60Hz 4bpp display working at sysclk/2 HyperRAM transfer rate with the P2 @308MHz, but again with no mouse. Still it could become handy if you have a monitor this size and want to make the most of its resolution.
Update2: I found I could get the mouse to render with 1920x1080@60Hz if I wait for the mailbox request completion then draw the mouse. It works as long as the video service jitter remains properly bounded and enforced by the HyperRAM driver COG. It seems like it would be a bit too fragile though and would cause video lockup issues if the external memory driver hangs or something else ever overwrites its mailbox. I'm not sure I like/trust it 100% but at least there is a way.
Time to step up to sysclock/1 for writes. :P And HyperBus V2 parts.
Yeah. Having sysclk/1 writes would be better for horizontal lines and block fills and copies. It won't help the random access rate much though, as the overhead for small access sizes swamps the transfer time.
I sort of wonder whether this whole P2 data sampling timing problem with HyperRAM would ever truly be totally solved for the reads. Can any external circuitry that uses the RWDS output from the HyperRAM ever help us and make the programmable read delay totally temperature/frequency independent and alleviate those sensitive operating regions? We sort of need to use the HyperRAMs RWDS to clock in the read data to a latching element and the P2 to drive out the read clock it can use to sample its data from on the other side a bit like an elastic FIFO. Not really sure what the best circuit for that would be...or is this just an inherent problem with these RAMs and the way the P2 captures its input data?
No, there is no true synchronous clock input for I/O on the prop2. It can't be a slave like that. RWDS won't do anything useful there. The way a sync clock is utilised on the prop2 is by oversampling an ordinary input with the sysclock, eg: SPI smartpin watching for a clock transition to decide if data bit is valid. This means data rate has to be below sysclock/2. In other words, no more than sysclock/3.
So that leaves us with only the method we already are using to get a data rate equal to sysclock. Synchronised by the sysclock itself ... but has the read data latencies we have to compensate for.
The greatest gains will always come from shorter tracks between prop2 and HR - tighter board design. I'm hoping this will at least widen the usable frequency bands and narrow the bad bands as well as push them all higher up. I'm not expecting this to fix the temperature gradient.
I guess when trying to exceed 300 MB/s, the V2 parts will help too but at this stage it's just guessing. It may only improve the read data frequency bands a little. I'd be happy with that though.
Roger,
For the HyperRAM opportunities, is it preferable to have larger v-blanking or h-blanking?
EDIT: What I mean is that certain display modes are quite flexible as to what sync rates can be thrown at them. There is sometimes options as to how much time goes to h-blanking vs v-blanking. And for all modes there is always a large usable range of dot clocks.
The only thing the monitors and TVs struggle with is deciding on what the resolution coming at them is. Which seems to be decided on just by the syncs. Primarily h-sync.
EDIT2: The down side to this arrangement is they can only support resolutions that are stored in their internal mode list.
There is a well supported display mode of 848x480@60. Its VESA listing shows a dot clock of 33.75 MHz but I've happily trimmed the blanking down to bring it in at 30 MHz while still holding at 60 Hz v-sync. Operating at 300 MHz for DVI, I guess is still too fast for the HyperRAM though.
On the other hand, maybe sticking with sysclock/2 for the moment is better anyway. Revisit sysclock/1 with a designed project board for the job.
Roger,
For the HyperRAM opportunities, is it preferable to have larger v-blanking or h-blanking?
If you have a choice more horizontal-blanking time allows more time for other COG's HyperRAM requests to be serviced after the video transfer is completed so more non-video bursts can be achieved per scan line which is beneficial for short access performance. Unless there is a way for video driver to notify the HyperRAM it is in vertical blanking and therefore allowing it to make longer use of the spare time available (which we don't have currently), it doesn't really help having more vertical blanking time vs horizontal blanking time. It also improves responsiveness to service more round-robin COG requests sooner. I do fragment the round-robin bursts anyway so a long burst from one of those COGs can be partially completed on each scan line even if there isn't enough time on each line to complete its entire burst. That's the magic and tricky part of it all.
There is a well supported display mode of 848x480@60. Its VESA listing shows a dot clock of 33.75 MHz but I've happily trimmed the blanking down to bring it in at 30 MHz while still holding at 60 Hz v-sync. Operating at 300 MHz for DVI, I guess is still too fast for the HyperRAM though.
On the other hand, maybe sticking with sysclock/2 for the moment is better anyway. Revisit sysclock/1 with a designed project board for the job.
I found I can just make HyperRAM work ok at 297MHz with syclk/1 reads and your 300MHz is pretty close to that. I did start to see problems when I tried 308MHz though.
A panel with only 848 pixels in 8bpp mode running sysclk/1 transfers at 300MHz could have its video request serviced in something like ~3-4us on the line (single transfer). This leaves in the vicinity of 26us or more for video writes and other non-video accesses, which is a lot of remaining memory performance. Even that panel in truecolour mode still leaves more that half the memory bandwidth remaining.
Sysclk/2 is much safer though at 300MHz, and should still leave heaps of extra bandwidth in 8-16bpp modes with 848 pixel wide displays. There should even be a fair bit left in 32bpp modes.
Cool, here's the config for 30 MHz that even an old 5:4 LCD monitor will sync to with full area coverage. Naturally it only perceives 640x480 though, so isn't really of value.
hd848x480_timing 'resolution 848x480, 59 Hz with 30.0 MHz pixel clock
long CLK300MHz
long 300000000
'_HSyncPolarity___FrontPorch__SyncWidth___BackPorch__Columns
' 1 bit 7 bits 8 bits 8 bits 8 bits
long (SYNC_POS<<31) | ( 16<<24) | ( 88<<16) | ( 88<<8 ) | (848/8)
'_VSyncPolarity___FrontPorch__SyncWidth___BackPorch__Visible
' 1 bit 8 bits 3 bits 9 bits 11 bits
long (SYNC_POS<<31) | (1<<23) | ( 2<<20) | ( 6<<11) | 480
long 10 << 8
long 0
long 0 ' reserved for CFRQ parameter
Casting that aside and cutting deep into h-blanking, a modern TV can even shrink down to 28 MHz dot clock
hd848x480_timing 'resolution 848x480, 60 Hz with 28.0 MHz pixel clock
long CLK280MHz
long 280000000
'_HSyncPolarity___FrontPorch__SyncWidth___BackPorch__Columns
' 1 bit 7 bits 8 bits 8 bits 8 bits
long (SYNC_POS<<31) | ( 8<<24) | ( 40<<16) | ( 40<<8 ) | (848/8)
'_VSyncPolarity___FrontPorch__SyncWidth___BackPorch__Visible
' 1 bit 8 bits 3 bits 9 bits 11 bits
long (SYNC_POS<<31) | (3<<23) | ( 2<<20) | ( 14<<11) | 480
long 10 << 8
long 0
long 0 ' reserved for CFRQ parameter
Interesting result. I wonder if 960x540x60Hz is achievable with 30MHz. Keep in mind that the custom timings can't be shrunk down to zero completely, there are some software limits in the code. You may begin to hit them at some point especially if you enable borders/mouse etc which uses that spare time too. A good thing with the DVI output is that the P2:pixel clock ratio is high at 10 giving the CPU lots more cycles to play with.
Comments
Not sure of ideal sync positions. Some use large sync length with equal sized small porches, others use 50% blanking as back porch, while others still use tight leading sync with huge back porch.
https://timetoexplore.net/blog/video-timings-vga-720p-1080p
So I have changed it to now use 44 for the hsync width and 88 for the front porch. It's slightly different now but my LCD still won't show all the characters.
Here's the updated build with this timing change. It should show 240 columns of text but I see about 202 columns after re-syncing the monitor which is only ~1600 pixels so it might not be detecting it right. I know my video driver can already hit 240 columns in 1920x1200 at 308MHz - just double checked it now too. That's the native LCD resolution.
Give it a try if you can and see if it helps you...?
I wonder if the monitor is somehow over-scanning etc because it thinks this is a TV resolution...? This is pretty heavily reduced blanking too I guess.
EDIT: Through personal experience, I've found that using a tiny front porch usually gives most leeway. So something like 1920-16-48-216.
Give 1920-44-44-192 a try. It might work better for you.
Attached is this 50Hz version for 1080p.
EDIT: Or it detects the wrong resolution.
This is definitely going to be fast enough to build a responsive 1080p frame buffer based GUI with the P2 with HyperRAM. I just need to check the mouse sprite render time isn't impacted. Will have to back off the write bursts of Round Robin COGs slightly if it is.
Very nice work Roger.
It's quite tricky to have a mouse drawn in time with the external memory because unlike the normal hub RAM sourced modes you need to delay until the end of the line has arrived before you can start rendering it (in case the mouse is positioned right at the right end of the line). With an external memory source, there are plenty of cycles after the mailbox request is issued where we could try to draw a mouse but unfortunately we can't use these because we need to delay until after hsync to account for the worst latency before the full data arrives. This may need some further thought at 60Hz too which could be a problem...
I'll let it run all day on my glob top board.
Cool. If it's anything like my own setup it should run ok all day at normal temperature anyway.
I think I found out why 1080p60 over VGA doesn't work right on the Dell 2405FPW. It was never a properly supported resolution apparently on this older monitor. I guess they wanted people to upgrade to their next model...people have asked about this here:
https://www.dell.com/community/Monitors/Full-HD-1080p-on-Dell-1920x1200-display-such-as-2405FPW-or/td-p/2498149
At least I can use 1920x1080@50Hz on this thing and still get it pixel perfect (with narrow black bars top/bottom).
Pushing up to it's native 1920x1200 with the external frame buffer is just a fraction too high. I can get both my drivers to run this fast and see a full image, but there is some slight pixel fuzz/corruption operating at 308MHz which looks like the HyperRAM signal quality is borderline. This was at 60Hz but I don't think this LCD can accept 1920x1200@50Hz. I guess I could experiment further at some point with the timing. Maybe also try the registered clock signal instead of pass through, as evanh recommended.
The 1080p60 resolution with external RAM and drawing a mouse sprite is an issue. I see screen distortion when I try to enable it (which shows I am running out of instruction cycles) and I can't drop the blanking enough to get it to have enough time to output in 8bpp mode unlike the 50Hz variant of 1080. I'm still looking into it but I might have finally reached the processing limit after all this time! Maybe I could wait for a HyperRAM driver acknowledgment via COGATN and then try to render the mouse before the horizontal front porch and sync pulse starts, but this way may be prone to driver lockup problems and it could exceed the last 3 video driver instructions available to me if I steal them back from a recent interlaced text misprovisioning fix (which is non-essential).
Update: I found I could get a 1920x1200@60Hz 4bpp display working at sysclk/2 HyperRAM transfer rate with the P2 @308MHz, but again with no mouse. Still it could become handy if you have a monitor this size and want to make the most of its resolution.
Update2: I found I could get the mouse to render with 1920x1080@60Hz if I wait for the mailbox request completion then draw the mouse. It works as long as the video service jitter remains properly bounded and enforced by the HyperRAM driver COG. It seems like it would be a bit too fragile though and would cause video lockup issues if the external memory driver hangs or something else ever overwrites its mailbox. I'm not sure I like/trust it 100% but at least there is a way.
Yeah. Having sysclk/1 writes would be better for horizontal lines and block fills and copies. It won't help the random access rate much though, as the overhead for small access sizes swamps the transfer time.
I sort of wonder whether this whole P2 data sampling timing problem with HyperRAM would ever truly be totally solved for the reads. Can any external circuitry that uses the RWDS output from the HyperRAM ever help us and make the programmable read delay totally temperature/frequency independent and alleviate those sensitive operating regions? We sort of need to use the HyperRAMs RWDS to clock in the read data to a latching element and the P2 to drive out the read clock it can use to sample its data from on the other side a bit like an elastic FIFO. Not really sure what the best circuit for that would be...or is this just an inherent problem with these RAMs and the way the P2 captures its input data?
So that leaves us with only the method we already are using to get a data rate equal to sysclock. Synchronised by the sysclock itself ... but has the read data latencies we have to compensate for.
The greatest gains will always come from shorter tracks between prop2 and HR - tighter board design. I'm hoping this will at least widen the usable frequency bands and narrow the bad bands as well as push them all higher up. I'm not expecting this to fix the temperature gradient.
I guess when trying to exceed 300 MB/s, the V2 parts will help too but at this stage it's just guessing. It may only improve the read data frequency bands a little. I'd be happy with that though.
For the HyperRAM opportunities, is it preferable to have larger v-blanking or h-blanking?
EDIT: What I mean is that certain display modes are quite flexible as to what sync rates can be thrown at them. There is sometimes options as to how much time goes to h-blanking vs v-blanking. And for all modes there is always a large usable range of dot clocks.
The only thing the monitors and TVs struggle with is deciding on what the resolution coming at them is. Which seems to be decided on just by the syncs. Primarily h-sync.
EDIT2: The down side to this arrangement is they can only support resolutions that are stored in their internal mode list.
On the other hand, maybe sticking with sysclock/2 for the moment is better anyway. Revisit sysclock/1 with a designed project board for the job.
I found I can just make HyperRAM work ok at 297MHz with syclk/1 reads and your 300MHz is pretty close to that. I did start to see problems when I tried 308MHz though.
A panel with only 848 pixels in 8bpp mode running sysclk/1 transfers at 300MHz could have its video request serviced in something like ~3-4us on the line (single transfer). This leaves in the vicinity of 26us or more for video writes and other non-video accesses, which is a lot of remaining memory performance. Even that panel in truecolour mode still leaves more that half the memory bandwidth remaining.
Sysclk/2 is much safer though at 300MHz, and should still leave heaps of extra bandwidth in 8-16bpp modes with 848 pixel wide displays. There should even be a fair bit left in 32bpp modes.
Casting that aside and cutting deep into h-blanking, a modern TV can even shrink down to 28 MHz dot clock
Tested with both VGA and HDMI.