All PASM2 gurus - help optimizing a text driver over DVI?

potatohead · 2019-10-27 17:12

I think the syncs are

rogloh wrote: »

I should also try 60Hz then too (if I can get that high).

Update: Tried it at 350MHz with 70MHz dot clock sending 1920x540@60Hz, didn't help. Now it thinks it's 640x480 res. So doesn't look good for a 1920 wide mode at this point. Maybe an HDTV might do better if it takes 30Hz signals over VGA or possibly component.

Some HDTV sets recognize 24p modes. There are Blu Ray discs that output that for cinematography reasons.

evanh · 2019-10-27 23:40

Every 50 Hz capable TV can do that. It's just interlaced with both interlaced fields from the same frame in time.

rogloh · 2019-10-28 00:01

I think I've almost hit about the limit of my code now....

1920x1440x4bpp@50Hz with a 350Mhz P2 clock. Looks amazingly fine on that Sony CRT if slightly flickery due to the low refresh.

I've now got to try it on the 1920x1200 LCD.

Update: can't quite yet get the LCD to recognize 1920x1200 at proper resolution. It thinks it is 1600x1200 and scales accordingly, but it does look reasonably good. I was sending the timing in the link below, but at ~51Hz refresh with the P2 at 325MHz, didn't want to clock the P2 higher to 386MHz needed. There's probably an official reduced blanking variant this monitor will actually recognize if I get the timing right for it.

http://tinyvga.com/vga-timing/1920x1200@60Hz

OMFG, it even runs 8bpp at this res too.

evanh · 2019-10-28 00:53

Oh, this is all analogue RGB (VGA) signalling! Took me a while to realise you aren't using the HDMI accessory. I gather you need the extra sysclocks for the scan line handling features. Active cooling time.

rogloh · 2019-10-28 01:34

Yes I should have mentioned that evanh. I am back on VGA at the moment, though I've been on DVI as well (via HDMI accessory board). This driver will do both, selectable at init time.

While I thought I'd hit the limit, I hadn't, as I was factoring in pixel doubling. Without that enabled I can actually get 16bpp at 1920x1200 now. Looks amazing, albeit scaled down to 1600 size on my LCD until I figure out a way to convince the LCD it really is 1920x1200. That photo repeated several times is 640x350x16bpp IIRC. I am using per scanline skew to rewind it on the start of the next scanline. You can see I am running out of hub memory, the upper 512kB returns black, then the images start again. Looks awesome on the monitor.

evanh · 2019-10-28 02:03

Code displayed as pixels ... brings back fond memories, oddly.

rogloh wrote: »

... get 16bpp at 1920x1200 now. Looks amazing, albeit scaled down to 1600 size on my LCD until I figure out a way to convince the LCD it really is 1920x1200.

You can try going for something like 20% hblanking for resolving this but I think the real barrier here is the monitor doesn't know 50 Hz for any 1920 across mode. It's expecting 60 Hz. You can likely give it 59 Hz without issue.

PS: 74.6 74.5 kHz hsync is the other one. The further away this is the more foreign it becomes to the monitor.

rogloh · 2019-10-28 02:33

Yes! Good call evanh.

I found this information online:

2405FPW's resolutions:

Preset Display Modes
Display Mode Horizontal Frequency (kHz) Vertical Frequency (Hz) Pixel Clock (MHz) Sync Polarity (Horizontal/Vertical)
VGA, 720 x 400 31.5 70.1 28.3 -/+
VGA, 640 x 480 31.5 59.9 25.2 -/-
VESA, 640 x 480 37.5 75.0 31.5 -/-
VESA, 800 x 600 37.9 60.3 40.0 +/+
VESA, 800 x 600 46.9 75.0 49.5 +/+
VESA, 1024 x 768 48.4 60.0 65.0 -/-
VESA, 1024 x 768 60.0 75.0 78.8 +/+
VESA, 1152 x 864 67.5 75.0 108.0 +/+
VESA, 1280 x 1024 64.0 60.0 108.0 +/+
VESA, 1280 x 1024 80.0 75.0 135.0 +/+
VESA, 1600 x 1200 75.0 60.0 162.0 +/+
VESA, 1920 x 1200 74.0 60.0 154.0 +/-

So I tried out a 154 MHz pixel clock with these timings below and it worked. Full 1920x1200 res width, 60Hz, 16bpp from the P2 when running at 308MHz, gorgeous! In some future display only cases with no copy or text rendering made by the driver cog but rendering done by external sprite type cog(s) for example, I suspect it would potentially be able to output in true colour too.

' 60Hz 1920X1200
        V_VISIBLE       = 1200
        V_FP            = 1 
        V_SYNC          = 3 
        V_BP            = 29

        H_VISIBLE       = 1920
        H_FP            = 16 
        H_SYNC          = 16  
        H_BP            = 128 

        HZ              = 60
        V_SYNC_POLARITY = SYNC_POS
        H_SYNC_POLARITY = SYNC_NEG

Update: In theory with three HyperRAM boards fitted, all working together, that might be able to deliver the data required for a full screen picture at this bit depth and size.... that'd be a sight to behold. P2-EVAL with A/V breakout on one side, and 3 double wide breakouts on the other 3 edges. Leaves you with perhaps a USB keyboard on the remaining 8 pins plus the serial and flash. Who's first to try?

evanh · 2019-10-28 02:43

Cool, that's definitely a reduced blanking mode. I suspect that +- is a convention for the polarity of syncs to indicate this as well, but I doubt it actually matters in practice.

ozpropdev · 2019-10-28 05:15

Looking good Roger!
I've got hyperram burst speeds now at 190MB/s now.
Hope to have a driver to match soon.
Here's a capture with my new text logic analyzer of hyperram burst read with P2 @ 380MHz.

D0         __________________________/^^^^^^\_/^\_/^\_/^\_______/^\_/^\_/^^^\___/^^^\___/^^^
                                              |
D1         __________________________/^^^^^^\___/^^^\_/^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
D2         __________________________/\_____________/^\_/^\___/^\___/^^^^^\_______/^^^\_____
                                              |
D3         __________________________/^^^^^^\___________/^^^^^^^^^^^\_/^^^^^^^\_/^^^^^\_/^\_
                                              |
D4         __________________________/^^^^^^^\/^\_/^\_/^^^\_/^^^^^^^^^^^\_____/^^^^^^^^^\_/^
                                              |
D5         __________________________/^^^^^^^\__/^^^\_/^^^^^^^^^^^^^^^^^\_/^^^^^^^^^^^^^\_/^
                                              |
D6         __________________________/\/^^^^\_______/^\_/^^^\_/^^^\_/^\_/^\_/^\_/^^^^^^^^^^^
                                              |
D7         __________________________/^^^^^^\_____________/^^^^^^^^^^^^^\_/^^^^^^^^^^^^^\___
                                              |
RAM_CLK    ____________________/^\__________________________________________________________
                                              |
FLASH_CLK  ___/\/\/\/\/\/\/\/\/\/\/\/\____/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^
                                              |
RAM_RWDS   ___________________________/\____/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_
                                              |
FLASH_RWDS ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
RAM_CS     _________________________________________________________________________________
                                              |
FLASH_CS   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
INT        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
reset      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
                                              |
                                   #00125=$ea11

cgracey · 2019-10-28 05:22

I've had 3-wire Y-Pb-Pr working recently at 1080p. I would post the code, but the power is out here in Red Bluff, California, and my office is blacked out. I've just got my phone. Anyway, I was running 1080p at a system and pixel clock of 148.5MHz.

rogloh · 2019-10-28 05:23

The perfect combination Brian.

Can't wait to integrate these two parts soon...later this week with any luck. I want to see a full screen high colour image at some nice resolution.

Liking the look of your handy analzyer

rogloh · 2019-10-28 05:27

cgracey wrote: »

I've had 3-wire Y-Pb-Pr working recently at 1080p. I would post the code, but the power is out here in Red Bluff, California, and my office is blacked out. I've just got my phone. Anyway, I was running 1080p at a system and pixel clock of 148.5MHz.

Ok, that sounds promising Chip. I just would like to know the proper colour space co-efficients used, when you next get access to your code, that worked fine with sync levels and didn't overdrive, handled negative values etc. In the meantime I can make my own educated guess at it using various snippets I've read and what Wuerful_21 had provided.

cgracey · 2019-10-28 05:33

I found it, thanks to Google:

https://forums.parallax.com/discussion/comment/1475170/#Comment_1475170

cgracey · 2019-10-28 05:40

Here is the best HDMI I could achieve, 720p24:

https://forums.parallax.com/discussion/comment/1475226/#Comment_1475226

In portrait mode, that could do 90x160 characters of an 8x8 font.

I would love to throw away all PC's to make a self-hosted development platform that worked directly with monitors, keyboard, and mouse.

rogloh · 2019-10-28 06:41

cgracey wrote: »

I found it, thanks to Google:

https://forums.parallax.com/discussion/comment/1475170/#Comment_1475170

Thanks for recalling this Chip. I will try it out too now.

cgracey wrote: »

I would love to throw away all PC's to make a self-hosted development platform that worked directly with monitors, keyboard, and mouse.

This should certainly be possible soon. If you make at least some of your tools SPIN2 based we should be golden. The video stuff is coming along nicely, and should dovetail with a HyperRAM (or other) arbiter interface for a decent shared memory setup. USB keyboard and mouse integration is all we'd probably need after that (ideally with USB hub support over time, for a COG reduction too).

rogloh · 2019-10-28 06:50

I've been thinking about a 1:1 P2 clock / pixel clock ratio for my driver. I think it could be doable using my current architecture without text or pixel doubling and just in a bare bones pass through mode, and still usable with other sprite or text mode drivers or simple frame buffers. It probably would not support a mouse sprite in all cases (because that could corrupt your frame buffer data), and the palette table size may have constraints depending on horizontal back porch timing, but there could be a ping/pong method used perhaps to get around it where it gets fully loaded the line before and flipped. It only needs 256 clocks to load (multiplied by some streamer load factor ~1.5x, so if there are at least 384 pixels on the line there should be time to do it). I've been wanting to incorporate a sprite mode to this driver so this might allow for a reduction in clock requirements and be operated within some under clocked P2 clock range which opens things up for more applications and still get to full HD at 148.5MHz as Chip had done with any luck.

Cluso99 · 2019-10-28 07:28

cgracey wrote: »

I've had 3-wire Y-Pb-Pr working recently at 1080p. I would post the code, but the power is out here in Red Bluff, California, and my office is blacked out. I've just got my phone. Anyway, I was running 1080p at a system and pixel clock of 148.5MHz.

I had VGA with text (240x135) at 1920x1080 at 148.5MHz on P2D2 and 150MHz on P2-Eval many moons ago. The text was readable but tiny on my 24" Acer LCD Monitor.

Cluso99 · 2019-10-28 07:32

Might be better to just use a couple of P2's in sync for extra hub memory rather than hyperram

rogloh · 2019-10-28 07:44

Nice. Did you do it all all a single COG Cluso? If so, that's only around 8 clocks or ~4 instructions per character. Perhaps if you read a whole font scanline in ahead of time in the LUT during blanking and used monochrome text something could just about work at those speeds but it would have to be very well optimized.

Cluso99 · 2019-10-28 08:54

It was just proof of concept. That is where we discovered the jitter when dividing the crystal down to 0.5MHz.

I just used a fixed block of text that I converted into a bitmap for display. I didn't try to do anything more than proving the display worked. Also I was only doing 1bpp with a simple 2 long lookup so I changed colors on a line by line (text) basis.

I used the mix of text and graphics for a logic analyser test case. IIRC I based it on Brian's logic analyser code.
https://forums.parallax.com/discussion/169288/vga-1920x1080x4bpp-148-5mhz-rock-stable-and-240-chars-x-135-lines/p3
There's a nice pic of the display on this link a few posts down the page.

Postedit
I was quite prepared to use a second cog if necessary. I was more interested in seeing what the streamer could do, and how many pixels and/or characters I could get on the screen.
My comment was more about getting 1920x1080 pixels on a VGA screen, not how to do it - you've been going great guns on that

cgracey · 2019-10-28 14:18

Cluso99 wrote: »

...That is where we discovered the jitter when dividing the crystal down to 0.5MHz...

That was Rev A silicon, right? The Rev B silicon seems to work fine at all crystal divisions.

evanh · 2019-10-28 14:40

cgracey wrote: »

Cluso99 wrote: »

...That is where we discovered the jitter when dividing the crystal down to 0.5MHz...

That was Rev A silicon, right? The Rev B silicon seems to work fine at all crystal divisions.

Yep, I'm certain Cluso is talking about way back at beginning of year.

The revB does have jitter in lower frequencies. Basically, the lower the VCO frequency is, the worse it gets. It's not really notable until below 40 MHz but I'd recommend to stay above 80 MHz and use XDIVP to divide lower.

evanh · 2019-10-28 14:42

Chip,
In Pnut, is there a way to bound the result of a constant macro? Or conditional constant assignment like the ? : operator?

TonyB_ · 2019-10-28 15:05

ozpropdev wrote: »

Looking good Roger!
I've got hyperram burst speeds now at 190MB/s now.
Hope to have a driver to match soon.
Here's a capture with my new text logic analyzer of hyperram burst read with P2 @ 380MHz.

D0         __________________________/^^^^^^\_/^\_/^\_/^\_______/^\_/^\_/^^^\___/^^^\___/^^^
                                              |
D1         __________________________/^^^^^^\___/^^^\_/^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
D2         __________________________/\_____________/^\_/^\___/^\___/^^^^^\_______/^^^\_____
                                              |
D3         __________________________/^^^^^^\___________/^^^^^^^^^^^\_/^^^^^^^\_/^^^^^\_/^\_
                                              |
D4         __________________________/^^^^^^^\/^\_/^\_/^^^\_/^^^^^^^^^^^\_____/^^^^^^^^^\_/^
                                              |
D5         __________________________/^^^^^^^\__/^^^\_/^^^^^^^^^^^^^^^^^\_/^^^^^^^^^^^^^\_/^
                                              |
D6         __________________________/\/^^^^\_______/^\_/^^^\_/^^^\_/^\_/^\_/^\_/^^^^^^^^^^^
                                              |
D7         __________________________/^^^^^^\_____________/^^^^^^^^^^^^^\_/^^^^^^^^^^^^^\___
                                              |
RAM_CLK    ____________________/^\__________________________________________________________
                                              |
FLASH_CLK  ___/\/\/\/\/\/\/\/\/\/\/\/\____/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^
                                              |
RAM_RWDS   ___________________________/\____/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_
                                              |
FLASH_RWDS ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
RAM_CS     _________________________________________________________________________________
                                              |
FLASH_CS   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
INT        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
reset      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                              |
                                              |
                                   #00125=$ea11

How about _ instead of ^ for highs and two lines per signal?

                                      ______  |_   _   _         _   _   ___     ___     ___
D0         __________________________/      \_/ \_/ \_/ \_______/ \_/ \_/   \___/   \___/
                                      ______  |  ___   _____________________________________
D1         __________________________/      \_|_/   \_/                                     
                                              |      _   _     _     _____         ___
D2         __________________________/\_______|_____/ \_/ \___/ \___/     \_______/   \_____
                                      ______  |          ___________   _______   _____   _
D3         __________________________/      \_|_________/           \_/       \_/     \_/ \_
                                      _______ |_   _   ___   ___________       _________   _
D4         __________________________/       \/ \_/ \_/   \_/           \_____/         \_/
                                      _______ |  ___   _________________   _____________   _
D5         __________________________/       \|_/   \_/                 \_/             \_/
                                        ____  |      _   ___   ___   _   _   _   ___________
D6         __________________________/\/    \_|_____/ \_/   \_/   \_/ \_/ \_/ \_/
                                      ______  |            _____________   _____________
D7         __________________________/      \_|___________/             \_/             \___
                                _             |
RAM_CLK    ____________________/ \____________|_____________________________________________
                                           _  |_   _   _   _   _   _   _   _   _   _   _   _
FLASH_CLK  ___/\/\/\/\/\/\/\/\/\/\/\/\____/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
                                             _|  _   _   _   _   _   _   _   _   _   _   _
RAM_RWDS   ___________________________/\____/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_
           ___________________________________|_____________________________________________
FLASH_RWDS                                    |
                                              |
RAM_CS     ___________________________________|_____________________________________________
           ___________________________________|_____________________________________________
FLASH_CS                                      |
           ___________________________________|_____________________________________________
INT                                           |
           ___________________________________|_____________________________________________
reset                                         |

Trailing spaces could be removed, as above. I think cursor looks better on every line except at edges for this alternative.

EDIT:
Hmm, it looks a lot better in a text editor than it does here inside a code block!

jmg · 2019-10-28 19:03

TonyB_ wrote: »

How about _ instead of ^ for highs and two lines per signal?
...
Hmm, it looks a lot better in a text editor than it does here inside a code block!

I've seen '_' used for LOW and '=' for Hi, as a common Text.Waveform standard. (and ~ used for float)

Example:

D0         ~~~_______________________/======\_/=\_/=\_/=\_______/=\_/=\_/===\___/===\___/===
                                              |
D1         ~~~_______________________/======\___/===\_/=====================================
                                              |
D2         ~~~_______________________/\_____________/=\_/=\___/=\___/=====\_______/===\_____
                                              |
D3         ~~~_______________________/======\___________/===========\_/=======\_/=====\_/=\_

jmg · 2019-10-28 19:09

ozpropdev wrote: »

Here's a capture with my new text logic analyzer of hyperram burst read with P2 @ 380MHz.

Nice plots. Are those labels right ?
RAM_RWDS seems to follow FLASH_CLK ?
What is the width of RAM_CS ?

Tubular · 2019-10-28 19:11

Parallax font. It just needs parallax font : )

Actually I would have probably chosen ~ by default, but I quite like the 'reaching high' look of the ^

Being able to cut and paste timing diagrams like this, from what is effectively a *380 MHz* logic analyzer inside the same silicon, is going to be really helpful

Cluso99 · 2019-10-28 20:46

cgracey wrote: »

Cluso99 wrote: »

...That is where we discovered the jitter when dividing the crystal down to 0.5MHz...

That was Rev A silicon, right? The Rev B silicon seems to work fine at all crystal divisions.

Yes, Rev A. You recommended a different divide/multiply which worked

ozpropdev · 2019-10-28 21:36

jmg wrote: »

ozpropdev wrote: »

Here's a capture with my new text logic analyzer of hyperram burst read with P2 @ 380MHz.

Nice plots. Are those labels right ?
RAM_RWDS seems to follow FLASH_CLK ?
What is the width of RAM_CS ?

I wondered if anyone would notice that.
What's happening there is the RAM_CLK pin switches from IO mode to a smart pin mode.
When this happens the state returned becomes the smartpin status.
This means I then lose the actual clock state from the capture.
To get around this I use the input selector feature to use FLASH_CLK to peek at RAM_CLK output.
The input selector has a limited reach of +/-3 so FLASH_CLK was chosen.
Another thing to note is the smartpin complete state appears earlier than expected because of the delays in the IO path.

RAM_CLK    ____________________/^\__________________________________________________________
                                              |
FLASH_CLK  ___/\/\/\/\/\/\/\/\/\/\/\/\____/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^\_/^

evanh · 2019-10-29 00:29

ozpropdev wrote: »

Another thing to note is the smartpin complete state appears earlier than expected because of the delays in the IO path.

That was initially a frustration for me when I imagined that each smartpin was geo-located next to its physical custom pin cell.

I eventually concluded that that was never going to happen even if we all asked Chip to make it so. The size of the interconnecting buses makes the optimiser want to clump the smartpins in the centre of the silicon between the streamers/cogs.

All PASM2 gurus - help optimizing a text driver over DVI?

Comments