P2 DVI/VGA driver



  • roglohrogloh Posts: 2,776
    edited 2020-08-19 - 15:20:58
    With PhiPi's information linked in the previous post I was able to build some tools to let me convert the original P1 font down to 8x16 size, with all the characters included. Then some of the symbols were hand tweaked after that to try to best represent the original P1 in the smaller character size. It's not going to be identical to the original of course, but it's close and looks very nice and clean on a high resolution LCD screen. Here's a photo of it (1:1 pixel accurate) on my Dell Ultrasharp (set to 1600x1200). I will include this font in the release of this video driver.

    1280 x 960 - 2M
    1280 x 1707 - 3M
  • That's pretty slick - you might want to tweak the $ sign though, it looks kinda lopsided
  • roglohrogloh Posts: 2,776
    edited 2020-08-20 - 07:26:29
    Yeah Wuerfel_21 it does look offset a bit. Most regular chars are 6 pixels wide, though I have widened a few others like M,m,W,w,Y,X etc to rebalance. I'll play more there. By the way I do find the box drawing symbols in the first 16 characters a bit confusing. I think some of them were special for P1 with 2bpp HW (I can't really recall making good use of those) but it now may not make as much sense for normal two colour stuff on the P2. So I may make a secondary font set which repurposes these for double thick box outlines which might be useful for push buttons. There are 8 symbols there that possibly could be reused for that. The dot and rounded square could be turned into two radio button/check box states. This could be helpful for hires text UIs. There are already other arrow characters that could be useful for control of scroll bars etc.

    I'm also going to mull over the dual HyperRAM module thing a bit more. My HyperRAM driver was designed for multi-COGs, multi-banks, and multi-modules, and in my video driver software I do already have some infrastructure for interlaced source where it selects from different source locations in each field, though that is currently automatic while for HyperRAM we may also want manual control as to when the switching occurs. The dual HyperRAM thing also needs two different mailboxes. I do currently have two mailbox registers, and one is just a fixed offset from the other to accelerate the computation of my split up mailbox writes. If I can rework the code a bit I might be able to have a region control bit select between the two somehow. This would let us flip between two different mailboxes per display frame or even nominate them dynamically per region. It's probably asking to crash when abused but if done carefully it might let us have very high performances writes on one HyperRAM bus, while the video COG displays out of the second HyperRAM bus at the same time.

    I have 2 instructions free for this. If I can make it fit it would be the icing on the cake. :smile:
  • Relatedly: Have you seen my ROM font compatible 8x8 font?
  • Wuerfel_21 wrote: »
    Relatedly: Have you seen my ROM font compatible 8x8 font?
    No not sure I have, does it look nice and easy to read? Is there a link to see a picture of it or other screenshot?

    The smaller fonts are very handy in lower resolutions to increase the number of rows. At high resolutions they can start to become microscopic however. I know a while back I tried a 6 scan line font which looks ridiculously tiny but gets a crazy amount of text on screen (up to 240x200 on the 1920x1200 monitor). I had posted an image of that on my original PASM2 gurus driver thread here:


    An 8x8 font there instead would be a lot easier to read and much easier to rotate for portrait mode use too. The possible 150x240 or 135x240 (@1080p) is convenient for file editing.
  • I also have an 8x8 font. It’s from P1 AIfont? (hippy) IIRC but slightly tweaked. I’ll post it in the morning.
  • Wuerfel_21Wuerfel_21 Posts: 927
    edited 2020-08-20 - 13:21:34
    Mine can be found here: https://forums.parallax.com/discussion/167894/a-rom-font-compatibile-8x8-font

    It's basically (f)unscii with approximations of the Parallax ROM characters added by me.

    If you mess with the source, you can compile multiple styles, sizes and character sets
  • Thanks for posting, that looks rather good Wuerfel_21 and the drawing symbols are there. The various European lower case "a" symbols do get squished down but it's probably unavoidable.
  • Here is AIChip's 8x8 font tweeked
  • roglohrogloh Posts: 2,776
    edited 2020-08-21 - 02:10:43
    (Thanks Cluso).

    So for enabling multiple HyperRAM buses in my video driver, overnight I was able to come up with an idea that might work...

    I already keep all my HyperRAM mailboxes in contiguous order over different driver instances. Each spawned HyperRAM driver gets its own instance of 8x3 longs for the mailboxes (3 longs per COG). Then the next instance follows on from the prior one. This is key.

    I already have two mailbox address registers in my code. One points to the first mailbox long for the video driver COG to use, the other to the second mailbox long for the COG. These two addresses get used to burst write two request parameters and then to trigger the HyperRAM read request command and to read back the result if waiting for completion (the wait only actually happens when drawing a mouse, just so we don't start to render it before the scan line data has been returned).

    My screen buffer parameter data in each region is currently maintained in this format:
    '   SCREEN_BUFFER_1 - defines the text or graphics mode screen's start address
    '   ===============   
    '   bit                                                      bit
    '   31 30-28 27  24 23                                        0
    '   -----------------------------------------------------------
    '  | E |Rsvd| Bank |         Screen Buffer Base Address        |
    '   -----------------------------------------------------------
    '   External Memory (E)
    '   -------------------
    '       When in graphics mode, the buffer base address can be used for external
    '       memory such as HyperRAM. 
    '       The E flag's value should be set to 0 for HUB RAM, 1 for external RAM.
    '   Bank
    '   ----
    '       Selects the bank for external memory accesses, ignored for hub access.
    '       Addresses will wrap within the same 16MB external memory bank.
    '   Screen Buffer Base Address
    '   --------------------------
    '       The Hub or External memory address to begin reading from at the start 
    '       of the region being displayed. 

    I think I can modify it so the E+Rsvd bits above get instead re-claimed as a nibble to identify the instance of the HyperRAM driver. A zero nibble would get used to indicate internal memory accesses instead of external memory accesses so the driver instances would be based at 1 not 0 from the video client's perspective.

    The current code to do external memory setup stuff for graphics regions is this:
                                mov     save, ptrb              'preserve for fifo use later
                                mov     d, ptra                 'preserve initial source pointer
    extm_test                   testb   screenaddr1, #31 wz     'check for external memory usage
                                testb   modedata, #13 wc        'check transparent/sprite mode and
                if_c_and_nz     mov     save, ptra              '...display from ptra if no ext mem
                if_z            getnib  a, screenaddr1, #6      'extract bank address
                if_z            setnib  ptra, a, #6             'copy into external request
                if_z            setnib  ptra, #EXTMEMREQ, #7    'add memory read request to address
    p6          if_z            mov     a, #COLS                'transfer column "units" of memory data
                if_z            shl     a, bppidx               '...multiplied by bpp into HUB RAM
                if_z            setq    #2-1                    'write 2 values
                if_z            wrlong  save, mailbox2          'setup memory request information
                if_z            wrlong  ptra, mailbox1          'initiate memory read request
                if_z            add     ptra, linebufsize       'increase ptra by this amount
                                bitz    mouseptr, #23           'remember for late mouse render
                if_c_or_z       jmp     #copy_done              'no need to do any copy this time

    If I can find a free temporary working register (my "xx" below) I think I can do this instead, by eliminating the old mailbox1 COG register (now leaving 3 COG RAM locations free for the new instructions) and only maintaining my mailbox2 as the parameter base address of the first real instance for this driver COG id - 96 (which I will still compute at COG init time). This allows the top nibble of each screen buffer parameter to index into the different HyperRAM drivers for this COG id which is ideal because it all remains together so changing it is atomic. Plus I was already using Z flag to differentiate internal/external access so I can still get that with the MUL instruction saving a further instruction which would not have fit otherwise. I think this might just work. :smile: :smile: :smile:
                                mov     save, ptrb              'preserve for fifo use later
                                mov     d, ptra                 'preserve initial source pointer
                                getnib  xx, screenaddr1, #7     'get ext-mem driver instance (0=hub only)
    extm_test                   mul     xx, #96 wz              'scale by mailbox bytes per driver
                                add     xx, mailbox2            'compute mailbox2 address
                                testb   modedata, #13 wc        'check transparent/sprite mode and
                if_c_and_z      mov     save, ptra              '...display from ptra if no ext mem
                if_nz           getnib  a, screenaddr1, #6      'extract bank address
                if_nz           setnib  ptra, a, #6             'copy into external request
                if_nz           setnib  ptra, #EXTMEMREQ, #7    'add memory read request to address
    p6          if_nz           mov     a, #COLS                'transfer column "units" of memory data
                if_nz           shl     a, bppidx               '...multiplied by bpp into HUB RAM
                if_nz           setq    #2-1                    'write 2 values
                if_nz           wrlong  save, xx                'setup memory request information
                if_nz           sub     xx, #4  
                if_nz           wrlong  ptra, xx                'initiate memory read request
                if_nz           add     ptra, linebufsize       'increase ptra by this amount
                                bitnz   mouseptr, #23           'remember for late mouse render
                if_c_or_nz      jmp     #copy_done              'no need to do any copy this time
  • roglohrogloh Posts: 2,776
    edited 2020-08-21 - 02:35:14
    The other real bonus with the approach above is that I think it should remain fully compatible with my driver's region interlaced mode where each field displays graphical source data from a different buffer for that region, so you could have the flip done automatically for you if you choose that approach. This would be perfect for video applications where other COGs work to fill a frame buffer in one HyperRAM for one field, while the video COG displays from another different HyperRAM at the same time, and the video HW output will flip to use the other HyperRAM device automatically on the next field without intervention. My Screen_buffer_3 and Screen_buffer_4 parameters are used for this.

    It doesn't have to actually be interlaced source material either. The auto-flipping will work for non-interlaced source material when you enable a region's interlaced mode. It will still just happen at the field rate (or entire frame in non-interlaced terminology).
  • Multiple HyperRAM driver access from the video driver is now all coded and builds with zero longs left in the COG. I've just run it and now see the top nibble of the screen buffer parameter is controlling sourcing pixels from hub vs external memory, just as I wanted. When set to 0 it displays from hub, when it's 1 data is coming from first HyperRAM bus, when greater than 1 the region is trying to use the corresponding HyperRAM driver(s) bus but because those board(s) do not exist in my system it is just alternating the two prior scan lines left in the line buffer since it won't get a mailbox response (as expected). I also can see my automatic interlaced source mode working with this feature too.

    To see it properly switching video from different HyperRAM boards in all its glory I would need a second HyperRAM board to test for real. Unfortunately we are still in lockdown for probably at least another month here so I can't even travel more than the 5km needed to visit the local Melbourne guys to test it with one of their own boards without the risk of an $1652 on the spot fine! To ship one any faster from Parallax would cost me $282 so that's not happening either.

    I'm actually pretty happy about getting this final capability into my video driver. It's now going to allow us some serious P2 video write performance with two or more HyperRAM memories on different buses for those that really need the most performance. With this firmware you could even have up to 5 buses with all the P2 pins and different COGs working on different buses with no contention. Might make a good video pipeline if you can carve up the workload appropriately. :smile:
  • msrobotsmsrobots Posts: 3,250
    edited 2020-08-22 - 11:43:12
    @rogloh, I have a P2 rev B and two Hyper-Ram Boards.

    Can you give me some Code and simple instructions to test it for you?

  • roglohrogloh Posts: 2,776
    edited 2020-08-23 - 05:23:10
    Yeah I may need to do that, msrobots. But I will first need to resolve some weird problem I noticed tonight with HyperRAM addressing that now seems to be truncating my read scan line data somehow - it's most likely a regression as I have not seen this one before when I last tested HyperRAM hires graphics, so my chosen temp register may not be good to use or something new is going on. Then once that is restored I'll try to build up a demo binary that draws into one RAM while outputting from another and compare that against single RAM performance. It should be a useful test to exercise the feature.

    Update: Thankfully a false alarm on the bug, it was just an invalid line buffer size passed into my driver from the test program. :smile:
  • So once I finally understood how these box drawing characters were meant to be used on the P1 (I'd forgotten it), I was able to get my 8x16 Propeller font tweaked a bit to support them. It looks okay and will be useful for text based UIs. The characters should be capable of basically doing the same things as available on the P1 (minus those 3d bevel effects with 2bpp output of course).
    The characters to use to form a box/button shape are these:
    0   12   8
    10       11
    1   12   9

    The bottom character 12 can also be 13 instead to create an underscore below the character above it in the box as shown in the attached picture. This is useful for accelerator keys or perhaps to highlight focus etc. A mouse will be able to move over this and click it. The "mouse click down" state could be an inverse font or different text colour inside perhaps.
    800 x 600 - 973K
  • rogloh,

    Just trying to get a handle (with: p2videodrv0_91b) on regions... Looks like you can specify the height of a region in pixels, but can you specify the region horizontally? I would like to build regions around user interface items that may be left-aligned somewhere other than pixel 0 on any line.

    649 x 486 - 146K
  • I think you will find there isn’t any time available in the horizontal line to permit different regions.
  • Cluso99 wrote: »
    I think you will find there isn’t any time available in the horizontal line to permit different regions.
    Thanks, Cluso!

  • roglohrogloh Posts: 2,776
    edited 2020-08-27 - 04:08:12
    Yes Cluso is correct @dgately. The way it works is that you can only reload a different region mode one line a time, one scan line before the line starts - there is no way to have two regions side by side, only above/below each other.

    Have you been able to get 800x480 working with some timing tweaks?

    The newer code I am putting together fixes some of the text printing API issues which was incomplete/buggy in the beta version. In the meantime if you manage your own screen buffer contents it won't be an issue.
  • rogloh wrote: »
    Have you been able to get 800x480 working with some timing tweaks?
    640x480 is working well!

    What areas of code would I need to mod, for HDMI for 800x400?
    I see this:
    ' obtain the VGA's timing for a given resolution
     	'timing := vid.getTiming(VID#RES_1024x768) 
     	'timing := vid.getTiming(VID#RES_800x600) 
     	timing  := vid.getTiming(VID#RES_640x480)

    Also, I do notice that when I've tried to use either SmartSerial or JonnyMac's latest FullDuplexSerial to send text back to the terminal, that I'm not able to get good results. There's some kind of memory issue, I think... I get a few correct characters back, but then garbage. If I place the printStr(), out(), Tx(), Dec(), Hex(), etc... type statements before InitDisplay, they "kind of" work, but placing any terminal related output after that definitely results in 3-5 character of good text, then garbage. I've attached the demoHDMI.spin2 code. It's set to use FullDuplexSerial, but can be converted to use SmartSerial with a few mods.
    Application "Wish" to activate'; exit 0
    ( Entering terminal mode.  Press Ctrl-] to exit. )

  • roglohrogloh Posts: 2,776
    edited 2020-08-28 - 01:49:10
    Ok @dgately the terminal problems are baud rate mismatches because the P2 PLL clock frequency changes when the video is initialised and this affects the bit timing. You need to either start the serial object after you init video or restart the serial again at the same baud rate once the video is initialized (that works for me).

    To get video tweaked, I would create a copy of the longs originally defined here for 800x600 with reduced blanking over DVI and experiment with that by playing with the sync values. As I don't have one of these LCD's I don't know what the best values are unfortunately, it would need some experimentation unless a proper data sheet is found.
    svga_dvi_timing   ' massively reduced blanking for 800x600 50Hz at 25.2MHz clk YMMV
                long   CLK252MHz
                long   252000000
                       '     1 bit         7 bits      8 bits      8 bits    8 bits
                long   (SYNC_POS<<31) | (  8<<24) | (  8<<16) | (  8<<8 ) | (800/8)
                       '     1 bit         8 bits      3 bits      9 bits   11 bits
                long   (SYNC_POS<<31) | (  2<<23) | (  2<<20) | ( 11<<11) | 600
                long    10 << 8
                long    0
                long    0   ' reserved for CFRQ parameter

    For the 800x480 LCD

    refresh rate = pixel frequency / (total horizontal pixels * total vertical lines)
    total horizontal pixels = horizontal front porch + horizontal sync + horizontal back porch + 800
    total vertical lines = vertical front porch lines + vertical sync lines + vertical back porch lines + 480

    The pixel frequency below is set to 25.2MHz (P2 clock=252MHz).

    Ideally you want to keep the LCD refresh at or close to 60Hz. There are possibly other minimum values for syncs and blanking portions for this panel but I don't know them. My numbers below are just a starting point to play with but I have made the total width 824 pixels which is 800 active pixels + 24 pixels of horizontal blanking (8+8+8), and 510 lines which is 480 active lines + (5+3+22) of vertical blanking lines.

    25.2Mpixels/s / (824 pixels x 510 lines) ~ 60Hz.

    So you would just include this data timing format in the DAT section in your top application object and try to tweak the numbers of blanking pixels and blanking lines in the two longs below until it works (keeping the 800 active pixels and 480 lines constant while doing this):
    long CLK252MHz
    long 252000000
    long (1<<31) | (8 << 24) | (8 << 16) | (8 << 8) | (800/8) ' example starting point only
    long (1<<31) | (5 << 23) | (3 << 20) | (22 << 11) | 480  ' example starting point only
    long 10 << 8
    long 0
    long 0
    And also need to modify your initialisation code to now use the custom timing data instead of standard VGA 640x480 mode by changing this code:
    	' obtain the VGA's timing for a given resolution
     	timing := vid.getTiming(VID#RES_640x480) 
    to this :
     	timing := @custom_lcd_timing
  • The timing numbers above are actually pretty aggressive in terms of reduced horizontal blanking, which puts more load on the driver and could ultimately overload it once they are too low. It limits other things like drawing borders and mice and may cause timing failures if they are enabled. So in general it's nicer to reduce the vertical lines down more and give as much horizontal blanking time to the driver for its own per scan line overheads. You have 510 lines to play with which is a lot. My driver doesn't do very much during these blanking lines.
  • roglohrogloh Posts: 2,776
    edited 2020-08-28 - 01:50:36
    Looking at that data sheet I had found earlier in the 7" LCD thread here http://forums.parallax.com/discussion/comment/1503509/#Comment_1503509

    if this is the same type of panel and so does apply to your LCD module, then the timing numbers above may not work.

    According to the data sheet you'd probably want to use a higher clock like the 270MHz one I defined in order to meet the 26.4Mpixel minimum. You could try this one... 882 total pixels, 510 total lines, ~60Hz refresh.
    long CLK270MHz
    long 270000000
    long (1<<31) | (16 << 24) | (16 << 16) | (50 << 8) | (800/8) ' example starting point only
    long (1<<31) | (7 << 23) | (3 << 20) | (20 << 11) | 480  ' example starting point only
    long 10 << 8
    long 0
    long 0

    EDIT: there were some incorrect values in the longs above that would have not worked, and also in the prior post! Fixed.
  • Thanks, rogloh!
    I'll give this some time in the next or two. I'll report my results.

  • Terminal operations are working now... (made sense).

    I'm a bit lost on creating the 270 MHz... Gave it a try, though!
    Not getting video with the attached version of the demo (for 800x480)
    CLK270MHz    = %1_000100_0010000110_0000_10_00 '(20MHz/5) * 135/2= 207   MHz (for 800x480)
    A bit over my head...

  • evanhevanh Posts: 9,970
    edited 2020-08-29 - 02:14:13
    You'll need to use DIVP of 1 rather than 2. 270 x 2 = 540 MHz. 540 MHz is too fast for the VCO in the PLL. Try this instead:
    CLK270MHz = %1_001001_0010000110_1111_10_00     '20MHz / 10 * 135 / 1 = 270 MHz

    PS: What will happen with DIVP of /2 is the VCO taps out at about 400 MHz so then sysclock is around 200 MHz. I say around because the PLL can't get a lock to the crystal. And without the PLL locked, sysclock frequency then becomes die temperature dependent. As it warms up the VCO frequency will go lower, well below 400 MHz.
  • Sorry about the 270MHz, I forgot that wasn't already defined in the code you have and had left it to you to work out.

    If that timing doesn't work for you another one to try might be this, which gives exactly 60Hz refresh with a faster 31.5MHz pixel clock (note that the P2 is running quite overclocked at 315MHz). Total pixels = 1000, total lines = 525. This gives lots of time in the horizontal blanking for mice/borders etc.
    long CLK315MHz
    long 315000000
    long (1<<31) | (16 << 24) | (96 << 16) | (88 << 8) | (800/8) ' example starting point only
    long (1<<31) | (33 << 23) | (2 << 20) | (10 << 11) | 480  ' example starting point only
    long 10 << 8
    long 0
    long 0

    Chip seemed to use the P2 @ 330MHz in his 800x480 HDMI demo in the PNut demo files. Whether or not that timing worked on your LCD panel (or some other monitor) I'm not sure, I've don't think I've seen that indicated 100% definitively from anyone. 330MHz would be too high for HyperRAM sysclk/1 operation though you could go down to half that.
  • Thanks, Evan!
    That was, of course, correct!

    Working 800x480 sample code attached...

    937 x 703 - 196K
    1079 x 1048 - 309K
  • GREAT! :smiley: :smiley:
  • roglohrogloh Posts: 2,776
    edited 2020-08-29 - 02:22:26
    @dgately In my newer version about to be released I am having success with better printing and you can use SEND redirection with ersmith's formatting and it works nicely in PNut and FastSpin now. Much easier than just writing single chars and calling printStr etc. Of course you can still manage the screen buffer independently with your own hub writes etc, but it's nice to have the screen region scroll for you and just use simple code like this:
            f   : "ers_fmt"  ' format utility from Eric Smith
            vid : "p2videodrv"
    PUB demo() : i, j
            send := @vid.tx
            send("Hello", f.nl())
            repeat i from 0 to 10
                j:= i*i 
                send(f.dec(i), 9, f.dec(j), f.nl())

    I'm probably going to remove the inbuilt hex/dec printing stuff in the newer video driver API given this formatting capability exists now. No need to have a myriad of different driver versions of formatting APIs. If a common way is standardized it would be easier to port code to different output devices. SEND redirection is a good way to do this.

Sign In or Register to comment.