Shop OBEX P1 Docs P2 Docs Learn Events
Anti-aliased 24-bits-per-pixel HDMI — Parallax Forums

Anti-aliased 24-bits-per-pixel HDMI

I've been working on graphics for the P2, using the P2-EC32MB Edge module.

The PSRAM buffers 960x540 screens at 24bpp for a really nice picture over HDMI. The resolution isn't super high, but it looks surprisingly good with anti-aliasing.

I took the anti-aliased line-draw routine I made for the PC that the DEBUG displays use and got it running on the P2.

To try this code, you'll need a P2-EC32MB module and the DIGITAL VIDEO OUT board which connects 8 pins to an HDMI connector. I will talk about this on the live Propeller Forum tomorrow.

https://drive.google.com/file/d/1UVdZ3K8Q_14O703ysN0Moq1a7pkLBiCz/view?usp=sharing

Next, I want to make a triangle renderer with a Z buffer for 3D graphics.

With the "qHD" mode, or quarter-HD, we'll be able to show really nice anti-aliased fonts and graphics at the same time.

«13

Comments

  • cgraceycgracey Posts: 14,133
    edited 2024-02-14 09:11

    Here are some colored lines. These are 1.5 pixels wide. The anti-aliased line draw has 8 sub-bits for each X, Y, and diameter. So, lines can be placed in X and Y at offsets of 256ths of a pixel. Line diameter is similar, but gets halved to make a radius in 256ths of a pixel. The minimum diameter is $100, or 1 whole pixel.

  • roglohrogloh Posts: 5,184

    Is there a simple way to get it to build with flexspin or does it need to be PNut only? I was hoping to run this demo on my Mac however I immediately encountered two problems when I tried to build with flexspin...

    1. "repeat x with y" needed patching to the old "repeat y from 0 to x-1" - was simple to fix
    2. setregs doesn't appear to be implemented - I'll need to go check the latest version's release docs etc as I'm still on 6.2 beta but it's probably related to management of variables held in COGRAM which I suspect still differs significantly between flexspin and PNut.
    Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2023 Total Spectrum Software Inc. and contributors
    Version 6.2.0-beta-v6.1.7-2-g588815b8 Compiled on: Jun 15 2023
    LineDrawAntiAlias.spin2
    |-PSRAM_driver.spin2
    |-HDMI_960x540_24bpp.spin2
    LineDrawAntiAlias.spin2:131: error: syntax error, unexpected '#'
    LineDrawAntiAlias.spin2:137: error: syntax error, unexpected '#'
    

    Maybe I'll just have to wait to get some form of Windows running again...I am planning a dual boot MacBook Pro setup in time with a newer larger SSD fitted once I get to opening the thing and installing it.

  • evanhevanh Posts: 15,209
    edited 2024-02-14 11:50

    @rogloh said:
    2. setregs doesn't appear to be implemented - I'll need to go check the latest version's release docs etc as I'm still on 6.2 beta but it's probably related to management of variables held in COGRAM which I suspect still differs significantly between flexspin and PNut.

    Looking at LineDrawAntiAlias.spin2 I see a large DAT section starting with ORG with all RES declares then everything else is ORGH. I'm not sure what Flexspin will make of a DAT section like that but if it compiles that section then it shouldn't be hard to then make a SETREGS like function to match.

    Maybe I'll just have to wait to get some form of Windows running again...I am planning a dual boot MacBook Pro setup in time with a newer larger SSD fitted once I get to opening the thing and installing it.

    Pnut, unlike Proptool, runs on Wine fine. It even does the full debug features.

  • evanhevanh Posts: 15,209

    Chip,
    Your hblanking is way too short! I've found 80 to be about the minimum.

    PS: I've got it working via Wine. The no-picture had me scratching my head for a while though. Had to double check each part of of the setup before discovering you had the horizontal blanking at only 16!

  • @rogloh said:
    1. "repeat x with y" needed patching to the old "repeat y from 0 to x-1" - was simple to fix

    That I'm pretty sure is in there, just the current version.

    @cgracey said:
    To try this code, you'll need a P2-EC32MB module and the DIGITAL VIDEO OUT board which connects 8 pins to an HDMI connector.

    Will need to make a VGA patch... Though the monitor I've been using is kinda dying. I got a fancy new capture card that could do HDMI, but ran into some medium driver issues. So bad monitor situation currently.

    @cgracey said:
    Next, I want to make a triangle renderer with a Z buffer for 3D graphics.

    My research on the topic has actually led me to believe that doing higher convex N-gons directly can be faster than just triangles. The setup is somewhat more complex, but that gets made up for when you draw a quadliteral in one go (instead of two triangles drawing adjacent spans). That also makes clipping easier - when the tip of a triangle pokes outside the screen, clipping actually turns it into a quadliteral. I think the worst case is a hexagon when all three tips are copped by different clip planes. Of course clipping a quad can turn it into an octagon, but there really isn't a difference between rasterizing quads vs octagons. Just need to iterate through more vertices. Though interpolating values across the face is somewhat more complicated (need to recalculate scale factor per scanline) but would generally be nicer than an affine transform. I think perspective-correct interpolation needs per-scanline work, anyways, so maybe it doesn't matter there. I really haven't fully worked it out, either.

  • @cgracey said:

    Is there a message in there Chip?
    Stare long enough, and it seems to suggest... "don't touch the lonely red line" :smiley:

  • cgraceycgracey Posts: 14,133

    @evanh said:
    Chip,
    Your hblanking is way too short! I've found 80 to be about the minimum.

    PS: I've got it working via Wine. The no-picture had me scratching my head for a while though. Had to double check each part of of the setup before discovering you had the horizontal blanking at only 16!

    Yeah, I found that on my TV it could be set minimally, in order to get to 60Hz refresh.

    All this timing was carry-in from the analog era. It seems that most of it can be squeezed out in HDMI. Sorry it was too short for your TV. I don't know what the minimum really is. This resolution was standard on many cell phones 12 years ago, but has since been eclipsed by higher resolutions.

  • cgraceycgracey Posts: 14,133

    @Wuerfel_21 said:

    @rogloh said:
    1. "repeat x with y" needed patching to the old "repeat y from 0 to x-1" - was simple to fix

    That I'm pretty sure is in there, just the current version.

    @cgracey said:
    To try this code, you'll need a P2-EC32MB module and the DIGITAL VIDEO OUT board which connects 8 pins to an HDMI connector.

    Will need to make a VGA patch... Though the monitor I've been using is kinda dying. I got a fancy new capture card that could do HDMI, but ran into some medium driver issues. So bad monitor situation currently.

    @cgracey said:
    Next, I want to make a triangle renderer with a Z buffer for 3D graphics.

    My research on the topic has actually led me to believe that doing higher convex N-gons directly can be faster than just triangles. The setup is somewhat more complex, but that gets made up for when you draw a quadliteral in one go (instead of two triangles drawing adjacent spans). That also makes clipping easier - when the tip of a triangle pokes outside the screen, clipping actually turns it into a quadliteral. I think the worst case is a hexagon when all three tips are copped by different clip planes. Of course clipping a quad can turn it into an octagon, but there really isn't a difference between rasterizing quads vs octagons. Just need to iterate through more vertices. Though interpolating values across the face is somewhat more complicated (need to recalculate scale factor per scanline) but would generally be nicer than an affine transform. I think perspective-correct interpolation needs per-scanline work, anyways, so maybe it doesn't matter there. I really haven't fully worked it out, either.

    Yeah, it seems quadrilaterals would be fine. Even triangles typically get broken into TWO triangles at rendering, so that each begins and ends on a common Y. A section identical to the screen memory can be maintained in the PSRAM to act as a per-pixel Z buffer. Only nearer pixels get written to the screen memory and the corresponding location in the Z buffer is updated with the new distance. By alpha-blending the polygons onto the screen, I think it would look pretty good.

  • RaymanRayman Posts: 13,955

    Neat stuff. FTDI’s EVE series does subpixel stuff like that. This could be something I’d use with 7” hdmi tfts

    3D would be neat for accelerometer and or IMU display…

  • @cgracey said:

    Yeah, it seems quadrilaterals would be fine. Even triangles typically get broken into TWO triangles at rendering, so that each begins and ends on a common Y.

    One can do it like that, but you end up splitting the long edge then. It's better to think of "which side will need a new vertex next" and then grab the next one up/down (depending on wether you're loading a right or a left vertex) and then recalculate that edge only. This all needs a bit of thought since you can go through multiple vertices without crossing an integer scanline boundary where you'd actually get to draw anything.

    A section identical to the screen memory can be maintained in the PSRAM to act as a per-pixel Z buffer. Only nearer pixels get written to the screen memory and the corresponding location in the Z buffer is updated with the new distance. By alpha-blending the polygons onto the screen, I think it would look pretty good.

    It's either or. Blending and Z-Buffer don't mix, because you can't meaningfully render behind something that's already been drawn with semi-transparency. So depending on how you do it, you'll either have weird occlusion effects (when blended geometry updates Z buffer) or a weird layering issue where further away objects seem to be on top of near objects (when blended geometry only reads Z buffer) This has always been the case. If you want to blend any pixels with the existing buffer (including edge AA), you need to draw in back-to-front order (or, with Z-buffering, draw all opaque geometry first and then the blended geometry in-order. This is slightly relaxed if the blend mode is cumulative, i.e. you're doing add/sub or XOR blending. In such cases the blended geometry doesn't need to be sorted with itself, only with respect to the opaque stuff (no dice if you want to have both additive and alpha blend in the same scene though).

    Of course, if you can sort the entire scene you can just toss the Z-buffer entirely. Loads of ways to do this, all slightly fiddly (BSP trees say hello!). The most general is to just have a load of buckets (between 256 and 1024 or so?) that you sort each primitive into based on the Z value of it's vertices. Generally the average of them all, but for something like the walls and floors of a room that objects may directly rub up against, you'd rather use the deepest vertex's Z to avoid sorting issues. This is how loads of CPU-only and early hardware 3D (Playstation 1, SEGA Saturn, etc) did it. PS1 documentations calls this method/the data structure "ordering table". This requires that geometry processing and pixel drawing are done as two phases. But it's always somewhat imperfect, of course. You can very easily create a construct that defies any attempt at sorting:

    A BSP tree would split up such a construct and provides perfect ordering (not neccessarily perfect depth sorting! things that don't overlap can be in any order...) in all cases, but is only good for fully static geometry. Of course, since the order that things get added to a single ordering table bucket is relevant in itself, one could combine BSP and OT to get perfect sorting within each static object and approximate sorting between objects.

    tl;dr; approaches to 3D graphics are infinite in number and infinitely interesting. Haven't even talked about the Quake edge-sorting algorithm (unlike many believe, quake does not use BSPs for ordering).

  • cgraceycgracey Posts: 14,133

    Wuerfel_21, I didn't say what I meant quite right. I know that alpha-blending whole polygons is impossible without Z-ordering per pixel. I meant to say that I would blend the edges, as in anti-alias them. I don't think this would have any detrimental effect. All polygons would be considered opaque, but the edges might as well get blended to reduce jaggies.

  • Wuerfel_21Wuerfel_21 Posts: 4,541
    edited 2024-02-14 21:09

    @cgracey said:
    Wuerfel_21, I didn't say what I meant quite right. I know that alpha-blending whole polygons is impossible without Z-ordering per pixel. I meant to say that I would blend the edges, as in anti-alias them. I don't think this would have any detrimental effect. All polygons would be considered opaque, but the edges might as well get blended to reduce jaggies.

    There's no difference though - the weird effect would be isolated to the edges, but you always get artifacts if you do something that reads the color underneath at all.

  • evanhevanh Posts: 15,209
    edited 2024-02-14 21:50

    @cgracey said:

    @evanh said:
    Chip,
    Your hblanking is way too short! I've found 80 to be about the minimum.

    PS: I've got it working via Wine. The no-picture had me scratching my head for a while though. Had to double check each part of of the setup before discovering you had the horizontal blanking at only 16!

    Yeah, I found that on my TV it could be set minimally, in order to get to 60Hz refresh.

    All this timing was carry-in from the analog era. It seems that most of it can be squeezed out in HDMI. Sorry it was too short for your TV. I don't know what the minimum really is. This resolution was standard on many cell phones 12 years ago, but has since been eclipsed by higher resolutions.

    Yeah, I don't know what is a safe generalised minimum either.

    As for the resolution, in DVI/HDMI there's no restrictions on selection other than multiples of 8 for horizontal and obviously there is a max resolution supported.

    You could choose a resolution from the desired dotclock and refresh: Start with 32 MHz and 60 Hz. 32e6 / 60 = 533e3 total dot area, sqrt = 730, x 1.333 = 974 htot, - 80 hblank = 894, round = 896 hres, / 1.78 = 504 vres.

    Interestingly, tweaking these, I find my TV is good down to 60 hblanking here. I'm not sure how I figured 80 as the minimum to be honest.

    EDIT: So, redoing it at hblank = 60: ... 974 htot, - 60 hblank = 914, round = 912 hres, / 1.78 = 513 vres.

    Of course, 960x540 works fine at 56 Hz refresh too.

  • evanhevanh Posts: 15,209
    edited 2024-02-14 22:43

    Huh, never expected that. My TV is also fussy about the vertical back porch (top blanking lines). It needs a minimum of 9 lines there. I had been unsure about where to place the vsync, so.it looks like there's more leeway when it's at the beginning of the blanking.

    Roger,
    Need more allocated bits for this in your timings structure!

  • evanhevanh Posts: 15,209
    edited 2024-02-14 23:27

    Here's an example using the tightest DVI/HDMI blanking timings for my TV:

     Sysclock freq = 320 MHz   Dotclock freq = 32.0 MHz
     Hres=1280  hfp=4 hsync=52 hbp=4  Htot=1340   Hfreq = 23881 Hz
     Vres=640  vfp=1 vsync=2 vbp=9  Vtot=652   Vfreq = 36.6 Hz
    

    EDIT: It also seems to be happy to accept up to 75 Hz refresh rate but I know other monitors I have top out at 60 Hz refresh.

     Sysclock freq = 172 MHz   Dotclock freq = 17.2 MHz
     Hres=640  hfp=4 hsync=52 hbp=4  Htot=700   Hfreq = 24571 Hz
     Vres=320  vfp=1 vsync=2 vbp=9  Vtot=332   Vfreq = 74.0 Hz
    

    EDIT2: Uh-oh, so the vblanking has more complexity here. It can go lower when the vres is lower, which means it probably also requires more when the vres is higher ... or not. The 640 vres didn't need any more blanking ... and 640x800 is fine too ... 640x1080 also good. That's the max vertical.

  • roglohrogloh Posts: 5,184
    edited 2024-02-15 00:11

    @Wuerfel_21 said:

    @rogloh said:
    1. "repeat x with y" needed patching to the old "repeat y from 0 to x-1" - was simple to fix

    That I'm pretty sure is in there, just the current version.

    Yeah I'm outta date yet again. Need to upgrade.

    @evanh said:

    @rogloh said:
    2. setregs doesn't appear to be implemented - I'll need to go check the latest version's release docs etc as I'm still on 6.2 beta but it's probably related to management of variables held in COGRAM which I suspect still differs significantly between flexspin and PNut.

    Looking at LineDrawAntiAlias.spin2 I see a large DAT section starting with ORG with all RES declares then everything else is ORGH. I'm not sure what Flexspin will make of a DAT section like that but if it compiles that section then it shouldn't be hard to then make a SETREGS like function to match.

    I've started working down that path. I just need to call the smoothline function from Spin somehow - or make it inline.

    @evanh said:
    Huh, never expected that. My TV is also fussy about the vertical back porch (top blanking lines). It needs a minimum of 9 lines there. I had been unsure about where to place the vsync, so.it looks like there's more leeway when it's at the beginning of the blanking.

    Roger,
    Need more allocated bits for this in your timings structure!

    LOL, your favourite bugbear. How many bits do you need for it?

    Speaking of timings I temporarily commented out the DAT PASM section in Chip's demo code and hacked in my own smoothline function for flexspin to use without the anti alias stuff and found my resurrected Dell2405FPW did accept the timings. They are tight! 16 pixels of horizontal and 9 lines of vertical blanking with a 960x540 active area at 60Hz. Cool.

    Amazing you can get something with this little blanking going. No way it works with VGA at this rate, it needs more blanking for that. My own video driver certainly wouldn't be able to do this little horizontal blanking due to its other housekeeping code required in this interval, like issuing external memory reads and loading in palettes to LUTRAM etc. Also, I found it didn't sync at all on another 17 inch TFT I have though (Samsung B1740). So not all monitors are going to like this signal.

  • evanhevanh Posts: 15,209
    edited 2024-02-15 01:10

    @rogloh said:
    LOL, your favourite bugbear. How many bits do you need for it?

    I'll get back to you on that. I want to allow space for VRR in the vertical allocations.

    ... They are tight! 16 pixels of horizontal and 9 lines of vertical blanking with a 960x540 active area at 60Hz. Cool.

    I think Chip might have it at just 8 lines blanking. Isn't the single first line also the sync? ie front porch = 0.

    And that 8 works for me. Just had to up the hblanking to 60.

    ... My own video driver certainly wouldn't be able to do this little horizontal blanking due to its other housekeeping code required in this interval, like issuing external memory reads and loading in palettes to LUTRAM etc. ...

    Oh, I was using your driver in my testing above ... I still need the hblanking of 60 but I can reduce the vblanking further using Chip's driver. So blanking of 60x8, instead of 60x12, works now. Not sure of other resolutions, Chip's program needs 960x540 specifically.

  • roglohrogloh Posts: 5,184

    @evanh said:]
    I think Chip might have it at just 8 lines blanking. Isn't the single first line also the sync? ie front porch = 0.

    Just double checked the code, yes you are correct. It's just 8 vblank lines total including the sync.

  • cgraceycgracey Posts: 14,133

    @rogloh said:

    @evanh said:]
    I think Chip might have it at just 8 lines blanking. Isn't the single first line also the sync? ie front porch = 0.

    Just double checked the code, yes you are correct. It's just 8 vblank lines total including the sync.

    Yeah, it's one vsync line and seven blanks. I need to know how tight this can be safely pushed. Ada said today that we need something like 34 total horizontal blank pixel periods to accommodate data packets for sound.

    It would be good to know exact numbers.

  • evanhevanh Posts: 15,209
    edited 2024-02-15 10:54

    An old Dell U2412M DVI monitor (My first LCD monitor) wants minimum hblank of 68. But it's a lot fussier about resolution options too. More like how VGA inputs work. Ah, uh-oh, the tight timings only seems to work for modes that weren't a VGA type mode. Basically, it's rubbish at adjusting even though it could do so easier than the fixed modes list it has been programmed with.

    So it looks like the fully flexible resolution detection is actual a newish (last ten years or so) ability of monitors and TVs.

  • evanhevanh Posts: 15,209
    edited 2024-02-15 10:57

    @evanh said:
    So it looks like the fully flexible resolution detection is actual a newish (last ten years or so) ability of monitors and TVs.

    Maybe that came along with firmwares that supported HDMI. Dunno.

  • pik33pik33 Posts: 2,352

    Here are my timings for 1024x600@50 Hz:

    '                      bf.hs, hs,  bf.vis  visible, up p., vsync, down p.,  cpl, total lines, clock,       hubset                                scanlines  ud bord mode reserved
    timings         long   8,     60,  8,       1024,   7,     4,     1,        128, 600,         340500000,   %1_100111__10_1010_1000__1111_1011,   600,        0,     192, 0, 0
    

    76 pixel horizontal, 12 lines vertical.

    It worked on what I managed to connect to the P2...

  • TonyB_TonyB_ Posts: 2,127

    How low could sysclk go for 960x540 @ 50Hz? (I don't have the Digital Video Out board.)

  • pik33pik33 Posts: 2,352
    edited 2024-02-15 22:33

    How low could sysclk go for 960x540 @ 50Hz? (I don't have the Digital Video Out board.)

    If similar to my 1024x600 sync timings are used, something about 290 MHz.

  • evanhevanh Posts: 15,209
    edited 2024-02-16 01:16

    Chip's timings: 50 x (540 + 8) x (960 + 16) x 10 = 267.424 MHz
    Evan's timings: 50 x (540 + 8) x (960 + 60) x 10 = 279.48 MHz
    Pik's timings: 50 x (540 + 12) x (960 + 76) x 10 = 285.936 MHz

    PS: Pik's timings will provide the most universal coverage. That'll suit my old Dell because it's not a recognised VGA mode and therefore it'll accept reduced blanking.

  • roglohrogloh Posts: 5,184
    edited 2024-02-16 02:05

    I was able to modify Chip's code slightly to run on flexspin to see this demo on my Mac. :smile: It just runs a seperate COG and waits for a command to occur via a cmd mailbox. Quick hack for now.

    If anyone else wants to run it without using PNut/PropTool give this version a go by replacing LineDrawAlias.spin2 with this attached file (and keep the other two source files as they are from Chip's code).

    Tested on flexspin v6.2.0-beta.

    Not sure why the top frame buffer scan line is not being cleared fully at the start (driver cog not ready?), but didn't dig very far into it. Also there is a warning about no flags being set and if you try to fix it with a "wz" added it then messes up the anti-aliased graphics. Maybe these are flexspin vs PNut differences or some other bugs. In any case you still see the demo.

    LineDrawAntiAlias.spin2:660: warning: instruction cmp used without flags being set

    EDIT: yes if I add a startup delay of 1sec before gfx are drawn the uncleared scan line disappears. Seems to be COG startup timing related.

  • evanhevanh Posts: 15,209
    edited 2024-02-16 03:08

    Replacing the REPEAT with the WAITMS worked for me.

      coginit(NEWCOG, @gfxcog, @cmdbuf)
    '  repeat while cmdbuf[0]
      waitms(1)
    

    Which suggests the REPEAT isn't working.

  • roglohrogloh Posts: 5,184
    edited 2024-02-16 03:46

    @evanh said:
    Replacing the REPEAT with the WAITMS worked for me.

      coginit(NEWCOG, @gfxcog, @cmdbuf)
    '  repeat while cmdbuf[0]
      waitms(1)
    

    Which suggests the REPEAT isn't working.

    I found it happens earlier on and it needs a 1ms wait after the hdmi COG is spawned before the first PSRAM clearing write access can occur. Still not entirely sure why. EDIT: One possible theory is that it could be priority related if the HDMI COG is spawned while a large PSRAM write is underway and the video COG's PSRAM initial reads are delayed and get out of sync with the scan line being rendered. Unlike my driver there is no priority for video COGs and fragmentation for non-realtime COGs in Chip's PSRAM driver, which could cause problems like this.

    PUB start()
    
      psram.start()
    
      hdmi.start(0, psram.pointer(), 0)
      waitms(1)  '  <<<  adding this fixes graphics issue with first scan line
    
      psram_ptr := psram.pointer() + cogid() * 12
    
      mapbase := 0
      pixeltype := @smooth_pixel1
    
  • pik33pik33 Posts: 2,352

    Chip's timings: 50 x (540 + 8) x (960 + 16) x 10 = 267.424 MHz

    16 horizontal? I had problems even at 60 with several monitors.

  • evanhevanh Posts: 15,209

    @pik33 said:
    16 horizontal? I had problems even at 60 with several monitors.

    Were those all using DVI/HDMI links?

Sign In or Register to comment.