Shop OBEX P1 Docs P2 Docs Learn Events
VGA 800x600 4bpp driver with text & graphics support + XGA 1024x768 4bpp - Page 4 — Parallax Forums

VGA 800x600 4bpp driver with text & graphics support + XGA 1024x768 4bpp

124

Comments

  • RaymanRayman Posts: 14,641
    edited 2016-09-11 17:31
    I've modified gfx2 in order to give WSVGA output (1024x600@72Hz)
    See attached.

    Mostly just multiplied things by 1024/800
    But, output was jagged, looked like the XGA driver posted here.
    Fixed that by changing a XCONT to XZERO.

    There was also one hardcoded ##800 in there that had to change to w.

    I remember we had to use that trick before to stabilize the display.
    Zeros the phase for each line so that rollovers (if they occur) are at same place on every line.
    This trick could probably also fix the XGA driver here.
  • Rayman wrote: »
    This trick could probably also fix the XGA driver here.
    Your right Rayman.
    It seems to have fixed the XGA display "jagged edge issue" too.
    Nice one! :)


  • RaymanRayman Posts: 14,641
    Back to looking at the 1024x600 version of this...
    Had to change all adra to PA and adrb to PB to compile with the V12 version of P2.

    One nice thing about width=1024 is that could use shift instead of multiply in the set pixel routine:
    'RJA:  Note that nice thing about w=1024 is that multiply is just a shift, don't need this cordic
                    mov     ptra,y1
                    qmul    ptra,##w/2'RJA 800 / 2          '2 pixels per byte
                    getrnd  ptra   'RJA:  What does this do Stalling?
                    getqx   ptra
    

    I am curious why the "getrnd" statement is there for. Not even sure what it does, get round number? It would work the same without that statement, right?
  • cgraceycgracey Posts: 14,152
    GETRND gets a pseudo-random number from the hub. It doesn't make any sense to me why that instruction is in the code that you showed.
  • RaymanRayman Posts: 14,641
    edited 2016-10-10 18:35
    Thanks. I'm tinkering more with this driver.

    Just substituted in Chip's line drawing routine from Graphics.spin:
    ' Plot line from px,py to dx,dy     (from graphics.spin)
    ' RJA:  Replaced line draw code with that of P1's graphics.spin
    'plot line from x1,y1 to x2,y2 in current color
    'one difference is that need to restore x1,y1 before exiting
    
    plot_line
                            'First, form px,py and dx,dy
                            mov     dx,x2
                            mov     dy,y2
                            mov     px,x1
                            mov     py,y1
                            
                            'cmps    dx,x1           wc', wr  'get x difference
                            subs    dx,x1          'P2 version of above
                            negc    asx,#1                   'set x direction
    
                            'cmps    dy,y1           wc', wr  'get y difference
                            subs    dy,y1           'P2 version of above
                            negc    asy,#1                   'set y direction
    
                            abs     dx,dx                   'make differences absolute
                            abs     dy,dy
    
                            cmp     dx,dy           wc      'determine dominant axis
            if_nc           tjz     dx,#.last               'if both differences 0, plot single pixel
            if_nc           mov     count,dx                'set pixel count
            if_c            mov     count,dy
                            mov     ratio,count             'set initial ratio
                            shr     ratio,#1
            if_c            jmp     #.yloop                 'x or y dominant?
    
    
    .xloop                  call    #set_pixel                   'dominant x line
                            add     x1,asx
                            sub     ratio,dy        wc
            if_c            add     ratio,dx
            if_c            add     y1,asy
                            djnz    count,#.xloop
    
                            jmp     #.last                  'plot last pixel
    
    
    .yloop                  call    #set_pixel                  'dominant y line
                            add     y1,asy
                            sub     ratio,dx        wc
            if_c            add     ratio,dy
            if_c            add     x1,asx
                            djnz    count,#.yloop
    
    .last                   call    #set_pixel                   'plot last pixel
    
    linepd_ret              ret
    
    'count   long 0
    ratio   long 0
    asx     long 0
    asy     long 0
    dx      long 0
    dy      long 0
    px      long 0
    py      long 0
    

    Haven't take the time to figure out exactly how it works, but seems very efficient...

    Did have some trouble with "cmps" and it's "wr". Don't think "wr" exists on P2.
    But, eventually figured out can just swap with "subs".
  • Rayman wrote: »
    Back to looking at the 1024x600 version of this...
    Had to change all adra to PA and adrb to PB to compile with the V12 version of P2.

    One nice thing about width=1024 is that could use shift instead of multiply in the set pixel routine:
    'RJA:  Note that nice thing about w=1024 is that multiply is just a shift, don't need this cordic
                    mov     ptra,y1
                    qmul    ptra,##w/2'RJA 800 / 2          '2 pixels per byte
                    getrnd  ptra   'RJA:  What does this do Stalling?
                    getqx   ptra
    

    I am curious why the "getrnd" statement is there for. Not even sure what it does, get round number? It would work the same without that statement, right?
    The GETRND instruction was put in between the CORDIC instructions simply to index the random number in the idle time waiting for the CORDIC result.
  • RaymanRayman Posts: 14,641
    edited 2016-10-11 16:10
    Ok thanks. That makes sense. There's a lot of time to wait with cordic.
    I think you could have used the regular 16-bit MUL instruction there instead though, right?

    This SVGA and WSVGA modes look very stable my home setup.
    There is only one flaw that I see.
    On the very bottom line, all the way on the right edge, there looks to be two pixels that are always white and one that flickers white.

    It's very strange. I think it has to be the monitor because I don't see any way it could be the driver.
    However, the 640x480 mode doesn't show this.
    Anyway, I'll try it on a different monitor and see what's what...

    Worst case, maybe it's possible to output the native resolution and do the scaling inside the P2.
  • Rayman wrote: »
    I think you could have used the regular 16-bit MUL instruction there instead though, right?
    Correct Ray.
    Not sure why I used the Cordic there, must have been in spinning fozzie mode. :)


    I've made that change and updated code to V12 FPGA image.
  • RaymanRayman Posts: 14,641
    Great, thanks. I'm still thinking this 4-bit color mode is going to be the best. All the nibble opcodes make it pretty efficient. Not quite as good as 8-bit, but plenty for simple text windows.

    BTW: I'm not sure I'm sold on this cordic for multiply. 38 clocks right? Seems to me that simple rotate and add version should be 32 clocks. Maybe I'm missing something...
  • potatoheadpotatohead Posts: 10,261
    edited 2016-10-12 02:49
    Cordic can be happening in parallel.
  • The beauty of the Cordic is you can have more than1 operation in the pipeline.
    Time to do other stuff and/or service interrupts while calculations are in progress. Neat!

    I had a quick look at coding a 32 * 32 multiply and arrived at 44 clocks.
    Maybe some fresh minds/eyes could trim that down. ;)
    '32 * 32 multiply
    
    	mov	v0,##$12345678	'test values
    	mov	v1,##$87654321
    
    	getword	r1,v0,#1	'2
    	getword	w1,v1,#1	'4
    	mul	r1,w1		'6
    
    	getword	r0,v0,#0	'8
    	getword	w1,v1,#0	'10
    	mul	r0,w1		'12
    
    	getword	w0,v0,#1	'14
    	getword	w1,v1,#0	'16
    	mul	w0,w1		'18
    	getword	w1,w0,#0	'20
    	getword	w2,w0,#1	'22
    	shl	w1,#16		'24
    	add	r0,w1 wc,wz	'26
    	addx	r1,w2		'28
    
    	getword	w0,v0,#0	'30
    	getword	w1,v1,#1	'32
    	mul	w0,w1		'34
    	getword	w1,w0,#0	'36
    	getword	w2,w0,#1	'38
    	shl	w1,#16		'40
    	add	r0,w1 wc,wz	'42
    	addx	r1,w2		'44 clocks
    
    'r1:r0 = $09A0CD05_70B88D78
    
    
  • RaymanRayman Posts: 14,641
    edited 2016-10-12 13:38
    I guess I'd like to know more about the accuracy and precision of the cordic solver.
    It's been a long while, but seem to recall that the precision depended on how many iterations were made. I think this was originally going to be user decided.

    Anyway, if multiply and divide give exactly the same result as the long multiply and long division, like what ozpropdev just posted, then I guess it's perfect.

    For the other cordic functions, I guess I'm not as concerned. Although, it would be nice to have a statement about precision in the docs...

    This is all a bit off topic from this thread though...
  • cgraceycgracey Posts: 14,152
    Rayman wrote: »
    I guess I'd like to know more about the accuracy and precision of the cordic solver.
    It's been a long while, but seem to recall that the precision depended on how many iterations were made. I think this was originally going to be user decided.

    Anyway, if multiply and divide give exactly the same result as the long multiply and long division, like what ozpropdev just posted, then I guess it's perfect.

    For the other cordic functions, I guess I'm not as concerned. Although, it would be nice to have a statement about precision in the docs...

    This is all a bit off topic from this thread though...

    Because it is pipelined, it always does the whole 32 iterations. It is as close to perfect as you can get with 32 bits.
  • jmgjmg Posts: 15,173
    Rayman wrote: »
    This SVGA and WSVGA modes look very stable my home setup.
    There is only one flaw that I see.
    On the very bottom line, all the way on the right edge, there looks to be two pixels that are always white and one that flickers white.

    It's very strange. I think it has to be the monitor because I don't see any way it could be the driver.
    However, the 640x480 mode doesn't show this.
    Anyway, I'll try it on a different monitor and see what's what...

    Did you check other monitors ?
    Are those pixels outside or inside the blanking rectangle ?
    What happens if you vary the blanking edges, and total clocks per line, slightly ?

  • RaymanRayman Posts: 14,641
    Just did. I was surprised to see the same thing on a different monitor.
    Maybe there is something strange going on with the driver...

    I modified the demo to put a box around the visible pixels.
    Came out pink this time.
    Look in the corner for the white pixels:

    1632 x 1224 - 344K
  • jmgjmg Posts: 15,173
    Rayman wrote: »
    I modified the demo to put a box around the visible pixels.
    Came out pink this time.
    Look in the corner for the white pixels:

    So those should be pink, but instead are white ?
    Does look like some issue within P2

    Rayman wrote: »
    .... I'm still thinking this 4-bit color mode is going to be the best. All the nibble opcodes make it pretty efficient. Not quite as good as 8-bit, but plenty for simple text windows.

    Is this 4b going via CLUT ? 4b via a table is probably just tolerable, as you could update the table every line, for blends, for example.


  • RaymanRayman Posts: 14,641
    For games, you probably want 8-bit mode.
    But, for basic GUIs with controls and text, I think this is best.
  • 4 bit mode is a good happy medium that leaves you at least half of the available hub ram for code.
    800 * 600 @ 8 bits = 480k , doesn't leave much room for code.
  • jmgjmg Posts: 15,173
    ozpropdev wrote: »
    4 bit mode is a good happy medium that leaves you at least half of the available hub ram for code.
    800 * 600 @ 8 bits = 480k , doesn't leave much room for code.

    Of course, but 4 bits -> 8/16b CLUT -> DACs keeps the RAM cost the same, but gives more pastel colour choices which are trendier than the primary test pattern colours..

  • RaymanRayman Posts: 14,641
    Right, that's what this driver does... 16 colors using 4-bits per pixel...

    You can make the 16-color palette with any colors you want...
  • rjo__rjo__ Posts: 2,114
    gfx7:

    I can't take a picture of it tonight. But in 640x480 mode, roughly the top 1/3 of the image shows a horizontal displacement error... has an almost sinusoidal character. The upper right corner shows a "tear", where it appears that the horizontal reset hasn't been reached yet but no data is being sent. This all disappears in the lower portion of the screen... but in the lower right corner there are 4 spurious pixels at the end of the last line: two set, one empty and one set.

    The distortion artifact does not occur in Chip's 8bit VGA sample.

    I don't understand how a timing error can occur in the top 1/3 but not the bottom 2/3:)

  • rjo__rjo__ Posts: 2,114
    After playing some more, changing to:

    fpix = 25_000_000.0

    seems to fix it...

    But I still don't get it:)

  • jmgjmg Posts: 15,173
    rjo__ wrote: »
    ... But in 640x480 mode, roughly the top 1/3 of the image shows a horizontal displacement error... has an almost sinusoidal character. ...
    I don't understand how a timing error can occur in the top 1/3 but not the bottom 2/3:)

    That sounds like a PLL effect, where the handling CPU is unsure about locking.
    Check also for gaps or incorrect Line sync placement during vertical sync.
    Do all your Monitors show the same artifact ?
    rjo__ wrote: »
    After playing some more, changing to:

    fpix = 25_000_000.0

    seems to fix it...

    But I still don't get it:)
    What was fpix before ?
    If you vary it some more, what is the stable range ?
  • rjo__rjo__ Posts: 2,114
    I have only checked on one monitor... my bench is so full of jumpers and wires, I might have to call the
    fire department for help.

    Originally fpix = 25_175_000.0

    I didn't systematically look for the range, but everything I tried (except 25_000_000.0 and 25_000_001.0) showed
    the same type of distortion to varying degrees.

    I tried changing the base value by numbers, which were multiples of 5 and 8.



  • rjo__rjo__ Posts: 2,114
    ozpropdev wrote: »
    4 bit mode is a good happy medium that leaves you at least half of the available hub ram for code.
    800 * 600 @ 8 bits = 480k , doesn't leave much room for code.

    This perked me up a bit.

    I am all into imaging, and 8 bits is my default bit depth. I am happy at 640x480 VGA, BUT I always assumed
    that I was going to be using off-chip memory to scroll data to the screen. If that isn't reasonable to assume, I need to rethink my parameters a bit.
    I can do a lot with the HUBRAM, but if it isn't actually possible to scroll the screen from external memory, then I am really going to need more P2's and I should be spending time on that issue now.
  • It gets even better Rjo, if using the video buffer to initialise the cogs then re-use the space, the difference is 512k less 480,000 ie 44,288 bytes. And P2 code is significantly more dense than P1...

    But yes I have no doubt you'll need more P2's
  • Rich
    Try these settings for 640 * 480 @72Hz
    Seems to work better for my monitor.
    '640 x 480 72Hz mode
    '{
    	w = 640
    	h=480
    	fpix = 31_500_000.0
    	top_blanks = 9
    	sync_blanks = 2
    	bottom_blanks = 29
    	before_sync = 24
    	sync = 40
    	before_visible = 128
    '}
    

  • rjo__rjo__ Posts: 2,114
    ozpropdev wrote: »
    Rich
    Try these settings for 640 * 480 @72Hz
    Seems to work better for my monitor.
    '640 x 480 72Hz mode
    '{
    	w = 640
    	h=480
    	fpix = 31_500_000.0
    	top_blanks = 9
    	sync_blanks = 2
    	bottom_blanks = 29
    	before_sync = 24
    	sync = 40
    	before_visible = 128
    '}
    

    That locks up my screen:) (black screen... IR remote controller largely, but not completely ignored ... as though the monitor develops Parkinson's)
    It is a new Vizio 1080p.

    I really like 640x400 @70Hz. I get full screen no artifacts. Had to change fpix to fpix = 25_000_000.0, though.

    I wonder if the fpix issue isn't because a relationship between fpix and xfrq. Note this line:

    setxfrq ##round(fset)
  • rjo__rjo__ Posts: 2,114
    Tubular wrote: »
    It gets even better Rjo, if using the video buffer to initialise the cogs then re-use the space, the difference is 512k less 480,000 ie 44,288 bytes. And P2 code is significantly more dense than P1...

    But yes I have no doubt you'll need more P2's

    I am building my tower as we speak:)
  • jmgjmg Posts: 15,173
    rjo__ wrote: »
    I really like 640x400 @70Hz. I get full screen no artifacts. Had to change fpix to fpix = 25_000_000.0, though.
    The std report is for 720 x 400 @ 70Hz , which is needed during PC Boot, so I doubt that is going away ever.
    640x480 just might be different enough to have issues ?

    Does the test display have non-black background, cleanly blanked ?
    I've found the auto-scale engines in Monitors, to be a little fussy around those areas.
    To decide on centre, and clocks, I think they use those image-edges in the decision process, whilst the ancient CRT could not care less....

Sign In or Register to comment.