HyperRam/Flash as VGA screen buffer (Now XGA, 720p &1080p) &Rev.B

Rayman · 2019-03-16 21:34

The attached file outputs a VGA image stored in HyperRam.
Code is a lot cleaner now with Ver.1N.

The read rate is 125 MBPS, much faster than the VGA pixel rate of 25 MHz.
There were some bugs initially, caused by an OUTA that should have been OUTH, but all fixed now.
Ver.1P is cleaned up and uses smartpin for HR clock and uses COGATN to synchronize HR with VGA.
Also, switched from waitse1 to waitx to get data starting from the first byte bin (instead of fifth).

Update: See ~page 4 for 16bpp code and images
Update2: See ~page 6 for some 1080p tests
Update3: Now works with HyperFlash too! See ~page 9.
Update4: Now have some examples working with Rev.B eval board and Parallax modules, see page 9.

jmg · 2019-03-16 21:40

Rayman wrote: »

Thought I had it, and then these corrupt horizontal lines started showing up...

Must be doing something wrong, not sure what yet...
They start showing up after about 10 seconds
....

Interesting effect.
You could try cooling P2 or HyperRAM chip with a can of freeze and see if it clears?
what is the time-between-artitfacts here ?

Rayman · 2019-03-16 22:39

Think I fixed that...
Seems it doesn't like me reading 640 bytes at a time.
Switched to 10x reads of 64 bytes and no more lines...
I don't see anything in the datasheet that says I can't do 640 bytes though...

Next minor issue is sync between vga driver and hyperram, needs some tweaking...

Rayman · 2019-03-16 23:30

Think I see what happened..
There's a 4 us limit for transactions (tCSM) and I was over 5 us.
Guess I can do 2 reads of 320 pixels...

whicker · 2019-03-17 19:21

Impressive progress!

jmg · 2019-03-17 20:14

Rayman wrote: »

Think I fixed that...
Seems it doesn't like me reading 640 bytes at a time.
Switched to 10x reads of 64 bytes and no more lines...
I don't see anything in the datasheet that says I can't do 640 bytes though...
...
Think I see what happened..
There's a 4 us limit for transactions (tCSM) and I was over 5 us.
Guess I can do 2 reads of 320 pixels...

Good progress. The data is vague on that tCSM, as one example divides the refresh time by the rows & it is unclear if that is a hard per-time ceiling, or if it is ok to hit that as an average and so meet overall refresh
Your tests indicate that is more a per-time ceiling.
That would mean it is the RAM warming up that shifts the refresh timing just enough to start to have an effect ?
That said, if they spec 4us, on a bench test being over 5us, is not a huge violation ?
If you do 320 pixel reads, and heat the RAM, is it still ok ? (is the whole RAM refreshing ok ?)

Rayman · 2019-03-17 20:56

It's not even warm, I think the chip is fine.

I'm having trouble figuring out a way to coordinate with HSYNC. looks like INA doesn't work because it's in smartpin mode...

whicker · 2019-03-17 21:23

I have a hunch that refreshing pauses when CS# is Low and it has received the upper 4 bytes (of 6) of the Command-Address with that Address being in memory space, until CS# goes High again.

Plenty of time to figure this out, though.

Rayman · 2019-03-17 22:21

I have something that works now. See top post.
It's stable and positioned correctly.

Also, there were some black pixels on the beak that are gone now. That was strange...

Not completely happy with it, but at least a proof of concept.

Martin Hodge · 2019-03-17 22:34

Is the dithering in the original image, or is that a limitation of the software/hardware?

rogloh · 2019-03-17 22:55

Nice work Rayman, perseverance pays off.

It will be interesting to see how people resolve the write accesses if a HyperRAM is used as a graphics buffer and can be shared between some COGs. We'd need guaranteed read bandwidth for display and some remaining bandwidth for intermittent write accesses from other COG(s), ideally keeping write latency low as a practical. I wonder if writes could be deferred to the end of each scan line being read in, or if you'd want finer grained interleaving between R/W. For block transfers that approach might be okay, but for lots of small accesses such as text rendering pixels etc I guess it might not be as good and probably starts to reduce the number of write transfers possible per frame. There'll probably be some sweet spot in between for different applications.

Rayman · 2019-03-17 23:23

The image is the same 8-bpp, 256 color palette that is used with the examples from Chip.
There probably is some dithering.

The read rate is 125 MBPS when the VGA output rate is 25 MBPS.
That's a decent ratio. Should be time to do updates.

Rayman · 2019-03-17 23:24

Would like to have a better way to sync with VGA HSync.

Right now, I have a jumper to P5 that I use to monitor HSync.

evanh · 2019-03-18 02:18

You could probably tack a COGATN or similar into the VGA hsync subroutine.

Cluso99 · 2019-03-18 02:19

Rayman,
Not sure if this is what you are asking...
In my VGA 1920x1080 driver, I changed the colors between successive groups of horizontal lines. This required syncing with the horizontal lines. Take a look at that thread for my latest code.

AJL · 2019-03-18 02:25

Rayman wrote: »

Would like to have a better way to sync with VGA HSync.

Right now, I have a jumper to P5 that I use to monitor HSync.

If I read your code and the documentation correctly you should be able to use Smartpin 61 with the input offset selector set to x011 (+3) to read pin 0.

Setting SE2 to match INB 29 rising, and using mode %10010 AND !Y[2] = Time X A-input highs/rises/edges, you can probably replace the TESTP loops with WAITSE2s, with an AKPIN to prepare for the next event should you not need the timing results.

BlanksLoop can include WXPIN with the total number (65?) of edges to wait, while HyperWait could WXPIN with two edges if you are confident that you won't start waiting in the wrong state.

This also goes some way toward freeing you from using the TESTP instruction ;-) With 3 SE sources you could also use one for VSync and be completely free :-)

Cluso99 · 2019-03-18 10:35

@Rayman
Here is what I do to get access to the HS. Note I swap palettes between some HS rows.
It may give you some ideas.

'+-------[ Setup Streamer ]----------------------------------------------------+ 
vid
                 loc     ptra,#\@palette_pairs          ' set 1st color palette   
                 setq2   #2-1                           ' (only need 2)
                 rdlong  lut_start,ptra                 ' set next palette color
                 mov     ctr1,#135                      ' & set row ctr

                setxfrq ##fset                  'set transfer frequency to xxxMHz
                dirh    #vsync                  'make vsync pin output

                'the next 4 lines may be commented out to bypass level scaling
                setcy   ##intensity << 24       '\ r      set colorspace for rgb
                setci   ##intensity << 16       '| g
                setcq   ##intensity << 08       '| b
                setcmod #%01_0_000_0            '/ enable colorspace conversion

                wrpin   dacmode_s,#0            '\ enable dac modes in pins 0..3
                wrpin   dacmode_c,#1            '|
                wrpin   dacmode_c,#2            '|
                wrpin   dacmode_c,#3            '|
                setnib  dira,#$f,#0             '/ & enable output


                rdfast  ##w*h*bpp/32/16,##@video_buffer ' wraps ~253KB bitmap (longs/16) 

'+-------[ Display Screen ]----------------------------------------------------+ 
' Field loop
field           outnot  #vsync                  '\ vsync on
                mov     x,#sync_blanks          '|
                call    #blank                  '|
                outnot  #vsync                  '/ vsync off
'-----------------------------------------------
                mov     x,#top_blanks
                call    #blank
'-----------------------------------------------
' display visible screen 1bpp
                mov     x,##h                   'set visible lines
                 loc     ptra,#\@palette_pairs           ' set 1st color palette   
                 mov     ctr1,#1                         ' & force new load below
.visible        xcont   m_bs,#0                 'horizontal sync (before sync)
                xcont   m_sn,#1                 'horizontal sync (sync)
                 djnz    ctr1,#.skip                     ' set new palette?
                 setq2   #2-1                            ' (only need 2)
                 rdlong  lut_start,ptra                  ' set next palette color
                 add     ptra,#2*4                       ' next color pair
                 mov     ctr1,#135                       ' & set row ctr
.skip           xcont   m_bv,#0                 'horizontal sync (before visible)
                xcont   m_rf,#0                 'visible line                   
                djnz    x,#.visible             'another line?
'-----------------------------------------------
                mov     x,##bottom_blanks
                call    #blank
'-----------------------------------------------
                jmp     #field                  'loop
'-----------------------------------------------

' display blank lines
blank           xcont   m_bs,#0                 'horizontal sync (before sync)
                xcont   m_sn,#1                 'horizontal sync (sync)
                xcont   m_bv,#0                 'horizontal sync (before visible)
                xcont   m_vi,#0                 'blank visible
        _RET_   djnz    x,#blank
'+-----------------------------------------------------------------------------+

' Initialized data
dacmode_s       long    %0000_0000_000_1011000000000_01_00000_0         'hsync is 123-ohm, 3.3V
dacmode_c       long    %0000_0000_000_1011100000000_01_00000_0         'R/G/B are 75-ohm, 2.0V

m_bs            long    $CF000000+before_sync
m_sn            long    $CF000000+sync
m_bv            long    $CF000000+before_visible
m_vi            long    $CF000000+w             'visible
'm_rf            long    $6F000000+w             '4bit RFLONG LUT
'm_rf            long    $5F000000+w             '2bit RFLONG LUT
m_rf            long    $4F000000+w             '1bit RFLONG LUT

Rayman · 2019-03-18 12:57

Thanks. I think my problem is the xcont buffering messing up timing...
Have to look into inserting fake or broken up xcont lines to clear that buffer...
Or, maybe I was just doing something wrong.
Now that I have it working, I'll try again...

jmg · 2019-03-18 20:17

Rayman wrote: »

I'm having trouble figuring out a way to coordinate with HSYNC. looks like INA doesn't work because it's in smartpin mode...
Would like to have a better way to sync with VGA HSync.
Right now, I have a jumper to P5 that I use to monitor HSync.

Hmm.. It would certainly be generally useful and expected in P2, to be able to trigger from any pin edge, without needing a jumper, or even a pin-remap.

I think you are saying the INA pin-in-attach fails (due to smart pin mode?), so this part of the docs cannot apply ?

‘SETSE1 D/#’, for which D/# selects the event:

%001_PPPPPP = INA/INB bit of pin %PPPPPP rises
%010_PPPPPP = INA/INB bit of pin %PPPPPP falls
%011_PPPPPP = INA/INB bit of pin %PPPPPP changes

It's not clear what INA/INB means, as there seems no bit to select A/B in this setup ?

Rayman · 2019-03-18 20:29

Well, just reading INA doesn't work in this smartpin mode.
But, maybe SETSE1 would?
I have a feeling it won't, but worth trying...
AJL's smartpin offset idea is interesting. Not sure if that works either...

I have a feeling that actual pin status may not be available in smart pin mode...

AJL · 2019-03-18 21:35

jmg wrote: »

Rayman wrote: »

I'm having trouble figuring out a way to coordinate with HSYNC. looks like INA doesn't work because it's in smartpin mode...
Would like to have a better way to sync with VGA HSync.
Right now, I have a jumper to P5 that I use to monitor HSync.

Hmm.. It would certainly be generally useful and expected in P2, to be able to trigger from any pin edge, without needing a jumper, or even a pin-remap.

I think you are saying the INA pin-in-attach fails (due to smart pin mode?), so this part of the docs cannot apply ?

‘SETSE1 D/#’, for which D/# selects the event:

%001_PPPPPP = INA/INB bit of pin %PPPPPP rises
%010_PPPPPP = INA/INB bit of pin %PPPPPP falls
%011_PPPPPP = INA/INB bit of pin %PPPPPP changes

It's not clear what INA/INB means, as there seems no bit to select A/B in this setup ?

With 6 bits in the %PPPPPP field the A/B select occurs from the uppermost bit.

For these events it's probably less helpful to think of INA/INB 0..31, and more helpful to think of IN 0..63.

When a pin cell is placed in smartpin mode the IN signal for that cell is subverted for the purposes of the smartpin function, which for the DAC modes indicates end of sample period. That's where the remap to another smartpin can provide a means to read the pin state.

Rayman · 2019-03-18 21:45

Is there anywhere where Chip says this works?
I know it can read the IN state, but not sure what that means in smartpin mode...

jmg · 2019-03-18 21:51

AJL wrote: »

With 6 bits in the %PPPPPP field the A/B select occurs from the uppermost bit.

For these events it's probably less helpful to think of INA/INB 0..31, and more helpful to think of IN 0..63.

Ah, of course.... I was thinking INB was some alternate pathway...

AJL wrote: »

When a pin cell is placed in smartpin mode the IN signal for that cell is subverted for the purposes of the smartpin function, which for the DAC modes indicates end of sample period. That's where the remap to another smartpin can provide a means to read the pin state.

That makes sense, but given P2's performance, I had not expected to also lose the ability to simply sense any pin edge.

AJL · 2019-03-18 23:09

Rayman wrote: »

Is there anywhere where Chip says this works?
I know it can read the IN state, but not sure what that means in smartpin mode...

This is another case where multiple signals have the same name, at least in part because of history.

On P1, IN simply meant the input state of the pin.

On P2, the state of the pin can be read by smartpins or can bypass the smartpin cell for direct reading by the COGs.
So, if you'll forgive some nomenclature creation on my part:
You have IN (PIN) and IN (COG), which in normal operation are the same signal. If my understanding is correct, IN (COG) is where the INA/INB is relevant.
In smartpin mode, the path is interrupted with IN (PIN) being supplied to the input selector circuits, and IN (COG) being an output of the smartpin cell fed to the COGs.
So the SE events are operating on the raw pin input in normal operation or the smartpin outputs in smartpin mode.
This would allow code to setup a smartpin cell, enable it, do any further preparatory processing and then a WAITSEx configured for IN rising edge to wait for the smartpin cell to complete its operation.

By using the input pin selector logic, it allows a smartpin to take its A and B inputs from any pin in the ±3 range, raw or inverted as necessary, and perform the smartpin function on that. While a smartpin cell can be a repository, or perform many automated logical functions, there's no simple read of the pin state.
In the OPs situation there appears to be only two approaches that can work:

Build the sync tracking into the frame generation routine, so you know what the sync pin state is and don't need to read it; or
Assign the sync tracking to a smartpin cell that is in the range and not being used for other purposes.

I don't have a P2 to test this with, so I'll need to rely on others to test the theory.

Cluso99 · 2019-03-19 01:11

Rayman wrote: »

Is there anywhere where Chip says this works?
I know it can read the IN state, but not sure what that means in smartpin mode...

Chip uses smartpins to sample the autobaud in the ROM. He uses P0 & P1 to do this for P63/62.

Rayman · 2019-03-19 09:50

Are p63/62 in smartpin mode then?

Rayman · 2019-03-19 09:59

Shoot. After running overnight the bad lines are back...

Need to read the datasheet again and see what I'm doing wrong...

evanh · 2019-03-19 10:47

Rayman wrote: »

Are p63/62 in smartpin mode then?

Yes. All four smartpins are set up in quick succession:

reset_serial	andn	dira,#%11		'disable timing measurements for autobaud

		setint1	#0			'disable int1

		mov	head,#0			'reset serial buffer pointers
		mov	tail,#0

		dirl	#rx_pin			'disable receiver
		wrpin	#%00_11111_0,#rx_pin	'configure rx_pin for asynchronous receive, always input

		wrpin	#%01_11110_0,#tx_pin	'configure tx_pin for asynchronous transmit, always output
		dirh	#tx_pin			'enable transmitter

		wrpin	mths,#rx_ths		'configure rx_ths for timing high states

		wrpin	mtne,#rx_tne		'configure rx_tne for timing negative edges
		wxpin	#1,#rx_tne		'report each cycle
		wypin	#0,#rx_tne		'measure fall to fall

		setse1	#%110<<6+rx_tne		'set se1 to trigger on rx_tne high

		mov	ijmp1,#autobaud_isr	'set int1 jump vector to autobaud ISR

		setint1	#4			'set int1 to trigger on se1 (rx_tne high)

		or	dira,#%11		'enable timing measurements for autobaud

Cluso99 · 2019-03-19 10:48

Rayman wrote: »

Are p63/62 in smartpin mode then?

Yes, they are also in smartpin mode to function as uarts. P0&1 both connect to P63 I think to time the receive high and low times while autobauding. See the ROM code.

Yanomani · 2019-03-19 14:12

Hy Rayman

Based in what I had found at the SPIN 2 code that is available at the first post, there are some instances where #Pin_CSn polarity is changed BEFORE #Pin_CK is driven LOW.

If you look carefully at the available datasheets (Cypress, ISSI) and also HyperBus specs, #Pin_CK MUST be LOW (Inactive), some time BEFORE ANY #Pin_CSn transition (tCSH (0nS), tCSS (3nS), at Cypress datasheets).

Also, since you are using a helper COG to transition #Pin_CK during the data transfer period, you must ensure that the last transition during a Data Memory Read operation lefts #Pin_CK in a LOW state (Inactive), during and after the last byte read operation.

Similarly, the same applies to any Data Memory Write operation; the last #Pin_CK transition, that latches the last byte to the HyperRam, MUST be a Low-going transition.

In both cases, #Pin_CK must be kept low, untill #Pin_CSn fully transitions from Low to High (tCSH, tCSS).

If needed (to any other intended application), #Pin_CK can be toggled freely during the time #PIN_CSn is actively driven HIGH, but the timing constraints must be observed, before any tansition on its active state.

Its also strongly recommended that #Pin_RSTn, #Pin_CSn and #Pin_CK are EVER actively driven, keeping them either LOW or HIGH, as needed by the control routines. This will ensure that noise or other interferences will have no opportunities to mess with the HyperBus interface.

Hope it helps a bit.

Henrique

Rayman · 2019-03-19 16:44

Thanks. That might be the problem (I hope so.)
I have a feeling the HR clock is still running when I deselect the chip...

HyperRam/Flash as VGA screen buffer (Now XGA, 720p &amp;1080p) &amp;Rev.B

Comments

HyperRam/Flash as VGA screen buffer (Now XGA, 720p &1080p) &Rev.B