Shop OBEX P1 Docs P2 Docs Learn Events
SDRAM Driver — Parallax Forums

SDRAM Driver

cgraceycgracey Posts: 14,206
edited 2013-05-09 20:29 in Propeller 2
Here is the SDRAM driver I've been working on.

It took a long time to get this sorted out, since scheduling, functionality, and efficiency all had to be optimized together. I had to come up with a new way to document what was happening concurrently with the instructions. The driver wound up being only 52 longs.

This runs on the DE2-115 platform and uses one of the on-board 32M x 16 SDRAM chips. It treats it as a 16M x 16 device, so that it will be faithful to what the actual Prop2 chip will use. The 16M x 16 part is only $2 in volume, whereas the 32M x 16 is $12.

This driver reads and writes 2^n QUADs at a time. When doing maximal 64 QUAD (1024 byte) transfers, the timing efficiency is ~89%, or 107MB/s out of a theoretical 120MB/s.

It requires a new DE2-115 configuration file, which is included here:

Prop2_SDRAM_Driver.zip

Here is the actual driver source (also in the .zip file):
'********************************
'*				*
'*  Propeller II SDRAM Driver	*
'*     for 16M x 16 devices	*
'*	  (FPGA version)	*
'*				*
'*  Version 0.1			*
'*  9 April 2013		*
'*  by Chip Gracey		*
'*				*
'********************************

{
SDRAM connections:

	P85 = cke (held high)		Port C
	P84 = cs
	P83 = ras
	P82 = cas
	P81 = we
	P80 = udqm (not used)
	P79 = ldqm (not used)
   P78..P77 = ba[1..0]
   P76..P64 = a[12..0]

   P63..P48 = dq[15..0]			Port B


  Note:	All pin directions have a 2-clock delay
	All pin outputs going to SDRAM have a 3-clock delay, since they are registered at the pin
	All pin inputs coming from SDRAM have a 2-clock delay, since they are registered at the pin


Commands (pairs of longs):

name		quads	bytes	hub_address (+0, set 2nd)			sdram_address (+1, set 1st)
------------------------------------------------------------------------------------------------------------------------
rw_1024		64	1024	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W111	%xxxx_xxxA_AAAA_AAAA_AAAA_AA00_0000_0000
rw_512		32	 512	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W110	%xxxx_xxxA_AAAA_AAAA_AAAA_AAA0_0000_0000
rw_256		16	 256	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W101	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_0000_0000
rw_128		 8	 128	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W100	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_A000_0000
rw_64		 4	  64	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W011	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AA00_0000
rw_32		 2	  32	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W010	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AAA0_0000
rw_16		 1	  16	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W001	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AAAA_0000
skip_done	 0	   0	%0000_0000_0000_0000_0000_0000_0000_0000	%xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx
end_of_list	 0	   0	%0000_0000_0000_0000_0000_0000_0000_1000	%xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx


The driver scans a list for commands. Commands take two longs and are structured as hub_address + sdram_address. To have
the driver perform a read or write operation on the SDRAM, first set the sdram_address (2nd long) and then the hub_address
(1st long). The driver will set each hub_address long to skip_done (0) when its associated operation has completed.

Before starting the driver, build the command list structure with pairs of 0's, then terminate it with an 8. When launching
the driver into a cog, point the parameter (S) to the start of the command list.


Command list (longs):

	hub_address (skip_done/rw_xxxx)
	sdram_address

	hub_address (skip_done/rw_xxxx)
	sdram_address

	hub_address (skip_done/rw_xxxx)
	sdram_address

	...

	end_of_list
}

DAT		org
'
'
' Initialize SDRAM				clocks	hub	
'						----------------
sdram_driver	getptra	list			'1	-		save command list address

		reps	#6000,#1		'1	-		repeat instruction for 100us @60MHz
		mov	pinc,h003FFFFF		'1	-		cke and cs high
		mov	dirc,h003FFFFF	wz	'6000	-		set SDRAM signals to outputs (100us), Z = 0
'
'
' Scan list for commands
'
:finish	if_z	wrlong	cmd,ptra[-2]		'1	0		if command finished, set it to skip_done (cmd = 0)

		mov	pinc,h00240400		'1	1		issue 'precharge all' command
		setb	pinc,#20	wz	'1	2		deselect, Z = 0
:reset	if_z	setptra	list			'1	3		if end_of_list, reset list pointer; satisy Trp

		mov	pinc,h00200037		'1	4		issue 'set mode' command
		setb	pinc,#20		'1	5		deselect, satisfy Tmrd

		mov	pinc,h00226000		'1	6		issue 'auto refresh' command
		setb	pinc,#20		'1	7		deselect, satisfy Trfc in 8 more clocks

:next		rdlong	cmd,ptra++[2]	wz	'3	0..2		check command, point to next
		setptrb	cmd			'1	3		in case command, set hub address
	if_z	jmp	#:next			'1/4	4/4..7		if skip_done, next command

		decod3	cmd		wc	'1	5		decode size bits, C = write
		and	cmd,#%11111110	wz	'1	6		isolate bits 7..1, Z = 1 if end_of_list
	if_z	jmp	#:reset			'1/4	7/7..2		if end_of_list, reset list pointer
'
'
' Execute read/write command			clocks	hub	XFR write SDRAM		XFR read SDRAM
'						------------------------------------------------------
		rdlong	adr,ptra[-1]		'3	0..2	-			-			get SDRAM address

		shl	adr,#7			'1	3	-			-			make 'active' command with bank and row address
		or	adr,#%1_0011_00		'1	4	-			-			(ba[1..0], a[12..0]) = adr[24..10]
		rol	adr,#15			'1	5	-			-

		mov	pinc,adr		'1	6	-			-			issue 'active' command
		setb	pinc,#20		'1	7	-			-			deselect, satisfy Trcd in 1 more clock

	if_c	rdquad	ptrb++			'1	0	-			-			if write, read initial QUADs from hub

		setbc	:quad,#0		'1	1	-			-			set wrquad/rdquad according to read/write

	if_c	setxfr	#%010_011		'1	2	<QUADs_to_16_pins>	-			if write, configure XFR at hub cycle 2

		shr	adr,#22+1		'1	3	-			-			make blank command with bank and column address
		and	pinc,h00306000		'1	4	-			-			(ba[1..0], a[12..0]) = (adr[24..23], %0000, adr[9..1])
		or	pinc,adr		'1	5	-			-

	if_c	mov	dirb,hFFFF0000		'1	6	-			-			if write, enable data outputs

	if_nc	setxfr	#%100_011		'1	7				<16_pins_to_QUADs>	if read, configure XFR at hub cycle 7

	if_nc	xor	pinc,h001A0000		'1	0	-			-			if read, issue 'read' command
	if_nc	setb	pinc,#20		'1	1	-			-			if read, deselect

	if_c	xor	pinc,h00180000		'1	2	-			-			if write, issue 'write' command (aligns with XFR on next clock)
	if_c	setb	pinc,#20		'1	3	output QUAD0 w0		-			if write, deselect; read: SDRAM sees 'read' command

		shr	cmd,#1			'1	4	output QUAD0 w1		-			get loop count

	if_nc	nop	#7			'1	5	output QUAD1 w0		-			if read, pad time
		'(nop)				'1	6				-			read: SDRAM starts outputting data stream
		'(nop)				'1	7				-
		'(nop)				'1	0				input w0		read: SDRAM data stream begins arriving in XFR
		'(nop)				'1	1				input w1 -> QUAD0
		'(nop)				'1	2				input w0
		'(nop)				'1	3				input w1 -> QUAD1
		'(nop)				'1	4				input w0

:quad		'(wrquad/rdquad)		'1	5	output QUAD1 w0		input w1 -> QUAD2
		'(wrquad/rdquad)		'1	6	output QUAD1 w1		input w0
		'(wrquad/rdquad)		'1	7	output QUAD2 w0		input w1 -> QUAD3
		wrquad	ptrb++			'1	0	output QUAD2 w1		input w0		write: read next QUADS; read: write current QUADS
		djnz	cmd,#:quad		'1	1	output QUAD3 w0		input w1 -> QUAD0	loop for each set of QUADs
		'(djnz looping)			'1	2	output QUAD3 w1		input w0
		'(djnz looping)			'1	3	output QUAD0 w0 (new)	input w1 -> QUAD1	write: RDQUAD data valid
		'(djnz looping)			'1	4	output QUAD0 w1		input w0

		mov	pinc,h002C0000		'1	2	output QUAD3 w1		-			issue 'burst terminate' command (aligns with XFR on next clock)
		mov	dirb,#0		wz	'1	3	-			-			cancel data outputs (aligns with 'burst terminate'), Z=1
		jmp	#:finish		'4	4..7	-			-			finish up, scan for next command
'
' Constants
'
h00180000	long	$00180000		'write' command toggle
h001A0000	long	$001A0000		'read' command toggle
h00200037	long	$00200037		'set mode' command (full-page r/w bursts, cas latency = 3)
h00226000	long	$00226000		'read' command mask
h00240400	long	$00240400		'precharge all' command
h002C0000	long	$002C0000		'burst terminate' command
h00306000	long	$00306000		'blank command mask
h003FFFFF	long	$003FFFFF		'control pins mask
hFFFF0000	long	$FFFF0000		'data pins mask
'
'
' Variables
'
list		res	1			'command list pointer
cmd		res	1			'command
adr		res	1			'address

Tomorrow, I hope to make a video example using this driver. It's good at 60MHz for displays which consume up to 107MB/s of data.
«1345

Comments

  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 04:32
    Hi Chip.

    Thanks
  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 04:40
    Hi Chip.

    What are differences from old one.

    Are You added PS2 - and some other board resources ?
    cgracey wrote: »
    Here is the SDRAM driver I've been working on.

    It requires a new DE2-115 configuration file, which is included here:

    Prop2_SDRAM_Driver.zip

    Here is the actual driver source (also in the .zip file):

    .
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 04:47
    Sapieha wrote: »
    Hi Chip.

    What are differences from old one.

    Are You added PS2 - and some other board resources ?

    I just changed the clock generation for the SDRAM.

    Could you tell me what you would like to see for the PS2 connections, etc.?
  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 04:52
    Hi Chip.

    I work on self contained Basic interpreter for P2.
    With other words You can like it old Home computers.

    I hope that will say You all what I need that to !!
    cgracey wrote: »
    I just changed the clock generation for the SDRAM.

    Could you tell me what you would like to see for the PS2 connections, etc.?
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 05:07
    Sapieha wrote: »
    Hi Chip.

    I work on self contained Basic interpreter for P2.
    With other words You can like it old Home computers.

    I hope that will say You all what I need that to !!

    Sounds neat. I want to make something like that, too.
  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 05:11
    Hi Chip.

    You wiil !
    I'm are half way done with (COG) RUN time interpreter module.

    cgracey wrote: »
    Sounds neat. I want to make something like that, too.
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-04-09 05:30
    Thanks Chip!

    This is going to be fun...
    cgracey wrote: »
    Here is the SDRAM driver I've been working on.

    It took a long time to get this sorted out, since scheduling, functionality, and efficiency all had to be optimized together. I had to come up with a new way to document what was happening concurrently with the instructions. The driver wound up being only 52 longs.

    This runs on the DE2-115 platform and uses one of the on-board 32M x 16 SDRAM chips. It treats it as a 16M x 16 device, so that it will be faithful to what the actual Prop2 chip will use. The 16M x 16 part is only $2 in volume, whereas the 32M x 16 is $12.

    This driver reads and writes 2^n QUADs at a time. When doing maximal 64 QUAD (1024 byte) transfers, the timing efficiency is ~89%, or 107MB/s out of a theoretical 120MB/s.

    It requires a new DE2-115 configuration file, which is included here:

    Prop2_SDRAM_Driver.zip

    Here is the actual driver source (also in the .zip file):
    '********************************
    '*				*
    '*  Propeller II SDRAM Driver	*
    '*     for 16M x 16 devices	*
    '*	  (FPGA version)	*
    '*				*
    '*  Version 0.1			*
    '*  9 April 2013		*
    '*  by Chip Gracey		*
    '*				*
    '********************************
    
    {
    SDRAM connections:
    
    	P85 = cke (held high)		Port C
    	P84 = cs
    	P83 = ras
    	P82 = cas
    	P81 = we
    	P80 = udqm (not used)
    	P79 = ldqm (not used)
       P78..P77 = ba[1..0]
       P76..P64 = a[12..0]
    
       P63..P48 = dq[15..0]			Port B
    
    
      Note:	All pin directions have a 2-clock delay
    	All pin outputs going to SDRAM have a 3-clock delay, since they are registered at the pin
    	All pin inputs coming from SDRAM have a 2-clock delay, since they are registered at the pin
    
    
    Commands (pairs of longs):
    
    name		quads	bytes	hub_address (+0, set 2nd)			sdram_address (+1, set 1st)
    ------------------------------------------------------------------------------------------------------------------------
    rw_1024		64	1024	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W111	%xxxx_xxxA_AAAA_AAAA_AAAA_AA00_0000_0000
    rw_512		32	 512	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W110	%xxxx_xxxA_AAAA_AAAA_AAAA_AAA0_0000_0000
    rw_256		16	 256	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W101	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_0000_0000
    rw_128		 8	 128	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W100	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_A000_0000
    rw_64		 4	  64	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W011	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AA00_0000
    rw_32		 2	  32	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W010	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AAA0_0000
    rw_16		 1	  16	%xxxx_xxxx_xxxx_xxxA_AAAA_AAAA_AAAA_W001	%xxxx_xxxA_AAAA_AAAA_AAAA_AAAA_AAAA_0000
    skip_done	 0	   0	%0000_0000_0000_0000_0000_0000_0000_0000	%xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx
    end_of_list	 0	   0	%0000_0000_0000_0000_0000_0000_0000_1000	%xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx
    
    
    The driver scans a list for commands. Commands take two longs and are structured as hub_address + sdram_address. To have
    the driver perform a read or write operation on the SDRAM, first set the sdram_address (2nd long) and then the hub_address
    (1st long). The driver will set each hub_address long to skip_done (0) when its associated operation has completed.
    
    Before starting the driver, build the command list structure with pairs of 0's, then terminate it with an 8. When launching
    the driver into a cog, point the parameter (S) to the start of the command list.
    
    
    Command list (longs):
    
    	hub_address (skip_done/rw_xxxx)
    	sdram_address
    
    	hub_address (skip_done/rw_xxxx)
    	sdram_address
    
    	hub_address (skip_done/rw_xxxx)
    	sdram_address
    
    	...
    
    	end_of_list
    }
    
    DAT		org
    '
    '
    ' Initialize SDRAM				clocks	hub	
    '						----------------
    sdram_driver	getptra	list			'1	-		save command list address
    
    		reps	#6000,#1		'1	-		repeat instruction for 100us @60MHz
    		mov	pinc,h003FFFFF		'1	-		cke and cs high
    		mov	dirc,h003FFFFF	wz	'6000	-		set SDRAM signals to outputs (100us), Z = 0
    '
    '
    ' Scan list for commands
    '
    :finish	if_z	wrlong	cmd,ptra[-2]		'1	0		if command finished, set it to skip_done (cmd = 0)
    
    		mov	pinc,h00240400		'1	1		issue 'precharge all' command
    		setb	pinc,#20	wz	'1	2		deselect, Z = 0
    :reset	if_z	setptra	list			'1	3		if end_of_list, reset list pointer; satisy Trp
    
    		mov	pinc,h00200037		'1	4		issue 'set mode' command
    		setb	pinc,#20		'1	5		deselect, satisfy Tmrd
    
    		mov	pinc,h00226000		'1	6		issue 'auto refresh' command
    		setb	pinc,#20		'1	7		deselect, satisfy Trfc in 8 more clocks
    
    :next		rdlong	cmd,ptra++[2]	wz	'3	0..2		check command, point to next
    		setptrb	cmd			'1	3		in case command, set hub address
    	if_z	jmp	#:next			'1/4	4/4..7		if skip_done, next command
    
    		decod3	cmd		wc	'1	5		decode size bits, C = write
    		and	cmd,#%11111110	wz	'1	6		isolate bits 7..1, Z = 1 if end_of_list
    	if_z	jmp	#:reset			'1/4	7/7..2		if end_of_list, reset list pointer
    '
    '
    ' Execute read/write command			clocks	hub	XFR write SDRAM		XFR read SDRAM
    '						------------------------------------------------------
    		rdlong	adr,ptra[-1]		'3	0..2	-			-			get SDRAM address
    
    		shl	adr,#7			'1	3	-			-			make 'active' command with bank and row address
    		or	adr,#%1_0011_00		'1	4	-			-			(ba[1..0], a[12..0]) = adr[24..10]
    		rol	adr,#15			'1	5	-			-
    
    		mov	pinc,adr		'1	6	-			-			issue 'active' command
    		setb	pinc,#20		'1	7	-			-			deselect, satisfy Trcd in 1 more clock
    
    	if_c	rdquad	ptrb++			'1	0	-			-			if write, read initial QUADs from hub
    
    		setbc	:quad,#0		'1	1	-			-			set wrquad/rdquad according to read/write
    
    	if_c	setxfr	#%010_011		'1	2	<QUADs_to_16_pins>	-			if write, configure XFR at hub cycle 2
    
    		shr	adr,#22+1		'1	3	-			-			make blank command with bank and column address
    		and	pinc,h00306000		'1	4	-			-			(ba[1..0], a[12..0]) = (adr[24..23], %0000, adr[9..1])
    		or	pinc,adr		'1	5	-			-
    
    	if_c	mov	dirb,hFFFF0000		'1	6	-			-			if write, enable data outputs
    
    	if_nc	setxfr	#%100_011		'1	7				<16_pins_to_QUADs>	if read, configure XFR at hub cycle 7
    
    	if_nc	xor	pinc,h001A0000		'1	0	-			-			if read, issue 'read' command
    	if_nc	setb	pinc,#20		'1	1	-			-			if read, deselect
    
    	if_c	xor	pinc,h00180000		'1	2	-			-			if write, issue 'write' command (aligns with XFR on next clock)
    	if_c	setb	pinc,#20		'1	3	output QUAD0 w0		-			if write, deselect; read: SDRAM sees 'read' command
    
    		shr	cmd,#1			'1	4	output QUAD0 w1		-			get loop count
    
    	if_nc	nop	#7			'1	5	output QUAD1 w0		-			if read, pad time
    		'(nop)				'1	6				-			read: SDRAM starts outputting data stream
    		'(nop)				'1	7				-
    		'(nop)				'1	0				input w0		read: SDRAM data stream begins arriving in XFR
    		'(nop)				'1	1				input w1 -> QUAD0
    		'(nop)				'1	2				input w0
    		'(nop)				'1	3				input w1 -> QUAD1
    		'(nop)				'1	4				input w0
    
    :quad		'(wrquad/rdquad)		'1	5	output QUAD1 w0		input w1 -> QUAD2
    		'(wrquad/rdquad)		'1	6	output QUAD1 w1		input w0
    		'(wrquad/rdquad)		'1	7	output QUAD2 w0		input w1 -> QUAD3
    		wrquad	ptrb++			'1	0	output QUAD2 w1		input w0		write: read next QUADS; read: write current QUADS
    		djnz	cmd,#:quad		'1	1	output QUAD3 w0		input w1 -> QUAD0	loop for each set of QUADs
    		'(djnz looping)			'1	2	output QUAD3 w1		input w0
    		'(djnz looping)			'1	3	output QUAD0 w0 (new)	input w1 -> QUAD1	write: RDQUAD data valid
    		'(djnz looping)			'1	4	output QUAD0 w1		input w0
    
    		mov	pinc,h002C0000		'1	2	output QUAD3 w1		-			issue 'burst terminate' command (aligns with XFR on next clock)
    		mov	dirb,#0		wz	'1	3	-			-			cancel data outputs (aligns with 'burst terminate'), Z=1
    		jmp	#:finish		'4	4..7	-			-			finish up, scan for next command
    '
    ' Constants
    '
    h00180000	long	$00180000		'write' command toggle
    h001A0000	long	$001A0000		'read' command toggle
    h00200037	long	$00200037		'set mode' command (full-page r/w bursts, cas latency = 3)
    h00226000	long	$00226000		'read' command mask
    h00240400	long	$00240400		'precharge all' command
    h002C0000	long	$002C0000		'burst terminate' command
    h00306000	long	$00306000		'blank command mask
    h003FFFFF	long	$003FFFFF		'control pins mask
    hFFFF0000	long	$FFFF0000		'data pins mask
    '
    '
    ' Variables
    '
    list		res	1			'command list pointer
    cmd		res	1			'command
    adr		res	1			'address
    

    Tomorrow, I hope to make a video example using this driver. It's good at 60MHz for displays which consume up to 107MB/s of data.
  • David BetzDavid Betz Posts: 14,516
    edited 2013-04-09 05:35
    Sapieha wrote: »
    Hi Chip.

    You wiil !
    I'm are half way done with (COG) RUN time interpreter module.
    This sounds cool! Are you writing it all in PASM?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-04-09 05:36
    cgracey wrote: »
    Here is the SDRAM driver I've been working on.
    Thanks Chip! This should be useful for XMM modes in PropGCC for P2!
  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 05:40
    Hi David.

    Yes -- RUN time are written in PASM - will work in any COG and thinking is -- multiple instances

    David Betz wrote: »
    This sounds cool! Are you writing it all in PASM?
  • SapiehaSapieha Posts: 2,964
    edited 2013-04-09 05:42
    Ps. David.

    To answer question for You from another thread !

    Now it is NOT confident any more what I'm working on.
  • David BetzDavid Betz Posts: 14,516
    edited 2013-04-09 05:44
    Sapieha wrote: »
    Hi David.

    Yes -- RUN time are written in PASM - will work in any COG and thinking is -- multiple instances
    I'm looking forward to seeing it especially hooked up to Chip's new SDRAM driver!
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-04-09 05:48
    Chip,

    I just read the driver - very nicely done ... I like it!

    For now, we can read a quad, modify it, and write it back - it will work fine.

    however if you are taking requests :-)

    For graphics use, it would be handy (and faster for things like line drawing, drawing circles etc) to add:

    - read/write long
    - read/write word
    - potentially even read/write byte (for 8 bit per pixel modes)

    so that individual pixels can be read/written from/to a frame buffer.

    If you don't have time to add it, obviously you can leave it as an exercise to the readers :)

    Now I am going back to analyzing your code... and trying it.

    Edit:

    I really like the way you documented every cycle, even ones that did not have associated instructions. I think I will start to use the same (or similar) style in my pasm2 code.
  • BaggersBaggers Posts: 3,019
    edited 2013-04-09 06:10
    Thanks Chip, I look forward to having a play with this tonight :)
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-04-09 06:49
    Nice work Chip!
  • SeairthSeairth Posts: 2,474
    edited 2013-04-09 08:02
    I really like the way you documented every cycle, even ones that did not have associated instructions. I think I will start to use the same (or similar) style in my pasm2 code.

    Agreed! Now, if we could only get the editor to show that (clocks and hub window) automatically!
  • roglohrogloh Posts: 5,837
    edited 2013-04-09 08:15
    This is all good stuff and things are getting very interesting. An SDRAM controller in a COG is certainly no cakewalk, especially once you have multiple requestors using different burst sizes and want good performance and low latency. So great work again Chip for already getting things going!

    It will be really fun to see how things develop further in this area and how people start architecting their code to allow combined use of SDRAM for both hires graphics frame buffers and LMM code for example and what caching schemes are used etc. That is what I've always been personally interested in, something that is capable of providing a nice big flat memory space for an LMM VM to access for both its application and some graphics memory buffers. It seems it could soon be a reality and I look forward to seeing just how high the graphics resolutions and bit depths will realistically get with all this potential SDRAM bandwidth once the VM starts to compete for access to the same memory. We'll have to see where things go from here and which partitioning approaches best cut it.

    Very interesting times ahead... :smile:
  • jazzedjazzed Posts: 11,803
    edited 2013-04-09 08:20
    Great to see this. Thanks.
  • tritoniumtritonium Posts: 543
    edited 2013-04-09 11:42
    Sapieha wrote: »
    Hi Chip.

    I work on self contained Basic interpreter for P2.
    With other words You can like it old Home computers.

    I hope that will say You all what I need that to !!

    Oh yes,yes,yes,yes,yes,
    goody,goody,goody

    :lol::lol::lol::lol::lol::lol::lol::lol:

    (happy days are here again.......)

    sigh

    Dave
  • David BetzDavid Betz Posts: 14,516
    edited 2013-04-09 12:03
    tritonium wrote: »
    Oh yes,yes,yes,yes,yes,
    goody,goody,goody

    :lol::lol::lol::lol::lol::lol::lol::lol:

    (happy days are here again.......)

    sigh

    Dave
    You can already run my little ebasic interpreter directly on the P2 (or P1). I guess I should hook it up to a PS2 keyboard and TV/VGA monitor though. Right now it just works through a serial connection.
  • pedwardpedward Posts: 1,642
    edited 2013-04-09 12:14
    Chip, any reason this code wouldn't work on a DE0? It's only 57 longs and I could see some of the wizards embedding this in some time-sliced magic for a combined framebuffer driver.
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 14:25
    pedward wrote: »
    Chip, any reason this code wouldn't work on a DE0? It's only 57 longs and I could see some of the wizards embedding this in some time-sliced magic for a combined framebuffer driver.

    It should run fine on the DE0-Nano. The only problem, of course, is that it would have to be altered to do more than just serve SDRAM. You'll need to keep a single-threaded model, unless you can figure out what timeslots can be consistently forfeited. Whooo, there's a challenge!

    ...

    Now that I'm thinking about it, there are no time slots available, except for 7 cycles during a read, which can't be relied upon. The way to do this would be to open up some multiple of 8 cycles for other threads inside the command-fetch loop.
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 14:38
    For graphics use, it would be handy (and faster for things like line drawing, drawing circles etc) to add:

    - read/write long
    - read/write word
    - potentially even read/write byte (for 8 bit per pixel modes)

    so that individual pixels can be read/written from/to a frame buffer.

    I know, we need byte/word/long access, as well, especially for drawing graphics. I'm looking into it...
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-04-09 14:40
    Thank you!
    cgracey wrote: »
    I know, we need byte/word/long access, as well, especially for drawing graphics. I'm looking into it...
  • pedwardpedward Posts: 1,642
    edited 2013-04-09 14:58
    cgracey wrote: »
    I know, we need byte/word/long access, as well, especially for drawing graphics. I'm looking into it...

    The most sensible solution would be to have a tile cache in main memory, then do a BLIT to SDRAM. I assume that a 64 Quad transfer is the most efficient, so that would be the natural size. The problem comes from doing operations across the memory boundaries, but perhaps putting boundary detection into the primitives would give you a good hook to save/load blocks on demand.

    I seem to recall you stating that at high color depths and resolutions there wasn't any memory bandwidth left to draw into the buffer, EG 1920x1080.

    It seems to me that to save cycles, the tile buffer needs to be in COG memory, to avoid hub access.
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 15:08
    pedward wrote: »
    It seems to me that to save cycles, the tile buffer needs to be in COG memory, to avoid hub access.

    Well, the SDRAM can read/write 16 bits per clock, which is the hub transfer rate (or 128 bits every 8 clocks for RDQUAD/WRQUAD). So, SDRAM data can move between pins, a server cog, the hub, and other cogs, all at the same rate.

    I think it's cleanest to just make a generic SDRAM server, thereby decoupling any video dependency from it, and vice-versa. There is no speed advantage to be had, though saving a cog could be important.
  • David BetzDavid Betz Posts: 14,516
    edited 2013-04-09 15:16
    cgracey wrote: »
    I think it's cleanest to just make a generic SDRAM server, thereby decoupling any video dependency from it, and vice-versa. There is no speed advantage to be had, though saving a cog could be important.

    Saving a COG on the DE0-Nano might be particularly important! :-)
  • AribaAriba Posts: 2,690
    edited 2013-04-09 15:43
    David Betz wrote: »
    Saving a COG on the DE0-Nano might be particularly important! :-)

    So true !

    Would we need also a new FPGA configuration for the DE0-Nano to make an SDRAM-driver working ?
    Then it would be wasted time if I try to get it working on a DE0.

    Andy
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 16:11
    Ariba wrote: »
    So true !

    Would we need also a new FPGA configuration for the DE0-Nano to make an SD-driver working ?
    Then it would be wasted time if I try to get it working on a DE0.

    Andy

    I'm compiling a new file for the DE0-Nano that will make it work like the DE2-115. I'll post it in 20 minutes, or so, once it's done.
  • cgraceycgracey Posts: 14,206
    edited 2013-04-09 16:25
    Here is the new DE0_Nano_Prop2 config file that should make the SDRAM work properly:

    DE0_Nano_Prop2.zip
Sign In or Register to comment.