Prop2 Flash loader

2

Comments

  • With 'FLASH' on and both P59 switches off you only have 100mS to start serial comms after reset.
    After that time if a valid 'Prop' checksum is calculated from the 1024 bytes loaded, from SPI FLASH, the code executes.
    This will block ypu from starting TAQOZ from a terminal.
    Melbourne, Australia
  • samuell wrote: »
    evanh wrote: »
    samuell wrote: »
    evanh wrote: »
    Try turning on P59 Up. I can make it program with both P59 Up and Down on together.
    Thanks evanh! That worked while programming with Spin 2 GUI. I get the program loaded up and apparently running. P56 blinks, although it doesn't show anything on the terminal yet.
    Next step is turn P59 Up off again and press the reset button.
    Strange, after pressing reset, with P59 Up already in the off position, P56 blinks no more. ...
    At that point it has run and finished. If you didn't have the terminal monitoring already then you would have missed the report.

    P56 blinking just indicates that programming of the flash is complete. To get it to run from the flash requires the reset. So, repeated resets will keep rerunning the program in flash.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • ozpropdev wrote: »
    With 'FLASH' on and both P59 switches off you only have 100mS to start serial comms after reset.
    After that time if a valid 'Prop' checksum is calculated from the 1024 bytes loaded, from SPI FLASH, the code executes.
    This will block ypu from starting TAQOZ from a terminal.
    Thanks! That's what I thought.
    evanh wrote: »
    samuell wrote: »
    evanh wrote: »
    samuell wrote: »
    evanh wrote: »
    Try turning on P59 Up. I can make it program with both P59 Up and Down on together.
    Thanks evanh! That worked while programming with Spin 2 GUI. I get the program loaded up and apparently running. P56 blinks, although it doesn't show anything on the terminal yet.
    Next step is turn P59 Up off again and press the reset button.
    Strange, after pressing reset, with P59 Up already in the off position, P56 blinks no more. ...
    At that point it has run and finished. If you didn't have the terminal monitoring already then you would have missed the report.

    P56 blinking just indicates that programming of the flash is complete. To get it to run from the flash requires the reset. So, repeated resets will keep rerunning the program in flash.
    So, how can I see the report? I'm programming with Spin 2 GUI and see nothing on the terminal window after I press the reset button, with the switches in the correct position.

    Kind regards, Samuel Lourenço
  • Oops, I never ran Oz's version. There is 20 seconds pause before the number gets printed. You might not be waiting long enough.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • I seem to need the P59 pull-up on in order to reprogram flash.

    Also, don't need the P59 pull-down for anything...
    Prop Info and Apps: http://www.rayslogic.com/
  • evanh wrote: »
    Oops, I never ran Oz's version. There is 20 seconds pause before the number gets printed. You might not be waiting long enough.
    Well, I notice that the cursor on the Spin 2 IDE terminal advances every so often. This might be an issue with the baud rate.

    Kind regards, Samuel Lourenço
  • RaymanRayman Posts: 9,906
    edited 2019-01-21 - 16:43:42
    My plan to use this to automatically load flash from SpinEdit isn't going so well...

    I can't figure out why hardcoding the file size doesn't work...

    In this test, I've just hard coded the file size and commented out the 3 lines that overwrite the file size. Doesn't work... Can't figure out why not...

    It works for the a very small program like "Larson scanner", but not this big one...
    Prop Info and Apps: http://www.rayslogic.com/
  • Never mind, figured it out...
    Added this line to fix it.
    rdlong byte_count,##size
    
    Don't know why, but it works...

    So, I take the fastspin binary of this and remove the last 32 bytes (fastspin seems to pad the end with 28 bytes of zero for some reason) to remove the file size. Then, I add the filesize. Then, add the program's binary. Add in those 28 zeros (just in case). Load this into ram and it programs the flash.

    Now, I have Load Flash option in SpinEdit. Thanks ozpropdev!
    Prop Info and Apps: http://www.rayslogic.com/
  • Hi ozprodev

    I am using your flash loader to load Catalina programs into Flash on the P2_EVAL - many thanks!. But I have noticed that it sometimes does not program correctly on the first attempt, but generally seems to work on the second.

    Have you seen anything like this in your own testing?

    Ross.
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • I see it predates the discovery about unreliable PLL mode switching. He's used the older simpler method that is known to randomly fail.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanh wrote: »
    I see it predates the discovery about unreliable PLL mode switching. He's used the older simpler method that is known to randomly fail.

    Ah! Thanks. I will amend the program.
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • evanh wrote: »
    I see it predates the discovery about unreliable PLL mode switching. He's used the older simpler method that is known to randomly fail.

    Hmmm. Still having problems. Can you point me to the correct way we are supposed to set the clock on boot? I thought I was doing it correctly, but perhaps I am not.
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • Here's my write up - https://forums.parallax.com/discussion/comment/1466702/#Comment_1466702

    There was also a big discussion in another topic. Basic structure is to remember and reuse the prior mode config to cleanly switch back to RCFAST clock source before making any adjustment to PLL configuration.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • Hi RossH
    I haven't had any issues with the flash loader so far.
    I've been using it for quite some time now with my micropython stuff.
    My eval board does seem to behave slightly differently to others though. :)
    Melbourne, Australia
  • ozpropdev wrote: »
    Hi RossH
    I haven't had any issues with the flash loader so far.
    I've been using it for quite some time now with my micropython stuff.
    My eval board does seem to behave slightly differently to others though. :)

    Yes, it might not be the P2 itself. It could be the P2 EVAL board. Or the boot ROM. I have noticed some odd things previously - they generally seem to sort themselves out with enough power cycles, SD Card removal/re-insertions and/or reboots :(

    But your flash loader is very useful - thanks!
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • Ross,
    IIRC the current ROM boot code for SD can leave the DO pin from the SD card driven. This interferes with the Flash SPI such that it will not work. This has hopefully been fully corrected in the new ROM by forcing the SD card to release DO after each use/transaction.
    Not sure if this is causing your problems.
    My Prop boards: P8XBlade2 , RamBlade , CpuBlade , TriBlade
    P1 Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    P1: Tools (Index) , Emulators (Index) , ZiCog (Z80)
    P2: Tools & Code , Tricks & Traps
  • Cluso99 wrote: »
    Ross,
    IIRC the current ROM boot code for SD can leave the DO pin from the SD card driven. This interferes with the Flash SPI such that it will not work. This has hopefully been fully corrected in the new ROM by forcing the SD card to release DO after each use/transaction.
    Not sure if this is causing your problems.

    That may be causing some odd problems I was having with some other code, even if it is not causing this particular problem.
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • For the current engineering samples, consider that if you access the SD, then Flash is a no-go, even after reset!!!

    There was an old thread where this was discussed as there is a sequence of clocks (96*8) if memory serves me correctly to force the sad to release the DO pin.
    My Prop boards: P8XBlade2 , RamBlade , CpuBlade , TriBlade
    P1 Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    P1: Tools (Index) , Emulators (Index) , ZiCog (Z80)
    P2: Tools & Code , Tricks & Traps
  • evanhevanh Posts: 8,259
    edited 2019-10-19 - 01:29:13
    I've solved the loadp2 issue with needing to turn P59-UP on and off all the time. The pauses in the loading sequence were just too long, causing the 100 ms timeout to occur. So that sorts the revA Eval board.

    RevB Eval board doesn't seem to program its Flash memory yet. The programming sequence completes but the reset doesn't boot. I'm about to look into this now ...
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • SPI Flash programs Ok on my RevB Eval board.. :)
    Melbourne, Australia
  • evanhevanh Posts: 8,259
    edited 2019-10-19 - 05:06:09
    Lol, I've done a large amount of re-engineering your programmer/loader code over the last day, Brian. Along with fixing up loadp2 as well.

    Only now did I decide to meter the electrical connections for continuity. Amazingly pin #1 (Chip Select) of the Flash chip was floating in air. The solder blob was only on top. Reheating it fixed it. Tested and work now.
    cs_fail.JPG
    622 x 410 - 66K
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanhevanh Posts: 8,259
    edited 2019-10-19 - 13:25:50
    Brian,
    This is my current stage1 SPI read routine that is added to front of Flash chip program. Ignore the longer 32 bits.

    Tell me if you can understand the comments describing the SPI clocking. The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
    read_byte4
    		outh	#spi_clk		'OUT takes 4 sysclocks to present to the pin
    		nop
    		outl	#spi_clk		'tells SPI chip to clock the second bit out
    
    		rep	@.loop, #31		'one bit per 8 sysclocks, plenty of leeway to accommodate poor slewing
    		outh	#spi_clk
    		testp	#spi_do		wc	'IN takes another 4 or 5 sysclock to present from the pin
    		outl	#spi_clk		'SPI chip clocks out data on falling edge
    		rcl	pa, #1
    .loop
    		nop
    		testp	#spi_do		wc	'picks up data from OUTL seven instructions prior (14 sysclocks)
    	_ret_	rcl	pa, #1			'  the last OUTL is for first bit of next word, if any
    
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • jmgjmg Posts: 14,088
    evanh wrote: »
    .... The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.

    Here are new data sheets I spotted, you could glance at, when doing SPI work ?
    http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-HP-MRAM.pdf
    http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-ULP-MRAM.pdf
    P2 may even boot from these ?
    Ultra low power one is spec'd at 10MHz which seems quite slow, but may be good enough to boot P2, in a low-power system.

    These parts have useful other registers too
    RDSR  05h Read Status Register         1 10MHz or 54MHz
    RDID  0Fh Read Device ID               4 10MHz or 54MHz
    RUID  4Ch Read Unique ID               8 10MHz or 54MHz
    RDSN  C3h Read Serial Number Register  8 10MHz or 54MHz
    
  • evanh wrote: »
    Brian,
    This is my current stage1 SPI read routine that is added to front of Flash chip program. Ignore the longer 32 bits.

    Tell me if you can understand the comments describing the SPI clocking. The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
    read_byte4
    		outh	#spi_clk		'OUT takes 4 sysclocks to present to the pin
    		nop
    		outl	#spi_clk		'tells SPI chip to clock the second bit out
    
    		rep	@.loop, #31		'one bit per 8 sysclocks, plenty of leeway to accommodate poor slewing
    		outh	#spi_clk
    		testp	#spi_do		wc	'IN takes another 4 or 5 sysclock to present from the pin
    		outl	#spi_clk		'SPI chip clocks out data on falling edge
    		rcl	pa, #1
    .loop
    		nop
    		testp	#spi_do		wc	'picks up data from OUTL seven instructions prior (14 sysclocks)
    	_ret_	rcl	pa, #1			'  the last OUTL is for first bit of next word, if any
    
    Hi Evan
    Had a quick look and I think I see what your trying to do.
    The original code works fine up to system clock <= 275MHz.
    What speed are you getting now?

    A variant of loader that uses the Hyperflash would be pretty slick.
    Yet another thing to add to the forever growing TODO list :lol:

    Melbourne, Australia
  • ozpropdev wrote: »
    The original code works fine up to system clock <= 275MHz.
    What speed are you getting now?
    Right up to crashing just above 400 MHz sysclock. Tested at 5 MHz too, to make sure the compensation isn't over the top.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanhevanh Posts: 8,259
    edited 2019-10-20 - 04:46:44
    jmg wrote: »
    http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-HP-MRAM.pdf
    http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-ULP-MRAM.pdf
    P2 may even boot from these ?
    Ultra low power one is spec'd at 10MHz which seems quite slow, but may be good enough to boot P2, in a low-power system.
    Nice. Two models for two distinct uses - Either as a do-it-all non-volatile buffer RAM or just simple low power rugged boot storage.

    PS: The FBGA package is clearly in need of becoming Hyperbus capable. :D
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • ozpropdev wrote: »
    A variant of loader that uses the Hyperflash would be pretty slick.
    Yet another thing to add to the forever growing TODO list :lol:
    The problem there is an SPI/SD device is still needed for booting as well. If going down that path then probably wise to turn any HyperFlash parts into FAT filesystem storage. As opposed to HyperRAM being used as a large buffer.

    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanhevanh Posts: 8,259
    edited 2019-10-31 - 02:29:47
    Brian,
    I've been back at this again. It dawned on me that smartpins doing rx behaves differently enough from tx that It'd make a lot of sense to use smartpins. So I've added a bunch of init code and replaced that read4 routine above again ... for doing dual-SPI fast reads. Nicely suits booting the onboard SPI Flash chip.
    {
    Prop2 Flash loader
    Version 1.2 17th January 2019 - ozpropdev
    18 Oct 2019   Reengineered the programming bitbashing to resolve an issue that turned out to be a faulty board - Evan H
    31 Oct 2019   Modified to use dual smartpins for block reads with DualSPI signalling
    
    Writes user code (.obj) and loader into flash.
    On P2-ES Eval board "FLASH" switch must be on.
    
    
    "CODE" is stored in FLASH starting @ $1_0000
    First long is code size in bytes.
    
    See end of program for examples of how to include users .obj file.
    
    }
    
    con
    
    		#58,spi_do,spi_di,spi_clk,spi_cs
    
    		write_enable = $06
    		block_unlock = $98
    		block_erase_64k = $D8
    		read_status = 5
    		device_id = $ab
    		enable_reset = $66
    		device_reset = $99
    		read_data = 3
    		page_program = 2
    		read_dual = $3b		' "Fast Read Dual Output" SPI command
    
    '==============================================================================================
    
    dat		org
    
    		drvh	#spi_cs
    		drvl	#spi_clk
    		drvl	#spi_di
    
    'faster loading
    		hubset	 .clk_mode		'config crystal and PLL - still running RCFAST
    		waitx	 ##25_000_000/100	'wait for crystal/PLL to ramp up
    		or	 .clk_mode, #XSEL	'select clock mode
    		hubset	 .clk_mode		'engage
    
    'compute checksum for SPI flash boot
    		call	#checksum
    
    'reset flash
    		call	#chip_reset
    'erase flash
    		mov	addr, #0		'erase_stage1
    		call	#erase_64k
    
    		mov	addr, ##$1_0000		'erase_code
    		mov	blocks, ##512 / 64
    .loop
    		call	#erase_64k
    		add	addr, ##$1_0000
    		djnz	blocks, #.loop
    
    'copy stage1 loader
    		call	#copy_stage1
    
    'copy code to $1_0000
    
    		mov	byte_count,##@code_end - @code
    		loc	ptra,#@size
    		wrlong	byte_count,ptra
    
    		call	#copy_code
    
    		hubset	##%0001 << 28	'hard reset for reboot to Flash
    
    		jmp	#$
    
    
    .clk_mode	long	1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
    
    '------------------------------------------------
    chip_reset
    		call	#busy
    'read device ID for scope to view
    		mov	pa, #device_id
    		outl	#spi_cs
    		call	#send_byte
    		call	#send_addr24		'dummy address
    		call	#read_byte
    		outh	#spi_cs
    
    		mov	pb, #2			'2 us pause in case was sleeping
    		call	#pause_us
    'do the reset
    		callpa	#enable_reset, #send_command
    		callpa	#device_reset, #send_command
    		mov	pb, #50			'50 us pause to let the interal reset occur
    		call	#pause_us
    'clear locks
    		callpa	#write_enable, #send_command
    		callpa	#block_unlock, #send_command
    		jmp	#busy
    
    '------------------------------------------------
    erase_64k	callpa	#write_enable,#send_command
    		mov	pa,#block_erase_64k
    		outl	#spi_cs
    		call	#send_byte
    		call	#send_addr24
    		outh	#spi_cs
    		call	#busy
    		ret
    
    copy_stage1	mov	pages,#4
    		mov	addr,#0
    		loc	ptra,#@stage1
    .loop2		callpa	#write_enable,#send_command
    		mov	byte_count,#256
    		outl	#spi_cs
    		mov	pa,#page_program
    		call	#send_byte
    		call	#send_addr24
    .loop		rdbyte	pa,ptra++
    		call	#send_byte
    		djnz	byte_count,#.loop
    		outh	#spi_cs
    		call	#busy
    		add	addr,#256
    		djnz	pages,#.loop2
    		ret
    
    copy_code	mov	pages,byte_count
    		shr	pages,#8
    		add	pages,#2
    		mov	addr,##$1_0000
    		loc	ptra,#@size
    .loop2		callpa	#write_enable,#send_command
    		mov	byte_count,#256
    		outl	#spi_cs
    		mov	pa,#page_program
    		call	#send_byte
    		call	#send_addr24
    .loop		rdbyte	pa,ptra++
    		call	#send_byte
    		djnz	byte_count,#.loop
    		outh	#spi_cs
    		call	#busy
    		add	addr,#256
    		djnz	pages,#.loop2
    		mov	pb, #2			'2 us pause
    		jmp	#pause_us
    
    '------------------------------------------------
    send_command
    		outl	#spi_cs
    		call	#send_byte
    	_ret_	outh	#spi_cs
    
    '------------------------------------------------
    send_addr24
    		getbyte	pa, addr, #2
    		call	#send_byte
    		getbyte	pa, addr, #1
    		call	#send_byte
    		getbyte	pa, addr, #0
    		jmp	#send_byte
    
    '------------------------------------------------
    send_byte
    		shl	pa, #32-7	wc
    
    		rep	@.loop, #8
    		outc	#spi_di
    		outh	#spi_clk
    		shl	pa, #1	wc
    		outl	#spi_clk
    .loop
    		ret			wcz	'preserve C/Z flags
    
    '------------------------------------------------
    read_byte
    		outh	#spi_clk
    
    		rep	@.loop, #7
    		outl	#spi_clk		'needs to be about 6 clocks early due to I/O buffering
    		testp	#spi_do		wc	'read in bit prior to clock
    		outh	#spi_clk
    		rcl	val,#1
    .loop
    		outl	#spi_clk
    		testp	#spi_do		wc	'read final bit
    		rcl	val,#1
    		ret			wcz	'preserve C/Z flags
    
    '------------------------------------------------
    busy
    		mov	pa, #read_status
    		outl	#spi_cs
    		call	#send_byte
    		call	#read_byte
    		outh	#spi_cs
    		testb	val, #0		wc	'write in progress
    	if_nc	ret			wcz	'preserve C/Z flags
    		jmp	#busy
    
    '------------------------------------------------
    checksum
    		loc	ptra, #@stage1
    		mov	pa, #0
    
    		rep	@.loop, #256
    		rdlong	pb, ptra++
    		add	pa, pb
    .loop
    		subr	pa, ##$706F7250 'Proo'
    		wrlong	pa, ptra[-1]
    		ret
    
    '------------------------------------------------
    pause_us
    		rep	@.rend, pb
    		waitx	#(CLOCKFREQ / 1_000_000)	'one microsecond - assumes a round number of MHz
    .rend
    		ret
    
    
    blocks		long	0
    count		long	0
    addr		long	0
    pages		long	0
    xx		long	0
    byte_count	long	0
    val		long	0
    
    '==============================================================================================
    con
    	XTALFREQ	= 20_000_000			'PLL stage 0: crystal frequency
    	XDIV		= 20				'PLL stage 1: crystal divider (1..64)
    	XMUL		= 160				'PLL stage 2: crystal / div * mul (1..1024)
    	XDIVP		= 1				'PLL stage 3: crystal / div * mul / divp (1,2,4,6..30)
    
    	XOSC		= %10				' OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
    	XSEL		= %11				' XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
    	XPPPP		= ((XDIVP>>1) + 15) & $F	' 1->15, 2->0, 4->1, 6->2...30->14
    	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
    
    	AF_PLUS1	= (%0001 << 28)
    	AF_PLUS2	= (%0010 << 28)
    	AF_PLUS3	= (%0011 << 28)
    	BF_PLUS1	= (%0001 << 24)
    	BF_PLUS2	= (%0010 << 24)
    	BF_PLUS3	= (%0011 << 24)
    
    	P_REGD		= (%1 << 16)			' turn on clocked digital I/O (registered pins)
    	SP_OUT		= (%1 << 6)			' force on pin output when DIR operates smartpin
    	SPM_PULSES	= %00100_0 |SP_OUT		' pulse/cycle output
    	SPM_SSER_TX	= %11100_0 |SP_OUT		' sync serial transmit (A-data, B-clock)
    	SPM_SSER_RX	= %11101_0			' sync serial receive (A-data, B-clock)
    
    
    	DMADIV		= 4		'160 MHz sysclock / 4 = 40 MHz SPI clock (with dual SPI makes 80 Mbit/s or 10 MB/s)
    
    dat
    		orgh	$400
    		org
    
    stage1	
    'config pin for SPI chip select
    		drvh	#spi_cs
    		drvl	#spi_clk
    		drvl	#spi_di
    
    'faster loading
    		hubset	 .clk_mode		'config crystal and PLL - still running RCFAST
    		waitx	 .pause			'wait for crystal/PLL to ramp up
    		or	 .clk_mode, #XSEL	'select clock mode
    		hubset	 .clk_mode		'engage
    
    'load code @$1_0000 to hub address 0
    
    		mov	pb, ##$1_0000		'Flash address to load
    		outl	#spi_cs
    		callpa	#read_dual, #send_byte2
    
    		getbyte	pa, pb, #2		'send Flash reading address
    		call	#send_byte2
    		getbyte	pa, pb, #1
    		call	#send_byte2
    		getbyte	pa, pb, #0
    		call	#send_byte2
    
    'config one smartpin for SPI clock
    		wrpin	#SPM_PULSES, #spi_clk
    		dirl	#spi_clk				'SPI clock still driven low by the smartpin
    		wxpin	##((DMADIV/2)<<16) | DMADIV, #spi_clk	'pulse width (space->mark) and period respectively
    		dirh	#spi_clk
    
    		wypin	#8, #spi_clk		'pace out dummy clocks required by "Fast Read Dual Output"
    		waitx	#50
    
    'config two smartpins for SPI dual data
    		fltl	#spi_do
    		fltl	#spi_di
    		wrpin	##SPM_SSER_RX | BF_PLUS2, #spi_do
    		wrpin	##SPM_SSER_RX | BF_PLUS1, #spi_di
    		wxpin	#15, #spi_do				'32 bits at a time
    		wxpin	#15, #spi_di
    		dirh	#spi_do
    		dirh	#spi_di
    
    'get length of binary data
    		setse1	#(%001<<6)|spi_do
    		wypin	#16, #spi_clk		'16 clock for first 32 bits containing binary length
    		pollse1				'clear prior event - needs a spacer instruction from SETSE1
    
    		call	#read_byte4		'get the "size" value
    		movbyts	pa, #%%0123		'endian swap 24bit length in bytes
    		add	pa, #3			'round up
    		shr	pa, #2			'scale to longwords
    		mov	.lcount, pa
    
    'full-on continuous burst, right up to sysclock/2!
    		wrfast	#0, #0			'start FIFO at beginning of hubRAM
    		shl	pa, #4			'x16 clocks per longword
    		wypin	pa, #spi_clk		'start clocking for the full length
    .loop
    		call	#read_byte4
    		movbyts	pa, #%%0123		'want as little-endian
    		wflong	pa
    		djnz	.lcount, #.loop
    
    		outh	#spi_cs
    		rdfast	#0, #0			'flush the FIFO
    
    'go back to RCFAST mode before handover
    		andn	 .clk_mode, #%11	'select RCFAST clock mode while retaining the old PLL config
    		hubset	 .clk_mode		'switch to RCFAST, critical reliability workaround for hardware bug
    		hubset	 #0			'shutdown crystal and PLL
    		waitx	 .pause			'wait for crystal shutdown, emulating hard reset conditions
    
    		coginit	#0, #0			'kick it!
    
    
    .clk_mode	long	1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
    .pause		long	25_000_000/100
    .lcount		long	0
    
    
    '------------------------------------------------
    send_byte2
    		shl	pa, #32-7	wc
    		rep	@.loop,#8
    		outc	#spi_di
    		outh	#spi_clk
    		shl	pa, #1		wc
    		outl	#spi_clk
    .loop		
    		ret
    
    '------------------------------------------------
    read_byte4
    		waitse1				'wait for smartpin (spi_do) buffer full event
    
    		rdpin	pa, #spi_do		'16-bit shift-in as little-endian (odd bits)
    		rdpin	pb, #spi_di		'(even bits)
    		rev	pa			'but SPI data is stored as big-endian (odd bits)
    		rev	pb			'(even bits)
    		rolword	pa, pb, #0		'combine to a single 32-bit word
    	_ret_	mergew	pa			'untangle the odd-even pattern
    '------------------------------------------------
    
    		fit	$100
    		orgf	$100
    
    '==============================================================================================
    
    		orgh
    
    size		long	0			'located at Flash address $1000
    
    code
    
    'example code indicating programming suceeded
    
    		drvh	#56			'LED56 off
    		drvl	#57			'LED57 on
    
    		rep	@.floop, #0		'loop forever toggling the LEDs
    		outnot	#56
    		outnot	#57
    		waitx	##(25_000_000/4)
    .floop
    
    
    '		file	"_P2 Invaders 2.0.52_eval.obj"
    code_end
    
    
    

    EDIT: Done a small tidy up. Added back in crystal clock setting for faster Flash programming. Had originally been removed when I had the unsoldered chip select pin on my revB board and I thought the issue must have been software.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanhevanh Posts: 8,259
    edited 2019-10-31 - 23:45:51
    Just done some experimenting with pin registering and found that sync serial smartpin mode surprisingly works better without. And then can even double the SPI clock rate by setting X[5] = 1 of the serial rx smartpins.

    I can't visualise why but testing has definitely proved it. Tested a 300 kByte binary using SPI clock = sysclock/2 with sysclock from 4 MHz to 160 MHz on the revA Eval board with its long SPI tracks. So up to 80 MHz SPI clock (20 MBytes/s)! I wouldn't be surprised to see the SPI clock attenuated down to something like one volt.

    EDIT: Ah, registering just the SPI clock pin does help a small amount. Err, or not, it fails the 4 MHz sysclock test. Hmm, that's not a good sign ...

    EDIT2: Right, given that issue, I figure pin registering all round is a good idea. With both clock and data pins registered the revA Eval Board works up to 60 MHz SPI clock (120 MHz sysclock) and the revB Eval Board works up to 115 MHz SPI clock (230 MHz sysclock). PS: Room temperature of 21 °C.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
  • evanhevanh Posts: 8,259
    edited 2019-10-31 - 23:46:41
    Hmm, now I've successfully tested a faster config of:
    - Falling edge SPI clock. Had always previously been configured as rising edge.
    - Post-clock-edge data-in sampling (late sampling).
    - Data pins registered, clock pin unregistered.

    Works on revB Eval Board at 2 MHz sysclock ( 1 MHz SPI clock) at -10 °C. This was the critical test. Demonstrates not easy to fool into an early sample.

    I think the reason it works is because the unregistered SPI clock out has enough of a natural delay line. I'm not quite sure how the post-clock sampling actually works but it seems to still get in before the Flash chip has responded to the falling clock edge. The other three registered/unregistered combinations don't work in this setup.

    Doh! The Flash programming routines fail at 300 MHz sysclock. :( Something else to fix ... hacked around ... Whoa! At 25°C, pulling 360 MHz sysclock (180 MHz SPI clock) now! A mere 35% above rating of the Flash chip. :D EDIT: Err, well, its rating is at 85 °C.

    EDIT2: 360 MHz fell over around 30 °C. 340 MHz made it to 65 °C. 330 MHz got to 80 °C. 320 MHz got about 100 °C.

    PS: Take those high measurements with some salt. I'm doing this in an open space with a cheap **** hair dryer, so the gradients are getting large above 60 °C.

    PPS: revA Eval Board reaches 200 MHz sysclock (100 MHz SPI clock) at 21 °C with this config.

    EDIT3: Updated source code to de-glitch the transition from 1-bit to 2-bit SPI.
    We have the vastness of the internet and yet billions of people decided to spend most of their time within a horribly designed, fake-news emporium of a website that sucks every possible piece of personal information out of you so it can sell it to others. And they see nothing wrong with that.
Sign In or Register to comment.