With 'FLASH' on and both P59 switches off you only have 100mS to start serial comms after reset.
After that time if a valid 'Prop' checksum is calculated from the 1024 bytes loaded, from SPI FLASH, the code executes.
This will block ypu from starting TAQOZ from a terminal.
Try turning on P59 Up. I can make it program with both P59 Up and Down on together.
Thanks evanh! That worked while programming with Spin 2 GUI. I get the program loaded up and apparently running. P56 blinks, although it doesn't show anything on the terminal yet.
Next step is turn P59 Up off again and press the reset button.
Strange, after pressing reset, with P59 Up already in the off position, P56 blinks no more. ...
At that point it has run and finished. If you didn't have the terminal monitoring already then you would have missed the report.
P56 blinking just indicates that programming of the flash is complete. To get it to run from the flash requires the reset. So, repeated resets will keep rerunning the program in flash.
With 'FLASH' on and both P59 switches off you only have 100mS to start serial comms after reset.
After that time if a valid 'Prop' checksum is calculated from the 1024 bytes loaded, from SPI FLASH, the code executes.
This will block ypu from starting TAQOZ from a terminal.
Try turning on P59 Up. I can make it program with both P59 Up and Down on together.
Thanks evanh! That worked while programming with Spin 2 GUI. I get the program loaded up and apparently running. P56 blinks, although it doesn't show anything on the terminal yet.
Next step is turn P59 Up off again and press the reset button.
Strange, after pressing reset, with P59 Up already in the off position, P56 blinks no more. ...
At that point it has run and finished. If you didn't have the terminal monitoring already then you would have missed the report.
P56 blinking just indicates that programming of the flash is complete. To get it to run from the flash requires the reset. So, repeated resets will keep rerunning the program in flash.
So, how can I see the report? I'm programming with Spin 2 GUI and see nothing on the terminal window after I press the reset button, with the switches in the correct position.
My plan to use this to automatically load flash from SpinEdit isn't going so well...
I can't figure out why hardcoding the file size doesn't work...
In this test, I've just hard coded the file size and commented out the 3 lines that overwrite the file size. Doesn't work... Can't figure out why not...
It works for the a very small program like "Larson scanner", but not this big one...
Never mind, figured it out...
Added this line to fix it.
rdlong byte_count,##size
Don't know why, but it works...
So, I take the fastspin binary of this and remove the last 32 bytes (fastspin seems to pad the end with 28 bytes of zero for some reason) to remove the file size. Then, I add the filesize. Then, add the program's binary. Add in those 28 zeros (just in case). Load this into ram and it programs the flash.
Now, I have Load Flash option in SpinEdit. Thanks ozpropdev!
I am using your flash loader to load Catalina programs into Flash on the P2_EVAL - many thanks!. But I have noticed that it sometimes does not program correctly on the first attempt, but generally seems to work on the second.
Have you seen anything like this in your own testing?
I see it predates the discovery about unreliable PLL mode switching. He's used the older simpler method that is known to randomly fail.
Hmmm. Still having problems. Can you point me to the correct way we are supposed to set the clock on boot? I thought I was doing it correctly, but perhaps I am not.
There was also a big discussion in another topic. Basic structure is to remember and reuse the prior mode config to cleanly switch back to RCFAST clock source before making any adjustment to PLL configuration.
Hi RossH
I haven't had any issues with the flash loader so far.
I've been using it for quite some time now with my micropython stuff.
My eval board does seem to behave slightly differently to others though.
Hi RossH
I haven't had any issues with the flash loader so far.
I've been using it for quite some time now with my micropython stuff.
My eval board does seem to behave slightly differently to others though.
Yes, it might not be the P2 itself. It could be the P2 EVAL board. Or the boot ROM. I have noticed some odd things previously - they generally seem to sort themselves out with enough power cycles, SD Card removal/re-insertions and/or reboots
Ross,
IIRC the current ROM boot code for SD can leave the DO pin from the SD card driven. This interferes with the Flash SPI such that it will not work. This has hopefully been fully corrected in the new ROM by forcing the SD card to release DO after each use/transaction.
Not sure if this is causing your problems.
Ross,
IIRC the current ROM boot code for SD can leave the DO pin from the SD card driven. This interferes with the Flash SPI such that it will not work. This has hopefully been fully corrected in the new ROM by forcing the SD card to release DO after each use/transaction.
Not sure if this is causing your problems.
That may be causing some odd problems I was having with some other code, even if it is not causing this particular problem.
For the current engineering samples, consider that if you access the SD, then Flash is a no-go, even after reset!!!
There was an old thread where this was discussed as there is a sequence of clocks (96*8) if memory serves me correctly to force the sad to release the DO pin.
I've solved the loadp2 issue with needing to turn P59-UP on and off all the time. The pauses in the loading sequence were just too long, causing the 100 ms timeout to occur. So that sorts the revA Eval board.
RevB Eval board doesn't seem to program its Flash memory yet. The programming sequence completes but the reset doesn't boot. I'm about to look into this now ...
Lol, I've done a large amount of re-engineering your programmer/loader code over the last day, Brian. Along with fixing up loadp2 as well.
Only now did I decide to meter the electrical connections for continuity. Amazingly pin #1 (Chip Select) of the Flash chip was floating in air. The solder blob was only on top. Reheating it fixed it. Tested and work now.
Brian,
This is my current stage1 SPI read routine that is added to front of Flash chip program. Ignore the longer 32 bits.
Tell me if you can understand the comments describing the SPI clocking. The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
read_byte4
outh #spi_clk 'OUT takes 4 sysclocks to present to the pin
nop
outl #spi_clk 'tells SPI chip to clock the second bit out
rep @.loop, #31 'one bit per 8 sysclocks, plenty of leeway to accommodate poor slewing
outh #spi_clk
testp #spi_do wc 'IN takes another 4 or 5 sysclock to present from the pin
outl #spi_clk 'SPI chip clocks out data on falling edge
rcl pa, #1
.loop
nop
testp #spi_do wc 'picks up data from OUTL seven instructions prior (14 sysclocks)
_ret_ rcl pa, #1 ' the last OUTL is for first bit of next word, if any
.... The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
RDSR 05h Read Status Register 1 10MHz or 54MHz
RDID 0Fh Read Device ID 4 10MHz or 54MHz
RUID 4Ch Read Unique ID 8 10MHz or 54MHz
RDSN C3h Read Serial Number Register 8 10MHz or 54MHz
Brian,
This is my current stage1 SPI read routine that is added to front of Flash chip program. Ignore the longer 32 bits.
Tell me if you can understand the comments describing the SPI clocking. The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
read_byte4
outh #spi_clk 'OUT takes 4 sysclocks to present to the pin
nop
outl #spi_clk 'tells SPI chip to clock the second bit out
rep @.loop, #31 'one bit per 8 sysclocks, plenty of leeway to accommodate poor slewing
outh #spi_clk
testp #spi_do wc 'IN takes another 4 or 5 sysclock to present from the pin
outl #spi_clk 'SPI chip clocks out data on falling edge
rcl pa, #1
.loop
nop
testp #spi_do wc 'picks up data from OUTL seven instructions prior (14 sysclocks)
_ret_ rcl pa, #1 ' the last OUTL is for first bit of next word, if any
Hi Evan
Had a quick look and I think I see what your trying to do.
The original code works fine up to system clock <= 275MHz.
What speed are you getting now?
A variant of loader that uses the Hyperflash would be pretty slick.
Yet another thing to add to the forever growing TODO list
A variant of loader that uses the Hyperflash would be pretty slick.
Yet another thing to add to the forever growing TODO list
The problem there is an SPI/SD device is still needed for booting as well. If going down that path then probably wise to turn any HyperFlash parts into FAT filesystem storage. As opposed to HyperRAM being used as a large buffer.
Brian,
I've been back at this again. It dawned on me that smartpins doing rx behaves differently enough from tx that It'd make a lot of sense to use smartpins. So I've added a bunch of init code and replaced that read4 routine above again ... for doing dual-SPI fast reads. Nicely suits booting the onboard SPI Flash chip.
{
Prop2 Flash loader
Version 1.2 17th January 2019 - ozpropdev
18 Oct 2019 Reengineered the programming bitbashing to resolve an issue that turned out to be a faulty board - Evan H
31 Oct 2019 Modified to use dual smartpins for block reads with DualSPI signalling
Writes user code (.obj) and loader into flash.
On P2-ES Eval board "FLASH" switch must be on.
"CODE" is stored in FLASH starting @ $1_0000
First long is code size in bytes.
See end of program for examples of how to include users .obj file.
}
con
#58,spi_do,spi_di,spi_clk,spi_cs
write_enable = $06
block_unlock = $98
block_erase_64k = $D8
read_status = 5
device_id = $ab
enable_reset = $66
device_reset = $99
read_data = 3
page_program = 2
read_dual = $3b ' "Fast Read Dual Output" SPI command
'==============================================================================================
dat org
drvh #spi_cs
drvl #spi_clk
drvl #spi_di
'faster loading
hubset .clk_mode 'config crystal and PLL - still running RCFAST
waitx ##25_000_000/100 'wait for crystal/PLL to ramp up
or .clk_mode, #XSEL 'select clock mode
hubset .clk_mode 'engage
'compute checksum for SPI flash boot
call #checksum
'reset flash
call #chip_reset
'erase flash
mov addr, #0 'erase_stage1
call #erase_64k
mov addr, ##$1_0000 'erase_code
mov blocks, ##512 / 64
.loop
call #erase_64k
add addr, ##$1_0000
djnz blocks, #.loop
'copy stage1 loader
call #copy_stage1
'copy code to $1_0000
mov byte_count,##@code_end - @code
loc ptra,#@size
wrlong byte_count,ptra
call #copy_code
hubset ##%0001 << 28 'hard reset for reboot to Flash
jmp #$
.clk_mode long 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
'------------------------------------------------
chip_reset
call #busy
'read device ID for scope to view
mov pa, #device_id
outl #spi_cs
call #send_byte
call #send_addr24 'dummy address
call #read_byte
outh #spi_cs
mov pb, #2 '2 us pause in case was sleeping
call #pause_us
'do the reset
callpa #enable_reset, #send_command
callpa #device_reset, #send_command
mov pb, #50 '50 us pause to let the interal reset occur
call #pause_us
'clear locks
callpa #write_enable, #send_command
callpa #block_unlock, #send_command
jmp #busy
'------------------------------------------------
erase_64k callpa #write_enable,#send_command
mov pa,#block_erase_64k
outl #spi_cs
call #send_byte
call #send_addr24
outh #spi_cs
call #busy
ret
copy_stage1 mov pages,#4
mov addr,#0
loc ptra,#@stage1
.loop2 callpa #write_enable,#send_command
mov byte_count,#256
outl #spi_cs
mov pa,#page_program
call #send_byte
call #send_addr24
.loop rdbyte pa,ptra++
call #send_byte
djnz byte_count,#.loop
outh #spi_cs
call #busy
add addr,#256
djnz pages,#.loop2
ret
copy_code mov pages,byte_count
shr pages,#8
add pages,#2
mov addr,##$1_0000
loc ptra,#@size
.loop2 callpa #write_enable,#send_command
mov byte_count,#256
outl #spi_cs
mov pa,#page_program
call #send_byte
call #send_addr24
.loop rdbyte pa,ptra++
call #send_byte
djnz byte_count,#.loop
outh #spi_cs
call #busy
add addr,#256
djnz pages,#.loop2
mov pb, #2 '2 us pause
jmp #pause_us
'------------------------------------------------
send_command
outl #spi_cs
call #send_byte
_ret_ outh #spi_cs
'------------------------------------------------
send_addr24
getbyte pa, addr, #2
call #send_byte
getbyte pa, addr, #1
call #send_byte
getbyte pa, addr, #0
jmp #send_byte
'------------------------------------------------
send_byte
shl pa, #32-7 wc
rep @.loop, #8
outc #spi_di
outh #spi_clk
shl pa, #1 wc
outl #spi_clk
.loop
ret wcz 'preserve C/Z flags
'------------------------------------------------
read_byte
outh #spi_clk
rep @.loop, #7
outl #spi_clk 'needs to be about 6 clocks early due to I/O buffering
testp #spi_do wc 'read in bit prior to clock
outh #spi_clk
rcl val,#1
.loop
outl #spi_clk
testp #spi_do wc 'read final bit
rcl val,#1
ret wcz 'preserve C/Z flags
'------------------------------------------------
busy
mov pa, #read_status
outl #spi_cs
call #send_byte
call #read_byte
outh #spi_cs
testb val, #0 wc 'write in progress
if_nc ret wcz 'preserve C/Z flags
jmp #busy
'------------------------------------------------
checksum
loc ptra, #@stage1
mov pa, #0
rep @.loop, #256
rdlong pb, ptra++
add pa, pb
.loop
subr pa, ##$706F7250 'Proo'
wrlong pa, ptra[-1]
ret
'------------------------------------------------
pause_us
rep @.rend, pb
waitx #(CLOCKFREQ / 1_000_000) 'one microsecond - assumes a round number of MHz
.rend
ret
blocks long 0
count long 0
addr long 0
pages long 0
xx long 0
byte_count long 0
val long 0
'==============================================================================================
con
XTALFREQ = 20_000_000 'PLL stage 0: crystal frequency
XDIV = 20 'PLL stage 1: crystal divider (1..64)
XMUL = 160 'PLL stage 2: crystal / div * mul (1..1024)
XDIVP = 1 'PLL stage 3: crystal / div * mul / divp (1,2,4,6..30)
XOSC = %10 ' OSC ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
XSEL = %11 ' XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
XPPPP = ((XDIVP>>1) + 15) & $F ' 1->15, 2->0, 4->1, 6->2...30->14
CLOCKFREQ = round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
AF_PLUS1 = (%0001 << 28)
AF_PLUS2 = (%0010 << 28)
AF_PLUS3 = (%0011 << 28)
BF_PLUS1 = (%0001 << 24)
BF_PLUS2 = (%0010 << 24)
BF_PLUS3 = (%0011 << 24)
P_REGD = (%1 << 16) ' turn on clocked digital I/O (registered pins)
SP_OUT = (%1 << 6) ' force on pin output when DIR operates smartpin
SPM_PULSES = %00100_0 |SP_OUT ' pulse/cycle output
SPM_SSER_TX = %11100_0 |SP_OUT ' sync serial transmit (A-data, B-clock)
SPM_SSER_RX = %11101_0 ' sync serial receive (A-data, B-clock)
DMADIV = 4 '160 MHz sysclock / 4 = 40 MHz SPI clock (with dual SPI makes 80 Mbit/s or 10 MB/s)
dat
orgh $400
org
stage1
'config pin for SPI chip select
drvh #spi_cs
drvl #spi_clk
drvl #spi_di
'faster loading
hubset .clk_mode 'config crystal and PLL - still running RCFAST
waitx .pause 'wait for crystal/PLL to ramp up
or .clk_mode, #XSEL 'select clock mode
hubset .clk_mode 'engage
'load code @$1_0000 to hub address 0
mov pb, ##$1_0000 'Flash address to load
outl #spi_cs
callpa #read_dual, #send_byte2
getbyte pa, pb, #2 'send Flash reading address
call #send_byte2
getbyte pa, pb, #1
call #send_byte2
getbyte pa, pb, #0
call #send_byte2
'config one smartpin for SPI clock
wrpin #SPM_PULSES, #spi_clk
dirl #spi_clk 'SPI clock still driven low by the smartpin
wxpin ##((DMADIV/2)<<16) | DMADIV, #spi_clk 'pulse width (space->mark) and period respectively
dirh #spi_clk
wypin #8, #spi_clk 'pace out dummy clocks required by "Fast Read Dual Output"
waitx #50
'config two smartpins for SPI dual data
fltl #spi_do
fltl #spi_di
wrpin ##SPM_SSER_RX | BF_PLUS2, #spi_do
wrpin ##SPM_SSER_RX | BF_PLUS1, #spi_di
wxpin #15, #spi_do '32 bits at a time
wxpin #15, #spi_di
dirh #spi_do
dirh #spi_di
'get length of binary data
setse1 #(%001<<6)|spi_do
wypin #16, #spi_clk '16 clock for first 32 bits containing binary length
pollse1 'clear prior event - needs a spacer instruction from SETSE1
call #read_byte4 'get the "size" value
movbyts pa, #%%0123 'endian swap 24bit length in bytes
add pa, #3 'round up
shr pa, #2 'scale to longwords
mov .lcount, pa
'full-on continuous burst, right up to sysclock/2!
wrfast #0, #0 'start FIFO at beginning of hubRAM
shl pa, #4 'x16 clocks per longword
wypin pa, #spi_clk 'start clocking for the full length
.loop
call #read_byte4
movbyts pa, #%%0123 'want as little-endian
wflong pa
djnz .lcount, #.loop
outh #spi_cs
rdfast #0, #0 'flush the FIFO
'go back to RCFAST mode before handover
andn .clk_mode, #%11 'select RCFAST clock mode while retaining the old PLL config
hubset .clk_mode 'switch to RCFAST, critical reliability workaround for hardware bug
hubset #0 'shutdown crystal and PLL
waitx .pause 'wait for crystal shutdown, emulating hard reset conditions
coginit #0, #0 'kick it!
.clk_mode long 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
.pause long 25_000_000/100
.lcount long 0
'------------------------------------------------
send_byte2
shl pa, #32-7 wc
rep @.loop,#8
outc #spi_di
outh #spi_clk
shl pa, #1 wc
outl #spi_clk
.loop
ret
'------------------------------------------------
read_byte4
waitse1 'wait for smartpin (spi_do) buffer full event
rdpin pa, #spi_do '16-bit shift-in as little-endian (odd bits)
rdpin pb, #spi_di '(even bits)
rev pa 'but SPI data is stored as big-endian (odd bits)
rev pb '(even bits)
rolword pa, pb, #0 'combine to a single 32-bit word
_ret_ mergew pa 'untangle the odd-even pattern
'------------------------------------------------
fit $100
orgf $100
'==============================================================================================
orgh
size long 0 'located at Flash address $1000
code
'example code indicating programming suceeded
drvh #56 'LED56 off
drvl #57 'LED57 on
rep @.floop, #0 'loop forever toggling the LEDs
outnot #56
outnot #57
waitx ##(25_000_000/4)
.floop
' file "_P2 Invaders 2.0.52_eval.obj"
code_end
EDIT: Done a small tidy up. Added back in crystal clock setting for faster Flash programming. Had originally been removed when I had the unsoldered chip select pin on my revB board and I thought the issue must have been software.
Just done some experimenting with pin registering and found that sync serial smartpin mode surprisingly works better without. And then can even double the SPI clock rate by setting X[5] = 1 of the serial rx smartpins.
I can't visualise why but testing has definitely proved it. Tested a 300 kByte binary using SPI clock = sysclock/2 with sysclock from 4 MHz to 160 MHz on the revA Eval board with its long SPI tracks. So up to 80 MHz SPI clock (20 MBytes/s)! I wouldn't be surprised to see the SPI clock attenuated down to something like one volt.
EDIT: Ah, registering just the SPI clock pin does help a small amount. Err, or not, it fails the 4 MHz sysclock test. Hmm, that's not a good sign ...
EDIT2: Right, given that issue, I figure pin registering all round is a good idea. With both clock and data pins registered the revA Eval Board works up to 60 MHz SPI clock (120 MHz sysclock) and the revB Eval Board works up to 115 MHz SPI clock (230 MHz sysclock). PS: Room temperature of 21 °C.
Hmm, now I've successfully tested a faster config of:
- Falling edge SPI clock. Had always previously been configured as rising edge.
- Post-clock-edge data-in sampling (late sampling).
- Data pins registered, clock pin unregistered.
Works on revB Eval Board at 2 MHz sysclock ( 1 MHz SPI clock) at -10 °C. This was the critical test. Demonstrates not easy to fool into an early sample.
I think the reason it works is because the unregistered SPI clock out has enough of a natural delay line. I'm not quite sure how the post-clock sampling actually works but it seems to still get in before the Flash chip has responded to the falling clock edge. The other three registered/unregistered combinations don't work in this setup.
Doh! The Flash programming routines fail at 300 MHz sysclock. Something else to fix ... hacked around ... Whoa! At 25°C, pulling 360 MHz sysclock (180 MHz SPI clock) now! A mere 35% above rating of the Flash chip. EDIT: Err, well, its rating is at 85 °C.
EDIT2: 360 MHz fell over around 30 °C. 340 MHz made it to 65 °C. 330 MHz got to 80 °C. 320 MHz got about 100 °C.
PS: Take those high measurements with some salt. I'm doing this in an open space with a cheap Smile hair dryer, so the gradients are getting large above 60 °C.
PPS: revA Eval Board reaches 200 MHz sysclock (100 MHz SPI clock) at 21 °C with this config.
EDIT3: Updated source code to de-glitch the transition from 1-bit to 2-bit SPI.
Comments
After that time if a valid 'Prop' checksum is calculated from the 1024 bytes loaded, from SPI FLASH, the code executes.
This will block ypu from starting TAQOZ from a terminal.
P56 blinking just indicates that programming of the flash is complete. To get it to run from the flash requires the reset. So, repeated resets will keep rerunning the program in flash.
So, how can I see the report? I'm programming with Spin 2 GUI and see nothing on the terminal window after I press the reset button, with the switches in the correct position.
Kind regards, Samuel Lourenço
Also, don't need the P59 pull-down for anything...
Kind regards, Samuel Lourenço
I can't figure out why hardcoding the file size doesn't work...
In this test, I've just hard coded the file size and commented out the 3 lines that overwrite the file size. Doesn't work... Can't figure out why not...
It works for the a very small program like "Larson scanner", but not this big one...
Added this line to fix it. Don't know why, but it works...
So, I take the fastspin binary of this and remove the last 32 bytes (fastspin seems to pad the end with 28 bytes of zero for some reason) to remove the file size. Then, I add the filesize. Then, add the program's binary. Add in those 28 zeros (just in case). Load this into ram and it programs the flash.
Now, I have Load Flash option in SpinEdit. Thanks ozpropdev!
I am using your flash loader to load Catalina programs into Flash on the P2_EVAL - many thanks!. But I have noticed that it sometimes does not program correctly on the first attempt, but generally seems to work on the second.
Have you seen anything like this in your own testing?
Ross.
Ah! Thanks. I will amend the program.
Hmmm. Still having problems. Can you point me to the correct way we are supposed to set the clock on boot? I thought I was doing it correctly, but perhaps I am not.
There was also a big discussion in another topic. Basic structure is to remember and reuse the prior mode config to cleanly switch back to RCFAST clock source before making any adjustment to PLL configuration.
I haven't had any issues with the flash loader so far.
I've been using it for quite some time now with my micropython stuff.
My eval board does seem to behave slightly differently to others though.
Yes, it might not be the P2 itself. It could be the P2 EVAL board. Or the boot ROM. I have noticed some odd things previously - they generally seem to sort themselves out with enough power cycles, SD Card removal/re-insertions and/or reboots
But your flash loader is very useful - thanks!
IIRC the current ROM boot code for SD can leave the DO pin from the SD card driven. This interferes with the Flash SPI such that it will not work. This has hopefully been fully corrected in the new ROM by forcing the SD card to release DO after each use/transaction.
Not sure if this is causing your problems.
That may be causing some odd problems I was having with some other code, even if it is not causing this particular problem.
There was an old thread where this was discussed as there is a sequence of clocks (96*8) if memory serves me correctly to force the sad to release the DO pin.
RevB Eval board doesn't seem to program its Flash memory yet. The programming sequence completes but the reset doesn't boot. I'm about to look into this now ...
Only now did I decide to meter the electrical connections for continuity. Amazingly pin #1 (Chip Select) of the Flash chip was floating in air. The solder blob was only on top. Reheating it fixed it. Tested and work now.
This is my current stage1 SPI read routine that is added to front of Flash chip program. Ignore the longer 32 bits.
Tell me if you can understand the comments describing the SPI clocking. The built-in compensation really makes a big difference with the ability to overclock it for higher bit rates. I probably should have just used the hardware resources. I might give that a whirl next.
Here are new data sheets I spotted, you could glance at, when doing SPI work ?
http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-HP-MRAM.pdf
http://www.avalanche-technology.com/wp-content/uploads/2019/10/1Mb-16Mb-Serial-ULP-MRAM.pdf
P2 may even boot from these ?
Ultra low power one is spec'd at 10MHz which seems quite slow, but may be good enough to boot P2, in a low-power system.
These parts have useful other registers too
Had a quick look and I think I see what your trying to do.
The original code works fine up to system clock <= 275MHz.
What speed are you getting now?
A variant of loader that uses the Hyperflash would be pretty slick.
Yet another thing to add to the forever growing TODO list
PS: The FBGA package is clearly in need of becoming Hyperbus capable.
I've been back at this again. It dawned on me that smartpins doing rx behaves differently enough from tx that It'd make a lot of sense to use smartpins. So I've added a bunch of init code and replaced that read4 routine above again ... for doing dual-SPI fast reads. Nicely suits booting the onboard SPI Flash chip.
EDIT: Done a small tidy up. Added back in crystal clock setting for faster Flash programming. Had originally been removed when I had the unsoldered chip select pin on my revB board and I thought the issue must have been software.
I can't visualise why but testing has definitely proved it. Tested a 300 kByte binary using SPI clock = sysclock/2 with sysclock from 4 MHz to 160 MHz on the revA Eval board with its long SPI tracks. So up to 80 MHz SPI clock (20 MBytes/s)! I wouldn't be surprised to see the SPI clock attenuated down to something like one volt.
EDIT: Ah, registering just the SPI clock pin does help a small amount. Err, or not, it fails the 4 MHz sysclock test. Hmm, that's not a good sign ...
EDIT2: Right, given that issue, I figure pin registering all round is a good idea. With both clock and data pins registered the revA Eval Board works up to 60 MHz SPI clock (120 MHz sysclock) and the revB Eval Board works up to 115 MHz SPI clock (230 MHz sysclock). PS: Room temperature of 21 °C.
- Falling edge SPI clock. Had always previously been configured as rising edge.
- Post-clock-edge data-in sampling (late sampling).
- Data pins registered, clock pin unregistered.
Works on revB Eval Board at 2 MHz sysclock ( 1 MHz SPI clock) at -10 °C. This was the critical test. Demonstrates not easy to fool into an early sample.
I think the reason it works is because the unregistered SPI clock out has enough of a natural delay line. I'm not quite sure how the post-clock sampling actually works but it seems to still get in before the Flash chip has responded to the falling clock edge. The other three registered/unregistered combinations don't work in this setup.
Doh! The Flash programming routines fail at 300 MHz sysclock. Something else to fix ... hacked around ... Whoa! At 25°C, pulling 360 MHz sysclock (180 MHz SPI clock) now! A mere 35% above rating of the Flash chip. EDIT: Err, well, its rating is at 85 °C.
EDIT2: 360 MHz fell over around 30 °C. 340 MHz made it to 65 °C. 330 MHz got to 80 °C. 320 MHz got about 100 °C.
PS: Take those high measurements with some salt. I'm doing this in an open space with a cheap Smile hair dryer, so the gradients are getting large above 60 °C.
PPS: revA Eval Board reaches 200 MHz sysclock (100 MHz SPI clock) at 21 °C with this config.
EDIT3: Updated source code to de-glitch the transition from 1-bit to 2-bit SPI.