SPI flash programmer code
cgracey
Posts: 14,210
Here is a program you can run that will program a connected SPI flash with the OUTB blinker program.
If you put pull-ups on both spi_cs and spi_ck, it will work with the last release.
I found a bug in the 100ms timeout code in the ROM booter, where relative addressing was being used and I didn't realize it. I fixed this on my end, but I need to make new FPGA images to fix it on your end. If you use both pull-ups now, it will work.
This boot process takes about 15ms. Five milliseconds is spent reading 1KB from the flash, then ten milliseconds is spent validating it before executing it.
If you put pull-ups on both spi_cs and spi_ck, it will work with the last release.
I found a bug in the 100ms timeout code in the ROM booter, where relative addressing was being used and I didn't realize it. I fixed this on my end, but I need to make new FPGA images to fix it on your end. If you use both pull-ups now, it will work.
This boot process takes about 15ms. Five milliseconds is spent reading 1KB from the flash, then ten milliseconds is spent validating it before executing it.
' Program SPI flash with signed OUTB blinker program ' - Connect SPI flash (M25P80 okay) with pull-ups on spi_cs and spi_ck ' - Blinks OUTB on boot-up CON spi_cs = 61 spi_ck = 60 spi_di = 59 spi_do = 58 DAT org ' ' ' Init SPI pins ' outh #spi_cs dirh #spi_cs dirh #spi_ck dirh #spi_di ' ' ' Erase 1st $1000 bytes ' call #spi_wrena 'write enable mov cmd,cmd_erase 'sector erase call #spi_cmd32 call #spi_wait 'wait for completion ' ' ' Program first $400 bytes ' loc ptra,#\pgmdata 'point to program data .program call #spi_wrena 'write enable mov cmd,cmd_program 'page program or cmd,adr call #spi_cmd32 .byte rdbyte cmd,ptra++ 'get byte mov x,#8 'send byte shl cmd,#24 call #spi_out add adr,#1 'page done? test adr,#$FF wz if_nz jmp #.byte call #spi_wait 'wait for completion testb adr,#10 wz 'another page? if_z jmp #.program ' ' ' Read data back to outa for viewing on logic analyzer ' mov dira,#$1FF .read1k mov cmd,cmd_read 'start read call #spi_cmd32 outh #8 'trigger signal outl #8 decod y,#10 'read byte to outa .read call #spi_in setbyte outa,cmd,#0 djnz y,#.read jmp #.read1k 'loop ' ' ' SPI write enable ' spi_wrena mov cmd,#$06 'write enable call #spi_cmd8 ret ' ' ' SPI wait while busy ' spi_wait mov cmd,#$05 call #spi_cmd8 .wait call #spi_in test cmd,#$01 wc if_c jmp #.wait ret ' ' ' SPI command ' spi_cmd32 mov x,#32 jmp #spi_cmd spi_cmd8 mov x,#8 shl cmd,#24 spi_cmd outh #spi_cs outl #spi_cs ' ' ' SPI long/byte out (x=bits, cmd=msbdata) ' spi_out rep @.r,x shl cmd,#1 wc outc #spi_di outh #spi_ck outl #spi_ck .r ret ' ' ' SPI byte in (cmd) ' spi_in rep @.r,#8 outh #spi_ck outl #spi_ck testin #spi_do wc 'due to latencies, 'testin' is from 2 clocks before 'outh' rcl cmd,#1 .r ret ' ' ' Data ' cmd_erase long $20_00_00_00 cmd_program long $02_00_00_00 cmd_read long $03_00_00_00 adr long 0 ' ' ' Variables ' cmd res 1 x res 1 y res 1 ' ' ' Program Data ' ' first 20 bytes are blinker program: ' ' not dirb '.lp not outb ' waitx ##20_000_000/4 ' jmp #.lp ' ' last 32 bytes are signature (key=0) ' orgh pgmdata byte $FB,$F7,$23,$F6,$FD,$FB,$23,$F6,$25,$26,$80,$FF,$28,$80,$66,$FD 'blinker program byte $F0,$FF,$9F,$FD,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00 byte $99,$AA,$44,$98,$86,$E2,$C8,$71,$C3,$1E,$60,$BF,$A3,$36,$19,$7A 'SHA-256/HMAC signature byte $F5,$3D,$53,$97,$5C,$AF,$BA,$BB,$B7,$7F,$C3,$0A,$B4,$24,$02,$40
Comments
nick
All flash boot programs must be signed. That's how we tell if it's really a program, or just a bunch of random data, or that even a flash chip is really connected.
Can you do similar code for a Quad-Connected Flash, as that is likely to be much more common final usage.
ie Boot and Pgm from a Quad-Connected part.
IIRC it needs to define and control the extra pins, and needs to issue a Reset command.
How does that time vary with code-size ? Is it linear or is there some block-step involved in validate ?
These are stimulating questions.
About quad SPI, the pins would be connected as such:
61 = SPI_CS
60 = SPI_CK
59 = SPI_DQ0 (DO for 1-bit)
58 = SPI_DQ1 (DI for 1-bit)
57 = SPI_DQ2 (WP for 1-bit)
56 = SPI_DQ3 (HOLD for 1-bit)
That would be adding two connections, pins 57 and 56, to make the extra data bits. Note that this makes the nibble (#6) MSB-LSB reversed. A single 'REV' instruction could be used before/after program/read long to reverse the bits. This allows nice compatibility with the 1-bit connection scheme.
I'd rather not put nibble mode in the ROM, because it would insist that two more pins always be dedicated for quad SPI flash, when some customers would want just want the single-bit hookup. They could turn that mode on in their own booter code (part of PNut, eventually), if they want to load their main program very quickly. They could also use the streamer to move nibbles at Fclk/2, since the clock must be toggled high and low for each nibble. At 160MHz, that's 40MB/s or the whole 512k flash-->hub in 12.5ms.
About the SHA-256/HMAC, the booter currently handles this on a post-load basis, but it could be made per-byte, as the data comes in. At 20MHz, it takes ~10ms/KB. At 160MHz, that will be ~1.25ms/KB. That's 0.64s/512KB. The user may be running some simpler checksum, or even decryption, instead.
Once the booter is loaded from SPI flash, the user will still have access to the SHA-256/HMAC code. He can handle things however he wants. The flash booter is $F8 longs. It plops right in amid the ROM booter code and gets JMP'd to. It can pick up right from where the ROM booter left off, if it wants to. All the ROM booter code is still present and callable.
How does the streamer manage the reverse nibble you mentioned ?
or do you mean use REV for commands and address, and skip for data R/W ?
That would make 1b load code, different from Quad code, which could get messy ?
I'd agree ROM is ok in 1b mode, but it needs to issue a reset from Quad (which I think you have done ?)
and the question about the WP and HOLD pins is does the user have to add external pullups, or can the ROM apply pullups, saving 2 parts on the BOM ?
If so, I think its the reverse of how the breakout boards are set up, but I'll check
Here's a pcb layout for the signal order. Looks good.
It'll be fine.
SPI FLASH
Quad SPI FLASH
SD CARD SPI
SD CARD quad
Here is the link, with the pinout also repeated
http://forums.parallax.com/discussion/comment/1386295/#Comment_1386295
Note: The akward use of the CD pinout to permit possible use of quad mode CD at a later date.
1. This permits the later use of QUAD SPI without the need to swap pins/bits around when using QUAD SPI.
2. This also permits SD to be used on the same pins instead of (ie without) SPI.
3. When accessing Flash normally (not Quad mode),#HOLD=1. This is compatible with the SD Card being deselected. Only if Flash was determined not to be present would the SD Card be tried.
Wait. Sorry. P59 needs to be DI and P58 needs to be D0. If that's not the case, the ROM code won't work.
That looks good, Cluso.
I've been looking up current SPI flash offerings at Digikey and everything new seems to be quad SPI. I'm thinking maybe just dedicating 4 data pins, as you've shown, and always run in quad mode.
So, you are pretty sure that your pinout permits concurrent connection of both quad SPI and SD card? That pinout certainly works well for quad SPI, alone. I don't know the specifics on the SD card, though.
While perusing Digikey, I looked up the new Hyper RAM part. There are many Hyper flash chips, but only one Hyper RAM. They all have the same pinout, it seems. I wish they would have made that Hyper RAM with maybe 16MB of flash, as well. That would have solved everything (except SD).
I kind of agree. The soft reset process for the quad-supporting chips is to send commands $66 then $99. The trouble is, if you're in single-bit SPI mode, you send them one bit at a time, while if you're in quad SPI mode, you send them 4 bits at a time. Well, you don't know if the part is in single or quad mode in a warm boot.
Micron offers a 'rescue' procedure to get a part back into single-bit mode, no matter what:
That procedure seems unique among manufacturers. I'm not clear, yet, on how we can be certain we're in single-bit mode.
These different manufacturers use different sector sizes in their parts, too, which affects erase and program procedures. Not fun. I think we'll need to support several chips from PNut.exe. The 'programmer' will get downloaded using the text protocol, and then we'll send data fast to it, so that it programs the 2nd stage booter and the application data. Later, we can implement key-based encryption.
The pin arrangement was not specifically designed for the use of both FLASH and SD. My thoughts were that you would boot FLASH or SD, but not both. If Flash exists, then that will have whatever other devices (including SD) and the location of the SD pins do not have to be shared.
The pin out I have chosen permits D0-3 to be the correct way around and on a byte boundary. You will note I chose the SD to possibly permit the quad mode to be used providing there is an open source spec able to be used. No point in excluding this possibility for later.
On the P1 where I have shared pins and been short of pins, I have had to use a single gate chip to assist with decoding.
BTW did you ever get around to asking OnSemi about on chip Flash/OTP/EEPROM ?
"If the bus mode is not known, the master device can transmit two RSTIO instructions using both the SDI and SQI formats to ensure a proper reset to SPI mode."
The RSTIO instruction is 0xFF. Maybe there's some magic in being all 1's...
Maybe I remembered wrong, maybe Kye's approach was to clock in 1's, not 0's. Have to check, been a while...
Kye told me that he sent an $FF command three times. I don't see how that would do anything.
I was about to say, surely that was a series of 1's ?
The Micron scheme Chip mentions above, clocks in 1's but the CS# work also seems to infer a frame-check may exist on some devices ?
Checking Winbond, I see they mention 0xff (8 Clocks) or 0xffff (16 clocks) for quad and dual exit respectively.
Of course, they are unclear if you need both, to exit either, or not...
They do explicitly mention 8 &16 clocks.
See above, Winbond could be saying both are needed, in which case I'd send $ff 3 times, but actually framed as 8 + 16 - which seems closer to what Micron are getting at ?
Fremont & Winbond & Adesto both say 0xff is reset, and Winbond & Adesto give 2 framings of 0xff and 0xffff for handling quad and dual exit.
Those are all for smaller Flash sizes 16Mb region. - sub 20c parts, likely to be most commonly used for boot.
Addit:
For larger (eg) 256Mb sizes, I see Winbond and Micron have this to say (2 x 8 bit commands adjacent )
“Enable Reset (66h)” and “Reset (99h)” instructions can be issued in SPI. To avoid accidental reset, both
instructions must be issued in sequence. Any other commands other than “Reset (99h)” after the “Enable
Reset (66h)” command will disable the “Reset Enable” state. A new sequence of “Enable Reset (66h)” and
“Reset (99h)” is needed to reset the device. Once the Reset command is accepted by the device, the device
will take approximately tRST=30us to reset. During this period, no command will be accepted.
Data corruption may happen if there is an on-going or suspended internal Erase or Program operation when
Reset command sequence is accepted by the device. It is recommended to check the BUSY bit and the
SUS bit in Status Register before issuing the Reset command sequence.
However, Micron also have a sticky XIP mode, that can skip commands (just sends nibble address), which is closer to HyperFLASH, and they also add this gem:
It is recommended that the device exit XIP mode before executing these two commands to initiate a reset.
Hmm, not easy for watchdog reset to cover that ?
Digging more, finds Xp is output on DQ0, 7th clock, as Confirm XIP
Xb is the XIP confirmation bit and should be set as follows: 0 to keep XIP state; 1 to exit XIP mode and return to standard read mode.
- ie, 0xff here on DQ0 would exit XIP sticky.
Suggests 0xff, 0xffff, 0x66, 0x99 (30us) might work on everything ?
Best avoided, if possible as that sounds like its own can of worms...
However the 0xff is still useful to exit XIP, before issuing the reset.
Question is, do smaller parts tolerate those 0x66 x99 commands, which seem to be non allocated.
Easy enough to test I guess.
One of the shared commands is reset, $ff
As for being unable to reset certain modes then perhaps you could "allow" (not mandatory) for extra I/O but I find that kludgy. At some point an SD only boot would fail too if I didn't have it inserted
FWIW to initialise an SD from an unknown state, they say (and Kye does) send 74 clocks with MOSI (DI) = 1's. Reading his code looks like that is 74 x $FF (ie 74*8).
I have been looking at the Spansion/Cypress S25FL116KOXMFI043 family from Mouser (0.39/100). What is interesting with these parts are they can replace the ~$15 Altera Flash chips used on Cyclone IV & V (others too???) for configuration.
Chip, can you investigate putting some Flash/OTP/Eeprom on chip with OnSemi? It could solve a lot of problems, depending on cost of course.