P2 SD Boot Code (v32+)
Cluso99
Posts: 18,069
Need a new thread
P2 has the following preferred pins...
P63 = RXD serial
P62 = TXD serial
P61 = SPI_CS
P60 = SPI_CLK
P59 = SPI_DQ = SPI_DO
P58 = (optional SPI_DI)
Pullup definitions...
P61 = Pullup = load from attached SPI FLASH or SPI SD
P60 = Pullup = and program loaded successfully, run SPI FLASH program
Now, because there are now (and possibly more) SPI commands for the FLASH that may interfere with the SD setup, I propose we define an alternative...
P60 = Pulldown = used to determine SPI SD attached and if loaded successfully run SPI SD program
By testing the pullups and pulldowns before selecting the FLASH or SD, the boot code can determine whether FLASH or SD is attached. This is an either/or case. Both FLASH and SD is not being considered at this point. Both may work, but we are not necessarily catering for it now because there is insufficient time to ensure it works correctly.
So we have...
If P61=pullup and P60=pullup/none, we could try FLASH and if not found, then try SD.
IMHO there is no need to have a load SD and then goto serial because we can remove the SD card if the code is faulty.
I am presuming here that serial will be used to start Tachyon. Chip/Peter will need to confirm.
Chip,
Why couldn't COG #0 be started to look for serial while COG #1 is started to try and load FLASH/EEPROM ?
P2 has the following preferred pins...
P63 = RXD serial
P62 = TXD serial
P61 = SPI_CS
P60 = SPI_CLK
P59 = SPI_DQ = SPI_DO
P58 = (optional SPI_DI)
Pullup definitions...
P61 = Pullup = load from attached SPI FLASH or SPI SD
P60 = Pullup = and program loaded successfully, run SPI FLASH program
Now, because there are now (and possibly more) SPI commands for the FLASH that may interfere with the SD setup, I propose we define an alternative...
P60 = Pulldown = used to determine SPI SD attached and if loaded successfully run SPI SD program
By testing the pullups and pulldowns before selecting the FLASH or SD, the boot code can determine whether FLASH or SD is attached. This is an either/or case. Both FLASH and SD is not being considered at this point. Both may work, but we are not necessarily catering for it now because there is insufficient time to ensure it works correctly.
So we have...
load = | | FLASH | FLASH | SD | boot = | Serial | FLASH | Serial | SD | P61 = | none | pullup | pullup | pullup | P60 = | xxxx | pullup | none | pulldown |
If P61=pullup and P60=pullup/none, we could try FLASH and if not found, then try SD.
IMHO there is no need to have a load SD and then goto serial because we can remove the SD card if the code is faulty.
I am presuming here that serial will be used to start Tachyon. Chip/Peter will need to confirm.
Chip,
Why couldn't COG #0 be started to look for serial while COG #1 is started to try and load FLASH/EEPROM ?
Comments
Is that pinout streamer compatible with QuadSPI and OctaSPI ?
ISTR a pinout map went past that was QuadSPI and OctaSPI streamer aligned ?
Anything is possible, but that's starting to move sway from KISS, and assumes 2 working COGS, and could get 'interesting' on a system with Flash and Serial, as flash will be busy or done, as serial
just gets started...
That request was lost a long time ago in the interest of minimal pins. Any use of Quad or Octal mode will require additional separate 4 or 8 DQ pins.
P59 = SPI_MISO
P58 = SPI_MOSI
It's only minimal pins, in minimal performance, and using a subset of the parts abilities. Can't really see the marketing sense of that ?
The clock pin can still be shared. The CS is not shareable unless daisy chaining. That only leaves pins 58 and 59. I suspect those can be reused in a 4-bit mode, maybe with bit reordering in the Prop2 pin selectors.
These parts are very common, and XIP is now defacto on larger SPI-connected MCUs.
The Streamer I think is the most constraining issue here, as I think the Streamer cannot map nibble-sets to ANY pin offset.
That must then impose a certain order on the boot pins ?
We only need three pins to achieve a 50ms 512KB boot load using an 8-pin SPI flash. We can add one more pin and make SD work, as well, on a mutually-exclusive basis.That's where we are headed for boot-up.
Where care is needed, is in aligning those 1-bit SPI pins that are used with the streamer, so a user PCB design can run their code at full P2 speed.
You do not need any more boot pins to allow a user to run QuadSPI/OctSPI, as that has reset commands for single spi, and some parts also have RESET pin.
What needs to be avoided, is having to tell a user they need to connect TWO flash parts, if they want a fully operational P2.
Besides the PCB and BOM impact, just managing two lots of code, and getting that code into the two memories will be a user and support nightmare.
For selfish reasons I'd like to try SD with Quad mode, but I can add extra pins to try it out. Not going to happen before silicon tho.
Anyway, Chip has decided the pinout. Done.
If 1-SPI really was 'good enough', that's all that would be available.
Pinout is simple mapping. Someone needs to do a connection mapping for Boot-Flash and QuadSPI or OctSPI, and count the pins required & show that to someone like Ken, who has to sell this.
Then imagine Parallax taking that to any potential large customer, who says 'Err, What..? you expect me to have to connect TWO memories to this thing, to use it properly ?!"
Jmg, I register your concern about this, but something like XIP is pointless on this chip. QSPI will never compete with internal memory bandwidth and who's going to want to run a (Q)SPI flash chip on a continuous basis, anyway? I see it necessary to hold the user program and some data that gets read periodically, but what's the point of a fast 4-data-pin ROM? QSPI does nothing to speed up writing, only reading. If I thought this was important, I'd accommodate it in the full sense. As it is, you CAN have QSPI, but with reversed data pins.
Can you also connect an OctSPI, and boot from that in 1-SPI, but use the same memory in x8 later ?
Some obvious use cases :
Font storage for LCD, where precious CODE memory is reserved in RAM, and Font info is stored in Flash. For that to work, users need to be able to quickly read the font info.
Likewise for graphics and background images for MMI - those update from flash.
Stage 2 boot and code overlays : Initial boot is always 1-SPI, but some users will want stage-2 fast boot, or faster load of code overlays during their system operation.
https://forums.parallax.com/discussion/comment/1424951/#Comment_1424951
Can all that not be accommodated with reversed data pins? I hate to interrupt the pin map for a use case that is going to be rare.
The SCH link Ariba gives above, shows use of bridged pins, where 2 prop pins connect together - but the data pins do look to be in the right order, not reversed ?
That's done in order to align with the streamer, as the current 1-SPI pin map does not streamer align.
A quad-connect then consumes 8 pins, when it might have been 6, and an Oct connect consumes 12 when it might have been 10.
It does save having to connect two memories, (which was a real 'ugh'), but the downside is it does consume 2 more Prop pins, and orphans 2 others...
Maybe that's a tolerable compromise ? Still seems quite high on the 'kludge meter' ?
If someone wanted to connect a Oct/Quad part, with no lost or orphaned pins, they could use a small sub 30c MCU as the Stage 1 boot, and connect like this :
During and out of P2 reset, the SPI memory is held CS=H, and boot MCU loads ~8-12-16k of code, which then loads from SPI. (MCU tristates)
It's 2 chips, but small MCU are getting ever cheaper, and smaller, and this connection is used by someone who really needed all those prop pins, in neat streamer groups.
Such a MCU can manage the SPI memory HW RESET and even P2 reset & watchdog function too..
Addit: added Dual/Single connect, which looks to be able to co-exist with P63.62 UART
If someone expands to Quad, the rule is cannot use P63.62 UART at the same time, but could BBU on that UART. Note 1-SPI Boot could be possible from that pin map (minor pin change).
If someone expands to Octal, the rule is cannot use P63.62 UART at the same time, but could BBU on that UART & there is no 1-SPI Boot (needs MCU stage 1 boot)
@jmg - We appreciate the information and input but none of us can see where we would use it although I can see plenty of uses for P2 and how I would use it.
Do you have an application in mind?
How will not having wide SPI affect the application's outcome?
Some use cases are 4 post up ^, and some pin-outs to suit 8/4/2/1 SPI use, are 2 posts up ^. (included caveats/rules)
Speed always matters, and limiting users to only 1-SPI guarantees things are slower than they could have been.
In some cases wider-spi will allow lower clocks speeds and thus lower power.
I quite like that as a minimal pin use scenario.
But what I am asking is do you have a real application in mind because unless it is a real application we tend to over-spec our whiz-bang project just in case and as experience has shown us, we end up not using those fancy features and get by just fine. You can still hook-up fast & wide Flash as one extra part for that esoteric application.
If I've read JMGs writings correctly, only two pins, P62 for Tx and P63 for Rx, are required for minimal booting. Admittedly, that involves programming another "boot" MCU that has onboard Flash.
Yes, boot does that now, it starts checking UART, but can skip if a pin is pulled low.
I think it is still useful to be able to SPI-BOOT, the question is really over the pin map to use.
Focusing on that minimal pin use scenario for Quad spi users, is the pin map I suggested above.
That has some caveats in simultaneous use of UART and SPI, but for those using UART to bring up the board, that's likely fine.
Close, however that MCU is only mandatory for the 10 pin OctaSPI connection, when the Boot-via-1-SPI is not possible.
In the other connections, Boot-via-1-SPI is still possible. 1-2 SPI connect leaves the UART completely free.
4-SPI means you cannot use UART and Quad at the same time. Is the user needs a system uart, choose 2 other pins.
Yes, a partner MCU allows you to do anything, but you do not want to be bumped to a partner MCU 'band aid' too early.
ie P2 boot should be able to allow your minimal pin use scenario, for as many SPI modes as practical.
Looks to me like it can support UART (2 pins), 1-SPI or 2-SPI in (4 more pins), and even 4-SPI (same pin cost) with no MCU is possible, with some simultaneous use exclusions.
The MCU use is only really mandated for OctaSPI.
The current ROM mapping, is poorly compatible with a 4-SPI connection, and is even incompatible with a 2-SPI connection.
I've not seen how SD mixes into this yet either...
BTW: The existing basic 3-pin booter (1-SPI) will be fine for extending to 2-SPI. The pin order isn't going to matter.
OK, lets take something quite mainstream, like a 800x480 LCD display, HMI ( - neither over-spec'd nor esoteric. )
Memory map that inside the P2, and we have 2^19-800*480-2^14 = ~123904 free code space.
Not much room for your fonts in there, barely enough room for code !
Now imagine you want to support 16 bit font index (multi language, global product) and 64x32 fonts, because you want this to look, well, not amateurish & to sell global.
Such a font can need 16MBytes, and each char is 256 bytes. (+ more for smaller fonts...)
You claim 2048 clocks is fine to load a char, but the designer might beg to differ, and prefer 512 or 256 clocks, or 128 if brave....
My point is P2 should not fight that choice.
Interesting idea, but I fear those +/- 3 muxes do not apply to the streamer mapping ?
Streamer mapping is rather block-allocated.