P2 SD BOOT ROM v2 (for P2 respin Feb 2019) +exFAT trial
Cluso99
Posts: 18,069
I have just been doing some experiments with SD boot times.
These times are a little higher than the real code will be, since I have a small section of debugging code enabled.
These times are with a Mizo 8GB SDHC U1 (Class 10) and FAT32 and loading the small boot file "_BOOT_P2.BIX" which is located in the first ~60 directory entries. I have estimated the rcfast at 23MHz.
All cards do vary considerably.
I have included warm boot results. A cold boot means the SD card has not been previously initialised since power-up. The SD ROM Booter will only be run from a cold boot. This code has been downloaded first before running. I cannot tell if the cold code will run any slower from the physical ROM due to the SD having power applied for a shorter time before initialisation.
From this, you will see there isn't much benefit of having the SD Boot ROM run its' code from COG versus HUB. I haven't included the copy from hub to cog but it will be minimal.
Note: The new SD Boot ROM is not expected to support calling these routines from user programs. This also means the Monitor will no longer support loading or running files from SD.
These times are a little higher than the real code will be, since I have a small section of debugging code enabled.
These times are with a Mizo 8GB SDHC U1 (Class 10) and FAT32 and loading the small boot file "_BOOT_P2.BIX" which is located in the first ~60 directory entries. I have estimated the rcfast at 23MHz.
All cards do vary considerably.
I have included warm boot results. A cold boot means the SD card has not been previously initialised since power-up. The SD ROM Booter will only be run from a cold boot. This code has been downloaded first before running. I cannot tell if the cold code will run any slower from the physical ROM due to the SD having power applied for a shorter time before initialisation.
COG HUB %faster Cold boot 78.3ms 85.3ms 8.2% Warm boot 23.4ms 30.7ms 23.7%
From this, you will see there isn't much benefit of having the SD Boot ROM run its' code from COG versus HUB. I haven't included the copy from hub to cog but it will be minimal.
Note: The new SD Boot ROM is not expected to support calling these routines from user programs. This also means the Monitor will no longer support loading or running files from SD.
Comments
Note the much slower cold boot times!!!
It is perfectly OK like in the current ROM
Mike
I disagree (with moving to COG/LUT to speed up the code) which is why I am posting the timing, because moving the routine(s) to COG/LUT do not translate to the benefits expected.
For reference, here is what I sent to Chip today for confirmation. I am sure he won't mind me posting this.
You can discuss this here if you wish.
As far as I can tell, most/all are in agreement to remove support for older SD cards (prior to SDHC and those SD & SDHC using the old byte mode sector addressing). SDXC cards will be supported. Only cards formatted with FAT32 will support file load/run plus MBR & VOL load/run, while those that are not FAT32 formatted will only support MBR load/run (eg exFAT which is patented).
Msrobots, what were you not liking? Was it not being able to run the booter from the monitor? Or not being able to call the boot routine?
Moving the SD Booter to COG/LUT doesn't give much gain, and removes being able to call the SD Booter from a user program (not very likely as would be preferable to load full-blown support anyway). But this also means that the Monitor cannot load and/or run files from a FAT32 formatted card either - currently it can.
Your thoughts???
and... thanks for your efforts again Cluso
I think the question is whether it should run in the cog or in the hub, right? I think running in the hub is much better, seeing that there is only a little speed penalty. There is a chance that you will return to my booter, correct? And you only use a few cog registers for your code. That leaves my code intact in the cog, right, so that it can be returned to and my booter can attempt a serial connection?
For instance, on the P2D2 $0100_0EFB is the value I can HUBSET with to change the clock to exactly what I need which in this case to run at 180MHZ.
At RCFAST speeds this is the best I can get with multiblock reads (it's an old Sandisk card, not the best).
But at 180MHZ:
So this is already 8 times faster and after card initialization a 64kB file will load in 35ms rather than 280ms. All it needs is for the SD loader to switch from RCFAST to the specified clock mode. It doesn't even have to think, all it has to do is read the value and HUBSET. If the value is zero it will stay in RCFAST (hubset #0 == RCFAST).
BTW, I haven't updated my V2 images yet but I am adding the final touches to the built-in SD formatter that can format an SD perfectly and compliantly, excepting of course that you can format 64GB cards and up with FAT32. Also there are some extra options for BACKUP:
BACKUP <filename> --- backup to the specified file if found (creates a file if needed)
BACKUP BIX --- backup to _BOOT_P2.BIX
BACKUP MBR --- backup to sector 1 and set MBR signature (FAT32 not required)
BACKUP FLASH --- backup to serial Flash (^R restores but does not boot yet, needs 2nd stage loader or new ROM)
Yes, it currently does this and runs in hub. So, that is precisely what it does now
I currently jump to "try_serial" if the SD boot fails, and your code is intact as you freed up the cog space required for the variables $1C0-$1EF.
The Monitor overwrites your boot code at $FC000-$FC0FF for the serial read buffer (which you said I could use).
So it's only a minor tweek as I found a few instructions to shave, and by removing support for the older cards (which haven't been sold for years anyway) shaves a few more. It's amazing when you look at code again, what you can shave.
I think Peter had talked about this idea in the past. Is it far too dangerous to add at this point? Maybe it doesn't add so much benefit if there are no default input/output device pins pre-allocated, but the general thought was it could allow a standalone system to boot without even needing a flash/sd device fitted or even a serial console present to trigger entering it from reset.
Perhaps the lack of a serial console to enter TAQOZ is a step too far... as what would the P2 then do without some default IO channel to interact with and control it?
I just ran the SD Boot test from hub up to reading the MBR on a SanDisk Ultra 64GB SDHA U1 C10 which comes formatted with exFAT.
And this is before you actually load the program which we could assume for the moment would typically be in the 64k to 128k range. So we need figures not just for being ready to load the boot code, but fully loaded ready to run. TAQOZ may be optional for some, but all of us will depend upon this bootloader, so it needs to be flexible.
The load time is mostly dependant upon the SD card used.
Well, call this a user request if you like, but for the sake of reading one location and performing a hubset, that is all you need to add. That is being flexible.
I don't understand why you wouldn't run the P2 faster because once it eventually boots, it will be switched to a much higher frequency in almost all cases.
The other user request is simply this, if serial boot fails, if Flash boot fails, if SD boot fails, then pass control to TAQOZ and it can do its own boot tests etc. If there is nothing to boot and no serial terminal etc then it can do a shutdown.
Or dependant upon the "format" used ?
If you reformat that SD card from exFat to FAT32, does that change the boot test timings significantly ?
And still asking if there will be a way to start TAQOZ with a mailbox instead pins 63/62, or at least 2 other pins with the ability to set the RX mode to listen to another pin?
Enjoy!
Mike
But i say again, the maximun saving over the current supported methods is ~6ms out of a total 85ms/158ms/236ms/etc depending on which SD card you use. A two stage boot is simple and foolproof and is currently a supported option. I am not being difficult, i am just unconvinced.
Chip, what do you think?
I pass control back to try_serial as requested by Chip.
You will need to get Chip to change his serial code to pass to TAQOZ if it times out.
I am happy with that provided there is no pulldown on P59 which is a lockout of serial as you requested. Though, equally, it is quite simple to enter the 5 character serial sequence to go to TAQOZ.
By little, maybe 1-3ms which is nothing compared to card choice.
Either you patch the code or TAQOZ patches its’ code. We are way short on ROM space so my guess is that it’s not going to happen in the TAQOZ ROM.
Plus, TAQOZ will send signals and use IOs that potentially could be used for other functions. That would trigger devices connected to those pins. You don't want that either. Anyway, it is just a matter of pressing three keys to call TAQOZ. Over-simplification ruins things, like M$ does.
Kind regards, Samuel Lourenço
@Cluso99 - what size boot file are you talking about when you quote boot times? It seems as if you are only quoting SD initialization because a 64kB "load" (after init) still takes 280ms at RCFAST even with TAQOZ doing a fast multiblock read. This is where having a simple clock long in the MBR (that can also easily be validated next to a copy of itself) is all that is required to make the difference between slow or fast boot. Initially TAQOZ was going to handle the SD boot, and this would have been one of the options.
@Chip - since this is an option it doesn't affect normal RCFAST boot, but if the user sets the boot clock long in the MBR, then that is up to the user to validate, just as the actual boot image is up to the user to validate. If the user has a bad boot image there is nothing the booter can do anyway whereas the clock config word can easily be validated as mentioned.
Now the other thing is, is there any reason why the final boot stage can't fall through to TAQOZ? At the very least when you power up your new P2 it can come to life and the user can even format or check the SD card and hardware etc.
Yeah, finally.
It is perfectly OK if the mailbox is in HUB not ROM.
It is just needed to jmp COG0 (or even any COG?) into some ROM address, so it can load TAQOZ into the lower 64K and TAQOZ uses a Mailbox instead Smartpins. This would help a lot to use TAQOZ as a smart IO cog from other languages.
Where you place the mail box does not matter as long as it is a fixed address after loading TAQOZ from ROM. Then any program can find it.
Thanks for considering this.
Mike
My SD times are for loading a file less than 2KB. So as you can see, those significant times are all to do with initialising the SD card. Nothing can be done to shorten these times. There is no miraculous time benefit in running at 180MHz or even 3GHz for that matter. It is what the SD card takes.
That <2KB file can contain a complete SD Boot code loader running at whatever crystal/clock you desire, and load/run another file (say _BOOT_P2.BIZ) at full speed. This is known as a 2 stage loader. Very little time is lost using this method over using a value in the MBR to set a higher clock frequency (maybe 6ms) over anything from 80-250+ms plus the actual file load time which Peter suggested was 35ms for 64KB (i havent timed this part).
And this works now with the current ROM.
BTW I hadn’t seen a valid reason for having a 2 stage loader for the Flash. But this is a reason to have it.
I also don't want to fall back into TAQOZ if it means twiddling a bunch of pins, and causing surprise behavior on people's hardware. That would be a disaster, as Samuell pointed out.
I kind of like the idea of just shutting down if nothing boots.