I really like the FAT-less idea of just writing file records sequentially into the flash, cancelling flags in old file records as you go. Then, when things fill up, reconsolidate the data at the beginning of the Flash. This takes almost no code and could even be added to the ROM Monitor. No need for directory hierarchies, really.
A standardized interface between the propeller2 tool (and others) and propeller file io would be more useful in my opinion. Something that would load a simple(well as simple as file system management can be) program to the prop2 ram then read/write to the file system. This would allow a standard set of tools to be used to directly access custom memory devices such as propeller connected SD cards, other flash devices, the propeller boot memory, networks of propellers or anything else people can come up with. The actual file system objects would then need to have two versions, one for compile time access, and another for run time code execution.
I like the simple approach too Chip. All that requires is a simple record header so everybody knows what is in the flash. Just read through...
Adding a little bit to the monitor seems like a worthy thing to do. One could squirt firmware and or data ove and just write it. And that could be a nice reference for the simple header too.
@Bill: You were thinking what I was. Boot, get right to SD for anything more than basics. Agreed.
I really like the FAT-less idea of just writing file records sequentially into the flash, cancelling flags in old file records as you go. Then, when things fill up, reconsolidate the data at the beginning of the Flash. This takes almost no code and could even be added to the ROM Monitor. No need for directory hierarchies, really.
A system like this has worked for Forth for all these years an look where it is today! (ok, bad example!)
Kidding aside, just set aside the first few blocks as master bit tables and a simple directory for names and offsets and away you go. I see this for more static files (look up tables, code overlays, ?) that dynamic data files. Maybe you can even get away with specifying a number of blocks upon file creation. Creation and deletion are easy, compaction when you need to recover those deleted blocks requires some gymnastics but nothing really complicated.
Anything more complex can go to SD like others have already mentioned.
Just saw Bill's post. Yes! 2) can we agree that the first long in the flash defines how big the boot area is?
Totally agreed. From there...
I think we can just use a simple header marker, maybe some http://nedbatchelder.com/text/hexwords.html followed by a number of blocks. Then just read through 'em as needed.
One thing about the flash I've seen is that you can only erase in 256-byte chunks (this is the smallest size you can erase).
Nice thing about TAR is that each file has a 512 byte header that describes size and name, etc.
So, it works out pretty well and is simple.
I have written a NOR-flash-optimized Unix like filesystem for my projects, however I cannot open source it as wifey would kill me if I gave away man-months of work.
Right, and this why this is probably not worth pursuing. Filesystems are a huge, huge black hole of a time suck. I think that baking in a specific implementation of a given filesystem into the monitor is going to take a long time before there's confidence about turning it into silicon, and even then I'm sure someone will discover some subtle incompatibility after it's in the wild.
When I was thinking about this, it seemed that just loading flash with a tarball of the files you want is the easiest way...
Then, you can easily transfer to PC too...
This is an interesting idea. Combine this with the SPI drivers people have made for the RaspPi: "tar -cf - prog1.p2 prog2.p2 prog3.p2 | ./spi_write.py"
I really think it is too early to decide on anything except how the size of the second stage bootloader is indicated in the flash.
I proposed the first long be the size of the code to be loaded, sequentially.
That boot loader could then boot from SD card, tar files, paper tape etc.
This also keeps the monitor simple & easy ... a 'boot' command just reads the boot block size, loads it, and jumps to it.
Such simplicity postpones 1e6 postings discussing the merits of this method or that, fat vs punch card decks et al.
One interesting bootblock (LATER) would be P2 making the flash chip appear like a flash drive (over USB) and simply writing BOOT.IMG out as the boot block when a PC wrote to it.
Another would be waiting for 5-10 seconds after bootup for a USB CDC protocol download...
I proposed the first long be the size of the code to be loaded, sequentially.
...
Such simplicity postpones 1e6 postings discussing the merits of this method or that, fat vs punch card decks et al.
This makes sense, but there are still some details. Would it be cogexec or hubexec? Would it be expecting any parameters such as which pins the flash is on or will that be hard-wired?
Wasn't this all locked down before? Unless the "Using the Propeller 2 Monitor" doc is out of date it says the boot order is serial, then SPI flash, then dropping to monitor.
This tries to specify how an un-encrypted flash boot would work.
Good point about cog/hubexec... I think greatest flexibility would be obtained by hubexec, as it would allow a larger than cog sized boot image without having to load a cog and jump out of it.
This makes sense, but there are still some details. Would it be cogexec or hubexec? Would it be expecting any parameters such as which pins the flash is on or will that be hard-wired?
Wasn't this all locked down before? Unless the "Using the Propeller 2 Monitor" doc is out of date it says the boot order is serial, then SPI flash, then dropping to monitor.
This tries to specify how an un-encrypted flash boot would work.
Good point about cog/hubexec... I think greatest flexibility would be obtained by hubexec, as it would allow a larger than cog sized boot image without having to load a cog and jump out of it.
Don't we have a long for cog starting now that tells us similar things? Can't they be consistent formats?
Say that boot long is 2K in size. Prop fetches it, and the very first HUBEXEC instruction is a COGNEW to start a COG. Then again, say it isn't, and that code goes and grabs 100K more... then it fires off a few cogs and tasks,
Or, that long is 512 bytes, and it starts up really quick in HUBEXEC, which then goes and does something...
Or, get right to it, take a long, slow boot and fetch a big image, the first few instructions fire off everything.
This is what Bill is getting at.
Also COG order can matter because of the DACS. Less shuffling of them with a start in HUBEXEC.
Right now, the first 512 longs of flash is the booter image which gets authenticated and, in turn, runs and loads the main app. That needs to be at the start of memory, sans any other requirement. If authentication fails on that fixed data block, there's no bootable image, period. This keeps things consistent and simple, as there's no possibility of false pointers sending the ROM booter on a fool's errand. It's 512 longs that either pass authentication, or don't. This same rule applies for serial download.
What comes after those initial 512 longs could be anything. Is there wisdom in actually mandating something there, like a long which tells the absolute address of where any file-system data begins? Maybe another long specifying the last address (plus 1?) to be used by file-system data?
About these 256-byte-block-erasable flash chips: These are 'data' flash chips and are way lower density (more expensive) than the bigger 4k-byte-block-erasable flash chips. We are better off, I think, with the 64Mb/128Mb/256Mb devices that cost $1..$3 for a whole lot more memory. So, we'd need a file system that works with 4KB block erasure.
I forgot about the authentication. Yes, that's gotta be one block, atomic. Or we have a hole!
Given we have to have that image to start, I don't think we need anything after it. Whatever authenticated image people choose to use will understand what it needs from the flash.
I forgot about the authentication. Yes, that's gotta be one block, atomic. Or we have a hole!
Given we have to have that image to start, I don't think we need anything after it. Whatever authenticated image people choose to use will understand what it needs from the flash.
I think you're right. Being a non-removable memory, it won't be accessible outside of what the booter eventually loads to perform reads and writes. So, the app, starting from the authenticated booter, is the sole controller of flash usage. Within that framework, rules can be set, but that's outside the scope of what we are talking about here.
It might be presumptuous to give the ROM monitor any more control over the flash than block read/write/erase. That would be so low-level that it may be superfluous, anyway.
Well, with block read, write, erase, one could send an image through the terminal and write it, then boot. Could even happen authenticated or for an authenticated one, if that option were provided for.
Other than that, it seems to me the terminal can handle data otherwise.
Right now, the first 512 longs of flash is the booter image which gets authenticated and, in turn, runs and loads the main app. That needs to be at the start of memory, sans any other requirement. If authentication fails on that fixed data block, there's no bootable image, period. This keeps things consistent and simple, as there's no possibility of false pointers sending the ROM booter on a fool's errand. It's 512 longs that either pass authentication, or don't. This same rule applies for serial download.
What comes after those initial 512 longs could be anything. Is there wisdom in actually mandating something there, like a long which tells the absolute address of where any file-system data begins? Maybe another long specifying the last address (plus 1?) to be used by file-system data?
About these 256-byte-block-erasable flash chips: These are 'data' flash chips and are way lower density (more expensive) than the bigger 4k-byte-block-erasable flash chips. We are better off, I think, with the 64Mb/128Mb/256Mb devices that cost $1..$3 for a whole lot more memory. So, we'd need a file system that works with 4KB block erasure.
Agreed, leave it as is. Anything can be built on top of what we have.
I would see that for file systems, a microSD socket (cost <$1) is the best way to go for file systems. Easily removable and read/write by PC etc. microSD are cheap and user supplied. I have quite a few just from old phones.
Just regret I didn't pursue getting a boot to SD so that we didn't require flash on the board.
I've got the FPGA compiling well now with all the new instructions and new ROM. I just need to get the docs updated to get the next release done. This has taken longer than I thought it would.
We are better off, I think, with the 64Mb/128Mb/256Mb devices that cost $1..$3 for a whole lot more memory.
I really like the idea that on a "standardized" base or development P2 platform, we get lots of extra flash memory available by default and quite a bit greater than the ~256kB hub RAM size. This will allow plenty of space for other assets that can be dynamically used such as fonts, wavetable samples, large look up tables, etc, without necessarily mandating the use of external SD cards at all times for this purpose.
Also someday (soon I hope) someone will invent some fancy new XMMC model which could potentially read P2 code data from this flash to be cached/executed in hub RAM. So there's no doubt there will be plenty of uses for it. It also provide some amount of resilience/redundancy in the field during software upgrades with the ability to have multiple and validated boot images etc. This is all without requiring any SD cards to be fitted, but if you want/need a full blown R/W filesystem on SD you can still add that too. In my view it's always good to be able to be able to boot to something useful if the SD is not fitted or gets corrupted etc, not just break to a ROM monitor. That would be fine for us developers, but not necessarily for end user customers.
Just regret I didn't pursue getting a boot to SD so that we didn't require flash on the board.
I recall comments in FPGA discussions along the lines that the FPGA loader engine will stream clocks until it finds a 'key' header, and then it clocks the image in from there.
It's not bullet proof, but it could give a simple means to be more tolerant of strict physical location, and in conventional SPI memory has very low cost, of a couple of distinct leader words.
I really like the idea that on a "standardized" base or development P2 platform, we get lots of extra flash memory available by default and quite a bit greater than the ~256kB hub RAM size. This will allow plenty of space for other assets that can be dynamically used such as fonts, wavetable samples, large look up tables, etc, without necessarily mandating the use of external SD cards at all times for this purpose.
Yes, and that memory is only going to get cheaper/byte too.
QuadSPI support in SerDes will make such devices run to close to their abilities, without costing SW resource.
Execute in place should be possible, for the least often used code.
Really, it's just going to fetch a COG image from the first 512 longs, authenticate it, then carry on from there. Once that image loads, it can do whatever, look for headers, etc... Necessary for a reliable secure code model.
And this is documented in the loader.spin code posted. We know what it does, exactly already.
Yes, and that memory is only going to get cheaper/byte too.
QuadSPI support in SerDes will make such devices run to close to their abilities, without costing SW resource.
Execute in place should be possible, for the least often used code.
Yeah I am not sure which particular device Chip has in mind but I am hoping it will be a QuadSPI based one and we can utilize the high speed performance from it one day.
By the way, I'm just looking at QuadSPI now, do we have a nibble swap operation in the P2 instruction set? That may come in very useful if we don't have it already. I do see ESWAP4 but don't know if that does bitwise endian swaps or nibble swaps in a byte (or both).
I've got some ideas on handling firmware upgrades with ping-pong flash loads, I need to work out one of the issues, then I'll lay out the whole idea here.
Thanks ozpropdev,
Now we can see what it does. It seems to swap nibbles in the whole long and reverse the byte order as well. Would have been good to also have a nibble swap opcode variant that swaps just the nibbles of the least significant byte, or alternatively each nibble within each of the 4 bytes in the long (that might be more versatile and still let you deal with single byte values).
So you started with say
$ABCDEF01
and do a nibble swap (NSWAP?) on it you'd get :
$BADCFE10
If you passed just a single byte value as
$000000AB
you'd get
$000000BA
This operation could be good for doing 4 bit data transfers and let you deal with sending/receiving MS or LS nibble first etc, without needing to rotate and mask etc.
By the way what does ESWAP8 do, reverse bytes in the long I guess?
Comments
M25PE and M45PE series has a small erasable page of just 256 bytes.
Digikey has M25PE16-VMW6TGCT in stock: 2M x8 (16 mbits) at $1.34 per unit.
Tradeoff is that It is a lower density flash.
Then, you can easily transfer to PC too...
Adding a little bit to the monitor seems like a worthy thing to do. One could squirt firmware and or data ove and just write it. And that could be a nice reference for the simple header too.
@Bill: You were thinking what I was. Boot, get right to SD for anything more than basics. Agreed.
A system like this has worked for Forth for all these years an look where it is today! (ok, bad example!)
Kidding aside, just set aside the first few blocks as master bit tables and a simple directory for names and offsets and away you go. I see this for more static files (look up tables, code overlays, ?) that dynamic data files. Maybe you can even get away with specifying a number of blocks upon file creation. Creation and deletion are easy, compaction when you need to recover those deleted blocks requires some gymnastics but nothing really complicated.
Anything more complex can go to SD like others have already mentioned.
I'd suggest that we reserve the first 32KB.
Why 32KB?
That's big enough to implement a more sophisticated SD / whatever system.
Reserving 256KB does not make sense to me, as it would be wasted in most cases.
OR
2) can we agree that the first long in the flash defines how big the boot area is?
Either of the above allows postponing the more sophisticated boot solutions until later.
Heck, once USB works, we might want a boot loader that waits 5 sec after reset for a USB download!
p.s.
I have dusted off my DE2-115 waiting for the next image...
&
I also dusted off my DE0-Nano with the add-on board.
2) can we agree that the first long in the flash defines how big the boot area is?
Totally agreed. From there...
I think we can just use a simple header marker, maybe some http://nedbatchelder.com/text/hexwords.html followed by a number of blocks. Then just read through 'em as needed.
Nice thing about TAR is that each file has a 512 byte header that describes size and name, etc.
So, it works out pretty well and is simple.
Right, and this why this is probably not worth pursuing. Filesystems are a huge, huge black hole of a time suck. I think that baking in a specific implementation of a given filesystem into the monitor is going to take a long time before there's confidence about turning it into silicon, and even then I'm sure someone will discover some subtle incompatibility after it's in the wild.
This is an interesting idea. Combine this with the SPI drivers people have made for the RaspPi: "tar -cf - prog1.p2 prog2.p2 prog3.p2 | ./spi_write.py"
I proposed the first long be the size of the code to be loaded, sequentially.
That boot loader could then boot from SD card, tar files, paper tape etc.
This also keeps the monitor simple & easy ... a 'boot' command just reads the boot block size, loads it, and jumps to it.
Such simplicity postpones 1e6 postings discussing the merits of this method or that, fat vs punch card decks et al.
One interesting bootblock (LATER) would be P2 making the flash chip appear like a flash drive (over USB) and simply writing BOOT.IMG out as the boot block when a PC wrote to it.
Another would be waiting for 5-10 seconds after bootup for a USB CDC protocol download...
Or booting from a FAT SD card...
This makes sense, but there are still some details. Would it be cogexec or hubexec? Would it be expecting any parameters such as which pins the flash is on or will that be hard-wired?
Wasn't this all locked down before? Unless the "Using the Propeller 2 Monitor" doc is out of date it says the boot order is serial, then SPI flash, then dropping to monitor.
Good point about cog/hubexec... I think greatest flexibility would be obtained by hubexec, as it would allow a larger than cog sized boot image without having to load a cog and jump out of it.
Don't we have a long for cog starting now that tells us similar things? Can't they be consistent formats?
Say that boot long is 2K in size. Prop fetches it, and the very first HUBEXEC instruction is a COGNEW to start a COG. Then again, say it isn't, and that code goes and grabs 100K more... then it fires off a few cogs and tasks,
Or, that long is 512 bytes, and it starts up really quick in HUBEXEC, which then goes and does something...
Or, get right to it, take a long, slow boot and fetch a big image, the first few instructions fire off everything.
This is what Bill is getting at.
Also COG order can matter because of the DACS. Less shuffling of them with a start in HUBEXEC.
What comes after those initial 512 longs could be anything. Is there wisdom in actually mandating something there, like a long which tells the absolute address of where any file-system data begins? Maybe another long specifying the last address (plus 1?) to be used by file-system data?
About these 256-byte-block-erasable flash chips: These are 'data' flash chips and are way lower density (more expensive) than the bigger 4k-byte-block-erasable flash chips. We are better off, I think, with the 64Mb/128Mb/256Mb devices that cost $1..$3 for a whole lot more memory. So, we'd need a file system that works with 4KB block erasure.
Given we have to have that image to start, I don't think we need anything after it. Whatever authenticated image people choose to use will understand what it needs from the flash.
I think you're right. Being a non-removable memory, it won't be accessible outside of what the booter eventually loads to perform reads and writes. So, the app, starting from the authenticated booter, is the sole controller of flash usage. Within that framework, rules can be set, but that's outside the scope of what we are talking about here.
It might be presumptuous to give the ROM monitor any more control over the flash than block read/write/erase. That would be so low-level that it may be superfluous, anyway.
Other than that, it seems to me the terminal can handle data otherwise.
Will the loader still support 0-key images (ie no encryption)? I hope so...
With this reminder... I agree. Let's just leave it at the 512 long initial loader, which can do more complex things (ie SD boot) as needed.
I would see that for file systems, a microSD socket (cost <$1) is the best way to go for file systems. Easily removable and read/write by PC etc. microSD are cheap and user supplied. I have quite a few just from old phones.
Just regret I didn't pursue getting a boot to SD so that we didn't require flash on the board.
I really like the idea that on a "standardized" base or development P2 platform, we get lots of extra flash memory available by default and quite a bit greater than the ~256kB hub RAM size. This will allow plenty of space for other assets that can be dynamically used such as fonts, wavetable samples, large look up tables, etc, without necessarily mandating the use of external SD cards at all times for this purpose.
Also someday (soon I hope) someone will invent some fancy new XMMC model which could potentially read P2 code data from this flash to be cached/executed in hub RAM. So there's no doubt there will be plenty of uses for it. It also provide some amount of resilience/redundancy in the field during software upgrades with the ability to have multiple and validated boot images etc. This is all without requiring any SD cards to be fitted, but if you want/need a full blown R/W filesystem on SD you can still add that too. In my view it's always good to be able to be able to boot to something useful if the SD is not fitted or gets corrupted etc, not just break to a ROM monitor. That would be fine for us developers, but not necessarily for end user customers.
I recall comments in FPGA discussions along the lines that the FPGA loader engine will stream clocks until it finds a 'key' header, and then it clocks the image in from there.
It's not bullet proof, but it could give a simple means to be more tolerant of strict physical location, and in conventional SPI memory has very low cost, of a couple of distinct leader words.
Yes, and that memory is only going to get cheaper/byte too.
QuadSPI support in SerDes will make such devices run to close to their abilities, without costing SW resource.
Execute in place should be possible, for the least often used code.
And this is documented in the loader.spin code posted. We know what it does, exactly already.
By the way, I'm just looking at QuadSPI now, do we have a nibble swap operation in the P2 instruction set? That may come in very useful if we don't have it already. I do see ESWAP4 but don't know if that does bitwise endian swaps or nibble swaps in a byte (or both).
Here's the result of a ESWAP4 D.
Thanks ozpropdev,
Now we can see what it does. It seems to swap nibbles in the whole long and reverse the byte order as well. Would have been good to also have a nibble swap opcode variant that swaps just the nibbles of the least significant byte, or alternatively each nibble within each of the 4 bytes in the long (that might be more versatile and still let you deal with single byte values).
So you started with say
$ABCDEF01
and do a nibble swap (NSWAP?) on it you'd get :
$BADCFE10
If you passed just a single byte value as
$000000AB
you'd get
$000000BA
This operation could be good for doing 4 bit data transfers and let you deal with sending/receiving MS or LS nibble first etc, without needing to rotate and mask etc.
By the way what does ESWAP8 do, reverse bytes in the long I guess?