PDA

View Full Version : serial boot loader



IAN STROME
08-29-2007, 06:45 PM
Anyone know the full serial boot protocal?

After reset goes high,Prop looks for a Break on P31( RxData) within 10mS.
Is this correct?.What baudrate/parity/etc.Any help appreciated.

Regards.

:- The component most likely to fail, will be in the least accessable place.

Mike Green
08-29-2007, 10:10 PM
Have you looked at: http://forums.parallax.com/showthread.php?p=591445?

This pretty well defines the protocol. There's a similar program written in Python as well.

The program actually uses a ratiometric protocol encoded in a 115KBaud data stream. As far as
the sender is concerned, it's an 8 bit, no parity mode with 1 stop bit although it should work with
2 stop bits.

Post Edited (Mike Green) : 8/29/2007 2:17:02 PM GMT

Ariba
08-30-2007, 05:22 AM
Here you find also some usefull information:

http://forums.parallax.com/showthread.php?p=611536
(last posting on first page)

Andy

IAN STROME
08-30-2007, 08:54 AM
Many Thanks to Mike + Andy for those links.
Plenty there to chew on.---I'm trying to get the prop to talk to FPC under DOS i.e.
( NO WINDOZE ). The python code was a bit involved,too many parenthesis.
Finally worked out the LFSR was a pseudo-random-sequence-generator,as used
on the secure mil stuff I worked on 30+ years ago, only instead of using the full
byte only uses bit 0 , hence the stream of 250 1's and 0's.
I will post again when I get it working.
Regards to all Ian

:- All tolerances add up in the same direction.

hippy
09-04-2007, 11:50 PM
A more formal specification of the protocol would be very welcome, by myself and I am sure others.

I think I can understand the transmission protocol but am entirely unclear on the hows and whys of "WaitBit" in Chip's PropellerLoader.spin, and especially how that would be implemented when using UART, byte receive only.

I'm also having trouble relating PropellerLoader.spin ( which seems to just send a stream of 1T and 2T 'bits' ) to BootSequence.spin found elsewhere which seems to be byte oriented, with particular importance given to $F9 bytes.

From the unframed bit sequences sent in PropellerLoader.spin it would seem possible to pack as many 1T and 2T bits into each serially framed back-to-back bytes as possible ( up to 5 x 1T's ), but BootSequence.spin states encoding is "up to 3 bits sent per serial byte".

I've looked at all the loader variants written but something isn't gelling and I remain completely confused. It's completely 'beyond me', so if anyone can explain the protocol in a manner I can understand, I'd be very grateful.

Mike Green
09-05-2007, 12:28 AM
hippy,
The primary thing to remember in the download protocol is that this is not really asynchronous serial. It's a speed independent ratiometric protocol that just happens to use a UART on the PC end because that's what's available. The 1T/2T bits are just that. These are bits packed into 3 bit "cells" using the start bit of the asynchronous character cell as the first bit, so you can get up to 3 bits sent per "character" (3 x 3 = 9 + stop bit = 10 bit times per UART "frame"). For simplicity, a lot of the data from the PC is sent one bit per asynchronous character, particularly the handshaking with the feedback shift register. The Propeller doesn't use a UART and is not really framing the information into asynchronous characters.

If you look at the $F9 bytes as bits, you get %11111010. If you append the start bit on the low order end, you get %111110100 which is used as a sync code since it has a 2T interval followed by a 1T interval and all the rest is idle state (logic 1). The Propeller boot loader is running off the internal RCFAST clock which varies from chip to chip, but is fairly stable over a short period of time (seconds to minutes). The Propeller can't do asynchronous I/O without a known, accurate clock, so it uses this protocol for the boot load. The sync characters provide a 2T and 1T clock and the 2:1 ratio provides a clear distinction between the two states (and idle - the third state). The Propeller end just has to catch the bits as they come in and pack them as needed for whatever is being received. For responding, it can just use the times from the sync bits.

hippy
09-05-2007, 01:38 AM
@ Mike ...

The primary thing to remember in the download protocol is that this is not really asynchronous serial. It's a speed independent ratiometric protocol that just happens to use a UART on the PC end because that's what's available.

Yes, I fully understand that. It's a very clever trick.

The 1T/2T bits are just that. These are bits packed into 3 bit "cells" using the start bit of the asynchronous character cell as the first bit, so you can get up to 3 bits sent per "character" (3 x 3 = 9 + stop bit = 10 bit times per UART "frame").

Looking at PropellerLoader.spin that doesn't seem to be the case. This "three bit cell" notion implies that it is not just the period of the bit time which is important but the period between the bit times, that doesn't seem to be enforced in PropellerLoader.spin.

Perhaps I should explain further - $FF is a 1T from just the start bit, $FE is a 2T because of the start bit followed by the lsb. So is $55 valid ? That has 5 x 1T bits, but it doesn't fit the "three bit cell" requirement.

I guess I'm asking if this "three bit cell" requirement is an actual requirement or simply a convenient way of looking at things when it comes to packing bits into a byte sent via a UART.

Is there a requirement that a one bit is send as "one high period, two low period", and a zero bit sent as a "two high period, one low period" on the physical (RS232) TX line, or is it simply the difference between the length of the high periods that the Propeller uses ? This is not at all clear from pointing to the example code, and why I'd like a more formal explanation. As best I can see, PropellerLoader.spin doesn't comply with any "three bits cell" timing.

For simplicity, a lot of the data from the PC is sent one bit per asynchronous character, particularly the handshaking with the feedback shift register. The Propeller doesn't use a UART and is not really framing the information into asynchronous characters. If you look at the $F9 bytes as bits, you get %11111010. If you append the start bit on the low order end, you get %111110100

Again, I fully understand that, and can understand how a single byte can be used to send a 1T or 2T ( and many other combinations of bits ) - it's %11111001, and %111110010 but I'm not noting that to get bogged down in debate over typo's.

If a "three bit cell" isn't a requirement, the same 1T then 2T could be achieved using $CF, %110011110, or is that not correct, or not allowed ?

... which is used as a sync code since it has a 2T interval followed by a 1T interval and all the rest is idle state (logic 1).

And this is where I am confused, because that 2T followed by a 1T is nothing special in a free-form bit stream and will undoubtedly occur many times while transferring an Eeprom image.

Apart from that bit sequence being used to allow the Propeller to re-adjust its bit timing, it is not clear what other significance it may have.

To move to talking of $F9 suggests there is some 'byte framing' or specific timing which makes the data stream something more than a free-form bit stream. It infers there is some importance to the time between the 1T and the 2T.

The Propeller boot loader is running off the internal RCFAST clock which varies from chip to chip, but is fairly stable over a short period of time (seconds to minutes). The Propeller can't do asynchronous I/O without a known, accurate clock, so it uses this protocol for the boot load. The sync characters provide a 2T and 1T clock and the 2:1 ratio provides a clear distinction between the two states (and idle - the third state). The Propeller end just has to catch the bits as they come in and pack them as needed for whatever is being received. For responding, it can just use the times from the sync bits.

Again, I understand that the Propeller sees either 1T or 2T bits and is aware when there are neither, and understand how it works to cater for wildly varying RCFAST clocks, and probably quite impressively so.

I can also understand the Propeller readjusting its timing on every bit received or even on a specific sync character, but where is the explanation which details this, or states how often a sync character sequence must be sent ? Am I right that the sync characters are sent at the start of downloading during LFSR 'handshaking' and then it's fingers crossed that RCFAST doesn't drift too far during the entire download ?

Apart from the lack of clarity on whether it's a free form stream of 1T and 2T bits, a "three bit cell" and any other special case for sync characters, it really is what comes back from the Propeller and dealing with it using a PC which only has a UART which is where my difficulties lay.

Ariba
09-05-2007, 08:13 PM
Hello hippy
Here is my understanding of the Loader protocol:

I guess I'm asking if this "three bit cell" requirement is an actual requirement or simply a convenient way of looking at things when it comes to packing bits into a byte sent via a UART.
The 3 bit cells are not required, but is a simple way to code it. If you have only 1T pulses, you can pack 5 ot them in a byte.

And this is where I am confused, because that 2T followed by a 1T is nothing special in a free-form bit stream and will undoubtedly occur many times while transferring an Eeprom image.
No it's nothing special. The PC and the Propeller count the bits, and know both when the Prop has to respond, and when not.
When th PC expect a respond bit of the Prop, he sends F9 bytes, what are 1T/2T sync patterns for the Prop, that synchronizes 1 bit of the respond.

Am I right that the sync characters are sent at the start of downloading during LFSR 'handshaking' and then it's fingers crossed that RCFAST doesn't drift too far during the entire download ?
I think that too, but I don't know it. Does this matter?

Apart from the lack of clarity on whether it's a free form stream of 1T and 2T bits, a "three bit cell" and any other special case for sync characters, it really is what comes back from the Propeller and dealing with it using a PC which only has a UART which is where my difficulties lay.
I made a timing diagram. See the attached picture, I hope it clarifies a little.

Cheers
Andy

hippy
09-05-2007, 11:00 PM
@ Ariba : Many thanks Andy, and especially for your GIF. I'm at a disadvantage here because I have neither Propeller hardware to experiment with nor scope to look at what's happening.

I think I got 'hung-up' on how PropellerLoader.spin appears to work, and the way it samples the response bits. Plus I couldn't relate the 'byte oriented' to the 'bit stream' mechanism which use different bit packing techniques. Your image and re-reading the Delphi code Chip provided now ties it all together.

If I've got it right, download is basically (1) do the synchronising, LFSR handshaking and version checking, (2) download the image three bits per cell or using more dense packing, (3) send $F9's to entice back 1T/$FF or 2T/$FE responses.

I suppose it all boils down to example code and descriptions of how something is or can be done is not the same as the specification of the actual protocol used. You've helped me make a great leap forward and I'm very grateful.

IAN STROME
09-07-2007, 11:57 AM
Hi All,
I've been looking at this for a few days,
It's probably worth noting :-
a zero is encoded as 2 bit times low followed by 1 bit time high
a one is encoded as 1 bit time low followed by 2 bit times high
So you can't have 5 1t's per byte :- not enough room.
I'm sure there's a name for this encoding,but can't remember it.Too long ago!!

This gives the codes for your 3 bits per byte as follows:-
000 = $92
001 = $93
010 = $9A
011 = $9B
100 = $D2
101 = $D3
110 = $DA See jpeg Diagram
111 = $DB

So encoding a 32 bit long takes 11 bytes 32/3 10 remainder 2. = 11 bytes

I too,had problems with VB,Delphi,etc,trying to understand what the hell
was going on in the code,So plugged the prop into PC Term 0.3,fired up my
old homebrew 68K system,jumpered a spare UART into the serial lines to
see what was going on.I'm getting 4 bytes @ various intervals when
uploading namely : - $F2,$F3,$FA, and $FB these in the 3 bits per byte
scheme decode as binary 00,01,02,03 any ideas what these are??

I think I may have just answered may own question, these codes are for
the end of a 32bit long 2 bits left over, so 10 bytes of the above codes
followed by 1 byte of either $F2,$F3,$FA, or $FB. I suppose in effect
signifying end of frame!!

It's amazing what Black Bottle whisky can do!!

:- Sliding fits don't

hippy
09-07-2007, 01:08 PM
@ Ian : It's probably worth noting :-
a zero is encoded as 2 bit times low followed by 1 bit time high
a one is encoded as 1 bit time low followed by 2 bit times high
So you can't have 5 1t's per byte :- not enough room.

That's how it's done in Chip's Delphi code example, but whether it's a requirement or just an easy to use technique remains unanswered. I'm on my way towards testing the 'dense packing' method but haven't got there yet.

@ Ariba : I've now got a Propeller and a Bootloader written and running, so thanks for your help.

I finally gave up with a 'from theory/assumption' approach and translated the Delphi example into VB3/VB6 almost line for line, and it's starting to look similar to Filip's code in http://forums.parallax.com/showthread.php?p=611536. I'm now tweaking it to see what does work and what breaks it.

VB3's MScomm.vbx won't support baud rates over 14400 so that's been an interesting issue as I'd like to get the VB3 version running for 16-bit systems, and the 'Propeller IDE' I was hoping to wedge this into is already written in VB3. Although full 32-bit support is the end goal, I'd hoped to leave the porting to last.

The VB6 works down to 14400 under Win 98SE, but not below 38400 on Win XP. I've also found that the LFSR handshaking after Reset needs to be buffered and then sent back-to-back, as per the Delphi code. Trying to send a byte at a time results in no response ever coming back from the Propeller.

There's no explicit waiting in that sequence so I presume VB6 runtime / Windows does something on every send ( schedules other tasks etc ) and I'm hitting some undocumented timeout limit. This is where a formal spec would really help, so I wouldn't have to be presuming or guessing.

None the less, I do have something working, so that's a great leap forward and a big 'question mark' crossed off the project plan.

Fred Hawkins
09-07-2007, 03:25 PM
"a zero is encoded as 2 bit times low followed by 1 bit time high
a one is encoded as 1 bit time low followed by 2 bit times high"

Alternatively: start bit low, one bit data, end bit high.

Which means 2 zero bits are always in the first two bits of the three, 2 high bits are always in the last two bits of the three. 111 and 000 are impossible sequences.


____
So, given 3 sequential bits received:

001 = data 0
010 = last two bits of preceding sequence (which was data 0), start bit of second sequence
011 = data 1
100 = last bit of preceding sequence, starting two bits of second sequence (which is data 0)
101 = last bit of preceding sequence, starting two bits of second sequence (which is data 1)
110 = last two bits of preceding sequence (which was data 1), start of second sequence

Filling out the other two cases of the eight possible bits sequences: 000, 111 = data is suspicious. (are we really connected?)

Interesting. In three bits you figure out the synchronization.

Ariba
09-08-2007, 09:06 AM
I still think that only the Low time is important (1T or 2T Low) for a 1 or 0 bit. If there is 1 or two Highbits after that does not matter (see the PropLoader.spin). Certainly there is a minimal High time, that we don't know, but why should it be longer after a 1T Pulse than after a 2T Pulse?

@hippy
I had the same problems when a made the Loader of PropTerminal. I have experimented with the (Windows) TimeOut settings of the COM-Ports, but nothing was helping. The Bootloader in the Propeller is very timing sensitive, if Windows delays a Byte to long, the Timing is lost.
My solution was also to program the Loader code exactly like the Pascal example, then I had no more problems.

Andy

Edit: The PropTerminal and it's Loader works well under Win98SE.

hippy
09-08-2007, 11:01 AM
It's very odd. As best I can tell it's the amount of 'idle' between bit-pulses which seems to cause the loss of comms. The key seems to be in keeping data churning out the serial port.

Using one byte to send every LFSR bit from the PC and 3-bits per byte / 11 bytes per long I cannot get the baud rate down below 38400 without it failing ( VB6/Win98SE ).

Using the dense packing algorithm ( 1 = two bits "01", 0 = three bits "001", all packed back to back ) for LFSR send and Eeprom image transfer I can get down to 4800 baud without error with VB6, down to 9600 baud with VB3. Without the dense packing it just doesn't work at all with VB3 ( capped at 14400 baud max ).

Lower baud rate transfers are esoteric for PC users, but it's an issue when considering slower speed development devices (PDA's), wireless and over-TCP/IP programming.

At least I seem to have an answer to the 'three bits per byte' query; no it's not a necessity, just convenient, and using dense packing should ( on average ) shave near 30% off download times. A couple of seconds for a full 32KB image isn't significant at 115200 baud, but it amounts to a lot at lower baud rates.


Edited (2007-09-09) : I was wrong about the 30% saving. The Propeller Tool pads unused Eeprom image with zeros, and they take three bits minimum no matter what scheme is used for download, so the saving works out nearer just 3%. The moral is, only download .binary files whenever possible. For compiler / assembler writers, it is possible to pad unused Eeprom image with ones. This gives an average 30% and up to 50% saving when using dense packing for download for full 32KB images.


When I've got my code tidied up and a few odds and ends sorted out I'll post it to the forum.

Post Edited (hippy) : 9/9/2007 8:48:48 PM GMT

IAN STROME
09-09-2007, 08:31 AM
Hi again
Yes hippy I think you're quite right,I couldn't get it to boot @ less than
14400 under DOS,so this is obviously in the timing in the PROP.
I imagine anything under 277uSec ( 1 bit time @ 14400 baud ) is
considered by the PROP as effectively A false start bit.

Another question?
When loading a file to the PROP when does it expect an $F9 for sync??

Best Regards to all
Ian

:- The transistor is there to protect the fuse.

hippy
09-09-2007, 10:07 AM
Not sure what I've done but as long as "dense packing" is used, I've been able to successfully identify and download at 4800 baud. How much that is PC dependent remains to be seen.

The $F9 needs to be sent at the beginning to set the timing, then it is sent later to solicit bit responses, to read the LFSR and version, and after download to get the acknowledgements of download checksum okay, write to Eeprom is okay and verify Eeprom is okay - note these after download are zero bits, not one bits ( that had me thinking things were not working for a while ).

I've attached my version of the downloader to add to the other examples. VB6 so needs the VB6 runtime installing plus MScomm32.ocx. Tested under Win 98SE and Win XP SP2.

jazzed
05-19-2008, 12:47 AM
Hippy, thanks for sharing this package ... works like a charm.
Have you given thought to making a command line version with parameters?
I could use such a feature in "Eclipse" with C projects. TIA

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed

hippy
05-19-2008, 03:56 AM
Glad it's been useful.

Looking at the source code I appear to have split the GUI from the main body so it should be fairly easy to change to a command line version. Being VB6 it won't be a true command line / console version but a cheat of using a not shown main form but no one should ever notice that.

Any particular command line switches you / Eclipse would like ? Should be easy enough to add and change as necessary. I usually try and do *nix -switch and windows /switch support to cater for whatever people prefer.

jazzed
05-19-2008, 05:08 AM
Hi Hippy,

Thanks for giving this command line request some consideration. Since you are saving all parameters in the system registry, a very simple but non intuitive route would be to let user set basic parameters with the GUI (could be same executable if flags are detected) or command line with flags·so that they are "sticky" unless overridden by a command line flag.

This flexibility would allow each user to·select their·com port without changing the·commited CVS source·script (too bad VB6 mscomm does not have autodetect). Having a "relative" path for the binary file would also be nice where source control is used. The file to load would be the most often changed parameter. The Eclipse "user defined" environment variables·don't seem to get·passed to the shell for make, etc....

The success MsgBox would go away of course, and failure messages would be written to command window (I have no idea·how to print to stdout in VB).

Given the previous description, you probably have a good idea of parameters by now. I'm thinking the parameters below would be useful. Any would be optional to change the "registered" state except for eeprom, help, and verify.

-c #·| such that # is com port number ... optional -e· | program eeprom ... program eeprom if and only if present -f path\filename·| path is absloute or relative and filename is binary file ... optional -h (or --help) | for parameter help ... show help if and only if present -v |·... verify for eeprom if and only if present ?
Hope this is not too much to consider. Thanks again.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed

hippy
05-19-2008, 08:32 AM
Probably not entirely Eclipse friendly but a rough and ready first attempt. For example ...

AiLoad32 /?
AiLoad32 -h
AiLoad32 -find
AiLoad32 -com:3 -identify
AiLoad32 -com:3 c:\mydir\myfile.eeprom
AiLoad32 -com:3 -e c:/mydir/myfile.eeprom

-com:<n> is com port, -e writes to eeprom, no parameters and the GUI launches.

This is callable from within any app but only reports by MsgBox. I've not had brilliant success with VB6 and STDOUT so it will have to go in the To Do pile as it could take a while. Someone with VB.Net experience should be able to turn it into a proper console app.

I've found a bug which I'll hunt down and fix - Switch the Prop off during download, and it still reports download success !

jazzed
05-19-2008, 09:39 AM
OK :)
I was able to add the tool to a makefile to download. Message boxes pop up on pass or fail ... generally a good thing in that it doesn't make a zombie program. Tried many ways to define -com:<port> on a "user" basis for the makefile to pick up, but nothing worked. I'll hard-code hack it in for now though using CVS for sharing would makes this unpalatable ... since I'm the only one who cares now, that is not so bad I guess.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed

Luis Digital
05-19-2008, 09:49 AM
IAN STROME said...
Many Thanks to Mike + Andy for those links.
Plenty there to chew on.---I'm trying to get the prop to talk to FPC under DOS i.e.
( NO WINDOZE ). The python code was a bit involved,too many parenthesis.
Finally worked out the LFSR was a pseudo-random-sequence-generator,as used
on the secure mil stuff I worked on 30+ years ago, only instead of using the full
byte only uses bit 0 , hence the stream of 250 1's and 0's.
I will post again when I get it working.
Regards to all Ian

:- All tolerances add up in the same direction.


I modified code that someone of Parallax left in a forum. Work in Windows and Linux.

It will be an advance for you, since many functions are equals in the DOS, "Delay(ms)" for example.

Without no problem can function in Mac OS X, modifying some small things.

Free Pascal (http://www.luisdigital.com/programacion/fpc/) is the best.

Good luck!

davidsaunders
04-21-2011, 03:55 PM
OK this seems to be the best place for this. What is the slowest serial that can be used to load the prop directly? I wish to get a Prop loader that will work at 19200 baud, is this even possible?

This way I can start work on a compiler and loader for older computers to use in loading and programming for the Propeller. As most of you know I am working on an Amiga clone; as such many of my potential customers still use older (first gen) Amiga (A1000, A500, A2000, A600) and Atari ST (520/1040 ST/STE/STF/STFM) computers and I would like them to be able to re-due the firmware with out having to use the system that they get from me if they need or want to.

Mike Green
04-21-2011, 04:43 PM
There is a tight timeout in the initial identification exchange between the PC and the Prop's bootloader. You'd have to look at the ROM loader source code to figure out the worst case time limit, but I think you need around 115KB on the PC side for this to work. If you "bit-bang" the download protocol rather than use a UART, you ought to be able to use something like an Atari. Remember that the Prop is using a self-timing protocol and the PC end uses a UART to generate it because it has to. The actual bit rate for the identification portion of the protocol is around 12KB because the PC end is using a whole character frame for a single bit. It's only for the actual program download where 3 bits are packed into a character frame.