P2 bootloader challenge
Peter Jakacki
Posts: 10,193
While I am waiting for some initial bootloader code to be included in the P2 FPGA images I have been busy with writing and testing Tachyon for the P2. The problem is that I have a kernel that gets loaded with PNut, I then copy&paste multiple source code files into Tachyon to develop my final image. Now if I had the bootloader in ROM I could just have it load an image I have saved onto serial Flash or SD but that got me to thinking, why don't I just write a bootloader so that for the moment I can use PNut to load the bootloader which will then load my image from Flash or SD? By doing so I will prove or disprove the methods that I advocate.
Now I throw the same challenge out to all those rather vocal forumistas to get off their "buts" and prove you can do it your way rather than saying "but" all the time. While Chip is taking a Sunday break why don't we try some methods and present the results for Chip to review this week? I know I won't have any problems at all with a bootloader and this at least solves my immediate problem to a small extent.
I use the same SPI routines for Flash and for SD which also include block transfer functions too. To kick that part of it off here are the basic SPI functions although all my source code is out there in my Dropbox.
Here I dump memory from both devices which use these common SPI functions:
I think I could program up a Prop hooked-up to the serial port so that it loads the bootloader automatically on reset to save me having to run PNut. But that is an extra step which would prove unnecessary if Chip starts implementing some of these bootloader functions.
Finally this is the reason I need a bootloader, look at my modules (you don't want to see the dictionary list).
What? you do want to see the dictionary listing?
(sorry, the forum said it was too big)
Now I throw the same challenge out to all those rather vocal forumistas to get off their "buts" and prove you can do it your way rather than saying "but" all the time. While Chip is taking a Sunday break why don't we try some methods and present the results for Chip to review this week? I know I won't have any problems at all with a bootloader and this at least solves my immediate problem to a small extent.
I use the same SPI routines for Flash and for SD which also include block transfer functions too. To kick that part of it off here are the basic SPI functions although all my source code is out there in my Dropbox.
'********************** SPI READ/WRITE ********************* ' SPI>BUF ( dst cnt -- ) SPI2BUF mov R1,tos wrfast #0,tos1 .L0 call #SPIRD wfbyte tos djnz r1,#.L0 jmp #DROP2 ' SPIRD ( dummy -- dat ) SPIRD rep @.end,#8 ' 8 bits xor outa,sck ' clock xor outa,sck test ina,miso wc ' read data from card rcl tos,#1 ' shift in msb first .end ret ' BUF>SPI ( src cnt -- ) BUF2SPI mov R1,tos rdfast #0,tos1 .L0 rfbyte tos call #SPIWR8 djnz r1,#.L0 jmp #DROP2 ' SPIWR8 ( byte -- byte ) ' Shift 8 bits from data[0..7] out and leave data on stack (restored with other bytes zeroed) ' SPIWR8 shl tos , #24 ' left justify 8-bit data s ' ' SPIWR ( data -- data<<8 ) ' SPIWR rep #3 , #8 rol tos,#1 wc ' output next msb muxc outa,mosi xor outa,sck ' clock xor outa,sck ' clock ret
Here I dump memory from both devices which use these common SPI functions:
TF2$ 0 $40 SF DUMPL 00.0000: B412.ECFC B410.ECFC B40E.ECFC B40C.ECFC ................ 00.0010: B40A.ECFC B408.ECFC B406.ECFC B404.ECFC ................ 00.0020: 2802.ECFC B400.ECFC 207E.65FD 7000.00FF (....... ~e.p... 00.0030: 003E.04F6 224A.00F6 024A.84F1 0048.04F6 .>.."J...J...H.. ok TF2$ 0 $40 SD DUMPL 00.0000: 4C42.3250 0069.7A00 0000.4DB1 0000.0000 P2BL.zi..M...... 00.0010: 0000.0000 0000.0000 0000.0000 0000.0000 ................ 00.0020: 0000.0000 0000.0000 0000.0000 0000.0000 ................ 00.0030: 0000.0000 0000.0000 0000.0000 0000.0000 ................ ok
I think I could program up a Prop hooked-up to the serial port so that it loads the bootloader automatically on reset to save me having to run PNut. But that is an extra step which would prove unnecessary if Chip starts implementing some of these bootloader functions.
Finally this is the reason I need a bootloader, look at my modules (you don't want to see the dictionary list).
TF2# Q -------------------------------------------------------------------------------- CODE MEMORY @ $00.82D7 for 29,399 NAME MEMORY @ $00.ABBE for 11,709 DATA MEMORY @ $00.E2DE for 329 FREE MEMORY = 10,471 MODULES LOADED: 73DE: P2ASM.fth Tachyon Forth inline assembler for the P2 151022-0000 63E3: EASYNET.fth WIZNET NETWORK SERVERS 151111.0800 5773: W5500.fth WIZNET W5500 driver for TF2 151110.1500 4839: EASYFILE.fth FAT32 Virtual Memory Access File System Layer V1.1 for TF2 151119-1200 41B3: SDCARD.fth P2 SD CARD Toolkit - 151119.1200 2A40: EXTEND.fth TACHYON FORTH EXTENSIONS for the P2 - 151121-1400 MON 12:07:04 -------------------------------------------------------------------------------- ok TF2#
What? you do want to see the dictionary listing?
(sorry, the forum said it was too big)
Comments
I think the SPI speed needs to be limited to no more than ~20MHz to cover the most common suspects.
You may want a NOP in the CLK pulses, to make it closer to 50% ?
If the file happens to be a single fragment then there would be only a single entry in the block list. Just like yours.
Then that 'very minor technicality' becomes very costly indeed.
Smarter to get this stuff right from the beginning, not kick the can down the road and rely on some later checks to (hopefully) catch all the "Don't worry about's"
Do, or do not, there is "no guessing" or naysay in this thread which is about actually putting your code where your mouth is.
I agree at the moment. Unless Chip says otherwise, the overall plan is to get the smart pins done right now, and continue to refine the core design. We are likely to find a few more teething pain type things to get cleaned up.
Once it's starting to look like a chip, there will be some time to settle the ROM. Synthesis and some design checks took a while last time, and that was a great time to sort out the ROM. Bet it all goes about the same on this run.
Getting a "just boot for now" job done, means we can simulate a lot of stuff, and doing that is important.
@Peter: Holy Smile! Well, your Forth-Fu is beyond question at this point.
Peter needs a boot loader to continue on his path, and we all need to hook up some SD cards and try some of this stuff out.
See the video driver efforts?
Same deal. Those will need to get some parametric type improvement. In fact, I'm late on some sync code I want to get done for TV... Point is, we build some rough code, test, explore the features, and what will fall out of that is some nicely refined code that can handle clock speed changes, etc... In fact, sync of frequency changes for pixel clocks is one thing that fell out of the current body of, "let's put stuff on the display" code out there right now.
Way back on P1, it was the same way. Walk before you run. It won't hurt a bit to toss some temporary SD card code in there. It's all gonna get redone and sorted as the final image gets done.
Here you go, since you claim to want to 'just do it' - enjoy.
[/quote]
I don't have to prove that I can get things done because I've been busy doing it. I have no problems writing my own bootloader that works and I have even shown the code that is common to both serial Flash and SD.
PLEASE - WRITE CODE - MAKE IT WORK - WE WILL ALL BENEFIT
?? Err nope.
The Boot clock will be much lower than the PLL ability (just as it is now, in P1), but you still need to avoid too-narrow clock pulses being the limiting factor.
The NOP fixes a clock skew issue, but you are welcome to provide a smaller means to fix the clock skew.
Peter is asking for code - I provided some.
You also seem to think the final ROM code will be radically different - why ?
If you get the details right now, there should be no reason for radically different ROM code.
Perhaps we all could be very enlightened with an example.
"there should be..."
Sure, and there is how it's actually being done. Two different things. And yes, that means the ROM code is yet to be seriously sorted out yet. We are running on a small stub for testing the core design, just like last time.
Now, I'm going to get back on some streamer tests and ideally get the VSYNC stuff I've had queued done.
@Peter, everyone: Is the SD card on the DE2 connected to the FPGA image?
Example already given above, maybe you missed it ?
http://forums.parallax.com/discussion/comment/1355551/#Comment_1355551
I suggest instead of arguing, you just go do it yourself. That way you can prove what you want works.
My bootloader is posted in the PropOS thread here
http://forums.parallax.com/discussion/138251/a-propeller-os-that-can-run-on-multiple-hardware/p1
It only requires converting to P2 code, which I will do sometime soon.
It works on the P1. I ship both commercial and hobbyist products using this bootloader.
And I am just about to release a new P1 board that includes an SD card with this bootloader.
Ok, this is getting really out there in the weeds.
What Peter wants is for people to post up SD card boot code methods. A lot has been proposed. Chip is asking for specifics, and the best way to do that is to provide some code.
Now, say his code was modified to do the FAT file blocks, or some other scheme? That's something that is wanted.
Or, ignore his code and contribute a PASM routine to read boot files from an SD card somehow. That's wanted too.
There has been a lot of discussion on various methods, and maybe it's best if some code got written to explore the merits of those methods. That is what is wanted.
It's wanted to see how many instructions it might take, difficulty of implementation, etc... and it's wanted for others to test too. Does it work on their board, their SD card, format, etc.. ?
With these code bodies, we can make some better decisions. If they are small, or perhaps can be combined to use common routines, a nice SD boot scheme that makes a lot of people happy might not take all that many instructions or time.
This means that you can't write code that toggles a clock pin and immediately reads a data pin.
I've got this SPI code which I've been using to read and write reliably from serial Flash, SD cards, and WIZnet chips so I'm not sure what you are saying.
I'm curious about it too.
Make sure to document it in your GDoc file instead of here. Otherwise, it will eventually get buried...
OUCH! This is going to be a BIG source of problems for the unwary!
It seems that three clocks are needed between changing OUTA/OUTB and seeing the effect on INA/INB. This means that four clocks would be safe for reading SPI parts, as three is cutting it maybe too close. That's a two-instruction delay needed between outputs and inputs.
This means for a fast synchronous serial input, clock outputs need to be staggered relative to inputs.
Do you have the Clk-op and tsu numbers, as that +1 SysCLK margin needs to be enough to satisfy the SPI Clk-op, and the P2 tsu values.
Best to get the details right early, and avoid thermal or process issues later.
I see a small 10MHz SPI EE device specs 40ns MAX Ck-op & Twh,Twl 40ns Max
A SPI SRAM specs 32ns in E-Temp version, which are both > 1 added SysCLK
Hehe, you mean to say a NOP (or two) is needed, like this ?
@jmg, you are stuck in a clock loop harping on about "your" nop, as if that is something only you could come up with.