What has changed since v26? The loader doesn't seem to work now. Sysclock default? CLKSET changes/defaults? ....
Oh oh OH never mind, Smile, PNut now downloads in Wine!!!
We are using Prop_Txt exclusively to download now, so no funny binary characters to upset other O.S.'s and what not. That must be why it works now. No more two-stage loading, either.
USB demo running OK on P2v30, and the CRCNIB instruction is the cat's meow :-D
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
USB demo running OK on P2v30, and the CRCNIB instruction is the cat's meow :-D
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
USB demo running OK on P2v30, and the CRCNIB instruction is the cat's meow :-D
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
How much code space does that table removal save ?
JP/JNP were removed from the design a long time ago because they took too long to settle, and those slots became do-nothing. However, I just now thought to remove them from the assembler. Sorry about that.
USB demo running OK on P2v30, and the CRCNIB instruction is the cat's meow :-D
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
How much code space does that table removal save ?
The CRC16 table lookup code is two instructions longer and the table takes up half of the host cog's LUT space.
USB demo running OK on P2v30, and the CRCNIB instruction is the cat's meow :-D
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
Chip
Here's my code that ran fine on V29 P123-A9.
Only changes made for V30 were 3 x DJNS instructions updated to DJJNF.
Appears to be hub/cordic related as can be seen by screen (VGA) artifacts and erratic code execution.
I've been out running around today, so I have not had a chance to look at this code yet. I'm certainly going to look at it when I get home in a few hours.
Attached is SD Boot Test v116 for use with pnut.exe v30
The program will output to the serial port (P62/63) at 115,200 baud once the SD Card has been initialised and read. Below is what I get on PST.
I would welcome any test results on any SD Cards you may have. While currently all cards are supported, I think we should probably only support Type 3 (SDHC block mode addressing) and future.
I'm running your bug demo code and I'm seeing strange things, for sure.
Okay. I think I know what it is. I made a mistake that caused pipelined and cancelled CORDIC instructions to generate spurious hub commands during the next valid instruction.
I'm running your bug demo code and I'm seeing strange things, for sure.
Okay. I think I know what it is. I made a mistake that caused pipelined and cancelled CORDIC instructions to generate spurious hub commands during the next valid instruction.
I'm running your bug demo code and I'm seeing strange things, for sure.
Okay. I think I know what it is. I made a mistake that caused pipelined and cancelled CORDIC instructions to generate spurious hub commands during the next valid instruction.
I'm recompiling now.
Thanks Chip for checking that.
Of course. Good thing this came up, because I discovered another potential pitfall during hubexec: On the right combination of hub slice timing and hubexec fetching, the slice could come up before 'get' and the flop that only allows one slice opportunity would have gone to 0, causing the instruction to hang. Got that fixed.
New Version 30a at the top of this thread fixes the hub-instruction bug that Ozpropdev found. Currently, there's only an 8-cog Prop123-A9 image, but I'm compiling an 8-cog BeMicro-A9 image that I'll add in the next hour.
Thank you for finding that problem. There are some things that are hard to anticipate. I'm going through the design looking for any other such pitfalls. I don't think there's occasion for any others, but I need to look.
v30a looking good. USB CRC5 calculations all being done using CRCBIT/CRCNIB. All lookup tables gone!
Outgoing CRC5 calculation on a token packet is only about six clocks more expensive than the table lookup, but easy enough to interleave between tx byte output. CRC5 verify on incoming token packets is faster than table lookup because you only need to use CRCNIB as data + CRC length is 16 bits. Way too easy
v30a looking good. USB CRC5 calculations all being done using CRCBIT/CRCNIB. All lookup tables gone!
Outgoing CRC5 calculation on a token packet is only about six clocks more expensive than the table lookup, but easy enough to interleave between tx byte output. CRC5 verify on incoming token packets is faster than table lookup because you only need to use CRCNIB as data + CRC length is 16 bits. Way too easy
Thanks, Chip, for adding CRCBIT/CRCNIB!
Sounding very good - no tables means more room for larger data packet buffers, and a CRC opcode just looks polished....
USB is going to be very important for P2 - have you tried larger packet payloads yet ?
IIRC Mouse/Kbd is quite small packets ?
Meanwhile, in the small MCU sandpit, the STC8F series indicate USB bootloaders (direct USB pins) on even the cheapest 8 & 16 pin ones. My Chinese is not great, but earlier SCH showed 24MHz crystal for USB, now gone.
I suspect this is SW based, bit-bang LS-USB as was done in AVR (so miles from passing approvals) - more detail on just how they do this, is hard to find, but it appears 'good enough' to load code.
Gosh, if we could get soft USB running out of the boot rom, we could save the whole FTDI problem.
But I would be pretty happy with @Cluso99's SD-boot or TAQOZ or both, able to pull needed stuff from SD.
I really do like the decision for the text-based serial boot, this is really system agnostic and will even allow COBOL based systems to boot and program a P2, treating it as a serial printer...
What I am worried about is that all this last-minute changes of the last couple of weeks, may introduce side-effects.
@Chip, please, tiding up things on the end, before release, is a common urge any programmer has. I do it also, anybody does. But it is a very dangerous thing to do after long development. It happens that one 'simplifies' stuff kept apart intentionally years ago for some forgotten reason, but done so after long and deep thinking, also forgotten.
Sure the CRC stuff is useful and it is important that you apply your skill to make the instruction-set beautiful, squeezing another use out of one or some other instructions.
That is what makes the P1 so wonderful. A beautiful instruction-set giving access to hardware and misused(?) over the last decade to do amazing things not even planned for when released.
What I hope for is that the instruction-set finally settles down and @Chip could work on SPIN2 again. And the PropGCC guys could start to work on a new compiler with confidence.
Peter J. can handle opcode changes in minutes, but say PropBasic2 or other things will not start before opcodes settle down, I guess.
Gosh, if we could get soft USB running out of the boot rom, we could save the whole FTDI problem.
I was not quite meaning that - the example of small-MCU boot, was really to show how widespread USB is, in the MCU space.
The P2 ROM is too small to hold USB loader, but a 2nd stage loader could offer USB.
I've not seen large-data-block download numbers yet, for P2-USB.
A cheaper, more flexible P2-USB solution than 'the whole FTDI problem', is the new EFM8UB3 - that has 40K flash, and is sub-90c
- it could do a useful P2 Debug interface, and Power Monitoring / Metering support.
Comments
I'll look into it shortly.
We are using Prop_Txt exclusively to download now, so no funny binary characters to upset other O.S.'s and what not. That must be why it works now. No more two-stage loading, either.
On-the-fly CRC calculation in four instructions and the CRC16 lookup table is history! Haven't gotten to the CRC5 code yet, but should be able to soon.
Great!
Here's my code that ran fine on V29 P123-A9.
Only changes made for V30 were 3 x DJNS instructions updated to DJJNF.
Appears to be hub/cordic related as can be seen by screen (VGA) artifacts and erratic code execution.
How much code space does that table removal save ?
Thanks a lot! Just ordered one. Yihaa!
Excellent news !
The CRC5 can mostly be precalculated anyway.
I've been out running around today, so I have not had a chance to look at this code yet. I'm certainly going to look at it when I get home in a few hours.
The program will output to the serial port (P62/63) at 115,200 baud once the SD Card has been initialised and read. Below is what I get on PST.
I would welcome any test results on any SD Cards you may have. While currently all cards are supported, I think we should probably only support Type 3 (SDHC block mode addressing) and future.
Here's a txt format image for downloading to the Prop2.
I'm running your bug demo code and I'm seeing strange things, for sure.
Okay. I think I know what it is. I made a mistake that caused pipelined and cancelled CORDIC instructions to generate spurious hub commands during the next valid instruction.
I'm recompiling now.
Of course. Good thing this came up, because I discovered another potential pitfall during hubexec: On the right combination of hub slice timing and hubexec fetching, the slice could come up before 'get' and the flop that only allows one slice opportunity would have gone to 0, causing the instruction to hang. Got that fixed.
Thank you for finding that problem. There are some things that are hard to anticipate. I'm going through the design looking for any other such pitfalls. I don't think there's occasion for any others, but I need to look.
Outgoing CRC5 calculation on a token packet is only about six clocks more expensive than the table lookup, but easy enough to interleave between tx byte output. CRC5 verify on incoming token packets is faster than table lookup because you only need to use CRCNIB as data + CRC length is 16 bits. Way too easy
Thanks, Chip, for adding CRCBIT/CRCNIB!
Sounding very good - no tables means more room for larger data packet buffers, and a CRC opcode just looks polished....
USB is going to be very important for P2 - have you tried larger packet payloads yet ?
IIRC Mouse/Kbd is quite small packets ?
Meanwhile, in the small MCU sandpit, the STC8F series indicate USB bootloaders (direct USB pins) on even the cheapest 8 & 16 pin ones. My Chinese is not great, but earlier SCH showed 24MHz crystal for USB, now gone.
I suspect this is SW based, bit-bang LS-USB as was done in AVR (so miles from passing approvals) - more detail on just how they do this, is hard to find, but it appears 'good enough' to load code.
But I would be pretty happy with @Cluso99's SD-boot or TAQOZ or both, able to pull needed stuff from SD.
I really do like the decision for the text-based serial boot, this is really system agnostic and will even allow COBOL based systems to boot and program a P2, treating it as a serial printer...
What I am worried about is that all this last-minute changes of the last couple of weeks, may introduce side-effects.
@Chip, please, tiding up things on the end, before release, is a common urge any programmer has. I do it also, anybody does. But it is a very dangerous thing to do after long development. It happens that one 'simplifies' stuff kept apart intentionally years ago for some forgotten reason, but done so after long and deep thinking, also forgotten.
Sure the CRC stuff is useful and it is important that you apply your skill to make the instruction-set beautiful, squeezing another use out of one or some other instructions.
That is what makes the P1 so wonderful. A beautiful instruction-set giving access to hardware and misused(?) over the last decade to do amazing things not even planned for when released.
What I hope for is that the instruction-set finally settles down and @Chip could work on SPIN2 again. And the PropGCC guys could start to work on a new compiler with confidence.
Peter J. can handle opcode changes in minutes, but say PropBasic2 or other things will not start before opcodes settle down, I guess.
anyways,
Mike
The P2 ROM is too small to hold USB loader, but a 2nd stage loader could offer USB.
I've not seen large-data-block download numbers yet, for P2-USB.
A cheaper, more flexible P2-USB solution than 'the whole FTDI problem', is the new EFM8UB3 - that has 40K flash, and is sub-90c
- it could do a useful P2 Debug interface, and Power Monitoring / Metering support.