P2D2, TAQOZ and HyperRAM design decisions
Peter Jakacki
Posts: 10,193
This P2 sure is keeping me busy but there are a couple of design decisions I thought I'd get some feedback on.
First off I've tested out a lot of extras for TAQOZ and I'd like to include them in the ROM but I only have maybe 12kB of ROM available. The current ROM is this:
TAQOZ is made up of PASM, HUBEXEC, and 16-bit wordcode as well as a dictionary but I have also added many extras and also the dictionary has expanded. How do I fit that all in to 12k? Of course I could remove some other useless/useful code but that's not enough. However, if I take the binary compiled for testing in RAM and compress it with LZMA it compresses to around 8k. I'm taking a look at what would be required in P2ASM to extract the contents of the archive and load it into RAM? Do you think I could write a decompress in around 2k of assembly?
Second, the P2D2 hardware now has a tiny 3x3mm QFN20 Busy Bee micro supervising the power, reset, DTR, and all 6 boot lines. This chip is actually initially programmed from the P2 itself and I even have an assembler written in TAQOZ too. This means it provides the brown-out and power-on reset, optional watchdog, DTR edge detection, plus it can also load up the P2 directly, even replacing the boot ROM in RAM. It might do this by reading the SPI Flash or using a Busy Bee with larger Flash and in which case compressing TAQOZ in the ROM would not be needed.
Thirdly, I don't think it would hurt to include HyperRAM or even QSPI RAM but certainly HyperRAM is available in tiny 16M byte BGA24 packs which I could integrate into the P2D2 itself as an option. Due to the bus speed that they can handle plus what we would need them to handle they are best placed close to the P2 itself, that is, on the P2D2 which is already a very tight and compact design.
Note also that rather than relying on a multilayer PCB which although it could be manufactured that way, there is a separate thermal pcb that can be surface mount soldered to the bottom of the P2D2 with the advantage that both sides of this pcb can have rather thick copper without restriction. However, this will only be an option as my P2D2 runs at 240MHz without any cooling at present. The thermal PCB has cutouts to allow for the microSD socket to be used on that side of the board. Also the P2D2 has a 0.2" strip on the serial end with a small reset switch and larger LEDs that can be mounted on either side of the board. Maybe the thermal PCB could be loaded with a whole stack of LEDs and buffers!
First off I've tested out a lot of extras for TAQOZ and I'd like to include them in the ROM but I only have maybe 12kB of ROM available. The current ROM is this:
ADDR BYTES FUNCTION ======================================== $FC000 1376 BOOTER $FC560 1304 SD BOOT VARIABLES & CODE $FCA78 1455 LMM DEBUGGER $FD027 12195 TAQOZ $FFFCA 54 FREE
TAQOZ is made up of PASM, HUBEXEC, and 16-bit wordcode as well as a dictionary but I have also added many extras and also the dictionary has expanded. How do I fit that all in to 12k? Of course I could remove some other useless/useful code but that's not enough. However, if I take the binary compiled for testing in RAM and compress it with LZMA it compresses to around 8k. I'm taking a look at what would be required in P2ASM to extract the contents of the archive and load it into RAM? Do you think I could write a decompress in around 2k of assembly?
Second, the P2D2 hardware now has a tiny 3x3mm QFN20 Busy Bee micro supervising the power, reset, DTR, and all 6 boot lines. This chip is actually initially programmed from the P2 itself and I even have an assembler written in TAQOZ too. This means it provides the brown-out and power-on reset, optional watchdog, DTR edge detection, plus it can also load up the P2 directly, even replacing the boot ROM in RAM. It might do this by reading the SPI Flash or using a Busy Bee with larger Flash and in which case compressing TAQOZ in the ROM would not be needed.
Thirdly, I don't think it would hurt to include HyperRAM or even QSPI RAM but certainly HyperRAM is available in tiny 16M byte BGA24 packs which I could integrate into the P2D2 itself as an option. Due to the bus speed that they can handle plus what we would need them to handle they are best placed close to the P2 itself, that is, on the P2D2 which is already a very tight and compact design.
Note also that rather than relying on a multilayer PCB which although it could be manufactured that way, there is a separate thermal pcb that can be surface mount soldered to the bottom of the P2D2 with the advantage that both sides of this pcb can have rather thick copper without restriction. However, this will only be an option as my P2D2 runs at 240MHz without any cooling at present. The thermal PCB has cutouts to allow for the microSD socket to be used on that side of the board. Also the P2D2 has a 0.2" strip on the serial end with a small reset switch and larger LEDs that can be mounted on either side of the board. Maybe the thermal PCB could be loaded with a whole stack of LEDs and buffers!
Comments
Excuse the lack of formatting.
If the 6x8 BGA24 is too tight to fit, there are also SO8 SDRAM parts like IPS6404L-SQ-SPN, LY68L6400SLIT - these look to have similar refresh rules as HyperRAM.
If you do decide to bump the Flash size, I see in QFN20, you can get 40kF in the EFM8UB3 (USB for free?), and if you use QFN24(3x3) that has 32kF & 64kF choices.
The UB3 pinout is a slight change, as they nudge in 2 dedicated USB pins. It also comes with a USB bootloader, which may simplify things.
https://docs.google.com/spreadsheets/d/1385RUevHgwYEnhL7wUaVAYPej5wvg9WPlq1Zm8usNqU/edit#gid=2094062265
https://www.cemetech.net/forum/viewtopic.php?t=11406&start=0
a poster there claims
"Also, if anyone is looking for alternatives to this, check out this topic for a similar compression / decompression algorithm
http://www.cemetech.net/forum/viewtopic.php?t=11292
The decompression code is considerably smaller and it yields similar compression ratios given the same data.
The compressor is somewhat slower.
I don't know if the decompression is faster, slower, or nearly equivelant.
I prefer lz4 because it's pretty easy to install with a system package manager, but take your pick based on where your priorities are."
dlz77 code here - seems to be simple lookups, which explains why it is fast. I think 36 LOC vs 72 LOC above ?
ZX7 unpacker in 142 bytes for 65C02
LZ4 unpacker in 136 bytes for 65C02
LZ compressor/decompressor
http://pferrie.host22.com/misc/appleii.htm
j
Please whatever you decide, keep the memory optional.
If you start to add memory / peripherals to your board it would likely start to require pins. I am designing a nice PCB to make good use of a P2D2 and I am already accessing all the P2 IO pins apart from SD/Flash! It's almost done now.
Was recently looking at that small Quad SPI espressif PSRAM myself as a possible on board RAM solution, plus keep the ability to add other memory types via breakouts etc. What is nice is that its SO-8 footprint is generic and allows other (Q)SPI devices etc.
Update: Yeah maybe your thermal PCB could have the footprint for a memory device..
@thej - thanks for that, I'm converting across the 65C02 code as an exercise, but of course it takes less instructions than the 65C02 although it may take more memory. We'll see.
@rogloh - The RAM will be optional and whatever 11 I/O it uses it will be just like the boot pins in that they are all totally accessible. That was the other option too since the HyperRAM is a fairly large 6x8mm package but the problem is that it would mean not having as thick a copper layer as I would like. But mind you, the RAM could have its own little pcb to stack onto the back of the P2D2.
Also the HyperRAM will connect up to P47..P57 so that the connections are very short. I'm trying to fit it in just below the 50mil pads at P52..P57 on the bottom but otherwise it will be mounted on a thin pcb that will sit in the same spot and connect to the 50mil pads.
So it looks like too that I should be able to compress TAQOZ into the ROM and when it is called it will simply run a small assembly routine to decompress the image into RAM. That lets me pack a lot into a little for all P2 chips but the P2D2 will also have the alternate booter that the Busy Bee can load in independent of the Flash or SD.
EDIT: Other changes on the P2D2 include:
* 1M pull-down on SD CS - this makes it easier to detect the card, it is either high or low, no need for a float test.
* P-channel to switch power to the microSD to bypass any weird mode - controlled from Busy Bee via I2C or every reset.
Looking at that compression code I pasted above of they mention tuning of the second ccccc / bbbbbbbbbb split, to tune the compression.
On P2 code the chunks will all be 32b, so the counts could be all 4x greater reach, and it means any 32b opcode used already (anywhere in previous 1024W), will compress 2:1
A scan of the ROM code should let you find the optimal split of c/b bits for P2 opcodes.
1) how much longer will it take to boot?
2) what extra features are you planning to add into the newly freed extra space?
At present TAQOZ needs a terminal sequence to become active so "boot time" would seem unimportant but I want the new ROM to fall through to TAQOZ if no boot was found in the timeout period. So even if TAQOZ took a second to boot (although I doubt it would be that long), it wouldn't matter.
The extra features are to do with the SD and FAT32 by allowing files to be renamed, copied, created, loaded, and saved etc. This and also some standalone features where it can run a keyboard and VGA monitor it would be possible to write code without any special tools or even a PC. If I can fit a logic analyser in there too, I will.
Nonetheless, if we had a choice to have more or have less for the same "price", which would we choose?
If you do manage to get it to boot to a minimal dev environment with no external PC/serial console needed that would be a great achievement indeed. One tricky part might be in determining/allocating which pins to use for the KB/display to keep things flexible as this would vary with HW implementation.
The boot pins are fixed and we work with that. If we have fixed VGA and PS/2 pins for TAQOZ in boot ROM then that shouldn't be a problem either. If a system config is available then it could use that, much like the clock config in Flash or SD that I've mentioned before. So if we have a bare P2 chip that we have placed on a board and for some reason we do not have a PC, we can still talk to that chip and write code for it and on it. Imagine if the majority of PCs were struck by a super StuxCryptoDoomFicker virus or a Winever update solved the problem of the BSOD and made it black instead? Maybe, maybe not, but for no cost we have an independent software development tool that doesn't need a PC to get it up and going and at the very least a real computer on a chip, locked into the chip.
? As mentioned in my many posts, I have VGA and keyboard capability built-in, that is what I have been using now for some time, running 5x7 font over a 640x480 8-bpp VGA along with PS/2 keyboard. Of course I could add a mouse too and some basic editing facility. It's much more than DOS ever did but even that needed to be booted from a disk.
Well yes, as I am developing the tools that is, so I still need a PC in the meantime But I can load/run/compile files as standalone and even assemble and program 8051 code for the helper micro. I need to write a simple editor I suppose, but I have quite a few aspects of P2 development on my plate, not just the software. Right now I am making up schematic symbols and pcb footprints for HyperRAM and integrating that into my P2D2 revision, for one.
I also have added single key commands to TAQOZ so that function keys which I represent with hex codes $F1 to $FC and any other 8th bit code will automatically search for a word with the keycode in its name in the TAQOZ dictionary. So if I define a word kFC then when I hit F12 it immediately searches the dictionary for kFC and if found executes it. In this case I just get it to list the files in short format with:
While I can view bmp files standalone, I want to be able to play wav files next, which is fairly easy, especially for the P2. Then I can write some simple games such as a VGA version of the Breakout game I've had running on the P1. But I'm not really interested in the games (yeah yeah), but they are a good platform for testing software.
I've translated the ZX7 decoder (standard version) from Z80 to p2asm and it's only ~50 longs for code and registers. I think (a) most of the ROM should be compressed and (b) smaller size is more important than quicker decompression.
ZX7
http://www.worldofspectrum.org/infoseekid.cgi?id=0027996