What would be a good idea for a new CPU and platform to try on a P2?
I sorta think it would be nice to have something somewhat based on a 6502, but made more for performance. For instance, make all the regular instructions at least 16-bit, and maybe keep the opcodes at 8-bits and variable length. Then one wouldn't have to get complex with the '816, multiple modes, etc. If X and Y are 16-bits, then 20-24-bit memory addressing would not be an issue, and Page 0 would be 64K. I guess it would be wise to throw in some 8-bit and 32-bit ops. I'd want RNG, multiplication, and division (with modulus) to be a part of the opcodes. Those are things that the early machines lacked as instructions.
I don't know if BCD support should be added, or how to handle it. For instance, with 16-bits, would all 4 nibbles be treated as a 4-digit decimal number? One reason the older machines used BCD, was not just for accuracy in accounting, but also to somewhat simplify things for scores in games. Yes, BCD takes more overhead to do, but then converting to ASCII was easier. Just add each nibble to the ASCII code for 0 and build that into a string. Converting from binary to ASCII was a little harder. I know on the PC platforms, a way to do that was to keep dividing by 10, keep the modulus each time, add each modulus to the ASCII code for 0, and build a string from right to left. I don't know if there are other ways to do that. I know you can subtract in place of dividing, but that gets messy fast and can take a while. I think to go the other way would be to subtract the ASCII offset, then multiply each digit by its place and then add them. I don't think it would be feasible to have both of these conversions as commands since you'd need more registers to be able to specify the source, the destination buffer, and the number of characters involved.
Are there any more instructions that would be good to add? I guess a stack would be good, and likely calls and interrupts. There are questions about where to put the reset vector, the interrupt vectors, etc.
And then there are the technical issues involved. When you get to 16-bit and higher, a question becomes what to do about alignment. On the P2, alignment is not an issue since you can read from the hub in any place and read 1-4 bytes. However, part of me would like to include the ability to add external SRAM, preferably parallel (yes, I know 1M of word addresses means 40 GPIO lines). In that case, you would incur an alignment penalty since you'd need to access it twice and increment the address as you go. I know for a homebrew design using just chips, one could mitigate this if you had two address and data buses, allowing you to read the low byte from the odd bus and the high byte from the next address of the even bus. But that sort of complex trick is out of the question here.
And how would fetching be handled? If you assume a byte and a byte operand, that is a nice and even 16 bits. But what if that is a single-byte opcode? Then the next one is an opcode too. I know the 6502 discarded these. It wasn't until the limited run of the 65CE02 that they recycled those. That "data" byte would get forwarded back into the pipeline, and while that happened, the immediate operand was loaded (or the next instruction if this happened twice in a row). And what about the maximum immediate size? Should I add some 24-bit instructions? That would seem efficient as a single-byte opcode and a 3-byte immediate would be an even 32-bits. And of course, what about handling within the P2, should one just do 32-bit fetches from the hub?
And how would this interpretation be implemented? I mean, is there an efficient way to branch based on virtual opcodes without polling or even testing bits and having instruction trees? How would one go about going from the virtual opcode to its "microcode?" Does the P2 have an efficient way to deal with 256 instruction handlers?
I mentioned the possibility of using parallel SRAM. I know, some would consider that not worth the effort and suggest PSRAM or some other serial arrangement. But I can see that as possible if one has plenty of multiplexers on the board (or hell, the /CS line on the memory) and allocates a line for that. And peripherals can watch that line too and know when they can speak to the P2 (even if it's another P2).
And then what about sound? I'd like the sound to be a little better than what existed in the retro era. The TI Chip was nice in that you had 3 tone channels and a noise channel. and the Pokey improved on that in that you could use all 4 channels as you saw fit. Plus that added 16-bit sound mode and gave you the ability to bit-bang through it too. And the IBM PC, while it had the worst sound of all, you had the full range of sound. One of the high pitches I sometimes used was 15,750 Hz. That one is kind of nostalgic. And the Gigatron, I think you can only go up to around 3900 Hz or so before aliasing gets you. I never messed with a SID or even know how to program it. It seems a bit complex to me. At least it is more in tune, though jazz musicians and demo makers don't seem to care. There are features I wished existed for the old platforms, such as a note mode, and a chord mode. For some songs, at least 5 channels would be handy. It would also be nice to have a library of the most common samples built in. So have square, sine, triangle, ramp/sawtooth, noise, percussion, and anything used by common games. Maybe the interesting of the nonstandard waveforms, like 2 triangles and convex in the center, one that starts as a square or triangle and transitions into a sine, one that starts as a square but ends in noise, etc. So I'd like more options and better sound than in the old days, but not too awfully complex. And vibrato, pitch-bending, and a few other variations would be nice.
And what about the video? I wouldn't mind something that could do 320x240 and 640x480. And of course, to do that in bit-map mode, you'd need 75K and 300K for those, assuming 8-bit colors. Hardware sprites of some sort would be nice. So would a text mode. I don't know what to think about an indirection table like the Gigatron uses. That is to make special effects easier, such as scrolling, duplicating lines, flipping the screen, etc. But that approach can be clunky to work with and fragments the memory. And David Murray thought of a way to do better than CGA, and trade that with memory. That is to store things so that each line has a 16-byte header to define the colors for that line and then let each byte represent 2 pixels. So you can use up to 256 colors on a page, but only 16 per line. You could do similar with 4-color modes by having a 4-byte header, and then have 4 pixels per byte. Once again, you can have 256 colors on a page, but only 4 per line. That would be more useful than the CGA hi-res mode. However, you couldn't use scrolling in these multi-pixel modes unless you scroll by 2-4 pixels at a time (unless you get really complex). And I've thought about ways to have a 9-bit output which isn't possible for an 8-bit system unless you let the exact opcode number determine the status of the highest bit. But then, how would one store it? So 16-bit would make that more possible. At that point, you might as well go with 15-bits.
I thought it was interesting how MS-DOS and text mode worked. You had 80x25 text. That is 2000 characters. However, it was stored as 4000, giving each character a 16-color foreground and background. The ASCII code represented where the pixels were, and the color byte determined what they were.
Speaking of hardware sprites and layers, what do you think would be good? I mean, how large, and how many? Atari used a PMG scheme. One thing that I thought up that I don't think has ever been tried is a hardware chase mode. Like you can name one sprite as the targeted sprite and assign that to the others, and pursuing sprites can try to find the targeted one. And they would need to know the background color and can only move over the background color. So it would avoid the borders and not walk through walls, and return a signal when a collision with the target occurs. And even a sprite and textures library in hardware could be nice. For instance, what if you had Pacman, ghosts, stars, numbers, walls, doors, etc? So you'd have many of the elements used in games already there.
Plus, why not a video mode(s) that use some sort of a display list format? There are ways to have "opcodes" for what you want to happen on the screen. So simple screens, you'd do better to set foreground and background, position things, and describe what you have on the page. So it can work as compression.