first steps toward a 6502 core
ericball
Posts: 774
Recently there was some discussion regarding emulating the Atari 2600 on the Propeller and some of the challenges which would be faced (e.g. tight coupling between CPU & GPU).· Although my opinion was fairly negative on the possibility,· the idea piqued my interest.· I have experience programming the 2600, so I'm familiar with the architecture and the challenges it provides.
The obvious first step was to code up a 6502 core, keeping in mind the requirements for use in a 2600 emulator.· Attached is my first attempt at a 6502 core.· It is _not_ complete nor functional.· I also have not reviewed anyone else's code, so there may be some tricks which can be used.
I started by reviewing the 6502 instruction set, looking for patterns which could be used to simplify instruction decoding.· Although there are obvious patterns (i.e. ALU instructions, addressing modes), there are also exceptions to those patterns.· And the exceptions would require just as much logic to handle as the patterns.
Really what I needed was an easy way to split the opcode into instruction and addressing mode portions.· I realized I could do it with a unique jump table.· The opcode byte is used as an index into a table, where each entry is the form "jmpret instruction, #addrmode nr".· The entry is copied into the code stream and executed, invoking to the addressing mode code.· When that returns the copied entry is shifted right by 9 bits (placing the destination in the source position), then a jmp copied_entry invoking the instruction code.· (Instructions with implicit addressing use jmp #instruction.)
I observed the "official" instructions typically have bit 1 cleared and no "official" instructions have both bit 0 & 1 set.· Therefore, if I reversed the opcode byte before using it as an index then gaps in the table would be minimized.· Support for the "unofficial" instructions (which are not fully decoded by the 6502) are left as a future exercise.
Although the code is not complete, I can already see some challenges.· First is going to be squeezing everything into cog RAM.· (Even without the "unofficial" instructions.)· I know there is some redundant code which can be combined, but I haven't started the more complicated instructions either.·
The second is performance.· The 6502 in the 2600 runs at 1.19MHz.· So if we assume a 76.3636MHz system clock (or 64 times), that only gives 13 normal instructions between memory accesses.· (95.4545MHz will yeild 4 additional instructions.)· I haven't counted, but I'm not confident that hard to emulate instructions like SBC can be done in that many clock cycles.
The final challenge will be working out the details of the CPU/GPU interaction - like how to handle WSYNC.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
Post Edited (ericball) : 10/16/2009 4:24:46 PM GMT
The obvious first step was to code up a 6502 core, keeping in mind the requirements for use in a 2600 emulator.· Attached is my first attempt at a 6502 core.· It is _not_ complete nor functional.· I also have not reviewed anyone else's code, so there may be some tricks which can be used.
I started by reviewing the 6502 instruction set, looking for patterns which could be used to simplify instruction decoding.· Although there are obvious patterns (i.e. ALU instructions, addressing modes), there are also exceptions to those patterns.· And the exceptions would require just as much logic to handle as the patterns.
Really what I needed was an easy way to split the opcode into instruction and addressing mode portions.· I realized I could do it with a unique jump table.· The opcode byte is used as an index into a table, where each entry is the form "jmpret instruction, #addrmode nr".· The entry is copied into the code stream and executed, invoking to the addressing mode code.· When that returns the copied entry is shifted right by 9 bits (placing the destination in the source position), then a jmp copied_entry invoking the instruction code.· (Instructions with implicit addressing use jmp #instruction.)
I observed the "official" instructions typically have bit 1 cleared and no "official" instructions have both bit 0 & 1 set.· Therefore, if I reversed the opcode byte before using it as an index then gaps in the table would be minimized.· Support for the "unofficial" instructions (which are not fully decoded by the 6502) are left as a future exercise.
Although the code is not complete, I can already see some challenges.· First is going to be squeezing everything into cog RAM.· (Even without the "unofficial" instructions.)· I know there is some redundant code which can be combined, but I haven't started the more complicated instructions either.·
The second is performance.· The 6502 in the 2600 runs at 1.19MHz.· So if we assume a 76.3636MHz system clock (or 64 times), that only gives 13 normal instructions between memory accesses.· (95.4545MHz will yeild 4 additional instructions.)· I haven't counted, but I'm not confident that hard to emulate instructions like SBC can be done in that many clock cycles.
The final challenge will be working out the details of the CPU/GPU interaction - like how to handle WSYNC.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
Post Edited (ericball) : 10/16/2009 4:24:46 PM GMT
Comments
There are spin and pasm drivers to use TriBlade's SRAM and microSD and also compile options to use the limited hub mamory instead. I am not sure what hardware the Atari 2600 had. We use Brad's bst.
IIRC, someone else on the forum started a 6502 emulation a long time ago, but I have not heard anything lately. Heater may know. I think the thread could be in the Hydra section?
Heater started a 6809 emulator also which may be of some help as they were to some degree similar.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
http://forums.parallax.com/showthread.php?p=767703
I'm not sure what speed we have actually hit with the ZiCog Z80 emulator but reports from Clusso and Dr_A indicate we are keeping up with a real 4MHz Z80.
But there is a trick here. ZiCog contains most 8080 instruction emulation in normal PASM code which runs at quite a lick, but the extra instructions introduced by the Z80 are mostly implemented as PASM overlays which make quite a performance hit. The trick is that most of the software we are interested in, from the CP/M world, uses only 8080 op codes so the overlay overhead does not matter. Even code that needs a Z80 often only uses the block move instructions for which the overhead of being an overlay is not so bad. Quite fortuitous really.
When it comes to the 6809 it needs a lot of code to emulate all the instructions and extensive addressing modes. This has led me to create a lot of overlays. The problem here is that, unlike the Z80, all the instructions and modes are needed all the time. So performance is bad.
I'm trying to come up with a scheme to use two COGs. I tried multi-cogs for the Z80 but it ended up no faster than what we have now. But now I want to try something different.
The idea is:
1) All the op code handler functions that are now overlays move to a second COG.
2) The dispatch table will contain the COG addresses of the required handler functions in which ever COG.
3) Dispatching will drop the COG address into HUB where the second COG will pick it up and use it to jump to the handler.
The down side is that all registers must be kept in HUB for COGS to share which is a slow down. For using external RAM the RAM driver gets a bit more tricky.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
Post Edited (ericball) : 10/8/2009 7:51:16 PM GMT
http://forum.6502.org/
I am thinking you might need multiple COGs synchronized via hub and pin states to deal with all the possibilities. Fortunately the 2600 didn't have much RAM so you can put lots of logic in Hub RAM and still have room for typical ROM images. You will probably need another cog to emulate the TIA. That will probably require a custom driver of some kind since the 2600 generated its own video timing in much the same way the Prop does, only differently and more primitively.
I think emulating some of the more byzantine cartridge RAM and paging schemes is just going to be beyond the Prop. But if you could get it to run Combat and other 2K and 4K carts, that would still be pretty nifty.
@heater: Lately I have been compiling with the 8080 option. Does not seem to be any problems yet. As for block moves and compares, I suggest we use the TriBlade driver for that as it is optimised and will be much quicker - it is resident and can be directly called pasm cog to pasm cog. The next big speedup will possibly be replacing the spin code to pasm. I haven't even looked at you ZiCog emulation code. The RamBlade will be faster, even before the extra overclocking I am anticipating.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
However that code was abandoned in favour of the ZiCog approach which turned out to be faster and allowed expansion via overlays to the full Z80.
You realize that CP/M is an 8080 program, that's why we don't care much about the speed hit of overlaying the extra Z80 ops so much.
I tried "multiple COGs synchronized via hub and pin states" for a 4 COG Z80. It turned out slower! I'm going to revisit this idea a different way for 6809 emulation.
Clusso: Using 8080 mode should not upset CP/M and most progs. The BIOS knows what processor it is running on and behaves accordingly. BUT you cannot rebuild CP/M 2 without Z80 because BOOTGEN is written in SPL which assumes a Z80. Also LS, XFORMAT and a few other things.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
http://www.propgfx.co.uk/forum/·home of the PropGFX Lite
·
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
I've also been doing some thinking on how to interface the 6502 core with a hypothetical TIA & RIOT. There are several challenges:
1. Handling memory map shadows - at the very least RAM is shadowed $0080 - $00FF == $0180 - $01FF. The problem is the number of cycles to perform the test & mask.
2. Handling "strobed" memory addresses. Many of the TIA "registers" were write addresses which caused actions rather than saving values.
3. Handling read only / write only registers sharing the same address.
4. Handling bank switching (although I think this would have to be integrated into the core).
One idea I've had is to not attempt to write directly to the HUB RAM. Instead the 6502 core writes both the address & data to a HUB LONG. Other cogs can poll that LONG and then act accordingly - zeroing out the LONG once they've acted on it. One of the cogs (like the one handling the audio generator) could also handle RAM, and write the value to the RAM array. This would remove a chunk of logic from the 6502 core and handle strobe registers.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
I just did a quick count of ADC and it's ~80 cycles from rdbyte to rdbyte, so it might be possible to get a reasonable clock rate depending on how many cycles get chewed up in memory mapping.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
One particluar game (Chase H.Q) didn't work correctly and i couldn't for the life of me find the bug since all other games seemed to work perfect. That was the ONLY game in my collection of 400 games that actually did use the BCD mode.
I think it is time for me to cancel my plans of making a 6502 emulator on the prop now that i see that you are coming along nicely..
And yes the 6502 is a very simple but nice CPU.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
I put in the 2600 address decode last night and did some cycle counting this morning. Worst case is ADC (without decimal mode) which needs 96 cycles from operand rdbyte to rdbyte of the next opcode. That means a 2600 emulator would need a 114.55MHz clock (7.159MHz (2*colorburst) crystal).
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
I did some thinking last night about emulating the TIA and the amount of processing required is huge.· Each CPU clock (96 CLK) is 3 pixels.· So imagine the TIA "kernal" as being something like
Taking away the 33 cycles required by the RDLONG 22/WAITVID 7/DJNZ 4 leaves only 63 cycles for processing (worst case).· That's only 15 instructions.· You might be able to squeeze in another 3 instructions if you figure out how to sync the waitvid and rdlong.· Even with 18 instructions, I don't think it is possible to accomplish the necessary processing.· Perhaps multiple cogs working in parallel (along with pre-processing by the 6502 core, i.e translating TIA COLUM values to Propeller colors) could assist with the processing, but then you have the difficulty of retrieving updates from the hub and merging them together.·
Another strategy would be to have two cogs alternating between rendering the pixels to cog RAM and outputing the pixels.· The challenge with this idea is the both cogs will still need to be performing register updates in sync with the 6502 core during both phases.
In any case, there will be significant challenges simply emulating the numerous flags which affect the output (i.e. NUSIZE, reflection, priority, enable) not to mention horizontal positioning and collision detection (which might be better handled by a dedicated cog).
So even if it is possible to emulate the TIA on the Propeller, the journey is not a simple one.· Nor is it a journey which I plan on making.· (I'll probably finish off the 6502 core, but I will leave it to others to make use of it.)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
I share your concerns about TIA. IMHO, writing the CPU state to the HUB every clock, while having other COGs attempt to perform the tasks that TIA does in tandem was the only idea that really made sense to me. All of that rolls up to a buffer where a TV COG then draws to the screen, or builds the draw list from. Baggers had the idea of a draw list, and to use it to buffer the screen, thus decoupling any emulation from the actual drawing of the TV image. An awful lot can be done with draw, or display lists (not the Atari meaning of the word), more like the OGL meaning of the word.
IMHO that could consume an entire propeller, rendering a VCS project a multi-prop one.
If you find it makes sense to finish the 6502 core, please do.
@Heater, or perhaps an Apple done in a style similar to the CP/M effort.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
http://www.propgfx.co.uk/forum/·home of the PropGFX Lite
·
How about emulating an Apple ][noparse][[/noparse] instead?
The video is a bit weird at 280 pixels with bit 7 indicating half a pixel shift, but it sounds easier than emulating the TIA.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
(Sorry about the formatting, AA changed forum software.)· Hmm... I wonder if it would be feasible to simply duplicate how the Apple ][noparse][[/noparse] created hires graphics (luma only and let artifacts provide color) rather than try to translate it into Propeller colors.
Yes and no.· I suspect that many games still did some cycle counting.· And let's not forget the Disk ][noparse][[/noparse] interface had some timing dependencies.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
The Apple// had 2 reserved memory sections for each card. I am sure it is documented somewhere. The emulator will ultimately have to plug in the code here and check, but as I said, later.
As Drac found, pacman on CPM is now too fast on ZiCog/TriBlade so some nops will be required [noparse]:)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
First CP/M, then APPLE ][noparse][[/noparse]... What'll be next? 6510? (Commodore)
It's fun to hang out with code wizards.. [noparse]:)[/noparse]
OBC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?
Visit the: The Propeller Pages @ Warranty Void.
I will also shortly post an asynchronous version which won't try to duplicate the 6502 clock/cycle timing.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
(god, you are fast!)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!