Shop OBEX P1 Docs P2 Docs Learn Events
6502 CPU Emulator - Page 2 — Parallax Forums

6502 CPU Emulator

2»

Comments

  • roglohrogloh Posts: 5,837
    edited 2021-04-28 11:25

    @Wuerfel_21 said:
    Not every R/W, just the ones going to a tricky area. RAM and ROM usually sit at the top and bottom, so those cases can be cought early with a CMP+JMP each. C64 in particular has a weird memory map with lots of bank switching though. Still, the bottom of memory is always RAM on a 6502, for obvious reasons.

    Yeah some fast way to match the range of addresses is required. A CAM would be ideal but we don't have that so some sequential address testing combined with limited range checking are required unless you burn a 64kB block of HUB that tells you what to do for every possible address (the brute force approach). That's very wasteful however. There'll be a happy medium somewhere.

  • @rogloh said:

    @Wuerfel_21 said:
    Not every R/W, just the ones going to a tricky area. RAM and ROM usually sit at the top and bottom, so those cases can be cought early with a CMP+JMP each. C64 in particular has a weird memory map with lots of bank switching though. Still, the bottom of memory is always RAM on a 6502, for obvious reasons.

    Yeah some fast way to match the range of addresses is required. A CAM would be ideal but we don't have that so some sequential address testing combined with limited range checking are required unless you burn a 64kB block of HUB that tells you what to do for every possible address (the brute force approach). That's very wasteful however. There'll be a happy medium somewhere.

    Most systems tend to map stuff on page boundaries, so having a per-page table of function pointers could work. Doesn't even need to include the pages that are covered by the top/bottom CMPs

  • Here is a more interactive demo running the Woz monitor.

    This means I built an Apple-1 ? :)

    If you don't know how to use it (like me), I found this site, where I also got the binaries:
    https://www.sbprojects.net/projects/apple1/wozmon.php

    Remember to use capital letters!

    Since I never used that thing before, and fighted a bit with the smartpins serial configuration, I hope to have got it right.
    The code also got some skip-pattern optimizations.

  • Using XBYTE would produce the smallest and fastest 6502 emulator, without any doubt. XBYTE was designed to handle more complicated CPUs than the 6502.

    There is big scope for code saving, freeing up space for hardware emulation. For example, you don't need to duplicate the fetch code for every arithmetic or logical operation. Instead load a register with either (a) the address of the subroutine for the operation or (b) the skip pattern for the operation and combine code for similar operations in the same skip block.

    You will be amazed how much code disappears when using skipping as much as possible.

  • How likely is it that the 6502 will fetch an instruction from $FFFF then wrap around to address $0000? If the answer is very unlikely or not at all, you could save more code and eliminate the PC variable.

    When you need to know PC, use the GETPTR instruction and subtract the base address. Note that a hidden GETPTR PB is done for each new bytecode and RFBYTE/RFWORD post-increment the hub pointer automatically.

  • @TonyB_ said:
    There is big scope for code saving, freeing up space for hardware emulation. For example, you don't need to duplicate the fetch code for every arithmetic or logical operation. Instead load a register with either (a) the address of the subroutine for the operation or (b) the skip pattern for the operation and combine code for similar operations in the same skip block.

    You will be amazed how much code disappears when using skipping as much as possible.

    This is a work in progress, I tend to make things working before doing any optimization, and the skip pattern gave me a lot of trouble, that's why the initial posted code is as it is (the woz monitor demo already has some changes). I spent a lot of time trying to understand why an instruction wasn't working only to discover that I typed the pattern backward or that I have added an instruction and forgot to add the relative skip (or not-skip) bit in the pattern. One problem at a time please.

    I spend few hours today optimizing the skip patterns, one group of instructions at a time, with repeated tests to make sure nothing broke. This is a thing to do when you have a working code, otherwise it is a nightmare.

    How likely is it that the 6502 will fetch an instruction from $FFFF then wrap around to address $0000? If the answer is very unlikely or not at all, you could save more code and eliminate the PC variable.

    That was my excess of safety, only later when most of the code was already working that way, I realized that the 6502 has its reset and interrupt vectors at the top of the address space and any code that wraps around it is very likely broken anyway. I already removed that thing in my current development code.

    As said before, xbyte will be last thing to do, there are already a number of things to do.

  • TonyB_TonyB_ Posts: 2,195
    edited 2021-04-29 14:13

    I have written a program to generate skip patterns automatically. Without it my 8086 emulator would have been a nightmare.

    http://forums.parallax.com/discussion/comment/1513077/#Comment_1513077

  • roglohrogloh Posts: 5,837
    edited 2021-05-02 06:46

    @TonyB_ said:
    I have written a program to generate skip patterns automatically. Without it my 8086 emulator would have been a nightmare.

    http://forums.parallax.com/discussion/comment/1513077/#Comment_1513077

    Was that Z80 emulator (that I think your were doing a while back) also completed @TonyB_ ? I could have a use for that CPU if you plan to release that one to the public and if it is cycle accurate.

  • TonyB_TonyB_ Posts: 2,195
    edited 2021-05-02 10:37

    @rogloh said:

    @TonyB_ said:
    I have written a program to generate skip patterns automatically. Without it my 8086 emulator would have been a nightmare.

    http://forums.parallax.com/discussion/comment/1513077/#Comment_1513077

    Was that Z80 emulator (that I think your were doing a while back) also completed @TonyB_ ? I could have a use for that CPU if you plan to release that one to the public and if it is cycle accurate.

    I stopped all work on the Z80 when starting the 8086. I finished untested fast and slow (cycle-accurate) Z80 emulators before then, apart from the skip patterns. I need to modify my Z80 emulators to include auto-generated patterns and lessons learned from the 8086. The cycle-accurate Z80 in particular could have a lot more free space for system hardware emulation.

    One self-contained cog can emulate the Z80, without a doubt. The sheer quantity of instructions is the main issue. XBYTE was designed to cater for the Z80, to which I will return fairly soon, then decide about a public release. Emulator complexity for some popular CPUs in increasing order: 6502, 8080, Z80, 8086.

  • pik33pik33 Posts: 2,387

    .... M68000?

  • Got a bit hooked with the Apple1 and went a little further with the emulation...

    https://youtu.be/gN8vVUQVLIM

    Sorry for the poor video quality.

  • cgraceycgracey Posts: 14,206

    @macca said:
    Got a bit hooked with the Apple1 and went a little further with the emulation...

    https://youtu.be/gN8vVUQVLIM

    Sorry for the poor video quality.

    Looks good, Macca. Was the Apple I monochrome?

  • @cgracey said:
    Looks good, Macca. Was the Apple I monochrome?

    Yes, it was composite monochrome, the emulator outputs VGA (green instead of white because I liked more...).
    I want to add the composite output soon.

  • roglohrogloh Posts: 5,837
    edited 2021-05-04 05:56

    Nice work and nice monitor macca. I actually have the same one, a little Sony Trinitron 15 inch 100ES, which is perfect for emulator stuff with VGA. I noticed it dumped on a sidewalk around here for our annual hard garbage collection. Its VGA cable had been cut off but it was in pristine condition and I nabbed it before it rained and added a new cable. Works fine and is still in top condition. I now use it for my souped up Z80 Microbee system with P1 VGA output and it matches the whole 90's look I was going for at the time.

  • Does this 6502 have the equivalent of a /RDY line? Is there a way for an outside cog to tell the 6502 to stop for a bit? I was thinking that for the Atari 8-bit computers, a halt line could be added (MOS SALLY mod), but as I've thought more, no, only /RDY would be needed since the P2 already has intrinsic concurrent DMA support, so the 6502 portion only needs to pause, not unlatch the memory.

    And I'd like to see an Atari 800 clone that can do VGA. And I'd like to see some enhancements. For instance, what if the unused opcode space was used for things the P2 is particularly good at? I'd like to see extra opcodes to do things like 24-bit memory addressing, multiplication, division, and other floating point functions. Thus, one could rewrite the ROM to use those additional opcodes in the FP emulator, the memory management, etc., as well as any drawing routines. Plus SIO routines could be improved (if one plans on adding that support). And it would be nice to rewrite the BASIC cartridge and optimize it too. The Turbo BASIC cartridge was a rewrite that used more efficient drawing and FPU routines, effectively doubling the speed of many interpreted programs. And the Veronica BASIC cartridge took things a step further by also including an '816 in the cartridge that was clocked faster than the system CPU. A rewritten BASIC might have midway performance. Faster than Turbo BASIC, but not as fast as Veronica BASIC. With 24-bit memory opcode mods, such a BASIC could access more than the 40-48K that is normally available.

    I like the Kim Klone build (not to be confused with a KIM-1 clone) where a guy took a 65C02, and build a wrapper for it using TTL and programmable logic to add a wrap-around coprocessor to add more instructions to the CPU. It replaced many of the NOPs and proxied some of them so both the CPU and the extension could work on something together. So something like that could be added to the 6502 core. It would only offer gains with specially-written programs without disturbing existing programs.

  • @PurpleGirl said:
    Does this 6502 have the equivalent of a /RDY line? Is there a way for an outside cog to tell the 6502 to stop for a bit? I was thinking that for the Atari 8-bit computers, a halt line could be added (MOS SALLY mod), but as I've thought more, no, only /RDY would be needed since the P2 already has intrinsic concurrent DMA support, so the 6502 portion only needs to pause, not unlatch the memory.

    It doesn't have a specific method to emulate it but can be easily added. However as you pointed out it isn't really needed in a generic emulated system. It only affects timing so the real question is: is there a way to add CPU cycles when needed by the emulated hardware ? Yes, but depends on how the hardware is emulated, there are a number of methods to do that but I don't know much about the Atari 8-bit computers to give an useful advice on that.

    And I'd like to see an Atari 800 clone that can do VGA.

    There are a lot of things I like to see... :)

    And I'd like to see some enhancements. For instance, what if the unused opcode space was used for things the P2 is particularly good at? I'd like to see extra opcodes to do things like 24-bit memory addressing, multiplication, division, and other floating point functions. Thus, one could rewrite the ROM to use those additional opcodes in the FP emulator, the memory management, etc., as well as any drawing routines. Plus SIO routines could be improved (if one plans on adding that support). And it would be nice to rewrite the BASIC cartridge and optimize it too. The Turbo BASIC cartridge was a rewrite that used more efficient drawing and FPU routines, effectively doubling the speed of many interpreted programs. And the Veronica BASIC cartridge took things a step further by also including an '816 in the cartridge that was clocked faster than the system CPU. A rewritten BASIC might have midway performance. Faster than Turbo BASIC, but not as fast as Veronica BASIC. With 24-bit memory opcode mods, such a BASIC could access more than the 40-48K that is normally available.

    Are the source code of all these things available ? Because without source code you can't modify anything...
    Anyway, I'll not implement P2-specific opcodes because it will be outside the scope of the emulator (you can do it yourself of course).

  • @macca said:

    @PurpleGirl said:
    Does this 6502 have the equivalent of a /RDY line? Is there a way for an outside cog to tell the 6502 to stop for a bit? I was thinking that for the Atari 8-bit computers, a halt line could be added (MOS SALLY mod), but as I've thought more, no, only /RDY would be needed since the P2 already has intrinsic concurrent DMA support, so the 6502 portion only needs to pause, not unlatch the memory.

    It doesn't have a specific method to emulate it but can be easily added. However as you pointed out it isn't really needed in a generic emulated system. It only affects timing so the real question is: is there a way to add CPU cycles when needed by the emulated hardware ? Yes, but depends on how the hardware is emulated, there are a number of methods to do that but I don't know much about the Atari 8-bit computers to give an useful advice on that.

    Yeah, I am not sure if ANTIC would need the CPU to not run, but I think it would help for compatibility in case the display list is overwritten mid-process. While that wouldn't pose a hardware race (thanks P2), there is a chance that old and new frame buffer info could get comingled. The Atari 800 was about the only 6502-based computer that used bus-mastering DMA. The C64 used concurrent DMA that interleaved itself with the CPU cycles. I don't know what Apple machines did here, but it was a rather simple architecture from the descriptions in this thread.

    And I'd like to see an Atari 800 clone that can do VGA.

    There are a lot of things I like to see... :)

    Obviously. The trick here would be to emulate the GTIA chip and incorporate this in the emulation. Thus, instead of NTSC (or PAL/SECAM) output and color burst stuff, it could be made to emulate a similar VGA mode (even if it is halved, quartered, thirded, etc). Halved might be pushing it, but the only way to know is to try. It is just a matter of reducing the pixel clock to the timing required to produce the number of pixels across that is needed and then duplicating lines to keep the aspect ratio. If there are extra pixels, they would be part of the overscan area. The A8 machines included a visible overscan region much like the C64 included a visible border.

    Are the source code of all these things available ? Because without source code you can't modify anything...
    Anyway, I'll not implement P2-specific opcodes because it will be outside the scope of the emulator (you can do it yourself of course).

    Well, I imagine some have disassembled these ROMs, but if not, one can do that themselves. It isn't hard to dump a cartridge and then run it through a cross-platform disassembler. Of course, that would be messy to decipher, but it would be a start. Plus there is enough documentation out there to be able to find things such as the various vectors. I won't delve into legal speculation here though. One could attempt to contact Warner Communications or whoever owns the rights now, but they might not even answer.

    But yeah, I'm not asking you to implement such. Still, some '816, 'C02, 24-bit address support, and FPU stuff would be helpful. It wouldn't be P2-specific, and adding such shouldn't impact compatibility. Thus, if one upgrades the ROMs to use new opcodes, they could offer modest overall gains while not impacting things excessively. So some improvements in the ROM could offset any losses from any overhead added by adding any 20-24 bit Address support. I think modding the core might be a cleaner way one could do this, but one could split added stuff into another cog and use it as a frontend for yours. Thus design purity is maintained while adding such. Just brainstorming for now.

  • @PurpleGirl said:
    Obviously. The trick here would be to emulate the GTIA chip and incorporate this in the emulation. Thus, instead of NTSC (or PAL/SECAM) output and color burst stuff, it could be made to emulate a similar VGA mode (even if it is halved, quartered, thirded, etc). Halved might be pushing it, but the only way to know is to try. It is just a matter of reducing the pixel clock to the timing required to produce the number of pixels across that is needed and then duplicating lines to keep the aspect ratio. If there are extra pixels, they would be part of the overscan area. The A8 machines included a visible overscan region much like the C64 included a visible border.

    Rendering the screen and sending the rendered lines to a display are separable (and arguably should be separated) - check MegaYume's video code (trigger warning: giant mess). It takes lines from a 2x320x32bpp buffer and puts them onto the screen according to the selected output mode, which can be NTSC/PAL60 composite/s-video, component, VGA at various resolutions, HDMI/DVI, 6bit LCD signalling and anything else one could come up with that has vaguely similar vertical timing to an SDTV signal. It also snoops the VDP mode register to make interlace and 256 wide modes happen.

  • pik33pik33 Posts: 2,387
    edited 2022-08-22 21:11

    And I'd like to see an Atari 800 clone that can do VGA.
    takes lines from a 2x320x32bpp buffer and puts them onto the screen according to the selected output mode,

    We can fill the A8 framebuffer and then convert it to any P2 resolution/framerate we want. As in ****yume series. This however, for A8, can have several side effects: 2 frames lag (we need to complete 2 frames for A8 to cover strange things programmers can do.. for example they can do 480 lines interlaced) and jerky/teared picture if the framerate will not fit. As in software emulators. Including Altirra, the best of them. I ended up with my own custom defined resolution, which, for a PC is not always possible, and still, sometimes, the glitches occur.

    The best way to do this is to make the P2 screen synchronous with A8 screen and double/triple the line on the fly. This gives two line lag (= no lag) and a stable picture. Two, not one, as we have to have at least 2 lines buffered to properly simulate the PAL multicolor effects, where every second line is displayed in 16-level grayscale while the rest are 16 dark colors. If we don't compute this properly, a lot of demos and even several games will look B/W while they are already in color.

    HDMI is much easier to do, there is no problem to signal near anything, including for example 896x624x50Hz, and make 768x480 visible region with frames where needed to make pixel aspect ratio about 1:1 which for PAL, but not NTSC, Ataris is a good approximation
    For VGA, 720x576 may be usable: 384 pixels in a wide playfield mode is theoretical only, the manual https://www.virtualdub.org/downloads/Altirra Hardware Reference Manual.pdf tells it is in the reality up to 354, so 360 should be good enough.

  • pik33pik33 Posts: 2,387
    edited 2022-08-22 20:58

    And I'd like to see some enhancements. For instance, what if the unused opcode space was used for things the P2 is particularly good at?

    This will make a system without a software. Look at VBXE - this excellent graphic card has only a few demos and games for it.. Of course, I want a VBXE, it may be doable on PSRAM systems

    I'd like to see extra opcodes to do things like 24-bit memory addressing, multiplication, division, and other floating point functions. Thus, one could rewrite the ROM to use those additional opcodes in the FP emulator, the memory management, etc., as well as any drawing routines.

    The best way for "Turbo A8" is to emulate 65816, there are several pieces of hard and soft stuff for A8 with this CPU.

    Plus SIO routines could be improved (if one plans on adding that support).

    I don't think we need to emulate SIO at all. What we can connect to this? A8 disks, printers and other SIO stuff is even more rare than A8s. SIO2SD should be emulated, but all the communication will go inside a P2. We may add a cassette audio input for the nostalgic experience, but then there is much easier to connect a mp3 player or a smartphone with the audio file to P2 ADC and make it decode the modulation than to find a 1010 and cassette tapes in good enough conditions. Also, for a nostalgic experience, I already printed a 1050 style case for a 3'5" drive. Probably this will be connected to a P2 via UART.

    And it would be nice to rewrite the BASIC cartridge and optimize it too. The Turbo BASIC cartridge was a rewrite that used more efficient drawing and FPU routines, effectively doubling the speed of many interpreted programs. And the Veronica BASIC cartridge took things a step further by also including an '816 in the cartridge that was clocked faster than the system CPU. A rewritten BASIC might have midway performance. Faster than Turbo BASIC, but not as fast as Veronica BASIC. With 24-bit memory opcode mods, such a BASIC could access more than the 40-48K that is normally available.

    '

    Are the source code of all these things available ? Because without source code you can't modify anything...

    A lot of stuff and informations is available here: https://www.virtualdub.org/altirra.html - Altirra OS and Basic are open source. The manual I linked a post earlier is priceless. Altirra Basic is a rewritten A8 basic for 8 kB standard space, for a drop-in replacement to avoid copyright stuff.

    Polish Atari community is strong. Here you can find near all A8 related stuff: https://www.atarionline.pl

    You can legally download A8 ROMs from here: http://www.atari.org.pl/PLus/index_pl.htm - file xf25.zip. This is the interesting story. xf25.zip is a very old, A8 emulator. The author was given the permission to include the original ROMs with it, so here it is. The emulator is now useless, but these ROMS still are useful.

  • On SIO emulation, if you want to go that far, just add USB support. They are cousin protocols. And yes, I have a 410 drive and disks.

    And emulating an '816 is passe. My goal would be to do something different. The '816 has no FP functions, but with a P2, we can.

  • What can connect to SIO?

    FujiNet!

    https://fujinet.online/shop/hardware/fujinet-1-6/

    I second doing 65816 as an expansion. There already is some software support.

    (Greets all! Still have my head down on a long project involving lasers, metal 3D printing and milling)

  • hinvhinv Posts: 1,255

    @rogloh said:
    Great stuff. With a P2 SIDCOG and this 6502 probably what we need now is just some VIC-II graphics emulation plus CIA chip and some C64 code could potentially start to run in a complete emulator. Who's game to write a full VIC-II emulator? Are you intending to do one of those too next @macca?

    Then it is going to need a special version to emulate the 6510 instead which allows for the bank switching of banks of ROM.

  • @hinv said:

    @rogloh said:
    Great stuff. With a P2 SIDCOG and this 6502 probably what we need now is just some VIC-II graphics emulation plus CIA chip and some C64 code could potentially start to run in a complete emulator. Who's game to write a full VIC-II emulator? Are you intending to do one of those too next @macca?

    Then it is going to need a special version to emulate the 6510 instead which allows for the bank switching of banks of ROM.

    AFAIK, the 6510 is a 6502 with an on-chip I/O port at address $00/$01 which can be easily emulated by adding the logic to the memory read/write functions.

  • @macca said:

    @hinv said:

    @rogloh said:
    Great stuff. With a P2 SIDCOG and this 6502 probably what we need now is just some VIC-II graphics emulation plus CIA chip and some C64 code could potentially start to run in a complete emulator. Who's game to write a full VIC-II emulator? Are you intending to do one of those too next @macca?

    Then it is going to need a special version to emulate the 6510 instead which allows for the bank switching of banks of ROM.

    AFAIK, the 6510 is a 6502 with an on-chip I/O port at address $00/$01 which can be easily emulated by adding the logic to the memory read/write functions.

    That makes sense, if it is implemented in hub memory, then your peripherals in other cogs can poll for it. If performance is critical, then one could put Page 0 in a shared LUT and the device that needs the port in the neighboring cog.

Sign In or Register to comment.