Shop OBEX P1 Docs P2 Docs Learn Events
Commodore 64 Emulator - Page 5 — Parallax Forums

Commodore 64 Emulator

12357

Comments

  • yerpa58yerpa58 Posts: 25
    edited 2007-11-09 20:48
    deSilva,

    I agree that emulating any instruction set is difficult, but the C-64 used (if I remember right) a 6502-series CPU which had hard-wired logic for opcodes. It was not microcoded, so it looked more like todays RISC chips. I think all instructions took no more than two CPU cycles each. The cpu was greatly assisted by the graphics chip which shared memory with the cpu and did sprites, transparency, layers, and collision detection in hardware. Also the SID sound chip was way ahead of its time with hi-res hardware frequency generators, A-d conversion for proportional joysticks, and a tuneable switched-cap audio filter.

    It would seem a shame to use the propellor's 80 MIPS to emulate an antique 1 MIP machine.

    However, emulating the best features of the C-64 in a modern design sounds more interesting. I am especially interested in following the propellor-audio threads in this forum. Some of the prop-generated sound samples provided in this forum are quite amazing!
  • AleAle Posts: 2,363
    edited 2007-11-09 20:52
    Well, as I see it, you hanged the towel. It is a hobby, does not have to provide money. Most of what is done here is made for self satisfaction, for learning purposes and to have a bit of fun.

    Controlling 32 servos with a propeller is also a waste of resources. That can be done with an AVR. A 6502 emulator can also be done with an AVR, and was done several times. It works. Both architectures are closer than the propeller is. But there lays the challenge.
    I think it can be done, I'd like to attempt it, it may have little value, but there are loads to learn from it. What do you want to do ? Number crunching ? Try to implement the DOOM engine, that is also challenging, and maybe much more difficult, the memory architecture hinders almost all conventional applications, that's where new ways of solving problems come in. Besides, without external helpers... it could be very difficult to achieve better than few colors at 128x96.
    Have fun

    ALe
  • Fred HawkinsFred Hawkins Posts: 997
    edited 2007-11-09 21:08
    Which is why I brought up FPGA's. While it is true that a FPGA can probably solve this problem without a propeller, it's my guess that a propeller with a FPGA assist would work as well. There are some bit moves and data sizes that on the propeller are a genuine pain in the butt -- "this dog can't hunt" -- and would benefit from a "don't bother with that, my trusty sidekick here does that stuff."

    (Found it!* took 74 minutes)

    Here's a guy who uses a FPGA to help out a pic·to dump a camera card's contents into a hard drive.

    ·http://home.nikocity.de/andymon/hfg/Alya/alya.html

    If I read him right, he's just using it for wiring, but the general idea -- FPGA as a helper --·seems sound.

    *caveat, if you have a webpage that brags about your hacks, make sure its title·helps find it again. "The alya project" doesn't help much.

    Post Edited (Fred Hawkins) : 11/9/2007 9:43:44 PM GMT
  • deSilvadeSilva Posts: 2,967
    edited 2007-11-09 22:02
    Please! I hate to discuss matters without any substance behind it from the counterpart. When I say you will need 3 µs I have my reasons. When you say you do not believe it, it is your full right.

    What I do not understand is why you do not take the 10 minutes coding - as I have suggested above - to implement a simple 6502 instruction???


    So I did it: LDA nnn,X

    Quite unoptimized, I count 28 COG instructions per 4 ticks + 6 HUB-instructions per - say - 16 ticks = 112- 96 = 200+ ticks = 2.5 µs

    Note that it does not help that the graphics co-processor is a co-processor; you have to emulate its instructions as well smile.gif

    DAT
    ' $BD = LD nnnn,X
    ' assume we wired DATA to 0..7 and ADDR to 8..23
    loop
      RDWORD instrcounter, #PCinHUB ' get insruction counter
      SHL instrcounter, #8
      MOV OUTA, instrcounter
      MOV  opcode, INA    ' get opcode byte
      AND  opcode, #$FF
      SUB  opcode, #lowrange WC
      IF_C  jmpaway
      MOVS  jmpdest, #jmptable
      ADD   jmpdest, opcode 
    
      CMP  opcode, #(highrange-lowrange) WC
      IF_NC  jmpaway
    jmpdest
      JMP  0-0
    jmptable
     ....
       long instBD   'LDA nnnn,X
     .....
    
    
    instBD
       ADD OUTA,#$100
       MOV bytelow, INA
       AND bytelow,#$FF
       ADD OUTA,#$100
       MOV newaddr, INA
       AND newaddr,#$FF
       SHL newaddr,#8
       OR  newaddr, bytelow
       RDBYTE  bytelow,#XinHUB
       ADD newaddr, bytelow
       MOV OUTA, newaddr
       MOV thebyte, INA
       WRBYTE thebyte, #AinHUB
       AND  thebyte, #$FF WZ
       IF_Z  WRBYTE aOne,  #ZimHUB
       IF_NZ WRBYTE aZero, #ZimHUB    
       AND  thebyte, #$80 WZ
       IF_Z  WRBYTE aOne,  #NimHUB
       IF_NZ WRBYTE aZero, #NimHUB
    
       ADD instrcounter, #3
       WRWORD  instrcounter, #PCinHUB
       
       JMP loop     
    
    

    Post Edited (deSilva) : 11/9/2007 10:09:03 PM GMT
  • AleAle Posts: 2,363
    edited 2007-11-09 22:35
    deSilva:

    That implementation is not what I had in mind. I was expressing an idea, to be refined, not something written on stone smile.gif. I'll try to do it and come back. Your code shows (sadly) that a traditional approach, if we had >> 2K COG RAM, will not work :-(
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-11-09 23:46
    I wouldn't be at all put off by speed constraints. It's amazing what optimisations can come up after you've produced some initial code to solve a problem and it turns out not to be good enough. With a chip as unusual as the Prop, the first thought solution may very well not be the best.

    And if you can't get the speed up to that of the original, it's still an achievement, and you can still run some programs. And once the next generation Prop comes out you're good to go for full speed.

    Don't let anyone tell you it can't be done.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Help to build the Propeller wiki - propeller.wikispaces.com
    Prop Room Robotics - my web store for Roomba spare parts in the UK
  • deSilvadeSilva Posts: 2,967
    edited 2007-11-09 23:52
    smile.gif
  • AribaAriba Posts: 2,690
    edited 2007-11-10 02:52
    When I look at the instruction set of the 6510 CPU, a LDA abs,x takes 4 cycles to execute - that are 4 us at 1MHz Clock. So also the unoptimized code of deSilva is even faster then the original !
    The 6510 CPU need more clock cycles for complexer addressing modes, and that helps a lot to emulate this complexer addressing modes.

    Or I am wrong?

    Andy

    P.S. I also think that the graphic coprocessor with the Sprites is the really challenge.
  • deSilvadeSilva Posts: 2,967
    edited 2007-11-10 08:31
    When I said "unrewarding" above I wanted to express the fact that you cannot gain anything substantial from the parallel working in the COGs. However due to the mass of code needed for all 6502 instructions you will need two or even four of them to hold it all, with additional overhead for Inter-COG synchronization, requiring the registers to be kept in HUB memory...

    So this seems to be a very unappropriate example to show-off the power of the Propeller smile.gif

    Post Edited (deSilva) : 11/10/2007 10:58:55 AM GMT
  • hippyhippy Posts: 1,981
    edited 2007-11-10 14:14
    deSilva said...
    So this seems to be a very unappropriate example to show-off the power of the Propeller smile.gif
    Conversely, as it seems so hard to do ( except on a micro with a very high clock speed ) success would raise the Propeller against its peers.
  • JT CookJT Cook Posts: 487
    edited 2007-11-10 19:46
    Before we start counting COGs for the 6502, lets throw something out here.

    We will probably need 5 COGs just for video rendering (1 for TV driver and 4 for Rendering), 1 for audio, and of course 1 for the SPIN interpreter, which leaves us 1 COG (2 if we remove sound). The of course we have the CIA chips and VIC operations (which the two could probably be combined into 1 COG).

    ·
  • JT CookJT Cook Posts: 487
    edited 2007-11-10 19:58
    Oh yeah, and 1 COG for keyboard input.
  • deSilvadeSilva Posts: 2,967
    edited 2007-11-11 09:26
    This is not funny what JT Cook requests.
    It undigs some of the issues of the COG concept. Life is easy as long you have no more tasks than you have COGs and your application code fits into 512 words. When you have a VERY LOW SPEED application (below 10 kHz), you can change to SPIN thus working nicely around the memory restriction.

    Needing more than 8 tasks however can become a nightmare of painstickingly arranging polling code and counting clock ticks, as the well understood method of choice - interrupts - is not available...

    Some clever tricks in some drivers ("co-routines") work as long as the code fits into 512 words.

    I would not complain if I had 64 COGs (well not this year at least...)

    Post Edited (deSilva) : 11/11/2007 9:32:49 AM GMT
  • AleAle Posts: 2,363
    edited 2007-11-11 17:51
    Let's divide and conquer smile.gif. One task at a time, if not... nothing will be done. 64 COGs would be great, but 128 i/o pins would be also useful.
  • BaggersBaggers Posts: 3,019
    edited 2007-11-11 19:44
    Good idea Ale, and if we run out of cogs, there's always Prop2, by which time we'll have most of the code ready, as that'll have ample RAM and ample cogs and ample IO's and more than ample speed [noparse];)[/noparse]
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-11-12 02:49
    There's other approaches too.
    • Large memory model, if anyone gets that working in a practical way.
    • Rather than use SPIN, write your own byte code interpreter which is tailored for the problem space. Think of it as an metaphor for the microcode used to implement instructions on a CISC CPU.
    • Use a table of data in hum memory with each row representing a machine code instruction and the columns representing different elements of that instruction (register, addressing mode, type of operation, flags affected etc.) Then rather than implementing each instruction in the cog program, you are writing generic table driven code.
    • Only emulate a sub set of instructions needed to run a particular program.
    Worth investigating.

    How about starting simpler? A Vic-20 is less complicated with less memory and therefore less complicated programs. Or maybe a PET. It would get the 6502 emulator working, ready to do he C64 when the PROP II comes out.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Help to build the Propeller wiki - propeller.wikispaces.com
    Prop Room Robotics - my web store for Roomba spare parts in the UK
  • BaggersBaggers Posts: 3,019
    edited 2007-11-12 06:18
    CardboardGuru said...
    There's other approaches too.
    • Large memory model, if anyone gets that working in a practical way.
    • Rather than use SPIN, write your own byte code interpreter which is tailored for the problem space. Think of it as an metaphor for the microcode used to implement instructions on a CISC CPU.
    • Use a table of data in hum memory with each row representing a machine code instruction and the columns representing different elements of that instruction (register, addressing mode, type of operation, flags affected etc.) Then rather than implementing each instruction in the cog program, you are writing generic table driven code.
    • Only emulate a sub set of instructions needed to run a particular program.
    Worth investigating.

    How about starting simpler? A Vic-20 is less complicated with less memory and therefore less complicated programs. Or maybe a PET. It would get the 6502 emulator working, ready to do he C64 when the PROP II comes out.

    That's why I've been thinking of doing a Z80 emulator·using the·Large memory model, to start simple, with say a ZX81 emulator, to later do a Spectrum emu, either on prop2 or HX512.
    But that's once later on, as I've got other stuff to finish first, and I don't have much spare time with work being busy at the mo.
  • AleAle Posts: 2,363
    edited 2007-11-12 06:21
    Cardboard:

    - The idea of the subset is quite sound. Anyway for any other model... we need the processor emulator, maybe with looser timings.
    I was diagramming yesterday the parallel tasks with more or less accurate timings and it seems to fit, as I explained before. I'll see if I can put a diagram together to show this, with some code.

    More than prop2 we may need 2 props smile.gif
  • J.T.J.T. Posts: 31
    edited 2007-11-14 08:53
    Can't help but laugh at all the old C-64 references....

    Load $,8,1.· That's classic.

    SYS 49152...I'm trying to recall what this did?· (I think it was like "ctrl+alt+del", a reset?)

    I remember you could crash the thing by entering 99E88 into a number field.· (overflow error)

    And the graphics were "sprites".

    ....I would download games from the local "BBS".· It would take like all night and as it was downloading it would add a dash like every 5 seconds and go across your screen:


    I had a Vic-20 first!

    fun stuff!
  • scottascotta Posts: 168
    edited 2007-11-15 13:05
    $FFD2 is my favorite Vic-20 address [noparse]:)[/noparse]
  • Ken PetersonKen Peterson Posts: 806
    edited 2007-11-15 13:31
    LDA #$48
    JSR $FFD2
    LDA #$49
    JSR $FFD2
    LDA #$0D
    JSR $FFD2
    
    

    Gosh....those were the days!
    smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    The more I know, the more I know I don't know.· Is this what they call Wisdom?
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-04-01 09:20
    I thought I'd post an answer to Dr A. in this thread instead of the "Thank you, Parallax" thread.

    This is still a very worthy project, and we have many of the pieces in place.

    * We have an SDRAM solution for memory
    * We have SID emulation
    * We have most of the requirements for a VIC emulation
    * We have a primitive 6502 emulator. Perhaps the 6510 could be achieved from it.

    I suspect that we should probably do this in steps, starting with a VIC-20 emulation, or perhaps even a PET emulator and work our way up. Personally, I think it would be a hoot to see Omega Race or Pirate Cove running off a prop.

    Thoughts? Is a VIC-20 emulation thread in order?

    OBC
  • trodosstrodoss Posts: 577
    edited 2011-04-01 09:39
    @OBC,
    I think that woud be a great idea! the VICE emulator project would be a good resource (http://www.viceteam.org/)
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-04-01 10:14
    I was thinking the exact same thing... I've been looking at the VICE emulator monitor this morning..

    Can some of the other Propeller "emulator" experts provide some direction on some starting points based on your experience with the Propeller?

    OBC
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-04-02 00:08
    Where would you start?

    Do you have a board with external memory? Attached is Juergen's CP/M emulation. The 'main' program is CPM.Spin.

    There are a few interesting things in there. A graphics driver. Sound driver. But I think the main thing is the LMM emulation code. Juergen said that this was the breakthrough that meant it was possible to keep coding opcode emulations without always having to worry about running out of cog space.

    For a C64, the disk drive format will be different but hopefully not too different.
    The opcodes are different but you maybe can drop in code already done.
    I think the sound chip emulation will be different - maybe you can hack the code that is already done.


    What was the standard graphics resolution for a C64?
  • potatoheadpotatohead Posts: 10,261
    edited 2011-04-02 00:19
    How fast is C on the Dracblade compared to SPIN?
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-02 02:17
    The 6502 and the 6510 are almost identical; The only difference is the 8bit I/O port which is controlled by two memory mapped registers at address $0 and 1$ (data, direction).
    These registers were used to map the different rom-chips in and out of the address space and also to handle communication with the datasette drive.

    The hardest part to emulate, BY FAAAAAR, is the Vic II, everything else are either already done or quite easy to do.

    The 6502/6510 could have been very easy to emulate if it wasn't for the illegal opcodes.
    These are not fully documented even today.They are just byproducts of the way the 6502 selects opcodes by combining addressing modes bits, instruction bits and instructions type bits.

    I have been coding CPU-emulators for 6502, 6510 and HuC6280 (A pimped 6502 used in the PCe).
    I'm sure it shouldn't be any problem to make an (quite) accurate emulation running at full speed in a single cog. I don't know how accurate Ericballs emulation is; it may be a good start for the 6510.

    A Vic20 is very simple compared to the C64. It uses the same cpu, but that's all it shares with the C64. So basing a C64 emulator on a Vic20 emulator makes no sense at all.

    /Ahle2
  • AleAle Posts: 2,363
    edited 2011-04-02 02:52
    @Ahle2:
    As the code for the C64 has been written already the undocumented opcodes used by every piece of software are known, do you happen to know how many of those opcodes are used and or how often ? Because shooting for a perfect emulation may not be the best approach... me thinks...

    Let's say you want to run "Skate or Die" (I don't make it that easy anyways...) ;-)
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-02 03:00
    @Dr_Acula
    The standard graphics modes used on the C64 were:
    - Highres, 1bpp per character/tile at 320x200 resolution (used in the basic interpreter)
    - Multicolor lowres, 2bpp per character/tile at 160x200 resolution (used in most games)
    Colors can be selected from the fixed palette of 16 entries.

    This might not seem to be too hard to emulate on a prop in a single cog.
    But the VicII in the C64 was nothing like the counterparts in other 8bits machines at the time.
    It featured hardware scrolling, extremely good sprite capabilites (for the time), hardware collision detection, hardware sprite scaling(it's true)... etc etc ... etc ..

    I would be surprised if it could be done in less than 4 cogs.

    /Ahle2
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-04-02 05:14
    How fast is C on the Dracblade compared to SPIN?

    C is faster, but LMM is going to be a lot faster again.
Sign In or Register to comment.