Shop OBEX P1 Docs P2 Docs Learn Events
Console Emulation - Page 7 — Parallax Forums

Console Emulation

145791068

Comments

  • So, uuhhhhhh it now doesn't work with DEBUG enabled, either? I diffed the code and I didn't leave something stupid on accident, no, there's something fishy going on for sure.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 12:26

    Okay, I figured it out: Due to uninitialized registers, the VDPR would trigger a spurious HBlank interrupt on startup. Due to questionable coding practice, it seems these games that don't use that interrupt don't have a particularily sensible IRQ 4 vector (Ms. Pacman literally fills all the vectors it doesn't think it needs with zeroes. Good job. That game is also not TMSS compliant, so, uh, yeah.)

    It'd work with DEBUG enabled since that changes the timing of cog startup, such that the CPU inits after the erroneous HInt and thus doesn't jump into the nonsense vector once interrupts are enabled (Ms. Pacman only enables interrupts during the actual game, all the other screens are timed by busy-waiting on the V counter)

    Now I'll just try to get the new VGA driver to more consistently sync up with the VDP emulation and then I'm almost ready to start with the PSRAM stuff. Still good some stubborn 128k ROMs.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 13:07

    Okay, so here is a version with the new video code with correct Vcounter emulation and H32 pixel clock stretching. Currently VGA only, but you can choose between Line2x (->480p), Line3x (->720p) and Line4x (->960p) modes (see top of megayume_upper.spin2 and MegaVGA.spin2). The 4X mode is too much for the crappy monitor I have hooked up, so if someone could check if that one actually works, that'd be neat. Might just add NTSC mode next. Interlace needs thinking about, too, but I think that has to wait until I can get Sonic 2 to run (to my knowledge the only game that actually uses that mode).

    EDIT: Oop, forgot to comment out some of the Pin 38/39 LED stuff again.

  • roglohrogloh Posts: 5,840
    edited 2022-01-05 13:58

    Very nice. Gameplay seems to work well and video looks silky smooth. I'm secretly a tad disappointed it needs your own custom video driver now so I won't be able to use it on my own personal P2 LCD platform, but it's great work anyway. This was probably going to happen at some point given your other needs.

    Update: I also tried the 4x mode and it seemed to display fine on my Sony CRT at the 63kHz line rate.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 16:43

    @rogloh said:
    Very nice. Gameplay seems to work well and video looks silky smooth. I'm secretly a tad disappointed it needs your own custom video driver now so I won't be able to use it on my own personal P2 LCD platform, but it's great work anyway. This was probably going to happen at some point given your other needs.

    Adding LCD support shouldn't be too difficult, right? Can't do the dynamic pixel clock, so have to add a special case for that, but HDMI would need that, too. Speaking of, need to contrive a HDMI mode that actually pushes the clock to ~320MHz. Tricky, since vertical timing needs to be a multiple of NTSC 240p. Then again, I think blanking is supposed to auto-calibrate, right? So just add enormous HBlanks?

    Update: I also tried the 4x mode and it seemed to display fine on my Sony CRT at the 63kHz line rate.

    Cool. Though the higher multipliers are really intended for use with LCD monitors that do smooth scaling to increase image sharpness :)

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 18:08

    Got NTSC working with the new video code... PAL60 is next up, I guess.
    Also got the pillarboxing for LCD/HDMI implemented already.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 19:08

    Here's a version with NTSC and PAL60 support. The latter is only good for svideo, the dot crawl on composite is a bit much.

    @rogloh If you want to look into wrenching LCD support or whatever in there, go ahead. I'm done with the video code for now.
    I've already implemented the pillarbox thing. If the H32 NCO value is zero, it adds the pillars.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-05 19:50

    Also, I just fixed the last blackscreening ROMs (by implementing 8 bit VDP port access), so it really is PSRAM time!

    EDIT: Intermediate releae. Moreso for my own purpose of diffing later on.

  • Wow, you seem to be making really fast progress now. I'll have a look at your video driver sometime when I can.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-06 00:02

    Started on the PSRAM read code. Reads... soemthing (should print out the "SEGA MEGA DRIVE " text from a Megadrive ROM header (at $100)). I think it only actually gets one of the RAM slices into read mode??? But it seems to be consistent at that.
    Tired for today, need to go to bed.

    CON
    '{
    _CLKFREQ = 300_000_000
    PSRAM_CLK = 56
    PSRAM_SELECT = 57
    PSRAM_BASE = 40
    '}
    
    {
    _CLKFREQ = 10_000_000
    PSRAM_CLK = 15
    PSRAM_SELECT = 14
    PSRAM_BASE = 24
    '}
    
    PSRAM_WAIT = 12
    PSRAM_DELAY = 3
    
    DAT
            org
            asmclk
            setq2 #511
            rdlong 0,##@lutstuff
    
            fltl #PSRAM_CLK
            wrpin ##P_TRANSITION|P_OE, #PSRAM_CLK
            wxpin #1, #PSRAM_CLK
            drvl #PSRAM_CLK
    
            drvh #PSRAM_SELECT
    
            waitx #200
    
            debug("init ok")
    
            wrlong #"A",##@romio_area
    
            mov iter,#5
    .lp
            mov pa,##$100 >> 2
            mov read_longs,#4
            call #readburst
            debug(lstr(#@romio_area,#16),uhex_byte_array(#@romio_area,#16))
            djnz iter,#.lp
    
            jmp #$
    
    readburst
                  setxfrq nco_slow
                  wrfast  bit31,##@romio_area
                  mov tmp1,#(8+PSRAM_WAIT)*2
                  shl read_longs,#1
                  setword read_cmd,read_longs,#0
                  shl read_longs,#1
                  add tmp1,read_longs
                  setbyte pa,#$EB,#3
                  splitb  pa
                  rev     pa
                  movbyts pa, #%%0123
                  mergeb  pa
                  drvl  #PSRAM_SELECT
                  drvl  bus_pinfield
                  xinit addr_cmd,pa
                  wypin tmp1,#PSRAM_CLK
                  xcont #PSRAM_WAIT,#0
                  setq nco_fast
                  xcont #PSRAM_DELAY,#0
                  fltl bus_pinfield
                  setq nco_slow
                  xcont read_cmd,#0
                  waitxfi
            _ret_ drvh #PSRAM_SELECT
    
    
    bit31         long  negx
    bus_pinfield  long PSRAM_BASE addpins 15
    addr_cmd      long (PSRAM_BASE<<17)|X_PINS_ON | X_IMM_8X4_LUT + 8
    read_cmd      long (PSRAM_BASE<<17)|X_WRITE_ON| X_16P_2DAC8_WFWORD
    
    nco_fast      long $8000_0000
    nco_slow      long $4000_0000
    
    read_longs    long 4
    
    tmp1          res 1
    iter          res 1
    
    
    DAT
            org $200
    lutstuff
                  long $0000
                  long $1111
                  long $2222
                  long $3333
                  long $4444
                  long $5555
                  long $6666
                  long $7777
                  long $8888
                  long $9999
                  long $AAAA
                  long $BCCC
                  long $CCCC
                  long $DDDD
                  long $EEEE
                  long $FFFF
    
    
    
    DAT
                  orgh
                  alignl
    
    romio_area    byte 0[256]
    
    
    
    
  • ...
    And then it immediately hits me that the LUT table has a typo. Owie ouch. Eitherhow, after setting WAIT to 10 and DELAY to 3, I beget this here joyful output. Good night.

  • Well done. That was fast. Your first reads of the PSRAM :smile:

    Only 24 instructions in your read burst code routine, so the read overhead should remain quite low and can be dominated by a longer transfer itself, which is good.

    It would be nice to time the actual transfer of the small data reads (e.g. as individual 16 bit words) and also a longer burst to help you determine what size fits best with your ROM instruction pre-fetching. @300MHz you might get something between 2-4 single read transfers done per microsecond once you add your execution time , so with pre-fetching hopefully your emulator can still perform well.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-06 11:40

    I can move a couple of instructions after the address/delay, too, so that's a few more cycles that can be saved. I wonder if there's a right moment to write to the clock pin a second time, then I could eliminate some more of those frontloaded instructions.
    But let's get it working before trying stuff like that.

    CON
    '{
    _CLKFREQ = 300_000_000
    PSRAM_CLK = 56
    PSRAM_SELECT = 57
    PSRAM_BASE = 40
    '}
    
    {
    _CLKFREQ = 10_000_000
    PSRAM_CLK = 15
    PSRAM_SELECT = 14
    PSRAM_BASE = 24
    '}
    PSRAM_WAIT  = 10
    PSRAM_DELAY = PSRAM_WAIT*2+3
    
    DAT
            org
            asmclk
            setq2 #511
            rdlong 0,##@lutstuff
    
            fltl #PSRAM_CLK
            wrpin ##P_TRANSITION|P_OE, #PSRAM_CLK
            wxpin #1, #PSRAM_CLK
            drvl #PSRAM_CLK
    
            drvh #PSRAM_SELECT
    
            setxfrq nco_slow
    
            waitx #200
    
            debug("init ok")
    
            wrlong #"A",##@romio_area
    
            mov iter,#5
    .lp
            mov pa,##$100 >> 2
            mov read_longs,#4
            call #readburst
            debug(lstr(#@romio_area,#16),uhex_byte_array(#@romio_area,#16))
            djnz iter,#.lp
    
            jmp #$
    
    readburst
                  mov tmp1,#(8+PSRAM_WAIT)*2
                  shl read_longs,#2
                  add tmp1,read_longs
                  setbyte pa,#$EB,#3
                  splitb  pa
                  rev     pa
                  movbyts pa, #%%0123
                  mergeb  pa
                  drvl  #PSRAM_SELECT
                  drvl  bus_pinfield
                  xinit addr_cmd,pa
                  wypin tmp1,#PSRAM_CLK
                  setq nco_fast
                  xcont #PSRAM_DELAY,#0
                  wrfast  bit31,read_dest
                  shr read_longs,#1
                  setword read_cmd,read_longs,#0
                  waitxmt
                  fltl bus_pinfield
                  setq nco_slow
                  xcont read_cmd,#0
                  waitxfi
            _ret_ drvh #PSRAM_SELECT
    
    
    bit31         long  negx
    bus_pinfield  long PSRAM_BASE addpins 15
    addr_cmd      long (PSRAM_BASE<<17)|X_PINS_ON | X_IMM_8X4_LUT + 8
    read_cmd      long (PSRAM_BASE<<17)|X_WRITE_ON| X_16P_2DAC8_WFWORD
    
    nco_fast      long $8000_0000
    nco_slow      long $4000_0000
    
    read_longs    long 4
    read_dest     long @romio_area
    
    tmp1          res 1
    iter          res 1
    
    
    DAT
            org $200
    lutstuff
                  long $0000
                  long $1111
                  long $2222
                  long $3333
                  long $4444
                  long $5555
                  long $6666
                  long $7777
                  long $8888
                  long $9999
                  long $AAAA
                  long $BBBB
                  long $CCCC
                  long $DDDD
                  long $EEEE
                  long $FFFF
    
    
    
    DAT
                  orgh
                  alignl
    
    romio_area    byte 0[256]
    
    
  • Behold, it works!

    Instructions:
    Use the provided PSRAM loading script loadit.rb to preload a ROM into PSRAM.
    Use build.sh to compile the emulator and then load megayume_lower.binary

    Here's what I've tried so far. Keep in mind that Z80 is completely missing still, so that'll explain a lot of the stuff that flat-out doesn't work.

    If you want to play along at home, please acquire the ROM images yourself, I'll get into hell's kitchen (no not the gordon ramsay one, the actual one) if I distribute all of these. Well, not really, but you get the point. The CRC32 of my ROMs is provided for comparsion. Also, make sure to fiddle the region setting between overseas and domestic, some games only work in one region or have slight differences.

    Game Name ROM CRC32 What it does
    Sonic the Hedgehog F9394E97 Works! As far as I tested, anyways.
    Sonic the Hedgehog 2 1E9A2E0D Hangs after level is loaded.
    Sonic the Hedgehog 3 Complete 2BD564B1 Blackscreen after menu (need backup SRAM?)
    Thunder Force IV 8D606480 Playable with minor graphics corruption. Crashes on game over.
    Streets of Rage BFF227C6 Crashes on character selection or demo start.
    Streets of Rage 2 E01FA526 White screen.
    Xeno Crisis AC5F4CCA Black screen.
    MUSHA Aleste 8FDE18AB Hang on SEGA logo.
    Castlevania: BLoodlines FB1EA6DF Hang on Konami logo.
    Out Run FDD9A8D2 Screen full of corrupted road graphics.
    TITAN Overdrive 03826F26 Black screen. What did I expect?
    Columns 03163D7A Still works.
    Ishido B1DE7D5E Still works.
    Ms. Pacman AF041BE6 Still works.
    Flicky 4291C8AB Still mostly works, broken timer.
    Panorama Cotton 9E57D92E WORKS?? Interesting, considering OutRun didn't.
    Gynoug 1B69241F Black screen.
    Bubsy: Encounters of the furred kind 3E30D365 Black screen. Thank god.
    Vectorman D38B3354 Black screen.
    Shining Force E0594ABE Black screen.

    Notable is that I don't think any of the games that do work have performance issues. Thunderforce IV usually drops a few frames when things get busy, but it seems that doesn't happen (until it crashes, anyways)

    Also, in terms of games that work without immediately evident issues, Sonic 1 and Panorama Cotton sure are an odd pair. I only remembered the latter exists after OutRun failed and I wanted another one with a 3D road effect, so I didn't test very far at all.

  • Well done Wuerfel_21! Having that PSRAM working fast enough totally rules. Now you'll have to fix whatever emulator bugs left you may find and add the Z80.

  • In case of random crashes or startup hangs, you may also want to validate all of the ROMs are getting loaded into the PSRAM, maybe with some checksum or something. I found a problem yesterday and it might be script failure. A whole chunk of a downloaded picture into PSRAM which looked to be about 256kB (slice size) seemed to be missing from the image being output. The ruby script you have might be not detecting any failure in loadp2 to connect/download to the P2.

  • @rogloh said:
    In case of random crashes or startup hangs, you may also want to validate all of the ROMs are getting loaded into the PSRAM, maybe with some checksum or something. I found a problem yesterday and it might be script failure. A whole chunk of a downloaded picture into PSRAM which looked to be about 256kB (slice size) seemed to be missing from the image being output. The ruby script you have might be not detecting any failure in loadp2 to connect/download to the P2.

    Yeah, It isn't checking the return code. Should probably do that.

    But that isn't the case, it has worked for me every time and conversely, the games that don't work still don't even after loading them again.

    Also interesting:

    • Your LCD board has a single PSRAM, right? If you confer me a driver for that, I can try to add support and see how slow it is.
    • I've had it running Sonic in attract mode for like an hour and it didn't do anything weird, though the Edge gets really hot. So it's nice and stable I guess.
    • Altered Beast (CRC: 154D59BB) also works! But lots of transient graphics glitches, which I think should be blanked out, but the flag for that isn't implemented yet. Some of the other games also exhibit this, but not to such an extent. Sonic actually avoids it by fading to black for any screen transitions.
    • A Hedgehog, a witch and a muscly dude walk into a bar. I feel like there's joke there somewhere.
    • There is actually a noticable issue in Panorama Cotton: The shadows don't scale properly (always seem to draw at the smallest size). Kindof a bad thing in a game that is predicated on fake 3D depth cues like that.
  • Your LCD board has a single PSRAM, right? If you confer me a driver for that, I can try to add support and see how slow it is.

    I've not got that done yet. Am still considering how/when to do it. Hopefully it's not a major rewrite to my existing code but it is quite a bit different to the 16 bit bus case. In general it is simpler as everything is byte addressable. The trickiest case is actually an 8 bit wide bus (2 PSRAM chips).

    I've had it running Sonic in attract mode for like an hour and it didn't do anything weird, though the Edge gets really hot. So it's nice and stable I guess.

    Yes I noticed this too. At 325MHz with my external frame buffer video from PSRAM the P2 Edge gets very warm to hot, but it remains stable too for long periods of testing. It mainly seems to be the P2 that is the source of the heat not so much the other devices on the board like the memory and regulators, though a thermal camera would be the best way to confirm that.

  • Additional observations:

    • Figured out that long standing "Flicky timer issue" (I didn't even notice, but the issue also affected the round counter, which would always stay at one). Another case of the classic TEST vs. TESTB typo in the BCD add/sub instructions. I went through every remaining TEST in the code, I think that actually was the last of these bugs. Didn't seem to fix anything else though.
    • MUSHA actually hard-crashes - the keyboard's capslock stops responding
    • Completing a level in Thunderforce does not crash and works as you'd want it to.
    • I don't know why, but Sonic 2 didn't lock up one time I tried and let me complete Emerald Hill Act 1, albeit with Sonic and Tails' sprites all jumbled up. It then locked up at the start of Act 2 as usual. Strange.

    Eitherhow, here's a new version with the fixed BCD instructions and the screen blank flag implemented.

  • @rogloh said:
    The trickiest case is actually an 8 bit wide bus (2 PSRAM chips).

    Why is that?

  • roglohrogloh Posts: 5,840
    edited 2022-01-07 01:08

    Because it can't natively represent a full 32 bit word or be atomically byte addressable. At least the other two cases (4 or 1 chip) gains one of those advantages. The native size of the contents addressable is 16 bits instead, which sits right in the middle and you need special handling for the other sizes. In any case I figured out how to do read-modify-write to overcome these sorts of limitations.

  • Got the Thunderforce crash on log. Apparently this here MOVE.W is triggering an address error trying to access $FFFF_6E0D. As one can see, A0 should be $C00004 (VDP control port). And even if it wasn't, that should generate the address error in the preceding instructions. A true mystery.

  • roglohrogloh Posts: 5,840
    edited 2022-01-07 15:03

    Whenever you see something that you suspect could be an error in an instruction, you also need to be sure it is not a PSRAM read error. If the PSRAM input timing you are using is marginal at your clock rate/temperature, there is going to be some potential chance of intermittent read errors. If a crash is detectable it would be good to capture whatever opcodes were read and compare with the original ROM source at the same offset to be sure it is not PSRAM read errors. Also if you run my PSRAM timing utility over the P2 frequency you use you can ensure the PSRAM input timing you have setup is basically in the middle of the acceptable range of values not right at the edge where the signal integrity may not be there when the data is sampled. This may vary with temperature too so it would be good to run it once the system is quite hot to be sure your input timing is valid.

  • I'm also not sure you have the extra "half step" in input timing where you enable/disable the clocked/live setting on the data pins. This gives you finer input timing granularity and becomes important to help the sampling remain centered in the data window as much as possible.

  • No, I don't have any of the pins in clocked mode right now. But I think PSRAM data errors would take a different form. The errors I get happen the same way every time.

    I think I narrowed the Thunderforce crash down to something getting corrupted by the DMA transfer that is triggered by the move at 19BC. It is not A0 though.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-07 17:17

    Yep, DMA was corrupting the instruction cache, classic. This happened because the DMA code wasn't resetting the ROM read length and the PSRAM code was leaving it doubled, so any transfer that got split into multiple chunks would actually transfer read from PSRAM exponentially more for each chunk. The next thing after the general-purpose ROM read buffer is the ROM code cache, so that got bungled and everything crashed.

    Here's some updated game testing (tested more, but left out uninteresting ones):

    Game Name ROM CRC32 What it does
    Sonic the Hedgehog 2 1E9A2E0D Works! At least as far as I got (Chemical Plant Act 2). Black screen on 2 player mode.
    Sonic the Hedgehog 3 Complete 2BD564B1 Gets into the game, but softlocks at a certain point.
    Thunder Force IV 8D606480 Works! Minor graphics issues in intro and menus.
    Streets of Rage BFF227C6 Works!
    Streets of Rage 2 E01FA526 Runs, but heavily corrupted graphics
    MUSHA Aleste 8FDE18AB Still crashes after SEGA logo
    Out Run FDD9A8D2 Works! Played a full game, no issues.
    Vectorman D38B3354 Grey screen after BlueSky logo.
    TITAN Overdrive 03826F26 Black screen after loading animation.
  • Fixed the MUSHA crash. Was setting the Window split X to a position outside the screen, which causes some of the VDPR's calculations to underflow, leading to it trying to draw an infinitely wide Plane A, wrecking all memory in the process.

    Game now works, but for some reason the icons are missing from the status display (and you can see Plane B through the holes).

  • Turns out MUSHA actually uses VRAM copy to move these little icons into place and I implemented that feature so totally wrong that it didn't even trigger at all (Copy trigger is more like a read command, whereas the other DMA ops use write commands). Also, it is the only DMA op that operates on bytes, not words.

    Well, I guess that works now. Doesn't seem to benefit any other game, perhaps because VRAM copy is of very little use.

  • Wuerfel_21Wuerfel_21 Posts: 5,114
    edited 2022-01-08 17:12

    Before going down the rabbit hole of tracking down the graphical issues in TF4 and SOR2 (a very cool task, because as I found out, no PC emulator actually supports breaking on VRAM access. Regen claims to support it, but it doesn't actually work. Genius. At least Gens can log DMAs...) and whatever the heck is going on with Sonic 3, I thought to look into all the games in my test set that don't boot at all.

    Game Name ROM CRC32 Why it doesn't work
    Shining Force E0594ABE Needs Z80 RAM.
    Vectorman D38B3354 Address Error (jump to nonsense address)
    Castlevania: Bloodlines FB1EA6DF Just needs region to be set to overseas. Not sure why it hangs on the logo instead of stating that. Silly Konami.
    Xeno Crisis AC5F4CCA Not sure, but it keeps reading Z80 RAM, so something Z80-related guess?
    Bubsy: Encounters of the furred kind 3E30D365 Similar thing, slightly odd code that checks some location in Z80 RAM runs continually.
    Gynoug 1B69241F Needs Z80 RAM.
    TITAN Overdrive 03826F26 Needs Z80 emulation (sets a RAM location to zero, enables Z80 for a bit and then checks if it is still zero).

    So indeed, my suspicion that a lot of them need Z80 (or at least its RAM) is sorta confirmed, but there's also one that seems to have an actual issue and one that's just stupid.


    Check out this weird little loop that Shining Force gets stuck on (Gynoug has a similar thing).

    "Yeah mate make sure the value actually got written". No idea why they do that (Z80 has to be halted to access its bus, so it's not some race condition thing).


    As said, Castlevania just fails to tell you that the region is incorrect. If you set it to overseas (or use a japanese ROM, I guess), it actually works, including the somewhat involved water effect in Stage 2:

    This is accomplished by using a raster interrupt to change three things mid-frame:
    - enable shadow/highlight mode to darken everything
    - change VSRAM (a mirrored copy of the level is kept in VRAM and updated as you destroy candles and stuff like that)
    - change sprite table base. This doesn't update the VDPs internal cache of sprite Y positions, so every sprite is duplicated and the unwanted version is moved off-screen horizontally in the sprite table for each part of the screen.


    Somewhat unrelatedly, the reason Sonic 2 blackscreens on 2 player mode (and when you wait too long on the title screen, being that the first demo in the rotation is of 2 player mode) is simply that it waits on the odd-field flag to be set. So that really just needs a proper interlace mode implementation.


    So, yeah, will probably check out that Vectorman crash, because that seems easier to debug :)

    Also, should probably find some more games that don't work.

  • Vectorman problem identified and fixed. Yet another typo kinda thing wherein I wrote zerox pb,#24 instead of zerox pb,#23.
    Game seems to work without issues now, nothing else seems affected.

Sign In or Register to comment.