So, uuhhhhhh it now doesn't work with DEBUG enabled, either? I diffed the code and I didn't leave something stupid on accident, no, there's something fishy going on for sure.
Okay, I figured it out: Due to uninitialized registers, the VDPR would trigger a spurious HBlank interrupt on startup. Due to questionable coding practice, it seems these games that don't use that interrupt don't have a particularily sensible IRQ 4 vector (Ms. Pacman literally fills all the vectors it doesn't think it needs with zeroes. Good job. That game is also not TMSS compliant, so, uh, yeah.)
It'd work with DEBUG enabled since that changes the timing of cog startup, such that the CPU inits after the erroneous HInt and thus doesn't jump into the nonsense vector once interrupts are enabled (Ms. Pacman only enables interrupts during the actual game, all the other screens are timed by busy-waiting on the V counter)
Now I'll just try to get the new VGA driver to more consistently sync up with the VDP emulation and then I'm almost ready to start with the PSRAM stuff. Still good some stubborn 128k ROMs.
Okay, so here is a version with the new video code with correct Vcounter emulation and H32 pixel clock stretching. Currently VGA only, but you can choose between Line2x (->480p), Line3x (->720p) and Line4x (->960p) modes (see top of megayume_upper.spin2 and MegaVGA.spin2). The 4X mode is too much for the crappy monitor I have hooked up, so if someone could check if that one actually works, that'd be neat. Might just add NTSC mode next. Interlace needs thinking about, too, but I think that has to wait until I can get Sonic 2 to run (to my knowledge the only game that actually uses that mode).
EDIT: Oop, forgot to comment out some of the Pin 38/39 LED stuff again.
Very nice. Gameplay seems to work well and video looks silky smooth. I'm secretly a tad disappointed it needs your own custom video driver now so I won't be able to use it on my own personal P2 LCD platform, but it's great work anyway. This was probably going to happen at some point given your other needs.
Update: I also tried the 4x mode and it seemed to display fine on my Sony CRT at the 63kHz line rate.
@rogloh said:
Very nice. Gameplay seems to work well and video looks silky smooth. I'm secretly a tad disappointed it needs your own custom video driver now so I won't be able to use it on my own personal P2 LCD platform, but it's great work anyway. This was probably going to happen at some point given your other needs.
Adding LCD support shouldn't be too difficult, right? Can't do the dynamic pixel clock, so have to add a special case for that, but HDMI would need that, too. Speaking of, need to contrive a HDMI mode that actually pushes the clock to ~320MHz. Tricky, since vertical timing needs to be a multiple of NTSC 240p. Then again, I think blanking is supposed to auto-calibrate, right? So just add enormous HBlanks?
Update: I also tried the 4x mode and it seemed to display fine on my Sony CRT at the 63kHz line rate.
Cool. Though the higher multipliers are really intended for use with LCD monitors that do smooth scaling to increase image sharpness
Here's a version with NTSC and PAL60 support. The latter is only good for svideo, the dot crawl on composite is a bit much.
@rogloh If you want to look into wrenching LCD support or whatever in there, go ahead. I'm done with the video code for now.
I've already implemented the pillarbox thing. If the H32 NCO value is zero, it adds the pillars.
Started on the PSRAM read code. Reads... soemthing (should print out the "SEGA MEGA DRIVE " text from a Megadrive ROM header (at $100)). I think it only actually gets one of the RAM slices into read mode??? But it seems to be consistent at that.
Tired for today, need to go to bed.
CON
'{
_CLKFREQ = 300_000_000
PSRAM_CLK = 56
PSRAM_SELECT = 57
PSRAM_BASE = 40
'}
{
_CLKFREQ = 10_000_000
PSRAM_CLK = 15
PSRAM_SELECT = 14
PSRAM_BASE = 24
'}
PSRAM_WAIT = 12
PSRAM_DELAY = 3
DAT
org
asmclk
setq2 #511
rdlong 0,##@lutstuff
fltl #PSRAM_CLK
wrpin ##P_TRANSITION|P_OE, #PSRAM_CLK
wxpin #1, #PSRAM_CLK
drvl #PSRAM_CLK
drvh #PSRAM_SELECT
waitx #200
debug("init ok")
wrlong #"A",##@romio_area
mov iter,#5
.lp
mov pa,##$100 >> 2
mov read_longs,#4
call #readburst
debug(lstr(#@romio_area,#16),uhex_byte_array(#@romio_area,#16))
djnz iter,#.lp
jmp #$
readburst
setxfrq nco_slow
wrfast bit31,##@romio_area
mov tmp1,#(8+PSRAM_WAIT)*2
shl read_longs,#1
setword read_cmd,read_longs,#0
shl read_longs,#1
add tmp1,read_longs
setbyte pa,#$EB,#3
splitb pa
rev pa
movbyts pa, #%%0123
mergeb pa
drvl #PSRAM_SELECT
drvl bus_pinfield
xinit addr_cmd,pa
wypin tmp1,#PSRAM_CLK
xcont #PSRAM_WAIT,#0
setq nco_fast
xcont #PSRAM_DELAY,#0
fltl bus_pinfield
setq nco_slow
xcont read_cmd,#0
waitxfi
_ret_ drvh #PSRAM_SELECT
bit31 long negx
bus_pinfield long PSRAM_BASE addpins 15
addr_cmd long (PSRAM_BASE<<17)|X_PINS_ON | X_IMM_8X4_LUT + 8
read_cmd long (PSRAM_BASE<<17)|X_WRITE_ON| X_16P_2DAC8_WFWORD
nco_fast long $8000_0000
nco_slow long $4000_0000
read_longs long 4
tmp1 res 1
iter res 1
DAT
org $200
lutstuff
long $0000
long $1111
long $2222
long $3333
long $4444
long $5555
long $6666
long $7777
long $8888
long $9999
long $AAAA
long $BCCC
long $CCCC
long $DDDD
long $EEEE
long $FFFF
DAT
orgh
alignl
romio_area byte 0[256]
...
And then it immediately hits me that the LUT table has a typo. Owie ouch. Eitherhow, after setting WAIT to 10 and DELAY to 3, I beget this here joyful output. Good night.
Well done. That was fast. Your first reads of the PSRAM
Only 24 instructions in your read burst code routine, so the read overhead should remain quite low and can be dominated by a longer transfer itself, which is good.
It would be nice to time the actual transfer of the small data reads (e.g. as individual 16 bit words) and also a longer burst to help you determine what size fits best with your ROM instruction pre-fetching. @300MHz you might get something between 2-4 single read transfers done per microsecond once you add your execution time , so with pre-fetching hopefully your emulator can still perform well.
I can move a couple of instructions after the address/delay, too, so that's a few more cycles that can be saved. I wonder if there's a right moment to write to the clock pin a second time, then I could eliminate some more of those frontloaded instructions.
But let's get it working before trying stuff like that.
CON
'{
_CLKFREQ = 300_000_000
PSRAM_CLK = 56
PSRAM_SELECT = 57
PSRAM_BASE = 40
'}
{
_CLKFREQ = 10_000_000
PSRAM_CLK = 15
PSRAM_SELECT = 14
PSRAM_BASE = 24
'}
PSRAM_WAIT = 10
PSRAM_DELAY = PSRAM_WAIT*2+3
DAT
org
asmclk
setq2 #511
rdlong 0,##@lutstuff
fltl #PSRAM_CLK
wrpin ##P_TRANSITION|P_OE, #PSRAM_CLK
wxpin #1, #PSRAM_CLK
drvl #PSRAM_CLK
drvh #PSRAM_SELECT
setxfrq nco_slow
waitx #200
debug("init ok")
wrlong #"A",##@romio_area
mov iter,#5
.lp
mov pa,##$100 >> 2
mov read_longs,#4
call #readburst
debug(lstr(#@romio_area,#16),uhex_byte_array(#@romio_area,#16))
djnz iter,#.lp
jmp #$
readburst
mov tmp1,#(8+PSRAM_WAIT)*2
shl read_longs,#2
add tmp1,read_longs
setbyte pa,#$EB,#3
splitb pa
rev pa
movbyts pa, #%%0123
mergeb pa
drvl #PSRAM_SELECT
drvl bus_pinfield
xinit addr_cmd,pa
wypin tmp1,#PSRAM_CLK
setq nco_fast
xcont #PSRAM_DELAY,#0
wrfast bit31,read_dest
shr read_longs,#1
setword read_cmd,read_longs,#0
waitxmt
fltl bus_pinfield
setq nco_slow
xcont read_cmd,#0
waitxfi
_ret_ drvh #PSRAM_SELECT
bit31 long negx
bus_pinfield long PSRAM_BASE addpins 15
addr_cmd long (PSRAM_BASE<<17)|X_PINS_ON | X_IMM_8X4_LUT + 8
read_cmd long (PSRAM_BASE<<17)|X_WRITE_ON| X_16P_2DAC8_WFWORD
nco_fast long $8000_0000
nco_slow long $4000_0000
read_longs long 4
read_dest long @romio_area
tmp1 res 1
iter res 1
DAT
org $200
lutstuff
long $0000
long $1111
long $2222
long $3333
long $4444
long $5555
long $6666
long $7777
long $8888
long $9999
long $AAAA
long $BBBB
long $CCCC
long $DDDD
long $EEEE
long $FFFF
DAT
orgh
alignl
romio_area byte 0[256]
Instructions:
Use the provided PSRAM loading script loadit.rb to preload a ROM into PSRAM.
Use build.sh to compile the emulator and then load megayume_lower.binary
Here's what I've tried so far. Keep in mind that Z80 is completely missing still, so that'll explain a lot of the stuff that flat-out doesn't work.
If you want to play along at home, please acquire the ROM images yourself, I'll get into hell's kitchen (no not the gordon ramsay one, the actual one) if I distribute all of these. Well, not really, but you get the point. The CRC32 of my ROMs is provided for comparsion. Also, make sure to fiddle the region setting between overseas and domestic, some games only work in one region or have slight differences.
Game Name
ROM CRC32
What it does
Sonic the Hedgehog
F9394E97
Works! As far as I tested, anyways.
Sonic the Hedgehog 2
1E9A2E0D
Hangs after level is loaded.
Sonic the Hedgehog 3 Complete
2BD564B1
Blackscreen after menu (need backup SRAM?)
Thunder Force IV
8D606480
Playable with minor graphics corruption. Crashes on game over.
Streets of Rage
BFF227C6
Crashes on character selection or demo start.
Streets of Rage 2
E01FA526
White screen.
Xeno Crisis
AC5F4CCA
Black screen.
MUSHA Aleste
8FDE18AB
Hang on SEGA logo.
Castlevania: BLoodlines
FB1EA6DF
Hang on Konami logo.
Out Run
FDD9A8D2
Screen full of corrupted road graphics.
TITAN Overdrive
03826F26
Black screen. What did I expect?
Columns
03163D7A
Still works.
Ishido
B1DE7D5E
Still works.
Ms. Pacman
AF041BE6
Still works.
Flicky
4291C8AB
Still mostly works, broken timer.
Panorama Cotton
9E57D92E
WORKS?? Interesting, considering OutRun didn't.
Gynoug
1B69241F
Black screen.
Bubsy: Encounters of the furred kind
3E30D365
Black screen. Thank god.
Vectorman
D38B3354
Black screen.
Shining Force
E0594ABE
Black screen.
Notable is that I don't think any of the games that do work have performance issues. Thunderforce IV usually drops a few frames when things get busy, but it seems that doesn't happen (until it crashes, anyways)
Also, in terms of games that work without immediately evident issues, Sonic 1 and Panorama Cotton sure are an odd pair. I only remembered the latter exists after OutRun failed and I wanted another one with a 3D road effect, so I didn't test very far at all.
Well done Wuerfel_21! Having that PSRAM working fast enough totally rules. Now you'll have to fix whatever emulator bugs left you may find and add the Z80.
In case of random crashes or startup hangs, you may also want to validate all of the ROMs are getting loaded into the PSRAM, maybe with some checksum or something. I found a problem yesterday and it might be script failure. A whole chunk of a downloaded picture into PSRAM which looked to be about 256kB (slice size) seemed to be missing from the image being output. The ruby script you have might be not detecting any failure in loadp2 to connect/download to the P2.
@rogloh said:
In case of random crashes or startup hangs, you may also want to validate all of the ROMs are getting loaded into the PSRAM, maybe with some checksum or something. I found a problem yesterday and it might be script failure. A whole chunk of a downloaded picture into PSRAM which looked to be about 256kB (slice size) seemed to be missing from the image being output. The ruby script you have might be not detecting any failure in loadp2 to connect/download to the P2.
Yeah, It isn't checking the return code. Should probably do that.
But that isn't the case, it has worked for me every time and conversely, the games that don't work still don't even after loading them again.
Also interesting:
Your LCD board has a single PSRAM, right? If you confer me a driver for that, I can try to add support and see how slow it is.
I've had it running Sonic in attract mode for like an hour and it didn't do anything weird, though the Edge gets really hot. So it's nice and stable I guess.
Altered Beast (CRC: 154D59BB) also works! But lots of transient graphics glitches, which I think should be blanked out, but the flag for that isn't implemented yet. Some of the other games also exhibit this, but not to such an extent. Sonic actually avoids it by fading to black for any screen transitions.
A Hedgehog, a witch and a muscly dude walk into a bar. I feel like there's joke there somewhere.
There is actually a noticable issue in Panorama Cotton: The shadows don't scale properly (always seem to draw at the smallest size). Kindof a bad thing in a game that is predicated on fake 3D depth cues like that.
Your LCD board has a single PSRAM, right? If you confer me a driver for that, I can try to add support and see how slow it is.
I've not got that done yet. Am still considering how/when to do it. Hopefully it's not a major rewrite to my existing code but it is quite a bit different to the 16 bit bus case. In general it is simpler as everything is byte addressable. The trickiest case is actually an 8 bit wide bus (2 PSRAM chips).
I've had it running Sonic in attract mode for like an hour and it didn't do anything weird, though the Edge gets really hot. So it's nice and stable I guess.
Yes I noticed this too. At 325MHz with my external frame buffer video from PSRAM the P2 Edge gets very warm to hot, but it remains stable too for long periods of testing. It mainly seems to be the P2 that is the source of the heat not so much the other devices on the board like the memory and regulators, though a thermal camera would be the best way to confirm that.
Figured out that long standing "Flicky timer issue" (I didn't even notice, but the issue also affected the round counter, which would always stay at one). Another case of the classic TEST vs. TESTB typo in the BCD add/sub instructions. I went through every remaining TEST in the code, I think that actually was the last of these bugs. Didn't seem to fix anything else though.
MUSHA actually hard-crashes - the keyboard's capslock stops responding
Completing a level in Thunderforce does not crash and works as you'd want it to.
I don't know why, but Sonic 2 didn't lock up one time I tried and let me complete Emerald Hill Act 1, albeit with Sonic and Tails' sprites all jumbled up. It then locked up at the start of Act 2 as usual. Strange.
Eitherhow, here's a new version with the fixed BCD instructions and the screen blank flag implemented.
Because it can't natively represent a full 32 bit word or be atomically byte addressable. At least the other two cases (4 or 1 chip) gains one of those advantages. The native size of the contents addressable is 16 bits instead, which sits right in the middle and you need special handling for the other sizes. In any case I figured out how to do read-modify-write to overcome these sorts of limitations.
Got the Thunderforce crash on log. Apparently this here MOVE.W is triggering an address error trying to access $FFFF_6E0D. As one can see, A0 should be $C00004 (VDP control port). And even if it wasn't, that should generate the address error in the preceding instructions. A true mystery.
Whenever you see something that you suspect could be an error in an instruction, you also need to be sure it is not a PSRAM read error. If the PSRAM input timing you are using is marginal at your clock rate/temperature, there is going to be some potential chance of intermittent read errors. If a crash is detectable it would be good to capture whatever opcodes were read and compare with the original ROM source at the same offset to be sure it is not PSRAM read errors. Also if you run my PSRAM timing utility over the P2 frequency you use you can ensure the PSRAM input timing you have setup is basically in the middle of the acceptable range of values not right at the edge where the signal integrity may not be there when the data is sampled. This may vary with temperature too so it would be good to run it once the system is quite hot to be sure your input timing is valid.
I'm also not sure you have the extra "half step" in input timing where you enable/disable the clocked/live setting on the data pins. This gives you finer input timing granularity and becomes important to help the sampling remain centered in the data window as much as possible.
No, I don't have any of the pins in clocked mode right now. But I think PSRAM data errors would take a different form. The errors I get happen the same way every time.
I think I narrowed the Thunderforce crash down to something getting corrupted by the DMA transfer that is triggered by the move at 19BC. It is not A0 though.
Yep, DMA was corrupting the instruction cache, classic. This happened because the DMA code wasn't resetting the ROM read length and the PSRAM code was leaving it doubled, so any transfer that got split into multiple chunks would actually transfer read from PSRAM exponentially more for each chunk. The next thing after the general-purpose ROM read buffer is the ROM code cache, so that got bungled and everything crashed.
Here's some updated game testing (tested more, but left out uninteresting ones):
Game Name
ROM CRC32
What it does
Sonic the Hedgehog 2
1E9A2E0D
Works! At least as far as I got (Chemical Plant Act 2). Black screen on 2 player mode.
Sonic the Hedgehog 3 Complete
2BD564B1
Gets into the game, but softlocks at a certain point.
Fixed the MUSHA crash. Was setting the Window split X to a position outside the screen, which causes some of the VDPR's calculations to underflow, leading to it trying to draw an infinitely wide Plane A, wrecking all memory in the process.
Game now works, but for some reason the icons are missing from the status display (and you can see Plane B through the holes).
Turns out MUSHA actually uses VRAM copy to move these little icons into place and I implemented that feature so totally wrong that it didn't even trigger at all (Copy trigger is more like a read command, whereas the other DMA ops use write commands). Also, it is the only DMA op that operates on bytes, not words.
Well, I guess that works now. Doesn't seem to benefit any other game, perhaps because VRAM copy is of very little use.
Before going down the rabbit hole of tracking down the graphical issues in TF4 and SOR2 (a very cool task, because as I found out, no PC emulator actually supports breaking on VRAM access. Regen claims to support it, but it doesn't actually work. Genius. At least Gens can log DMAs...) and whatever the heck is going on with Sonic 3, I thought to look into all the games in my test set that don't boot at all.
Game Name
ROM CRC32
Why it doesn't work
Shining Force
E0594ABE
Needs Z80 RAM.
Vectorman
D38B3354
Address Error (jump to nonsense address)
Castlevania: Bloodlines
FB1EA6DF
Just needs region to be set to overseas. Not sure why it hangs on the logo instead of stating that. Silly Konami.
Xeno Crisis
AC5F4CCA
Not sure, but it keeps reading Z80 RAM, so something Z80-related guess?
Bubsy: Encounters of the furred kind
3E30D365
Similar thing, slightly odd code that checks some location in Z80 RAM runs continually.
Gynoug
1B69241F
Needs Z80 RAM.
TITAN Overdrive
03826F26
Needs Z80 emulation (sets a RAM location to zero, enables Z80 for a bit and then checks if it is still zero).
So indeed, my suspicion that a lot of them need Z80 (or at least its RAM) is sorta confirmed, but there's also one that seems to have an actual issue and one that's just stupid.
Check out this weird little loop that Shining Force gets stuck on (Gynoug has a similar thing).
"Yeah mate make sure the value actually got written". No idea why they do that (Z80 has to be halted to access its bus, so it's not some race condition thing).
As said, Castlevania just fails to tell you that the region is incorrect. If you set it to overseas (or use a japanese ROM, I guess), it actually works, including the somewhat involved water effect in Stage 2:
This is accomplished by using a raster interrupt to change three things mid-frame:
- enable shadow/highlight mode to darken everything
- change VSRAM (a mirrored copy of the level is kept in VRAM and updated as you destroy candles and stuff like that)
- change sprite table base. This doesn't update the VDPs internal cache of sprite Y positions, so every sprite is duplicated and the unwanted version is moved off-screen horizontally in the sprite table for each part of the screen.
Somewhat unrelatedly, the reason Sonic 2 blackscreens on 2 player mode (and when you wait too long on the title screen, being that the first demo in the rotation is of 2 player mode) is simply that it waits on the odd-field flag to be set. So that really just needs a proper interlace mode implementation.
So, yeah, will probably check out that Vectorman crash, because that seems easier to debug
Also, should probably find some more games that don't work.
Vectorman problem identified and fixed. Yet another typo kinda thing wherein I wrote zerox pb,#24 instead of zerox pb,#23.
Game seems to work without issues now, nothing else seems affected.
Comments
So, uuhhhhhh it now doesn't work with DEBUG enabled, either? I diffed the code and I didn't leave something stupid on accident, no, there's something fishy going on for sure.
Okay, I figured it out: Due to uninitialized registers, the VDPR would trigger a spurious HBlank interrupt on startup. Due to questionable coding practice, it seems these games that don't use that interrupt don't have a particularily sensible IRQ 4 vector (Ms. Pacman literally fills all the vectors it doesn't think it needs with zeroes. Good job. That game is also not TMSS compliant, so, uh, yeah.)
It'd work with DEBUG enabled since that changes the timing of cog startup, such that the CPU inits after the erroneous HInt and thus doesn't jump into the nonsense vector once interrupts are enabled (Ms. Pacman only enables interrupts during the actual game, all the other screens are timed by busy-waiting on the V counter)
Now I'll just try to get the new VGA driver to more consistently sync up with the VDP emulation and then I'm almost ready to start with the PSRAM stuff. Still good some stubborn 128k ROMs.
Okay, so here is a version with the new video code with correct Vcounter emulation and H32 pixel clock stretching. Currently VGA only, but you can choose between Line2x (->480p), Line3x (->720p) and Line4x (->960p) modes (see top of
megayume_upper.spin2
andMegaVGA.spin2
). The 4X mode is too much for the crappy monitor I have hooked up, so if someone could check if that one actually works, that'd be neat. Might just add NTSC mode next. Interlace needs thinking about, too, but I think that has to wait until I can get Sonic 2 to run (to my knowledge the only game that actually uses that mode).EDIT: Oop, forgot to comment out some of the Pin 38/39 LED stuff again.
Very nice. Gameplay seems to work well and video looks silky smooth. I'm secretly a tad disappointed it needs your own custom video driver now so I won't be able to use it on my own personal P2 LCD platform, but it's great work anyway. This was probably going to happen at some point given your other needs.
Update: I also tried the 4x mode and it seemed to display fine on my Sony CRT at the 63kHz line rate.
Adding LCD support shouldn't be too difficult, right? Can't do the dynamic pixel clock, so have to add a special case for that, but HDMI would need that, too. Speaking of, need to contrive a HDMI mode that actually pushes the clock to ~320MHz. Tricky, since vertical timing needs to be a multiple of NTSC 240p. Then again, I think blanking is supposed to auto-calibrate, right? So just add enormous HBlanks?
Cool. Though the higher multipliers are really intended for use with LCD monitors that do smooth scaling to increase image sharpness
Got NTSC working with the new video code... PAL60 is next up, I guess.
Also got the pillarboxing for LCD/HDMI implemented already.
Here's a version with NTSC and PAL60 support. The latter is only good for svideo, the dot crawl on composite is a bit much.
@rogloh If you want to look into wrenching LCD support or whatever in there, go ahead. I'm done with the video code for now.
I've already implemented the pillarbox thing. If the H32 NCO value is zero, it adds the pillars.
Also, I just fixed the last blackscreening ROMs (by implementing 8 bit VDP port access), so it really is PSRAM time!
EDIT: Intermediate releae. Moreso for my own purpose of diffing later on.
Wow, you seem to be making really fast progress now. I'll have a look at your video driver sometime when I can.
Started on the PSRAM read code. Reads... soemthing (should print out the "SEGA MEGA DRIVE " text from a Megadrive ROM header (at $100)). I think it only actually gets one of the RAM slices into read mode??? But it seems to be consistent at that.
Tired for today, need to go to bed.
...
And then it immediately hits me that the LUT table has a typo. Owie ouch. Eitherhow, after setting WAIT to 10 and DELAY to 3, I beget this here joyful output. Good night.
Well done. That was fast. Your first reads of the PSRAM
Only 24 instructions in your read burst code routine, so the read overhead should remain quite low and can be dominated by a longer transfer itself, which is good.
It would be nice to time the actual transfer of the small data reads (e.g. as individual 16 bit words) and also a longer burst to help you determine what size fits best with your ROM instruction pre-fetching. @300MHz you might get something between 2-4 single read transfers done per microsecond once you add your execution time , so with pre-fetching hopefully your emulator can still perform well.
I can move a couple of instructions after the address/delay, too, so that's a few more cycles that can be saved. I wonder if there's a right moment to write to the clock pin a second time, then I could eliminate some more of those frontloaded instructions.
But let's get it working before trying stuff like that.
Behold, it works!
Instructions:
Use the provided PSRAM loading script
loadit.rb
to preload a ROM into PSRAM.Use
build.sh
to compile the emulator and then loadmegayume_lower.binary
Here's what I've tried so far. Keep in mind that Z80 is completely missing still, so that'll explain a lot of the stuff that flat-out doesn't work.
If you want to play along at home, please acquire the ROM images yourself, I'll get into hell's kitchen (no not the gordon ramsay one, the actual one) if I distribute all of these. Well, not really, but you get the point. The CRC32 of my ROMs is provided for comparsion. Also, make sure to fiddle the region setting between overseas and domestic, some games only work in one region or have slight differences.
Notable is that I don't think any of the games that do work have performance issues. Thunderforce IV usually drops a few frames when things get busy, but it seems that doesn't happen (until it crashes, anyways)
Also, in terms of games that work without immediately evident issues, Sonic 1 and Panorama Cotton sure are an odd pair. I only remembered the latter exists after OutRun failed and I wanted another one with a 3D road effect, so I didn't test very far at all.
Well done Wuerfel_21! Having that PSRAM working fast enough totally rules. Now you'll have to fix whatever emulator bugs left you may find and add the Z80.
In case of random crashes or startup hangs, you may also want to validate all of the ROMs are getting loaded into the PSRAM, maybe with some checksum or something. I found a problem yesterday and it might be script failure. A whole chunk of a downloaded picture into PSRAM which looked to be about 256kB (slice size) seemed to be missing from the image being output. The ruby script you have might be not detecting any failure in loadp2 to connect/download to the P2.
Yeah, It isn't checking the return code. Should probably do that.
But that isn't the case, it has worked for me every time and conversely, the games that don't work still don't even after loading them again.
Also interesting:
I've not got that done yet. Am still considering how/when to do it. Hopefully it's not a major rewrite to my existing code but it is quite a bit different to the 16 bit bus case. In general it is simpler as everything is byte addressable. The trickiest case is actually an 8 bit wide bus (2 PSRAM chips).
Yes I noticed this too. At 325MHz with my external frame buffer video from PSRAM the P2 Edge gets very warm to hot, but it remains stable too for long periods of testing. It mainly seems to be the P2 that is the source of the heat not so much the other devices on the board like the memory and regulators, though a thermal camera would be the best way to confirm that.
Additional observations:
Eitherhow, here's a new version with the fixed BCD instructions and the screen blank flag implemented.
Why is that?
Because it can't natively represent a full 32 bit word or be atomically byte addressable. At least the other two cases (4 or 1 chip) gains one of those advantages. The native size of the contents addressable is 16 bits instead, which sits right in the middle and you need special handling for the other sizes. In any case I figured out how to do read-modify-write to overcome these sorts of limitations.
Got the Thunderforce crash on log. Apparently this here MOVE.W is triggering an address error trying to access $FFFF_6E0D. As one can see, A0 should be $C00004 (VDP control port). And even if it wasn't, that should generate the address error in the preceding instructions. A true mystery.
Whenever you see something that you suspect could be an error in an instruction, you also need to be sure it is not a PSRAM read error. If the PSRAM input timing you are using is marginal at your clock rate/temperature, there is going to be some potential chance of intermittent read errors. If a crash is detectable it would be good to capture whatever opcodes were read and compare with the original ROM source at the same offset to be sure it is not PSRAM read errors. Also if you run my PSRAM timing utility over the P2 frequency you use you can ensure the PSRAM input timing you have setup is basically in the middle of the acceptable range of values not right at the edge where the signal integrity may not be there when the data is sampled. This may vary with temperature too so it would be good to run it once the system is quite hot to be sure your input timing is valid.
I'm also not sure you have the extra "half step" in input timing where you enable/disable the clocked/live setting on the data pins. This gives you finer input timing granularity and becomes important to help the sampling remain centered in the data window as much as possible.
No, I don't have any of the pins in clocked mode right now. But I think PSRAM data errors would take a different form. The errors I get happen the same way every time.
I think I narrowed the Thunderforce crash down to something getting corrupted by the DMA transfer that is triggered by the move at 19BC. It is not A0 though.
Yep, DMA was corrupting the instruction cache, classic. This happened because the DMA code wasn't resetting the ROM read length and the PSRAM code was leaving it doubled, so any transfer that got split into multiple chunks would actually transfer read from PSRAM exponentially more for each chunk. The next thing after the general-purpose ROM read buffer is the ROM code cache, so that got bungled and everything crashed.
Here's some updated game testing (tested more, but left out uninteresting ones):
Fixed the MUSHA crash. Was setting the Window split X to a position outside the screen, which causes some of the VDPR's calculations to underflow, leading to it trying to draw an infinitely wide Plane A, wrecking all memory in the process.
Game now works, but for some reason the icons are missing from the status display (and you can see Plane B through the holes).
Turns out MUSHA actually uses VRAM copy to move these little icons into place and I implemented that feature so totally wrong that it didn't even trigger at all (Copy trigger is more like a read command, whereas the other DMA ops use write commands). Also, it is the only DMA op that operates on bytes, not words.
Well, I guess that works now. Doesn't seem to benefit any other game, perhaps because VRAM copy is of very little use.
Before going down the rabbit hole of tracking down the graphical issues in TF4 and SOR2 (a very cool task, because as I found out, no PC emulator actually supports breaking on VRAM access. Regen claims to support it, but it doesn't actually work. Genius. At least Gens can log DMAs...) and whatever the heck is going on with Sonic 3, I thought to look into all the games in my test set that don't boot at all.
So indeed, my suspicion that a lot of them need Z80 (or at least its RAM) is sorta confirmed, but there's also one that seems to have an actual issue and one that's just stupid.
Check out this weird little loop that Shining Force gets stuck on (Gynoug has a similar thing).
"Yeah mate make sure the value actually got written". No idea why they do that (Z80 has to be halted to access its bus, so it's not some race condition thing).
As said, Castlevania just fails to tell you that the region is incorrect. If you set it to overseas (or use a japanese ROM, I guess), it actually works, including the somewhat involved water effect in Stage 2:
This is accomplished by using a raster interrupt to change three things mid-frame:
- enable shadow/highlight mode to darken everything
- change VSRAM (a mirrored copy of the level is kept in VRAM and updated as you destroy candles and stuff like that)
- change sprite table base. This doesn't update the VDPs internal cache of sprite Y positions, so every sprite is duplicated and the unwanted version is moved off-screen horizontally in the sprite table for each part of the screen.
Somewhat unrelatedly, the reason Sonic 2 blackscreens on 2 player mode (and when you wait too long on the title screen, being that the first demo in the rotation is of 2 player mode) is simply that it waits on the odd-field flag to be set. So that really just needs a proper interlace mode implementation.
So, yeah, will probably check out that Vectorman crash, because that seems easier to debug
Also, should probably find some more games that don't work.
Vectorman problem identified and fixed. Yet another typo kinda thing wherein I wrote
zerox pb,#24
instead ofzerox pb,#23
.Game seems to work without issues now, nothing else seems affected.