So indeed, my suspicion that a lot of them need Z80 (or at least its RAM) is sorta confirmed, but there's also one that seems to have an actual issue and one that's just stupid.
Check out this weird little loop that Shining Force gets stuck on (Gynoug has a similar thing).
"Yeah mate make sure the value actually got written". No idea why they do that (Z80 has to be halted to access its bus, so it's not some race condition thing).
@rogloh said:
So the question is are you gonna make a Z80 emulator too?
Presumably, yes. As said, it needs to at least pretend to actually run at 3.58 MHz, so that'll be fun to figure out.
@TonyB_ said:
@Wuerfel_21 said:
So indeed, my suspicion that a lot of them need Z80 (or at least its RAM) is sorta confirmed, but there's also one that seems to have an actual issue and one that's just stupid.
Check out this weird little loop that Shining Force gets stuck on (Gynoug has a similar thing).
"Yeah mate make sure the value actually got written". No idea why they do that (Z80 has to be halted to access its bus, so it's not some race condition thing).
I know you need to wait for a bus request to actually go through, but this is checking wether the actual RAM access has actually written the actual value...
RAM paranoia?
Reading Tony's link I note a comment about the Z80 having the ability to also access the 68k's RAM. Maybe that type of access is unreliable and needs the verification. Doh! I somehow assumed that was Z80 code without reading it. 0_o
Maybe the 68k can hammer data through without halting the Z80. Has the Z80 actually been halted there?
EDIT: Another possibility is the halt request has no check coded and is instead relying on the write validation to confirm the Z80 has halted and the write is also done. This has the bonus of allowing some free processing time in the interim.
@evanh said:
Reading Tony's link I note a comment about the Z80 having the ability to also access the 68k's RAM. Maybe that type of access is unreliable and needs the verification.
It can actually not. It can only actually access the cartridge ROM (I think the RAM can be addressed, but it doesn't actually work?). There are plenty of bugs related to the bus arbiter, but 68k access to the Z80 RAM failing isn't one of them. Perhaps some programmers just generally distrusted the entire thing, thus "RAM paranoia".
@evanh said:
Maybe the 68k can hammer data through without halting the Z80. Has the Z80 actually been halted there?
Seems like it is halted. There's some general confusion over the polarity of the BUSREQ register. As far as I can tell, the correct info is that you write a 1 to request/0 to release, and you read a 1 when the bus is busy and a 0 when the bus is clear (so read/write sorta have opposite polarity).
@Coley said:
Ada, it may be worth contacting Baggers, he has a mostly complete Z80 emulator and I think Cluso has one too.
I know. I will look into what's there when I get to it. I want to fix the apparent issues first before making the code larger. Though I'd have to make quite some modifications to the Z80 core to wrench in timing, bus logic, etc, so writing my own also seems appealing.
@Wuerfel_21 said:
There's some general confusion over the polarity of the BUSREQ register. As far as I can tell, the correct info is that you write a 1 to request/0 to release, and you read a 1 when the bus is busy and a 0 when the bus is clear (so read/write sorta have opposite polarity).
Write a 1 for a halt request then read a 1 when it's halted. Quote below from Tony's link:
So once you write $100 to Z80BusReq, you need to keep reading back from it until it also returns $100 (at which point the Z80 has let go of the bus).
That snippet you've posted wouldn't have to wait around for the halted signal. It can rely on the data write verify to known the Z80 has halted ... while also confirming the data is written. I'm now thinking this is the likely scenario.
@Wuerfel_21 said:
There's some general confusion over the polarity of the BUSREQ register. As far as I can tell, the correct info is that you write a 1 to request/0 to release, and you read a 1 when the bus is busy and a 0 when the bus is clear (so read/write sorta have opposite polarity).
Write a 1 for a halt request then read a 1 when it's halted. Quote below from Tony's link:
So once you write $100 to Z80BusReq, you need to keep reading back from it until it also returns $100 (at which point the Z80 has let go of the bus).
But it is not like that.´
Mind, this is incorrect in SEGA's official documentation, too (as are many things).
The proof is in the pudding though. Here's the implementation of that register from GenPlusGX (note that it also emulates the unused bits being filled with whatever data lingers on the bus):
@evanh said:
As for the C code, the comments match the action but I'd need to read up on zstate to know what is really happening there.
zstate is just the BUSREQ and RESET bits. So if it is 3, bus is requested and Z80 is not held in reset, which is the condition for the bus to be clear.
Also, I know I'm right because stuff stops working if I change megayume's behavior to match the documentation. Try it yourself: comment out the BITNOT on line 2622 and notice how nothing works anymore.
Either way, I'm guessing that particular game doesn't bother checking the bus state. It is blindly requesting then relying on the data write validation instead.
@evanh said:
Either way, I'm guessing that particular game doesn't bother checking the bus state. It is blindly requesting then relying on the data write validation instead.
Could well be. Maybe there also was a bug like that in early prototype hardware and the code just never got changed as the hardware got fixed and perhaps even carried forward into new games.
Its not a problem, anyways, as it should just work when the Z80 stuff is implemented. I could implement the RAM access right now, actually, but I think that'd break games that use it as a mailbox and wait until the Z80 clears a location. This works right now because it always reads zero.
What really is a problem is the strange issues that go on in Sonic 3.
The symptoms appear to be different between regular Sonic 3 and Sonic 3 Complete: While the issue starts happening at the exact same point, regular S3 just corrupts slightly instead of immediately locking up. What happens then depends on what you do (you can feel your way to the special stage entrance, upon the leaving of which, the corruption mostly undoes itself (but comes back if you go back to the start of the level and go forward again from there)), but it seems to always end up on a $Fxxx instruction somewhere, triggering the Line F emulator trap, which presumbly hangs the game. Also, some objects just don't spawn (Selecting S&k from S3C's menu to start in Mushroom Hill leads to a slightly hilarious sight wherein all the large mushrooms are missing their swaying caps (see below). Relatedly, regular S&K doesn't work. Think that needs emulating its custom mapping hardware, which seems unnecessary given S3C is a thing), which seems to indicate that the issue manifests in the object spawning code.
Further debugging required.
The stems still animate!
This slightly cursed image is what it locked up on, btw
@TonyB_ said:
Wuerfel_21, could you remind us how many cogs are needed and what each one does?
Currently, the cogs are used as such:
Spin stuff (currently just maps inputs, but I think I want to move that into the USB driver itself so this cog can do stuff without messing up the input latency. In particular, this cog will need to concern itself with writing backup SRAM to SD card)
USB driver
VDP renderer (renders palette index and priority for all 3 layers into a buffer)
VDP compositor (composites the layers together, does CRAM lookup and applies shadow/highlight effect)
Video output
68000 emulation (also does DMA and other I/O related things)
That leaves two, which will be used for the Z80 and YM2612/PSG, the latter I already have an emulation core for (search "OPN2cog").
I think "Sega Genesis/Mega Drive Emulation" would be a more informative title than "Console Emulation".
Perhaps. But this is a split thread, so I can't rename it. @VonSzarvas I guess
Streamer available, e.g. for video (interrupt-driven)
~256 longs in LUT RAM available for non-Z80 code
~320 longs saved by using SKIPF/EXECF
Thus it might be possible to combine the Z80 cog with another, so that only seven are needed.
yeah, thought that it wouldn't be terribly demanding. The streamer wouldn't be available for background tasks due to the PSRAM access, so it can't be combined with the video cog.
On Sonic 3 issues, here's the current status:
I traced one possible crash (there are multiple ways it can crash) back to this function here. A0 is a pointer into the object table and this function intends to set up a plant object, I guess. The final MOVE installs a code pointer into the object struct. For some reason, this writes the nonsense value $9BEA_3228, which upon jumping to crashes the thing.
Will have to trace it back further. If luck is bad, it's just that $2C(a0) (which seems to be the object's subtype) became corrupted by something else before and I'll have to go even deeper.
EDIT: Yes it is that. Pain is the most profound of human emotions.
Figured out the issue that was causing the corrupted stripe of graphics in the intro to Thunderforce IV (and also the freakout the background has in the setup menu. The fadeout of the mission select screen is still bugged.). The issue is that the VDP was designed by actual insane people and VRAM fill doesn't remotely do what you'd expect.
What you'd expect it to do: Repeat the VRAM word write N times with the same data. How it actually works: Do one normal VRAM word write, then take the upper byte of the data and start filling N bytes with that, preincrementing the address with the stride register each time. How the documentation tries to explain it:
M E N T A L.
There's too much debugging gunk in the code right now, so if you really want to see corruption free TF4 (and really only that, I don't think anything else is fixed by this) right now, replace the impl of .do_fill with this:
.do_fill
getbyte mk_memtmp1,mk_memvalue,#1
rdword mk_memtmp2,#vdp_dmalen_w wz
if_z setword mk_memtmp2,#1,#1
wrword #0,#vdp_dmalen_w
debug("VRAM fill of ",uhex_word_(mk_memtmp2)," bytes with ",uhex_word_(mk_memtmp1)," to ",uhex_word_array_(#@vdp_io_address_w,#1)," and increment ",uhex_byte_array_(#@vdp_autoinc_b,#1))
' VRAM fill is very very strange
mov ptrb,mk_memtmp0
call #.wr_dataport_internal
rdbyte mk_memtmp0,#vdp_autoinc_b
.fill_loop
add ptrb,mk_memtmp0
setword ptrb,#vdp_ram>>16,#1
wrbyte mk_memtmp1,ptrb
djnz mk_memtmp2,#.fill_loop
ret wcz
Also tracked the issue with Streets of Rage 2 down: That one needs the VDP to actually honor the "DMA enable" flag, for it does what looks like a DMA trigger when it really just wants to write to VRAM? IDK why, but since the DMA length counter is zero after a transfer, that trashes the entire VRAM! Twice!
Code changes for this are kinda scattered three places and I still have a lot of debugging nonsense in there, so no ZIP until I find the Sonic 3 bug or give up on that.
EDIT: SOR2 also helped me figure out that the operands to CMPM were reversed when I noticed that I'd always place first on the highscore table. oops. No, fixing this also doesn't seem to fix anything else (well, I haven't scrutinized all the highscore tables, which is about the only place I can think of to use CMPM). Then again, I think the only unknown issues in my test set are Sonic 3 crash, the shadows in Panorama Cotton and the stage select fadeout in TF4, so that's slim pickings. Should try some more games. Well, I can tell you that Crusader of Centy (also known as Ragnacënty or Soleil. Localization department worked overtime on that one.) seems to at least start, but IDK if it runs into issues due to lack of backup SRAM. I can also tell you that Another World / Out of this World doesn't work because it waits on the HBlank flag? Okay. I guess I could simply randomize that flag on each read....
TF4 fade issue resolved, too: reading CRAM back wasn't working, lol (well, was giving VSRAM and vice-versa).
Notice how all of these last couple bugs were related to the VDP memory interface. As I said, designed by insane people. The memory interface, that is. The actual rendering features are nice for what they are. Though having a write-through sprite attribute cache instead of dedicated OAM is weird, but I guess that's the TMS9918 heritage of having sprite data in VRAM.
I return once again from my exile to bring good news! That pesky "Sonic 3 issue" has been vanquished! No, not by trudging through the game's code and trying to figure out where it goes wrong like a good girl, but by becoming lazy and trying random ROMs to procrastinate. In this case, the excellent 240p test suite, a video test pattern generator. I noticed that the video mode setting was perma-stuck on interlaced and couldn't be changed (even though it actually was in progressive mode. Non-doubleres interlaced mode is actually bugged, but that is a separate issue...). After a brief attempt at compiling it from source to acquire debugging symbols, I gave up and debugged it as it were. The issue turns out to be that ADDQ/SUBQ of an address register affected the condition flags (address register ops NEVER affect the condition flags) and this is brought to light by GCCs amazing ability to re-order instructions in very strange ways, such as in this function that is meant to check if the interlace mode is currently enabled:
What a lame issue. I'm really quite stupid to miss that one, aren't I?
Well, it works now, so I guess that leaves the only mystery issue to be the Panorama Cotton shadow size issue, but I have no idea where to even start on that one and it works fine other than that, so I'll just do proper interlace mode next, then Z80/Audio and hope that something else hits on the same bug in a more debuggable manner. I have some suspicion in DIVU/DIVS, might also just double check those.
Changes in MegaYume alpha 016:
Fix CMPM operands being swapped
Fix ADDQ/SUBQ #x, An writing flags when it shouldn't
Don't do DMA transfers if DMA isn't enabled in mode register
Fix VRAM fill
Fix interlace enable / doubleres mode register bit confusion
@Wuerfel_21 Good work. It's got to be tricky to debug instruction problems this way but you've managed well. It'll be great to have the Z80 stuff working soon too, hopefully it won't give you as much grief.
Added proper interlace mode support. Real interlace on TV output, screen bobbing on VGA (well, if you double a half-line, you just get a whole line, so in that way it makes sense).
Added hack to get Sonic 2 multiplayer and Out of this World to run (random Hblank flag on status register read).
Edit NTSC/PAL60 timings to better center the picture (at least on my CRT TV, for whatever that's worth)
@rogloh can you test that 4X mode again to make sure it interlaces correctly? (As said, only game I know of to actually use interlace mode is Sonic 2's multiplayer mode) It should work though.
@Wuerfel_21 said:
@rogloh can you test that 4X mode again to make sure it interlaces correctly? (As said, only game I know of to actually use interlace mode is Sonic 2's multiplayer mode) It should work though.
I ran it with both the 2X and 4X modes on the CRT and it appeared to work okay. My monitor reported 63kHz and 60Hz in the 4X mode for the timings.
@Wuerfel_21 said:
@rogloh can you test that 4X mode again to make sure it interlaces correctly? (As said, only game I know of to actually use interlace mode is Sonic 2's multiplayer mode) It should work though.
I ran it with both the 2X and 4X modes on the CRT and it appeared to work okay. My monitor reported 63kHz and 60Hz in the 4X mode for the timings.
Well, if it works thats nice. In VGA mode, the switch between progressive and interlace should be seamless (since to the monitor, nothing changes - the image just shakes up and down)
@evanh said:
Analogue has that softness. The natural lack of quantisation. Another World.
Yeah.
Here's my favorite image to demonstrate this dissonance between digital pixels and real CRT display:
Comments
So the question is are you gonna make a Z80 emulator too?
This might help:
https://plutiedev.com/using-the-z80
Presumably, yes. As said, it needs to at least pretend to actually run at 3.58 MHz, so that'll be fun to figure out.
I know you need to wait for a bus request to actually go through, but this is checking wether the actual RAM access has actually written the actual value...
RAM paranoia?
Reading Tony's link I note a comment about the Z80 having the ability to also access the 68k's RAM. Maybe that type of access is unreliable and needs the verification. Doh! I somehow assumed that was Z80 code without reading it. 0_o
Maybe the 68k can hammer data through without halting the Z80. Has the Z80 actually been halted there?
EDIT: Another possibility is the halt request has no check coded and is instead relying on the write validation to confirm the Z80 has halted and the write is also done. This has the bonus of allowing some free processing time in the interim.
It can actually not. It can only actually access the cartridge ROM (I think the RAM can be addressed, but it doesn't actually work?). There are plenty of bugs related to the bus arbiter, but 68k access to the Z80 RAM failing isn't one of them. Perhaps some programmers just generally distrusted the entire thing, thus "RAM paranoia".
Ada, it may be worth contacting Baggers, he has a mostly complete Z80 emulator and I think Cluso has one too.
Seems like it is halted. There's some general confusion over the polarity of the BUSREQ register. As far as I can tell, the correct info is that you write a 1 to request/0 to release, and you read a 1 when the bus is busy and a 0 when the bus is clear (so read/write sorta have opposite polarity).
I know. I will look into what's there when I get to it. I want to fix the apparent issues first before making the code larger. Though I'd have to make quite some modifications to the Z80 core to wrench in timing, bus logic, etc, so writing my own also seems appealing.
Write a 1 for a halt request then read a 1 when it's halted. Quote below from Tony's link:
That snippet you've posted wouldn't have to wait around for the halted signal. It can rely on the data write verify to known the Z80 has halted ... while also confirming the data is written. I'm now thinking this is the likely scenario.
But it is not like that.´
Mind, this is incorrect in SEGA's official documentation, too (as are many things).
The proof is in the pudding though. Here's the implementation of that register from GenPlusGX (note that it also emulates the unused bits being filled with whatever data lingers on the bus):
Well, the documentation matches the link.
As for the C code, the comments match the action but I'd need to read up on
zstate
to know what is really happening there.zstate
is just the BUSREQ and RESET bits. So if it is 3, bus is requested and Z80 is not held in reset, which is the condition for the bus to be clear.Also, I know I'm right because stuff stops working if I change megayume's behavior to match the documentation. Try it yourself: comment out the BITNOT on line 2622 and notice how nothing works anymore.
Either way, I'm guessing that particular game doesn't bother checking the bus state. It is blindly requesting then relying on the data write validation instead.
Could well be. Maybe there also was a bug like that in early prototype hardware and the code just never got changed as the hardware got fixed and perhaps even carried forward into new games.
Its not a problem, anyways, as it should just work when the Z80 stuff is implemented. I could implement the RAM access right now, actually, but I think that'd break games that use it as a mailbox and wait until the Z80 clears a location. This works right now because it always reads zero.
What really is a problem is the strange issues that go on in Sonic 3.
The symptoms appear to be different between regular Sonic 3 and Sonic 3 Complete: While the issue starts happening at the exact same point, regular S3 just corrupts slightly instead of immediately locking up. What happens then depends on what you do (you can feel your way to the special stage entrance, upon the leaving of which, the corruption mostly undoes itself (but comes back if you go back to the start of the level and go forward again from there)), but it seems to always end up on a $Fxxx instruction somewhere, triggering the Line F emulator trap, which presumbly hangs the game. Also, some objects just don't spawn (Selecting S&k from S3C's menu to start in Mushroom Hill leads to a slightly hilarious sight wherein all the large mushrooms are missing their swaying caps (see below). Relatedly, regular S&K doesn't work. Think that needs emulating its custom mapping hardware, which seems unnecessary given S3C is a thing), which seems to indicate that the issue manifests in the object spawning code.
Further debugging required.
The stems still animate!
This slightly cursed image is what it locked up on, btw
Wuerfel_21, could you remind us how many cogs are needed and what each one does?
I think "Sega Genesis/Mega Drive Emulation" would be a more informative title than "Console Emulation".
Currently, the cogs are used as such:
That leaves two, which will be used for the Z80 and YM2612/PSG, the latter I already have an emulation core for (search "OPN2cog").
Perhaps. But this is a split thread, so I can't rename it. @VonSzarvas I guess
Wuerfel_21, thanks for the cog reminder.
I think the following Z80 emulator is possible:
Thus it might be possible to combine the Z80 cog with another, so that only seven are needed.
yeah, thought that it wouldn't be terribly demanding. The streamer wouldn't be available for background tasks due to the PSRAM access, so it can't be combined with the video cog.
On Sonic 3 issues, here's the current status:
I traced one possible crash (there are multiple ways it can crash) back to this function here. A0 is a pointer into the object table and this function intends to set up a plant object, I guess. The final MOVE installs a code pointer into the object struct. For some reason, this writes the nonsense value $9BEA_3228, which upon jumping to crashes the thing.
Will have to trace it back further. If luck is bad, it's just that
$2C(a0)
(which seems to be the object's subtype) became corrupted by something else before and I'll have to go even deeper.EDIT: Yes it is that. Pain is the most profound of human emotions.
Figured out the issue that was causing the corrupted stripe of graphics in the intro to Thunderforce IV (and also the freakout the background has in the setup menu. The fadeout of the mission select screen is still bugged.). The issue is that the VDP was designed by actual insane people and VRAM fill doesn't remotely do what you'd expect.
What you'd expect it to do: Repeat the VRAM word write N times with the same data.
How it actually works: Do one normal VRAM word write, then take the upper byte of the data and start filling N bytes with that, preincrementing the address with the stride register each time.
How the documentation tries to explain it:
M E N T A L.
There's too much debugging gunk in the code right now, so if you really want to see corruption free TF4 (and really only that, I don't think anything else is fixed by this) right now, replace the impl of
.do_fill
with this:Also tracked the issue with Streets of Rage 2 down: That one needs the VDP to actually honor the "DMA enable" flag, for it does what looks like a DMA trigger when it really just wants to write to VRAM? IDK why, but since the DMA length counter is zero after a transfer, that trashes the entire VRAM! Twice!
Code changes for this are kinda scattered three places and I still have a lot of debugging nonsense in there, so no ZIP until I find the Sonic 3 bug or give up on that.
EDIT: SOR2 also helped me figure out that the operands to CMPM were reversed when I noticed that I'd always place first on the highscore table. oops. No, fixing this also doesn't seem to fix anything else (well, I haven't scrutinized all the highscore tables, which is about the only place I can think of to use CMPM). Then again, I think the only unknown issues in my test set are Sonic 3 crash, the shadows in Panorama Cotton and the stage select fadeout in TF4, so that's slim pickings. Should try some more games. Well, I can tell you that Crusader of Centy (also known as Ragnacënty or Soleil. Localization department worked overtime on that one.) seems to at least start, but IDK if it runs into issues due to lack of backup SRAM. I can also tell you that Another World / Out of this World doesn't work because it waits on the HBlank flag? Okay. I guess I could simply randomize that flag on each read....
TF4 fade issue resolved, too: reading CRAM back wasn't working, lol (well, was giving VSRAM and vice-versa).
Notice how all of these last couple bugs were related to the VDP memory interface. As I said, designed by insane people. The memory interface, that is. The actual rendering features are nice for what they are. Though having a write-through sprite attribute cache instead of dedicated OAM is weird, but I guess that's the TMS9918 heritage of having sprite data in VRAM.
I return once again from my exile to bring good news! That pesky "Sonic 3 issue" has been vanquished! No, not by trudging through the game's code and trying to figure out where it goes wrong like a good girl, but by becoming lazy and trying random ROMs to procrastinate. In this case, the excellent 240p test suite, a video test pattern generator. I noticed that the video mode setting was perma-stuck on interlaced and couldn't be changed (even though it actually was in progressive mode. Non-doubleres interlaced mode is actually bugged, but that is a separate issue...). After a brief attempt at compiling it from source to acquire debugging symbols, I gave up and debugged it as it were. The issue turns out to be that ADDQ/SUBQ of an address register affected the condition flags (address register ops NEVER affect the condition flags) and this is brought to light by GCCs amazing ability to re-order instructions in very strange ways, such as in this function that is meant to check if the interlace mode is currently enabled:
What a lame issue. I'm really quite stupid to miss that one, aren't I?
Well, it works now, so I guess that leaves the only mystery issue to be the Panorama Cotton shadow size issue, but I have no idea where to even start on that one and it works fine other than that, so I'll just do proper interlace mode next, then Z80/Audio and hope that something else hits on the same bug in a more debuggable manner. I have some suspicion in DIVU/DIVS, might also just double check those.
Changes in MegaYume alpha 016:
@Wuerfel_21 Good work. It's got to be tricky to debug instruction problems this way but you've managed well. It'll be great to have the Z80 stuff working soon too, hopefully it won't give you as much grief.
Okay, here's another one:
@rogloh can you test that 4X mode again to make sure it interlaces correctly? (As said, only game I know of to actually use interlace mode is Sonic 2's multiplayer mode) It should work though.
Since I had the CRT running, here are some nice screen photos (using PAL60 SVideo mode):
Analogue has that softness. The natural lack of quantisation. Another World.
I ran it with both the 2X and 4X modes on the CRT and it appeared to work okay. My monitor reported 63kHz and 60Hz in the 4X mode for the timings.
Well, if it works thats nice. In VGA mode, the switch between progressive and interlace should be seamless (since to the monitor, nothing changes - the image just shakes up and down)
Yeah.
Here's my favorite image to demonstrate this dissonance between digital pixels and real CRT display: