Shop OBEX P1 Docs P2 Docs Learn Events
Console Emulation - Page 47 — Parallax Forums

Console Emulation

1444547495058

Comments

  • I have a lot of better things to do, so I'm looking into fixing MegaYume's sprite cache emulation to be properly accurate (should fix sprite glitches and broken effects in some games and just be the right thing to do). Minor problem: I wrote the sprite logic a long time ago and it's totally cryptic gobbledygook with multiple layers of indirection (somewhat necessary to get the correct masking behaviours).

    So currently sprite cache is kept internally in renderer's cogRAM and recomputed on every blank line. This caches Y position, size and link field (cursed entity that is...) of the sprites. But to match real hardware, this should instead be recomputed when the sprite table is written to (so that job'll have to be moved to the CPU cog, just like CRAM format conversion is). But currently sprite cache is not formatted in a way that would be fast to update one sprite at a time. There's also dependency on current video mode (progressive vs. interlaced). Owie owie.

  • Well, it was a success, but the internet died and now I can't upload it (without mayor hassle, anyways). Take this bad picture, wherein the rather clever snow effect in Stone Protectors now correctly displays, as proof.

  • Well, everything's okay again, so I pushed new MegaYume V1.3 beta code to github.

    I fixed the aforementioned sprite cache behaviour, which should fix glitches or missing sprites in a few games, including:

    • Stone Protectors (missing snow effect)
    • Gauntlet IV (missing sprites when there's a lot of enemies)
    • King of the Monsters (everything was broken)
    • Samurai Shodown (occasional glitches)

    I also fixed an edge case (or rather, an edge case of an edge case) relating to sprite masking and the sprite tile limit, so now the test screen ROM shows "6. MASK S1 ON DOT OVERFLOW" as "PASS".

    I also optimized the whole rendering code some more. In particular, I found a faster way to handle 4bpp -> 8bpp conversion. The comments make me remember @rogloh helped find out about the current method, so I guess I'll tag him (oops I already did).

    Old code:

                  rdlong vdpr_tiledata,vdpr_tmp4
    
                  ' rogloh's optimized variant...
    
                  ' Prepare attributes
                  mov vdpr_tmp1, vdpr_tile
                  shr vdpr_tmp1, #13 ' Just pal+priority
    
                  testb vdpr_tile, #11 wc ' mirror bit
        if_nc     splitb vdpr_tiledata ' reverse nibbles in tile
        if_nc     rev vdpr_tiledata
        if_nc     movbyts vdpr_tiledata, #%%0123
        if_nc     mergeb vdpr_tiledata
    
                  ' bit magic (space out nybbles)
                  getword vdpr_tilebuffer1, vdpr_tiledata, #1    ' first group
                  movbyts vdpr_tilebuffer1, #%%3120
                  mergew vdpr_tilebuffer1
                  movbyts vdpr_tilebuffer1, #%%3120
                  splitw vdpr_tilebuffer1
                  getword vdpr_tilebuffer2, vdpr_tiledata, #0    ' second group
                  movbyts vdpr_tilebuffer2, #%%3120
                  mergew vdpr_tilebuffer2
                  movbyts vdpr_tilebuffer2, #%%3120
                  splitw vdpr_tilebuffer2
    
                  ' add attributes to  non-transparent pixels
                  test vdpr_tilebuffer1,vdpr_pixnibtest+0 wz
            if_nz setnib vdpr_tilebuffer1,vdpr_tmp1,#1
                  test vdpr_tilebuffer1,vdpr_pixnibtest+1 wz
            if_nz setnib vdpr_tilebuffer1,vdpr_tmp1,#3
                  test vdpr_tilebuffer1,vdpr_pixnibtest+2 wz
            if_nz setnib vdpr_tilebuffer1,vdpr_tmp1,#5
                  test vdpr_tilebuffer1,vdpr_pixnibtest+3 wz
            if_nz setnib vdpr_tilebuffer1,vdpr_tmp1,#7
                  test vdpr_tilebuffer2,vdpr_pixnibtest+0 wz
            if_nz setnib vdpr_tilebuffer2,vdpr_tmp1,#1
                  test vdpr_tilebuffer2,vdpr_pixnibtest+1 wz
            if_nz setnib vdpr_tilebuffer2,vdpr_tmp1,#3
                  test vdpr_tilebuffer2,vdpr_pixnibtest+2 wz
            if_nz setnib vdpr_tilebuffer2,vdpr_tmp1,#5
                  test vdpr_tilebuffer2,vdpr_pixnibtest+3 wz
            if_nz setnib vdpr_tilebuffer2,vdpr_tmp1,#7
    
                  [...]
    
    vdpr_pixnibtest       long 15<<0,15<<8,15<<16,15<<24
    
    

    new code:

                rdlong vdpr_tilebuffer1,vdpr_tmp4
    
                  ' rogloh's optimized variant...
    
                  ' Prepare attributes
                  getword vdpr_tmp1, vdpr_tile,#0 '' Note: this changed to getword because vdpr_tile now has garbage in top half
                  shr vdpr_tmp1, #13 ' Just pal+priority
    
                  testb vdpr_tile, #11 wc ' mirror bit
                  splitb vdpr_tilebuffer1 ' reverse nibbles in tile
        if_nc     rev vdpr_tilebuffer1
        if_nc     movbyts vdpr_tilebuffer1, #%%0123
                  not vdpr_tmp2,vdpr_tilebuffer1
                  mergeb vdpr_tilebuffer1
    
                  ' generate transparency mask
                  getword vdpr_tmp3,vdpr_tmp2,#1
                  and vdpr_tmp2,vdpr_tmp3
                  getbyte vdpr_tmp3,vdpr_tmp2,#1
                  and vdpr_tmp2,vdpr_tmp3
    
                  ' bit magic (space out nybbles)
                  movbyts vdpr_tilebuffer1, #%%3120
                  mergew vdpr_tilebuffer1
                  movbyts vdpr_tilebuffer1, #%%3120
                  splitw vdpr_tilebuffer1
                  mov vdpr_tilebuffer2,vdpr_tilebuffer1
                  shr vdpr_tilebuffer1,#4
                  and vdpr_tilebuffer1,vdpr_lowernibbles
                  and vdpr_tilebuffer2,vdpr_lowernibbles
    
                  ' add attributes to  non-transparent pixels
                  skip vdpr_tmp2
                  setnib vdpr_tilebuffer2,vdpr_tmp1,#1
                  setnib vdpr_tilebuffer2,vdpr_tmp1,#3
                  setnib vdpr_tilebuffer2,vdpr_tmp1,#5
                  setnib vdpr_tilebuffer2,vdpr_tmp1,#7
                  setnib vdpr_tilebuffer1,vdpr_tmp1,#1
                  setnib vdpr_tilebuffer1,vdpr_tmp1,#3
                  setnib vdpr_tilebuffer1,vdpr_tmp1,#5
                  setnib vdpr_tilebuffer1,vdpr_tmp1,#7
    
                  [...]
    
    vdpr_lowernibbles     long $0F0F0F0F
    

    It's 34 vs 30 instructions, so not quite so much, but the methodology is quite different. (The more efficient spacing out also happens in a slightly different version with tile planes). I feel like there might be an even more efficient way.

  • Yeah SKIP comes in very handy for a sequence of binary choices, especially when executing single instructions, although you can still use it for groups of 2,4,8,16 instructions as well if you do some initial bit replicating when creating the skip mask.

  • RaymanRayman Posts: 14,161

    Nice to get gauntlet working!

  • @Rayman said:
    Nice to get gauntlet working!

    Well, I though it was working already. But then a friend came over and we actually played a few levels... Owie. I still am not 100% sure it works now. Weird if it wouldn't though. Notice the text in the status area has different colors for each player? They're actually the same color being changed by a raster interrupt. The bug I encountered is that at some threshold of active sprites, some sections of the screen will have no sprites drawn in them. This matches up with the player sections. I think the same raster IRQ is used to rewrite the sprite table mid-frame in these cases (but for some reason, this is only done when needed instead of always). With the old code, the Y positions would be stuck at the previous values.

  • RaymanRayman Posts: 14,161

    Need to try gauntlet again…
    Guessing multiplayer can work now?

  • @Rayman said:
    Need to try gauntlet again…
    Guessing multiplayer can work now?

    The multiplayer has worked ever since I switched it to usbnew. Just have to hook up multiple game pads to a hub. I also added 4-player adapter support on the emulation side shortly after (which can be sortof wonky and doesn't quite work with every game that it ought to work with, but it works with Gauntlet).

  • Ever tried the HDMI video modes in recent times and wondered why they don't work? That's because I broke them when adding LCD support. Fixed now.

  • evanhevanh Posts: 15,423
    edited 2023-11-25 12:57

    Ada,
    In MegaYume, after loading a ROM from the SD, does an unchanging solid magenta display have a meaning?

    I'm struggling to work out what I've done wrong. I've downloaded, configured and compiled the latest emulator from https://github.com/IRQsome/MegaYume
    The load menu comes up fine on the HDMI TV, the USB keyboard control allows me to select the ROM to load, it shows a brief loading bar zip across the display then it goes magenta and that's it. I can CTRL+ESC and try again with the same result. ROM name is RASTAN2.MD which is renamed from Rastan Saga II (USA).md. I've played it briefly in the past on an older Megayume without issue.

    PS: Using an Eval Board, I've tried both a 4-bit PSRAM add-on (from Rayman) and Parallax's HyperRAM add-on. Both have worked fine for me in the past.

    PPS: I suspect my prior success with Megayume was using Edge32MB only.

  • Wuerfel_21Wuerfel_21 Posts: 4,686
    edited 2023-11-25 12:51

    Solid magenta in megayume usually means that it's not reading the external RAM properly.

    By the same token, could also be a corrupted ROM file, can happen on SD cards that are being subjected to experiments(TM).

  • evanhevanh Posts: 15,423

    @Wuerfel_21 said:
    ... can happen on SD cards that are being subjected to experiments(TM).

    I've reinit'd the FAT32 filesystem and it boots a _BOOT_P2.BIX bootfile no problem. Okay, I've done the RAM expansion thing to death, maybe I need to try a fresh ROM then. I only have the one ...

  • evanhevanh Posts: 15,423

    Aargh! No, it was the RAM. Don't know what I was doing wrong with Rayman's addon but it seems there a bug in the handling of the HyperRAM addon. I doesn't work when using a base pin of 32.

  • @evanh said:
    Aargh! No, it was the RAM. Don't know what I was doing wrong with Rayman's addon but it seems there a bug in the handling of the HyperRAM addon. I doesn't work when using a base pin of 32.

    Really? Have to dig one out to figure out the issue...

  • evanhevanh Posts: 15,423
    edited 2023-11-25 14:00

    Okay, worked out the problem with Rayman's 4-bit PSRAM add-on with some trial and error testing: It needed PSRAM_WAIT = 6 instead of 5.

    ' Enable one of these to select the exmem type to use
    '#define USE_PSRAM16
    '#define USE_PSRAM8
    #define USE_PSRAM4
    '#define USE_HYPER
    
    ' Rayman's 24MB 4-bit PSRAM add-on
    PSRAM_BASE = 40
    PSRAM_CLK = 4+PSRAM_BASE
    PSRAM_SELECT = 5+PSRAM_BASE
    PSRAM_BANKS = 3 ' Only used to stop further banks from interfering
    
    PSRAM_WAIT  = 6
    PSRAM_DELAY = 13
    PSRAM_SYNC_CLOCK = true
    PSRAM_SYNC_DATA = true
    
    ' Uncomment for slower memory clock
    '#define USE_PSRAM_SLOW
    
  • No, you want to increase DELAY. WAIT is for the actual chip latency that needs to be clocked through. If you increase WAIT, there'll be too many clock pulses. Check the RAMCONFIG.MD, it has correct values for all sorts of setup

  • evanhevanh Posts: 15,423

    Yep, I know. I've not gone looking at the datasheets yet but my guess is ended up with a different brand of PSRAM than what you tested with. On that note ...

  • No, you're just being thick. The indicated timing for the Rayslogic 24MB board is:

    PSRAM_WAIT = 5
    PSRAM_DELAY = 15
    PSRAM_SYNC_CLOCK = true
    PSRAM_SYNC_DATA = true
    

    If you do WAIT=6 and DELAY=13, that results in the same read offset, but there'll be an extraneous clock pulse at the end of each transfer. That doesn't really matter, but it's just more correct that way.

  • evanhevanh Posts: 15,423
    edited 2023-11-25 14:29

    I tried 5 and 15, it was a magenta display skewed display at the load menu. I didn't bother loading the game.

    Datasheet for the APS6404L PSRAM says 6 for both Quad and QPI

  • ????

    Okay now I'm confused. How did WAIT=5 ever work then?

  • @evanh said:
    I tried 5 and 15, it was a magenta display skewed display at the load menu. I didn't bother loading the game.

    Skewed? You have a visual of this? Sounds weird.

  • evanhevanh Posts: 15,423

    @Wuerfel_21 said:
    Skewed? You have a visual of this? Sounds weird.

    Argh! I can't reproduce it. That 5 + 15 now works! And so does 6 + 13.

  • evanhevanh Posts: 15,423

    So, does that mean the correct value for PSRAM_DELAY is one less than what the datasheet says?

  • @evanh said:
    So, does that mean the correct value for PSRAM_DELAY is one less than what the datasheet says?

    I want to say yes?

  • evanhevanh Posts: 15,423

    Neoyume working on Eval board with Rayman's 96MB add-on using the following config:

    '#define USE_PSRAM16
    #define USE_PSRAM8
    '#define USE_PSRAM4
    '#define USE_HYPER
    
    PSRAM_BASE = 32
    PSRAM_CLK = 8+PSRAM_BASE addpins 1
    PSRAM_SELECT = 10+PSRAM_BASE
    PSRAM_BANKS = 6
    
    ' \/ Uncomment if PSRAM_BANKS = 1 for speedup
    '#define USE_PSRAM_NOBANKS
    
    PSRAM_WAIT  = 5
    PSRAM_DELAY = 16
    PSRAM_SYNC_CLOCK = false
    PSRAM_SYNC_DATA = false
    
    ' \/ Uncomment for lameness
    #define USE_PSRAM_SLOW
    

    The default registered pins started okay but didn't take long before it crashed.

  • Wuerfel_21Wuerfel_21 Posts: 4,686
    edited 2023-11-26 01:13

    Well isn't that nice. There's a bit of a catch to NeoYume reliability: the ROMs are neatly segmented into graphics, sound and code. Corruption in the former two is entirely transient and can never crash the system. The CPUs are completely isolated from the contents of Cx/Vx/S1 ROMs. They can only see M1 (Z80), Px and the BIOS (68000). Due to the (somewhat arbitrarily defined in neoyume_gamedb) load orders of games, the code ROMs end up in different banks for different games. So if you want to be sure, you actually need to test a few different-sized games.

  • RaymanRayman Posts: 14,161

    I think there's also the slow mode right? That sometimes fixes things...

  • The 96MB board needs slow mode to begin with. Too chunky for 170Mhz overclock.

  • evanhevanh Posts: 15,423
    edited 2023-11-26 01:54

    It's the only option I have on hand. I can now finally move on to the purpose of this effort - Load-testing of something substantial other than my over-the-top synthetic high-power tests.

    EDIT: I guess I could dig up a smaller game that fits in 16MB. But I'd be waiting for your hyperRAM fix first. I don't want to use my good Eval board for this and my damaged Eval board will only work with the RAM board at base pin P32.

  • The 96MB board should be fine. Mine is perfectly reliable as long as you don't take it outside on a hot day.

    Will look into fixing the HyperRAM mode. Might be tomorrow if it isn't a super quick fix.

Sign In or Register to comment.