Shop OBEX P1 Docs P2 Docs Learn Events
Console Emulation - Page 12 — Parallax Forums

Console Emulation

191012141556

Comments

  • Wuerfel_21Wuerfel_21 Posts: 4,601
    edited 2022-02-12 16:35

    @rogloh said:
    @Wuerfel_21 For your H Counter If you are not yet using ATN, you might be able to send an ATN on each scanline start to the emulator COG from your video driver or your other render COG(s), which interrupts and latches the current P2 clock value in an ISR. Then you can try to interpolate pixel position on the scanline from the current offset from this time value when you need to read it. How much overhead that would add is unknown (it might need a division operation in the CORDIC for example). However if the pixel output timing is very different due to your VDP/compositor COG processing vs the real HW delays this may not work quite as well.

    I ended up doing basically that. No interrupt needed, since the render cog already busy-waits on the scanline counter to increment. The overhead is reasonable, since I can precalc the reciporal of what the elapsed cycles would need to be divided by. Since under no circumstances a scanline is more than 216 cycles, the fast multiplier can be used for the calculation.

                  ' Compute timing constant for H counter emulation
                  ' (note that "pixel" here means hcounter values, which are actually 2 pixels)
                  rdlong pa,#_clkf
                  qmul pa,##round(6.36e-5/211.0*4294967296.0) ' cy/s * s/px -> cy/px
                  getqy pa ' cycles per pixel
                  getqx pb ' fractional part
                  rolword pa,pb,#1
                  qfrac #1,pa
                  getqx pa ' our timing reciporal
                  wrword pa,#vdp_cyc2hcnt_w
                  debug("H counter timing constant: ",udec_word_array(#vdp_cyc2hcnt_w,#1))
    

    It doesn't fix the demo quite as much as you'd hope, but the textured twister part at 3:14 is now working.

    EDIT: paste in new video link:


    Also, found another issue: The starwars-esque text crawl in Monster World IV's attract mode doesn't render. Seems to just be software rendering... This is broken in most other emulators (Exodus in particular hangs on that part!), too, so IDK what the heck is going on with that.


    Alpha 025 changes:

    • Add guesstimated H timing (only correct for H40)
    • Fix some more undocumented 68000 flags (fixes crash in Bloodshot)
    • Extend ROM region to $7FFFFF, as on real MD without MegaCD attached.
  • So, am trying to add HyperRAM support. Problem is that 2:00 AM is not the time to do that....

    It works, but only for certain alignments of the hub-side buffer? Why? That is completely irrelevant to anything. It seems the pattern of working/not working repeats every 8 bytes, which is even stranger (you'd expect 32 bytes = one full set of hub slices).

    CON
    '{
    _CLKFREQ = 300_000_000
    HYPER_ACCESSORY = 0 ' Base pin for P2EVAL HyperRAM board
    HYPER_CLK    =  8+HYPER_ACCESSORY
    HYPER_RWDS   = 10+HYPER_ACCESSORY
    HYPER_SELECT = 12+HYPER_ACCESSORY
    HYPER_BASE   =  0+HYPER_ACCESSORY
    HYPER_RESET  = 15+HYPER_ACCESSORY
    '}
    
    {
    _CLKFREQ = 10_000_000
    '}
    
    HYPER_WAIT  = 28
    HYPER_DELAY = HYPER_WAIT*2+1
    
    DAT
            org
            asmclk
    
    
    
            fltl #HYPER_CLK
            wrpin ##P_TRANSITION|P_OE, #HYPER_CLK
            wxpin #2, #HYPER_CLK
            drvl #HYPER_CLK
    
            wrpin ##P_INVERT_OUTPUT,#HYPER_SELECT
            drvl #HYPER_SELECT
    
            drvh #HYPER_RESET
    
            setxfrq nco_slow
    
            waitx #200
    
            debug("init ok")
            debug(uhex_long(#@romio_area))
    
    
            mov iter,#5
    .lp
            wrlong #"A",##@romio_area
            mov pa,##$100 >> 2
            mov read_longs,#4
            call #readburst
            rdlong tmp1,##@romio_area
            debug(uhex_long(tmp1),lstr(#@romio_area,#16),uhex_byte_array(#@romio_area,#16))
            djnz iter,#.lp
    
            jmp #$
    
    readburst
                  wrfast bit31,read_dest
                  mov tmp1,#(6+HYPER_WAIT)
                  shl read_longs,#3
                  add tmp1,read_longs
                  rczr pa wcz
                  rczl tmp2
                  and tmp2,#%11
                  shl tmp2,#9
                  setbyte pa,#%101_00000,#3 ' read linear burst
                  movbyts pa, #%%0123
                  drvh  #HYPER_SELECT
                  drvl  bus_pinfield
                  xinit addr_cmd1,pa
                  wypin tmp1,#HYPER_CLK
                  xcont addr_cmd2,tmp2
                  setq nco_fast
                  xcont #HYPER_DELAY,#0
                  shr read_longs,#1
                  setword read_cmd,read_longs,#0
                  waitxmt
                  fltl bus_pinfield
                  setq nco_slow
                  xcont read_cmd,#0
                  waitxfi
           _ret_  drvl #HYPER_SELECT
                  'nop
                  'nop
                  'ret ' read as: _ret_ nop. No idea why there needs to be a NOP, but there needs to be.
    
    
    bit31         long  negx
    bus_pinfield  long HYPER_BASE addpins 7
    addr_cmd1     long (HYPER_BASE<<17)|X_PINS_ON | X_IMM_4X8_1DAC8 + 4
    addr_cmd2     long (HYPER_BASE<<17)|X_PINS_ON | X_IMM_4X8_1DAC8 + 2
    read_cmd      long (HYPER_BASE<<17)|X_WRITE_ON| X_8P_1DAC8_WFBYTE
    
    nco_fast      long $8000_0000
    nco_slow      long $4000_0000
    
    read_longs    long 4
    read_dest     long @romio_area
    
    tmp1          res 1
    tmp2          res 1
    iter          res 1
    
    
    DAT
                  orgh
                  alignl
                  byte 0[8] ' <---- WTF????
    romio_area    byte 0[256]
    
  • evanhevanh Posts: 15,274
    edited 2022-02-13 02:09
                  xinit addr_cmd1,pa
                  wypin tmp1,#HYPER_CLK
                  xcont addr_cmd2,tmp2
    

    The pulse gen smartpins are a headbanger ... They don't have a begin-pulses-exactly-now equivalent to XINIT command. Smartpin TRANS and PULSE modes only cycle like XCONT does, ie: Once enabled, with DIR, the cycling of future pulses becomes boundary locked to the period specified with WXPIN. The period is cycling even when no pulses are generated.

    There's a few solutions:

    • Use DIRL clkpin/DIRH clkpin at opportune places in the code.
    • Use WXPIN #1,clkpin/WXPIN period,clkpin the same way. Note the #1 takes effect after completion of prior period.
    • Craft the timing of other hardware to fit. Not usually an option but could be less code.
  • @Wuerfel_21 said:
    So, am trying to add HyperRAM support. Problem is that 2:00 AM is not the time to do that....

    Hehe, yeah HyperRAM debug and 2.00AM don't go together well unless you are fully dosed on caffeine. :smile:

    It works, but only for certain alignments of the hub-side buffer? Why? That is completely irrelevant to anything. It seems the pattern of working/not working repeats every 8 bytes, which is even stranger (you'd expect 32 bytes = one full set of hub slices).

    First guess on this is that you might not giving the streamer FIFO enough time to prepare itself if you are using bit31 on the WRFAST command which doesn't wait more than 2 cycles. Maybe try a wrfast #0, read_dest instead of wrfast bit31, readdest to see if that helps with anything.

  • @Wuerfel_21 said:
    Figured out the case of the missing SEGA sound (and the speech samples in Altered Beast) and of course the solution was blindingly obvious - the YM2612 pan registers all reset to $C0 (left and right channels both enabled). I was resetting them to zero, so while the sample was infact playing, the output was effectively muted until something initialized that register for channel 6.

    I don't know why it took me so long to figure this out. The Sonic 2 disassembly even helpfully notes that the game forgets to set the register:

    Eitherhow, here's a new ZIP and Sonic 3 in action:

    [Pops head up to look around]

    NICE!

    I just had to chime in here and express my appreciation for your work!

  • @rogloh said:

    It works, but only for certain alignments of the hub-side buffer? Why? That is completely irrelevant to anything. It seems the pattern of working/not working repeats every 8 bytes, which is even stranger (you'd expect 32 bytes = one full set of hub slices).

    First guess on this is that you might not giving the streamer FIFO enough time to prepare itself if you are using bit31 on the WRFAST command which doesn't wait more than 2 cycles. Maybe try a wrfast #0, read_dest instead of wrfast bit31, readdest to see if that helps with anything.

    Nah, that doesn't fix it, I obviously tried that.

    @evanh said:

                  xinit addr_cmd1,pa
                  wypin tmp1,#HYPER_CLK
                  xcont addr_cmd2,tmp2
    

    The pulse gen smartpins are a headbanger ... They don't have a begin-pulses-exactly-now equivalent to XINIT command. Smartpin TRANS and PULSE modes only cycle like XCONT does, ie: Once enabled, with DIR, the cycling of future pulses becomes boundary locked to the period specified with WXPIN. The period is cycling even when no pulses are generated.

    Oh. I guess that may have something to do with it? I just worked off the PSRAM code, which uses the maximum speed (one transition every cycle) and thus isn't affected by that.

  • evanhevanh Posts: 15,274

    @Wuerfel_21 said:
    ... which uses the maximum speed (one transition every cycle) and thus isn't affected by that.

    You got it. :)

  • Yep, resetting the clock pin seems to fix everything.

    Now to integrate this into the emulator.

    Very fun with PropTool still not supporting preprocessing.

  • @Wuerfel_21 said:
    Very fun with PropTool still not supporting preprocessing.

    Not! Is there even a safe way to test for PropTool spin vs Flexspin environment that doesn't break PropTool compiles?

  • @rogloh said:

    @Wuerfel_21 said:
    Very fun with PropTool still not supporting preprocessing.

    Not! Is there even a safe way to test for PropTool spin vs Flexspin environment that doesn't break PropTool compiles?

    You mean at compile time? no, not really.

    At runtime you can ofc check for any number of things, but that's silly.


    Got the HyperRAM stuff to mostly work. Z80 doesn't work for some reason.

  • Wuerfel_21Wuerfel_21 Posts: 4,601
    edited 2022-02-13 13:46

    Okay, got it sorted. Here's the version with HyperRAM support. Please test it if you have the necessary setup.

    To switch to a different memory type search for "HyperRAM as ROM", "PSRAM as ROM" and "HUB RAM as ROM" and make sure that only the sections you want are enabled.

    Yes, the final produkt(tm) will use preprocessor, but right now I need it to work in proptool for ease of development.

    The loadit.rb script has also been upgraded:
    - loader code and related nonsense now live in a memstuff directory
    - loadit.rb can now locates the loader relative to itself
    - Errors while loading or compiling the loader are now reported (but I am yet too lazy to print the actual error messages)
    - You may now pass --hyper to use the HyperRAM loader
    - You may now pass -p COMx to override the serial port without editing the script

    So to load game.bin into HyperRAM on COM8, you'd do ruby loadit.rb --hyper -p COM8 game.bin

    Next stop is HDMI support (which I need to do on the P2EVAL, for there is a curious localized shortage of USB to DC barrel cables and I thus can't power the P2EDGE and the crusty HDMI screen at once)

  • evanhevanh Posts: 15,274

    That'll be -p /dev/serial/by-id/usb-Parallax_Inc_Propeller_P2-ES_EVAL_P23YOO42-if00-port0 for me. Long winded but I know it's the right board then. Kind of annoying trying to load one board and getting the other because the connect order changed.

  • evanhevanh Posts: 15,274

    hmm, I've got a choice between Ruby v2.7 via Apt or v3.1 via Snap. Recommendations?

  • @evanh said:
    hmm, I've got a choice between Ruby v2.7 via Apt or v3.1 via Snap. Recommendations?

    I'm still on v2.6, lol. Either should work.

  • evanhevanh Posts: 15,274

    2.7 it is. Next step is there's no game.bin ... and build.sh makes two binaries and I'm guessing those aren't a game anyway.

  • @evanh said:
    2.7 it is. Next step is there's no game.bin ... and build.sh makes two binaries and I'm guessing those aren't a game anyway.

    The build process is a twofer - megayume_upper contains the spin upstart and USB code (don't forget to adjust your pins and video type there). It is assembled for high memory and then binary included into megayume_lower, which contains the emulator core. So the one you need to start is megayume_lower.binary.

    A game ROM, you'd have to acquire yourself. putting "[game name] rom" into google usually suffices. Should have .bin, .gen or .md extension. If it is .smd (ye olde interleaved format), go find a different dump.

  • evanhevanh Posts: 15,274
    edited 2022-02-13 15:31

    Okay, so say I had a game.rom, and everything is built for using hyperRAM. What are the steps to then load and run the game?

    EDIT: Oops, make that game.bin. Still the question is about fitting the emulator in with the game.bin

  • Wuerfel_21Wuerfel_21 Posts: 4,601
    edited 2022-02-13 15:38

    ./build.sh to build the emulator. Once again, mind the USB, audio and video pin settings.
    ruby loadit.rb --hyper -p COMx game.bin to preload the ROM (note that if your RAM board isn't on pins 0..15, you also need to edit memstuff/hyper_loadit.spin2)
    loadp2 -p COMx megayume_lower.binary to start everything up.

  • evanhevanh Posts: 15,274

    cool! I have a picture. :) Gonna have to play with the hsync timings though ...

  • RaymanRayman Posts: 14,032

    Have try this out now that there's hyperram version...

    @Wuerfel_21 Are you doing this on a PC? Sounds like it if using PropTool...

    I don't have ruby, looks like I need to figure that out...

  • @evanh said:
    cool! I have a picture. :) Gonna have to play with the hsync timings though ...

    For VGA 2X, the timings should(tm) be completely standard. For the other VGA modes, yeah, those are just whatever. The NTSC modes should match the original hardware timings closely.

    Note: do not touch the vertical timings unless you know what you're doing - if there's too little blanking, that messes up the Vcounter emulation.


    @Rayman said:
    Have try this out now that there's hyperram version...

    @Wuerfel_21 Are you doing this on a PC? Sounds like it if using PropTool...

    No of course not, I am doing this on my Amiga that I don't have. ;P

    I don't have ruby, looks like I need to figure that out...

    On windows, use RubyInstaller


    HDMI progress report: no progress. Only sadness. Extra fun because I do infact not own a GHz bandwidth logic analyzer.

  • evanhevanh Posts: 15,274
    edited 2022-02-13 16:17

    How to invert hsync polarity? EDIT: And vsync too. EDIT2: And hblanking is aggressively small too. Any options for changing the clocks or ratios?

    vga3x_config
    long 327_600_000
    ' Timing
    long    3 - 1 ' line multiplier minus one
    long    15 ' native front porch lines
    long    3 ' native sync lines
    long    48 ' native back porch lines
    long    round(2147483648.0/13.0/2.0*1.5) ' Sync NCO value
    long    round(2147483648.0/13.0/2.0*1.5) ' H40 NCO value
    long    round(2147483648.0/13.0/2.0*1.5*256.0/320.0) ' H32 NCO value
    long    %00 ' blanking color
    
    long    X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 8,$00  ' HSync section 1 (front porch)
    long    X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 64,$01 ' HSync section 2 (sync pulse)
    long    X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 8,$00 ' HSync section 3 (back porch)
    long                         0,0 ' HSync padding 4
    long                         0,0 ' HSync padding 5
    long                         0,0 ' HSync padding 6
    long                         0,0 ' HSync padding 7
    long                         0,0 ' HSync padding 8
    
    long    X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 39,$00  ' VSync section 1 (front porch)
    long   X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 2,$01  ' VSync section 2 (sync pulse)
    long   X_DACS_3_2_1_0|X_IMM_1X32_4DAC8 + 39 + 320,$00 ' VSync section 3 (back porch + active)
    long                         0,0 ' VSync padding 4
    long                         0,0 ' VSync padding 5
    long                         0,0 ' VSync padding 6
    long                         0,0 ' VSync padding 7
    long                         0,0 ' VSync padding 8
    
    ' Color conversion
    long    %0_01_1_000_1 ' CMOD mode + flags
    long    $5A_00_00_00 ' CY
    long    $00_5A_00_00 ' CI
    long    $00_00_5A_00 ' CQ
    long    $00_00_00_00 ' CQ XORlternate value
    long    0 ' CFRQ
    
  • Inverting HSync polarity is one of the bits in CMOD, see P2 documentation. For VSYNC, you'd have to add a wrpin(discrete_vsync,P_INVERT_OUTPUT) somewhere to the start function (there are other ways, but that seems the least hacky).

    Since when is VGA sync polarity an actual issue though? I don't think I've ever seen a monitor that cares.

  • evanhevanh Posts: 15,274

    @Wuerfel_21 said:
    For VGA 2X, the timings should(tm) be completely standard. For the other VGA modes, yeah, those are just whatever. The NTSC modes should match the original hardware timings closely.

    Note: do not touch the vertical timings unless you know what you're doing - if there's too little blanking, that messes up the Vcounter emulation.

    Ah, MODE_VGA3X was default, I'll try MODE_VGA2X ...

  • evanhevanh Posts: 15,274

    Okay, MODE_VGA2X works much better. :) Now for some sound cables ...

  • Yeah, VGA3X is just kinda oddball. It works for my particular crappy monitor. It sorta misdetects it as 1024x768, which is it's native panel resolution, thus really crisp pixels. For pretty much any other monitor VGA2X or VGA4X is going to be better. 3X also doesn't quite support interlace mode (because you can't really shift the image by 0.5*3 = 1.5 pixels)

  • evanhevanh Posts: 15,274

    Still haven't found suitable audio cable. The game, Rastan II, runs though. I've used a USB keyboard. Took a while to find the jump (C) and stab (X) keys.

  • evanhevanh Posts: 15,274

    Ah, right, had to add VGA_BASEPIN + to the audio pins config.
    Sound goes now ... man, Amiga audio was so much better alright ... turning the volume down now.

  • @evanh said:
    Ah, right, had to add VGA_BASEPIN + to the audio pins config.
    Sound goes now ... man, Amiga audio was so much better alright ... turning the volume down now.

    I think Rastan II just has kinda questionable sound design ;P


    On the HDMI front, I've hit an unfortunate roadblock - it seems I can not get it to produce pixels at half-rate (NCO $06666667). I tried making only the visible area half-rate, but that's no dice, either. Well, will have to invent some terrible hack to feed pixels manually.

  • evanhevanh Posts: 15,274

    NCO numbers like that sometimes need the use of XZERO instead of XCONT. I've not coded a display driver myself so not sure when it helps.

Sign In or Register to comment.