Shop OBEX P1 Docs P2 Docs Learn Events
dotsidplayer - a failed attempt. (For now...) — Parallax Forums

dotsidplayer - a failed attempt. (For now...)

jhhjhh Posts: 28
edited 2013-03-01 22:14 in Propeller 1
Hello.

Thought I might share something, even though it didn't quite work out the way I thought it might. I know it's something of some interest to some folks here, and maybe it generates some interesting discussion or maybe someone finds it's a bug I missed vs. a fail, or can re-use bits in a more successful attempt. So I'll post it and see where it goes, as I probably wouldn't spend much more time on it otherwise, and it'll do no good rotting on my hard drive.

It was supposed to be a sidplayer that directly plays .sid files. It's built on a cog for the C3 called c3io I have been playing with, and uses Ahle2's awesome SIDcog.

I took the the 6502 core from the pacman NES sources, and bits of the memory access for a port of zog that c3io has, glued SIDcog to the 6502, and wrote a loader based on what I saw in the C sources for tinysid. The 6502 core also gets some spin routines that let you "JSR" into some code and have that return back to spin, which lets one implement the "glue" in SPIN. This is the technique tinysid (and probably most other) "dot" sidplayers use, as it really involves calling an init
function, and then calling the sound routine @ 50/100 Hz.

Awesome, right? Well... not so much yet. It plays... something.. but it's not quite right. It makes noise depending on what sid file you throw at it. I put a couple in the archive. "Sweet.sid" look to run close to fast enough, but generates clicks. "Dawn.sid" probably runs too slow, but generate a wild soundscape that you'll probably want to turn down.

Confirmed problems are:

- It's just not quite fast enough. The problem is probably less the 6502 emulation and more the VM/serial ram on the C3. Maybe a bigger/smarter cache?
- the above is also why the delay loop is commented out between calls. putting it in often results in the deadline passing sending the prop into the "53 second pause".

Suspected problems are:

- the 6502 core I have has bugs the pacman game never triggered? (IE: it implemented just enough?)
- this probably is why Sweet.sid doesn't make much sound despite running fast enough
- there is some subtlety of how I've glued SIGcog that isn't quite right...
- I've made bugs I haven't yet uncovered. :)

So anyway... Have a look if you've got a C3 handy and this sort of thing interests you, or you feel like porting it over to your platform of choice. (boards with more direct memory attachment might do better here...)

Comments

  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2011-04-23 23:29
    Interesting, How many mips does your 6502 emulator do?

    6502 asm looks like it would be fun to code in. It must have been
    enjoyable to program for the old Apple computer in 6502 asm.
    I think that old computer only had like 1mips or so, amazing
    they could do much of anything.
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-24 03:11
    I'm not at home now, so I can't test your code on my C3.
    However, I'm pretty sure the problem has to do with the "dummy byte" that I put between each SID channels registers to "long align". This was made to optimize SIDcog to be able to run at 31 kHz. Otherwise it would have been running at 15 kHz, and that's not an option. Have a look at the VAR section to see where those "dummy bytes" are. The "updateRegisters" method takes care of the alignment issue when playing register dumps.

    Btw, download the latest version of SIDcog in the OBEX.

    /Ahle2
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-24 03:30
    I can have a look at your code when I get home tomorrow. It might be an issue of how you are handling JSR calls as well.
    Btw, I can see if I can finish my C3 spi ram driver (with the help of kuroneko??) that does 20 Mbit/sec using the video generator.
    Then the memory bandwidth certainly isn't an issue anymore.
    I can manage to get it to work like 90% of the time I launch it in a cog. It's an issue with the PLL phase, and I think kuroneko is the man to talk to.

    /Ahle2
  • potatoheadpotatohead Posts: 10,261
    edited 2011-04-24 08:49
    @Holly, it's actually a few hundred K instructions per second! Ops consume 2-7 cycles, depending on their addressing mode, indexed, indirect, absolute, z-page, etc... and their form, simple increment register, add, store, etc..

    Yes it was fun. The Apple, of the different computers, actually shipped to be programed on and hacked on. In the ROM was a mini line assembler (you compute your own branches), and memory monitor. That's where my first assembly language programs were done. SPIN runs somewhere in the range of where a 6502 does in assembly. I've written a few basic test programs and SPIN holds up well against a old 6502!

    If you want to have a play there are plenty of emulators. Should you do that, be sure and check out a 6809 machine. That's the most elegant and powerful 8 bitter ever made, IMHO. Addictive, be warned.

    @jhh: Nice project. I hope you see some success!
  • jhhjhh Posts: 28
    edited 2011-04-24 16:34
    Ahle2, I actually started first by using the spin function to update the registers, and thought that might be a source of slowdowns, and so switched to pointing directly to that same base where the "dummy bytes" are. Here's the wrsid routine called when writing memory that munges the offsets from the 6502 cog:
    wrsid
                            mov     addr, _AB
                            and     addr, sid_mask        ' mask $FC00
                            cmp     addr, sid_base_addr     wz        ' check for $D400
                    if_nz   jmp     #wrsid_ret
                            mov     addr, _AB
                            and     addr, #$1F            ' mask lower 5 bits
                            cmp     addr, #6       wc, wz ' munge it into nice longs for sidcog
                  if_a      add     addr, #1
                            cmp     addr, #14       wc, wz
                  if_a      add     addr, #1
                            cmp     addr, #22       wc, wz
                  if_a      add     addr, #1
                            add     addr, sidReg          ' use as offset to sid registers
                            wrbyte  _DB,addr
    wrsid_ret               ret
    

    It might be something about the JSRs as you say, but when I compare the 6502 code from tinysid to that of the cog, it looks like there might be some differences in addressing mode handling that might cause incorrect interpretations. The 6502 cog does all addressing handling before the opcode, where tinysid sets a "mode" and calls the opcode, which may take different paths in get/set memory routines depending on the mode. I actually started to rework the 6502 cog to look more like that but it looked like it would get too big to fit and shelved it.

    I just came across another 6502 core called 'q6502.spin' which seems more complete, but uses LMM. I am just trying to wrap my head around what LMM is all about... but my understanding is speed is less than non-LMM assembly, so I'm not sure it's a good candidate to work with for this project.

    That routine you're talking about for SPI sounds awesome. I'm sure I could increase my speed 3-4x too by just ditching the 'bidirectional SPI' I have in c3io right now and went with something using the hardware timers. (IE: as in Kye's SD card driver)

    Thanks for having a look. :)
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-25 02:43
    I tried to run it on my C3, but the file couldn't be read from the card. I know the card is okey and works on my C3 when running other applications.

    Here is the output:
    Starting IO/Cache driver...0000FFFF
    
    Mounting SD...0000FF9D
    
    Starting SIGCog...
    
    Starting 6502 Cog...
    
    Loading: Dawn.sid
    
    
    
    0000FFFF
    
    Name:
    
    Authour:
    
    Copyright:
    
    
    
    Clearing RAM: ................................
    
    
    
    Loading SID File to RAM:  -1 Bytes Loaded.
    
    Done
    
    load_addr: 0000FFFF
    
    init_addr: 0000FFFF
    
    play_addr: 0000FFFF
    
    
    
    Running the sidfile init!
    

    Your offset handling seems to be correct as far I can see, but I don't understand the "add addr, sidReg"; It's set to 0 anyway?!

    I'm not sure what you mean by how the "6502 cog" vs tinysid handles addressing modes.
    In the 6502 emulator I wrote for the SID dumper, opcodes are handled like:
    - First the address is "calculated" using the selected addressing mode.
    - Then the actual instruction is interpreted.
  • jhhjhh Posts: 28
    edited 2011-04-25 16:39
    Hrmmm. Interesting. This one is using fsrw which I think only likes max 2GB FAT16 filesystems, so that might be a part of it. Maybe I'll flip it over to use Kye's driver as it's a lot more forgiving to what it reads.

    sidReg is set on init of the 6502 cog to the pointer returned when starting SIDCog (IE: the ch1 frequency low byte value passed as PAR to the cog). So sid register offset (0-27) + the base to get the right address for SIDcog to read.

    The way you describe your dump utility seems right along the lines of how the 6502 cog is implemented, and I've heard the results of that, so maybe the approach isn't the issue (although the cog might still be missing something the utility does.)

    Was the dump tool you wrote something for which source is available? Perhaps I could compare the two.
  • lonesocklonesock Posts: 917
    edited 2011-04-25 17:35
    FSRW 2.6 can handle FAT32 just fine (and is pretty quick), but only Kye's code can handle directories.

    Off topic, I think the name of the player should be "El SID" Probably already taken, but still...[8^)

    Jonathan
  • jhhjhh Posts: 28
    edited 2011-04-25 18:38
    I'm not sure what it is then. I do have filesystems that fsrw hates but Kye's code copes with. Maybe it's to do with how it's partitioned?

    Might be some other problem too, like a bug in the sd init code of c3io triggered by a card I haven't come across yet. (c3io uses code inspired by, but not the same as, Kye's).
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-04-25 22:42
    No the source code to the dumper isn't published. I can dig through my HD and see if I can find the source to the latest version.

    My experience is that FSRW always works on any card. The same card that didn't work with your player works flawlessly on the same C3 with my chiptune player using FSRW.
    On the other hand I have seen some cards that doesn't work with Kye's fatengine.
  • jhhjhh Posts: 28
    edited 2011-04-26 18:00
    I'll take it as a bug report and look into it. :)

    Question: Ahle2, do you typically use BST or the official tools? (Trying to figure out what might be different....only thing I can think of...)
  • lonesocklonesock Posts: 917
    edited 2011-04-27 09:10
    Assuming you are using FSRW 2.6, make sure you are using the "safe_spi.spin" block driver (called from fsrw.spin). That block driver still does:

    * SD / SDHC / MMC support
    * multi-block mode
    * read-ahead
    * write-behind
    * SPI output at 20 Mb/s
    * SPI input was slowed down to 10 Mb/s.

    That slowdown was to work around some cog<=>pin issues that show up when using some combinations (only at PLL16X).

    I do have a slightly newer version of the block driver if you want to test it out. The code is a bit smaller, the Spin calling routines were reordered and optimized a bit, and the initialization routines seem to handle coming out of the certain weird states better (like the prop reset in the middle of a multi-block transfer to the SD card).

    Jonathan
  • jhhjhh Posts: 28
    edited 2011-04-27 19:11
    I'm going to go back and look at this all again, I think I started with other code, the adaptations to the C3 on the ftp.propeller-chip.com site. Had a quick look at the attachment and that looks a nice bit of code, but different than where i started from.

    Is the official 2.6 FSRW in OBEX?

    c3io is meant to be a "replacement cog" specific to the C3's shared-SPI bus to allow safe concurrency, interleaving operations using the SPI bus. Unfortunately it means mucking about in the low levels of everything, and I'm a bit of a noob to the platform to boot, so I am most certainly breaking things. :) But this is very helpful. c3io is so alpha, it should have it's own torpedo of truth tour.

    Spinning.
  • jhhjhh Posts: 28
    edited 2011-04-27 19:50
    So yes, found it in the obex... think I'll go re-write some things in c3io, and spit out a new dotsidplayer at the end of that. I think I was working from an earlier version or something that forked from it some time ago?

    I especially like the 'hub_cog_transfer' bit. I may have to evict some things to make room for such a thing. :) I've been cringing every time i use rdbyte/wrbyte as it seems a wasted opportunity, and thats a nice way around that.
  • Ahle2Ahle2 Posts: 1,179
    edited 2011-06-22 02:49
    Is this project dead? If so, I might give it a shoot myself!

    / Ahle2
  • jhhjhh Posts: 28
    edited 2013-03-01 22:14
    I might dust it off and try to take another run at this, had to set aside the various projects I had on the go to focus on other things.

    Anyone else get further while I was away? :)
Sign In or Register to comment.