reSound - A sound driver and mixer for the P2

cgracey · 2020-02-01 17:19

Johannes, I'm looking forward to you writing some code to synthesize sounds, from scratch, at high resolution and high sample rate. I think the audio quality will be way higher than this Amiga sound.

The video of the Amiga program showing the sound trigger data scrolling upward was really cool! I love it when very small data drives big things.

rogloh · 2020-02-01 23:46

Update: problem resolved, see later post below.

So I downloaded the latest flexgui 4.1.2 and retried this drumbeat demo using newest 4.1.2 fastspin and latest bundled loadp2 version on my Mac. Unfortunately I still have the same result on two rev B boards. The reSound drumbeat demo fails to output anything, while mod file tracker demo succeeds on the same pins.

I wonder if the binary being generated on the MAC OS X with Fastspin is different to the PC? I have attached my binaries and matching P2asm files in case @ersmith or @Ahle2 wanted to take a look at if/why this could be happening. It was compiled for outputting on pins 6,7 for both demos. I have included both the mod player and the drumbeat binaries demo Fastspin generated if someone wanted to see if these work in their setup on pins 6/7 with A/V breakout board fitted from P0-P7, or possibly diff the generated output against their own generated by the Windows version of Fastspin 4.1.2 perhaps.

Here was the output from the compile and run step. You can see the compiler version is the latest.

"/Users/roger/Downloads/flexgui-2/bin/fastspin.mac" -2 -l -D_BAUD=230400 -O1 -I "/Users/roger/Downloads/flexgui-2/include"  "/Users/roger/Downloads/reSoundAlfaExamples/Ahle2_reSoundSamplePlayback/main.spin2"
Propeller Spin/PASM Compiler 'FastSpin' (c) 2011-2020 Total Spectrum Software Inc.
Version 4.1.2 Compiled on: Jan 31 2020
main.spin2
|-reSound.spin2
main.p2asm
Done.
Program size is 273760 bytes
Finished at Sun Feb  2 10:28:13 2020
/Users/roger/Downloads/flexgui-2/bin/mac_terminal.sh "/Users/roger/Downloads/flexgui-2/bin/loadp2.mac"  -b230400 "/Users/roger/Downloads/reSoundAlfaExamples/Ahle2_reSoundSamplePlayback/main.binary" "-9/Users/roger/Downloads/reSoundAlfaExamples/Ahle2_reSoundSamplePlayback" -k

And loadp2 was run using the version bundled as loadp2.mac...

RLs-MacBook-Pro:~ roger$ /Users/roger/Downloads/flexgui-2/bin/loadp2.mac -b230400 /Users/roger/Downloads/reSoundAlfaExamples/Ahle2_reSoundSamplePlayback/main.binary -9/Users/roger/Downloads/reSoundAlfaExamples/Ahle2_reSoundSamplePlayback -k; osascript -e 'tell application "Wish" to activate'; exit 0
( Entering terminal mode.  Press Ctrl-] to exit. )

potatohead · 2020-02-02 00:13

Ahle2 wrote: »

I Will try to figure out what's wrong with the drum beat example. It's works for me! It may have something to do with my hacky way of setting the hub pointer reference with an offset into the reSound binary! That's the warning you saw rogloh!

I just got a chance to run everything. Had to go look up the schematic to understand what pins go to the headphones, but other than that, it all worked. I am using latest FastGUI, just downloaded a few minutes ago.

Man, those mods sound great. I will echo memories of great times.

Had some fun making up some beats with the simple example shown.

It's been said a couple times, once by me for sure, this is the Amiga of microcontrollers. Kind of felt that today, and while working with Rogers amazing video driver. It's been fun to do some stuff in PAL.

Travel coding with HDMI has been hit or miss. I love this, because I can just package it all up, and use headphones.

I went ahead and listened at default volume, and for my AKG earbuds, that's just about too loud. Subjectivly, it's way better than P1, and the 8 bit ness of the samples is loud and clear. I agree with the ~80db assessment based on what I heard with the beat example. It's good.

There will come a day when some of us are composing on a P2 tracker running on the P2 + SD card and external RAM. That will be a fun day.

rogloh · 2020-02-02 01:00

Finally figured out the problem.

Along the way I must have inadvertently compiled the reSound.spin2 file which generated a resound.binary file and clobbered the file with the same name as distributed with the source. Once I refreshed the source from the original zip file contents I now have the drumbeat demo working.

Damn that!

Ahle2 · 2020-02-02 07:44

> @cgracey said:
> Johannes, I was just curious about your subjective experience. I'm happy to hear it sounds good! It may be fine to always use the PWM-dither mode, since there will be maybe a dozen full cycles before a sample change. The noise-dither mode is a lot noisier. Let your ears be the judge.

The SNR will probably be much better using the PWM mode. I Will make some measurements with different modes.

Ahle2 · 2020-02-02 07:54

> @Wuerfel_21 said:
> Ahle2 wrote: »
>
> * Can output on up to 8 smart pins for surround sound
>
>
>
>
> Have you thought of implementing surround matrixing? That's probably more commonly supported/useful than discrete surround channels.
> Although doing it properly would require applying a phase shift to the surround channels. I don't even have a clue of how that would work...

The surround option is not a priority at this time. Regarding matrix mixing, I don't like the thought of the driver doing too much processing and in the end being slower for something that will not be used very much. In my opinion Dolby Prologic never sounded good. All my surround amplifers, through the years, have got discrete analog inputs for all channels.

Ahle2 · 2020-02-02 08:03

> @dgately said:
> Really great example of what P2 can do!
>
> Both modules & sample versions working as compiled with fastspin -v 4.1.1 (same as in flexgui v 4.1.1), for pins 6 & 7.
> And, I'm getting the kick drum, without issue. The BassU8 sample works as well. Not as much luck with TrumpetS8, but that could be my choice of frequencies.
>
> dgately

Actually, the examples doesn't really use the full potential of the driver. You could probably load 4 modules at the same time and letting one cog instance process a total of 16 channels at that 4 x 44100Hz mixing rate. Going down to 44100 you could process 64 channels with a little bit of overclocking.

Ahle2 · 2020-02-02 08:05

> @JonnyMac said:
> That's really cool, and sounds great. I am hoping that Chip will write a multi-file SD file reader that can be used to feed this. For lots of projects I need a background track with effects mixed in and played over it as needed.

I guess the P2 will be put to good use in Hollywood doing sound and leddy things? Right!?

Ahle2 · 2020-02-02 08:55

> @potatohead said:
> Ahle2 wrote: »
>
> I Will try to figure out what's wrong with the drum beat example. It's works for me! It may have something to do with my hacky way of setting the hub pointer reference with an offset into the reSound binary! That's the warning you saw rogloh!
>
>
>
>
> I just got a chance to run everything. Had to go look up the schematic to understand what pins go to the headphones, but other than that, it all worked. I am using latest FastGUI, just downloaded a few minutes ago.
>
> Man, those mods sound great. I will echo memories of great times.
>
> Had some fun making up some beats with the simple example shown.
>
> It's been said a couple times, once by me for sure, this is the Amiga of microcontrollers. Kind of felt that today, and while working with Rogers amazing video driver. It's been fun to do some stuff in PAL.
>
> Travel coding with HDMI has been hit or miss. I love this, because I can just package it all up, and use headphones.
>
> I went ahead and listened at default volume, and for my AKG earbuds, that's just about too loud. Subjectivly, it's way better than P1, and the 8 bit ness of the samples is loud and clear. I agree with the ~80db assessment based on what I heard with the beat example. It's good.
>
> There will come a day when some of us are composing on a P2 tracker running on the P2 + SD card and external RAM. That will be a fun day.

Yes, the P2 is such an unique chip, just like the Amiga was in its day. I guess Chip is the Jay Miner of Microcontrollers then?!

Ahle2 · 2020-02-02 12:17

> @cgracey said:
> Johannes, I'm looking forward to you writing some code to synthesize sounds, from scratch, at high resolution and high sample rate. I think the audio quality will be way higher than this Amiga sound.
>
> The video of the Amiga program showing the sound trigger data scrolling upward was really cool! I love it when very small data drives big things.

A synthesizer is on my list!

Yes it will be of better quality than 8 bit samples! But that was extremely cool for a computer in the 80's. Especially without taxin the CPU, all in hardware.

JonnyMac · 2020-02-02 16:18

I guess the P2 will be put to good use in Hollywood doing sound and leddy things? Right!?

That is the idea. I have two good friends in the business who lean more to the fabrication side while I focus on the coding side. We have talked about collaborating on a P1 or P2 board for running props and effects. One of the guys does work in the arcade gaming industry, too, so sound is important. We're hoping that the power of the P2 will allow us to build a board that will handle the generic stuff we do, and have a option module that will allow for multi-channel sound. Again, a typical application with sound has a background and needs other sounds overlaid on top of that -- of course, those secondary sounds and their timing are programmatic; they can't be baked into the background.

potatohead · 2020-02-02 17:38

Lol, yeah. We could say Chip is the Jay Miner of micros. Honestly, the two have some things in common.

Re: Dolby

I feel the same way about it. To me, it's a movie format. Not necessarily something one would use to appreciate music. The 7 channel one I have now, on some productions, is pretty respectable for casual, get the feels type experiences. I guess a lot depends on who masters it.

I did not know surround amps came with discrete inputs. Now I need one.

evanh · 2020-02-03 09:12

This is cool. Digging up some mods I note the mod player trips up on the more complex tracks. Light-of-day has stretched/wobbly instruments in a few places. Held it together really well on the whole, I've had much worse mod players.

evanh · 2020-02-03 09:17

I worked out how to reduce the system clock down to 80 MHz. First attempt was funny as, the instruments were fine but the sequencing was running real slow. Second attempt with the mixing frequency made the instruments lower pitch as well. Only after that did I find the tracker sourcecode had a couple of matching constants.

Ahle2 · 2020-02-03 15:28

evanh wrote: »

This is cool. Digging up some mods I note the mod player trips up on the more complex tracks. Light-of-day has stretched/wobbly instruments in a few places. Held it together really well on the whole, I've had much worse mod players.

Thanks Evan!

The module player was made to have something to test reSound with. I know there are a lot of problems and stuff that needs to be implemented, but it was deemed good enough for now. Most modules plays okay though. My main focus is getting reSound up to release status first. And I think I'll need to make an example of using reSound as a sound buffer with interrupt handling for wav playback and stuff like that. Is there a stable SD driver that I could use?

rogloh · 2020-02-03 20:34

So I fully disassembled your binary @Ahle2 to learn how you do it (don't worry, won't post here) and if/how it might cope with using a HyperRAM buffer. For that I added symbolic names so in theory I can now fully recompile this code myself and experiment with it a bit. I really like how you've made very good use of skipf and scas and the fifo to do things and allowing the flexibility of multiple output channels without incurring the overhead when you don't need them; this is some high performance code. Fantastic work.

Based on what I saw I think HyperRAM + audio may not be good friends yet given how the mixer works. We will likely need an intermediate hub RAM buffer and something dragging audio into the hub RAM so your mixer can access it. Probably portions of things like large WAV files can be stored/buffered in HyperRAM if they can't be streamed off an SD card in real-time, then brought in pieces to hub RAM before the mixer output.

For MOD file playback and wavetable synthesis, due to the random access nature of the reads, it would be hard to use the HyperRAM directly when shared with video. Because of all the setup overheads, and latency to get the data back, these small HyperRAM accesses have very limited performance and you might only get a few read opportunities per scanline which won't always work for mixing higher sample rate audio with low resolution video (eg. 44100 kHz with VGA @ 31.5kHz). An intermediate transfer buffer in hub RAM may help in situations that make use of sequential data, but it's these random accesses within sample buffers that kill us.

cheezus · 2020-02-04 01:44

Ahle2 wrote: »

Is there a stable SD driver that I could use?

I need to update the FSRW port for Rev B still. I've been putting it off until I get my P2D2 but I guess that's going to be a while still? I think you should be able to use SDSPI_Bashed, the waitx may need to be adjusted due to the extra flops. I know the smartpin modes have changed and need to be updated to Rev B. I think the extra flops will change the timing as well. Sorry I haven't got this done yet.
https://forums.parallax.com/discussion/169786/sd-drivers-for-reva-fsrw-working-and-physical-spi#latest

evanh · 2020-02-04 02:46

Sync serial smartpin modes didn't change at all.

Only smartpin changes was the four USB modes compacted into a single mode to make way for the new ADC bitstream filtering modes. Only other I/O change was BITDAC pin mode went dual 4-bit level settable rather than an 8-bit level.

Ahle2 · 2020-02-04 09:23

rogloh,

That's not allowed!

I love the skipf instruction in combination with the fifo. You could go completely dynamic without loosing any performance when some options are not needed. Chip is a genius!

The more I work with the P2, the more I see how rigid most other designs are. Not much innovation for decades, apart from performance wise (pipelines, SIMD, faster clock, caches, superscalar.. etc)

evanh · 2020-02-04 10:55

Commercial interests tend to lock innovation behind NDAs, patents and copyright hoarding. That and the PC industry wiped everything else out.

Ahle2 · 2020-02-05 07:39

cheezus wrote: »

Ahle2 wrote: »

Is there a stable SD driver that I could use?

I need to update the FSRW port for Rev B still. I've been putting it off until I get my P2D2 but I guess that's going to be a while still? I think you should be able to use SDSPI_Bashed, the waitx may need to be adjusted due to the extra flops. I know the smartpin modes have changed and need to be updated to Rev B. I think the extra flops will change the timing as well. Sorry I haven't got this done yet.
https://forums.parallax.com/discussion/169786/sd-drivers-for-reva-fsrw-working-and-physical-spi#latest

Thank you, I am looking forward to a finished FSRW port for the P2!

I will try SDSPI_Bashed.. Maybe you could add multiple file support as well?!

Ahle2 · 2020-02-05 07:42

rogloh,

Maybe there is a way of using lut sharing as a cache for audio data with a hyperRAM driver on the other end? Just thinking out loud!!

cheezus · 2020-02-05 23:19

Ahle2 wrote: »

cheezus wrote: »

Ahle2 wrote: »

Is there a stable SD driver that I could use?

I need to update the FSRW port for Rev B still. I've been putting it off until I get my P2D2 but I guess that's going to be a while still? I think you should be able to use SDSPI_Bashed, the waitx may need to be adjusted due to the extra flops. I know the smartpin modes have changed and need to be updated to Rev B. I think the extra flops will change the timing as well. Sorry I haven't got this done yet.
https://forums.parallax.com/discussion/169786/sd-drivers-for-reva-fsrw-working-and-physical-spi#latest

Thank you, I am looking forward to a finished FSRW port for the P2! I will try SDSPI_Bashed.. Maybe you could add multiple file support as well?!

Evanh pointed out that smartpin modes didn't change, and thinking about it I'm not sure if the extra flops change timing? I know they will for the bit-bashed version.. Hopefully sdspi_asm will just work for Rev B? There is still a bug in the bpin setup, see 2nd from last post in thread. I'm not sure how I'd implement multiple file support.. I think as it is you could include multiple copies of FSRW, but really haven't got that far yet.

I'm very excited about this, as well as SidCog2. Been meaning to get the eval accessory board set but money's been tight. I think tonight I'll wire up an audio board and actually play with some audio stuff. Most of my coding has been focused on getting the touchscreen/memory stuff working and now it's fairly functional. I have an idea of using my memory board to run a DAC board, something similar to an AKAI MPC-1000 but that's a way off. It's nowhere near hyper-bus throughput but 16-bit bus is very nice for audio imo.

evanh · 2020-02-05 23:47

Oops, forgot the extra flops, the extra stages apply to all I/O.

For smartpins, it doesn't affect rx but it reduces max tx rate. More precisely it limits how small the tx to sysclock ratio can be. Chip says 5:1 but that is very tight and high sysclocks will likely fail. I usually recommend 8:1 but I guess 6:1 is probably safe up to 240 MHz - Which gives you up to 40 MHz SPI clock.

Will need tested.

PS: And that's without any registered pins. If all are registered then it adds another two to the minimum ratio.

rogloh · 2020-02-06 00:23

Ahle2 wrote: »

rogloh,

Maybe there is a way of using lut sharing as a cache for audio data with a hyperRAM driver on the other end? Just thinking out loud!!

Well that could allow direct transfers between two COGs but it doesn't really solve the issue of small random accesses from a buffer in my view as the HyperRAM driver won't know what audio sample to cache next if it is not simply sequential data. Think of HyperRAM as something that at best can allow up to around 1 million individual random accessed elements (longs/bytes/words) per second at ~252MHz, when nothing else is using it, then scale this down by the fraction of time that it gets used by video/other COGs, which might vary from perhaps 25% to near 100% depending on resolution etc. Or you may have this memory fully dedicated to audio in some applications I guess.

If the audio accesses are just single 16 bit audio samples you might only achieve 2MB/s out of it this way with exclusive access. That could still do multiple channels at 44100 kHz and the mixer COG might be able do other things while it is waiting for the next sample to arrive after it is requested during this 1us latency period. But if you make it a burst read instead, you can get MUCH higher performance out of it and the memory can then be shared with other COGs. The HyperRAM used on the Parallax module is rated to up to 200MB/s at 3V (when operated at 100MHz), and newer RAMs coming out are higher still. So with larger read burst sizes you can hit much higher bandwidths.

Audio is interesting because in general it doesn't really need very high bandwidth compared to video, it just needs low latency and either sequential access (playback of WAV file) or the much more demanding (for HyperRAM) random access if source wavetable data is played back at variable frequencies like your mixer COG does with it's mod file samples etc. HyperRAM is great for sequential access but not good for random access because you can't change the address in the middle of the burst access, you need to start a whole new one.

In applications that share HyperRAM with video, perhaps some type of hybrid implementation might be possible where any "active" instrument waveforms are temporarily held in HUB RAM while a much larger library or file of them could be stored in HyperRAM and then read into Hub RAM with bursts and used whenever instruments change during playback. Then you mixer would only every need to playback from HUB RAM and not incur the memory access penalty from HyperRAM using small accesses. But how well it would work might depend on how frequently these instruments change and how many you are using etc.

jmg · 2020-02-06 00:54

rogloh wrote: »

.... but it doesn't really solve the issue of small random accesses from a buffer in my view as the HyperRAM driver won't know what audio sample to cache next if it is not simply sequential data. Think of HyperRAM as something that at best can allow up to around 1 million individual random accessed elements (longs/bytes/words) per second at ~252MHz, when nothing else is using it, then scale this down by the fraction of time that it gets used by video/other COGs, which might vary from perhaps 25% to near 100% depending on resolution etc.
Or you may have this memory fully dedicated to audio in some applications I guess.

Maybe more advanced Audio + Video systems will allocate TWO HyperRAMs ?
Or, a design could use HyperRAM for Video, and QuadSPI memory for the Audio, for a mid-level pin cost. (LY68L6400SLIT is spec'd at 144MHz if not crossing a page boundary)

evanh · 2020-02-06 01:05

Ah, the advantage of a ratio of 8:1 is there is four sysclocks of lag from the SPI clock rise to the SPI tx data out appearing at its pin. So 8:1 perfectly delays the tx by half the SPI clock period which is nicely bang on the low going SPI clock edge. Exactly as desired.

At 6:1, the tx pin transitions one sysclock after the SPI clock low, leaving still two sysclocks of tx data setup before the next rising SPI clock.

At 5:1 you're down to a single sysclock of setup.

rogloh · 2020-02-06 02:10

Yeah jmg there are probably plenty of other solutions. In any case I'd hope we can still come up with something reasonably usable for audio+video with just a single HyperRAM implementation which could be quite common, by trying to remove/reduce the random access load element from the HyperRAM. In many audio cases it may not need to use external memory at all. So this seems mainly to be a problem when you want to do wavetable synthesis with samples in external HyperRAM memory and share the memory with other COGs, particularly with video.

I'm working on this HyperRAM driver right now so if there are special things we can try to add for assisting audio accesses, now is a great time to add it in there. I'm just not sure what they would be given the way HyperRAM works. The only thing I can think of is a creating a list of address read requests, instead of a single read address request in the mailbox, but this would only really reduce the setup overheads slightly for the audio requesting COG, and it won't alleviate the latency burden on the HyperRAM itself, so it may not be worth putting in all the extra complexity for that, especially if it slows regular accesses further as well.

Right now the HyperRAM COG uses most (~85%) of the COGRAM space due to loop unrolling optimisations and probably about 1/3 LUTRAM or so to help speed up the bank to device mapping. So there's still some more room left for code / data and I can shuffle a few things around between LUT and COG space if required to rebalance. I don't expect adding HyperFlash support to need too much more space either as I have decided to pass the programming sequence logic burden back to the calling COG. If you browse this data sheet you'll see why I want to do that...

http://www.issi.com/WW/pdf/26KS-KL128S-256S-512S.pdf

jmg · 2020-02-06 03:08

rogloh wrote: »

I'm working on this HyperRAM driver right now so if there are special things we can try to add for assisting audio accesses, now is a great time to add it in there. I'm just not sure what they would be given the way HyperRAM works.

A device with longer tCSM would be nice, and I see the OctaRAM info is now up at ISSI (but they spec the same 4us MAX for tCSM)

http://www.issi.com/US/product-cellular-ram.shtml#jump2
http://www.issi.com/WW/pdf/66-67WVO16M8DALL-BLL.pdf

This part family has a read pattern command, which could be useful with P2 for margin verify checking.
Overall, very similar to HyperRAM, but the 6 bytes Cmd.Adr splits differently. (speed is 133MHz on 3.3V parts)

The 128M HyperRAM with ECC, specs faster 16Mx8 IS66/67WVH16M8EDBLL 2.7-3.6V 133MHz

Ahle2 · 2020-02-06 07:32

Rogloh,

To be honest, it really isn't random access at all. It is "variable number of multiplexed audio streams of variable read rate". It is order to the chaos in a sense!

But I understand what you are saying. Still it could be seen as X number of fifos (caches) that are sequential read at different rates and the read address loops after X number of reads.

reSound - A sound driver and mixer for the P2

Comments