WAV Play Experiment

JonnyMac · 2021-02-03 20:26

I was inspired by @Baggers to create easy code for playing a WAV file from RAM -- the experimental result is attached. It will play 8- or 16-bit, mono or stereo files. Now that it all seems to be working, I am moving it to a proper object so that I can change stereo volume and playback speed on-the-fly (object will use one cog)

While we don't have structures in Spin, it is nice that with the P2 we no longer have to worry about word or long alignment, which lets us keep a group of variables in the declared order. WAV files have a 44-byte header that is a mix of longs and words. We can declare variables for the header and simply copy them into place with a xxxxmove instruction. Neat. I used bytemove to reinforce the idea of the 44-byte header (wordmove and longmove work, too).

Here are the header variables (notice the size mix)

  ' WAV header structure
  ' -- do not change order of these variables

  long  chunkId
  long  chunkSize
  long  format
  long  subChunk1Id
  long  subChunk1Size
  word  audioFormat
  word  numChannels
  long  sampleRate
  long  byteRate
  word  blockAlign
  word  bitsPerSample
  long  subChunk2Id
  long  subChunk2Size

Copying the header from the embedded WAV is this easy now:

  bytemove(@chunkId, p_dat, 44)                                 ' copy WAV header to variables

This is the program output to PST (ANSI terminals seem to work, too -- change T_TYPE constant).

Code updated 05 FEB 2021. Includes bug fix (thanks, Chip!) and simplification of the volume control code.

Baggers · 2021-02-03 23:10

That's awesome @JonnyMac

dgately · 2021-02-04 02:45

flexprop may not agree with PropTool in byte moving the WAV header structure...

Copying the vars one-at-a-time, results in a good display (though the wav file does not play correctly):

There are 2 sounds that are heard (a high volume and a low volume), but they are just a "PHHHT"-like sound

JonnyMac · 2021-02-04 03:05

The ZERP file has a 1kHz sine on one channel and a 440Hz sine on the other -- the 16-bit version should sound very clear. Perhaps @ersmith can sort out why this program doesn't work in FlexProp -- it's really pretty straightforward. I only tested with Propeller Tool, but did check the ANSI output using PuTTY.

The audio playback gets what it needs from the header without using any of the displayed global variables. The object I'm writing is based on that inline assembly.

JonnyMac · 2021-02-04 03:36

I thought I'd have a look to see if I could help out, but I was only reminded why I have been so turned off by Spin compilers. How does a line a code that seems like it could translate to this:

__byte_move     rdbyte    tmp, arg02
                wrbyte    tmp, arg01
                add       arg02, #1
                add       arg01, #1
                djnz      arg03, #__byte_move
                ret

... become this?

__system__bytemove
        mov     _var01, arg01
        cmps    arg01, arg02 wcz
 if_ae  jmp     #LR__0142
        loc     pa,     #(@LR__0136-@LR__0135)
        call    #FCACHE_LOAD_
LR__0135
        cmps    arg03, #3 wcz
 if_be  jmp     #LR__0137
        rdlong  _var02, arg02
        wrlong  _var02, arg01
        add     arg01, #4
        add     arg02, #4
        sub     arg03, #4
        jmp     #LR__0135
LR__0136
LR__0137
        mov     _var03, arg03 wz
 if_e   jmp     #LR__0148
        loc     pa,     #(@LR__0140-@LR__0138)
        call    #FCACHE_LOAD_
LR__0138
        rep     @LR__0141, _var03
LR__0139
        rdbyte  _var02, arg02
        wrbyte  _var02, arg01
        add     arg01, #1
        add     arg02, #1
LR__0140
LR__0141
        jmp     #LR__0148
LR__0142
        add     arg01, arg03
        add     arg02, arg03
        mov     _var04, arg03 wz
 if_e   jmp     #LR__0147
        loc     pa,     #(@LR__0145-@LR__0143)
        call    #FCACHE_LOAD_
LR__0143
        rep     @LR__0146, _var04
LR__0144
        sub     arg01, #1
        sub     arg02, #1
        rdbyte  _var02, arg02
        wrbyte  _var02, arg01
LR__0145
LR__0146
LR__0147
LR__0148
        mov     result1, _var01
__system__bytemove_ret
        ret

The truth is, I don't care because I'm not using anything but official Parallax tools at the moment, but those using non-Parallax compilers might want to know why it takes so much code for such a simple function.

whicker · 2021-02-04 04:42

Based on the example by dgately, it looks like Flexprop is reordering the variables so that longs are first, followed by words. There's no other context to tell it not to.

This variable reordering is more "normal" in the world of compilers. It's for this reason that struct keyword was invented, it means don't mess with this.

https://stackoverflow.com/questions/9486364/why-cant-c-compilers-rearrange-struct-members-to-eliminate-alignment-padding#9487640

whicker · 2021-02-04 05:53

It's too bad computing has diluted the meaning of terminology. I went down the trail of researching record as opposed to structure. But even with records usually being more ridgid in their implementation, they still come with system-dependent padding and alignment concerns such that you can't be sure what you ultimately get.

A data block might still mean an exact and unambiguous representation of how variables are laid out in memory. Consistent in the order they're declared and their size and their absence of secret padding... but even then I'm not so sure.

Wuerfel_21 · 2021-02-04 11:54

@whicker said:
Based on the example by dgately, it looks like Flexprop is reordering the variables so that longs are first, followed by words. There's no other context to tell it not to.

This is intended behavior for Spin1, but apparently not for Spin2

@JonnyMac said:
I thought I'd have a look to see if I could help out, but I was only reminded why I have been so turned off by Spin compilers. How does a line a code that seems like it could translate to this:
[...]
... become this?
[..]
The truth is, I don't care because I'm not using anything but official Parallax tools at the moment, but those using non-Parallax compilers might want to know why it takes so much code for such a simple function.

Well it's simple

If you wrote bytemove like you did there it wouldn't work in all cases. Read the documentation or spinterpreter source. It copies upwards or downwards depending on where the source/destination are to avoid overwriting something that it will later want to read.
It's faster that way - The function as generated by flexspin will be vastly faster than a simple rdbyte/wrbyte loop for big bytemoves. (Smaller constant-length bytemoves are instead inlined into the caller)

So pack your prejudice away for today.

Baggers · 2021-02-04 14:49

Wow, so glad I use pasm. it's like the early days of C compilers, they were terrible compared to pure asm, it took many many years before they became good.

ersmith · 2021-02-04 15:58

FlexProp is optimized for speed, not for simplicity of the generated code. In the case of bytemove, it has to handle both upwards and downwards moves (that's a Spin language requirement). For the P2 it takes advantage of the ability to do long moves to speed up the copy in a common case. Finally there's also the FCACHE code that copies the loop into local memory so it can run even faster. All of this adds complication. Yes, if we didn't care about correctness or about speed we could do the straightforward translation you proposed @JonnyMac .

The WAV file problem is, as @Wuerfel_21 and @whicker figured out, do to the object variables being re-ordered the same way as in Spin1. There are some distinct performance advantages to this (un-aligned reads/writes are possible on P2, but slower). If enough objects rely on variable ordering then I'll re-visit this.

@Baggers: FlexProp isn't for everyone, but performance wise it's not "terrible". The code is complicated and "ugly" precisely because it's trying to be fast.

Wuerfel_21 · 2021-02-04 16:16

Or to bring the topic back to audio playback: I think P2 is powerful enough to run Tremor, the integer-only OGG decoder. Someone should try that.

JonnyMac · 2021-02-04 16:28

The WAV file problem is, as @Wuerfel_21 and @whicker figured out, do to the object variables being re-ordered the same way as in Spin1.

That only explains the broken display of the header information, which Dennis sorted on his own. The playback code is atomic and not reliant on the variables used in the display.

JonnyMac · 2021-02-04 17:12

@dgately I know why audio playback is not working: The Spin compiler stores the system frequency (clkfreq) at hub address $44, while FlepProp puts the system frequency at hub address 20.

The fix (I'm asking you to do this manually to verify what I did works [I do have proper audio from FlexProp after these changes]).
-- Add a local variable to the play_wav() method called systix.
-- Add this line before the inline pasm code
systix := clkfreq

-- Look for this line:
rdlong smpltix, #$44

...which you'll find about six lines above the .fix_level label. Change that line to this:
mov smpltix, systix

I will try to figure out an elegant way to deal with this incompatibility issue with my WAV player object.

Wuerfel_21 · 2021-02-04 17:17

Ye, looks fine to me, so that not working is probably an actual bug to add to the pile. (written before the post above)

Unrelatedly: SAL is not the way to scale a sample. Use SHL. Or if you want proper levels, MUL/MULS or ROLBYTE+SAR

JonnyMac · 2021-02-04 17:50

Did I misunderstand SAL? It appears to be a left shift that pads with the original bit0, so $7F is promoted to $7FFF instead of $7F00 as with SHL (assuming a shift value of 8). I've tried both an there is no discernable difference in the audio.

Assembly is not my core strength, and as most of my code is for public consumption I tend to keep things very simple (i.e., public code is not highly optimized). This is the code in question:

.mono8          rdbyte    ls, p_wav                             ' read sample
                add       p_wav, #1                             ' point to next
                subs      ls, #$80                              ' convert to signed
                sal       ls, #7                                ' expand sample, make 1v p-p
                mov       rs, ls                                ' copy to right channel
                jmp       #.set_volume

My thoughts: This is an unsigned byte so subtracting $80 gets us to a signed long. The SAL #7 promotes it to 16 bits and then divides by two; this gives a maximum of 1v peak-to-peak from the DAC into the external amplifier.

You have a lot more experience with PASM and A/V coding than I have. I'd love to see how you would do this. I put up this experimental code so that it could be improved before being folded into a general-purpose object.

Wuerfel_21 · 2021-02-04 18:09

You understood SAL correctly, it's just that doing it that way effectively halves the resolution, because, for example, $71 becomes $71FF and $72 becomes $7200, which is almost the same value.

I think this would be more correct (untested though)

.mono8          rdbyte    ls, p_wav                             ' read sample
                add       p_wav, #1                             ' point to next
                subs      ls, #$80                              ' convert to signed
                muls      ls, #$0101                            ' expand sample
                sar       ls, #1                                ' make 1v p-p
                mov       rs, ls                                ' copy to right channel
                jmp       #.set_volume

dgately · 2021-02-04 18:13

@JonnyMac said:
@dgately I know why audio playback is not working: The Spin compiler stores the system frequency (clkfreq) at hub address $44, while FlepProp puts the system frequency at hub address 20.

The fix (I'm asking you to do this manually to verify what I did works [I do have proper audio from FlexProp after these changes]).
-- Add a local variable to the play_wav() method called systix.
-- Add this line before the inline pasm code
systix := clkfreq

-- Look for this line:
rdlong smpltix, #$44

...which you'll find about six lines above the .fix_level label. Change that line to this:
mov smpltix, systix

I will try to figure out an elegant way to deal with this incompatibility issue with my WAV player object.

Yes, this gets WavFile3 (zerp-16.wav) to play, and it sounds the same as playing that file from my Mac Desktop. The other 3 files are distorted and either very loud (01_milshot.wav) or very quiet (zerp-8 & 8.wav).

Wuerfel_21 · 2021-02-04 18:16

You might also want to add support for A-law compression (logarithmic PCM). It's 8 bits per sample, but it sounds so much better than straight 8 bit PCM and takes little code to decompress.
Here's the code for that (in P1 ASM, can be optimized a bit for P2):

              '' Decompress A-law
              '' Compressed sample is in atmp1
              '' output is in sfxsample, scale determined by ALAW_BASESHL (= 19 for 31 bit, would be 3 for 15 bit)
              '' alaw_bias is a constant: 1<<(ALAW_BASESHL-1)  
              '' alaw_leading is a constant: $10<<ALAW_BASESHL
              xor atmp1,#$55
              mov sfxsample,atmp1
              and sfxsample,#$0F ' mantissa isolated
              shl sfxsample,#ALAW_BASESHL
              add sfxsample,alaw_bias
              mov atmp2,atmp1
              shr atmp2,#4
              and atmp2,#7 wz ' exponent isolated
        if_nz add sfxsample,alaw_leading
        if_nz sub atmp2,#1
              shl sfxsample,atmp2
              test atmp1,#$80 wc
              negnc sfxsample,sfxsample

JonnyMac · 2021-02-04 20:04

@Wuerfel_21 said:
You understood SAL correctly, it's just that doing it that way effectively halves the resolution, because, for example, $71 becomes $71FF and $72 becomes $7200, which is almost the same value.

I think this would be more correct (untested though)
.mono8          rdbyte    ls, p_wav                             ' read sample
                add       p_wav, #1                             ' point to next
                subs      ls, #$80                              ' convert to signed
                muls      ls, #$0101                            ' expand sample
                sar       ls, #1                                ' make 1v p-p
                mov       rs, ls                                ' copy to right channel
                jmp       #.set_volume

Thanks, I will give that a try.

JonnyMac · 2021-02-04 20:08

@Wuerfel_21 said:
You might also want to add support for A-law compression (logarithmic PCM). It's 8 bits per sample, but it sounds so much better than straight 8 bit PCM and takes little code to decompress.
Here's the code for that (in P1 ASM, can be optimized a bit for P2):

              '' Decompress A-law
              '' Compressed sample is in atmp1
              '' output is in sfxsample, scale determined by ALAW_BASESHL (= 19 for 31 bit, would be 3 for 15 bit)
              '' alaw_bias is a constant: 1<<(ALAW_BASESHL-1)  
              '' alaw_leading is a constant: $10<<ALAW_BASESHL
              xor atmp1,#$55
              mov sfxsample,atmp1
              and sfxsample,#$0F ' mantissa isolated
              shl sfxsample,#ALAW_BASESHL
              add sfxsample,alaw_bias
              mov atmp2,atmp1
              shr atmp2,#4
              and atmp2,#7 wz ' exponent isolated
        if_nz add sfxsample,alaw_leading
        if_nz sub atmp2,#1
              shl sfxsample,atmp2
              test atmp1,#$80 wc
              negnc sfxsample,sfxsample

Again, thank you. The 8-bit uncompressed sounds a bit ratty, so I'll I give this a try. Even running at 200MH (my standard), there are about 4500 system ticks per 44.1kHz sample period -- there is plenty of time to do a bit of simple decompression. That said, my goal is always simplicity for the sake of teaching and inspiring. That code doesn't look bad, though.

JonnyMac · 2021-02-05 16:01

I had a late night code review with Chip -- he helped me track down a bug (was setting mode bits backward) and simplify the volume control code. It sounds better and looks good on a 'scope (1v, p-p, centered at 1v). I will move this code into a cog that will allow the user to start and stop playback at will, to put playback on hold for later release, to change left and right volume levels (0% to 100%) on-the-fly, and to change the playback rate (50% to 200%) on the fly.

Updated code is in the first post. If you're a FlexProp user, you'll have to make a small adjustment -- it's noted in the file.

WAV Play Experiment

Comments