A simple dithered wav player - now dithers every 1 microsecond

pik33pik33 Posts: 782
edited June 2012 in Propeller 1 Vote Up0Vote Down
Edit: now this object dithers every 1 microsecond.



I needed this for my VGA player - i want it to play not only SIDs... Kye's WAV object from OBEX is good, with all this dithering, but it is simply too big and needs 2 cogs.

So I got dithering code from it, and wrote something like this. Maybe it can be useful for someone. The code needs cleaning (these aaa,bbb,ccc etc variables needs renaming, constant "1812" needs to be counted from clkfreq, etc, etc...)

A demo plays wav.wav fom SD. This wav has to be 44.1 kHz/16 bit - it doesn't use any wav header information
{{
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//  Simple sample player object
//  Piotr Kardasz pik33@o2.pl
//  Plays a 512 16-bit stereo samples buffer in endless loop
//  Used dither code from WAV player engine by Kwabena W. Agyeman
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
}}


  _clkmode = xtal1+pll16x
  _clkfreq = 80_000_000

  _SD_DO = 0
  _SD_CLK = 1
  _SD_DI = 2
  _SD_CS = 3
  _SD_WP = -1 ' -1 ifnot installed.
  _SD_CD = -1 ' -1 ifnot installed.

VAR

  byte cog
  long buf[512]
  long bufnum

obj fat : "kyefat_lfn" 'use Kye's SD driver from OBEX here

pub demo |i

fat.FATEngineStart(_SD_DO, _SD_CLK, _SD_DI, _SD_CS, _SD_WP, _SD_CD, -1, -1, -1)
i:=fat.mountpartition(0)
start(11,10,6)

repeat
  fat.openfile(string("wav.wav"),"R")
  repeat
    repeat until bufnum>1024
    i:=fat.readdata(@buf,1024)

    repeat until bufnum<1024
    i:=fat.readdata(@buf+1024,1024)
  until i<>1024


pub getbuf
return @buf

pub getbufnum
return @bufnum


PUB start(left, right, ditherlevel)

'**** Starts a dithering sample player cog. Dither level=2..31

  bytefill(@buf,0,1024)
  bufptr:=@buf
  stop

'**** Kye's dither procedure initialization

  ditherLeftCounterSetup := ((left & $1F) + constant(110 << 26))
  ditherRightCounterSetup := ((right & $1F) + constant(110 << 26))
  ditherOutputMask := |<left | |<right
  leftDitherShift := rightDitherShift := (ditherlevel & $1F)
    
'**** start a cog

    cog := cognew(@init, @bufnum)
    return cog
  
PUB stop

if cog
  cogstop(cog-1)
cog := 0
  

DAT


                        org     0


init                    mov     ctra,               ditherLeftCounterSetup     ' Setup counter modes to duty cycle mode.
                        mov     ctrb,               ditherRightCounterSetup
                        mov     dira,               ditherOutputMask
  '
                        mov     bufptr2,par
                        mov     aaa,cnt
                        add     aaa,bbb

ditherLoop              cmp     aaa,cnt      wc

                 if_nc  jmp     #p00
                        add     aaa, bbb

                        mov     fff,bufptr
                        add     fff,bufcnt
                        rdword  ccc,fff
                        add     ccc,offset
                        and     ccc,ffff
                        shl     ccc,#13
                        mov     ggg,ccc
                        shl     ccc,#1
                        add     ggg,ccc
                        shl     ccc,#1
                        add     ccc,ggg

                        add     ccc,offset2
                        add     fff,#2
                        rdword  ddd,fff
                        add     ddd,offset
                        and     ddd,ffff
                        shl     ddd,#13
                        mov     hhh,ddd
                        shl     ddd,#1
                        add     hhh,ddd
                        shl     ddd,#1
                        add     ddd,hhh
                        add     ddd,offset2
                        add     bufcnt,#4
                        wrlong  bufcnt,bufptr2
                        and     bufcnt,eee

' **** Kye's dither procedure

p00                   test    leftLFSR,           leftTaps wc                ' Iterate left dither source
                        rcl     leftLFSR,           #1                         '

                        test    rightLFSR,          rightTaps wc               ' Iterate right dither source
                        rcl     rightLFSR,          #1                         '

                        mov     ditherlbuffer, ccc
                        mov     ditherLCounter,     leftLFSR                   '
                        sar     ditherLCounter,     leftDitherShift            '

                        mov     ditherrbuffer,ddd
                        mov     ditherRCounter,     rightLFSR                  '
                        sar     ditherRCounter,     rightDitherShift           '

                        add     ditherLBuffer,      ditherLCounter             ' Apply dither.
                        mov     frqa,               ditherLBuffer              ' Output.
                                               
                        add     ditherRBuffer,      ditherRCounter             ' Apply dither.
                        mov     frqb,               ditherRBuffer              ' Output.
                        jmp     #ditherLoop                                    '
                        

ditherAdjust            long    $80_00_00_00                                   ' Prevents poping. 

leftTaps                long    $A4_00_00_80                                   ' Left LFSR taps.
rightTaps               long    $80_A0_10_00                                   ' Right LFSR taps.

leftLFSR                long    1                                              ' Initial value.
rightLFSR               long    1                                              ' Initial value.

ditherLeftCounterSetup  long    0
ditherRightCounterSetup long    0
ditherOutputMask        long    0

leftDitherShift         long    0
rightDitherShift        long    0

ditherLeftAddress       long    0
ditherRightAddress      long    0


aaa long 0
bbb long 1812
ccc long 0
ddd long 0
eee long $0000_07FF
fff long 0
ggg long 0
hhh long 0
bufptr long 0
bufcnt long 0
bufptr2 long 0
offset2 long $1000_0000
offset long $8000
ffff long $FFFF
ditherLBuffer           res     1
ditherLCounter          res     1
ditherRBuffer           res     1
ditherRCounter          res     1


                        fit     496
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//                                                  TERMS OF USE: MIT License
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation
// files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy,
// modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
// Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
// Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
// WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
// COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Comments

  • 14 Comments sorted by Date Added Votes
  • JonnyMacJonnyMac Posts: 6,087
    edited June 2012 Vote Up0Vote Down
    The litmus test is playing a 200Hz pure sine wave (you can generate in Audacity). I worked with Chip Gracey when developing the code for the EFX-TEK AP-16+ and he maintained then that you cannot dither in the same cog reading and playing your samples; the dithering must be constant and at such a rate as to induce white noise onto the signal that is filtered by the RC circuit connected to the output pin.
    Jon McPhalen
    Hollywood, CA
    It's Jon or JonnyMac -- please do not call me Jonny.
  • KyeKye Posts: 2,200
    edited June 2012 Vote Up0Vote Down
    I had to produce a dual cog solution also because of this. That's why the dither cog is running by itself. If you get some loud stereo speakers you can hear the difference. When using only a small desktop speaker the noise is hard to discern.
    Nyamekye,
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
    Maybe. But this code plays MUCH better than code without any dithering. I will do this test with 200 Hz sinus.
  • jazzedjazzed Posts: 11,803
    edited June 2012 Vote Up0Vote Down
    >> obj fat : "kyefat_lfn" 'use Kye's SD driver from OBEX here

    Got a link? I don't see a file like that.

    Please post a full .zip package if possible.

    Thanks,
    --Steve
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
  • jazzedjazzed Posts: 11,803
    edited June 2012 Vote Up0Vote Down
    Thanks. Noticed your license quote isn't spin commented. C++?
    --Steve
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
    Now, this is a finished object.

    After all investigations:

    - dithering effect is worse than in Kye's 2 cog version: noise level is higher.
    - this object plays better than sample player without any dither: this is lower level white noise, without any dither there is louder, irregular noise

    - when not dithering at all, the noise depends on wav file, used cogs (less noise achieved with cog 0 playing, cog 6 spin interpreter, cog 7 sd), and maybe weather and currency exchange rate :)

    - it may be useful when no cogs available for 2 cogs version.
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
    So,

    (1) all can be done better
    (2) there is no such thing as "impossible" with the Propeller

    Kye's original dither procedure does one dithering loop in 20 instruction (=1 us/loop)

    This procedure... does it also every 1 us, evenly timed.

    Dither procedure doesn't read from the hub, so it is 15 instruction long
    With calling it from the main loop I have 4 instruction or 2 instruction plus one hub between calls. So this is 1 us dither with sample play procedure.

    Edit: what does this forum with percent sign??? I had to change to hex...
    DAT
    
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    '                       DAC Dither Driver
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    
                            org     0
    
    ' //////////////////////Initialization/////////////////////////////////////////////////////////////////////////////////////////
    
    init                    mov     ctra,               ditherLeftCounterSetup     ' Setup counter modes to duty cycle mode.
                            mov     ctrb,               ditherRightCounterSetup
                            mov     dira,               ditherOutputMask
      '
                            mov     bufptr2,par
                            mov     time,cnt
                            add     time,delay
    
    
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    '                       Dither
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    
    loop                    cmp     time,cnt      wc   'if it is not time to get next sample
                     if_c   jmp     #p01
                            nop
                            call    #dither
                            jmp     #loop
    
    p01
    
                            add     time,delay
                            call    #dither
                            mov     ptr,bufptr
                            add     ptr,bufcnt         'pointer to next sample
                            rdword  lsample,ptr        'get left sample
                            call    #dither            '16 instr
                            add     lsample,offset     'convert to unsigned
                            add     ptr, #2
                            rdword  rsample,ptr        'hub window after 18 instr.
                            call    #dither
                            and     lsample,ffff
                            shl     lsample,#13        'multiply *$E000 to give it margin for dither
                            mov     temp,lsample
                            shl     lsample,#1
                            call    #dither
                            add     temp,lsample
                            shl     lsample,#1
                            add     lsample,temp
                            add     lsample,offset2    '$10000000 up!
                            call    #dither
                            add     rsample,offset
                            and     rsample,ffff
                            shl     rsample,#13
                            mov     temp,rsample
                            call    #dither
                            shl     rsample,#1
                            add     temp,rsample
                            shl     rsample,#1
                            add     rsample,temp
                            call    #dither
                            add     rsample,offset2
                            mov     rs2,rsample
                            mov     ls2,lsample
                            add     bufcnt,#4
                            call    #dither
                            nop
                            and     bufcnt,bufmask
                            wrlong  bufcnt,bufptr2
                            call    #dither
                            jmp    #loop
    
    ' **** Kye's dither procedure
    
    dither                  test    leftLFSR,           leftTaps wc                ' Iterate left dither source
                            rcl     leftLFSR,           #1                         '
    
                            test    rightLFSR,          rightTaps wc               ' Iterate right dither source
                            rcl     rightLFSR,          #1                         '
    
                            mov     ditherlbuffer,      ls2
                            mov     ditherLCounter,     leftLFSR                   '
                            sar     ditherLCounter,     leftDitherShift            '
    
                            mov     ditherrbuffer,      rs2
                            mov     ditherRCounter,     rightLFSR                  '
                            sar     ditherRCounter,     rightDitherShift           '
    
                            add     ditherLBuffer,      ditherLCounter             ' Apply dither.
                            mov     frqa,               ditherLBuffer              ' Output.
    
                            add     ditherRBuffer,      ditherRCounter             ' Apply dither.
                            mov     frqb,               ditherRBuffer              ' Output.
    dither_ret              ret                                    '               ' 15 instruction
                            
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    '                       Data
    ' /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    
    
    
    leftTaps                long    $A4_00_00_80                                   ' Left LFSR taps.
    rightTaps               long    $80_A0_10_00                                   ' Right LFSR taps.
    
    leftLFSR                long    1                                              ' Initial value.
    rightLFSR               long    1                                              ' Initial value.
    
    ' //////////////////////Configuration Settings/////////////////////////////////////////////////////////////////////////////////
    
    ditherLeftCounterSetup  long    0
    ditherRightCounterSetup long    0
    ditherOutputMask        long    0
    
    leftDitherShift         long    0
    rightDitherShift        long    0
    
    ' //////////////////////Addresses//////////////////////////////////////////////////////////////////////////////////////////////
    
    ditherLeftAddress       long    0
    ditherRightAddress      long    0
    
    ' //////////////////////Run Time Variables/////////////////////////////////////////////////////////////////////////////////////
    
    time       long 0
    delay      long 1812
    lsample    long 0
    rsample    long 0
    ls2 long 0
    rs2 long 0
    bufmask    long $0000_07ff
    ptr        long 0
    temp long 0
    
    bufptr  long 0
    bufcnt  long 0
    bufptr2 long 0
    offset2 long $1000_0100
    offset  long $8000
    ffff    long $FFFF
    
    ditherLBuffer           res     1
    ditherLCounter          res     1
    ditherRBuffer           res     1
    ditherRCounter          res     1
    
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
    And here is the zipped object:
  • Mark_TMark_T Posts: 1,679
    edited June 2012 Vote Up0Vote Down
    There is another approach to generating audio at high quality that I've been experimenting with and has some advantages:

    o Less audible noise than dithering (from what I can tell playing with the code examples in this thread)

    o Can be easily amplified in digital domain - duty-cycle mode signals have to be filtered to analog close to the Prop and aren't then amenable to digital transmission or class-D amplification.

    o Less sensitive to jitter and digital noise from neighbouring pins - the output frequency is 256 times lower being PWM

    Basically the scheme is a sigma-delta modulation scheme driving phase-invariant PWM (ie class-D) output. I've been using an audio sample rate of 48kHz and PWM at 192kHz (4-times oversampling). Using the standard 5MHz crystal (instead of my slightly overclocked 6.144MHz) these frequencies would be about 39kHz and 156kHz.

    Advantages:
    o Using a PWM output signal switching at 192kHz the output transitions are 2.6us apart rather than 10ns with duty-cycle mode - thus jitter in the output transitions of 50ps yields an error of 0.002% rather than 0.5% for duty cycle mode - this is a real issue I believe, as I found that changing the dithering for one channel affects the other's sound quality (try listening to only the left channel and then changing the feedback taps for the right channel)

    o Using 4-times oversampling allows sigma-delta techniques to permit significantly lower-resolution PWM than the audio samples. I use 24 bit audio and convert to 8 bit PWM. Clocking the PWM at 192kHz makes anti-alias filtering of the output simple, but is (just) low enough to be amplified digitally (class-D).

    o The use of a filter in the sigma delta conversion feedback path pushes the quantization noise into higher frequencies and away from the audio band. Dithering spreads quantization noise evenly across the available bandwidth - it is quite audible in the code posted above, whereas I struggle to hear any noise from the sigma-delta technique.

    o The combination of 4-times oversampling and 2-pole feedback filtering seems to produce very acceptable results with little demands on the processor (the filtering is two integrate steps) - I believe my output cog is in waitcnt for 75% of the time or so. This would allow it do other work such as multiplying the signal by a volume level.

    My example code uses in-phase class-D outputs (two signals in-phase whose difference provides the signal - gives noticeably less power loss in the output stage) - this requires two counters for one audio channel as the outputs have different mark-space ratios - the signal is encoded as that difference in fact.

    Alternatively the more usual class-D modulation (where the outputs are in anti-phase) can be used, needing only one counter per audio channel.

    The demo in my example code uses a cordic routine to generate 24 bit sine wave with a slow frequency sweep.

    The guts of the sigma-delta conversion is:
    :loop           rdlong  samp, sampaddr          ' output value 32 bit signed (i.e. 24 bits left-justified)
                    sar     samp, #8
                                                    ' 8-bit output delta-sigma modulator, 2-pole noise-shaping filter:
                    add     accum, samp             ' integrator 1
                    add     accum2, accum           ' integrator 2
                    mov     top, accum2
                    sar     top, #16                ' 8 bits in the toppart
                    mov     feedb, top
                    shl     feedb, #16
                    sub     accum, feedb            ' feedback the quantized output to both integrators
                    sub     accum2, feedb
    
                    maxs    top, MAXVAL             ' clip so class D waveforms don't over modulate
                    mins    top, MINVAL
    
    
    
    The 8 bit quantized value in 'top' then goes to modulate the PWM... If the integrators are omitted then lots of stray spurious noises are present from the quantization mixing across the audio band. Using only one integrator is intermediate in effect. Adding another integrator will NOT work, however as the system becomes unstable - more sophisticated digital filtering is needed for more than 2 poles to prevent the filter reaching 180 phase-shift (two integrators are at the precise limit of stability).
  • pik33pik33 Posts: 782
    edited June 2012 Vote Up0Vote Down
    This is an interesting thing to experiment with. In reality, using duty mode, we have less than 11 bit DAC. So, maybe it is better to reduce resolution from 16 to 8 bit and add noise shaping filter. I will try it.
  • Mark_TMark_T Posts: 1,679
    edited June 2012 Vote Up0Vote Down
    pik33 wrote: »
    This is an interesting thing to experiment with. In reality, using duty mode, we have less than 11 bit DAC. So, maybe it is better to reduce resolution from 16 to 8 bit and add noise shaping filter. I will try it.

    The output stage I use is based on MIC4422 MOSFET driver chips (about 1 ohm output resistance which isn't great, but they do switch nice and fast (25ns) and run from 4.5V to 18V and take 3.3V inputs fine). I run at about 8V and this drives an 8-ohm load (an old Acoustic Research AR18s speaker). The output filter is 10uH and 680nF, plus a big ferrite toroid over both speaker leads to reduce the common-mode high-frequency components.
    ClassD-output.jpg
    568 x 279 - 32K
  • I've been working on class D again recently, and the limit I thought existed for only 2 integrate steps seems
    wrong - I can use 4 integrate steps, so long as I allow for the wider range of output values this creates.
    4 steps (or even 5) pushes the noise higher in frequency, but means there is an increase in amplitude
    of the output to accomodate the high frequency noise signal.

    I also have used 384kHz PWM to make output filtering easier.

    About 90dB signal/noise ratio seems possible from a 16 bit signal source using 6 bit PWM, but I see
    harmonic distortion due to output filter components, I believe mainly the inductors - some are a lot
    better than others (gapped/unshielded seem to be best - toroidal inductors are hopeless producing
    harmonic and intermodulation distortion is bucketloads - there is some information out there about
    using ferrite beads on class D outputs being really bad for distortion, for the same reason. Gapped
    or air-cored inductors are the lowest distortion, and of course unfiltered class-D is an option too
    (esp for short cable runs).
  • Another observation I made is that with class D boosted via a FAN7380 high/low MOSFET driver + MOSFET bridge
    produced much more noise than I expected when allowing the FAN7380's built-in shoot-through prevention
    mechanism to control the timing - providing separately timed high and low input signals was necessary to
    make the noise level acceptable.

    I guess the shoot-through prevention delay in the FAN7380 is pattern-sensitive and not a constant delay,
    thus causing intermodulation.

    The cleanest signal I get is with the MIC4422 driver chip directly, although its power handling and maximum
    voltage are limited. They internally have minimal shoot-through by just being extremely fast I believe.

    An issue I haven't figured out is whether free-wheel diodes would be a useful extra bit of protection circuitry
    for the outputs of the MIC4422 (ie whether its internal output drivers already have body diodes or not) - the
    typical class-D output filter is inductive.
Sign In or Register to comment.