A simple dithered wav player - now dithers every 1 microsecond

Edit: now this object dithers every 1 microsecond.
I needed this for my VGA player - i want it to play not only SIDs... Kye's WAV object from OBEX is good, with all this dithering, but it is simply too big and needs 2 cogs.
So I got dithering code from it, and wrote something like this. Maybe it can be useful for someone. The code needs cleaning (these aaa,bbb,ccc etc variables needs renaming, constant "1812" needs to be counted from clkfreq, etc, etc...)
A demo plays wav.wav fom SD. This wav has to be 44.1 kHz/16 bit - it doesn't use any wav header information
I needed this for my VGA player - i want it to play not only SIDs... Kye's WAV object from OBEX is good, with all this dithering, but it is simply too big and needs 2 cogs.
So I got dithering code from it, and wrote something like this. Maybe it can be useful for someone. The code needs cleaning (these aaa,bbb,ccc etc variables needs renaming, constant "1812" needs to be counted from clkfreq, etc, etc...)
A demo plays wav.wav fom SD. This wav has to be 44.1 kHz/16 bit - it doesn't use any wav header information
{{
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Simple sample player object
// Piotr Kardasz pik33@o2.pl
// Plays a 512 16-bit stereo samples buffer in endless loop
// Used dither code from WAV player engine by Kwabena W. Agyeman
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
}}
_clkmode = xtal1+pll16x
_clkfreq = 80_000_000
_SD_DO = 0
_SD_CLK = 1
_SD_DI = 2
_SD_CS = 3
_SD_WP = -1 ' -1 ifnot installed.
_SD_CD = -1 ' -1 ifnot installed.
VAR
byte cog
long buf[512]
long bufnum
obj fat : "kyefat_lfn" 'use Kye's SD driver from OBEX here
pub demo |i
fat.FATEngineStart(_SD_DO, _SD_CLK, _SD_DI, _SD_CS, _SD_WP, _SD_CD, -1, -1, -1)
i:=fat.mountpartition(0)
start(11,10,6)
repeat
fat.openfile(string("wav.wav"),"R")
repeat
repeat until bufnum>1024
i:=fat.readdata(@buf,1024)
repeat until bufnum<1024
i:=fat.readdata(@buf+1024,1024)
until i<>1024
pub getbuf
return @buf
pub getbufnum
return @bufnum
PUB start(left, right, ditherlevel)
'**** Starts a dithering sample player cog. Dither level=2..31
bytefill(@buf,0,1024)
bufptr:=@buf
stop
'**** Kye's dither procedure initialization
ditherLeftCounterSetup := ((left & $1F) + constant(110 << 26))
ditherRightCounterSetup := ((right & $1F) + constant(110 << 26))
ditherOutputMask := |<left | |<right
leftDitherShift := rightDitherShift := (ditherlevel & $1F)
'**** start a cog
cog := cognew(@init, @bufnum)
return cog
PUB stop
if cog
cogstop(cog-1)
cog := 0
DAT
org 0
init mov ctra, ditherLeftCounterSetup ' Setup counter modes to duty cycle mode.
mov ctrb, ditherRightCounterSetup
mov dira, ditherOutputMask
'
mov bufptr2,par
mov aaa,cnt
add aaa,bbb
ditherLoop cmp aaa,cnt wc
if_nc jmp #p00
add aaa, bbb
mov fff,bufptr
add fff,bufcnt
rdword ccc,fff
add ccc,offset
and ccc,ffff
shl ccc,#13
mov ggg,ccc
shl ccc,#1
add ggg,ccc
shl ccc,#1
add ccc,ggg
add ccc,offset2
add fff,#2
rdword ddd,fff
add ddd,offset
and ddd,ffff
shl ddd,#13
mov hhh,ddd
shl ddd,#1
add hhh,ddd
shl ddd,#1
add ddd,hhh
add ddd,offset2
add bufcnt,#4
wrlong bufcnt,bufptr2
and bufcnt,eee
' **** Kye's dither procedure
p00 test leftLFSR, leftTaps wc ' Iterate left dither source
rcl leftLFSR, #1 '
test rightLFSR, rightTaps wc ' Iterate right dither source
rcl rightLFSR, #1 '
mov ditherlbuffer, ccc
mov ditherLCounter, leftLFSR '
sar ditherLCounter, leftDitherShift '
mov ditherrbuffer,ddd
mov ditherRCounter, rightLFSR '
sar ditherRCounter, rightDitherShift '
add ditherLBuffer, ditherLCounter ' Apply dither.
mov frqa, ditherLBuffer ' Output.
add ditherRBuffer, ditherRCounter ' Apply dither.
mov frqb, ditherRBuffer ' Output.
jmp #ditherLoop '
ditherAdjust long $80_00_00_00 ' Prevents poping.
leftTaps long $A4_00_00_80 ' Left LFSR taps.
rightTaps long $80_A0_10_00 ' Right LFSR taps.
leftLFSR long 1 ' Initial value.
rightLFSR long 1 ' Initial value.
ditherLeftCounterSetup long 0
ditherRightCounterSetup long 0
ditherOutputMask long 0
leftDitherShift long 0
rightDitherShift long 0
ditherLeftAddress long 0
ditherRightAddress long 0
aaa long 0
bbb long 1812
ccc long 0
ddd long 0
eee long $0000_07FF
fff long 0
ggg long 0
hhh long 0
bufptr long 0
bufcnt long 0
bufptr2 long 0
offset2 long $1000_0000
offset long $8000
ffff long $FFFF
ditherLBuffer res 1
ditherLCounter res 1
ditherRBuffer res 1
ditherRCounter res 1
fit 496
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// TERMS OF USE: MIT License
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation
// files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy,
// modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the
// Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the
// Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
// WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
// COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Comments
Got a link? I don't see a file like that.
Please post a full .zip package if possible.
Thanks,
--Steve
http://obex.parallax.com/objects/619/
--Steve
After all investigations:
- dithering effect is worse than in Kye's 2 cog version: noise level is higher.
- this object plays better than sample player without any dither: this is lower level white noise, without any dither there is louder, irregular noise
- when not dithering at all, the noise depends on wav file, used cogs (less noise achieved with cog 0 playing, cog 6 spin interpreter, cog 7 sd), and maybe weather and currency exchange rate
- it may be useful when no cogs available for 2 cogs version.
(1) all can be done better
(2) there is no such thing as "impossible" with the Propeller
Kye's original dither procedure does one dithering loop in 20 instruction (=1 us/loop)
This procedure... does it also every 1 us, evenly timed.
Dither procedure doesn't read from the hub, so it is 15 instruction long
With calling it from the main loop I have 4 instruction or 2 instruction plus one hub between calls. So this is 1 us dither with sample play procedure.
Edit: what does this forum with percent sign??? I had to change to hex...
DAT ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ' DAC Dither Driver ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// org 0 ' //////////////////////Initialization///////////////////////////////////////////////////////////////////////////////////////// init mov ctra, ditherLeftCounterSetup ' Setup counter modes to duty cycle mode. mov ctrb, ditherRightCounterSetup mov dira, ditherOutputMask ' mov bufptr2,par mov time,cnt add time,delay ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ' Dither ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// loop cmp time,cnt wc 'if it is not time to get next sample if_c jmp #p01 nop call #dither jmp #loop p01 add time,delay call #dither mov ptr,bufptr add ptr,bufcnt 'pointer to next sample rdword lsample,ptr 'get left sample call #dither '16 instr add lsample,offset 'convert to unsigned add ptr, #2 rdword rsample,ptr 'hub window after 18 instr. call #dither and lsample,ffff shl lsample,#13 'multiply *$E000 to give it margin for dither mov temp,lsample shl lsample,#1 call #dither add temp,lsample shl lsample,#1 add lsample,temp add lsample,offset2 '$10000000 up! call #dither add rsample,offset and rsample,ffff shl rsample,#13 mov temp,rsample call #dither shl rsample,#1 add temp,rsample shl rsample,#1 add rsample,temp call #dither add rsample,offset2 mov rs2,rsample mov ls2,lsample add bufcnt,#4 call #dither nop and bufcnt,bufmask wrlong bufcnt,bufptr2 call #dither jmp #loop ' **** Kye's dither procedure dither test leftLFSR, leftTaps wc ' Iterate left dither source rcl leftLFSR, #1 ' test rightLFSR, rightTaps wc ' Iterate right dither source rcl rightLFSR, #1 ' mov ditherlbuffer, ls2 mov ditherLCounter, leftLFSR ' sar ditherLCounter, leftDitherShift ' mov ditherrbuffer, rs2 mov ditherRCounter, rightLFSR ' sar ditherRCounter, rightDitherShift ' add ditherLBuffer, ditherLCounter ' Apply dither. mov frqa, ditherLBuffer ' Output. add ditherRBuffer, ditherRCounter ' Apply dither. mov frqb, ditherRBuffer ' Output. dither_ret ret ' ' 15 instruction ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ' Data ' ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// leftTaps long $A4_00_00_80 ' Left LFSR taps. rightTaps long $80_A0_10_00 ' Right LFSR taps. leftLFSR long 1 ' Initial value. rightLFSR long 1 ' Initial value. ' //////////////////////Configuration Settings///////////////////////////////////////////////////////////////////////////////// ditherLeftCounterSetup long 0 ditherRightCounterSetup long 0 ditherOutputMask long 0 leftDitherShift long 0 rightDitherShift long 0 ' //////////////////////Addresses////////////////////////////////////////////////////////////////////////////////////////////// ditherLeftAddress long 0 ditherRightAddress long 0 ' //////////////////////Run Time Variables///////////////////////////////////////////////////////////////////////////////////// time long 0 delay long 1812 lsample long 0 rsample long 0 ls2 long 0 rs2 long 0 bufmask long $0000_07ff ptr long 0 temp long 0 bufptr long 0 bufcnt long 0 bufptr2 long 0 offset2 long $1000_0100 offset long $8000 ffff long $FFFF ditherLBuffer res 1 ditherLCounter res 1 ditherRBuffer res 1 ditherRCounter res 1
o Less audible noise than dithering (from what I can tell playing with the code examples in this thread)
o Can be easily amplified in digital domain - duty-cycle mode signals have to be filtered to analog close to the Prop and aren't then amenable to digital transmission or class-D amplification.
o Less sensitive to jitter and digital noise from neighbouring pins - the output frequency is 256 times lower being PWM
Basically the scheme is a sigma-delta modulation scheme driving phase-invariant PWM (ie class-D) output. I've been using an audio sample rate of 48kHz and PWM at 192kHz (4-times oversampling). Using the standard 5MHz crystal (instead of my slightly overclocked 6.144MHz) these frequencies would be about 39kHz and 156kHz.
Advantages:
o Using a PWM output signal switching at 192kHz the output transitions are 2.6us apart rather than 10ns with duty-cycle mode - thus jitter in the output transitions of 50ps yields an error of 0.002% rather than 0.5% for duty cycle mode - this is a real issue I believe, as I found that changing the dithering for one channel affects the other's sound quality (try listening to only the left channel and then changing the feedback taps for the right channel)
o Using 4-times oversampling allows sigma-delta techniques to permit significantly lower-resolution PWM than the audio samples. I use 24 bit audio and convert to 8 bit PWM. Clocking the PWM at 192kHz makes anti-alias filtering of the output simple, but is (just) low enough to be amplified digitally (class-D).
o The use of a filter in the sigma delta conversion feedback path pushes the quantization noise into higher frequencies and away from the audio band. Dithering spreads quantization noise evenly across the available bandwidth - it is quite audible in the code posted above, whereas I struggle to hear any noise from the sigma-delta technique.
o The combination of 4-times oversampling and 2-pole feedback filtering seems to produce very acceptable results with little demands on the processor (the filtering is two integrate steps) - I believe my output cog is in waitcnt for 75% of the time or so. This would allow it do other work such as multiplying the signal by a volume level.
My example code uses in-phase class-D outputs (two signals in-phase whose difference provides the signal - gives noticeably less power loss in the output stage) - this requires two counters for one audio channel as the outputs have different mark-space ratios - the signal is encoded as that difference in fact.
Alternatively the more usual class-D modulation (where the outputs are in anti-phase) can be used, needing only one counter per audio channel.
The demo in my example code uses a cordic routine to generate 24 bit sine wave with a slow frequency sweep.
The guts of the sigma-delta conversion is:
:loop rdlong samp, sampaddr ' output value 32 bit signed (i.e. 24 bits left-justified) sar samp, #8 ' 8-bit output delta-sigma modulator, 2-pole noise-shaping filter: add accum, samp ' integrator 1 add accum2, accum ' integrator 2 mov top, accum2 sar top, #16 ' 8 bits in the toppart mov feedb, top shl feedb, #16 sub accum, feedb ' feedback the quantized output to both integrators sub accum2, feedb maxs top, MAXVAL ' clip so class D waveforms don't over modulate mins top, MINVAL
The 8 bit quantized value in 'top' then goes to modulate the PWM... If the integrators are omitted then lots of stray spurious noises are present from the quantization mixing across the audio band. Using only one integrator is intermediate in effect. Adding another integrator will NOT work, however as the system becomes unstable - more sophisticated digital filtering is needed for more than 2 poles to prevent the filter reaching 180 phase-shift (two integrators are at the precise limit of stability).The output stage I use is based on MIC4422 MOSFET driver chips (about 1 ohm output resistance which isn't great, but they do switch nice and fast (25ns) and run from 4.5V to 18V and take 3.3V inputs fine). I run at about 8V and this drives an 8-ohm load (an old Acoustic Research AR18s speaker). The output filter is 10uH and 680nF, plus a big ferrite toroid over both speaker leads to reduce the common-mode high-frequency components.
wrong - I can use 4 integrate steps, so long as I allow for the wider range of output values this creates.
4 steps (or even 5) pushes the noise higher in frequency, but means there is an increase in amplitude
of the output to accomodate the high frequency noise signal.
I also have used 384kHz PWM to make output filtering easier.
About 90dB signal/noise ratio seems possible from a 16 bit signal source using 6 bit PWM, but I see
harmonic distortion due to output filter components, I believe mainly the inductors - some are a lot
better than others (gapped/unshielded seem to be best - toroidal inductors are hopeless producing
harmonic and intermodulation distortion is bucketloads - there is some information out there about
using ferrite beads on class D outputs being really bad for distortion, for the same reason. Gapped
or air-cored inductors are the lowest distortion, and of course unfiltered class-D is an option too
(esp for short cable runs).
produced much more noise than I expected when allowing the FAN7380's built-in shoot-through prevention
mechanism to control the timing - providing separately timed high and low input signals was necessary to
make the noise level acceptable.
I guess the shoot-through prevention delay in the FAN7380 is pattern-sensitive and not a constant delay,
thus causing intermodulation.
The cleanest signal I get is with the MIC4422 driver chip directly, although its power handling and maximum
voltage are limited. They internally have minimal shoot-through by just being extremely fast I believe.
An issue I haven't figured out is whether free-wheel diodes would be a useful extra bit of protection circuitry
for the outputs of the MIC4422 (ie whether its internal output drivers already have body diodes or not) - the
typical class-D output filter is inductive.