Audio Frequency Shifter (Hear your PING!)
Phil Pilgrim (PhiPi)
Posts: 23,514
This is a continuation of my FIR Filter thread. I'm posting it as a new thread because it has more general appeal and doesn't require any special equipment besides a Propeller Demo Board and stereo headphones. Please refer to the original thread, though, for the more technical details and for how to create your own filter.
What this demo does is to shift the frequencies of sounds picked up by the Demo Board's microphone by a constant, user-selectable amount. In this manner, sounds which the microphone can hear but which we can't are brought into the human audio range. Also, sounds which we can hear can be shifted in frequency to let us pick out aspects that may have been more subtle at their normal frequency. Here's a block diagram of the demo, as implemented on the Demo Board:
Sounds picked up by the microphone are mixed with two square waves that are 90° out of phase with each other. The output of each mixer consists of signals representing the sum and difference frequencies of the audio and the square waves from the I and Q local oscillators. So, for example, if you had two audio frequencies of say 400 Hz and 2000 Hz and mixed them with 1000 Hz square waves, you get signals of 600 Hz (1000 - 400), 1000 Hz (2000 - 1000), 1400 Hz (1000 + 400), and 3000 Hz (1000 + 2000). We are interested here only in the difference frequencies. The mixer products from input frequencies that start out above the local oscillator (LO) frequency, are the "upper" outputs; those from the audio below the LO frequency, "lower" outputs. The upper and lower outputs may have the same frequency. But they can still be separated, due to their different phase relationships emanating from the two mixers.
(Since the LO produces square waves, there are also mixer products from the audio and the LO's odd harmonics. Where the LO has a very high frequency w.r.t. the signal inputs, this is not a problem, since those signals are way outside the passband of anything downstream. In this case, however, there could be some harmonic products that get through. I attempted to produce a sine/cosine wave LO, which outputs DUTY mode pulses. The idea was that these pulses would mix statistically with the pulses from the sigma-delta counter output to produce the desired mixer products without the square wave harmonics. That may have been the case. But the method produced so much background white noise that it nearly obscured the audio. Nonetheless, I've left the sine/cosine LO in the archived code, in case anyone wants to play with it.)
The FIR filter software is what enables us to select either the upper or lower mixer products and suppress those from the "other side". Adding the two filtered mixer outputs selects the upper range; subtracting, the lower range. Typically, we use the upper range, because sounds that are higher in frequency there, are also higher in frequency after mixing. In the lower range, any frequency shifts are inverted. Listening to a trombone that's sliding upwards in frequency will sound like it's sliding down in frequency after mixing, if its original frequency is below that of the LO.
In any event, the demo code allows you to select which output(s) you want to listen to, by assigning UPPER, LOWER, BOTH, or INPUT to the constant OUTPUT. Selecting BOTH puts the UPPER output in one stereo channel; LOWER, in the other. This produces some interesting spatialization effects with headphones, as does selecting INPUT, which puts the raw output of each mixer in its own stereo channel.
The two constants that you will want to play with are SAMPRATE and LOFREQ. LOFREQ selects the local oscillator frequency. This equals the amount by which the audio is shifted downward in frequency. SAMPRATE defines the rate (in Hz) used to sample the mixer outputs before they're filtered. This determines the frequency range of the shifted audio that you will be able to hear. The filter's passband (at 6 dB down) is preset to equal 45% of the sampling rate, centered on 1/4 of the sampling rate. For example, if you set LOFREQ to 2000 Hz, SAMPRATE to 1000 Hz, and OUTPUT to UPPER, the audio input will be shifted down by 2000 Hz. The passband for the filters will be 450 Hz wide, centered at 250 Hz. This means that you will hear signals that originally ranged from 2025 Hz to 2475 Hz translated to the 25 Hz to 475 Hz range. With this setting, tapping a crystal wine glass will sound like Big Ben, with pronounced vibrato not noticeable at the original frequency.
To hear the output from a Parallax Ping))) ultrasonic ranger, set LOFREQ to 39_000 Hz. and SAMPRATE to 3900 Hz. (SAMPRATE should always be a submultiple of LOFREQ to avoid the generation of spurious beat frequencies between the two.) You will hear loud clicks when the output transducer is placed above the microphone and perhaps even echoes for nearby targets. But keep in mind that the microphone is not really optimal for picking up sounds at 40 KHz. For that you'd need an ultrasonic microphone or transducer.
One additional note: Don't expect to be able to transpose music by this technique. It won't sound right, because the frequency translation is constant, not proportional to the original frequency. So an 880 Hz "A" shifted down by 440 Hz will still be an "A", but the 1047 Hz "C" above it will end up at 607 Hz, which lies between "D" and "D#".
Have fun with it! Questions and comments, as always, are invited.
-Phil
What this demo does is to shift the frequencies of sounds picked up by the Demo Board's microphone by a constant, user-selectable amount. In this manner, sounds which the microphone can hear but which we can't are brought into the human audio range. Also, sounds which we can hear can be shifted in frequency to let us pick out aspects that may have been more subtle at their normal frequency. Here's a block diagram of the demo, as implemented on the Demo Board:
Sounds picked up by the microphone are mixed with two square waves that are 90° out of phase with each other. The output of each mixer consists of signals representing the sum and difference frequencies of the audio and the square waves from the I and Q local oscillators. So, for example, if you had two audio frequencies of say 400 Hz and 2000 Hz and mixed them with 1000 Hz square waves, you get signals of 600 Hz (1000 - 400), 1000 Hz (2000 - 1000), 1400 Hz (1000 + 400), and 3000 Hz (1000 + 2000). We are interested here only in the difference frequencies. The mixer products from input frequencies that start out above the local oscillator (LO) frequency, are the "upper" outputs; those from the audio below the LO frequency, "lower" outputs. The upper and lower outputs may have the same frequency. But they can still be separated, due to their different phase relationships emanating from the two mixers.
(Since the LO produces square waves, there are also mixer products from the audio and the LO's odd harmonics. Where the LO has a very high frequency w.r.t. the signal inputs, this is not a problem, since those signals are way outside the passband of anything downstream. In this case, however, there could be some harmonic products that get through. I attempted to produce a sine/cosine wave LO, which outputs DUTY mode pulses. The idea was that these pulses would mix statistically with the pulses from the sigma-delta counter output to produce the desired mixer products without the square wave harmonics. That may have been the case. But the method produced so much background white noise that it nearly obscured the audio. Nonetheless, I've left the sine/cosine LO in the archived code, in case anyone wants to play with it.)
The FIR filter software is what enables us to select either the upper or lower mixer products and suppress those from the "other side". Adding the two filtered mixer outputs selects the upper range; subtracting, the lower range. Typically, we use the upper range, because sounds that are higher in frequency there, are also higher in frequency after mixing. In the lower range, any frequency shifts are inverted. Listening to a trombone that's sliding upwards in frequency will sound like it's sliding down in frequency after mixing, if its original frequency is below that of the LO.
In any event, the demo code allows you to select which output(s) you want to listen to, by assigning UPPER, LOWER, BOTH, or INPUT to the constant OUTPUT. Selecting BOTH puts the UPPER output in one stereo channel; LOWER, in the other. This produces some interesting spatialization effects with headphones, as does selecting INPUT, which puts the raw output of each mixer in its own stereo channel.
The two constants that you will want to play with are SAMPRATE and LOFREQ. LOFREQ selects the local oscillator frequency. This equals the amount by which the audio is shifted downward in frequency. SAMPRATE defines the rate (in Hz) used to sample the mixer outputs before they're filtered. This determines the frequency range of the shifted audio that you will be able to hear. The filter's passband (at 6 dB down) is preset to equal 45% of the sampling rate, centered on 1/4 of the sampling rate. For example, if you set LOFREQ to 2000 Hz, SAMPRATE to 1000 Hz, and OUTPUT to UPPER, the audio input will be shifted down by 2000 Hz. The passband for the filters will be 450 Hz wide, centered at 250 Hz. This means that you will hear signals that originally ranged from 2025 Hz to 2475 Hz translated to the 25 Hz to 475 Hz range. With this setting, tapping a crystal wine glass will sound like Big Ben, with pronounced vibrato not noticeable at the original frequency.
To hear the output from a Parallax Ping))) ultrasonic ranger, set LOFREQ to 39_000 Hz. and SAMPRATE to 3900 Hz. (SAMPRATE should always be a submultiple of LOFREQ to avoid the generation of spurious beat frequencies between the two.) You will hear loud clicks when the output transducer is placed above the microphone and perhaps even echoes for nearby targets. But keep in mind that the microphone is not really optimal for picking up sounds at 40 KHz. For that you'd need an ultrasonic microphone or transducer.
One additional note: Don't expect to be able to transpose music by this technique. It won't sound right, because the frequency translation is constant, not proportional to the original frequency. So an 880 Hz "A" shifted down by 440 Hz will still be an "A", but the 1047 Hz "C" above it will end up at 607 Hz, which lies between "D" and "D#".
Have fun with it! Questions and comments, as always, are invited.
-Phil
Comments
Unless I'm missing something, haven't you given us the Propeller version of the Bat Detector that was being discussed here:
http://forums.parallax.com/showthread.php?131953-The-Bat-detector-from-Nuts-and-Volts&highlight=detector
That thread was, in part, what inspired this investigation. The missing essentials are the input transducer and, probably, a preamp. Unfortunately, there are virtually no bats where I live, so it would be hard for me to test a completed bat detector. Hopefully, someone else can pick up the thread from here to try it out in the field.
-Phil
1) Can you make me sound more like a Dalek ("ex-termmm-in-ateeeee")
2) Can you make 'monotonal speech' follow a programmed profile (a la the famous "autotune")
3) I know just where to find bats, lots and lots of bats. They somehow relocated them all to a bend in the river. Might need a good boom mic to pick out a smaller sample rather than the whole screaming colony.
Great write-up.
I'm looking forward to a chance to try it out. I do have ultrasonic mics other tools around here to compare it with.
1) Shifting all frequencies by a constant amount will not give you that effect, I'm afraid. As with music, you would need to shift proportionally to frequency.
2) Small shifts like that might be possible, if it's done in two stages: a: Shift to a much higher frequency with a variable LO; b: Shift back down into the audible range with a fixed LO. The reason for doing it like this is that you want to keep the LO out of the resulting passband, else the upper and lower channels will intersect each other. The quality from going through two shifts will likely suffer, though.
3) A directional mic would certainly be a benefit.
-Phil
-Phil
Super neat, Phil!
I used to make pitch benders by having a circular buffer into which input samples were written at one rate, and output samples were read at another rate. The pitch change was always proportional using this technique - a simple ratio of output to input rates. The problem was when one pointer crossed the other there would be a step function in the output - a click. This could have been fixed by cross-fading between two buffers, but I couldn't figure out how to do that at the time.
It would be interesting to get a wide-band ultrasonic transducer and pitch-bend it down to 1/10th frequency. You could hear bats and insects pretty well, I suppose. Maybe even other stuff from man-made sources.
-Phil
Maybe 256 samples for an 8KHz sample rate. This was on an Apple ][ and I was bending my 14-year-old voice DOWN an octave or two to sound scary.
It's a very simple concept and it works really well, except for the pointer crossings.
You could also make a flange effect by inputting at a constant rate, then outputting at the same average rate, with sinusoidal slow-down and speed-up, remixing the output back into the input.
Rather than using two buffers and cross-fading, I'll bet a single buffer with an all-pass filter on the output would mitigate the clicks equally well. It would be easy enough to try, at least.
-Phil
My hearing is much better than my knees. The problem I have is knowing whether it's the knees or if my bicycle's power train needs lubrication.
-Phil
John Abshier
The 3 main types of bat detector are, heterodyne (like Phil's), time expansion (like Chip's), and frequency division. A further distinction is full spectrum (including amplitude) vs frequency or zero crossing only. Time expansion detectors usually capture at a high rate, like 192kHz, into a buffer, and identified "events" are played back at a lower rate. The Wildlife acoustics detector streams the full spectrum data at 192kHz onto a CF card for later playback, and has an squelch option to avoid recording a lot of noise. Bats can call at frequencies that span 8kHz to 120kHz, so any of these can require tuning manually or by some algorithm.
Phil, at 39kHz LOFREQ, that is 2051 clock cycles. Do you have an idea of the maximum sampling rate given code execution time? Nice try with the SIN/COS mixer. I have a question about the mixer. I see you use the logic mode xor (which the docs call LOGIC A <> . That does seem like the logical choice, but could you elaborate on why you chose that mode? I was thinking about the synthesized audio that Ahle2 posted a couple of weeks ago. I don't know how he does it, if there is actual mixing from different cogs producing different audio tracks, and if so, how is analog mixing accomplished?
It simulates multiplication, which is how an analog product mixer operates. (Had I chosen logic equality, instead of XOR, the correlation would have been more apparent, but the results are the same.) Basically, the counter counts high pulses from the ADC when the LO output is low and low pulses from the ADC when the LO output is high. There's a more thorough explanation of the mixing process in this thread:
The only difference lies in the frequencies involved -- analog vs. RF.
For combining various sounds into a single track, it might be as simple as scaling and adding their amplitudes. Once you deviate from a linear transform, though, as is done in the product mixer, you end up with frequency components other than the ones you started with.
-Phil
The Polaroid/Senscomp transducer does not need all of 300 V to operate as a microphone, and the current is sub-nanoamp. What! You don't have a B battery lying around? One option is a stack of 50 CR2032 coin cells (as if you have those.)
A voltage multiplier or switcher can give that boost with the necessary nA output. Here is one with the 4 stages, dual diodes at the output of one of those neat Coilcraft LPR6235 series transformers. The transformer steps up the input AC to about 25V, and the V-multiplier takes it up to 200 V+.
This program is quite fun to play with! Try speaking into the mic with the default settings and listen to yourself. Oddly enough, I did not notice the clicks that Chip mentioned, so I didn't bother to add any cross-fading or filtering.
One thing to keep in mind, if using this for downshifting ultrasonics, is that this method is much better suited to sounds of a continuous nature, rather than the intermittent chirps emitted by the Ping))) or by bats. The reason is that the buffer will get overwritten n times before it's read out once. Any sound that isn't sustained for at least n complete buffer write cycles will be missed entirely upon readout if it gets overwritten.
Have fun!
-Phil
When I was young (pre-teen) I used my Amiga 500 and an audio sampling tool named "Audio Master IV" for realtime pitchbending telephone pranks. I still have got some tapes laying around.
I'm sure a circular buffer was used because that 7.14 Mhz 68000 couldn't do much calculations at >30 kHz in realtime (can't remember the exact sample frequency)
The program I had
The computer I had
The sampler I had
@Tracy Allen
"Ahle2 posted a couple of weeks ago. I don't know how he does it, if there is actual mixing from different cogs producing different audio tracks, and if so, how is analog mixing accomplished?"
Retronitus uses a SINGLE cog for EVERYTHING (synthesis, mixing, audio output, music handling, instrument handling.. etc). You could launch it, give it a pointer to music and instrument data and then kill all other cogs; It would still play music.
The mixing of the 8 audio channels is accomplished by some shifting and adding to produce the final sample before output.
Now I feel waves of nostalgia pumping through my veins (or it may be serotonin)
I found another clip of "Audio master" running realtime audio with a home built sampler. (without pitchbend though)
I used CorelDraw to create the block diagram, then rendered it as a GIF in Corel PhotoPaint.
-Phil
-Phil
Jonathan
So I guess it doesn't have to be ultrasonic.
-Phil